Message ID | alpine.DEB.2.20.2208021117360.10833@tpp.orcam.me.uk |
---|---|
State | New, archived |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:b5d6:b0:2b9:3548:2db5 with SMTP id v22csp233792pxt; Wed, 3 Aug 2022 02:54:55 -0700 (PDT) X-Google-Smtp-Source: AA6agR4CClcWdMgRaL7ZN/ZH7Pu40OxDPvJQLRSwI1glvpF54zzUtpgodI8eV/2jqbWk+TyGjM5y X-Received: by 2002:a17:907:b590:b0:730:9e0f:e9a3 with SMTP id qx16-20020a170907b59000b007309e0fe9a3mr6416497ejc.112.1659520495085; Wed, 03 Aug 2022 02:54:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659520495; cv=none; d=google.com; s=arc-20160816; b=Tn3OQDYrIh3D2d6kbdGcwk8zrIZSHqeXFSCgBDo0O5ovm+pKMczoHs29VmgdQP7B40 pa2Ya9C5HjO2r6XaOSP1FRvZvJJonFlgJ9r876SJOAkmM8kj+RnOKwUjXnvqV6R7LZQG Nw09POz+4hhCCDn/mKZWSC7lvwDH1w5ls9qxFWiR94Rgtt/wANFrz1zL9SL8pCCHIjGb CQRwclckh0acXFB7B/Brs3KWSw2qa4kmU2Li7hJ8U5GCzGlkFrclpBJhsl83ad9mZQKU LWKd2wExtox+1BKU6Y84KkISewHHqGCPGssu41TFPhtrg7gbSIVli5eaDkHrMXxqzl+E cWPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version:user-agent :message-id:subject:to:from:date:dkim-signature:dmarc-filter :delivered-to; bh=vOtu2rWaBDCOibxksVDb11aBbxyFGQbizRDRp1uItMI=; b=R/C0QWG2OHoijepQfvBgL5tQoA26cc0nsamuujJZ9YeL3VTB0z5BUj/BxsaUoOrJPt kxoxMe0yJWVEurexkqcSAlFvLBb0fpIcCYoKYR+NqK4UCUnKCIA24/UAC0eV9rC/lwGf P5BFSnoaAzpoo18I4OTwAXhdLQlBZaH7eZUmGxAksnSqiBtG5pXIbHz4wdhfX03aYAS/ hO5tXR3R+ombGRP0tWpeUrfHFl7a8nJOY5uXB4Nz/Iw11qqmExVVGdDaC7DwbYZrOmxz 5FZ/fzt/93PLUUcsf+wLJGHX5Bs6U1BzyDjqcQcjdzzlwTSWHVAa7AJTsF8TN/1PIBN4 3qcA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@embecosm.com header.s=google header.b=eNrBQj4c; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id h20-20020a170906111400b00726ed19161bsi13695101eja.921.2022.08.03.02.54.54 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Aug 2022 02:54:55 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=fail header.i=@embecosm.com header.s=google header.b=eNrBQj4c; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8D1F33852778 for <ouuuleilei@gmail.com>; Wed, 3 Aug 2022 09:54:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id 3A27D3858292 for <gcc-patches@gcc.gnu.org>; Wed, 3 Aug 2022 09:54:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3A27D3858292 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=embecosm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=embecosm.com Received: by mail-wm1-x330.google.com with SMTP id v131-20020a1cac89000000b003a4bb3f786bso671498wme.0 for <gcc-patches@gcc.gnu.org>; Wed, 03 Aug 2022 02:54:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=embecosm.com; s=google; h=date:from:to:cc:subject:message-id:user-agent:mime-version; bh=vOtu2rWaBDCOibxksVDb11aBbxyFGQbizRDRp1uItMI=; b=eNrBQj4cnlWPBk1Ufr+gyCJT3aNltrvVCjGywaWdp/eDlPSXLSZmHe8BYWALhIWAXD sFG8jGs/gj9LQiuntN7vTSI7Sxm+Ad1GysagAFVCNkuy/+0nr6ewhIunpGFJdzlqsnFZ PZS7rBz1N4goBCj5Q+w0SIfxVFGprtK7PulWtLjQhnhUHfUCQFEq8tKPnBiUjwfCu57c 8iSRbAjElftRow2WjNCiwmgtJE+nmflVH9vLN21Qw/GD4spAOt7DLA6+IxgjqYlhT+Qx LM+bKp7DSSSiLYnf0QMlDEjS6Bd4PoyyerW6mh5DaQIhPzP+smMwA2Fc6PlTWOP1TCSb L5Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:user-agent :mime-version; bh=vOtu2rWaBDCOibxksVDb11aBbxyFGQbizRDRp1uItMI=; b=jDXJ0yPyPdJz7C3R08UnJay+S7LxP3Mfr9audHY0w35k26AHGeFQeL4vbotqyALpPX D4W9+lHiMoEu/TIP4uIvnc6Ae8Gy+Q4Uxc982KBDSKTZ7sIJ+aaOexlIDgp4HK/Pq7Ge v7WR/4LHAf92joUwczdK6z/6Hhk2neQQeXQbN6g+96CsYL6WXuK1MWmHPxOq3/0BZwWb +hYAyUoYWqibUP06R37YVCMIBHSw0k5HubfglQ3s1gZcFErcDN+rVGXZw+PMA1Yko+qA dFkGWD6pYXMJqIODiEjUoH+y2HWg5mfOTVu2gO+jLd+HQG/fdhmKRKXaR/YKYRweye2q EqRA== X-Gm-Message-State: ACgBeo07uEYK3zk8oy3q/8pzi6AQGHqOqyEkNljpQ6Z774YbpFNUeFW1 0ZkOrY9VMyZNBuJBO7BfttRkeH3Kk+9aP3hg X-Received: by 2002:a05:600c:21d4:b0:3a3:10a0:cc4f with SMTP id x20-20020a05600c21d400b003a310a0cc4fmr2308724wmj.75.1659520449824; Wed, 03 Aug 2022 02:54:09 -0700 (PDT) Received: from [172.16.1.110] ([194.168.26.145]) by smtp.gmail.com with ESMTPSA id k12-20020a5d524c000000b002205ffe88edsm10400164wrc.31.2022.08.03.02.54.09 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 03 Aug 2022 02:54:09 -0700 (PDT) Date: Wed, 3 Aug 2022 10:54:08 +0100 (BST) From: "Maciej W. Rozycki" <macro@embecosm.com> To: gcc-patches@gcc.gnu.org Subject: [PATCH] RISC-V: Avoid redundant sign-extension for SImode SGE, SGEU, SLE, SLEU Message-ID: <alpine.DEB.2.20.2208021117360.10833@tpp.orcam.me.uk> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-0.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLACK autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: Andrew Waterman <andrew@sifive.com>, Kito Cheng <kito.cheng@gmail.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740133362590114492?= X-GMAIL-MSGID: =?utf-8?q?1740133362590114492?= |
Series |
RISC-V: Avoid redundant sign-extension for SImode SGE, SGEU, SLE, SLEU
|
|
Commit Message
Maciej W. Rozycki
Aug. 3, 2022, 9:54 a.m. UTC
We produce inefficient code for some synthesized SImode conditional set operations (i.e. ones that are not directly implemented in hardware) on RV64. For example a piece of C code like this: int sleu (unsigned int x, unsigned int y) { return x <= y; } gets compiled (at `-O2') to this: sleu: sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 sext.w a0,a0 # 16 [c=4 l=4] extendsidi2/0 ret # 25 [c=0 l=4] simple_return This is because the middle end expands a SLEU operation missing from RISC-V hardware into a sequence of a SImode SGTU operation followed by an explicit SImode XORI operation with immediate 1. And while the SGTU machine instruction (alias SLTU with the input operands swapped) gives a properly sign-extended 32-bit result which is valid both as a SImode or a DImode operand the middle end does not see that through a SImode XORI operation, because we tell the middle end that the RISC-V target (unlike MIPS) may hold values in DImode integer registers that are valid for SImode operations even if not properly sign-extended. However the RISC-V psABI requires that 32-bit function arguments and results passed in 64-bit integer registers be properly sign-extended, so this is explicitly done at the conclusion of the function. Fix this by making the backend use a sequence of a DImode SGTU operation followed by a SImode SEQZ operation instead. The latter operation is known by the middle end to produce a properly sign-extended 32-bit result and therefore combine gets rid of the sign-extension operation that follows and actually folds it into the very same XORI machine operation resulting in: sleu: sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_didi xori a0,a0,1 # 16 [c=4 l=4] xordi3/1 ret # 25 [c=0 l=4] simple_return instead (although the SEQZ alias SLTIU against immediate 1 machine instruction would equally do and is actually retained at `-O0'). This is handled analogously for the remaining synthesized operations of this kind, i.e. `SLE', `SGEU', and `SGE'. gcc/ * config/riscv/riscv.cc (riscv_emit_int_order_test): Use EQ 0 rather that XOR 1 for LE and LEU operations. gcc/testsuite/ * gcc.target/riscv/sge.c: New test. * gcc.target/riscv/sgeu.c: New test. * gcc.target/riscv/sle.c: New test. * gcc.target/riscv/sleu.c: New test. --- Hi, Regression-tested with the `riscv64-linux-gnu' target. OK to apply? Maciej --- gcc/config/riscv/riscv.cc | 4 ++-- gcc/testsuite/gcc.target/riscv/sge.c | 11 +++++++++++ gcc/testsuite/gcc.target/riscv/sgeu.c | 11 +++++++++++ gcc/testsuite/gcc.target/riscv/sle.c | 11 +++++++++++ gcc/testsuite/gcc.target/riscv/sleu.c | 11 +++++++++++ 5 files changed, 46 insertions(+), 2 deletions(-) gcc-riscv-int-order-inv-seqz.diff
Comments
LGTM, but with a nit, I don't get set.w but get an andi like below, so maybe we should also scan-assembler-not andi? feel free to commit that directly with that fix ```asm sleu: sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 andi a0,a0,1 # 16 [c=4 l=4] anddi3/1 ret # 25 [c=0 l=4] simple_return ``` On Wed, Aug 3, 2022 at 5:54 PM Maciej W. Rozycki <macro@embecosm.com> wrote: > > We produce inefficient code for some synthesized SImode conditional set > operations (i.e. ones that are not directly implemented in hardware) on > RV64. For example a piece of C code like this: > > int > sleu (unsigned int x, unsigned int y) > { > return x <= y; > } > > gets compiled (at `-O2') to this: > > sleu: > sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi > xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 > sext.w a0,a0 # 16 [c=4 l=4] extendsidi2/0 > ret # 25 [c=0 l=4] simple_return > > This is because the middle end expands a SLEU operation missing from > RISC-V hardware into a sequence of a SImode SGTU operation followed by > an explicit SImode XORI operation with immediate 1. And while the SGTU > machine instruction (alias SLTU with the input operands swapped) gives a > properly sign-extended 32-bit result which is valid both as a SImode or > a DImode operand the middle end does not see that through a SImode XORI > operation, because we tell the middle end that the RISC-V target (unlike > MIPS) may hold values in DImode integer registers that are valid for > SImode operations even if not properly sign-extended. > > However the RISC-V psABI requires that 32-bit function arguments and > results passed in 64-bit integer registers be properly sign-extended, so > this is explicitly done at the conclusion of the function. > > Fix this by making the backend use a sequence of a DImode SGTU operation > followed by a SImode SEQZ operation instead. The latter operation is > known by the middle end to produce a properly sign-extended 32-bit > result and therefore combine gets rid of the sign-extension operation > that follows and actually folds it into the very same XORI machine > operation resulting in: > > sleu: > sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_didi > xori a0,a0,1 # 16 [c=4 l=4] xordi3/1 > ret # 25 [c=0 l=4] simple_return > > instead (although the SEQZ alias SLTIU against immediate 1 machine > instruction would equally do and is actually retained at `-O0'). This > is handled analogously for the remaining synthesized operations of this > kind, i.e. `SLE', `SGEU', and `SGE'. > > gcc/ > * config/riscv/riscv.cc (riscv_emit_int_order_test): Use EQ 0 > rather that XOR 1 for LE and LEU operations. > > gcc/testsuite/ > * gcc.target/riscv/sge.c: New test. > * gcc.target/riscv/sgeu.c: New test. > * gcc.target/riscv/sle.c: New test. > * gcc.target/riscv/sleu.c: New test. > --- > Hi, > > Regression-tested with the `riscv64-linux-gnu' target. OK to apply? > > Maciej > --- > gcc/config/riscv/riscv.cc | 4 ++-- > gcc/testsuite/gcc.target/riscv/sge.c | 11 +++++++++++ > gcc/testsuite/gcc.target/riscv/sgeu.c | 11 +++++++++++ > gcc/testsuite/gcc.target/riscv/sle.c | 11 +++++++++++ > gcc/testsuite/gcc.target/riscv/sleu.c | 11 +++++++++++ > 5 files changed, 46 insertions(+), 2 deletions(-) > > gcc-riscv-int-order-inv-seqz.diff > Index: gcc/gcc/config/riscv/riscv.cc > =================================================================== > --- gcc.orig/gcc/config/riscv/riscv.cc > +++ gcc/gcc/config/riscv/riscv.cc > @@ -2500,9 +2500,9 @@ riscv_emit_int_order_test (enum rtx_code > } > else if (invert_ptr == 0) > { > - rtx inv_target = riscv_force_binary (GET_MODE (target), > + rtx inv_target = riscv_force_binary (word_mode, > inv_code, cmp0, cmp1); > - riscv_emit_binary (XOR, target, inv_target, const1_rtx); > + riscv_emit_binary (EQ, target, inv_target, const0_rtx); > } > else > { > Index: gcc/gcc/testsuite/gcc.target/riscv/sge.c > =================================================================== > --- /dev/null > +++ gcc/gcc/testsuite/gcc.target/riscv/sge.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target rv64 } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ > + > +int > +sge (int x, int y) > +{ > + return x >= y; > +} > + > +/* { dg-final { scan-assembler-not "sext\\.w" } } */ > Index: gcc/gcc/testsuite/gcc.target/riscv/sgeu.c > =================================================================== > --- /dev/null > +++ gcc/gcc/testsuite/gcc.target/riscv/sgeu.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target rv64 } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ > + > +int > +sgeu (unsigned int x, unsigned int y) > +{ > + return x >= y; > +} > + > +/* { dg-final { scan-assembler-not "sext\\.w" } } */ > Index: gcc/gcc/testsuite/gcc.target/riscv/sle.c > =================================================================== > --- /dev/null > +++ gcc/gcc/testsuite/gcc.target/riscv/sle.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target rv64 } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ > + > +int > +sle (int x, int y) > +{ > + return x <= y; > +} > + > +/* { dg-final { scan-assembler-not "sext\\.w" } } */ > Index: gcc/gcc/testsuite/gcc.target/riscv/sleu.c > =================================================================== > --- /dev/null > +++ gcc/gcc/testsuite/gcc.target/riscv/sleu.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target rv64 } */ > +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ > + > +int > +sleu (unsigned int x, unsigned int y) > +{ > + return x <= y; > +} > + > +/* { dg-final { scan-assembler-not "sext\\.w" } } */
On Thu, 11 Aug 2022, Kito Cheng wrote: > LGTM, but with a nit, I don't get set.w but get an andi like below, so > maybe we should also scan-assembler-not andi? feel free to commit that > directly with that fix > > ```asm > sleu: > sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi > xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 > andi a0,a0,1 # 16 [c=4 l=4] anddi3/1 > ret # 25 [c=0 l=4] simple_return > ``` Interesting. I can do that, but can you please share the compilation options, given or defaulted (from `--with...' configuration options), this happens with? Maciej
Hi Kito, On Fri, 12 Aug 2022, Maciej W. Rozycki wrote: > > LGTM, but with a nit, I don't get set.w but get an andi like below, so > > maybe we should also scan-assembler-not andi? feel free to commit that > > directly with that fix > > > > ```asm > > sleu: > > sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi > > xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 > > andi a0,a0,1 # 16 [c=4 l=4] anddi3/1 > > ret # 25 [c=0 l=4] simple_return > > ``` > > Interesting. I can do that, but can you please share the compilation > options, given or defaulted (from `--with...' configuration options), this > happens with? I have noticed it went nowhere. Can you please check what compilation options lead to this discrepancy so that we can have the fix included in GCC 13? I'd like to understand what's going on here. Maciej
On 11/25/22 07:07, Maciej W. Rozycki wrote: > Hi Kito, > > On Fri, 12 Aug 2022, Maciej W. Rozycki wrote: > >>> LGTM, but with a nit, I don't get set.w but get an andi like below, so >>> maybe we should also scan-assembler-not andi? feel free to commit that >>> directly with that fix >>> >>> ```asm >>> sleu: >>> sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi >>> xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 >>> andi a0,a0,1 # 16 [c=4 l=4] anddi3/1 >>> ret # 25 [c=0 l=4] simple_return >>> ``` >> Interesting. I can do that, but can you please share the compilation >> options, given or defaulted (from `--with...' configuration options), this >> happens with? > I have noticed it went nowhere. Can you please check what compilation > options lead to this discrepancy so that we can have the fix included in > GCC 13? I'd like to understand what's going on here. FWIW, I don't see the redundant sign extension with this testcase at -O2 on the trunk. Is it possible the patch has been made redundant over the last few months? Jeff
On Mon, 28 Nov 2022, Jeff Law wrote: > > > > LGTM, but with a nit, I don't get set.w but get an andi like below, so > > > > maybe we should also scan-assembler-not andi? feel free to commit that > > > > directly with that fix > > > > > > > > ```asm > > > > sleu: > > > > sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi > > > > xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 > > > > andi a0,a0,1 # 16 [c=4 l=4] anddi3/1 > > > > ret # 25 [c=0 l=4] simple_return > > > > ``` > > > Interesting. I can do that, but can you please share the compilation > > > options, given or defaulted (from `--with...' configuration options), this > > > happens with? > > I have noticed it went nowhere. Can you please check what compilation > > options lead to this discrepancy so that we can have the fix included in > > GCC 13? I'd like to understand what's going on here. > > FWIW, I don't see the redundant sign extension with this testcase at -O2 on > the trunk. Is it possible the patch has been made redundant over the last few > months? Maybe at -O2, but the test cases continue to fail in my configuration for other optimisation levels: FAIL: gcc.target/riscv/sge.c -O1 scan-assembler-not sext\\.w FAIL: gcc.target/riscv/sge.c -Og -g scan-assembler-not sext\\.w FAIL: gcc.target/riscv/sgeu.c -O1 scan-assembler-not sext\\.w FAIL: gcc.target/riscv/sgeu.c -Og -g scan-assembler-not sext\\.w FAIL: gcc.target/riscv/sle.c -O1 scan-assembler-not sext\\.w FAIL: gcc.target/riscv/sle.c -Og -g scan-assembler-not sext\\.w FAIL: gcc.target/riscv/sleu.c -O1 scan-assembler-not sext\\.w FAIL: gcc.target/riscv/sleu.c -Og -g scan-assembler-not sext\\.w when applied on top of: $ riscv64-linux-gnu-gcc --version riscv64-linux-gnu-gcc (GCC) 13.0.0 20221128 (experimental) Not anymore with the whole patch applied. Does it make sense to bisect the change that removed the pessimisation at -O2 to understand what is going on here? I think my change is worthwhile anyway: why to rely on the optimiser to get things sorted while we can produce the best code in the backend right away in the first place? Maciej
On 11/28/22 08:38, Maciej W. Rozycki wrote: > On Mon, 28 Nov 2022, Jeff Law wrote: > >>>>> LGTM, but with a nit, I don't get set.w but get an andi like below, so >>>>> maybe we should also scan-assembler-not andi? feel free to commit that >>>>> directly with that fix >>>>> >>>>> ```asm >>>>> sleu: >>>>> sgtu a0,a0,a1 # 9 [c=4 l=4] *sgtu_disi >>>>> xori a0,a0,1 # 10 [c=4 l=4] *xorsi3_internal/1 >>>>> andi a0,a0,1 # 16 [c=4 l=4] anddi3/1 >>>>> ret # 25 [c=0 l=4] simple_return >>>>> ``` >>>> Interesting. I can do that, but can you please share the compilation >>>> options, given or defaulted (from `--with...' configuration options), this >>>> happens with? >>> I have noticed it went nowhere. Can you please check what compilation >>> options lead to this discrepancy so that we can have the fix included in >>> GCC 13? I'd like to understand what's going on here. >> FWIW, I don't see the redundant sign extension with this testcase at -O2 on >> the trunk. Is it possible the patch has been made redundant over the last few >> months? > Maybe at -O2, but the test cases continue to fail in my configuration for > other optimisation levels: > > FAIL: gcc.target/riscv/sge.c -O1 scan-assembler-not sext\\.w > FAIL: gcc.target/riscv/sge.c -Og -g scan-assembler-not sext\\.w > FAIL: gcc.target/riscv/sgeu.c -O1 scan-assembler-not sext\\.w > FAIL: gcc.target/riscv/sgeu.c -Og -g scan-assembler-not sext\\.w > FAIL: gcc.target/riscv/sle.c -O1 scan-assembler-not sext\\.w > FAIL: gcc.target/riscv/sle.c -Og -g scan-assembler-not sext\\.w > FAIL: gcc.target/riscv/sleu.c -O1 scan-assembler-not sext\\.w > FAIL: gcc.target/riscv/sleu.c -Og -g scan-assembler-not sext\\.w I may have been running an rv32 toolchain... So I'll start over and ensure that I'm running rv64 :-) With the trunk, I get code like Kito (AND with 0x1 mask) The key difference is Roger's patch: commit c23a9c87cc62bd177fd0d4db6ad34b34e1b9a31f Author: Roger Sayle <roger@nextmovesoftware.com> Date: Wed Aug 3 08:55:35 2022 +0100 Some additional zero-extension related optimizations in simplify-rtx. This patch implements some additional zero-extension and sign-extension related optimizations in simplify-rtx.cc. The original motivation comes from PR rtl-optimization/71775, where in comment #2 Andrew Pinksi sees: Failed to match this instruction: (set (reg:DI 88 [ _1 ]) (sign_extend:DI (subreg:SI (ctz:DI (reg/v:DI 86 [ x ])) 0))) [ ... ] With that patch the sign extension is removed and instead we generate the AND with 0x1. Old, from combine dump: Successfully matched this instruction: (set (reg/i:DI 10 a0) ! (sign_extend:DI (reg:SI 78))) New, from combine dump: (set (reg/i:DI 10 a0) ! (and:DI (subreg:DI (reg:SI 78) 0) ! (const_int 1 [0x1]))) Note the date on Roger's patch, roughly the same time as yours. I suspect Kito had tested the truck with Roger's patch. Your patch is probably still useful. I think Kito's only concern was to make sure we don't have the ANDI instruction in addition to not having the SEXT instruction. So still approved for trunk, just update the testcases to make sure we don't have the ANDI too. jeff
Index: gcc/gcc/config/riscv/riscv.cc =================================================================== --- gcc.orig/gcc/config/riscv/riscv.cc +++ gcc/gcc/config/riscv/riscv.cc @@ -2500,9 +2500,9 @@ riscv_emit_int_order_test (enum rtx_code } else if (invert_ptr == 0) { - rtx inv_target = riscv_force_binary (GET_MODE (target), + rtx inv_target = riscv_force_binary (word_mode, inv_code, cmp0, cmp1); - riscv_emit_binary (XOR, target, inv_target, const1_rtx); + riscv_emit_binary (EQ, target, inv_target, const0_rtx); } else { Index: gcc/gcc/testsuite/gcc.target/riscv/sge.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/sge.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target rv64 } */ +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ + +int +sge (int x, int y) +{ + return x >= y; +} + +/* { dg-final { scan-assembler-not "sext\\.w" } } */ Index: gcc/gcc/testsuite/gcc.target/riscv/sgeu.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/sgeu.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target rv64 } */ +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ + +int +sgeu (unsigned int x, unsigned int y) +{ + return x >= y; +} + +/* { dg-final { scan-assembler-not "sext\\.w" } } */ Index: gcc/gcc/testsuite/gcc.target/riscv/sle.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/sle.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target rv64 } */ +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ + +int +sle (int x, int y) +{ + return x <= y; +} + +/* { dg-final { scan-assembler-not "sext\\.w" } } */ Index: gcc/gcc/testsuite/gcc.target/riscv/sleu.c =================================================================== --- /dev/null +++ gcc/gcc/testsuite/gcc.target/riscv/sleu.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target rv64 } */ +/* { dg-skip-if "" { *-*-* } { "-O0" } } */ + +int +sleu (unsigned int x, unsigned int y) +{ + return x <= y; +} + +/* { dg-final { scan-assembler-not "sext\\.w" } } */