Message ID | 20220905214437.1275139-1-philipp.tomsich@vrull.eu |
---|---|
State | New, archived |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp358502wrt; Mon, 5 Sep 2022 14:45:13 -0700 (PDT) X-Google-Smtp-Source: AA6agR68uw1q9/J5DuI47eIGQV/39BSlYC0bax0qwnIwtJoYtQZPvvOdHbG2bHlXru3iVx7M4xy7 X-Received: by 2002:a05:6402:538a:b0:43a:298e:bc2b with SMTP id ew10-20020a056402538a00b0043a298ebc2bmr44399962edb.125.1662414313254; Mon, 05 Sep 2022 14:45:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1662414313; cv=none; d=google.com; s=arc-20160816; b=0jTihBNl2FRCHLLqcxEnHCoDc/XQLeza14ZniD2Mo7YwD9M5x0ON+05fHkROT1U+Cj KxFJ/1RPpHfrzL5Gp+XwGDOwNIRqVG1uOOAzN0aFMfowYLxl8u79pZfq8B98JfGncg4x akNXi5FfPv+m3BhmvAU9rnhfwg9ounuDJ4TtCAuRJJ2nTIwoOuwM1B7J3nsZacGYyvJF nH1A3ppNh29QetDDoQ/U2RdOOD3fcLzKAH4sI7+JvNOFDZ7gzHuDtQvKpMf03FMhmH/R HhMSgG0tp7Oc/DlyvydAK4BaYfzUqYwjO/DvLn5ahNgJM+SNdisSq8A5h2/oYxKwd7Yj E8ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dkim-signature :dmarc-filter:delivered-to; bh=t5l2bMV3GS0EaeyF3s32OLO/VihmJy6fwRf4sJjrrsY=; b=cbkKEthebd2unOpiv4dYLhE7/mmhxZ7YCf0qeELkOsoHlj2F5DwTq/NShpk9I8p7yl et9Nn47L1zsbNifFovExBFUiMJsAnow3pLH4qxiKT9JOmZDTZ1Hx7zinpI4hIMyIV/Ib th/1KPiWs/2yPj9BATprzhmlTafxTLr//wpsZTSrAE2/ZYjTKfF8vEiHP2gVVpzKxYXH M60iP+iff0HmUm+Xiegn3I5DCpuegopIF0Tx9FiWpSR6r2zl7SwkRgB80MDi+Uqnd5ma XxYhZmOouQ9hyQTm9WM5p0FvJOUBo6QQb+42zJUBzTr0UBmsrVJ5rA1Cmv/v6vAvyAfR SwTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@vrull.eu header.s=google header.b=p+CUi2Vj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id f11-20020a056402354b00b00448763c66d3si8244102edd.46.2022.09.05.14.45.13 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Sep 2022 14:45:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@vrull.eu header.s=google header.b=p+CUi2Vj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4E1863839DCD for <ouuuleilei@gmail.com>; Mon, 5 Sep 2022 21:45:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by sourceware.org (Postfix) with ESMTPS id EB3053858D28 for <gcc-patches@gcc.gnu.org>; Mon, 5 Sep 2022 21:44:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org EB3053858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wr1-x42e.google.com with SMTP id b16so12822223wru.7 for <gcc-patches@gcc.gnu.org>; Mon, 05 Sep 2022 14:44:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date; bh=t5l2bMV3GS0EaeyF3s32OLO/VihmJy6fwRf4sJjrrsY=; b=p+CUi2VjpdQWGatj5kHn8df1xnJ59PVlPKgk4HrZl5NHeWcUnHYZWbP5uZTbfCmpnk xwC7x7F4q3TdeqUQwJjS9jL9Jdr3JiOYHA8PGkOAXQmxqc4H1QieWTRYMDeK5gP9Chwk 09YDflE2fgjDmvnLwzGRo1ux/MDwNw0Hyvrpw+86j8boH4dV63IrPKUTy5yi9J8EjYQt VIfJJ2BTZf6h2+2Yh6cS+Bz3qVv1fBxlAALXVpD7l7SB9kGF4WjxWiwD19qcR1t+0uTV rrvu45pooKhNlkAhRGsAR9z0/e1w7ewfR+AOEeZCOkn5IDZHXdPz6HsLk+iSWQLIQxm8 114g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date; bh=t5l2bMV3GS0EaeyF3s32OLO/VihmJy6fwRf4sJjrrsY=; b=naCrTjcV5fInBJxrMAIHFrFuS+ICoZ1yymng21YnW5vTkEv6Od8iGekM8RaQG+1iDQ /Wl2XRBP7sG2G6WmNx2Fe/leh6enPnmVsbvUOwCHq5ZU9M/cODKJt02y8vg5GFHSjFEA 7VrPVNYnGr2y6kF5HDKXndyZB4yYvWlrNVicr2aJFAt1r6ICxMButTPmwc1W2zPWMzjm kKNC6aCavsChg9zSMnl+RloaPAfCyI7ZvRvrzmJn6W2fZ5rNxbfCatrWJkJ3OaPx5EOu hnYzDrhHwbTICkOZw5t+g14ECr1ELr7u0rbf6Fhutub3ZJN63Wr32eNII/Mgnvy8qC4a 4Kqg== X-Gm-Message-State: ACgBeo1rIj5T/tn+3qI5xcuChKuVrKv0RJdcAZyULbIfNrcspMTF45gh /OI1vH/CktABPBT/yKnfBAJ/67RORuertCYa X-Received: by 2002:adf:d1c7:0:b0:226:eb3b:29b0 with SMTP id b7-20020adfd1c7000000b00226eb3b29b0mr15386373wrd.365.1662414283405; Mon, 05 Sep 2022 14:44:43 -0700 (PDT) Received: from ubuntu-focal.. ([2a01:4f9:3a:1e26::2]) by smtp.gmail.com with ESMTPSA id bt24-20020a056000081800b0022377df817fsm10394967wrb.58.2022.09.05.14.44.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Sep 2022 14:44:43 -0700 (PDT) From: Philipp Tomsich <philipp.tomsich@vrull.eu> To: gcc-patches@gcc.gnu.org Subject: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED Date: Mon, 5 Sep 2022 23:44:37 +0200 Message-Id: <20220905214437.1275139-1-philipp.tomsich@vrull.eu> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: Vineet Gupta <vineetg@rivosinc.com>, Kito Cheng <kito.cheng@gmail.com>, Philipp Tomsich <philipp.tomsich@vrull.eu>, Jojo R <rjiejie@linux.alibaba.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1743167750858088450?= X-GMAIL-MSGID: =?utf-8?q?1743167750858088450?= |
Series |
riscv: implement TARGET_MODE_REP_EXTENDED
|
|
Commit Message
Philipp Tomsich
Sept. 5, 2022, 9:44 p.m. UTC
TARGET_MODE_REP_EXTENDED is supposed to match LOAD_EXTEND_OP, so this
adds an implementation using the same logic as in LOAD_EXTEND_OP.
This reduces the number of extension operations, as evidenced in the
reduction of dynamic instructions for the xz benchmark in SPEC CPU:
# dynamic instructions
baseline new improvement
xz, workload 1 384681308026 374464538911 2.66%
xz, workload 2 985995327109 974304030498 1.19%
xz, workload 3 545372994523 533717744260 2.14%
The shift-shift-2.c testcase needs to be adjusted, as it will no
longer use slliw/slriw for sub5, but will instead emit slli/slri.
No new regressions runnung the riscv.exp suite.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_mode_rep_extended):
(TARGET_MODE_REP_EXTENDED): Implement.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/shift-shift-2.c: Adjust.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---
gcc/config/riscv/riscv.cc | 15 +++++++++++++++
gcc/testsuite/gcc.target/riscv/shift-shift-2.c | 2 --
2 files changed, 15 insertions(+), 2 deletions(-)
Comments
On Mon, 5 Sep 2022, Philipp Tomsich wrote: > +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep) > +{ > + /* On 64-bit targets, SImode register values are sign-extended to DImode. */ > + if (TARGET_64BIT && mode == SImode && mode_rep == DImode) > + return SIGN_EXTEND; I think this leads to a counter-intuitive requirement that a hand-written inline asm must sign-extend its output operands that are bound to either signed or unsigned 32-bit lvalues. Will compiler users be aware of that? Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing hook says that nothing needs to be done to truncate, but the new hook says that the result of the truncation is properly sign-extended. The documentation for TARGET_MODE_REP_EXTENDED warns about that: In order to enforce the representation of mode, TARGET_TRULY_NOOP_TRUNCATION should return false when truncating to mode. Alexander
On 9/6/22 05:39, Alexander Monakov via Gcc-patches wrote: > On Mon, 5 Sep 2022, Philipp Tomsich wrote: > >> +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep) >> +{ >> + /* On 64-bit targets, SImode register values are sign-extended to DImode. */ >> + if (TARGET_64BIT && mode == SImode && mode_rep == DImode) >> + return SIGN_EXTEND; > I think this leads to a counter-intuitive requirement that a hand-written > inline asm must sign-extend its output operands that are bound to either > signed or unsigned 32-bit lvalues. Will compiler users be aware of that? Is this significantly different than on MIPS? Hand-written code there also has to ensure that the results are properly sign extended and it's been that way for 20+ years since the introduction of mips64 IIRC. Though I don't think we had MODE_REP_EXTENDED that long. Haha, MIPS is the only target that currently defines TARGET_MODE_REP_EXTENDED :-) > > Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause > miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing > hook says that nothing needs to be done to truncate, but the new hook says > that the result of the truncation is properly sign-extended. > > The documentation for TARGET_MODE_REP_EXTENDED warns about that: > > In order to enforce the representation of mode, TARGET_TRULY_NOOP_TRUNCATION > should return false when truncating to mode. This may well need adjusting in Philipp's patch. I'd be surprised if the MIPS definition wasn't usable nearly verbatim here. jeff
On Fri, 16 Sep 2022 16:48:24 PDT (-0700), gcc-patches@gcc.gnu.org wrote: > > On 9/6/22 05:39, Alexander Monakov via Gcc-patches wrote: >> On Mon, 5 Sep 2022, Philipp Tomsich wrote: >> >>> +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep) >>> +{ >>> + /* On 64-bit targets, SImode register values are sign-extended to DImode. */ >>> + if (TARGET_64BIT && mode == SImode && mode_rep == DImode) >>> + return SIGN_EXTEND; >> I think this leads to a counter-intuitive requirement that a hand-written >> inline asm must sign-extend its output operands that are bound to either >> signed or unsigned 32-bit lvalues. Will compiler users be aware of that? > > Is this significantly different than on MIPS? Hand-written code there > also has to ensure that the results are properly sign extended and it's > been that way for 20+ years since the introduction of mips64 IIRC. > Though I don't think we had MODE_REP_EXTENDED that long. IMO the problem isn't so much that asm has this constraint, it's that it's a new constraint and thus risks breaking code that used to work. That said... > Haha, MIPS is the only target that currently defines > TARGET_MODE_REP_EXTENDED :-) > > > >> >> Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause >> miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing >> hook says that nothing needs to be done to truncate, but the new hook says >> that the result of the truncation is properly sign-extended. >> >> The documentation for TARGET_MODE_REP_EXTENDED warns about that: >> >> In order to enforce the representation of mode, TARGET_TRULY_NOOP_TRUNCATION >> should return false when truncating to mode. > > This may well need adjusting in Philipp's patch. I'd be surprised if > the MIPS definition wasn't usable nearly verbatim here. Yes, and we have a few MIPS-isms in the ISA but don't have the same flavor of TRULY_NOOP_TRUNCATION. It's been pointed out a handful of times and I'm not sure what the right way to go is here, every time I try and reason about which is going to produce better code I come up with a different answer. IIRC last time I looked at this I came to the conclusion that we're doing the right thing for RISC-V because most of our instructions implicitly truncate. It's pretty easy to generate bad code here and I'm pretty sure we could fix some of that by moving to a more MIPS-like TRULY_MODE_TRUNCATION, but I think we'd end up just pushing the problems around. Every time I look at this I also get worried that we've leaked some of these internal promotion rules into something visible to inline asm, but when I poke around it seems like things generally work. > > > jeff
Alexander, I had missed your comment until now. On Tue, 6 Sept 2022 at 13:39, Alexander Monakov <amonakov@ispras.ru> wrote: > > On Mon, 5 Sep 2022, Philipp Tomsich wrote: > > > +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep) > > +{ > > + /* On 64-bit targets, SImode register values are sign-extended to DImode. */ > > + if (TARGET_64BIT && mode == SImode && mode_rep == DImode) > > + return SIGN_EXTEND; > > I think this leads to a counter-intuitive requirement that a hand-written > inline asm must sign-extend its output operands that are bound to either > signed or unsigned 32-bit lvalues. Will compiler users be aware of that? I am not sure if I fully understand your concern, as the mode of the asm-output will be derived from the variable type. So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type of a. The concern, as far as I understand would be the case where the assembly-sequence leaves an incompatible extension in the register. So l understand the problem being: int a; asm volatile ("zext.w %0, %1" : "=r"(a) : "r"(b)); // possible pitfall -- the programmer wants a SImode representation (so it needs to be extended) but not long long a; asm volatile ("zext.w %0, %1" : "=r"(a): "r"(b)); // possible pitfall -- the programmer wants the DImode representation (don't extend) If we look at the output of expand for the first case (I made sure to consume "a" again using a "asm volatile ("nop" : : "r"(a))"), we see that an explicit extension is created from SImode to DImode (the "(subreg:SI (reg:DI ...) 0)" in the asm-operand for the next block is a bigger concern — this may be an artifact of TARGET_TRULY_NOOP_TRUNCATION not being properly defined on RISC-V; but this issue didn't originate from this patch): ;; __asm__ __volatile__("zext.w %0,%1" : "=r" a_2 : "r" b_1(D)); (insn 7 5 6 (set (reg:SI 75 [ aD.1490 ]) (asm_operands/v:SI ("zext.w %0,%1") ("=r") 0 [ (reg/v:DI 74 [ bD.1487 ]) ] [ (asm_input:DI ("r") ./rep-asm.c:5) ] [] ./rep-asm.c:5)) "./rep-asm.c":5:3 -1 (nil)) (insn 6 7 0 (set (reg/v:DI 72 [ aD.1490 ]) (sign_extend:DI (reg:SI 75 [ aD.1490 ]))) "./rep-asm.c":5:3 -1 (nil)) ;; __asm__ __volatile__("nop" : : "r" a_2); (insn 8 6 0 (asm_operands/v ("nop") ("") 0 [ (subreg/s/u:SI (reg/v:DI 72 [ aD.1490 ]) 0) ] [ (asm_input:SI ("r") ./rep-asm.c:6) ] [] ./rep-asm.c:6) "./rep-asm.c":6:3 -1 (nil)) which translates to: f: #APP # 5 "./rep-asm.c" 1 zext.w a0,a0 # 0 "" 2 #NO_APP sext.w a0,a0 #APP # 6 "./rep-asm.c" 1 nop # 0 "" 2 #NO_APP If this patch is not applied (and TARGET_MODE_REP_EXTENDED is not defined), we see the sext.w missing. So in fact, we may have unintentionally fixed an issue? > Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause > miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing > hook says that nothing needs to be done to truncate, but the new hook says > that the result of the truncation is properly sign-extended. > > The documentation for TARGET_MODE_REP_EXTENDED warns about that: > > In order to enforce the representation of mode, TARGET_TRULY_NOOP_TRUNCATION > should return false when truncating to mode. Good point. This should indeed be added and we'll look into this in more detail for v2. Looking into this in more detail, it seems that this change doesn't make things worse (we already had that problem before, as an explicit extension might also lead to the same reasoning). So somehow we've been avoiding this bullet so far, although I don't know yet how... Philipp.
hi, This patch is try to fix the issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, would you like to give me some suggestion, thanks. ~/source/gccUpstreamDir/gcc/testsuite(cfg) » git format-patch -1 --start-number=00 HEAD -o ~/patch /home/zhongyunde/patch/0000-PHIOPT-Add-A-B-CST-B-match-and-simplify-optimization.patch
On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com> wrote: > > hi, > This patch is try to fix the issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, > would you like to give me some suggestion, thanks. This seems like a "simplified" version of https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html which just handles power of 2 constants where we know the cond will be removed. We could do even more "simplified" of 1 if needed really. What is the IR before PHI-OPT? Is it just + 1? Also your pattern can be simplified to use integer_pow2p in the match part instead of INTEGER_CST. Thanks, Andrew > > ~/source/gccUpstreamDir/gcc/testsuite(cfg) » git format-patch -1 --start-number=00 HEAD -o ~/patch > /home/zhongyunde/patch/0000-PHIOPT-Add-A-B-CST-B-match-and-simplify-optimization.patch
> -----Original Message----- > From: Andrew Pinski [mailto:pinskia@gcc.gnu.org] > Sent: Saturday, November 5, 2022 2:34 PM > To: Zhongyunde <zhongyunde@huawei.com> > Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan) > <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler) > <weiwei64@huawei.com>; zhong_1985624@163.com > Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify > optimizations > > On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com> > wrote: > > > > hi, > > This patch is try to fix the issue > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, > > would you like to give me some suggestion, thanks. > > This seems like a "simplified" version of > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html > which just handles power of 2 constants where we know the cond will be > removed. > We could do even more "simplified" of 1 if needed really. > What is the IR before PHI-OPT? Is it just + 1? Thanks for your attention. It is + 4294967296 before PHI-OPT (See detail https://gcc.godbolt.org/z/6zEc6ja1z) So we should keep matching the power of 2 constants ? > Also your pattern can be simplified to use integer_pow2p in the match part > instead of INTEGER_CST. > Apply your comment, thanks > Thanks, > Andrew
On Sat, 5 Nov 2022, Philipp Tomsich wrote: > Alexander, > > I had missed your comment until now. Please make sure to read replies from Jeff and Palmer as well (their responses went to gcc-patches with empty Cc list): https://inbox.sourceware.org/gcc-patches/ba895f78-7f47-0f4-5bfe-e21893c4bcb@ispras.ru/T/#m7b7e5708b82de3b05ba8007ae6544891a03bdc42 For now let me respond to some of the more prominent points: > > I think this leads to a counter-intuitive requirement that a hand-written > > inline asm must sign-extend its output operands that are bound to either > > signed or unsigned 32-bit lvalues. Will compiler users be aware of that? > > I am not sure if I fully understand your concern, as the mode of the > asm-output will be derived from the variable type. > So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type > of a. Yes. The problem arises when 'a' later undergoes conversion to a wider type. > The concern, as far as I understand would be the case where the > assembly-sequence leaves an incompatible extension in the register. Existing documentation like the psABI does not constrain compiler users in any way, so their inline asm snippets are free to leave garbage in upper bits. Thus there's no "incompatibility" to speak of. To give a specific example that will be problematic if you go far enough down the road of matching MIPS64 behavior: long f(void) { int x; asm("" : "=r"(x)); return x; } here GCC (unlike LLVM) omits sign extension of 'x', assuming that asm output must have been sign-extended to 64 bits by the programmer. Alexander
On Sat, Nov 5, 2022 at 10:03 AM Zhongyunde via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > > > -----Original Message----- > > From: Andrew Pinski [mailto:pinskia@gcc.gnu.org] > > Sent: Saturday, November 5, 2022 2:34 PM > > To: Zhongyunde <zhongyunde@huawei.com> > > Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan) > > <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler) > > <weiwei64@huawei.com>; zhong_1985624@163.com > > Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify > > optimizations > > > > On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com> > > wrote: > > > > > > hi, > > > This patch is try to fix the issue > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, > > > would you like to give me some suggestion, thanks. > > > > This seems like a "simplified" version of > > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html > > which just handles power of 2 constants where we know the cond will be > > removed. > > We could do even more "simplified" of 1 if needed really. > > What is the IR before PHI-OPT? Is it just + 1? > > Thanks for your attention. It is + 4294967296 before PHI-OPT (See detail https://gcc.godbolt.org/z/6zEc6ja1z) > So we should keep matching the power of 2 constants ? > > > Also your pattern can be simplified to use integer_pow2p in the match part > > instead of INTEGER_CST. > > > Apply your comment, thanks How does the patch fix the mentioned bug? match.pd patterns should make things "simpler" but x + (a << C') isn't simpler than a ? x + C : x. It looks you are targeting PHI-OPT here, so maybe instead extend value_replacement to handle this case, it does look similar to the case with neutral/absorbing element there? Richard. > > > Thanks, > > Andrew > >
At 2022-11-08 22:58:34, "Richard Biener" <richard.guenther@gmail.com> wrote: >On Sat, Nov 5, 2022 at 10:03 AM Zhongyunde via Gcc-patches ><gcc-patches@gcc.gnu.org> wrote: >> >> >> > -----Original Message----- >> > From: Andrew Pinski [mailto:pinskia@gcc.gnu.org] >> > Sent: Saturday, November 5, 2022 2:34 PM >> > To: Zhongyunde <zhongyunde@huawei.com> >> > Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan) >> > <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler) >> > <weiwei64@huawei.com>; zhong_1985624@163.com >> > Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify >> > optimizations >> > >> > On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com> >> > wrote: >> > > >> > > hi, >> > > This patch is try to fix the issue >> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, >> > > would you like to give me some suggestion, thanks. >> > >> > This seems like a "simplified" version of >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html >> > which just handles power of 2 constants where we know the cond will be >> > removed. >> > We could do even more "simplified" of 1 if needed really. >> > What is the IR before PHI-OPT? Is it just + 1? >> >> Thanks for your attention. It is + 4294967296 before PHI-OPT (See detail https://gcc.godbolt.org/z/6zEc6ja1z) >> So we should keep matching the power of 2 constants ? >> >> > Also your pattern can be simplified to use integer_pow2p in the match part >> > instead of INTEGER_CST. >> > >> Apply your comment, thanks > >How does the patch fix the mentioned bug? match.pd patterns should make things >"simpler" but x + (a << C') isn't simpler than a ? x + C : x. It >looks you are targeting >PHI-OPT here, so maybe instead extend value_replacement to handle this case, >it does look similar to the case with neutral/absorbing element there? > >Richard. Thanks. This patch try to fix the 1st issued mentioned in107090 – [aarch64] sequence logic should be combined with mul and umulh (gnu.org) Sure, I'll take a look at the function value_replacement. I have also noticed that the function of two_value_replacement is very close to patch I want to achieve, and it may be easy to extend. It seems can be expressed equally in match.pd (called by match_simplify_replacement), so how do we choose where to implement may be better? ``` | /* Do the replacement of conditional if it can be done. */if (!early_p && !diamond_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1)) cfgchanged = true; elseif (!diamond_p && match_simplify_replacement (bb, bb1, e1, e2, phi, arg0, arg1, early_p)) cfgchanged = true; | ``` >> > Thanks, >> > Andrew >> >>
On Mon, 7 Nov 2022 at 14:55, Alexander Monakov <amonakov@ispras.ru> wrote: > > > > On Sat, 5 Nov 2022, Philipp Tomsich wrote: > > > Alexander, > > > > I had missed your comment until now. > > Please make sure to read replies from Jeff and Palmer as well (their responses > went to gcc-patches with empty Cc list): > https://inbox.sourceware.org/gcc-patches/ba895f78-7f47-0f4-5bfe-e21893c4bcb@ispras.ru/T/#m7b7e5708b82de3b05ba8007ae6544891a03bdc42 > > For now let me respond to some of the more prominent points: > > > > I think this leads to a counter-intuitive requirement that a hand-written > > > inline asm must sign-extend its output operands that are bound to either > > > signed or unsigned 32-bit lvalues. Will compiler users be aware of that? > > > > I am not sure if I fully understand your concern, as the mode of the > > asm-output will be derived from the variable type. > > So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type > > of a. > > Yes. The problem arises when 'a' later undergoes conversion to a wider type. > > > The concern, as far as I understand would be the case where the > > assembly-sequence leaves an incompatible extension in the register. > > Existing documentation like the psABI does not constrain compiler users in any > way, so their inline asm snippets are free to leave garbage in upper bits. Thus > there's no "incompatibility" to speak of. > > To give a specific example that will be problematic if you go far enough down > the road of matching MIPS64 behavior: > > long f(void) > { > int x; > asm("" : "=r"(x)); > return x; > } > > here GCC (unlike LLVM) omits sign extension of 'x', assuming that asm output > must have been sign-extended to 64 bits by the programmer. In fact, with the proposed patch (but also without it), GCC will sign-extend: f: sext.w a0,a0 ret .size f, .-f To make sure that this is not just an extension to promote the int to long for the function return, I next added another empty asm to consume 'x'. This clearly shows that the extension is performed to postprocess the output of the asm-statement: f: # ./asm2.c:4: asm("" : "=r"(x)); sext.w a0,a0 # x, x # ./asm2.c:5: asm("" : : "r"(x)); # ./asm2.c:7: } ret Thanks, Philipp.
On Tue, Nov 8, 2022 at 4:51 PM 钟云德 <zhong_1985624@163.com> wrote: > At 2022-11-08 22:58:34, "Richard Biener" <richard.guenther@gmail.com> > wrote: > > >On Sat, Nov 5, 2022 at 10:03 AM Zhongyunde via Gcc-patches > ><gcc-patches@gcc.gnu.org> wrote: > >> > >> > >> > -----Original Message----- > >> > From: Andrew Pinski [mailto:pinskia@gcc.gnu.org] > >> > Sent: Saturday, November 5, 2022 2:34 PM > >> > To: Zhongyunde <zhongyunde@huawei.com> > >> > Cc: hongtao.liu@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan) > >> > <zwzhangwen.zhang@huawei.com>; Weiwei (weiwei, Compiler) > >> > <weiwei64@huawei.com>; zhong_1985624@163.com > >> > Subject: Re: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify > >> > optimizations > >> > > >> > On Fri, Nov 4, 2022 at 11:17 PM Zhongyunde <zhongyunde@huawei.com> > >> > wrote: > >> > > > >> > > hi, > >> > > This patch is try to fix the issue > >> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, > >> > > would you like to give me some suggestion, thanks. > >> > > >> > This seems like a "simplified" version of > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584411.html > >> > which just handles power of 2 constants where we know the cond will be > >> > removed. > >> > We could do even more "simplified" of 1 if needed really. > >> > What is the IR before PHI-OPT? Is it just + 1? > >> > >> Thanks for your attention. It is + 4294967296 before PHI-OPT (See detail https://gcc.godbolt.org/z/6zEc6ja1z) > >> So we should keep matching the power of 2 constants ? > >> > >> > Also your pattern can be simplified to use integer_pow2p in the match part > >> > instead of INTEGER_CST. > >> > > >> Apply your comment, thanks > > > >How does the patch fix the mentioned bug? match.pd patterns should make things > >"simpler" but x + (a << C') isn't simpler than a ? x + C : x. It > >looks you are targeting > >PHI-OPT here, so maybe instead extend value_replacement to handle this case, > >it does look similar to the case with neutral/absorbing element there? > > > >Richard. > > Thanks. This patch try to fix the 1st issued mentioned in 107090 – [aarch64] sequence logic should be combined with mul and umulh (gnu.org) <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090> > Sure, I'll take a look at the function value_replacement. > I have also noticed that the function of two_value_replacement is very close to patch I want to achieve, and it may be easy to extend. > It seems can be expressed equally in match.pd (called by match_simplify_replacement), so how do we > choose where to implement may be better? > > I think that if we realize that we don't want a simplification to always apply (like by checking for canonicalize_math_p ()), then we should look for a pass which is a good fit and since you add the pattern to trigger a PHI-OPT optimization that looked like an obvious choice ... ``` > > /* Do the replacement of conditional if it can be done. */ if (!early_p && !diamond_p && two_value_replacement (bb, bb1, e2, phi, arg0, arg1)) cfgchanged = true; else if (!diamond_p && match_simplify_replacement (bb, bb1, e1, e2, phi, arg0, arg1, early_p)) cfgchanged = true; > > ``` > >> > Thanks, >> > Andrew >> >> > >
On Wed, 9 Nov 2022, Philipp Tomsich wrote: > > To give a specific example that will be problematic if you go far enough down > > the road of matching MIPS64 behavior: > > > > long f(void) > > { > > int x; > > asm("" : "=r"(x)); > > return x; > > } > > > > here GCC (unlike LLVM) omits sign extension of 'x', assuming that asm output > > must have been sign-extended to 64 bits by the programmer. > > In fact, with the proposed patch (but also without it), GCC will sign-extend: > f: > sext.w a0,a0 > ret > .size f, .-f I'm aware. I said "will be problematic if ...", meaning that GCC omits sign extension when targeting MIPS64, and if you match MIPS64 behavior on RISC-V, you'll get in that situation as well. > To make sure that this is not just an extension to promote the int to > long for the function return, I next added another empty asm to > consume 'x'. > This clearly shows that the extension is performed to postprocess the > output of the asm-statement: > > f: > # ./asm2.c:4: asm("" : "=r"(x)); > sext.w a0,a0 # x, x > # ./asm2.c:5: asm("" : : "r"(x)); > # ./asm2.c:7: } > ret No, you cannot distinguish post-processing the output of the first asm vs. pre-processing the input of the second asm. Try asm("" : "+r"(x)); as the second asm instead, and you'll get f: # t.c:17: asm("" : "=r"(x)); # t.c:18: asm("" : "+r"(x)); # t.c:20: } sext.w a0,a0 #, x ret If it helps, here's a Compiler Explorer link comparing with MIPS64: https://godbolt.org/z/7eobvdKdK Alexander
On 11/4/22 17:00, Philipp Tomsich wrote: > Alexander, > > I had missed your comment until now. > > On Tue, 6 Sept 2022 at 13:39, Alexander Monakov <amonakov@ispras.ru> wrote: >> On Mon, 5 Sep 2022, Philipp Tomsich wrote: >> >>> +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode > mode_rep) >>> +{ >>> + /* On 64-bit targets, SImode register values are sign-extended to > DImode. */ >>> + if (TARGET_64BIT && mode == SImode && mode_rep == DImode) >>> + return SIGN_EXTEND; >> I think this leads to a counter-intuitive requirement that a hand-written >> inline asm must sign-extend its output operands that are bound to either >> signed or unsigned 32-bit lvalues. Will compiler users be aware of that? > I am not sure if I fully understand your concern, as the mode of the > asm-output will be derived from the variable type. > So "asm (... : "=r" (a))" will take DI/SI/HI/QImode depending on the type > of a. Correct. > > The concern, as far as I understand would be the case where the > assembly-sequence leaves an incompatible extension in the register. Right. The question in my mind is whether or not the responsibility should be on the compiler or on the developer to ensure the ASM output is properly extended. If someone's writing ASMs, then to a large degree, I consider it their responsibility to make sure things are properly extended -- even more so if having the compiler do it results in slower code independent of ASMs. Jeff
On Sun, 20 Nov 2022, Jeff Law wrote: > > The concern, as far as I understand would be the case where the > > assembly-sequence leaves an incompatible extension in the register. > > Right. The question in my mind is whether or not the responsibility should be > on the compiler or on the developer to ensure the ASM output is properly > extended. If someone's writing ASMs, then to a large degree, I consider it > their responsibility to make sure things are properly extended Right. They should also find out the hard way, with zero documentation telling them they had to (let alone *why* they had to), and without a hypothetical -fsanitize=abi that would catch exactly the point of the missing extension instead of letting the program crash mysteriously at a much later point. Uphill both ways, in a world where LLVM does not place such burden on the programmer, and even among GCC targets only mips64 made a precedent (also without documenting the new requirement). > -- even more so > if having the compiler do it results in slower code independent of ASMs. I think LLVM demonstrates well enough that a compiler can do a better job than GCC at eliminating redundant extensions without upgrading requirements for inline asm (in the usual C code, not for sub-word outputs of an asm). Alexander
On 11/21/22 06:49, Alexander Monakov wrote: > On Sun, 20 Nov 2022, Jeff Law wrote: > >>> The concern, as far as I understand would be the case where the >>> assembly-sequence leaves an incompatible extension in the register. >> Right. The question in my mind is whether or not the responsibility should be >> on the compiler or on the developer to ensure the ASM output is properly >> extended. If someone's writing ASMs, then to a large degree, I consider it >> their responsibility to make sure things are properly extended > Right. They should also find out the hard way, with zero documentation telling > them they had to (let alone *why* they had to), and without a hypothetical > -fsanitize=abi that would catch exactly the point of the missing extension > instead of letting the program crash mysteriously at a much later point. > Uphill both ways, in a world where LLVM does not place such burden on the > programmer, and even among GCC targets only mips64 made a precedent (also > without documenting the new requirement). They're writing assembly code -- in my book that means they'd better have a pretty good understanding of the architecture, its limitations and quirks. I'm happy to give them some diagnostics, guardrails, etc etc, but slowing down standard C code to placate someone who doesn't really know the architecture, but is trying to write assembly code is a step too far for me. > >> -- even more so >> if having the compiler do it results in slower code independent of ASMs. > I think LLVM demonstrates well enough that a compiler can do a better job > than GCC at eliminating redundant extensions without upgrading requirements > for inline asm (in the usual C code, not for sub-word outputs of an asm). Perhaps. But it's also the case the LLVM and GCC have some significant differences in implementation. Sometimes those differences in impementations have notable impacts on how code is generated. jeff > > Alexander
On Mon, 21 Nov 2022, Jeff Law wrote: > They're writing assembly code -- in my book that means they'd better have a > pretty good understanding of the architecture, its limitations and quirks. That GCC ties together optimization and inline asm interface via its internal TARGET_MODE_REP_EXTENDED macro is not a quirk of the RISC-V architecture. It's a quirk of GCC. Alexander
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 675d92c0961..cf829f390ab 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -5053,6 +5053,18 @@ riscv_hard_regno_mode_ok (unsigned int regno, machine_mode mode) return true; } +/* Implement TARGET_MODE_REP_EXTENDED. */ + +static int +riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep) +{ + /* On 64-bit targets, SImode register values are sign-extended to DImode. */ + if (TARGET_64BIT && mode == SImode && mode_rep == DImode) + return SIGN_EXTEND; + + return UNKNOWN; +} + /* Implement TARGET_MODES_TIEABLE_P. Don't allow floating-point modes to be tied, since type punning of @@ -6153,6 +6165,9 @@ riscv_init_libfuncs (void) #undef TARGET_HARD_REGNO_MODE_OK #define TARGET_HARD_REGNO_MODE_OK riscv_hard_regno_mode_ok +#undef TARGET_MODE_REP_EXTENDED +#define TARGET_MODE_REP_EXTENDED riscv_mode_rep_extended + #undef TARGET_MODES_TIEABLE_P #define TARGET_MODES_TIEABLE_P riscv_modes_tieable_p diff --git a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c index 5f93be15ac5..2f38b3f0fec 100644 --- a/gcc/testsuite/gcc.target/riscv/shift-shift-2.c +++ b/gcc/testsuite/gcc.target/riscv/shift-shift-2.c @@ -38,5 +38,3 @@ sub5 (unsigned int i) } /* { dg-final { scan-assembler-times "slli" 5 } } */ /* { dg-final { scan-assembler-times "srli" 5 } } */ -/* { dg-final { scan-assembler-times "slliw" 1 } } */ -/* { dg-final { scan-assembler-times "srliw" 1 } } */