Message ID | 20220727034437.154625-2-juzhe.zhong@rivai.ai |
---|---|
State | New, archived |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:b5d6:b0:2b9:3548:2db5 with SMTP id v22csp154645pxt; Tue, 26 Jul 2022 20:45:56 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sDbC2xKQjIPxfuWerfvd57hpqjM0ZBfKnN3k/zEW3XN4zNkxqaHarU9jGD51IdiCOdvNBt X-Received: by 2002:a05:6402:1102:b0:43a:9cf7:68a3 with SMTP id u2-20020a056402110200b0043a9cf768a3mr21264462edv.68.1658893556228; Tue, 26 Jul 2022 20:45:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658893556; cv=none; d=google.com; s=arc-20160816; b=j22iiZtwcRdTSRVKiheY5uumMBoCgay0iBZz6aLY8WmAuWKJkrqigONf6cJR8WnGg4 KdZfcLwuoLNxh/pRzYBRYcrI5kJCxXxcivrIC3tUWp5ExJjdtLfrusl7OnU9PBKu/3D8 T8jMOP6TavwAw7w43dAuj0ZrqfHAxnWkuPS2Wkal461KzV3K9ZBo3WAyPlEcotYQDuGP zcJyDjszHVHsqNMfyr30jDrosxRUdCO/McFVpDlC9EnQZ36VGxiDzq4TNj+cZLtlkqVb pj4sYQRcvNGNFIIK7d4Nufb4lm4T4M31WAt9hQuwXVklPlAR6NknfVok/wvFeibYewmU Vmsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:dmarc-filter:delivered-to; bh=isT0ivaH3kHqReY3bUBzAnjBsiYhR249+54/Ysmb7/Y=; b=b64PdPTSSovfb6gciAqKgoQFquRsz+bfdpOAFzSy+9QujMdm5e9kzUu8XmitJcXkl5 XvCLEoq251SQhRYLzp3Na8uNHaaqo960ay34zm4AmnY08CrV2bJbM1ynMzKxzFn0MtQJ 3bgw5aY85Yun6w0+cEKNLPsRYHE4p0dBwwGJ70ezDF71qfvhyoRh4rtU3PgklRpoLw+g ef8eZLB3/V0XfZOj8kCiPR6dZnJltZr9MbYrDtYyjN+aO1hH0HmFHoqUDHqEQ8WQI3aL 4Kq+OtFDvGxUt6028baiwUu3pUxWJ4UFV/FprN6jq49nlDFm2tWxYLYG02b+cSw5Yf7s Ye6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id cw15-20020a170906478f00b0072b749835e9si5645414ejc.850.2022.07.26.20.45.55 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Jul 2022 20:45:56 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 838CF385AE45 for <ouuuleilei@gmail.com>; Wed, 27 Jul 2022 03:45:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgau1.qq.com (smtpbgau1.qq.com [54.206.16.166]) by sourceware.org (Postfix) with ESMTPS id D992D3857C52 for <gcc-patches@gcc.gnu.org>; Wed, 27 Jul 2022 03:45:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D992D3857C52 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp82t1658893489tp2nk9yr Received: from server1.localdomain ( [42.247.22.66]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 27 Jul 2022 11:44:48 +0800 (CST) X-QQ-SSF: 01400000000000C0H000000A0000000 X-QQ-FEAT: nMuci6qhceP+zavxxo4VgM/xD97PW+7y4occEKBzZVTQBVXXUOsMwOlPEYLJJ J2fVBYAjiYPZbWAbDJgoruPrJ1aPz7PLG3TMxEpR9cmzhC81Yl4lkg0kQRsxeSH/YMppyFX xY0Lsi7L2KiYes/eavjtqt7+WspBTtPoSlPwAl5Tp95U6r46hfBqx2NdoGwjyxoauQQ8zC4 a+qAs872DvKfCYBSOJrRFjuejaU69UTUH/VC5NiGBIRTxyHpQb3t4YnfC7uYAwjGr3l0lNK 3NQnkuhpgvGVixbzLl6chnDQYGf206igYklONz7+v+5Wn2nwzl8o84J2GqkSisAkvJ9ru1q TbJnanUwZL+IByE7ZpwABzTQx63yf3IGaEGDD0LC0+6EEtWTu1UbCzE6ASp4sKM3MNcM8xb X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/1] Fix bit-position comparison Date: Wed, 27 Jul 2022 11:44:37 +0800 Message-Id: <20220727034437.154625-2-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220727034437.154625-1-juzhe.zhong@rivai.ai> References: <20220727034437.154625-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign3 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: jakub@redhat.com, richard.earnshaw@arm.com, rguenther@suse.de, richard.sandiford@arm.com, gnu@the-meissners.org, jlaw@tachyum.com, zhongjuzhe <juzhe.zhong@rivai.ai>, davem@redhat.com, kenner@nyu.edu, ian@airs.com, wilson@tuliptree.org, joseph@codesourcery.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1739475969830841128?= X-GMAIL-MSGID: =?utf-8?q?1739475969830841128?= |
Series |
middle-end: Fix bit position comparison
|
|
Commit Message
juzhe.zhong@rivai.ai
July 27, 2022, 3:44 a.m. UTC
From: zhongjuzhe <juzhe.zhong@rivai.ai>
gcc/ChangeLog:
* expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE
---
gcc/expr.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > gcc/ChangeLog: > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > --- > gcc/expr.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/expr.cc b/gcc/expr.cc > index 80bb1b8a4c5..ac2b3c07df6 100644 > --- a/gcc/expr.cc > +++ b/gcc/expr.cc > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > code contains an out-of-bounds access to a small array. */ > if (!MEM_P (to_rtx) > && GET_MODE (to_rtx) != BLKmode > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) I think this has the chance to go wrong with regard to endianess. Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos is relative to the start of to_rtx as if it were in memory and bitsize determines the contiguous region affected. But since we are actually storing into a non-memory endianess comes into play. So no, I don't think the patch is correct, it would be way more complicated to get the desired improvement. Richard. > { > expand_normal (from); > result = NULL; >
Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, The machine_mode PRECISION is calculate by component size which is different from BITSIZE Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. Can you help me with this? This is important for the following RVV upstream support. Thanks. juzhe.zhong@rivai.ai From: Richard Biener Date: 2022-07-27 14:46 To: zhongjuzhe CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson Subject: Re: [PATCH 1/1] Fix bit-position comparison On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > gcc/ChangeLog: > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > --- > gcc/expr.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/expr.cc b/gcc/expr.cc > index 80bb1b8a4c5..ac2b3c07df6 100644 > --- a/gcc/expr.cc > +++ b/gcc/expr.cc > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > code contains an out-of-bounds access to a small array. */ > if (!MEM_P (to_rtx) > && GET_MODE (to_rtx) != BLKmode > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) I think this has the chance to go wrong with regard to endianess. Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos is relative to the start of to_rtx as if it were in memory and bitsize determines the contiguous region affected. But since we are actually storing into a non-memory endianess comes into play. So no, I don't think the patch is correct, it would be way more complicated to get the desired improvement. Richard. > { > expand_normal (from); > result = NULL; > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)
On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? > > For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). Can you explain in terms of GCCs generic vectors what vint8m1_t and vint8mf2_t are? > Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of > fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) > so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, > The machine_mode PRECISION is calculate by component size which is different from BITSIZE > > Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by > bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. > > Can you help me with this? This is important for the following RVV upstream support. Thanks. So you have that vint8mf2_t type and registers + instructions to operate on them but no way to load/store? How do you implement vint8mf2_t foo (vint8mf2_t *p) { return *p; } ? > > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-07-27 14:46 > To: zhongjuzhe > CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson > Subject: Re: [PATCH 1/1] Fix bit-position comparison > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > > > gcc/ChangeLog: > > > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > > > --- > > gcc/expr.cc | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > > index 80bb1b8a4c5..ac2b3c07df6 100644 > > --- a/gcc/expr.cc > > +++ b/gcc/expr.cc > > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > > code contains an out-of-bounds access to a small array. */ > > if (!MEM_P (to_rtx) > > && GET_MODE (to_rtx) != BLKmode > > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > I think this has the chance to go wrong with regard to endianess. > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > is relative to the start of to_rtx as if it were in memory and bitsize > determines the contiguous region affected. But since we are actually > storing into a non-memory endianess comes into play. > > So no, I don't think the patch is correct, it would be way more > complicated to get the desired improvement. > > Richard. > > > { > > expand_normal (from); > > result = NULL; > > > >
Let's take look at these 2 cases: https://godbolt.org/z/zP16frPnb. In RVV, we have vle8 and vsetvli to specify loading vint8mf2 (vsetvli a1, zero + vle8.v). You can see it in foo function. In this case we don't need to confuse compiler the size of vint8mf2. However, The second case is I write assembly to occupy the vector register to generate register spilling cases. You can see LLVM implementation: First use vsetvli + vle8.v load (vint8mf2_t) data from the base pointer, then spill the vector to memory using vs1r (this is the whole register store which store the whole vector to the memory and then use vl1r load the whole register and finally return it. In LLVM implementation, it insert vsetvli instructions for RVV instruction using LLVM PASS before RA (register allocation). So after RA, compiler generate the spilling loads/stores. We can either choose to use vsetvli + vle/vse (load/store fractional vector) or vl1r/vs1r (load/store whole vector which enlarge the spill size). Because implementing insert vsetvli PASS after RA (register allocation) is so complicated, LLVM choose to use vl1r/vs1r. Frankly, I implement RVV GCC reference to LLVM. So that why I want to define the machine_mode for `vint8mf2` with smaller element-size but same byte-size from `vint8m1'. Thank you for your reply. juzhe.zhong@rivai.ai From: Richard Biener Date: 2022-07-27 15:35 To: juzhe.zhong@rivai.ai CC: gcc-patches Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? > > For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). Can you explain in terms of GCCs generic vectors what vint8m1_t and vint8mf2_t are? > Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of > fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) > so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, > The machine_mode PRECISION is calculate by component size which is different from BITSIZE > > Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by > bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. > > Can you help me with this? This is important for the following RVV upstream support. Thanks. So you have that vint8mf2_t type and registers + instructions to operate on them but no way to load/store? How do you implement vint8mf2_t foo (vint8mf2_t *p) { return *p; } ? > > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-07-27 14:46 > To: zhongjuzhe > CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson > Subject: Re: [PATCH 1/1] Fix bit-position comparison > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > > > gcc/ChangeLog: > > > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > > > --- > > gcc/expr.cc | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > > index 80bb1b8a4c5..ac2b3c07df6 100644 > > --- a/gcc/expr.cc > > +++ b/gcc/expr.cc > > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > > code contains an out-of-bounds access to a small array. */ > > if (!MEM_P (to_rtx) > > && GET_MODE (to_rtx) != BLKmode > > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > I think this has the chance to go wrong with regard to endianess. > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > is relative to the start of to_rtx as if it were in memory and bitsize > determines the contiguous region affected. But since we are actually > storing into a non-memory endianess comes into play. > > So no, I don't think the patch is correct, it would be way more > complicated to get the desired improvement. > > Richard. > > > { > > expand_normal (from); > > result = NULL; > > > > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)
On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > Let's take look at these 2 cases: https://godbolt.org/z/zP16frPnb. In > RVV, we have vle8 and vsetvli to specify loading vint8mf2 (vsetvli a1, > zero + vle8.v). You can see it in foo function. In this case we don't > need to confuse compiler the size of vint8mf2. However, The second case > is I write assembly to occupy the vector register to generate register > spilling cases. You can see LLVM implementation: First use vsetvli + > vle8.v load (vint8mf2_t) data from the base pointer, then spill the > vector to memory using vs1r (this is the whole register store which > store the whole vector to the memory and then use vl1r load the whole > register and finally return it. In LLVM implementation, it insert > vsetvli instructions for RVV instruction using LLVM PASS before RA > (register allocation). So after RA, compiler generate the spilling > loads/stores. We can either choose to use vsetvli + vle/vse (load/store > fractional vector) or vl1r/vs1r (load/store whole vector which enlarge > the spill size). > > Because implementing insert vsetvli PASS after RA (register allocation) > is so complicated, LLVM choose to use vl1r/vs1r. Frankly, I implement > RVV GCC reference to LLVM. So that why I want to define the machine_mode > for `vint8mf2` with smaller element-size but same byte-size from > `vint8m1'. The LLVM strathegy might not be the best one for GCC. I still don't quite understand what issue you face with the code you try to patch. How does the machmode.def portion of the port look like for vint8m1 and vint8mf2? What are the corresponding modes? > Thank you for your reply. > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-07-27 15:35 > To: juzhe.zhong@rivai.ai > CC: gcc-patches > Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? > > > > For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). > > Can you explain in terms of GCCs generic vectors what vint8m1_t and > vint8mf2_t are? > > > Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of > > fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) > > so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, > > The machine_mode PRECISION is calculate by component size which is different from BITSIZE > > > > Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by > > bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. > > > > Can you help me with this? This is important for the following RVV upstream support. Thanks. > > So you have that vint8mf2_t type and registers + instructions to operate > on them but no way to load/store? How do you implement > > vint8mf2_t foo (vint8mf2_t *p) > { > return *p; > } > > ? > > > > > > > > > > > juzhe.zhong@rivai.ai > > > > From: Richard Biener > > Date: 2022-07-27 14:46 > > To: zhongjuzhe > > CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson > > Subject: Re: [PATCH 1/1] Fix bit-position comparison > > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > > > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > > > > > gcc/ChangeLog: > > > > > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > > > > > --- > > > gcc/expr.cc | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > > > index 80bb1b8a4c5..ac2b3c07df6 100644 > > > --- a/gcc/expr.cc > > > +++ b/gcc/expr.cc > > > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > > > code contains an out-of-bounds access to a small array. */ > > > if (!MEM_P (to_rtx) > > > && GET_MODE (to_rtx) != BLKmode > > > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > > > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > > > I think this has the chance to go wrong with regard to endianess. > > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > > is relative to the start of to_rtx as if it were in memory and bitsize > > determines the contiguous region affected. But since we are actually > > storing into a non-memory endianess comes into play. > > > > So no, I don't think the patch is correct, it would be way more > > complicated to get the desired improvement. > > > > Richard. > > > > > { > > > expand_normal (from); > > > result = NULL; > > > > > > > > >
For vint8m1_t: VECTOR_MODES_WITH_PREFIX (VNx, INT, 16, 0) ADJUST_NUNITS (VNx16QI, riscv_vector_chunks * 8); ADJUST_BYTES (VNx16QI, riscv_vector_chunks * 8); For vint8mf2_t: VECTOR_MODES_WITH_PREFIX (VNx, INT, 8, 0) ADJUST_NUNITS (VNx8QI, riscv_vector_chunks * 4); ADJUST_BYTES (VNx16QI, riscv_vector_chunks * 8); riscv_vector_chunks is a compile-time unknown poly value, I use ADJUST_BYTES to specify these 2 machine_modes with same bytesize but different element number. This way can tell GCC that vint8mf2_t has half element nunber of vint8m1_t but occupy the same size during the register spilling. May be we can add a new function that call ADJUST_PRECISION that I can adjust these two precision? I am not sure. I am look forward another solution to deal with this issuse. Thank you so much. juzhe.zhong@rivai.ai From: Richard Biener Date: 2022-07-27 16:12 To: juzhe.zhong@rivai.ai CC: gcc-patches Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > Let's take look at these 2 cases: https://godbolt.org/z/zP16frPnb. In > RVV, we have vle8 and vsetvli to specify loading vint8mf2 (vsetvli a1, > zero + vle8.v). You can see it in foo function. In this case we don't > need to confuse compiler the size of vint8mf2. However, The second case > is I write assembly to occupy the vector register to generate register > spilling cases. You can see LLVM implementation: First use vsetvli + > vle8.v load (vint8mf2_t) data from the base pointer, then spill the > vector to memory using vs1r (this is the whole register store which > store the whole vector to the memory and then use vl1r load the whole > register and finally return it. In LLVM implementation, it insert > vsetvli instructions for RVV instruction using LLVM PASS before RA > (register allocation). So after RA, compiler generate the spilling > loads/stores. We can either choose to use vsetvli + vle/vse (load/store > fractional vector) or vl1r/vs1r (load/store whole vector which enlarge > the spill size). > > Because implementing insert vsetvli PASS after RA (register allocation) > is so complicated, LLVM choose to use vl1r/vs1r. Frankly, I implement > RVV GCC reference to LLVM. So that why I want to define the machine_mode > for `vint8mf2` with smaller element-size but same byte-size from > `vint8m1'. The LLVM strathegy might not be the best one for GCC. I still don't quite understand what issue you face with the code you try to patch. How does the machmode.def portion of the port look like for vint8m1 and vint8mf2? What are the corresponding modes? > Thank you for your reply. > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-07-27 15:35 > To: juzhe.zhong@rivai.ai > CC: gcc-patches > Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? > > > > For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). > > Can you explain in terms of GCCs generic vectors what vint8m1_t and > vint8mf2_t are? > > > Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of > > fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) > > so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, > > The machine_mode PRECISION is calculate by component size which is different from BITSIZE > > > > Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by > > bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. > > > > Can you help me with this? This is important for the following RVV upstream support. Thanks. > > So you have that vint8mf2_t type and registers + instructions to operate > on them but no way to load/store? How do you implement > > vint8mf2_t foo (vint8mf2_t *p) > { > return *p; > } > > ? > > > > > > > > > > > juzhe.zhong@rivai.ai > > > > From: Richard Biener > > Date: 2022-07-27 14:46 > > To: zhongjuzhe > > CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson > > Subject: Re: [PATCH 1/1] Fix bit-position comparison > > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > > > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > > > > > gcc/ChangeLog: > > > > > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > > > > > --- > > > gcc/expr.cc | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > > > index 80bb1b8a4c5..ac2b3c07df6 100644 > > > --- a/gcc/expr.cc > > > +++ b/gcc/expr.cc > > > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > > > code contains an out-of-bounds access to a small array. */ > > > if (!MEM_P (to_rtx) > > > && GET_MODE (to_rtx) != BLKmode > > > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > > > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > > > I think this has the chance to go wrong with regard to endianess. > > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > > is relative to the start of to_rtx as if it were in memory and bitsize > > determines the contiguous region affected. But since we are actually > > storing into a non-memory endianess comes into play. > > > > So no, I don't think the patch is correct, it would be way more > > complicated to get the desired improvement. > > > > Richard. > > > > > { > > > expand_normal (from); > > > result = NULL; > > > > > > > > > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)
Richard Biener <rguenther@suse.de> writes: > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > >> From: zhongjuzhe <juzhe.zhong@rivai.ai> >> >> gcc/ChangeLog: >> >> * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE >> >> --- >> gcc/expr.cc | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/gcc/expr.cc b/gcc/expr.cc >> index 80bb1b8a4c5..ac2b3c07df6 100644 >> --- a/gcc/expr.cc >> +++ b/gcc/expr.cc >> @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) >> code contains an out-of-bounds access to a small array. */ >> if (!MEM_P (to_rtx) >> && GET_MODE (to_rtx) != BLKmode >> - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) >> + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > I think this has the chance to go wrong with regard to endianess. > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > is relative to the start of to_rtx as if it were in memory and bitsize > determines the contiguous region affected. But since we are actually > storing into a non-memory endianess comes into play. Not sure I follow the code well enough to comment, but: this condition is saying when we can drop the assignment on the floor and just expand the rhs (for side effects). From that point of view, using bitsize should be more conservative than using precision, since bitsize is always >= precision. Thanks, Richard > > So no, I don't think the patch is correct, it would be way more > complicated to get the desired improvement. > > Richard. > >> { >> expand_normal (from); >> result = NULL; >>
On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > For vint8m1_t: > VECTOR_MODES_WITH_PREFIX (VNx, INT, 16, 0) > ADJUST_NUNITS (VNx16QI, riscv_vector_chunks * 8); > ADJUST_BYTES (VNx16QI, riscv_vector_chunks * 8); > For vint8mf2_t: > VECTOR_MODES_WITH_PREFIX (VNx, INT, 8, 0) > ADJUST_NUNITS (VNx8QI, riscv_vector_chunks * 4); > ADJUST_BYTES (VNx16QI, riscv_vector_chunks * 8); ^^^ ADJUST_BYTES (VNx8QI, riscv_vector_chunks * 8); probably. As said, I'm not sure this is a good idea just for the sake of spill slots. Maybe there's a mode_for_spill one could use and spill a paradoxical subreg of that mode ... (don't search for mode_for_spill, I just made that up). Maybe you can somehow forbit spilling of vint8mf2_t and instead define spill_class for those so they spill first to vint8m1_t and then those get spilled to the stack? > riscv_vector_chunks is a compile-time unknown poly value, I use ADJUST_BYTES to specify these 2 machine_modes > with same bytesize but different element number. > This way can tell GCC that vint8mf2_t has half element nunber of vint8m1_t but occupy the same size during the register spilling. > > May be we can add a new function that call ADJUST_PRECISION that I can adjust these two precision? > I am not sure. I am look forward another solution to deal with this issuse. Thank you so much. > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-07-27 16:12 > To: juzhe.zhong@rivai.ai > CC: gcc-patches > Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > Let's take look at these 2 cases: https://godbolt.org/z/zP16frPnb. In > > RVV, we have vle8 and vsetvli to specify loading vint8mf2 (vsetvli a1, > > zero + vle8.v). You can see it in foo function. In this case we don't > > need to confuse compiler the size of vint8mf2. However, The second case > > is I write assembly to occupy the vector register to generate register > > spilling cases. You can see LLVM implementation: First use vsetvli + > > vle8.v load (vint8mf2_t) data from the base pointer, then spill the > > vector to memory using vs1r (this is the whole register store which > > store the whole vector to the memory and then use vl1r load the whole > > register and finally return it. In LLVM implementation, it insert > > vsetvli instructions for RVV instruction using LLVM PASS before RA > > (register allocation). So after RA, compiler generate the spilling > > loads/stores. We can either choose to use vsetvli + vle/vse (load/store > > fractional vector) or vl1r/vs1r (load/store whole vector which enlarge > > the spill size). > > > > Because implementing insert vsetvli PASS after RA (register allocation) > > is so complicated, LLVM choose to use vl1r/vs1r. Frankly, I implement > > RVV GCC reference to LLVM. So that why I want to define the machine_mode > > for `vint8mf2` with smaller element-size but same byte-size from > > `vint8m1'. > > The LLVM strathegy might not be the best one for GCC. I still don't > quite understand what issue you face with the code you try to patch. > How does the machmode.def portion of the port look like for > vint8m1 and vint8mf2? What are the corresponding modes? > > > Thank you for your reply. > > > > > > > > juzhe.zhong@rivai.ai > > > > From: Richard Biener > > Date: 2022-07-27 15:35 > > To: juzhe.zhong@rivai.ai > > CC: gcc-patches > > Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison > > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > > > Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? > > > > > > For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). > > > > Can you explain in terms of GCCs generic vectors what vint8m1_t and > > vint8mf2_t are? > > > > > Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of > > > fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) > > > so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, > > > The machine_mode PRECISION is calculate by component size which is different from BITSIZE > > > > > > Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by > > > bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. > > > > > > Can you help me with this? This is important for the following RVV upstream support. Thanks. > > > > So you have that vint8mf2_t type and registers + instructions to operate > > on them but no way to load/store? How do you implement > > > > vint8mf2_t foo (vint8mf2_t *p) > > { > > return *p; > > } > > > > ? > > > > > > > > > > > > > > > > > juzhe.zhong@rivai.ai > > > > > > From: Richard Biener > > > Date: 2022-07-27 14:46 > > > To: zhongjuzhe > > > CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson > > > Subject: Re: [PATCH 1/1] Fix bit-position comparison > > > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > > > > > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > > > > > > > gcc/ChangeLog: > > > > > > > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > > > > > > > --- > > > > gcc/expr.cc | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > > > > index 80bb1b8a4c5..ac2b3c07df6 100644 > > > > --- a/gcc/expr.cc > > > > +++ b/gcc/expr.cc > > > > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > > > > code contains an out-of-bounds access to a small array. */ > > > > if (!MEM_P (to_rtx) > > > > && GET_MODE (to_rtx) != BLKmode > > > > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > > > > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > > > > > I think this has the chance to go wrong with regard to endianess. > > > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > > > is relative to the start of to_rtx as if it were in memory and bitsize > > > determines the contiguous region affected. But since we are actually > > > storing into a non-memory endianess comes into play. > > > > > > So no, I don't think the patch is correct, it would be way more > > > complicated to get the desired improvement. > > > > > > Richard. > > > > > > > { > > > > expand_normal (from); > > > > result = NULL; > > > > > > > > > > > > > > > >
After consideration. Maybe I can try another solution. I aggree with you that it is not good idea that fake the BYTESIZE of vint8mf2_t and let GCC think it is a full vector. I think the best way is let GCC understand the real size of 'vint8mf2_t' as a half of a vector and then RISC-V backend insert vsetvli after RA (register allocation) so that it will not enlarge the spill size. Thank you so much. juzhe.zhong@rivai.ai From: Richard Biener Date: 2022-07-27 20:56 To: juzhe.zhong@rivai.ai CC: gcc-patches; vmakarov Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > For vint8m1_t: > VECTOR_MODES_WITH_PREFIX (VNx, INT, 16, 0) > ADJUST_NUNITS (VNx16QI, riscv_vector_chunks * 8); > ADJUST_BYTES (VNx16QI, riscv_vector_chunks * 8); > For vint8mf2_t: > VECTOR_MODES_WITH_PREFIX (VNx, INT, 8, 0) > ADJUST_NUNITS (VNx8QI, riscv_vector_chunks * 4); > ADJUST_BYTES (VNx16QI, riscv_vector_chunks * 8); ^^^ ADJUST_BYTES (VNx8QI, riscv_vector_chunks * 8); probably. As said, I'm not sure this is a good idea just for the sake of spill slots. Maybe there's a mode_for_spill one could use and spill a paradoxical subreg of that mode ... (don't search for mode_for_spill, I just made that up). Maybe you can somehow forbit spilling of vint8mf2_t and instead define spill_class for those so they spill first to vint8m1_t and then those get spilled to the stack? > riscv_vector_chunks is a compile-time unknown poly value, I use ADJUST_BYTES to specify these 2 machine_modes > with same bytesize but different element number. > This way can tell GCC that vint8mf2_t has half element nunber of vint8m1_t but occupy the same size during the register spilling. > > May be we can add a new function that call ADJUST_PRECISION that I can adjust these two precision? > I am not sure. I am look forward another solution to deal with this issuse. Thank you so much. > > > > juzhe.zhong@rivai.ai > > From: Richard Biener > Date: 2022-07-27 16:12 > To: juzhe.zhong@rivai.ai > CC: gcc-patches > Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > Let's take look at these 2 cases: https://godbolt.org/z/zP16frPnb. In > > RVV, we have vle8 and vsetvli to specify loading vint8mf2 (vsetvli a1, > > zero + vle8.v). You can see it in foo function. In this case we don't > > need to confuse compiler the size of vint8mf2. However, The second case > > is I write assembly to occupy the vector register to generate register > > spilling cases. You can see LLVM implementation: First use vsetvli + > > vle8.v load (vint8mf2_t) data from the base pointer, then spill the > > vector to memory using vs1r (this is the whole register store which > > store the whole vector to the memory and then use vl1r load the whole > > register and finally return it. In LLVM implementation, it insert > > vsetvli instructions for RVV instruction using LLVM PASS before RA > > (register allocation). So after RA, compiler generate the spilling > > loads/stores. We can either choose to use vsetvli + vle/vse (load/store > > fractional vector) or vl1r/vs1r (load/store whole vector which enlarge > > the spill size). > > > > Because implementing insert vsetvli PASS after RA (register allocation) > > is so complicated, LLVM choose to use vl1r/vs1r. Frankly, I implement > > RVV GCC reference to LLVM. So that why I want to define the machine_mode > > for `vint8mf2` with smaller element-size but same byte-size from > > `vint8m1'. > > The LLVM strathegy might not be the best one for GCC. I still don't > quite understand what issue you face with the code you try to patch. > How does the machmode.def portion of the port look like for > vint8m1 and vint8mf2? What are the corresponding modes? > > > Thank you for your reply. > > > > > > > > juzhe.zhong@rivai.ai > > > > From: Richard Biener > > Date: 2022-07-27 15:35 > > To: juzhe.zhong@rivai.ai > > CC: gcc-patches > > Subject: Re: Re: [PATCH 1/1] Fix bit-position comparison > > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > > > Thank you so much for the fast reply. Ok, it is true that I didn't think about it carefully. Can you help me with the following the issue? > > > > > > For RVV (RISC-V 'V' Extension), we have full vector type 'vint8m1_t' (LMUL = 1) and fractional vector type 'vint8mf2_t' (LMUL = 1/2). > > > > Can you explain in terms of GCCs generic vectors what vint8m1_t and > > vint8mf2_t are? > > > > > Because in the ISA, we don't have whole register load/store for fractional vector. I reference the LLVM implementation and I adjust BITSIZE of > > > fractional vector same as full vector (It will confuse GCC the bytesize of fractional vector and consider the spill size of a fractional vector is same as LMUL = 1) > > > so that I can use whole register load/store directly during the register spilling. (Even though it will enlarge the spill size). According to the machine_mode definition, > > > The machine_mode PRECISION is calculate by component size which is different from BITSIZE > > > > > > Now, here is the question. For array type: vint8mf2x4_t, if I want to access vint8mf2x4_t[2], because the PRECISION and BITSIZE are different. Because bitops is calculated by > > > bitsize and compare to precision in the codes that the patch mentioned. It will make a out-of-bounds access to small array. > > > > > > Can you help me with this? This is important for the following RVV upstream support. Thanks. > > > > So you have that vint8mf2_t type and registers + instructions to operate > > on them but no way to load/store? How do you implement > > > > vint8mf2_t foo (vint8mf2_t *p) > > { > > return *p; > > } > > > > ? > > > > > > > > > > > > > > > > > juzhe.zhong@rivai.ai > > > > > > From: Richard Biener > > > Date: 2022-07-27 14:46 > > > To: zhongjuzhe > > > CC: gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; richard.sandiford; bernds_cb1; ian; wilson > > > Subject: Re: [PATCH 1/1] Fix bit-position comparison > > > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > > > > > > > From: zhongjuzhe <juzhe.zhong@rivai.ai> > > > > > > > > gcc/ChangeLog: > > > > > > > > * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE > > > > > > > > --- > > > > gcc/expr.cc | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/gcc/expr.cc b/gcc/expr.cc > > > > index 80bb1b8a4c5..ac2b3c07df6 100644 > > > > --- a/gcc/expr.cc > > > > +++ b/gcc/expr.cc > > > > @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) > > > > code contains an out-of-bounds access to a small array. */ > > > > if (!MEM_P (to_rtx) > > > > && GET_MODE (to_rtx) != BLKmode > > > > - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) > > > > + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > > > > > I think this has the chance to go wrong with regard to endianess. > > > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > > > is relative to the start of to_rtx as if it were in memory and bitsize > > > determines the contiguous region affected. But since we are actually > > > storing into a non-memory endianess comes into play. > > > > > > So no, I don't think the patch is correct, it would be way more > > > complicated to get the desired improvement. > > > > > > Richard. > > > > > > > { > > > > expand_normal (from); > > > > result = NULL; > > > > > > > > > > > > > > > > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)
Thank you for your reply. I am gonna try another to implement the fractional vector spilling of RVV in RISC-V backend. If this patch is really having a bad impact on other targets. I think this patch may needs to be abandon. juzhe.zhong@rivai.ai From: Richard Sandiford Date: 2022-07-27 20:47 To: Richard Biener CC: zhongjuzhe; gcc-patches; richard.earnshaw; jakub; kenner; jlaw; gnu; jason; davem; joseph; bernds_cb1; ian; wilson Subject: Re: [PATCH 1/1] Fix bit-position comparison Richard Biener <rguenther@suse.de> writes: > On Wed, 27 Jul 2022, juzhe.zhong@rivai.ai wrote: > >> From: zhongjuzhe <juzhe.zhong@rivai.ai> >> >> gcc/ChangeLog: >> >> * expr.cc (expand_assignment): Change GET_MODE_PRECISION to GET_MODE_BITSIZE >> >> --- >> gcc/expr.cc | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/gcc/expr.cc b/gcc/expr.cc >> index 80bb1b8a4c5..ac2b3c07df6 100644 >> --- a/gcc/expr.cc >> +++ b/gcc/expr.cc >> @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) >> code contains an out-of-bounds access to a small array. */ >> if (!MEM_P (to_rtx) >> && GET_MODE (to_rtx) != BLKmode >> - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) >> + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) > > I think this has the chance to go wrong with regard to endianess. > Consider to_rtx with 32bit mode size but 12bit mode precision. bitpos > is relative to the start of to_rtx as if it were in memory and bitsize > determines the contiguous region affected. But since we are actually > storing into a non-memory endianess comes into play. Not sure I follow the code well enough to comment, but: this condition is saying when we can drop the assignment on the floor and just expand the rhs (for side effects). From that point of view, using bitsize should be more conservative than using precision, since bitsize is always >= precision. Thanks, Richard > > So no, I don't think the patch is correct, it would be way more > complicated to get the desired improvement. > > Richard. > >> { >> expand_normal (from); >> result = NULL; >>
diff --git a/gcc/expr.cc b/gcc/expr.cc index 80bb1b8a4c5..ac2b3c07df6 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -5574,7 +5574,7 @@ expand_assignment (tree to, tree from, bool nontemporal) code contains an out-of-bounds access to a small array. */ if (!MEM_P (to_rtx) && GET_MODE (to_rtx) != BLKmode - && known_ge (bitpos, GET_MODE_PRECISION (GET_MODE (to_rtx)))) + && known_ge (bitpos, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) { expand_normal (from); result = NULL;