diff mbox series

[v1] RISC-V: Bugfix for scalar move with merged operand

Message ID	20230917074234.1541088-1-pan2.li@intel.com
State	Unresolved
Headers	Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 737CA3858D28 To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, pan2.li@intel.com, yanzhang.wang@intel.com, kito.cheng@gmail.com, rdapp.gcc@gmail.com Subject: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand Date: Sun, 17 Sep 2023 15:42:34 +0800 Message-Id: <20230917074234.1541088-1-pan2.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list From: Pan Li via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: pan2.li@intel.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: INBOX
Series	[v1] RISC-V: Bugfix for scalar move with merged operand \| [v1] RISC-V: Bugfix for scalar move with merged operand

Checks

Context	Check	Description
snail/gcc-patch-check	warning	Git am fail log

Commit Message

Li, Pan2 via Gcc-patches Sept. 17, 2023, 7:42 a.m. UTC

  From: Pan Li <pan2.li@intel.com>

Given below example for VLS mode

void
test (vl_t *u)
{
  vl_t t;
  long long *p = (long long *)&t;

  p[0] = p[1] = 2;

  *u = t;
}

The vec_set will simplify the insn to vmv.s.x when index is 0, without
merged operand. That will result in some problems in DCE, aka:

1:  137[DI] = a0
2:  138[V2DI] = 134[V2DI]                              // deleted by DCE
3:  139[DI] = #2                                       // deleted by DCE
4:  140[DI] = #2                                       // deleted by DCE
5:  141[V2DI] = vec_dup:V2DI (139[DI])                 // deleted by DCE
6:  138[V2DI] = vslideup_imm (138[V2DI], 141[V2DI], 1) // deleted by DCE
7:  135[V2DI] = 138[V2DI]                              // deleted by DCE
8:  142[V2DI] = 135[V2DI]                              // deleted by DCE
9:  143[DI] = #2
10: 142[V2DI] = vec_dup:V2DI (143[DI])
11: (137[DI]) = 142[V2DI]

The higher 64 bits of 142[V2DI] is unknown here and it generated incorrect
code when store back to memory. This patch would like to fix this issue
by adding a new SCALAR_MOVE_MERGED_OP for vec_set.

Please note this patch doesn't enable VLS for vec_set, the underlying
patches will support this soon.

gcc/ChangeLog:

	* config/riscv/autovec.md: Bugfix.
	* config/riscv/riscv-protos.h (SCALAR_MOVE_MERGED_OP): New enum.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/config/riscv/autovec.md                   |  4 +--
 gcc/config/riscv/riscv-protos.h               |  4 +++
 .../riscv/rvv/base/scalar-move-merged-run-1.c | 29 +++++++++++++++++++
 3 files changed, 35 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c

Comments

Jeff Law Sept. 17, 2023, 3:52 p.m. UTC | #1

On 9/17/23 01:42, Pan Li via Gcc-patches wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> Given below example for VLS mode
> 
> void
> test (vl_t *u)
> {
>    vl_t t;
>    long long *p = (long long *)&t;
> 
>    p[0] = p[1] = 2;
> 
>    *u = t;
> }
> 
> The vec_set will simplify the insn to vmv.s.x when index is 0, without
> merged operand. That will result in some problems in DCE, aka:
> 
> 1:  137[DI] = a0
> 2:  138[V2DI] = 134[V2DI]                              // deleted by DCE
> 3:  139[DI] = #2                                       // deleted by DCE
> 4:  140[DI] = #2                                       // deleted by DCE
> 5:  141[V2DI] = vec_dup:V2DI (139[DI])                 // deleted by DCE
> 6:  138[V2DI] = vslideup_imm (138[V2DI], 141[V2DI], 1) // deleted by DCE
> 7:  135[V2DI] = 138[V2DI]                              // deleted by DCE
> 8:  142[V2DI] = 135[V2DI]                              // deleted by DCE
> 9:  143[DI] = #2
> 10: 142[V2DI] = vec_dup:V2DI (143[DI])
> 11: (137[DI]) = 142[V2DI]
> 
> The higher 64 bits of 142[V2DI] is unknown here and it generated
> incorrect code when store back to memory. This patch would like to
> fix this issue by adding a new SCALAR_MOVE_MERGED_OP for vec_set.
I must be missing something.  Doesn't insn 10 broadcast the immediate 
0x2 to both elements of r142?!?  What am I missing?

JEff

Li, Pan2 via Gcc-patches Sept. 18, 2023, 1:34 a.m. UTC | #2

> I must be missing something.  Doesn't insn 10 broadcast the immediate 
> 0x2 to both elements of r142?!?  What am I missing?

Thanks Jeff for comments.

The insn 10 is VECTOR_SCALAR_MOV, aka vmv.s.x from the asm code.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Sunday, September 17, 2023 11:53 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com; rdapp.gcc@gmail.com
Subject: Re: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand




On 9/17/23 01:42, Pan Li via Gcc-patches wrote:
> From: Pan Li <pan2.li@intel.com>
> 
> Given below example for VLS mode
> 
> void
> test (vl_t *u)
> {
>    vl_t t;
>    long long *p = (long long *)&t;
> 
>    p[0] = p[1] = 2;
> 
>    *u = t;
> }
> 
> The vec_set will simplify the insn to vmv.s.x when index is 0, without
> merged operand. That will result in some problems in DCE, aka:
> 
> 1:  137[DI] = a0
> 2:  138[V2DI] = 134[V2DI]                              // deleted by DCE
> 3:  139[DI] = #2                                       // deleted by DCE
> 4:  140[DI] = #2                                       // deleted by DCE
> 5:  141[V2DI] = vec_dup:V2DI (139[DI])                 // deleted by DCE
> 6:  138[V2DI] = vslideup_imm (138[V2DI], 141[V2DI], 1) // deleted by DCE
> 7:  135[V2DI] = 138[V2DI]                              // deleted by DCE
> 8:  142[V2DI] = 135[V2DI]                              // deleted by DCE
> 9:  143[DI] = #2
> 10: 142[V2DI] = vec_dup:V2DI (143[DI])
> 11: (137[DI]) = 142[V2DI]
> 
> The higher 64 bits of 142[V2DI] is unknown here and it generated
> incorrect code when store back to memory. This patch would like to
> fix this issue by adding a new SCALAR_MOVE_MERGED_OP for vec_set.
I must be missing something.  Doesn't insn 10 broadcast the immediate 
0x2 to both elements of r142?!?  What am I missing?

JEff

Robin Dapp Sept. 18, 2023, 10 a.m. UTC | #3

> I must be missing something.  Doesn't insn 10 broadcast the immediate
> 0x2 to both elements of r142?!?  What am I missing?
It is indeed a bit misleading.  The difference is in the mask which
is not displayed in the short form.  So we actually use a vec_dup
for a single-element move, essentially a masked vec_dup where only
one element is masked in.

The problem was that the original doesn't use a merging "vec_set"
but a "destructive" one where the other elements get ignored.

The fix is OK IMHO. 

Regards
 Robin

Li, Pan2 via Gcc-patches Sept. 18, 2023, 10:24 a.m. UTC | #4

Thanks Robin, let's wait Jeff's confirmation for this.

Pan

-----Original Message-----
From: Robin Dapp <rdapp.gcc@gmail.com> 
Sent: Monday, September 18, 2023 6:01 PM
To: Jeff Law <jeffreyalaw@gmail.com>; Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: rdapp.gcc@gmail.com; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand

> I must be missing something.  Doesn't insn 10 broadcast the immediate
> 0x2 to both elements of r142?!?  What am I missing?
It is indeed a bit misleading.  The difference is in the mask which
is not displayed in the short form.  So we actually use a vec_dup
for a single-element move, essentially a masked vec_dup where only
one element is masked in.

The problem was that the original doesn't use a merging "vec_set"
but a "destructive" one where the other elements get ignored.

The fix is OK IMHO. 

Regards
 Robin

Jeff Law Sept. 18, 2023, 5:44 p.m. UTC | #5

On 9/18/23 04:00, Robin Dapp wrote:
>> I must be missing something.  Doesn't insn 10 broadcast the immediate
>> 0x2 to both elements of r142?!?  What am I missing?
> It is indeed a bit misleading.  The difference is in the mask which
> is not displayed in the short form.  So we actually use a vec_dup
> for a single-element move, essentially a masked vec_dup where only
> one element is masked in.
Ah :-)

> 
> The problem was that the original doesn't use a merging "vec_set"
> but a "destructive" one where the other elements get ignored.
> 
> The fix is OK IMHO.
Agreed.

jeff

Li, Pan2 via Gcc-patches Sept. 18, 2023, 11:03 p.m. UTC | #6

Committed, thanks Jeff and Robin.

Pan

-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com> 
Sent: Tuesday, September 19, 2023 1:44 AM
To: Robin Dapp <rdapp.gcc@gmail.com>; Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] RISC-V: Bugfix for scalar move with merged operand



On 9/18/23 04:00, Robin Dapp wrote:
>> I must be missing something.  Doesn't insn 10 broadcast the immediate
>> 0x2 to both elements of r142?!?  What am I missing?
> It is indeed a bit misleading.  The difference is in the mask which
> is not displayed in the short form.  So we actually use a vec_dup
> for a single-element move, essentially a masked vec_dup where only
> one element is masked in.
Ah :-)

> 
> The problem was that the original doesn't use a merging "vec_set"
> but a "destructive" one where the other elements get ignored.
> 
> The fix is OK IMHO.
Agreed.

jeff

diff mbox series

Patch

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index aca86554a94..01291ad9830 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1401,9 +1401,9 @@  (define_expand "vec_set<mode>"
   /* If we set the first element, emit an v(f)mv.s.[xf].  */
   if (operands[2] == const0_rtx)
     {
-      rtx ops[] = {operands[0], operands[1]};
+      rtx ops[] = {operands[0], operands[0], operands[1]};
       riscv_vector::emit_nonvlmax_insn (code_for_pred_broadcast (<MODE>mode),
-                                         riscv_vector::SCALAR_MOVE_OP, ops, CONST1_RTX (Pmode));
+					riscv_vector::SCALAR_MOVE_MERGED_OP, ops, CONST1_RTX (Pmode));
     }
   else
     {
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5a2d218d67b..6d9367d9602 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -345,6 +345,10 @@  enum insn_type : unsigned int
   SCALAR_MOVE_OP = HAS_DEST_P | HAS_MASK_P | USE_ONE_TRUE_MASK_P | HAS_MERGE_P
 		   | USE_VUNDEF_MERGE_P | TDEFAULT_POLICY_P | MDEFAULT_POLICY_P
 		   | UNARY_OP_P,
+
+  SCALAR_MOVE_MERGED_OP = HAS_DEST_P | HAS_MASK_P | USE_ONE_TRUE_MASK_P
+			  | HAS_MERGE_P | TDEFAULT_POLICY_P | MDEFAULT_POLICY_P
+			  | UNARY_OP_P,
 };
 
 enum vlmul_type
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c
new file mode 100644
index 00000000000..7aee75c6940
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/scalar-move-merged-run-1.c
@@ -0,0 +1,29 @@ 
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-options "-O3 -Wno-psabi" } */
+
+#define TEST_VAL 2
+
+typedef long long vl_t __attribute__((vector_size(2 * sizeof (long long))));
+
+void init_vl (vl_t *u)
+{
+  vl_t t;
+  long long *p = (long long *)&t;
+
+  p[0] = p[1] = TEST_VAL;
+
+  *u = t;
+}
+
+int
+main ()
+{
+  vl_t vl = {};
+
+  init_vl (&vl);
+
+  if (vl[0] != TEST_VAL || vl[1] != TEST_VAL)
+    __builtin_abort ();
+
+  return 0;
+}