[GCC14,QUEUE] RISC-V: Optimize fault only first load

Message ID 20230330012804.110539-1-juzhe.zhong@rivai.ai
State Accepted
Headers
Series [GCC14,QUEUE] RISC-V: Optimize fault only first load |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

juzhe.zhong@rivai.ai March 30, 2023, 1:28 a.m. UTC
  From: Juzhe-Zhong <juzhe.zhong@rivai.ai>

gcc/ChangeLog:

        * config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns): Adapt PASS.
        * config/riscv/vector-iterators.md: New unspec.
        * config/riscv/vector.md: Optimize fault only first load pattern.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/vsetvl/ffload-1.c: New test.
        * gcc.target/riscv/rvv/vsetvl/ffload-2.c: New test.
        * gcc.target/riscv/rvv/vsetvl/ffload-3.c: New test.
        * gcc.target/riscv/rvv/vsetvl/ffload-4.c: New test.
        * gcc.target/riscv/rvv/vsetvl/ffload-5.c: New test.
        * gcc.target/riscv/rvv/vsetvl/ffload-6.c: New test.
        * gcc.target/riscv/rvv/vsetvl/ffload-7.c: New test.

---
 gcc/config/riscv/riscv-vsetvl.cc              |  3 +-
 gcc/config/riscv/vector-iterators.md          |  1 +
 gcc/config/riscv/vector.md                    | 10 ++++-
 .../gcc.target/riscv/rvv/vsetvl/ffload-1.c    | 21 +++++++++++
 .../gcc.target/riscv/rvv/vsetvl/ffload-2.c    | 28 ++++++++++++++
 .../gcc.target/riscv/rvv/vsetvl/ffload-3.c    | 28 ++++++++++++++
 .../gcc.target/riscv/rvv/vsetvl/ffload-4.c    | 37 +++++++++++++++++++
 .../gcc.target/riscv/rvv/vsetvl/ffload-5.c    | 29 +++++++++++++++
 .../gcc.target/riscv/rvv/vsetvl/ffload-6.c    | 29 +++++++++++++++
 .../gcc.target/riscv/rvv/vsetvl/ffload-7.c    | 32 ++++++++++++++++
 10 files changed, 216 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c
  

Comments

Jeff Law April 22, 2023, 3:18 a.m. UTC | #1
On 3/29/23 19:28, juzhe.zhong@rivai.ai wrote:
> From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
> 
> gcc/ChangeLog:
> 
>          * config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns): Adapt PASS.
This doesn't provide any useful information as far as I can tell. 
Perhaps something like:
Erase AVL from instructions with the fault first load property.

OK with a better ChangeLog entry.

Related.  As a separate patch, can you add a function comment to 
cleanup_insns?  It doesn't have one and it should.

Thanks,
jeff
  
juzhe.zhong@rivai.ai April 23, 2023, 10:58 p.m. UTC | #2
Hi, Jeff.
I have fixed patches as you suggested:
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616515.html 
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616518.html 
https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616516.html 

Can you merge these patches?


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-04-22 11:18
To: juzhe.zhong; gcc-patches
CC: kito.cheng; palmer
Subject: Re: [GCC14 QUEUE PATCH] RISC-V: Optimize fault only first load
 
 
On 3/29/23 19:28, juzhe.zhong@rivai.ai wrote:
> From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
> 
> gcc/ChangeLog:
> 
>          * config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns): Adapt PASS.
This doesn't provide any useful information as far as I can tell. 
Perhaps something like:
Erase AVL from instructions with the fault first load property.
 
OK with a better ChangeLog entry.
 
Related.  As a separate patch, can you add a function comment to 
cleanup_insns?  It doesn't have one and it should.
 
Thanks,
jeff
  
Jeff Law April 24, 2023, 11:47 p.m. UTC | #3
On 4/23/23 16:58, 钟居哲 wrote:
> Hi, Jeff.
> I have fixed patches as you suggested:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616515.html 
> <https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616515.html>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616518.html 
> <https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616518.html>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616516.html 
> <https://gcc.gnu.org/pipermail/gcc-patches/2023-April/616516.html>
> 
> Can you merge these patches?
I would really prefer you get to the point where you're committing your 
own patches.  I'm already quite overloaded and having to apply your 
patches isn't going to help.

I'm willing to invest some time to address concerns/problems you may 
have with the commit flow as that ultimately makes both of us more 
effective.  But I really don't have the time to sit here and push patches.

So let's start with the  basics.  Have you applied for and received 
write permissions?  If so, add yourself to the MAINTAINERS file.  If 
not, please fill out this form:

> https://sourceware.org/cgi-bin/pdw/ps_form.cgi



Jeff
  

Patch

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 58568b45010..4d043c0645b 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -4003,7 +4003,8 @@  pass_vsetvl::cleanup_insns (void) const
 	  if (!has_vl_op (rinsn) || !REG_P (get_vl (rinsn)))
 	    continue;
 	  rtx avl = get_vl (rinsn);
-	  if (count_occurrences (PATTERN (rinsn), avl, 0) == 1)
+	  if (count_occurrences (PATTERN (rinsn), avl, 0) == 1
+	      || fault_first_load_p (rinsn))
 	    {
 	      /* Get the list of uses for the new instruction.  */
 	      auto attempt = crtl->ssa->new_change_attempt ();
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 34e486e48ca..8fff61eff30 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -80,6 +80,7 @@ 
   UNSPEC_VRGATHEREI16
   UNSPEC_VCOMPRESS
   UNSPEC_VLEFF
+  UNSPEC_MODIFY_VL
 ])
 
 (define_mode_iterator V [
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index b0a4d4cea69..92adfb06122 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -7537,7 +7537,15 @@ 
 	  (unspec:V
 	    [(match_operand:V 3 "memory_operand"         "    m,     m,     m,     m")] UNSPEC_VLEFF)
 	  (match_operand:V 2 "vector_merge_operand"      "   vu,     0,    vu,     0")))
-   (set (reg:SI VL_REGNUM) (unspec:SI [(match_dup 0)] UNSPEC_VLEFF))]
+   (set (reg:SI VL_REGNUM)
+   	  (unspec:SI
+	    [(if_then_else:V
+	       (unspec:<VM>
+		[(match_dup 1) (match_dup 4) (match_dup 5)
+		 (match_dup 6) (match_dup 7)
+	 	 (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+	       (unspec:V [(match_dup 3)] UNSPEC_VLEFF)
+	       (match_dup 2))] UNSPEC_MODIFY_VL))]
   "TARGET_VECTOR"
   "vle<sew>ff.v\t%0,%3%p1"
   [(set_attr "type" "vldff")
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c
new file mode 100644
index 00000000000..b2b7eafa945
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (int8_t * restrict in, int8_t * restrict out, int n, int cond,size_t *new_vl,size_t *new_vl2)
+{
+  size_t vl = 101;
+  
+  vint8mf8_t v = __riscv_vle8_v_i8mf8 (in, vl);
+  __riscv_vse8_v_i8mf8 (out, v, vl);
+  vbool64_t mask = __riscv_vlm_v_b64 (in + 100, vl);
+  vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + 100, new_vl, vl);
+  __riscv_vse8_v_i8mf8 (out + 100, v2, *new_vl);
+  v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v2, in + 200, new_vl2, vl);
+  __riscv_vse8_v_i8mf8 (out + 200, v2, *new_vl2);
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-times {csrr} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
+/* { dg-final { scan-assembler-not {vmv} { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c
new file mode 100644
index 00000000000..c0e21d461e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c
@@ -0,0 +1,28 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond)
+{
+  size_t vl = 101;
+  
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+      __riscv_vse8_v_i8mf8 (out + i, v, vl);
+      
+      vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl);
+      
+      vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl);
+    }
+
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 300, v, vl);
+    }
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c
new file mode 100644
index 00000000000..9e90b189bd6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c
@@ -0,0 +1,28 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond)
+{
+  size_t vl = 101;
+  
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+      __riscv_vse8_v_i8mf8 (out + i, v, vl);
+      
+      vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl);
+      
+      vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl);
+    }
+
+  for (size_t i = 0; i < m; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 300, v, vl);
+    }
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c
new file mode 100644
index 00000000000..eee027e4d48
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c
@@ -0,0 +1,37 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond)
+{
+  size_t vl = 101;
+  
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+      __riscv_vse8_v_i8mf8 (out + i, v, vl);
+      
+      vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl);
+      
+      vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl);
+    }
+
+  for (int i = 0 ; i < n * n; i++)
+    out[i] = out[i] + out[i];
+  
+  for (int i = 0 ; i < n * n * n; i++)
+    out[i] = out[i] * out[i];
+
+  for (int i = 0 ; i < n * n * n * n; i++)
+    out[i] = out[i] * out[i];
+
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 300, v, vl);
+    }
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c
new file mode 100644
index 00000000000..895180cc54e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c
@@ -0,0 +1,29 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond)
+{
+  size_t vl = 101;
+  size_t new_vl;
+  
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+      __riscv_vse8_v_i8mf8 (out + i, v, vl);
+      
+      vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl);
+      
+      vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &new_vl, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 100, v2, new_vl);
+    }
+
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, new_vl);
+      __riscv_vse8_v_i8mf8 (out + i + 300, v, new_vl);
+    }
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c
new file mode 100644
index 00000000000..1b32f4ab24b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c
@@ -0,0 +1,29 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond)
+{
+  size_t vl = 101;
+  size_t new_vl;
+  
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+      __riscv_vse8_v_i8mf8 (out + i, v, vl);
+      
+      vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl);
+      
+      vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &new_vl, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl);
+    }
+
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, new_vl);
+      __riscv_vse8_v_i8mf8 (out + i + 300, v, new_vl);
+    }
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 3 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c
new file mode 100644
index 00000000000..1c08b75873d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */
+
+#include "riscv_vector.h"
+
+void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond)
+{
+  size_t vl = 101;
+  if (cond)
+    vl = m * 2;
+  else
+    vl = m * 2 * vl;
+  
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl);
+      __riscv_vse8_v_i8mf8 (out + i, v, vl);
+      
+      vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl);
+      
+      vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl);
+    }
+
+  for (size_t i = 0; i < n; i++)
+    {
+      vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl);
+      __riscv_vse8_v_i8mf8 (out + i + 300, v, vl);
+    }
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */