[v3] RISC-V: Fix regression of -fzero-call-used-regs=all

Message ID 20230407123249.2600968-1-yanzhang.wang@intel.com
State Accepted
Headers
Series [v3] RISC-V: Fix regression of -fzero-call-used-regs=all |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Li, Pan2 via Gcc-patches April 7, 2023, 12:32 p.m. UTC
  From: Yanzhang Wang <yanzhang.wang@intel.com>

This patch registers a riscv specific function to
TARGET_ZERO_CALL_USED_REGS instead of default in targhooks.cc. It will
clean gpr and vector relevant registers.

	PR 109104

gcc/ChangeLog:

	* config/riscv/riscv-protos.h (emit_hard_vlmax_vsetvl):
	* config/riscv/riscv-v.cc (emit_hard_vlmax_vsetvl):
	(emit_vlmax_vsetvl):
	* config/riscv/riscv.cc (vector_zero_call_used_regs):
	(riscv_zero_call_used_regs):
	(TARGET_ZERO_CALL_USED_REGS):

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zero-scratch-regs-1.c: New test.
	* gcc.target/riscv/zero-scratch-regs-2.c: New test.
	* gcc.target/riscv/zero-scratch-regs-3.c: New test.

Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
Co-authored-by: Pan Li <pan2.li@intel.com>
Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
---
 gcc/config/riscv/riscv-protos.h               |  1 +
 gcc/config/riscv/riscv-v.cc                   | 15 +++-
 gcc/config/riscv/riscv.cc                     | 71 +++++++++++++++++++
 .../gcc.target/riscv/zero-scratch-regs-1.c    |  9 +++
 .../gcc.target/riscv/zero-scratch-regs-2.c    | 24 +++++++
 .../gcc.target/riscv/zero-scratch-regs-3.c    | 57 +++++++++++++++
 6 files changed, 174 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-3.c
  

Comments

Jeff Law April 8, 2023, 6:39 p.m. UTC | #1
On 4/7/23 06:32, yanzhang.wang--- via Gcc-patches wrote:
> From: Yanzhang Wang <yanzhang.wang@intel.com>
> 
> This patch registers a riscv specific function to
> TARGET_ZERO_CALL_USED_REGS instead of default in targhooks.cc. It will
> clean gpr and vector relevant registers.
> 
> 	PR 109104
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv-protos.h (emit_hard_vlmax_vsetvl):
> 	* config/riscv/riscv-v.cc (emit_hard_vlmax_vsetvl):
> 	(emit_vlmax_vsetvl):
> 	* config/riscv/riscv.cc (vector_zero_call_used_regs):
> 	(riscv_zero_call_used_regs):
> 	(TARGET_ZERO_CALL_USED_REGS):
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/zero-scratch-regs-1.c: New test.
> 	* gcc.target/riscv/zero-scratch-regs-2.c: New test.
> 	* gcc.target/riscv/zero-scratch-regs-3.c: New test.
Presumably the difficulty here is we need to find a suitable hard 
register so that we can emit the vsetvl.


> 
> Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
> Co-authored-by: Pan Li <pan2.li@intel.com>
> Co-authored-by: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
> Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
> ---
>   gcc/config/riscv/riscv-protos.h               |  1 +
>   gcc/config/riscv/riscv-v.cc                   | 15 +++-
>   gcc/config/riscv/riscv.cc                     | 71 +++++++++++++++++++
>   .../gcc.target/riscv/zero-scratch-regs-1.c    |  9 +++
>   .../gcc.target/riscv/zero-scratch-regs-2.c    | 24 +++++++
>   .../gcc.target/riscv/zero-scratch-regs-3.c    | 57 +++++++++++++++
>   6 files changed, 174 insertions(+), 3 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-1.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-2.c
>   create mode 100644 gcc/testsuite/gcc.target/riscv/zero-scratch-regs-3.c
> 
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index 2e91d019f6c..13dd6639c9f 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -118,6 +118,17 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
>   	  && IN_RANGE (INTVAL (elt), minval, maxval));
>   }
>   
> +/* Emit a vlmax vsetvl instruction with side effect, this should be only used
> +   when optimization is tune off or emit after vsetvl insertion pass.  */
Minor grammar errors.  I'd probably write it as:

/* Emit a vlvax vsetvl instruction.  This should only be used
    when optimization is disabled or after the vsetvl insertion pass.  */


Do you need to save/restore the vector configuration before and after 
clearing the vector registers?    If so, that seems to be missing.  If 
not, it seems like a comment explaining why would be useful.




Jeff
  
Li, Pan2 via Gcc-patches April 10, 2023, 2:21 a.m. UTC | #2
Thanks Jeff's comment.

> Presumably the difficulty here is we need to find a suitable hard
> register so that we can emit the vsetvl.
 
Yes. We use the GPR which has been flagged in the need_zeroed_regs to
hold the vl. There should be one GPR we can use, otherwise, will throw
an exception.
 
> Do you need to save/restore the vector configuration before and after
> clearing the vector registers?    If so, that seems to be missing.  If
> not, it seems like a comment explaining why would be useful.

I'll add some comments in the code and want to explain here first.
We need not save/restore the vector configurations. Because, by design,
the RVV requires vsetvl when using vector instructions. When users want to
use the RVV insns next, they should have to issue vsetvl first.

Thanks,
Yanzhang
  
Kito Cheng April 10, 2023, 3:11 a.m. UTC | #3
> > Do you need to save/restore the vector configuration before and after
> > clearing the vector registers?    If so, that seems to be missing.  If
> > not, it seems like a comment explaining why would be useful.
>
> I'll add some comments in the code and want to explain here first.
> We need not save/restore the vector configurations. Because, by design,
> the RVV requires vsetvl when using vector instructions. When users want to
> use the RVV insns next, they should have to issue vsetvl first.

I think one keypoint here is -fzero-call-used-regs=* emit zeroing
instruction before return,
that means there won't be any vector operations between return and
zeroing instructions,
so we don't need to restore the vcsr after zeroing.
  
Jeff Law April 10, 2023, 8:57 p.m. UTC | #4
On 4/9/23 21:11, Kito Cheng wrote:

> I think one keypoint here is -fzero-call-used-regs=* emit zeroing
> instruction before return, that means there won't be any vector
> operations between return and zeroing instructions, so we don't need
> to restore the vcsr after zeroing.
Oh yea, makes perfect sense.  Thanks.
jeff
  

Patch

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 4611447ddde..5244e8dcbf0 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -159,6 +159,7 @@  bool check_builtin_call (location_t, vec<location_t>, unsigned int,
 bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 bool legitimize_move (rtx, rtx, machine_mode);
 void emit_vlmax_vsetvl (machine_mode, rtx);
+void emit_hard_vlmax_vsetvl (machine_mode, rtx);
 void emit_vlmax_op (unsigned, rtx, rtx, machine_mode);
 void emit_vlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
 void emit_nonvlmax_op (unsigned, rtx, rtx, rtx, machine_mode);
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 2e91d019f6c..13dd6639c9f 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -118,6 +118,17 @@  const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval,
 	  && IN_RANGE (INTVAL (elt), minval, maxval));
 }
 
+/* Emit a vlmax vsetvl instruction with side effect, this should be only used
+   when optimization is tune off or emit after vsetvl insertion pass.  */
+void
+emit_hard_vlmax_vsetvl (machine_mode vmode, rtx vl)
+{
+  unsigned int sew = get_sew (vmode);
+  emit_insn (gen_vsetvl (Pmode, vl, RVV_VLMAX, gen_int_mode (sew, Pmode),
+			 gen_int_mode (get_vlmul (vmode), Pmode), const0_rtx,
+			 const0_rtx));
+}
+
 void
 emit_vlmax_vsetvl (machine_mode vmode, rtx vl)
 {
@@ -126,9 +137,7 @@  emit_vlmax_vsetvl (machine_mode vmode, rtx vl)
   unsigned int ratio = calculate_ratio (sew, vlmul);
 
   if (!optimize)
-    emit_insn (gen_vsetvl (Pmode, vl, RVV_VLMAX, gen_int_mode (sew, Pmode),
-			   gen_int_mode (get_vlmul (vmode), Pmode), const0_rtx,
-			   const0_rtx));
+    emit_hard_vlmax_vsetvl (vmode, vl);
   else
     emit_insn (gen_vlmax_avl (Pmode, vl, gen_int_mode (ratio, Pmode)));
 }
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 5f542932d13..a6a610f5901 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -7066,6 +7066,74 @@  riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask)
   return shamt == ctz_hwi (mask);
 }
 
+HARD_REG_SET
+vector_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
+{
+  HARD_REG_SET zeroed_hardregs;
+  CLEAR_HARD_REG_SET (zeroed_hardregs);
+
+  /* Find a register to hold vl.  */
+  unsigned vl_regno = INVALID_REGNUM;
+  /* Skip the first GPR, otherwise the existing vl is kept due to the same
+     between vl and avl.  */
+  for (unsigned regno = GP_REG_FIRST + 1; regno <= GP_REG_LAST; regno++)
+    {
+      if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
+	{
+	  vl_regno = regno;
+	  break;
+	}
+    }
+
+  if (vl_regno > GP_REG_LAST)
+    sorry ("cannot allocate vl register for %qs on this target",
+	   "-fzero-call-used-regs");
+
+  bool emitted_vlmax_vsetvl = false;
+  rtx vl = gen_rtx_REG (Pmode, vl_regno); /* vl is VLMAX.  */
+  for (unsigned regno = V_REG_FIRST; regno <= V_REG_LAST; ++regno)
+    {
+      if (TEST_HARD_REG_BIT (need_zeroed_hardregs, regno))
+	{
+	  rtx target = regno_reg_rtx[regno];
+	  machine_mode mode = GET_MODE (target);
+	  poly_uint16 nunits = GET_MODE_NUNITS (mode);
+	  machine_mode mask_mode = riscv_vector::get_vector_mode (BImode,
+								  nunits)
+	    .require ();
+
+	  if (!emitted_vlmax_vsetvl)
+	    {
+	      riscv_vector::emit_hard_vlmax_vsetvl (mode, vl);
+	      emitted_vlmax_vsetvl = true;
+	    }
+
+	  riscv_vector::emit_vlmax_op (code_for_pred_mov (mode), target,
+				       CONST0_RTX (mode), vl, mask_mode);
+
+	  SET_HARD_REG_BIT (zeroed_hardregs, regno);
+	}
+    }
+
+  return zeroed_hardregs;
+}
+
+/* Generate a sequence of instructions that zero registers specified by
+   NEED_ZEROED_HARDREGS.  Return the ZEROED_HARDREGS that are actually
+   zeroed.  */
+HARD_REG_SET
+riscv_zero_call_used_regs (HARD_REG_SET need_zeroed_hardregs)
+{
+  HARD_REG_SET zeroed_hardregs;
+  CLEAR_HARD_REG_SET (zeroed_hardregs);
+
+  if (TARGET_VECTOR)
+    zeroed_hardregs |= vector_zero_call_used_regs (need_zeroed_hardregs);
+
+  return zeroed_hardregs | default_zero_call_used_regs (need_zeroed_hardregs
+							& ~zeroed_hardregs);
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -7317,6 +7385,9 @@  riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask)
 #undef TARGET_DWARF_POLY_INDETERMINATE_VALUE
 #define TARGET_DWARF_POLY_INDETERMINATE_VALUE riscv_dwarf_poly_indeterminate_value
 
+#undef TARGET_ZERO_CALL_USED_REGS
+#define TARGET_ZERO_CALL_USED_REGS riscv_zero_call_used_regs
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-riscv.h"
diff --git a/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-1.c b/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-1.c
new file mode 100644
index 00000000000..41d94ab921a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-1.c
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64 -O2 -fzero-call-used-regs=used -fno-stack-protector -fno-PIC" } */
+
+void
+foo (void)
+{
+}
+
+/* { dg-final { scan-assembler-not "li\t" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-2.c b/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-2.c
new file mode 100644
index 00000000000..9161dd3d4ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-2.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64 -O2 -fzero-call-used-regs=all-gpr" } */
+
+void
+foo (void)
+{
+}
+
+/* { dg-final { scan-assembler-not "vsetvli" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t0,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t1,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t2,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a0,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a1,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a2,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a3,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a4,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a5,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a6,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a7,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t3,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t4,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t5,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t6,0" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-3.c b/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-3.c
new file mode 100644
index 00000000000..824fe9e548f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zero-scratch-regs-3.c
@@ -0,0 +1,57 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64 -O2 -fzero-call-used-regs=all" } */
+
+void
+foo (void)
+{
+}
+
+/* { dg-final { scan-assembler "vsetvli\[ \t\]*t0,zero,e32,m1,tu,mu" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v0,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v1,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v2,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v3,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v4,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v5,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v6,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v7,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v8,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v9,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v10,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v11,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v12,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v13,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v14,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v15,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v16,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v17,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v18,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v19,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v20,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v21,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v22,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v23,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v24,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v25,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v26,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v27,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v28,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v29,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v30,0" } } */
+/* { dg-final { scan-assembler "vmv.v.i\[ \t\]*v31,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t0,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t1,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t2,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a0,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a1,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a2,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a3,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a4,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a5,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a6,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*a7,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t3,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t4,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t5,0" } } */
+/* { dg-final { scan-assembler "li\[ \t\]*t6,0" } } */
+/* { dg-final { scan-assembler "fmv.d.x\[ \t\]*ft0,zero" } } */