RISC-V: Enable movmisalign for VLS modes
Checks
Commit Message
Prevous patch (which removed VLA modes movmisalign pattern) to fix run-time bug.
Such patch disable vectorization for misalign data movement.
After I check LLVM codes, LLVM supports misalign for VLS modes.
Before this patch:
sll a5,a4,0x1
add a5,a5,a1
lhu a3,64(a5)
lbu a5,66(a5)
addw a4,a4,1
srl a3,a3,0x8
sll a5,a5,0x8
or a5,a5,a3
sh a5,0(a2)
add a2,a2,2
bne a4,a0,101f8 <foo+0x14>
After this patch:
foo:
lui a0,%hi(.LANCHOR0)
addi a0,a0,%lo(.LANCHOR0)
addi sp,sp,-16
addi a1,a0,1
li a2,64
sd ra,8(sp)
vsetvli zero,a2,e8,m4,ta,ma
addi a0,a0,128
vle8.v v4,0(a1)
vse8.v v4,0(a0)
call memcmp
bne a0,zero,.L6
ld ra,8(sp)
addi sp,sp,16
jr ra
.L6:
call abort
Note this patch has passed all testcases in "vect" which are related to alignment.
gcc/ChangeLog:
* config/riscv/autovec-vls.md (movmisalign<mode>): New pattern.
* config/riscv/riscv.cc (riscv_support_vector_misalignment): Support VLS misalign.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vls/misalign-1.c: New test.
---
gcc/config/riscv/autovec-vls.md | 19 +++++++++++++
gcc/config/riscv/riscv.cc | 16 +++++++----
.../riscv/rvv/autovec/vls/misalign-1.c | 27 +++++++++++++++++++
3 files changed, 57 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/misalign-1.c
Comments
> + /* To support misalign data movement, we should use
> + minimum element alignment load/store. */
> + unsigned int size = GET_MODE_SIZE (GET_MODE_INNER (<MODE>mode));
> + poly_int64 nunits = GET_MODE_NUNITS (<MODE>mode) * size;
> + machine_mode mode = riscv_vector::get_vector_mode (QImode, nunits).require ();
> + operands[0] = gen_lowpart (mode, operands[0]);
> + operands[1] = gen_lowpart (mode, operands[1]);
> + if (MEM_P (operands[0]) && !register_operand (operands[1], mode))
> + operands[1] = force_reg (mode, operands[1]);
Does force_reg safe for movmisalign?
On 8/29/23 07:54, Kito Cheng via Gcc-patches wrote:
>> + /* To support misalign data movement, we should use
>> + minimum element alignment load/store. */
>> + unsigned int size = GET_MODE_SIZE (GET_MODE_INNER (<MODE>mode));
>> + poly_int64 nunits = GET_MODE_NUNITS (<MODE>mode) * size;
>> + machine_mode mode = riscv_vector::get_vector_mode (QImode, nunits).require ();
>> + operands[0] = gen_lowpart (mode, operands[0]);
>> + operands[1] = gen_lowpart (mode, operands[1]);
>> + if (MEM_P (operands[0]) && !register_operand (operands[1], mode))
>> + operands[1] = force_reg (mode, operands[1]);
>
> Does force_reg safe for movmisalign?
It should be. It's a pretty common idiom. Essentially it's going to
result in generating this for the MEM->MEM case:
MEM->REG
REG->MEM
Both of which are likely to go through the misalign expander.
I was about to ACK when I had to leave for a few minutes.
OK for the trunk.
jeff
> OK for the trunk.
Thanks. Will commit it soon.
> Does force_reg safe for movmisalign?
Both operands[0] and operands[1] are vector QImode already, so it's safe to force reg.
And we have fully tested MEM->MEM and CONST->MEM in gcc.dg/vect.
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-08-29 22:23
To: Kito Cheng; Juzhe-Zhong
CC: gcc-patches; kito.cheng
Subject: Re: [PATCH] RISC-V: Enable movmisalign for VLS modes
On 8/29/23 07:54, Kito Cheng via Gcc-patches wrote:
>> + /* To support misalign data movement, we should use
>> + minimum element alignment load/store. */
>> + unsigned int size = GET_MODE_SIZE (GET_MODE_INNER (<MODE>mode));
>> + poly_int64 nunits = GET_MODE_NUNITS (<MODE>mode) * size;
>> + machine_mode mode = riscv_vector::get_vector_mode (QImode, nunits).require ();
>> + operands[0] = gen_lowpart (mode, operands[0]);
>> + operands[1] = gen_lowpart (mode, operands[1]);
>> + if (MEM_P (operands[0]) && !register_operand (operands[1], mode))
>> + operands[1] = force_reg (mode, operands[1]);
>
> Does force_reg safe for movmisalign?
It should be. It's a pretty common idiom. Essentially it's going to
result in generating this for the MEM->MEM case:
MEM->REG
REG->MEM
Both of which are likely to go through the misalign expander.
I was about to ACK when I had to leave for a few minutes.
OK for the trunk.
jeff
Committed, thanks Jeff and Kito.
Pan
-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of ???
Sent: Wednesday, August 30, 2023 6:27 AM
To: Jeff Law <jeffreyalaw@gmail.com>; kito.cheng <kito.cheng@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>; kito.cheng <kito.cheng@sifive.com>
Subject: Re: Re: [PATCH] RISC-V: Enable movmisalign for VLS modes
> OK for the trunk.
Thanks. Will commit it soon.
> Does force_reg safe for movmisalign?
Both operands[0] and operands[1] are vector QImode already, so it's safe to force reg.
And we have fully tested MEM->MEM and CONST->MEM in gcc.dg/vect.
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-08-29 22:23
To: Kito Cheng; Juzhe-Zhong
CC: gcc-patches; kito.cheng
Subject: Re: [PATCH] RISC-V: Enable movmisalign for VLS modes
On 8/29/23 07:54, Kito Cheng via Gcc-patches wrote:
>> + /* To support misalign data movement, we should use
>> + minimum element alignment load/store. */
>> + unsigned int size = GET_MODE_SIZE (GET_MODE_INNER (<MODE>mode));
>> + poly_int64 nunits = GET_MODE_NUNITS (<MODE>mode) * size;
>> + machine_mode mode = riscv_vector::get_vector_mode (QImode, nunits).require ();
>> + operands[0] = gen_lowpart (mode, operands[0]);
>> + operands[1] = gen_lowpart (mode, operands[1]);
>> + if (MEM_P (operands[0]) && !register_operand (operands[1], mode))
>> + operands[1] = force_reg (mode, operands[1]);
>
> Does force_reg safe for movmisalign?
It should be. It's a pretty common idiom. Essentially it's going to
result in generating this for the MEM->MEM case:
MEM->REG
REG->MEM
Both of which are likely to go through the misalign expander.
I was about to ACK when I had to leave for a few minutes.
OK for the trunk.
jeff
@@ -140,6 +140,25 @@
[(set_attr "type" "vmov")
(set_attr "mode" "<MODE>")])
+(define_expand "movmisalign<mode>"
+ [(set (match_operand:VLS 0 "nonimmediate_operand")
+ (match_operand:VLS 1 "general_operand"))]
+ "TARGET_VECTOR"
+ {
+ /* To support misalign data movement, we should use
+ minimum element alignment load/store. */
+ unsigned int size = GET_MODE_SIZE (GET_MODE_INNER (<MODE>mode));
+ poly_int64 nunits = GET_MODE_NUNITS (<MODE>mode) * size;
+ machine_mode mode = riscv_vector::get_vector_mode (QImode, nunits).require ();
+ operands[0] = gen_lowpart (mode, operands[0]);
+ operands[1] = gen_lowpart (mode, operands[1]);
+ if (MEM_P (operands[0]) && !register_operand (operands[1], mode))
+ operands[1] = force_reg (mode, operands[1]);
+ riscv_vector::emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP, operands);
+ DONE;
+ }
+)
+
;; -----------------------------------------------------------------
;; ---- Duplicate Operations
;; -----------------------------------------------------------------
@@ -8063,12 +8063,18 @@ riscv_support_vector_misalignment (machine_mode mode,
int misalignment,
bool is_packed ATTRIBUTE_UNUSED)
{
- /* TODO: For RVV scalable vector auto-vectorization, we should allow
- movmisalign<mode> pattern to handle misalign data movement to unblock
- possible auto-vectorization.
+ /* Only enable misalign data movements for VLS modes. */
+ if (TARGET_VECTOR_VLS && STRICT_ALIGNMENT)
+ {
+ /* Return if movmisalign pattern is not supported for this mode. */
+ if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing)
+ return false;
- RVV VLS auto-vectorization or SIMD auto-vectorization can be supported here
- in the future. */
+ /* Misalignment factor is unknown at compile time. */
+ if (misalignment == -1)
+ return false;
+ }
+ /* Disable movmisalign for VLA auto-vectorization. */
return default_builtin_support_vector_misalignment (mode, type, misalignment,
is_packed);
}
new file mode 100644
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2 --param=riscv-autovec-lmul=m4 -fno-tree-loop-distribute-patterns" } */
+
+#include <stdlib.h>
+
+typedef union U { unsigned short s; unsigned char c; } __attribute__((packed)) U;
+struct S { char e __attribute__((aligned (64))); U s[32]; };
+struct S t = {0, {{1}, {2}, {3}, {4}, {5}, {6}, {7}, {8},
+ {9}, {10}, {11}, {12}, {13}, {14}, {15}, {16},
+ {17}, {18}, {19}, {20}, {21}, {22}, {23}, {24},
+ {25}, {26}, {27}, {28}, {29}, {30}, {31}, {32}}};
+unsigned short d[32] = { 1 };
+
+__attribute__((noinline, noclone)) void
+foo ()
+{
+ int i;
+ for (i = 0; i < 32; i++)
+ d[i] = t.s[i].s;
+ if (__builtin_memcmp (d, t.s, sizeof d))
+ abort ();
+}
+
+/* { dg-final { scan-assembler-times {vle8\.v} 1 } } */
+/* { dg-final { scan-assembler-times {vle8\.v} 1 } } */
+/* { dg-final { scan-assembler-not {vle16\.v} } } */
+/* { dg-final { scan-assembler-not {vle16\.v} } } */