[Committed] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]
Checks
Commit Message
The mem-to-mem insn pattern is splitted from reg-to-mem/mem-to-reg/reg-to-reg
causes ICE in RA since RA prefer they stay together.
Now, we split mem-to-mem as a pure pre-RA split pattern and only allow
define_insn match mem-to-mem VLS move in pre-RA stage (Forbid mem-to-mem move after RA).
Tested no difference. Committed.
PR target/111566
gcc/ChangeLog:
* config/riscv/vector.md (*mov<mode>_mem_to_mem):
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/fortran/pr111566.f90: New test.
---
gcc/config/riscv/vector.md | 19 +++++++++---
.../gcc.target/riscv/rvv/fortran/pr111566.f90 | 31 +++++++++++++++++++
2 files changed, 45 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/fortran/pr111566.f90
Comments
On 9/26/23 07:15, Juzhe-Zhong wrote:
> The mem-to-mem insn pattern is splitted from reg-to-mem/mem-to-reg/reg-to-reg
> causes ICE in RA since RA prefer they stay together.
>
> Now, we split mem-to-mem as a pure pre-RA split pattern and only allow
> define_insn match mem-to-mem VLS move in pre-RA stage (Forbid mem-to-mem move after RA).
>
> Tested no difference. Committed.
>
> PR target/111566
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md (*mov<mode>_mem_to_mem):
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/fortran/pr111566.f90: New test.
ChangeLog for the vector.md is missing.
In general we shouldn't be allowing mem->mem in most patterns since the
hardware doesn't actually implement such instructions. I suspect that's
the real problem here and that ultimately you're just papering over it.
Jeff
Thanks Jeff.
Address comments:
[PATCH V2] RISC-V: Fix mem-to-mem VLS move pattern[PR111566] (gnu.org)
Actually, we only allow mem-to-mem move for VLS modes size <= MAX_BITS_PER_WORD.
Since we want to optimize this case:
- typedef int8_t v2qi __attribute__ ((vector_size (2)));
- v2qi v = *(v2qi*)in;
- *(v2qi*)out = v;
using scalar load/store.
Does it look more reasonable ?
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-09-26 21:35
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [Committed] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]
On 9/26/23 07:15, Juzhe-Zhong wrote:
> The mem-to-mem insn pattern is splitted from reg-to-mem/mem-to-reg/reg-to-reg
> causes ICE in RA since RA prefer they stay together.
>
> Now, we split mem-to-mem as a pure pre-RA split pattern and only allow
> define_insn match mem-to-mem VLS move in pre-RA stage (Forbid mem-to-mem move after RA).
>
> Tested no difference. Committed.
>
> PR target/111566
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md (*mov<mode>_mem_to_mem):
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/fortran/pr111566.f90: New test.
ChangeLog for the vector.md is missing.
In general we shouldn't be allowing mem->mem in most patterns since the
hardware doesn't actually implement such instructions. I suspect that's
the real problem here and that ultimately you're just papering over it.
Jeff
On 9/26/23 08:51, 钟居哲 wrote:
> Thanks Jeff.
>
> Address comments:
> [PATCH V2] RISC-V: Fix mem-to-mem VLS move pattern[PR111566] (gnu.org)
> <https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631395.html>
>
> Actually, we only allow mem-to-mem move for VLS modes size <=
> MAX_BITS_PER_WORD.
> Since we want to optimize this case:
>
> - typedef int8_t v2qi __attribute__ ((vector_size (2)));
> - v2qi v = *(v2qi*)in;
> - *(v2qi*)out = v;
>
> using scalar load/store.
That should be do-able without resorting to a pattern that allowed
mem->mem moves.
THe thing you have to be careful about is in the effort to optimize this
case, you can end up confusing the register allocator into making poor
choices elsewhere. ie, once you expose a small vector move as
implementable in GPRs you run the very real risk of pessimizing other code.
But even with that caveat, the better way to go here is to disallow the
mem->mem case.
jeff
OK。
Remove mem-to-mem pattern:
[PATCH V3] RISC-V: Remove mem-to-mem VLS move pattern[PR111566] (gnu.org)
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-09-26 23:15
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [Committed] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]
On 9/26/23 08:51, 钟居哲 wrote:
> Thanks Jeff.
>
> Address comments:
> [PATCH V2] RISC-V: Fix mem-to-mem VLS move pattern[PR111566] (gnu.org)
> <https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631395.html>
>
> Actually, we only allow mem-to-mem move for VLS modes size <=
> MAX_BITS_PER_WORD.
> Since we want to optimize this case:
>
> - typedef int8_t v2qi __attribute__ ((vector_size (2)));
> - v2qi v = *(v2qi*)in;
> - *(v2qi*)out = v;
>
> using scalar load/store.
That should be do-able without resorting to a pattern that allowed
mem->mem moves.
THe thing you have to be careful about is in the effort to optimize this
case, you can end up confusing the register allocator into making poor
choices elsewhere. ie, once you expose a small vector move as
implementable in GPRs you run the very real risk of pessimizing other code.
But even with that caveat, the better way to go here is to disallow the
mem->mem case.
jeff
Hi, Jeff.
I removed mem-to-mem patterns as you suggested that means we don't have scalar move optimization for small size vector modes.
Is it ok for trunk?
Since it is a bug fix patch, I hope we can land it soon. We may will find another way to optimize small size vector mode mem-to-mem.
juzhe.zhong@rivai.ai
From: Jeff Law
Date: 2023-09-26 23:15
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [Committed] RISC-V: Fix mem-to-mem VLS move pattern[PR111566]
On 9/26/23 08:51, 钟居哲 wrote:
> Thanks Jeff.
>
> Address comments:
> [PATCH V2] RISC-V: Fix mem-to-mem VLS move pattern[PR111566] (gnu.org)
> <https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631395.html>
>
> Actually, we only allow mem-to-mem move for VLS modes size <=
> MAX_BITS_PER_WORD.
> Since we want to optimize this case:
>
> - typedef int8_t v2qi __attribute__ ((vector_size (2)));
> - v2qi v = *(v2qi*)in;
> - *(v2qi*)out = v;
>
> using scalar load/store.
That should be do-able without resorting to a pattern that allowed
mem->mem moves.
THe thing you have to be careful about is in the effort to optimize this
case, you can end up confusing the register allocator into making poor
choices elsewhere. ie, once you expose a small vector move as
implementable in GPRs you run the very real risk of pessimizing other code.
But even with that caveat, the better way to go here is to disallow the
mem->mem case.
jeff
On 9/26/23 17:08, 钟居哲 wrote:
> Hi, Jeff.
>
> I removed mem-to-mem patterns as you suggested that means we don't have
> scalar move optimization for small size vector modes.
> Is it ok for trunk?
> Since it is a bug fix patch, I hope we can land it soon. We may will
> find another way to optimize small size vector mode mem-to-mem.
It's OK with me.
jeff
@@ -1222,12 +1222,14 @@
DONE;
})
-(define_insn_and_split "*mov<mode>_mem_to_mem"
+;; Some VLS modes (like V2SImode) have size <= a general purpose
+;; register width, we optimize such mem-to-mem move into mem-to-mem
+;; scalar move. Otherwise, we always force operands[1] into register
+;; so that we will never get mem-to-mem move after RA.
+(define_split
[(set (match_operand:VLS_AVL_IMM 0 "memory_operand")
(match_operand:VLS_AVL_IMM 1 "memory_operand"))]
"TARGET_VECTOR && can_create_pseudo_p ()"
- "#"
- "&& 1"
[(const_int 0)]
{
if (GET_MODE_BITSIZE (<MODE>mode).to_constant () <= MAX_BITS_PER_WORD)
@@ -1256,14 +1258,21 @@
}
DONE;
}
- [(set_attr "type" "vmov")]
)
+;; We recognize mem-to-mem move in pre-RA stage so that we won't have
+;; ICE (unrecognizable insn: (set (mem) (mem))). Then, the previous
+;; mem-to-mem split pattern will force operands[1] into a register so
+;; that mem-to-mem move will never happen after RA.
+;;
+;; We don't allow mem-to-mem move in post-RA stage since we
+;; don't have an instruction to split mem-to-mem move after RA.
(define_insn_and_split "*mov<mode>"
[(set (match_operand:VLS_AVL_IMM 0 "reg_or_mem_operand" "=vr, m, vr")
(match_operand:VLS_AVL_IMM 1 "reg_or_mem_operand" " m,vr, vr"))]
"TARGET_VECTOR
- && (register_operand (operands[0], <MODE>mode)
+ && (can_create_pseudo_p ()
+ || register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
"@
#
new file mode 100644
@@ -0,0 +1,31 @@
+! { dg-do compile }
+! { dg-options "-march=rv64gcv -mabi=lp64d -Ofast -fallow-argument-mismatch -fmax-stack-var-size=65536 -S -std=legacy -w" }
+
+module a
+ integer,parameter :: SHR_KIND_R8 = selected_real_kind(12)
+end module a
+module b
+ use a, c => shr_kind_r8
+contains
+ subroutine d(cg , km, i1, i2)
+ real (c) ch(i2,km)
+ real (c) cg(4,i1:i2,km)
+ real dc(i2,km)
+ real(c) ci(i2,km)
+ real(c) cj(i2,km)
+ do k=2,ck
+ do i=i1,0
+ cl = ci(i,k) *ci(i,1) / cj(i,k)+ch(i,1)
+ cm = cg(1,i,k) - min(e,cg(1,i,co))
+ dc(i,k) = sign(cm, cl)
+ enddo
+ enddo
+ if ( cq == 0 ) then
+ do i=i1,i2
+ if( cr <= cs ) then
+ cg= sign( min(ct, cg), cg)
+ endif
+ enddo
+ endif
+ end subroutine d
+end module b