RISC-V: Avoid unnecessary slideup in compress pattern of vec_perm

Message ID 20230910035538.2034153-1-juzhe.zhong@rivai.ai
State Unresolved
Headers
Series RISC-V: Avoid unnecessary slideup in compress pattern of vec_perm |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

juzhe.zhong@rivai.ai Sept. 10, 2023, 3:55 a.m. UTC
  If a const vector all elements are same, the slide up is unnecessary.

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (shuffle_compress_patterns): Avoid unnecessary slideup.

---
 gcc/config/riscv/riscv-v.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Jeff Law Sept. 10, 2023, 1:34 p.m. UTC | #1
On 9/9/23 21:55, Juzhe-Zhong wrote:
> If a const vector all elements are same, the slide up is unnecessary.
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv-v.cc (shuffle_compress_patterns): Avoid unnecessary slideup.
> 
> ---
>   gcc/config/riscv/riscv-v.cc | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index bee60de1d26..7ef884907b8 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -2697,7 +2697,7 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
>     rtx mask = force_reg (mask_mode, builder.build ());
>   
>     rtx merge = d->op1;
> -  if (need_slideup_p)
> +  if (need_slideup_p && !const_vec_duplicate_p (d->op1))
>       {
>         int slideup_cnt = vlen - (d->perm[vlen - 1].to_constant () % vlen) - 1;
>         rtx ops[] = {d->target, d->op1, gen_int_mode (slideup_cnt, Pmode)};
Would it be better to adjust how we compute need_slidup_p to check 
!const_vec_duplicate_p (d->op1) instead of doing it here?

That way the name "need_slideup_p" stays consistent with the intent of 
the code.  It would also mean we wouldn't need to duplicate the 
additional check if we wanted to model the use of slideup in the cost 
calculations.

Jeff
  
juzhe.zhong@rivai.ai Sept. 10, 2023, 2:31 p.m. UTC | #2
Address comment: [PATCH V2] RISC-V: Avoid unnecessary slideup in compress pattern of vec_perm (gnu.org)



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-09-10 21:34
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Avoid unnecessary slideup in compress pattern of vec_perm
 
 
On 9/9/23 21:55, Juzhe-Zhong wrote:
> If a const vector all elements are same, the slide up is unnecessary.
> 
> gcc/ChangeLog:
> 
> * config/riscv/riscv-v.cc (shuffle_compress_patterns): Avoid unnecessary slideup.
> 
> ---
>   gcc/config/riscv/riscv-v.cc | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
> index bee60de1d26..7ef884907b8 100644
> --- a/gcc/config/riscv/riscv-v.cc
> +++ b/gcc/config/riscv/riscv-v.cc
> @@ -2697,7 +2697,7 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
>     rtx mask = force_reg (mask_mode, builder.build ());
>   
>     rtx merge = d->op1;
> -  if (need_slideup_p)
> +  if (need_slideup_p && !const_vec_duplicate_p (d->op1))
>       {
>         int slideup_cnt = vlen - (d->perm[vlen - 1].to_constant () % vlen) - 1;
>         rtx ops[] = {d->target, d->op1, gen_int_mode (slideup_cnt, Pmode)};
Would it be better to adjust how we compute need_slidup_p to check 
!const_vec_duplicate_p (d->op1) instead of doing it here?
 
That way the name "need_slideup_p" stays consistent with the intent of 
the code.  It would also mean we wouldn't need to duplicate the 
additional check if we wanted to model the use of slideup in the cost 
calculations.
 
Jeff
  

Patch

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index bee60de1d26..7ef884907b8 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -2697,7 +2697,7 @@  shuffle_compress_patterns (struct expand_vec_perm_d *d)
   rtx mask = force_reg (mask_mode, builder.build ());
 
   rtx merge = d->op1;
-  if (need_slideup_p)
+  if (need_slideup_p && !const_vec_duplicate_p (d->op1))
     {
       int slideup_cnt = vlen - (d->perm[vlen - 1].to_constant () % vlen) - 1;
       rtx ops[] = {d->target, d->op1, gen_int_mode (slideup_cnt, Pmode)};