[V2] VECT: Fix ICE of variable stride on strieded load/store with SELECT_VL loop control.

Message ID 20230706065135.3448078-1-juzhe.zhong@rivai.ai
State Unresolved
Headers
Series [V2] VECT: Fix ICE of variable stride on strieded load/store with SELECT_VL loop control. |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

juzhe.zhong@rivai.ai July 6, 2023, 6:51 a.m. UTC
  From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>

Hi, Richi.

Sorry for making mistake on LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE
with SELECT_VL loop control.

Consider this following case:
#define TEST_LOOP(DATA_TYPE, BITS)                                             \
  void __attribute__ ((noinline, noclone))                                     \
  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
			  INDEX##BITS stride, INDEX##BITS n)                   \
  {                                                                            \
    for (INDEX##BITS i = 0; i < n; ++i)                                        \
      dest[i] += src[i * stride];                                              \
  }

When "stride" is a constant, current flow works fine.
However, when "stride" is a variable. It causes an ICE:
# vectp_src.67_85 = PHI <vectp_src.67_86(6), src_21(D)(12)>
...
_96 = .SELECT_VL (ivtmp_94, 4);
...
ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4;
vect__11.69_87 = .LEN_MASK_GATHER_LOAD (vectp_src.67_85, _84, 4, { 0, 0, 0, 0 }, { -1, -1, -1, -1 }, _96, 0);
...
vectp_src.67_86 = vectp_src.67_85 + ivtmp_78;

Becase the IR: ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4;

Instead, I split the IR into:

step_stride = _39
step = step_stride * 4
ivtmp_78 = step * _96

Thanks.

gcc/ChangeLog:

        * tree-vect-stmts.cc (vect_get_strided_load_store_ops): Fix ICE.

---
 gcc/tree-vect-stmts.cc | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)
  

Comments

Richard Biener July 6, 2023, 7:08 a.m. UTC | #1
On Thu, 6 Jul 2023, juzhe.zhong@rivai.ai wrote:

> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
> 
> Hi, Richi.
> 
> Sorry for making mistake on LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE
> with SELECT_VL loop control.

OK.

> Consider this following case:
> #define TEST_LOOP(DATA_TYPE, BITS)                                             \
>   void __attribute__ ((noinline, noclone))                                     \
>   f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
> 			  INDEX##BITS stride, INDEX##BITS n)                   \
>   {                                                                            \
>     for (INDEX##BITS i = 0; i < n; ++i)                                        \
>       dest[i] += src[i * stride];                                              \
>   }
> 
> When "stride" is a constant, current flow works fine.
> However, when "stride" is a variable. It causes an ICE:
> # vectp_src.67_85 = PHI <vectp_src.67_86(6), src_21(D)(12)>
> ...
> _96 = .SELECT_VL (ivtmp_94, 4);
> ...
> ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4;
> vect__11.69_87 = .LEN_MASK_GATHER_LOAD (vectp_src.67_85, _84, 4, { 0, 0, 0, 0 }, { -1, -1, -1, -1 }, _96, 0);
> ...
> vectp_src.67_86 = vectp_src.67_85 + ivtmp_78;
> 
> Becase the IR: ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4;
> 
> Instead, I split the IR into:
> 
> step_stride = _39
> step = step_stride * 4
> ivtmp_78 = step * _96
> 
> Thanks.
> 
> gcc/ChangeLog:
> 
>         * tree-vect-stmts.cc (vect_get_strided_load_store_ops): Fix ICE.
> 
> ---
>  gcc/tree-vect-stmts.cc | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index c10a4be60eb..10e71178ce7 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -3176,10 +3176,8 @@ vect_get_strided_load_store_ops (stmt_vec_info stmt_info,
>  	= fold_build2 (MULT_EXPR, sizetype,
>  		       fold_convert (sizetype, unshare_expr (DR_STEP (dr))),
>  		       loop_len);
> -      tree bump = make_temp_ssa_name (sizetype, NULL, "ivtmp");
> -      gassign *assign = gimple_build_assign (bump, tmp);
> -      gsi_insert_before (gsi, assign, GSI_SAME_STMT);
> -      *dataref_bump = bump;
> +      *dataref_bump = force_gimple_operand_gsi (gsi, tmp, true, NULL_TREE, true,
> +						GSI_SAME_STMT);
>      }
>    else
>      {
>
  
Li, Pan2 via Gcc-patches July 6, 2023, 7:12 a.m. UTC | #2
Committed, thanks Richard.

Pan

-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Richard Biener via Gcc-patches
Sent: Thursday, July 6, 2023 3:09 PM
To: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Cc: gcc-patches@gcc.gnu.org; richard.sandiford@arm.com
Subject: Re: [PATCH V2] VECT: Fix ICE of variable stride on strieded load/store with SELECT_VL loop control.

On Thu, 6 Jul 2023, juzhe.zhong@rivai.ai wrote:

> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
> 
> Hi, Richi.
> 
> Sorry for making mistake on LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE
> with SELECT_VL loop control.

OK.

> Consider this following case:
> #define TEST_LOOP(DATA_TYPE, BITS)                                             \
>   void __attribute__ ((noinline, noclone))                                     \
>   f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
> 			  INDEX##BITS stride, INDEX##BITS n)                   \
>   {                                                                            \
>     for (INDEX##BITS i = 0; i < n; ++i)                                        \
>       dest[i] += src[i * stride];                                              \
>   }
> 
> When "stride" is a constant, current flow works fine.
> However, when "stride" is a variable. It causes an ICE:
> # vectp_src.67_85 = PHI <vectp_src.67_86(6), src_21(D)(12)>
> ...
> _96 = .SELECT_VL (ivtmp_94, 4);
> ...
> ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4;
> vect__11.69_87 = .LEN_MASK_GATHER_LOAD (vectp_src.67_85, _84, 4, { 0, 0, 0, 0 }, { -1, -1, -1, -1 }, _96, 0);
> ...
> vectp_src.67_86 = vectp_src.67_85 + ivtmp_78;
> 
> Becase the IR: ivtmp_78 = ((sizetype) _39 * (sizetype) _96) * 4;
> 
> Instead, I split the IR into:
> 
> step_stride = _39
> step = step_stride * 4
> ivtmp_78 = step * _96
> 
> Thanks.
> 
> gcc/ChangeLog:
> 
>         * tree-vect-stmts.cc (vect_get_strided_load_store_ops): Fix ICE.
> 
> ---
>  gcc/tree-vect-stmts.cc | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index c10a4be60eb..10e71178ce7 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -3176,10 +3176,8 @@ vect_get_strided_load_store_ops (stmt_vec_info stmt_info,
>  	= fold_build2 (MULT_EXPR, sizetype,
>  		       fold_convert (sizetype, unshare_expr (DR_STEP (dr))),
>  		       loop_len);
> -      tree bump = make_temp_ssa_name (sizetype, NULL, "ivtmp");
> -      gassign *assign = gimple_build_assign (bump, tmp);
> -      gsi_insert_before (gsi, assign, GSI_SAME_STMT);
> -      *dataref_bump = bump;
> +      *dataref_bump = force_gimple_operand_gsi (gsi, tmp, true, NULL_TREE, true,
> +						GSI_SAME_STMT);
>      }
>    else
>      {
>
  

Patch

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index c10a4be60eb..10e71178ce7 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -3176,10 +3176,8 @@  vect_get_strided_load_store_ops (stmt_vec_info stmt_info,
 	= fold_build2 (MULT_EXPR, sizetype,
 		       fold_convert (sizetype, unshare_expr (DR_STEP (dr))),
 		       loop_len);
-      tree bump = make_temp_ssa_name (sizetype, NULL, "ivtmp");
-      gassign *assign = gimple_build_assign (bump, tmp);
-      gsi_insert_before (gsi, assign, GSI_SAME_STMT);
-      *dataref_bump = bump;
+      *dataref_bump = force_gimple_operand_gsi (gsi, tmp, true, NULL_TREE, true,
+						GSI_SAME_STMT);
     }
   else
     {