Optimize identical permutation in my last r13-3212-gb88adba751da63
Checks
Commit Message
Add extra index check when merging VEC_CST, this handles the case when exactly op1 needs to be return.
This fixes:
FAIL: gcc.dg/tree-ssa/forwprop-19.c scan-tree-dump-not forwprop1 "VEC_PERM_EXPR"
gcc/ChangeLog:
PR target/107220
* match.pd: Check the index of VEC_CST and return the op1 if needed.
---
gcc/match.pd | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
Comments
On Thu, Oct 13, 2022 at 5:15 AM Liwei Xu <liwei.xu@intel.com> wrote:
>
> Add extra index check when merging VEC_CST, this handles the case when exactly op1 needs to be return.
>
> This fixes:
> FAIL: gcc.dg/tree-ssa/forwprop-19.c scan-tree-dump-not forwprop1 "VEC_PERM_EXPR"
>
> gcc/ChangeLog:
>
> PR target/107220
> * match.pd: Check the index of VEC_CST and return the op1 if needed.
> ---
> gcc/match.pd | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 3550c16aaa6..1efdc3abb5d 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8106,6 +8106,7 @@ and,
> vec_perm_builder builder0;
> vec_perm_builder builder1;
> vec_perm_builder builder2 (nelts, nelts, 1);
> + bool ident_to_1 = true;
>
> if (!tree_to_vec_perm_builder (&builder0, @3)
> || !tree_to_vec_perm_builder (&builder1, @4))
> @@ -8115,7 +8116,15 @@ and,
> vec_perm_indices sel1 (builder1, 1, nelts);
>
> for (int i = 0; i < nelts; i++)
> - builder2.quick_push (sel0[sel1[i].to_constant ()]);
> + {
> + int tmp_index = sel0[sel1[i].to_constant ()].to_constant ();
> + builder2.quick_push (sel0[sel1[i].to_constant ()]);
> + if ( i != tmp_index)
> + ident_to_1 = false;
> + }
> +
> + if (ident_to_1)
> + return @1;
You can't "return" in match.pd code. I think the code was fine and the testcase
can be adjusted to scan the subsequent DSE or DCE pass instead.
The "correct" match.pd approach would be to do the if (ident_to_1) check here:
@@ -8124,7 +8124,9 @@ and,
op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2);
}
- (vec_perm @1 @2 { op0; })))
+ (if (ident_to_1)
+ @1
+ (vec_perm @1 @2 { op0; }))))
I'll see to reject 'return' in c-exprs ;)
Richard.
>
> vec_perm_indices sel2 (builder2, 2, nelts);
>
> --
> 2.18.2
>
@@ -8106,6 +8106,7 @@ and,
vec_perm_builder builder0;
vec_perm_builder builder1;
vec_perm_builder builder2 (nelts, nelts, 1);
+ bool ident_to_1 = true;
if (!tree_to_vec_perm_builder (&builder0, @3)
|| !tree_to_vec_perm_builder (&builder1, @4))
@@ -8115,7 +8116,15 @@ and,
vec_perm_indices sel1 (builder1, 1, nelts);
for (int i = 0; i < nelts; i++)
- builder2.quick_push (sel0[sel1[i].to_constant ()]);
+ {
+ int tmp_index = sel0[sel1[i].to_constant ()].to_constant ();
+ builder2.quick_push (sel0[sel1[i].to_constant ()]);
+ if ( i != tmp_index)
+ ident_to_1 = false;
+ }
+
+ if (ident_to_1)
+ return @1;
vec_perm_indices sel2 (builder2, 2, nelts);