match: Delay folding of 1/x into `(x+1u)<2u?x:0` until late [PR113301]
Checks
Commit Message
Since currently ranger does not work with the complexity of COND_EXPR in
some cases so delaying the simplification of `1/x` for signed types
help code generation.
tree-ssa/divide-8.c is a new testcase where this can help.
Bootstrapped and tested on x86_64-linux-gnu with no regressions.
PR tree-optimization/113301
gcc/ChangeLog:
* match.pd (`1/x`): Delay signed case until late.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/divide-8.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
---
gcc/match.pd | 12 +++++++-----
gcc/testsuite/gcc.dg/tree-ssa/divide-8.c | 13 +++++++++++++
2 files changed, 20 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-8.c
Comments
On Thu, Jan 11, 2024 at 5:24 AM Andrew Pinski <quic_apinski@quicinc.com> wrote:
>
> Since currently ranger does not work with the complexity of COND_EXPR in
> some cases so delaying the simplification of `1/x` for signed types
> help code generation.
> tree-ssa/divide-8.c is a new testcase where this can help.
Huh. It looks like this pattern was added for PR95424 (and not say,
for vectorization). IMO the COND_EXPR isn't a good canonical
representation and the optimization should have been done
during either instruction selection or RTL expansion.
Of course ranger should be better in handling this as well and
we should be able to canonicalize (unsigned)(x+1)<=2 ? x : 0
to 1/x.
The patch works towards this, keeping it as possible vectorization
enabler.
Thus OK.
Thanks,
Richard.
> Bootstrapped and tested on x86_64-linux-gnu with no regressions.
>
> PR tree-optimization/113301
>
> gcc/ChangeLog:
>
> * match.pd (`1/x`): Delay signed case until late.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/divide-8.c: New test.
>
> Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
> ---
> gcc/match.pd | 12 +++++++-----
> gcc/testsuite/gcc.dg/tree-ssa/divide-8.c | 13 +++++++++++++
> 2 files changed, 20 insertions(+), 5 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/divide-8.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index d75babd86c2..81a389057cf 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -560,7 +560,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> 1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
> But not for 1 / 0 so that we can get proper warnings and errors,
> and not for 1-bit integers as they are edge cases better handled
> - elsewhere. */
> + elsewhere. Delay the conversion of the signed division until late
> + because `1 / X` is simplier to handle than the resulting COND_EXPR. */
> (simplify
> (trunc_div integer_onep@0 @1)
> (if (INTEGRAL_TYPE_P (type)
> @@ -569,10 +570,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> && (!flag_non_call_exceptions || tree_expr_nonzero_p (@1)))
> (if (TYPE_UNSIGNED (type))
> (convert (eq:boolean_type_node @1 { build_one_cst (type); }))
> - (with { tree utype = unsigned_type_for (type); }
> - (cond (le (plus (convert:utype @1) { build_one_cst (utype); })
> - { build_int_cst (utype, 2); })
> - @1 { build_zero_cst (type); })))))
> + (if (!canonicalize_math_p ())
> + (with { tree utype = unsigned_type_for (type); }
> + (cond (le (plus (convert:utype @1) { build_one_cst (utype); })
> + { build_int_cst (utype, 2); })
> + @1 { build_zero_cst (type); }))))))
>
> /* Combine two successive divisions. Note that combining ceil_div
> and floor_div is trickier and combining round_div even more so. */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c
> new file mode 100644
> index 00000000000..b8149088177
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/divide-8.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-tree-optimized" } */
> +/* PR tree-optimization/113301 */
> +/* We should figure out that 1/(x+1) range is [-1,1]
> + and then /2 is always 0. */
> +
> +void link_error(void);
> +void func(int x){
> + int c=(1/(x+1))/2;
> + if (c != 0)
> + link_error();
> +}
> +/* { dg-final { scan-tree-dump-not "link_error " "optimized" } } */
> --
> 2.39.3
>
@@ -560,7 +560,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
1 / X -> X >= -1 && X <= 1 ? X : 0 for signed integer X.
But not for 1 / 0 so that we can get proper warnings and errors,
and not for 1-bit integers as they are edge cases better handled
- elsewhere. */
+ elsewhere. Delay the conversion of the signed division until late
+ because `1 / X` is simplier to handle than the resulting COND_EXPR. */
(simplify
(trunc_div integer_onep@0 @1)
(if (INTEGRAL_TYPE_P (type)
@@ -569,10 +570,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& (!flag_non_call_exceptions || tree_expr_nonzero_p (@1)))
(if (TYPE_UNSIGNED (type))
(convert (eq:boolean_type_node @1 { build_one_cst (type); }))
- (with { tree utype = unsigned_type_for (type); }
- (cond (le (plus (convert:utype @1) { build_one_cst (utype); })
- { build_int_cst (utype, 2); })
- @1 { build_zero_cst (type); })))))
+ (if (!canonicalize_math_p ())
+ (with { tree utype = unsigned_type_for (type); }
+ (cond (le (plus (convert:utype @1) { build_one_cst (utype); })
+ { build_int_cst (utype, 2); })
+ @1 { build_zero_cst (type); }))))))
/* Combine two successive divisions. Note that combining ceil_div
and floor_div is trickier and combining round_div even more so. */
new file mode 100644
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized" } */
+/* PR tree-optimization/113301 */
+/* We should figure out that 1/(x+1) range is [-1,1]
+ and then /2 is always 0. */
+
+void link_error(void);
+void func(int x){
+ int c=(1/(x+1))/2;
+ if (c != 0)
+ link_error();
+}
+/* { dg-final { scan-tree-dump-not "link_error " "optimized" } } */