match.pd: Canonicalize (signed x << c) >> c [PR101955]
Checks
Commit Message
Canonicalizes (signed x << c) >> c into the lowest
precision(type) - c bits of x IF those bits have a mode precision or a
precision of 1. Also combines this rule with (unsigned x << c) >> c -> x &
((unsigned)-1 >> c) to prevent duplicate pattern. Tested successfully on
x86_64 and x86 targets.
PR middle-end/101955
gcc/ChangeLog:
* match.pd ((signed x << c) >> c): New canonicalization.
gcc/testsuite/ChangeLog:
* gcc.dg/pr101955.c: New test.
---
gcc/match.pd | 20 +++++++----
gcc/testsuite/gcc.dg/pr101955.c | 63 +++++++++++++++++++++++++++++++++
2 files changed, 77 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/pr101955.c
Comments
On Tue, Aug 01, 2023 at 03:20:33PM -0400, Drew Ross via Gcc-patches wrote:
> Canonicalizes (signed x << c) >> c into the lowest
> precision(type) - c bits of x IF those bits have a mode precision or a
> precision of 1. Also combines this rule with (unsigned x << c) >> c -> x &
> ((unsigned)-1 >> c) to prevent duplicate pattern. Tested successfully on
> x86_64 and x86 targets.
>
> PR middle-end/101955
>
> gcc/ChangeLog:
>
> * match.pd ((signed x << c) >> c): New canonicalization.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr101955.c: New test.
> ---
> gcc/match.pd | 20 +++++++----
> gcc/testsuite/gcc.dg/pr101955.c | 63 +++++++++++++++++++++++++++++++++
> 2 files changed, 77 insertions(+), 6 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/pr101955.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 8543f777a28..62f7c84f565 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3758,13 +3758,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> - TYPE_PRECISION (TREE_TYPE (@2)))))
> (bit_and (convert @0) (lshift { build_minus_one_cst (type); } @1))))
>
> -/* Optimize (x << c) >> c into x & ((unsigned)-1 >> c) for unsigned
> - types. */
> +/* For (x << c) >> c, optimize into x & ((unsigned)-1 >> c) for
> + unsigned x OR truncate into the precision(type) - c lowest bits
> + of signed x (if they have mode precision or a precision of 1) */
There should be . between ) and " */" above.
> (simplify
> - (rshift (lshift @0 INTEGER_CST@1) @1)
> - (if (TYPE_UNSIGNED (type)
> - && (wi::ltu_p (wi::to_wide (@1), element_precision (type))))
> - (bit_and @0 (rshift { build_minus_one_cst (type); } @1))))
> + (rshift (nop_convert? (lshift @0 INTEGER_CST@1)) @@1)
> + (if (wi::ltu_p (wi::to_wide (@1), element_precision (type)))
> + (if (TYPE_UNSIGNED (type))
> + (bit_and @0 (rshift { build_minus_one_cst (type); } @1))
This needs to be (convert @0) instead of @0, because now that there is
the nop_convert? in between, @0 could have different type than type.
I certainly see regressions on
gcc.c-torture/compile/950612-1.c
on i686-linux because of this:
/home/jakub/src/gcc/gcc/testsuite/gcc.c-torture/compile/950612-1.c:17:1: error: type mismatch in binary expression
long long unsigned int
long long int
long long unsigned int
_346 = _3 & 4294967295;
during GIMPLE pass: forwprop
/home/jakub/src/gcc/gcc/testsuite/gcc.c-torture/compile/950612-1.c:17:1: internal compiler error: verify_gimple failed
0x9018a4e verify_gimple_in_cfg(function*, bool, bool)
../../gcc/tree-cfg.cc:5646
0x8e81eb5 execute_function_todo
../../gcc/passes.cc:2088
0x8e8234c do_per_function
../../gcc/passes.cc:1687
0x8e82431 execute_todo
../../gcc/passes.cc:2142
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
> + (if (INTEGRAL_TYPE_P (type))
> + (with {
> + int width = element_precision (type) - tree_to_uhwi (@1);
> + tree stype = build_nonstandard_integer_type (width, 0);
> + }
> + (if (width == 1 || type_has_mode_precision_p (stype))
> + (convert (convert:stype @0))))))))
just one space before == instead of two
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr101955.c
> @@ -0,0 +1,63 @@
> +/* { dg-do compile } */
The above line should be
/* { dg-do compile { target int32 } } */
because the test relies on 32-bit int, some targets have just
16-bit int.
Of course, unless you want to make the testcase more portable, by
using say
#define CHAR_BITS __CHAR_BIT__
#define INT_BITS (__SIZEOF_INT__ * __CHAR_BIT__)
#define LLONG_BITS (__SIZEOF_LONGLONG__ * __CHAR_BIT__)
and replacing all the 31, 24, 56 etc. constants with (INT_BITS - 1),
(INT_BITS - CHAR_BITS), (LLONG_BITS - CHAR_BITS) etc.
Though, it would still fail on some AVR configurations which have
(invalid for C) just 8-bit int, and the question is what to do with
that 16, because (INT_BITS - 2 * CHAR_BITS) is 0 on 16-bit ints, so
it would need to be (INT_BITS / 2) instead. C requires that
long long is at least 64-bit, so that is less problematic (no known
target to have > 64-bit long long, though theoretically possible).
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
Jakub
@@ -3758,13 +3758,21 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
- TYPE_PRECISION (TREE_TYPE (@2)))))
(bit_and (convert @0) (lshift { build_minus_one_cst (type); } @1))))
-/* Optimize (x << c) >> c into x & ((unsigned)-1 >> c) for unsigned
- types. */
+/* For (x << c) >> c, optimize into x & ((unsigned)-1 >> c) for
+ unsigned x OR truncate into the precision(type) - c lowest bits
+ of signed x (if they have mode precision or a precision of 1) */
(simplify
- (rshift (lshift @0 INTEGER_CST@1) @1)
- (if (TYPE_UNSIGNED (type)
- && (wi::ltu_p (wi::to_wide (@1), element_precision (type))))
- (bit_and @0 (rshift { build_minus_one_cst (type); } @1))))
+ (rshift (nop_convert? (lshift @0 INTEGER_CST@1)) @@1)
+ (if (wi::ltu_p (wi::to_wide (@1), element_precision (type)))
+ (if (TYPE_UNSIGNED (type))
+ (bit_and @0 (rshift { build_minus_one_cst (type); } @1))
+ (if (INTEGRAL_TYPE_P (type))
+ (with {
+ int width = element_precision (type) - tree_to_uhwi (@1);
+ tree stype = build_nonstandard_integer_type (width, 0);
+ }
+ (if (width == 1 || type_has_mode_precision_p (stype))
+ (convert (convert:stype @0))))))))
/* Optimize x >> x into 0 */
(simplify
new file mode 100644
@@ -0,0 +1,63 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+__attribute__((noipa)) int
+t1 (int x)
+{
+ int y = x << 31;
+ int z = y >> 31;
+ return z;
+}
+
+__attribute__((noipa)) int
+t2 (unsigned int x)
+{
+ int y = x << 31;
+ int z = y >> 31;
+ return z;
+}
+
+__attribute__((noipa)) int
+t3 (int x)
+{
+ return (x << 31) >> 31;
+}
+
+__attribute__((noipa)) int
+t4 (int x)
+{
+ return (x << 24) >> 24;
+}
+
+__attribute__((noipa)) int
+t5 (int x)
+{
+ return (x << 16) >> 16;
+}
+
+__attribute__((noipa)) long long
+t6 (long long x)
+{
+ return (x << 63) >> 63;
+}
+
+__attribute__((noipa)) long long
+t7 (long long x)
+{
+ return (x << 56) >> 56;
+}
+
+__attribute__((noipa)) long long
+t8 (long long x)
+{
+ return (x << 48) >> 48;
+}
+
+__attribute__((noipa)) long long
+t9 (long long x)
+{
+ return (x << 32) >> 32;
+}
+
+/* { dg-final { scan-tree-dump-not " >> " "optimized" } } */
+/* { dg-final { scan-tree-dump-not " << " "optimized" } } */