lower-bitint: Punt .*_OVERFLOW optimization if cast from IMAGPART_EXPR appears before REALPART_EXPR [PR113119]
Checks
Commit Message
Hi!
_BitInt lowering for .{ADD,SUB,MUL}_OVERFLOW calls which have both
REALPART_EXPR and IMAGPART_EXPR used and have a cast from the IMAGPART_EXPR
to a boolean or normal integral type lowers them at the point of
the REALPART_EXPR statement (which is especially needed if the lhs of
the call is complex with large/huge _BitInt element type); we emit the
stmt to set the lhs of the cast at the same spot as well.
Normally, the lowering of __builtin_{add,sub,mul}_overflow arranges
the REALPART_EXPR to come before IMAGPART_EXPR, followed by cast from that,
but as the testcase shows, a redundant __builtin_*_overflow call and VN
can reorder those and we then ICE because the def-stmt of the former cast
from IMAGPART_EXPR may appear after its uses.
We already check that all of REALPART_EXPR, IMAGPART_EXPR and the cast
from the latter appear in the same bb as the .{ADD,SUB,MUL}_OVERFLOW call
in the optimization, the following patch just extends it to make sure
cast appears after REALPART_EXPR; if not, we punt on the optimization and
expand it as a store of a complex _BitInt on the location of the ifn call.
Only the testcase in the testsuite is changed by the patch, all other
__builtin_*_overflow* calls in the bitint* tests (and there are quite a few)
have REALPART_EXPR first.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2024-01-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/113119
* gimple-lower-bitint.cc (optimizable_arith_overflow): Punt if
both REALPART_EXPR and cast from IMAGPART_EXPR appear, but cast
is before REALPART_EXPR.
* gcc.dg/bitint-61.c: New test.
Jakub
Comments
On Thu, 4 Jan 2024, Jakub Jelinek wrote:
> Hi!
>
> _BitInt lowering for .{ADD,SUB,MUL}_OVERFLOW calls which have both
> REALPART_EXPR and IMAGPART_EXPR used and have a cast from the IMAGPART_EXPR
> to a boolean or normal integral type lowers them at the point of
> the REALPART_EXPR statement (which is especially needed if the lhs of
> the call is complex with large/huge _BitInt element type); we emit the
> stmt to set the lhs of the cast at the same spot as well.
> Normally, the lowering of __builtin_{add,sub,mul}_overflow arranges
> the REALPART_EXPR to come before IMAGPART_EXPR, followed by cast from that,
> but as the testcase shows, a redundant __builtin_*_overflow call and VN
> can reorder those and we then ICE because the def-stmt of the former cast
> from IMAGPART_EXPR may appear after its uses.
> We already check that all of REALPART_EXPR, IMAGPART_EXPR and the cast
> from the latter appear in the same bb as the .{ADD,SUB,MUL}_OVERFLOW call
> in the optimization, the following patch just extends it to make sure
> cast appears after REALPART_EXPR; if not, we punt on the optimization and
> expand it as a store of a complex _BitInt on the location of the ifn call.
> Only the testcase in the testsuite is changed by the patch, all other
> __builtin_*_overflow* calls in the bitint* tests (and there are quite a few)
> have REALPART_EXPR first.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
OK.
Richard.
> 2024-01-04 Jakub Jelinek <jakub@redhat.com>
>
> PR tree-optimization/113119
> * gimple-lower-bitint.cc (optimizable_arith_overflow): Punt if
> both REALPART_EXPR and cast from IMAGPART_EXPR appear, but cast
> is before REALPART_EXPR.
>
> * gcc.dg/bitint-61.c: New test.
>
> --- gcc/gimple-lower-bitint.cc.jj 2023-12-22 12:27:58.497437164 +0100
> +++ gcc/gimple-lower-bitint.cc 2023-12-23 10:44:05.586522553 +0100
> @@ -305,6 +305,7 @@ optimizable_arith_overflow (gimple *stmt
> imm_use_iterator ui;
> use_operand_p use_p;
> int seen = 0;
> + gimple *realpart = NULL, *cast = NULL;
> FOR_EACH_IMM_USE_FAST (use_p, ui, lhs)
> {
> gimple *g = USE_STMT (use_p);
> @@ -317,6 +318,7 @@ optimizable_arith_overflow (gimple *stmt
> if ((seen & 1) != 0)
> return 0;
> seen |= 1;
> + realpart = g;
> }
> else if (gimple_assign_rhs_code (g) == IMAGPART_EXPR)
> {
> @@ -338,13 +340,35 @@ optimizable_arith_overflow (gimple *stmt
> if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs2))
> || TREE_CODE (TREE_TYPE (lhs2)) == BITINT_TYPE)
> return 0;
> + cast = use_stmt;
> }
> else
> return 0;
> }
> if ((seen & 2) == 0)
> return 0;
> - return seen == 3 ? 2 : 1;
> + if (seen == 3)
> + {
> + /* Punt if the cast stmt appears before realpart stmt, because
> + if both appear, the lowering wants to emit all the code
> + at the location of realpart stmt. */
> + gimple_stmt_iterator gsi = gsi_for_stmt (realpart);
> + unsigned int cnt = 0;
> + do
> + {
> + gsi_prev_nondebug (&gsi);
> + if (gsi_end_p (gsi) || gsi_stmt (gsi) == cast)
> + return 0;
> + if (gsi_stmt (gsi) == stmt)
> + return 2;
> + /* If realpart is too far from stmt, punt as well.
> + Usually it will appear right after it. */
> + if (++cnt == 32)
> + return 0;
> + }
> + while (1);
> + }
> + return 1;
> }
>
> /* If STMT is some kind of comparison (GIMPLE_COND, comparison assignment)
> --- gcc/testsuite/gcc.dg/bitint-61.c.jj 2023-12-23 10:46:17.808658852 +0100
> +++ gcc/testsuite/gcc.dg/bitint-61.c 2023-12-23 10:46:02.482874865 +0100
> @@ -0,0 +1,17 @@
> +/* PR tree-optimization/113119 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +_BitInt(8) b;
> +_Bool c;
> +#if __BITINT_MAXWIDTH__ >= 8445
> +_BitInt(8445) a;
> +
> +void
> +foo (_BitInt(4058) d)
> +{
> + c = __builtin_add_overflow (a, 0ULL, &d);
> + __builtin_add_overflow (a, 0ULL, &d);
> + b = d;
> +}
> +#endif
>
> Jakub
>
>
@@ -305,6 +305,7 @@ optimizable_arith_overflow (gimple *stmt
imm_use_iterator ui;
use_operand_p use_p;
int seen = 0;
+ gimple *realpart = NULL, *cast = NULL;
FOR_EACH_IMM_USE_FAST (use_p, ui, lhs)
{
gimple *g = USE_STMT (use_p);
@@ -317,6 +318,7 @@ optimizable_arith_overflow (gimple *stmt
if ((seen & 1) != 0)
return 0;
seen |= 1;
+ realpart = g;
}
else if (gimple_assign_rhs_code (g) == IMAGPART_EXPR)
{
@@ -338,13 +340,35 @@ optimizable_arith_overflow (gimple *stmt
if (!INTEGRAL_TYPE_P (TREE_TYPE (lhs2))
|| TREE_CODE (TREE_TYPE (lhs2)) == BITINT_TYPE)
return 0;
+ cast = use_stmt;
}
else
return 0;
}
if ((seen & 2) == 0)
return 0;
- return seen == 3 ? 2 : 1;
+ if (seen == 3)
+ {
+ /* Punt if the cast stmt appears before realpart stmt, because
+ if both appear, the lowering wants to emit all the code
+ at the location of realpart stmt. */
+ gimple_stmt_iterator gsi = gsi_for_stmt (realpart);
+ unsigned int cnt = 0;
+ do
+ {
+ gsi_prev_nondebug (&gsi);
+ if (gsi_end_p (gsi) || gsi_stmt (gsi) == cast)
+ return 0;
+ if (gsi_stmt (gsi) == stmt)
+ return 2;
+ /* If realpart is too far from stmt, punt as well.
+ Usually it will appear right after it. */
+ if (++cnt == 32)
+ return 0;
+ }
+ while (1);
+ }
+ return 1;
}
/* If STMT is some kind of comparison (GIMPLE_COND, comparison assignment)
@@ -0,0 +1,17 @@
+/* PR tree-optimization/113119 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-std=c23 -O2" } */
+
+_BitInt(8) b;
+_Bool c;
+#if __BITINT_MAXWIDTH__ >= 8445
+_BitInt(8445) a;
+
+void
+foo (_BitInt(4058) d)
+{
+ c = __builtin_add_overflow (a, 0ULL, &d);
+ __builtin_add_overflow (a, 0ULL, &d);
+ b = d;
+}
+#endif