i386: Fix ICEs with SUBREGs from vector etc. constants to XFmode [PR114184]
Checks
Commit Message
Hi!
The Intel extended format has the various weird number categories,
pseudo denormals, pseudo infinities, pseudo NaNs and unnormals.
Those are not representable in the GCC real_value and so neither
GIMPLE nor RTX VIEW_CONVERT_EXPR/SUBREG folding folds those into
constants.
As can be seen on the following testcase, because it isn't folded
(since GCC 12, before that we were folding it) we can end up with
a SUBREG of a CONST_VECTOR or similar constant, which isn't valid
general_operand, so we ICE during vregs pass trying to recognize
the move instruction.
Initially I thought it is a middle-end bug, the movxf instruction
has general_operand predicate, but the middle-end certainly never
tests that predicate, seems moves are special optabs.
And looking at other mov optabs, e.g. for vector modes the i386
patterns use nonimmediate_operand predicate on the input, yet
ix86_expand_vector_move deals with CONSTANT_P and SUBREG of CONSTANT_P
arguments which if the predicate was checked couldn't ever make it through.
The following patch handles this case similarly to the
ix86_expand_vector_move's SUBREG of CONSTANT_P case, does it just for XFmode
because I believe that is the only mode that needs it from the scalar ones,
others should just be folded.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2024-03-04 Jakub Jelinek <jakub@redhat.com>
PR target/114184
* config/i386/i386-expand.cc (ix86_expand_move): If XFmode op1
is SUBREG of CONSTANT_P, force the SUBREG_REG into memory or
register.
* gcc.target/i386/pr114184.c: New test.
Jakub
Comments
On Mon, Mar 4, 2024 at 9:25 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> The Intel extended format has the various weird number categories,
> pseudo denormals, pseudo infinities, pseudo NaNs and unnormals.
> Those are not representable in the GCC real_value and so neither
> GIMPLE nor RTX VIEW_CONVERT_EXPR/SUBREG folding folds those into
> constants.
>
> As can be seen on the following testcase, because it isn't folded
> (since GCC 12, before that we were folding it) we can end up with
> a SUBREG of a CONST_VECTOR or similar constant, which isn't valid
> general_operand, so we ICE during vregs pass trying to recognize
> the move instruction.
> Initially I thought it is a middle-end bug, the movxf instruction
> has general_operand predicate, but the middle-end certainly never
> tests that predicate, seems moves are special optabs.
> And looking at other mov optabs, e.g. for vector modes the i386
> patterns use nonimmediate_operand predicate on the input, yet
> ix86_expand_vector_move deals with CONSTANT_P and SUBREG of CONSTANT_P
> arguments which if the predicate was checked couldn't ever make it through.
>
> The following patch handles this case similarly to the
> ix86_expand_vector_move's SUBREG of CONSTANT_P case, does it just for XFmode
> because I believe that is the only mode that needs it from the scalar ones,
> others should just be folded.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2024-03-04 Jakub Jelinek <jakub@redhat.com>
>
> PR target/114184
> * config/i386/i386-expand.cc (ix86_expand_move): If XFmode op1
> is SUBREG of CONSTANT_P, force the SUBREG_REG into memory or
> register.
>
> * gcc.target/i386/pr114184.c: New test.
OK, with a question inline.
Thanks,
Uros.
>
> --- gcc/config/i386/i386-expand.cc.jj 2024-03-01 14:56:34.120925989 +0100
> +++ gcc/config/i386/i386-expand.cc 2024-03-03 18:41:08.278793046 +0100
> @@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
> && GET_MODE (SUBREG_REG (op1)) == DImode
> && SUBREG_BYTE (op1) == 0)
> op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
> + /* As not all values in XFmode are representable in real_value,
> + we might be called with unfoldable SUBREGs of constants. */
> + if (mode == XFmode
> + && CONSTANT_P (SUBREG_REG (op1))
> + && can_create_pseudo_p ())
We have quite some unguarded force_regs in ix86_expand_move. While it
doesn't hurt to have an extra safety net, is there a particular reason
for can_create_pseudo_p check in the added code?
> + {
> + machine_mode imode = GET_MODE (SUBREG_REG (op1));
> + rtx r = force_const_mem (imode, SUBREG_REG (op1));
> + if (r)
> + r = validize_mem (r);
> + else
> + r = force_reg (imode, SUBREG_REG (op1));
> + op1 = simplify_gen_subreg (mode, r, imode, SUBREG_BYTE (op1));
> + }
> break;
> }
>
> --- gcc/testsuite/gcc.target/i386/pr114184.c.jj 2024-03-03 18:45:45.912964030 +0100
> +++ gcc/testsuite/gcc.target/i386/pr114184.c 2024-03-03 18:45:37.639078138 +0100
> @@ -0,0 +1,22 @@
> +/* PR target/114184 */
> +/* { dg-do compile } */
> +/* { dg-options "-Og -mavx2" } */
> +
> +typedef unsigned char V __attribute__((vector_size (32)));
> +typedef unsigned char W __attribute__((vector_size (16)));
> +
> +_Complex long double
> +foo (void)
> +{
> + _Complex long double d;
> + *(V *)&d = (V) { 149, 136, 89, 42, 38, 240, 196, 194 };
> + return d;
> +}
> +
> +long double
> +bar (void)
> +{
> + long double d;
> + *(W *)&d = (W) { 149, 136, 89, 42, 38, 240, 196, 194 };
> + return d;
> +}
>
> Jakub
>
On Mon, Mar 04, 2024 at 09:34:30AM +0100, Uros Bizjak wrote:
> > --- gcc/config/i386/i386-expand.cc.jj 2024-03-01 14:56:34.120925989 +0100
> > +++ gcc/config/i386/i386-expand.cc 2024-03-03 18:41:08.278793046 +0100
> > @@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
> > && GET_MODE (SUBREG_REG (op1)) == DImode
> > && SUBREG_BYTE (op1) == 0)
> > op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
> > + /* As not all values in XFmode are representable in real_value,
> > + we might be called with unfoldable SUBREGs of constants. */
> > + if (mode == XFmode
> > + && CONSTANT_P (SUBREG_REG (op1))
> > + && can_create_pseudo_p ())
>
> We have quite some unguarded force_regs in ix86_expand_move. While it
> doesn't hurt to have an extra safety net, is there a particular reason
> for can_create_pseudo_p check in the added code?
Various other places in ix86_expand_move do check can_create_pseudo_p, the
case I've mostly copied this from in ix86_expand_vector_move also does that,
and then there is the
Therefore, when given such a pair of operands, the pattern must
generate RTL which needs no reloading and needs no temporary
registers--no registers other than the operands. For example, if
you support the pattern with a 'define_expand', then in such a case
the 'define_expand' mustn't call 'force_reg' or any other such
function which might generate new pseudo registers.
in mov<M> description, which initially scared me off from using it at all.
Guess we'll ICE either way if something like that appears during RA.
Jakub
On Mon, Mar 4, 2024 at 9:41 AM Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Mon, Mar 04, 2024 at 09:34:30AM +0100, Uros Bizjak wrote:
> > > --- gcc/config/i386/i386-expand.cc.jj 2024-03-01 14:56:34.120925989 +0100
> > > +++ gcc/config/i386/i386-expand.cc 2024-03-03 18:41:08.278793046 +0100
> > > @@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
> > > && GET_MODE (SUBREG_REG (op1)) == DImode
> > > && SUBREG_BYTE (op1) == 0)
> > > op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
> > > + /* As not all values in XFmode are representable in real_value,
> > > + we might be called with unfoldable SUBREGs of constants. */
> > > + if (mode == XFmode
> > > + && CONSTANT_P (SUBREG_REG (op1))
> > > + && can_create_pseudo_p ())
> >
> > We have quite some unguarded force_regs in ix86_expand_move. While it
> > doesn't hurt to have an extra safety net, is there a particular reason
> > for can_create_pseudo_p check in the added code?
>
> Various other places in ix86_expand_move do check can_create_pseudo_p, the
> case I've mostly copied this from in ix86_expand_vector_move also does that,
> and then there is the
> Therefore, when given such a pair of operands, the pattern must
> generate RTL which needs no reloading and needs no temporary
> registers--no registers other than the operands. For example, if
> you support the pattern with a 'define_expand', then in such a case
> the 'define_expand' mustn't call 'force_reg' or any other such
> function which might generate new pseudo registers.
> in mov<M> description, which initially scared me off from using it at all.
> Guess we'll ICE either way if something like that appears during RA.
Thanks for the insight - it was PIC handling in ix86_expand_move that
catched my eye, especially the TARGET_MACHO part that looks like it
was somehow left behind. OTOH, the whole ix86_expand_move would need
some TLC anyway.
FAOD - the patch is OK as is.
Thanks,
Uros.
@@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
&& GET_MODE (SUBREG_REG (op1)) == DImode
&& SUBREG_BYTE (op1) == 0)
op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
+ /* As not all values in XFmode are representable in real_value,
+ we might be called with unfoldable SUBREGs of constants. */
+ if (mode == XFmode
+ && CONSTANT_P (SUBREG_REG (op1))
+ && can_create_pseudo_p ())
+ {
+ machine_mode imode = GET_MODE (SUBREG_REG (op1));
+ rtx r = force_const_mem (imode, SUBREG_REG (op1));
+ if (r)
+ r = validize_mem (r);
+ else
+ r = force_reg (imode, SUBREG_REG (op1));
+ op1 = simplify_gen_subreg (mode, r, imode, SUBREG_BYTE (op1));
+ }
break;
}
@@ -0,0 +1,22 @@
+/* PR target/114184 */
+/* { dg-do compile } */
+/* { dg-options "-Og -mavx2" } */
+
+typedef unsigned char V __attribute__((vector_size (32)));
+typedef unsigned char W __attribute__((vector_size (16)));
+
+_Complex long double
+foo (void)
+{
+ _Complex long double d;
+ *(V *)&d = (V) { 149, 136, 89, 42, 38, 240, 196, 194 };
+ return d;
+}
+
+long double
+bar (void)
+{
+ long double d;
+ *(W *)&d = (W) { 149, 136, 89, 42, 38, 240, 196, 194 };
+ return d;
+}