[v2] Do not emulate vectors containing floats.
Checks
Commit Message
The emulation via word mode tries to perform integer arithmetic on floating
point values instead of floating point arithmetic. This leads to
mis-compilations.
Failure occured on s390x on these existing test cases:
gcc.dg/vect/tsvc/vect-tsvc-s112.c
gcc.dg/vect/tsvc/vect-tsvc-s113.c
gcc.dg/vect/tsvc/vect-tsvc-s119.c
gcc.dg/vect/tsvc/vect-tsvc-s121.c
gcc.dg/vect/tsvc/vect-tsvc-s131.c
gcc.dg/vect/tsvc/vect-tsvc-s132.c
gcc.dg/vect/tsvc/vect-tsvc-s2233.c
gcc.dg/vect/tsvc/vect-tsvc-s421.c
gcc.dg/vect/vect-alias-check-14.c
gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
point vectors
Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
---
gcc/tree-vect-stmts.cc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Comments
Juergen Christ <jchrist@linux.ibm.com> writes:
> The emulation via word mode tries to perform integer arithmetic on floating
> point values instead of floating point arithmetic. This leads to
> mis-compilations.
Is the bug ref + test missing?
>
> Failure occured on s390x on these existing test cases:
> gcc.dg/vect/tsvc/vect-tsvc-s112.c
> gcc.dg/vect/tsvc/vect-tsvc-s113.c
> gcc.dg/vect/tsvc/vect-tsvc-s119.c
> gcc.dg/vect/tsvc/vect-tsvc-s121.c
> gcc.dg/vect/tsvc/vect-tsvc-s131.c
> gcc.dg/vect/tsvc/vect-tsvc-s132.c
> gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> gcc.dg/vect/tsvc/vect-tsvc-s421.c
> gcc.dg/vect/vect-alias-check-14.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
>
> gcc/ChangeLog:
>
> * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> point vectors
>
> Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> ---
> gcc/tree-vect-stmts.cc | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 09749ae38174..f95ff2c2aa34 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> those through even when the mode isn't word_mode. For
> ops we have to lower the lowering code assumes we are
> dealing with word_mode. */
> - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> || !target_support_p)
> && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> /* Check only during analysis. */
Am Fri, Feb 23, 2024 at 01:57:12PM +0000 schrieb Sam James:
>
> Juergen Christ <jchrist@linux.ibm.com> writes:
>
> > The emulation via word mode tries to perform integer arithmetic on floating
> > point values instead of floating point arithmetic. This leads to
> > mis-compilations.
>
> Is the bug ref + test missing?
Sorry, forgot to add the "bootstrapped and tested on s390x and x86_64".
Not sure how to reference a bugzilla here. There is 114075 that
should be solved with this, too.
> >
> > Failure occured on s390x on these existing test cases:
> > gcc.dg/vect/tsvc/vect-tsvc-s112.c
> > gcc.dg/vect/tsvc/vect-tsvc-s113.c
> > gcc.dg/vect/tsvc/vect-tsvc-s119.c
> > gcc.dg/vect/tsvc/vect-tsvc-s121.c
> > gcc.dg/vect/tsvc/vect-tsvc-s131.c
> > gcc.dg/vect/tsvc/vect-tsvc-s132.c
> > gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> > gcc.dg/vect/tsvc/vect-tsvc-s421.c
> > gcc.dg/vect/vect-alias-check-14.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> > point vectors
> >
> > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> > ---
> > gcc/tree-vect-stmts.cc | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 09749ae38174..f95ff2c2aa34 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > those through even when the mode isn't word_mode. For
> > ops we have to lower the lowering code assumes we are
> > dealing with word_mode. */
> > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > || !target_support_p)
> > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > /* Check only during analysis. */
>
On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote:
> The emulation via word mode tries to perform integer arithmetic on floating
> point values instead of floating point arithmetic. This leads to
> mis-compilations.
>
> Failure occured on s390x on these existing test cases:
> gcc.dg/vect/tsvc/vect-tsvc-s112.c
> gcc.dg/vect/tsvc/vect-tsvc-s113.c
> gcc.dg/vect/tsvc/vect-tsvc-s119.c
> gcc.dg/vect/tsvc/vect-tsvc-s121.c
> gcc.dg/vect/tsvc/vect-tsvc-s131.c
> gcc.dg/vect/tsvc/vect-tsvc-s132.c
> gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> gcc.dg/vect/tsvc/vect-tsvc-s421.c
> gcc.dg/vect/vect-alias-check-14.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
>
> gcc/ChangeLog:
>
Please add
PR tree-optimization/114075
above the * tree-vect-stmts line.
> * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> point vectors
This line should be tab indented like the first one, and end with .
And given what the patch does, perhaps say non-integral instead of floating
point.
As for testcase, I'll handle it separately, given that it already
fixes some pre-existing tests.
> Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> ---
> gcc/tree-vect-stmts.cc | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 09749ae38174..f95ff2c2aa34 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> those through even when the mode isn't word_mode. For
> ops we have to lower the lowering code assumes we are
> dealing with word_mode. */
> - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> || !target_support_p)
> && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> /* Check only during analysis. */
LGTM, but please wait until Monday evening so that Richi or Richard
have a chance to chime in.
Jakub
On Fri, 23 Feb 2024, Jakub Jelinek wrote:
> On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote:
> > The emulation via word mode tries to perform integer arithmetic on floating
> > point values instead of floating point arithmetic. This leads to
> > mis-compilations.
> >
> > Failure occured on s390x on these existing test cases:
> > gcc.dg/vect/tsvc/vect-tsvc-s112.c
> > gcc.dg/vect/tsvc/vect-tsvc-s113.c
> > gcc.dg/vect/tsvc/vect-tsvc-s119.c
> > gcc.dg/vect/tsvc/vect-tsvc-s121.c
> > gcc.dg/vect/tsvc/vect-tsvc-s131.c
> > gcc.dg/vect/tsvc/vect-tsvc-s132.c
> > gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> > gcc.dg/vect/tsvc/vect-tsvc-s421.c
> > gcc.dg/vect/vect-alias-check-14.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
> >
> > gcc/ChangeLog:
> >
>
> Please add
> PR tree-optimization/114075
> above the * tree-vect-stmts line.
> > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> > point vectors
>
> This line should be tab indented like the first one, and end with .
> And given what the patch does, perhaps say non-integral instead of floating
> point.
>
> As for testcase, I'll handle it separately, given that it already
> fixes some pre-existing tests.
>
> > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> > ---
> > gcc/tree-vect-stmts.cc | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 09749ae38174..f95ff2c2aa34 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > those through even when the mode isn't word_mode. For
> > ops we have to lower the lowering code assumes we are
> > dealing with word_mode. */
> > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > || !target_support_p)
> > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > /* Check only during analysis. */
I think it will work fine. Even after the last TLC this feels like in
the need of more TLC ;)
So OK. Also for affected branches - the effective check should be the
same in GCC 13 at least, but with some added ad-hoc costing which might
make this not trigger (maybe_lt (nunits_out, 4U)) - so we'd need a
word_mode that can cover 4 FP elements. Possibly triggerable with
HFmode?
Thanks,
Richard.
> LGTM, but please wait until Monday evening so that Richi or Richard
> have a chance to chime in.
>
> Jakub
>
>
On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > those through even when the mode isn't word_mode. For
> > > ops we have to lower the lowering code assumes we are
> > > dealing with word_mode. */
> > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > || !target_support_p)
> > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > /* Check only during analysis. */
>
> I think it will work fine. Even after the last TLC this feels like in
> the need of more TLC ;)
Note, at least in theory, floating point NEGATE_EXPR could be handled just
fine in the emulated vectors, just xor the sign bit, and
BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
floating point modes (though e.g. in RTL they are used even for them),
it is mainly PLUS_EXPR/MINUS_EXPR. Perhaps the NEGATE_EXPR isn't worth
it though. In any case, that wouldn't be stage4 material.
Jakub
On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > those through even when the mode isn't word_mode. For
> > > > ops we have to lower the lowering code assumes we are
> > > > dealing with word_mode. */
> > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > || !target_support_p)
> > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > /* Check only during analysis. */
> >
> > I think it will work fine. Even after the last TLC this feels like in
> > the need of more TLC ;)
>
> Note, at least in theory, floating point NEGATE_EXPR could be handled just
> fine in the emulated vectors, just xor the sign bit, and
> BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> floating point modes (though e.g. in RTL they are used even for them),
Hmm, I think vector types not supported only ever get integer modes
assigned, not FP modes so the operations would be on integer modes.
Unless I'm missing the point ...
> it is mainly PLUS_EXPR/MINUS_EXPR. Perhaps the NEGATE_EXPR isn't worth
> it though. In any case, that wouldn't be stage4 material.
Agreed.
Richard.
On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote:
> On Mon, 26 Feb 2024, Jakub Jelinek wrote:
>
> > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > > those through even when the mode isn't word_mode. For
> > > > > ops we have to lower the lowering code assumes we are
> > > > > dealing with word_mode. */
> > > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > || !target_support_p)
> > > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > > /* Check only during analysis. */
> > >
> > > I think it will work fine. Even after the last TLC this feels like in
> > > the need of more TLC ;)
> >
> > Note, at least in theory, floating point NEGATE_EXPR could be handled just
> > fine in the emulated vectors, just xor the sign bit, and
> > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> > floating point modes (though e.g. in RTL they are used even for them),
>
> Hmm, I think vector types not supported only ever get integer modes
> assigned, not FP modes so the operations would be on integer modes.
I mean one could still SLP vectorize using the emulated vectors
float a[2], b[2];
...
a[0] = -b[0];
a[1] = -b[1];
as effectively
*(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL;
if we handled that case later on (except the above !INTEGRAL_TYPE_P check
would prevent that).
As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't
disallow
float f, g;
f_3 |= g_4;
etc. (but the FEs do), though maybe we'd ICE on some targets during
expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs.
Emulation using integer mode would be well defined.
Jakub
On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote:
> > On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> >
> > > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > > > those through even when the mode isn't word_mode. For
> > > > > > ops we have to lower the lowering code assumes we are
> > > > > > dealing with word_mode. */
> > > > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > > || !target_support_p)
> > > > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > > > /* Check only during analysis. */
> > > >
> > > > I think it will work fine. Even after the last TLC this feels like in
> > > > the need of more TLC ;)
> > >
> > > Note, at least in theory, floating point NEGATE_EXPR could be handled just
> > > fine in the emulated vectors, just xor the sign bit, and
> > > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> > > floating point modes (though e.g. in RTL they are used even for them),
> >
> > Hmm, I think vector types not supported only ever get integer modes
> > assigned, not FP modes so the operations would be on integer modes.
>
> I mean one could still SLP vectorize using the emulated vectors
> float a[2], b[2];
> ...
> a[0] = -b[0];
> a[1] = -b[1];
> as effectively
> *(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL;
> if we handled that case later on (except the above !INTEGRAL_TYPE_P check
> would prevent that).
Yep.
> As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't
> disallow
> float f, g;
> f_3 |= g_4;
> etc. (but the FEs do), though maybe we'd ICE on some targets during
> expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs.
I think that's an omission in the verifier (we allow the ops on
pointers though).
Richard.
@@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
those through even when the mode isn't word_mode. For
ops we have to lower the lowering code assumes we are
dealing with word_mode. */
- if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
+ if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
+ || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
|| !target_support_p)
&& maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
/* Check only during analysis. */