vect: Handle demoting FLOAT and promoting FIX_TRUNC.
Checks
Commit Message
Hi,
the recent changes that allowed multi-step conversions for
"non-packing/unpacking", i.e. modifier == NONE targets included
promoting to-float and demoting to-int variants. This patch
adds demoting to-float and promoting to-int handling.
Bootstrapped and regtested on x86 and aarch64.
A question that seems related: Why do we require !flag_trapping_math
for the "NONE" multistep conversion but not for the "NARROW_DST"
case when both seem to handle float -> int and there are float
values that do not have an int representation? If a backend
can guarantee that the conversion traps, should it just implement
a multistep conversion in a matching expander?
Regards
Robin
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_conversion): Handle
more demotion/promotion for modifier == NONE.
---
gcc/tree-vect-stmts.cc | 40 +++++++++++++++++++++++++++++-----------
1 file changed, 29 insertions(+), 11 deletions(-)
Comments
On Thu, Jul 13, 2023 at 12:31 PM Robin Dapp via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> the recent changes that allowed multi-step conversions for
> "non-packing/unpacking", i.e. modifier == NONE targets included
> promoting to-float and demoting to-int variants. This patch
> adds demoting to-float and promoting to-int handling.
Can you add testcases? Also the current restriction is because
the variants you add are not always correct and I don't see any
checks that the intermediate type doesn't lose significant bits?
Richard.
> Bootstrapped and regtested on x86 and aarch64.
>
> A question that seems related: Why do we require !flag_trapping_math
> for the "NONE" multistep conversion but not for the "NARROW_DST"
> case when both seem to handle float -> int and there are float
> values that do not have an int representation? If a backend
> can guarantee that the conversion traps, should it just implement
> a multistep conversion in a matching expander?
>
> Regards
> Robin
>
>
> gcc/ChangeLog:
>
> * tree-vect-stmts.cc (vectorizable_conversion): Handle
> more demotion/promotion for modifier == NONE.
> ---
> gcc/tree-vect-stmts.cc | 40 +++++++++++++++++++++++++++++-----------
> 1 file changed, 29 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 10e71178ce7..78e0510be7e 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -5324,28 +5324,46 @@ vectorizable_conversion (vec_info *vinfo,
> break;
> }
>
> - /* For conversions between float and smaller integer types try whether we
> - can use intermediate signed integer types to support the
> + /* For conversions between float and larger integer types try whether
> + we can use intermediate signed integer types to support the
> conversion. */
> if ((code == FLOAT_EXPR
> - && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
> + && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode))
> || (code == FIX_TRUNC_EXPR
> - && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
> - && !flag_trapping_math))
> + && ((GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
> + && !flag_trapping_math)
> + || GET_MODE_SIZE (rhs_mode) < GET_MODE_SIZE (lhs_mode))))
> {
> + bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode);
> bool float_expr_p = code == FLOAT_EXPR;
> - scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
> - fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode);
> + unsigned short target_size;
> + scalar_mode intermediate_mode;
> + if (demotion)
> + {
> + intermediate_mode = lhs_mode;
> + target_size = GET_MODE_SIZE (rhs_mode);
> + }
> + else
> + {
> + target_size = GET_MODE_SIZE (lhs_mode);
> + tree itype
> + = build_nonstandard_integer_type (GET_MODE_BITSIZE
> + (rhs_mode), 0);
> + intermediate_mode = SCALAR_TYPE_MODE (itype);
> + }
> code1 = float_expr_p ? code : NOP_EXPR;
> codecvt1 = float_expr_p ? NOP_EXPR : code;
> - FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode)
> + opt_scalar_mode mode_iter;
> + FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode)
> {
> - imode = rhs_mode_iter.require ();
> - if (GET_MODE_SIZE (imode) > fltsz)
> + intermediate_mode = mode_iter.require ();
> +
> + if (GET_MODE_SIZE (intermediate_mode) > target_size)
> break;
>
> cvt_type
> - = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode),
> + = build_nonstandard_integer_type (GET_MODE_BITSIZE
> + (intermediate_mode),
> 0);
> cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type,
> slp_node);
> --
> 2.41.0
>
> Can you add testcases? Also the current restriction is because
> the variants you add are not always correct and I don't see any
> checks that the intermediate type doesn't lose significant bits?
The testcases I wanted to add with a follow-up RISC-V patch but
I can also try an aarch64 one.
So for my understanding, please correct, we have:
promoting int -> float, should always be safe. We currently
vectorize this with WIDEN and NONE.
demoting float -> int, this is safe as long as the float
value can be represented in the int type, otherwise we must
trap.
We currently vectorize this on x86 using NARROW (regardless
of -ftrapping-math) and using NONE only with -fno-trapping-math.
demoting int -> float, this is safe as long as the
intermediate types can hold the initial value? How is
this different to demoting e.g. int64_t -> int8_t?
We currently do not vectorize this with either NARROW or NONE.
LLVM vectorizes but only with their default(?) -fno-trapping-math.
Yet I don't see how we could trap here?
promoting float -> int, this is safe as long as the float
value can be represented (as above)? We currently vectorize
this (regardless of -ftrapping-math) with WIDEN but not NONE.
So apart from unifying the -ftrapping-math behavior I think only
the third variant is somewhat critical?
Regards
Robin
On Thu, Jul 13, 2023 at 2:19 PM Robin Dapp <rdapp.gcc@gmail.com> wrote:
>
> > Can you add testcases? Also the current restriction is because
> > the variants you add are not always correct and I don't see any
> > checks that the intermediate type doesn't lose significant bits?
>
> The testcases I wanted to add with a follow-up RISC-V patch but
> I can also try an aarch64 one.
>
> So for my understanding, please correct, we have:
>
> promoting int -> float, should always be safe. We currently
> vectorize this with WIDEN and NONE.
>
> demoting float -> int, this is safe as long as the float
> value can be represented in the int type, otherwise we must
> trap.
> We currently vectorize this on x86 using NARROW (regardless
> of -ftrapping-math) and using NONE only with -fno-trapping-math.
>
> demoting int -> float, this is safe as long as the
> intermediate types can hold the initial value? How is
> this different to demoting e.g. int64_t -> int8_t?
> We currently do not vectorize this with either NARROW or NONE.
> LLVM vectorizes but only with their default(?) -fno-trapping-math.
> Yet I don't see how we could trap here?
>
> promoting float -> int, this is safe as long as the float
> value can be represented (as above)? We currently vectorize
> this (regardless of -ftrapping-math) with WIDEN but not NONE.
>
> So apart from unifying the -ftrapping-math behavior I think only
> the third variant is somewhat critical?
I think all demoting cases need checks that are not present right now
irrespective of properly trapping.
Richard.
> Regards
> Robin
>
@@ -5324,28 +5324,46 @@ vectorizable_conversion (vec_info *vinfo,
break;
}
- /* For conversions between float and smaller integer types try whether we
- can use intermediate signed integer types to support the
+ /* For conversions between float and larger integer types try whether
+ we can use intermediate signed integer types to support the
conversion. */
if ((code == FLOAT_EXPR
- && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
+ && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode))
|| (code == FIX_TRUNC_EXPR
- && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
- && !flag_trapping_math))
+ && ((GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
+ && !flag_trapping_math)
+ || GET_MODE_SIZE (rhs_mode) < GET_MODE_SIZE (lhs_mode))))
{
+ bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode);
bool float_expr_p = code == FLOAT_EXPR;
- scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
- fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode);
+ unsigned short target_size;
+ scalar_mode intermediate_mode;
+ if (demotion)
+ {
+ intermediate_mode = lhs_mode;
+ target_size = GET_MODE_SIZE (rhs_mode);
+ }
+ else
+ {
+ target_size = GET_MODE_SIZE (lhs_mode);
+ tree itype
+ = build_nonstandard_integer_type (GET_MODE_BITSIZE
+ (rhs_mode), 0);
+ intermediate_mode = SCALAR_TYPE_MODE (itype);
+ }
code1 = float_expr_p ? code : NOP_EXPR;
codecvt1 = float_expr_p ? NOP_EXPR : code;
- FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode)
+ opt_scalar_mode mode_iter;
+ FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode)
{
- imode = rhs_mode_iter.require ();
- if (GET_MODE_SIZE (imode) > fltsz)
+ intermediate_mode = mode_iter.require ();
+
+ if (GET_MODE_SIZE (intermediate_mode) > target_size)
break;
cvt_type
- = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode),
+ = build_nonstandard_integer_type (GET_MODE_BITSIZE
+ (intermediate_mode),
0);
cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type,
slp_node);