vect: Handle demoting FLOAT and promoting FIX_TRUNC.

Message ID d7abbe5a-2b77-00e6-a2ba-b390891d2a99@gmail.com
State Unresolved
Headers
Series vect: Handle demoting FLOAT and promoting FIX_TRUNC. |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Robin Dapp July 13, 2023, 10:30 a.m. UTC
  Hi,

the recent changes that allowed multi-step conversions for
"non-packing/unpacking", i.e. modifier == NONE targets included
promoting to-float and demoting to-int variants.  This patch
adds demoting to-float and promoting to-int handling.

Bootstrapped and regtested on x86 and aarch64.

A question that seems related: Why do we require !flag_trapping_math
for the "NONE" multistep conversion but not for the "NARROW_DST"
case when both seem to handle float -> int and there are float
values that do not have an int representation?  If a backend
can guarantee that the conversion traps, should it just implement
a multistep conversion in a matching expander?

Regards
 Robin


gcc/ChangeLog:

	* tree-vect-stmts.cc (vectorizable_conversion): Handle
	more demotion/promotion for modifier == NONE.
---
 gcc/tree-vect-stmts.cc | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)
  

Comments

Richard Biener July 13, 2023, 10:37 a.m. UTC | #1
On Thu, Jul 13, 2023 at 12:31 PM Robin Dapp via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> the recent changes that allowed multi-step conversions for
> "non-packing/unpacking", i.e. modifier == NONE targets included
> promoting to-float and demoting to-int variants.  This patch
> adds demoting to-float and promoting to-int handling.

Can you add testcases?  Also the current restriction is because
the variants you add are not always correct and I don't see any
checks that the intermediate type doesn't lose significant bits?

Richard.

> Bootstrapped and regtested on x86 and aarch64.
>
> A question that seems related: Why do we require !flag_trapping_math
> for the "NONE" multistep conversion but not for the "NARROW_DST"
> case when both seem to handle float -> int and there are float
> values that do not have an int representation?  If a backend
> can guarantee that the conversion traps, should it just implement
> a multistep conversion in a matching expander?
>
> Regards
>  Robin
>
>
> gcc/ChangeLog:
>
>         * tree-vect-stmts.cc (vectorizable_conversion): Handle
>         more demotion/promotion for modifier == NONE.
> ---
>  gcc/tree-vect-stmts.cc | 40 +++++++++++++++++++++++++++++-----------
>  1 file changed, 29 insertions(+), 11 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 10e71178ce7..78e0510be7e 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -5324,28 +5324,46 @@ vectorizable_conversion (vec_info *vinfo,
>         break;
>        }
>
> -      /* For conversions between float and smaller integer types try whether we
> -        can use intermediate signed integer types to support the
> +      /* For conversions between float and larger integer types try whether
> +        we can use intermediate signed integer types to support the
>          conversion.  */
>        if ((code == FLOAT_EXPR
> -          && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
> +          && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode))
>           || (code == FIX_TRUNC_EXPR
> -             && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
> -             && !flag_trapping_math))
> +             && ((GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
> +                 && !flag_trapping_math)
> +                 || GET_MODE_SIZE (rhs_mode) < GET_MODE_SIZE (lhs_mode))))
>         {
> +         bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode);
>           bool float_expr_p = code == FLOAT_EXPR;
> -         scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
> -         fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode);
> +         unsigned short target_size;
> +         scalar_mode intermediate_mode;
> +         if (demotion)
> +           {
> +             intermediate_mode = lhs_mode;
> +             target_size = GET_MODE_SIZE (rhs_mode);
> +           }
> +         else
> +           {
> +             target_size = GET_MODE_SIZE (lhs_mode);
> +             tree itype
> +               = build_nonstandard_integer_type (GET_MODE_BITSIZE
> +                                                 (rhs_mode), 0);
> +             intermediate_mode = SCALAR_TYPE_MODE (itype);
> +           }
>           code1 = float_expr_p ? code : NOP_EXPR;
>           codecvt1 = float_expr_p ? NOP_EXPR : code;
> -         FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode)
> +         opt_scalar_mode mode_iter;
> +         FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode)
>             {
> -             imode = rhs_mode_iter.require ();
> -             if (GET_MODE_SIZE (imode) > fltsz)
> +             intermediate_mode = mode_iter.require ();
> +
> +             if (GET_MODE_SIZE (intermediate_mode) > target_size)
>                 break;
>
>               cvt_type
> -               = build_nonstandard_integer_type (GET_MODE_BITSIZE (imode),
> +               = build_nonstandard_integer_type (GET_MODE_BITSIZE
> +                                                 (intermediate_mode),
>                                                   0);
>               cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type,
>                                                       slp_node);
> --
> 2.41.0
>
  
Robin Dapp July 13, 2023, 12:19 p.m. UTC | #2
> Can you add testcases?  Also the current restriction is because
> the variants you add are not always correct and I don't see any
> checks that the intermediate type doesn't lose significant bits?

The testcases I wanted to add with a follow-up RISC-V patch but
I can also try an aarch64 one.

So for my understanding, please correct, we have:
  
  promoting int -> float, should always be safe.  We currently
   vectorize this with WIDEN and NONE.

  demoting float -> int, this is safe as long as the float
   value can be represented in the int type, otherwise we must
   trap.
   We currently vectorize this on x86 using NARROW (regardless
   of -ftrapping-math) and using NONE only with -fno-trapping-math.

  demoting int -> float, this is safe as long as the
   intermediate types can hold the initial value?  How is
   this different to demoting e.g. int64_t -> int8_t?
   We currently do not vectorize this with either NARROW or NONE.
   LLVM vectorizes but only with their default(?) -fno-trapping-math.
   Yet I don't see how we could trap here?

  promoting float -> int, this is safe as long as the float
   value can be represented (as above)?  We currently vectorize
   this (regardless of -ftrapping-math) with WIDEN but not NONE.

So apart from unifying the -ftrapping-math behavior I think only
the third variant is somewhat critical?

Regards
 Robin
  
Richard Biener July 13, 2023, 12:36 p.m. UTC | #3
On Thu, Jul 13, 2023 at 2:19 PM Robin Dapp <rdapp.gcc@gmail.com> wrote:
>
> > Can you add testcases?  Also the current restriction is because
> > the variants you add are not always correct and I don't see any
> > checks that the intermediate type doesn't lose significant bits?
>
> The testcases I wanted to add with a follow-up RISC-V patch but
> I can also try an aarch64 one.
>
> So for my understanding, please correct, we have:
>
>   promoting int -> float, should always be safe.  We currently
>    vectorize this with WIDEN and NONE.
>
>   demoting float -> int, this is safe as long as the float
>    value can be represented in the int type, otherwise we must
>    trap.
>    We currently vectorize this on x86 using NARROW (regardless
>    of -ftrapping-math) and using NONE only with -fno-trapping-math.
>
>   demoting int -> float, this is safe as long as the
>    intermediate types can hold the initial value?  How is
>    this different to demoting e.g. int64_t -> int8_t?
>    We currently do not vectorize this with either NARROW or NONE.
>    LLVM vectorizes but only with their default(?) -fno-trapping-math.
>    Yet I don't see how we could trap here?
>
>   promoting float -> int, this is safe as long as the float
>    value can be represented (as above)?  We currently vectorize
>    this (regardless of -ftrapping-math) with WIDEN but not NONE.
>
> So apart from unifying the -ftrapping-math behavior I think only
> the third variant is somewhat critical?

I think all demoting cases need checks that are not present right now
irrespective of properly trapping.

Richard.

> Regards
>  Robin
>
  

Patch

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 10e71178ce7..78e0510be7e 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5324,28 +5324,46 @@  vectorizable_conversion (vec_info *vinfo,
 	break;
       }
 
-      /* For conversions between float and smaller integer types try whether we
-	 can use intermediate signed integer types to support the
+      /* For conversions between float and larger integer types try whether
+	 we can use intermediate signed integer types to support the
 	 conversion.  */
       if ((code == FLOAT_EXPR
-	   && GET_MODE_SIZE (lhs_mode) > GET_MODE_SIZE (rhs_mode))
+	   && GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode))
 	  || (code == FIX_TRUNC_EXPR
-	      && GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
-	      && !flag_trapping_math))
+	      && ((GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode)
+		  && !flag_trapping_math)
+		  || GET_MODE_SIZE (rhs_mode) < GET_MODE_SIZE (lhs_mode))))
 	{
+	  bool demotion = GET_MODE_SIZE (rhs_mode) > GET_MODE_SIZE (lhs_mode);
 	  bool float_expr_p = code == FLOAT_EXPR;
-	  scalar_mode imode = float_expr_p ? rhs_mode : lhs_mode;
-	  fltsz = GET_MODE_SIZE (float_expr_p ? lhs_mode : rhs_mode);
+	  unsigned short target_size;
+	  scalar_mode intermediate_mode;
+	  if (demotion)
+	    {
+	      intermediate_mode = lhs_mode;
+	      target_size = GET_MODE_SIZE (rhs_mode);
+	    }
+	  else
+	    {
+	      target_size = GET_MODE_SIZE (lhs_mode);
+	      tree itype
+		= build_nonstandard_integer_type (GET_MODE_BITSIZE
+						  (rhs_mode), 0);
+	      intermediate_mode = SCALAR_TYPE_MODE (itype);
+	    }
 	  code1 = float_expr_p ? code : NOP_EXPR;
 	  codecvt1 = float_expr_p ? NOP_EXPR : code;
-	  FOR_EACH_2XWIDER_MODE (rhs_mode_iter, imode)
+	  opt_scalar_mode mode_iter;
+	  FOR_EACH_2XWIDER_MODE (mode_iter, intermediate_mode)
 	    {
-	      imode = rhs_mode_iter.require ();
-	      if (GET_MODE_SIZE (imode) > fltsz)
+	      intermediate_mode = mode_iter.require ();
+
+	      if (GET_MODE_SIZE (intermediate_mode) > target_size)
 		break;
 
 	      cvt_type
-		= build_nonstandard_integer_type (GET_MODE_BITSIZE (imode),
+		= build_nonstandard_integer_type (GET_MODE_BITSIZE
+						  (intermediate_mode),
 						  0);
 	      cvt_type = get_vectype_for_scalar_type (vinfo, cvt_type,
 						      slp_node);