[v2] vect: Fix vectorized BIT_FIELD_REF for signed bit-fields [PR110557]

Message ID 20230707131857.2386125-2-xry111@xry111.site
State Accepted
Headers
Series [v2] vect: Fix vectorized BIT_FIELD_REF for signed bit-fields [PR110557] |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Xi Ruoyao July 7, 2023, 1:18 p.m. UTC
  If a bit-field is signed and it's wider than the output type, we must
ensure the extracted result sign-extended.  But this was not handled
correctly.

For example:

    int x : 8;
    long y : 55;
    bool z : 1;

The vectorized extraction of y was:

    vect__ifc__49.29_110 =
      MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.27_108];
    vect_patt_38.30_112 =
      vect__ifc__49.29_110 & { 9223372036854775552, 9223372036854775552 };
    vect_patt_39.31_113 = vect_patt_38.30_112 >> 8;
    vect_patt_40.32_114 =
      VIEW_CONVERT_EXPR<vector(2) long int>(vect_patt_39.31_113);

This is obviously incorrect.  This pach has implemented it as:

    vect__ifc__25.16_62 =
      MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.14_60];
    vect_patt_31.17_63 =
      VIEW_CONVERT_EXPR<vector(2) long int>(vect__ifc__25.16_62);
    vect_patt_32.18_64 = vect_patt_31.17_63 << 1;
    vect_patt_33.19_65 = vect_patt_32.18_64 >> 9;

gcc/ChangeLog:

	PR tree-optimization/110557
	* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern):
	Ensure the output sign-extended if necessary.

gcc/testsuite/ChangeLog:

	PR tree-optimization/110557
	* g++.dg/vect/pr110557.cc: New test.
---

Change v1 -> v2:

- Rename two variables for readability.
- Remove a redundant useless_type_conversion_p check.
- Edit the comment for early conversion to show the rationale of
  "|| ref_sext".

Bootstrapped (with BOOT_CFLAGS="-O3 -mavx2") and regtested on
x86_64-linux-gnu.  Ok for trunk and gcc-13?

 gcc/testsuite/g++.dg/vect/pr110557.cc | 37 ++++++++++++++++
 gcc/tree-vect-patterns.cc             | 62 ++++++++++++++++++++-------
 2 files changed, 83 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/vect/pr110557.cc
  

Comments

Richard Biener July 10, 2023, 10:33 a.m. UTC | #1
On Fri, 7 Jul 2023, Xi Ruoyao wrote:

> If a bit-field is signed and it's wider than the output type, we must
> ensure the extracted result sign-extended.  But this was not handled
> correctly.
> 
> For example:
> 
>     int x : 8;
>     long y : 55;
>     bool z : 1;
> 
> The vectorized extraction of y was:
> 
>     vect__ifc__49.29_110 =
>       MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.27_108];
>     vect_patt_38.30_112 =
>       vect__ifc__49.29_110 & { 9223372036854775552, 9223372036854775552 };
>     vect_patt_39.31_113 = vect_patt_38.30_112 >> 8;
>     vect_patt_40.32_114 =
>       VIEW_CONVERT_EXPR<vector(2) long int>(vect_patt_39.31_113);
> 
> This is obviously incorrect.  This pach has implemented it as:
> 
>     vect__ifc__25.16_62 =
>       MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.14_60];
>     vect_patt_31.17_63 =
>       VIEW_CONVERT_EXPR<vector(2) long int>(vect__ifc__25.16_62);
>     vect_patt_32.18_64 = vect_patt_31.17_63 << 1;
>     vect_patt_33.19_65 = vect_patt_32.18_64 >> 9;

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> 	PR tree-optimization/110557
> 	* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern):
> 	Ensure the output sign-extended if necessary.
> 
> gcc/testsuite/ChangeLog:
> 
> 	PR tree-optimization/110557
> 	* g++.dg/vect/pr110557.cc: New test.
> ---
> 
> Change v1 -> v2:
> 
> - Rename two variables for readability.
> - Remove a redundant useless_type_conversion_p check.
> - Edit the comment for early conversion to show the rationale of
>   "|| ref_sext".
> 
> Bootstrapped (with BOOT_CFLAGS="-O3 -mavx2") and regtested on
> x86_64-linux-gnu.  Ok for trunk and gcc-13?
> 
>  gcc/testsuite/g++.dg/vect/pr110557.cc | 37 ++++++++++++++++
>  gcc/tree-vect-patterns.cc             | 62 ++++++++++++++++++++-------
>  2 files changed, 83 insertions(+), 16 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/vect/pr110557.cc
> 
> diff --git a/gcc/testsuite/g++.dg/vect/pr110557.cc b/gcc/testsuite/g++.dg/vect/pr110557.cc
> new file mode 100644
> index 00000000000..e1fbe1caac4
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/vect/pr110557.cc
> @@ -0,0 +1,37 @@
> +// { dg-additional-options "-mavx" { target { avx_runtime } } }
> +
> +static inline long
> +min (long a, long b)
> +{
> +  return a < b ? a : b;
> +}
> +
> +struct Item
> +{
> +  int x : 8;
> +  long y : 55;
> +  bool z : 1;
> +};
> +
> +__attribute__ ((noipa)) long
> +test (Item *a, int cnt)
> +{
> +  long size = 0;
> +  for (int i = 0; i < cnt; i++)
> +    size = min ((long)a[i].y, size);
> +  return size;
> +}
> +
> +int
> +main ()
> +{
> +  struct Item items[] = {
> +    { 1, -1 },
> +    { 2, -2 },
> +    { 3, -3 },
> +    { 4, -4 },
> +  };
> +
> +  if (test (items, 4) != -4)
> +    __builtin_trap ();
> +}
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 1bc36b043a0..c0832e8679f 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -2566,7 +2566,7 @@ vect_recog_widen_sum_pattern (vec_info *vinfo,
>     Widening with mask first, shift later:
>     container = (type_out) container;
>     masked = container & (((1 << bitsize) - 1) << bitpos);
> -   result = patt2 >> masked;
> +   result = masked >> bitpos;
>  
>     Widening with shift first, mask last:
>     container = (type_out) container;
> @@ -2578,6 +2578,15 @@ vect_recog_widen_sum_pattern (vec_info *vinfo,
>     result = masked >> bitpos;
>     result = (type_out) result;
>  
> +   If the bitfield is signed and it's wider than type_out, we need to
> +   keep the result sign-extended:
> +   container = (type) container;
> +   masked = container << (prec - bitsize - bitpos);
> +   result = (type_out) (masked >> (prec - bitsize));
> +
> +   Here type is the signed variant of the wider of type_out and the type
> +   of container.
> +
>     The shifting is always optional depending on whether bitpos != 0.
>  
>  */
> @@ -2636,14 +2645,22 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
>    if (BYTES_BIG_ENDIAN)
>      shift_n = prec - shift_n - mask_width;
>  
> +  bool ref_sext = (!TYPE_UNSIGNED (TREE_TYPE (bf_ref)) &&
> +		   TYPE_PRECISION (ret_type) > mask_width);
> +  bool load_widen = (TYPE_PRECISION (TREE_TYPE (container)) <
> +		     TYPE_PRECISION (ret_type));
> +
>    /* We move the conversion earlier if the loaded type is smaller than the
> -     return type to enable the use of widening loads.  */
> -  if (TYPE_PRECISION (TREE_TYPE (container)) < TYPE_PRECISION (ret_type)
> -      && !useless_type_conversion_p (TREE_TYPE (container), ret_type))
> -    {
> -      pattern_stmt
> -	= gimple_build_assign (vect_recog_temp_ssa_var (ret_type),
> -			       NOP_EXPR, container);
> +     return type to enable the use of widening loads.  And if we need a
> +     sign extension, we need to convert the loaded value early to a signed
> +     type as well.  */
> +  if (ref_sext || load_widen)
> +    {
> +      tree type = load_widen ? ret_type : container_type;
> +      if (ref_sext)
> +	type = gimple_signed_type (type);
> +      pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (type),
> +					  NOP_EXPR, container);
>        container = gimple_get_lhs (pattern_stmt);
>        container_type = TREE_TYPE (container);
>        prec = tree_to_uhwi (TYPE_SIZE (container_type));
> @@ -2671,7 +2688,7 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
>      shift_first = true;
>  
>    tree result;
> -  if (shift_first)
> +  if (shift_first && !ref_sext)
>      {
>        tree shifted = container;
>        if (shift_n)
> @@ -2694,14 +2711,27 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
>      }
>    else
>      {
> -      tree mask = wide_int_to_tree (container_type,
> -				    wi::shifted_mask (shift_n, mask_width,
> -						      false, prec));
> -      pattern_stmt
> -	= gimple_build_assign (vect_recog_temp_ssa_var (container_type),
> -			       BIT_AND_EXPR, container, mask);
> -      tree masked = gimple_assign_lhs (pattern_stmt);
> +      tree temp = vect_recog_temp_ssa_var (container_type);
> +      if (!ref_sext)
> +	{
> +	  tree mask = wide_int_to_tree (container_type,
> +					wi::shifted_mask (shift_n,
> +							  mask_width,
> +							  false, prec));
> +	  pattern_stmt = gimple_build_assign (temp, BIT_AND_EXPR,
> +					      container, mask);
> +	}
> +      else
> +	{
> +	  HOST_WIDE_INT shl = prec - shift_n - mask_width;
> +	  shift_n += shl;
> +	  pattern_stmt = gimple_build_assign (temp, LSHIFT_EXPR,
> +					      container,
> +					      build_int_cst (sizetype,
> +							     shl));
> +	}
>  
> +      tree masked = gimple_assign_lhs (pattern_stmt);
>        append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype);
>        pattern_stmt
>  	= gimple_build_assign (vect_recog_temp_ssa_var (container_type),
>
  
Xi Ruoyao July 10, 2023, 11:12 a.m. UTC | #2
On Mon, 2023-07-10 at 10:33 +0000, Richard Biener wrote:
> On Fri, 7 Jul 2023, Xi Ruoyao wrote:
> 
> > If a bit-field is signed and it's wider than the output type, we
> > must
> > ensure the extracted result sign-extended.  But this was not handled
> > correctly.
> > 
> > For example:
> > 
> >     int x : 8;
> >     long y : 55;
> >     bool z : 1;
> > 
> > The vectorized extraction of y was:
> > 
> >     vect__ifc__49.29_110 =
> >       MEM <vector(2) long unsigned int> [(struct Item
> > *)vectp_a.27_108];
> >     vect_patt_38.30_112 =
> >       vect__ifc__49.29_110 & { 9223372036854775552,
> > 9223372036854775552 };
> >     vect_patt_39.31_113 = vect_patt_38.30_112 >> 8;
> >     vect_patt_40.32_114 =
> >       VIEW_CONVERT_EXPR<vector(2) long int>(vect_patt_39.31_113);
> > 
> > This is obviously incorrect.  This pach has implemented it as:
> > 
> >     vect__ifc__25.16_62 =
> >       MEM <vector(2) long unsigned int> [(struct Item
> > *)vectp_a.14_60];
> >     vect_patt_31.17_63 =
> >       VIEW_CONVERT_EXPR<vector(2) long int>(vect__ifc__25.16_62);
> >     vect_patt_32.18_64 = vect_patt_31.17_63 << 1;
> >     vect_patt_33.19_65 = vect_patt_32.18_64 >> 9;
> 
> OK.

Pushed r14-2407 and r13-7553.
  
Prathamesh Kulkarni July 11, 2023, 7:34 a.m. UTC | #3
On Mon, 10 Jul 2023 at 16:43, Xi Ruoyao via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Mon, 2023-07-10 at 10:33 +0000, Richard Biener wrote:
> > On Fri, 7 Jul 2023, Xi Ruoyao wrote:
> >
> > > If a bit-field is signed and it's wider than the output type, we
> > > must
> > > ensure the extracted result sign-extended.  But this was not handled
> > > correctly.
> > >
> > > For example:
> > >
> > >     int x : 8;
> > >     long y : 55;
> > >     bool z : 1;
> > >
> > > The vectorized extraction of y was:
> > >
> > >     vect__ifc__49.29_110 =
> > >       MEM <vector(2) long unsigned int> [(struct Item
> > > *)vectp_a.27_108];
> > >     vect_patt_38.30_112 =
> > >       vect__ifc__49.29_110 & { 9223372036854775552,
> > > 9223372036854775552 };
> > >     vect_patt_39.31_113 = vect_patt_38.30_112 >> 8;
> > >     vect_patt_40.32_114 =
> > >       VIEW_CONVERT_EXPR<vector(2) long int>(vect_patt_39.31_113);
> > >
> > > This is obviously incorrect.  This pach has implemented it as:
> > >
> > >     vect__ifc__25.16_62 =
> > >       MEM <vector(2) long unsigned int> [(struct Item
> > > *)vectp_a.14_60];
> > >     vect_patt_31.17_63 =
> > >       VIEW_CONVERT_EXPR<vector(2) long int>(vect__ifc__25.16_62);
> > >     vect_patt_32.18_64 = vect_patt_31.17_63 << 1;
> > >     vect_patt_33.19_65 = vect_patt_32.18_64 >> 9;
> >
> > OK.
>
> Pushed r14-2407 and r13-7553.
Hi Xi,
Your commit:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=63ae6bc60c0f67fb2791991bf4b6e7e0a907d420,

seems to cause following regressions on arm-linux-gnueabihf:
FAIL: g++.dg/vect/pr110557.cc  -std=c++98 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc  -std=c++14 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc  -std=c++17 (test for excess errors)
FAIL: g++.dg/vect/pr110557.cc  -std=c++20 (test for excess errors)

Excess error:
gcc/testsuite/g++.dg/vect/pr110557.cc:12:8: warning: width of
'Item::y' exceeds its type

Thanks,
Prathamesh
>
> --
> Xi Ruoyao <xry111@xry111.site>
> School of Aerospace Science and Technology, Xidian University
  

Patch

diff --git a/gcc/testsuite/g++.dg/vect/pr110557.cc b/gcc/testsuite/g++.dg/vect/pr110557.cc
new file mode 100644
index 00000000000..e1fbe1caac4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr110557.cc
@@ -0,0 +1,37 @@ 
+// { dg-additional-options "-mavx" { target { avx_runtime } } }
+
+static inline long
+min (long a, long b)
+{
+  return a < b ? a : b;
+}
+
+struct Item
+{
+  int x : 8;
+  long y : 55;
+  bool z : 1;
+};
+
+__attribute__ ((noipa)) long
+test (Item *a, int cnt)
+{
+  long size = 0;
+  for (int i = 0; i < cnt; i++)
+    size = min ((long)a[i].y, size);
+  return size;
+}
+
+int
+main ()
+{
+  struct Item items[] = {
+    { 1, -1 },
+    { 2, -2 },
+    { 3, -3 },
+    { 4, -4 },
+  };
+
+  if (test (items, 4) != -4)
+    __builtin_trap ();
+}
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 1bc36b043a0..c0832e8679f 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -2566,7 +2566,7 @@  vect_recog_widen_sum_pattern (vec_info *vinfo,
    Widening with mask first, shift later:
    container = (type_out) container;
    masked = container & (((1 << bitsize) - 1) << bitpos);
-   result = patt2 >> masked;
+   result = masked >> bitpos;
 
    Widening with shift first, mask last:
    container = (type_out) container;
@@ -2578,6 +2578,15 @@  vect_recog_widen_sum_pattern (vec_info *vinfo,
    result = masked >> bitpos;
    result = (type_out) result;
 
+   If the bitfield is signed and it's wider than type_out, we need to
+   keep the result sign-extended:
+   container = (type) container;
+   masked = container << (prec - bitsize - bitpos);
+   result = (type_out) (masked >> (prec - bitsize));
+
+   Here type is the signed variant of the wider of type_out and the type
+   of container.
+
    The shifting is always optional depending on whether bitpos != 0.
 
 */
@@ -2636,14 +2645,22 @@  vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
   if (BYTES_BIG_ENDIAN)
     shift_n = prec - shift_n - mask_width;
 
+  bool ref_sext = (!TYPE_UNSIGNED (TREE_TYPE (bf_ref)) &&
+		   TYPE_PRECISION (ret_type) > mask_width);
+  bool load_widen = (TYPE_PRECISION (TREE_TYPE (container)) <
+		     TYPE_PRECISION (ret_type));
+
   /* We move the conversion earlier if the loaded type is smaller than the
-     return type to enable the use of widening loads.  */
-  if (TYPE_PRECISION (TREE_TYPE (container)) < TYPE_PRECISION (ret_type)
-      && !useless_type_conversion_p (TREE_TYPE (container), ret_type))
-    {
-      pattern_stmt
-	= gimple_build_assign (vect_recog_temp_ssa_var (ret_type),
-			       NOP_EXPR, container);
+     return type to enable the use of widening loads.  And if we need a
+     sign extension, we need to convert the loaded value early to a signed
+     type as well.  */
+  if (ref_sext || load_widen)
+    {
+      tree type = load_widen ? ret_type : container_type;
+      if (ref_sext)
+	type = gimple_signed_type (type);
+      pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (type),
+					  NOP_EXPR, container);
       container = gimple_get_lhs (pattern_stmt);
       container_type = TREE_TYPE (container);
       prec = tree_to_uhwi (TYPE_SIZE (container_type));
@@ -2671,7 +2688,7 @@  vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
     shift_first = true;
 
   tree result;
-  if (shift_first)
+  if (shift_first && !ref_sext)
     {
       tree shifted = container;
       if (shift_n)
@@ -2694,14 +2711,27 @@  vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
     }
   else
     {
-      tree mask = wide_int_to_tree (container_type,
-				    wi::shifted_mask (shift_n, mask_width,
-						      false, prec));
-      pattern_stmt
-	= gimple_build_assign (vect_recog_temp_ssa_var (container_type),
-			       BIT_AND_EXPR, container, mask);
-      tree masked = gimple_assign_lhs (pattern_stmt);
+      tree temp = vect_recog_temp_ssa_var (container_type);
+      if (!ref_sext)
+	{
+	  tree mask = wide_int_to_tree (container_type,
+					wi::shifted_mask (shift_n,
+							  mask_width,
+							  false, prec));
+	  pattern_stmt = gimple_build_assign (temp, BIT_AND_EXPR,
+					      container, mask);
+	}
+      else
+	{
+	  HOST_WIDE_INT shl = prec - shift_n - mask_width;
+	  shift_n += shl;
+	  pattern_stmt = gimple_build_assign (temp, LSHIFT_EXPR,
+					      container,
+					      build_int_cst (sizetype,
+							     shl));
+	}
 
+      tree masked = gimple_assign_lhs (pattern_stmt);
       append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype);
       pattern_stmt
 	= gimple_build_assign (vect_recog_temp_ssa_var (container_type),