vect: Fix vectorized BIT_FIELD_REF for signed bit-fields [PR110557]
Checks
Commit Message
If a bit-field is signed and it's wider than the output type, we must
ensure the extracted result sign-extended. But this was not handled
correctly.
For example:
int x : 8;
long y : 55;
bool z : 1;
The vectorized extraction of y was:
vect__ifc__49.29_110 =
MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.27_108];
vect_patt_38.30_112 =
vect__ifc__49.29_110 & { 9223372036854775552, 9223372036854775552 };
vect_patt_39.31_113 = vect_patt_38.30_112 >> 8;
vect_patt_40.32_114 =
VIEW_CONVERT_EXPR<vector(2) long int>(vect_patt_39.31_113);
This is obviously incorrect. This pach has implemented it as:
vect__ifc__25.16_62 =
MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.14_60];
vect_patt_31.17_63 =
VIEW_CONVERT_EXPR<vector(2) long int>(vect__ifc__25.16_62);
vect_patt_32.18_64 = vect_patt_31.17_63 << 1;
vect_patt_33.19_65 = vect_patt_32.18_64 >> 9;
gcc/ChangeLog:
PR tree-optimization/110557
* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern):
Ensure the output sign-extended if necessary.
gcc/testsuite/ChangeLog:
PR tree-optimization/110557
* g++.dg/vect/pr110557.cc: New test.
---
Bootstrapped and regtested on x86_64-linux-gnu. Ok for trunk and gcc-13
branch?
gcc/testsuite/g++.dg/vect/pr110557.cc | 37 +++++++++++++++++
gcc/tree-vect-patterns.cc | 58 ++++++++++++++++++++-------
2 files changed, 81 insertions(+), 14 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/vect/pr110557.cc
Comments
On Thu, Jul 6, 2023 at 6:18 PM Xi Ruoyao via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> If a bit-field is signed and it's wider than the output type, we must
> ensure the extracted result sign-extended. But this was not handled
> correctly.
>
> For example:
>
> int x : 8;
> long y : 55;
> bool z : 1;
>
> The vectorized extraction of y was:
>
> vect__ifc__49.29_110 =
> MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.27_108];
> vect_patt_38.30_112 =
> vect__ifc__49.29_110 & { 9223372036854775552, 9223372036854775552 };
> vect_patt_39.31_113 = vect_patt_38.30_112 >> 8;
> vect_patt_40.32_114 =
> VIEW_CONVERT_EXPR<vector(2) long int>(vect_patt_39.31_113);
>
> This is obviously incorrect. This pach has implemented it as:
>
> vect__ifc__25.16_62 =
> MEM <vector(2) long unsigned int> [(struct Item *)vectp_a.14_60];
> vect_patt_31.17_63 =
> VIEW_CONVERT_EXPR<vector(2) long int>(vect__ifc__25.16_62);
> vect_patt_32.18_64 = vect_patt_31.17_63 << 1;
> vect_patt_33.19_65 = vect_patt_32.18_64 >> 9;
>
> gcc/ChangeLog:
>
> PR tree-optimization/110557
> * tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern):
> Ensure the output sign-extended if necessary.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/110557
> * g++.dg/vect/pr110557.cc: New test.
> ---
>
> Bootstrapped and regtested on x86_64-linux-gnu. Ok for trunk and gcc-13
> branch?
>
> gcc/testsuite/g++.dg/vect/pr110557.cc | 37 +++++++++++++++++
> gcc/tree-vect-patterns.cc | 58 ++++++++++++++++++++-------
> 2 files changed, 81 insertions(+), 14 deletions(-)
> create mode 100644 gcc/testsuite/g++.dg/vect/pr110557.cc
>
> diff --git a/gcc/testsuite/g++.dg/vect/pr110557.cc b/gcc/testsuite/g++.dg/vect/pr110557.cc
> new file mode 100644
> index 00000000000..e1fbe1caac4
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/vect/pr110557.cc
> @@ -0,0 +1,37 @@
> +// { dg-additional-options "-mavx" { target { avx_runtime } } }
> +
> +static inline long
> +min (long a, long b)
> +{
> + return a < b ? a : b;
> +}
> +
> +struct Item
> +{
> + int x : 8;
> + long y : 55;
> + bool z : 1;
> +};
> +
> +__attribute__ ((noipa)) long
> +test (Item *a, int cnt)
> +{
> + long size = 0;
> + for (int i = 0; i < cnt; i++)
> + size = min ((long)a[i].y, size);
> + return size;
> +}
> +
> +int
> +main ()
> +{
> + struct Item items[] = {
> + { 1, -1 },
> + { 2, -2 },
> + { 3, -3 },
> + { 4, -4 },
> + };
> +
> + if (test (items, 4) != -4)
> + __builtin_trap ();
> +}
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 1bc36b043a0..20412c27ead 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -2566,7 +2566,7 @@ vect_recog_widen_sum_pattern (vec_info *vinfo,
> Widening with mask first, shift later:
> container = (type_out) container;
> masked = container & (((1 << bitsize) - 1) << bitpos);
> - result = patt2 >> masked;
> + result = masked >> bitpos;
>
> Widening with shift first, mask last:
> container = (type_out) container;
> @@ -2578,6 +2578,15 @@ vect_recog_widen_sum_pattern (vec_info *vinfo,
> result = masked >> bitpos;
> result = (type_out) result;
>
> + If the bitfield is signed and it's wider than type_out, we need to
> + keep the result sign-extended:
> + container = (type) container;
> + masked = container << (prec - bitsize - bitpos);
> + result = (type_out) (masked >> (prec - bitsize));
> +
> + Here type is the signed variant of the wider of type_out and the type
> + of container.
> +
> The shifting is always optional depending on whether bitpos != 0.
>
> */
> @@ -2636,14 +2645,22 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
> if (BYTES_BIG_ENDIAN)
> shift_n = prec - shift_n - mask_width;
>
> + bool sign_ext = (!TYPE_UNSIGNED (TREE_TYPE (bf_ref)) &&
> + TYPE_PRECISION (ret_type) > mask_width);
> + bool widening = ((TYPE_PRECISION (TREE_TYPE (container)) <
> + TYPE_PRECISION (ret_type))
> + && !useless_type_conversion_p (TREE_TYPE (container),
> + ret_type));
the !useless_type_conversion_p check isn't necessary, when TYPE_PRECISION
isn't equal the conversion is never useless.
I'll also note that ret_type == TREE_TYPE (bf_ref).
Can you rename 'widening' to 'load_widen' and 'sign_ext' to 'ref_sext'? As they
are named it suggest they apply to the same so I originally thought sign_ext
should be widening && !TYPE_UNSIGNED.
Otherwise looks reasonable.
Thanks,
Richard.
> +
> /* We move the conversion earlier if the loaded type is smaller than the
> return type to enable the use of widening loads. */
> - if (TYPE_PRECISION (TREE_TYPE (container)) < TYPE_PRECISION (ret_type)
> - && !useless_type_conversion_p (TREE_TYPE (container), ret_type))
> + if (sign_ext || widening)
> {
> - pattern_stmt
> - = gimple_build_assign (vect_recog_temp_ssa_var (ret_type),
> - NOP_EXPR, container);
> + tree type = widening ? ret_type : container_type;
> + if (sign_ext)
> + type = gimple_signed_type (type);
> + pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (type),
> + NOP_EXPR, container);
> container = gimple_get_lhs (pattern_stmt);
> container_type = TREE_TYPE (container);
> prec = tree_to_uhwi (TYPE_SIZE (container_type));
> @@ -2671,7 +2688,7 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
> shift_first = true;
>
> tree result;
> - if (shift_first)
> + if (shift_first && !sign_ext)
> {
> tree shifted = container;
> if (shift_n)
> @@ -2694,14 +2711,27 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
> }
> else
> {
> - tree mask = wide_int_to_tree (container_type,
> - wi::shifted_mask (shift_n, mask_width,
> - false, prec));
> - pattern_stmt
> - = gimple_build_assign (vect_recog_temp_ssa_var (container_type),
> - BIT_AND_EXPR, container, mask);
> - tree masked = gimple_assign_lhs (pattern_stmt);
> + tree temp = vect_recog_temp_ssa_var (container_type);
> + if (!sign_ext)
> + {
> + tree mask = wide_int_to_tree (container_type,
> + wi::shifted_mask (shift_n,
> + mask_width,
> + false, prec));
> + pattern_stmt = gimple_build_assign (temp, BIT_AND_EXPR,
> + container, mask);
> + }
> + else
> + {
> + HOST_WIDE_INT shl = prec - shift_n - mask_width;
> + shift_n += shl;
> + pattern_stmt = gimple_build_assign (temp, LSHIFT_EXPR,
> + container,
> + build_int_cst (sizetype,
> + shl));
> + }
>
> + tree masked = gimple_assign_lhs (pattern_stmt);
> append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype);
> pattern_stmt
> = gimple_build_assign (vect_recog_temp_ssa_var (container_type),
> --
> 2.41.0
>
On Fri, 2023-07-07 at 08:15 +0200, Richard Biener wrote:
/* snip */
> > + bool sign_ext = (!TYPE_UNSIGNED (TREE_TYPE (bf_ref)) &&
> > + TYPE_PRECISION (ret_type) > mask_width);
> > + bool widening = ((TYPE_PRECISION (TREE_TYPE (container)) <
> > + TYPE_PRECISION (ret_type))
> > + && !useless_type_conversion_p (TREE_TYPE (container),
> > + ret_type));
>
> the !useless_type_conversion_p check isn't necessary, when TYPE_PRECISION
> isn't equal the conversion is never useless.
I'll drop it.
> I'll also note that ret_type == TREE_TYPE (bf_ref).
No, ret_type == TREE_TYPE (ret), not TREE_TYPE (bf_ref). For something
like
struct Item
{
int x : 30;
int y : 30;
};
Item *p = get();
unsigned long t = p->y;
Then TREE_TYPE (ret) is unsigned long, and TREE_TYPE (bf_ref) is int.
In this case we still need to perform the sign extension: if p->y is -1
we should have -1ul in t. So we need to check the signedness of
TREE_TYPE (bf_ref).
> Can you rename 'widening' to 'load_widen' and 'sign_ext' to 'ref_sext'? As they
> are named it suggest they apply to the same so I originally thought sign_ext
> should be widening && !TYPE_UNSIGNED.
I'll rename them.
I'll send a v2 after testing it.
>
new file mode 100644
@@ -0,0 +1,37 @@
+// { dg-additional-options "-mavx" { target { avx_runtime } } }
+
+static inline long
+min (long a, long b)
+{
+ return a < b ? a : b;
+}
+
+struct Item
+{
+ int x : 8;
+ long y : 55;
+ bool z : 1;
+};
+
+__attribute__ ((noipa)) long
+test (Item *a, int cnt)
+{
+ long size = 0;
+ for (int i = 0; i < cnt; i++)
+ size = min ((long)a[i].y, size);
+ return size;
+}
+
+int
+main ()
+{
+ struct Item items[] = {
+ { 1, -1 },
+ { 2, -2 },
+ { 3, -3 },
+ { 4, -4 },
+ };
+
+ if (test (items, 4) != -4)
+ __builtin_trap ();
+}
@@ -2566,7 +2566,7 @@ vect_recog_widen_sum_pattern (vec_info *vinfo,
Widening with mask first, shift later:
container = (type_out) container;
masked = container & (((1 << bitsize) - 1) << bitpos);
- result = patt2 >> masked;
+ result = masked >> bitpos;
Widening with shift first, mask last:
container = (type_out) container;
@@ -2578,6 +2578,15 @@ vect_recog_widen_sum_pattern (vec_info *vinfo,
result = masked >> bitpos;
result = (type_out) result;
+ If the bitfield is signed and it's wider than type_out, we need to
+ keep the result sign-extended:
+ container = (type) container;
+ masked = container << (prec - bitsize - bitpos);
+ result = (type_out) (masked >> (prec - bitsize));
+
+ Here type is the signed variant of the wider of type_out and the type
+ of container.
+
The shifting is always optional depending on whether bitpos != 0.
*/
@@ -2636,14 +2645,22 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
if (BYTES_BIG_ENDIAN)
shift_n = prec - shift_n - mask_width;
+ bool sign_ext = (!TYPE_UNSIGNED (TREE_TYPE (bf_ref)) &&
+ TYPE_PRECISION (ret_type) > mask_width);
+ bool widening = ((TYPE_PRECISION (TREE_TYPE (container)) <
+ TYPE_PRECISION (ret_type))
+ && !useless_type_conversion_p (TREE_TYPE (container),
+ ret_type));
+
/* We move the conversion earlier if the loaded type is smaller than the
return type to enable the use of widening loads. */
- if (TYPE_PRECISION (TREE_TYPE (container)) < TYPE_PRECISION (ret_type)
- && !useless_type_conversion_p (TREE_TYPE (container), ret_type))
+ if (sign_ext || widening)
{
- pattern_stmt
- = gimple_build_assign (vect_recog_temp_ssa_var (ret_type),
- NOP_EXPR, container);
+ tree type = widening ? ret_type : container_type;
+ if (sign_ext)
+ type = gimple_signed_type (type);
+ pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (type),
+ NOP_EXPR, container);
container = gimple_get_lhs (pattern_stmt);
container_type = TREE_TYPE (container);
prec = tree_to_uhwi (TYPE_SIZE (container_type));
@@ -2671,7 +2688,7 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
shift_first = true;
tree result;
- if (shift_first)
+ if (shift_first && !sign_ext)
{
tree shifted = container;
if (shift_n)
@@ -2694,14 +2711,27 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
}
else
{
- tree mask = wide_int_to_tree (container_type,
- wi::shifted_mask (shift_n, mask_width,
- false, prec));
- pattern_stmt
- = gimple_build_assign (vect_recog_temp_ssa_var (container_type),
- BIT_AND_EXPR, container, mask);
- tree masked = gimple_assign_lhs (pattern_stmt);
+ tree temp = vect_recog_temp_ssa_var (container_type);
+ if (!sign_ext)
+ {
+ tree mask = wide_int_to_tree (container_type,
+ wi::shifted_mask (shift_n,
+ mask_width,
+ false, prec));
+ pattern_stmt = gimple_build_assign (temp, BIT_AND_EXPR,
+ container, mask);
+ }
+ else
+ {
+ HOST_WIDE_INT shl = prec - shift_n - mask_width;
+ shift_n += shl;
+ pattern_stmt = gimple_build_assign (temp, LSHIFT_EXPR,
+ container,
+ build_int_cst (sizetype,
+ shl));
+ }
+ tree masked = gimple_assign_lhs (pattern_stmt);
append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype);
pattern_stmt
= gimple_build_assign (vect_recog_temp_ssa_var (container_type),