[V2] SCCVN: Add LEN_MASK_STORE and fix LEN_STORE
Checks
Commit Message
From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
Hi, Richi. It seems that we use nunits which is len + bias to iterate then we can
simplify the codes.
Also, I fixed behavior of len_store,
Before this patch:
(len - bias) * BITS_PER_UNIT
After this patch:
(len + bias) * BITS_PER_UNIT
gcc/ChangeLog:
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Add LEN_MAS_STORE and fix LEN_STORE.
---
gcc/tree-ssa-sccvn.cc | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
Comments
On Mon, 26 Jun 2023, juzhe.zhong@rivai.ai wrote:
> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
>
> Hi, Richi. It seems that we use nunits which is len + bias to iterate then we can
> simplify the codes.
>
> Also, I fixed behavior of len_store,
>
> Before this patch:
> (len - bias) * BITS_PER_UNIT
> After this patch:
> (len + bias) * BITS_PER_UNIT
>
> gcc/ChangeLog:
>
> * tree-ssa-sccvn.cc (vn_reference_lookup_3): Add LEN_MAS_STORE and fix LEN_STORE.
>
> ---
> gcc/tree-ssa-sccvn.cc | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
> index 11061a374a2..228ec117ff3 100644
> --- a/gcc/tree-ssa-sccvn.cc
> +++ b/gcc/tree-ssa-sccvn.cc
> @@ -3304,6 +3304,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
> return (void *)-1;
> break;
> + case IFN_LEN_MASK_STORE:
> + len = gimple_call_arg (call, 2);
> + bias = gimple_call_arg (call, 5);
> + if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
> + return (void *)-1;
> + mask = gimple_call_arg (call, internal_fn_mask_index (fn));
> + mask = vn_valueize (mask);
> + if (TREE_CODE (mask) != VECTOR_CST)
> + return (void *)-1;
> + break;
> default:
> return (void *)-1;
> }
> @@ -3344,6 +3354,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> tree vectype = TREE_TYPE (def_rhs);
> unsigned HOST_WIDE_INT elsz
> = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (vectype)));
> + poly_uint64 nunits;
> + if (len)
> + nunits = tree_to_uhwi (len) + tree_to_shwi (bias);
> + else
> + nunits = TYPE_VECTOR_SUBPARTS (vectype);
Are the _LEN ifns accessible via intrinsics as well? If so I think
we should use MIN (nunits, len + bias) here as otherwise we risk
out-of bound accesses.
Otherwise looks good to me.
Thanks,
Richard.
> if (mask)
> {
> HOST_WIDE_INT start = 0, length = 0;
> @@ -3373,7 +3388,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> length += elsz;
> mask_idx++;
> }
> - while (known_lt (mask_idx, TYPE_VECTOR_SUBPARTS (vectype)));
> + while (known_lt (mask_idx, nunits));
> if (length != 0)
> {
> pd.rhs_off = start;
> @@ -3389,7 +3404,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> {
> pd.offset = offset2i;
> pd.size = (tree_to_uhwi (len)
> - + -tree_to_shwi (bias)) * BITS_PER_UNIT;
> + + tree_to_shwi (bias)) * BITS_PER_UNIT;
> if (BYTES_BIG_ENDIAN)
> pd.rhs_off = pd.size - tree_to_uhwi (TYPE_SIZE (vectype));
> else
>
Hi, Richi. I am wondering whether it is true that :?
TYPE_VECTOR_SUBPARTS (vectype).to_constant ()
Thanks.
juzhe.zhong@rivai.ai
From: Richard Biener
Date: 2023-06-26 19:18
To: Ju-Zhe Zhong
CC: gcc-patches; richard.sandiford
Subject: Re: [PATCH V2] SCCVN: Add LEN_MASK_STORE and fix LEN_STORE
On Mon, 26 Jun 2023, juzhe.zhong@rivai.ai wrote:
> From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
>
> Hi, Richi. It seems that we use nunits which is len + bias to iterate then we can
> simplify the codes.
>
> Also, I fixed behavior of len_store,
>
> Before this patch:
> (len - bias) * BITS_PER_UNIT
> After this patch:
> (len + bias) * BITS_PER_UNIT
>
> gcc/ChangeLog:
>
> * tree-ssa-sccvn.cc (vn_reference_lookup_3): Add LEN_MAS_STORE and fix LEN_STORE.
>
> ---
> gcc/tree-ssa-sccvn.cc | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
> index 11061a374a2..228ec117ff3 100644
> --- a/gcc/tree-ssa-sccvn.cc
> +++ b/gcc/tree-ssa-sccvn.cc
> @@ -3304,6 +3304,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
> return (void *)-1;
> break;
> + case IFN_LEN_MASK_STORE:
> + len = gimple_call_arg (call, 2);
> + bias = gimple_call_arg (call, 5);
> + if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
> + return (void *)-1;
> + mask = gimple_call_arg (call, internal_fn_mask_index (fn));
> + mask = vn_valueize (mask);
> + if (TREE_CODE (mask) != VECTOR_CST)
> + return (void *)-1;
> + break;
> default:
> return (void *)-1;
> }
> @@ -3344,6 +3354,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> tree vectype = TREE_TYPE (def_rhs);
> unsigned HOST_WIDE_INT elsz
> = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (vectype)));
> + poly_uint64 nunits;
> + if (len)
> + nunits = tree_to_uhwi (len) + tree_to_shwi (bias);
> + else
> + nunits = TYPE_VECTOR_SUBPARTS (vectype);
Are the _LEN ifns accessible via intrinsics as well? If so I think
we should use MIN (nunits, len + bias) here as otherwise we risk
out-of bound accesses.
Otherwise looks good to me.
Thanks,
Richard.
> if (mask)
> {
> HOST_WIDE_INT start = 0, length = 0;
> @@ -3373,7 +3388,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> length += elsz;
> mask_idx++;
> }
> - while (known_lt (mask_idx, TYPE_VECTOR_SUBPARTS (vectype)));
> + while (known_lt (mask_idx, nunits));
> if (length != 0)
> {
> pd.rhs_off = start;
> @@ -3389,7 +3404,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> {
> pd.offset = offset2i;
> pd.size = (tree_to_uhwi (len)
> - + -tree_to_shwi (bias)) * BITS_PER_UNIT;
> + + tree_to_shwi (bias)) * BITS_PER_UNIT;
> if (BYTES_BIG_ENDIAN)
> pd.rhs_off = pd.size - tree_to_uhwi (TYPE_SIZE (vectype));
> else
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)
On Mon, 26 Jun 2023, juzhe.zhong@rivai.ai wrote:
> Hi, Richi. I am wondering whether it is true that :?
>
> TYPE_VECTOR_SUBPARTS (vectype).to_constant ()
Not necessarily.
> Thanks.
>
>
> juzhe.zhong@rivai.ai
>
> From: Richard Biener
> Date: 2023-06-26 19:18
> To: Ju-Zhe Zhong
> CC: gcc-patches; richard.sandiford
> Subject: Re: [PATCH V2] SCCVN: Add LEN_MASK_STORE and fix LEN_STORE
> On Mon, 26 Jun 2023, juzhe.zhong@rivai.ai wrote:
>
> > From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
> >
> > Hi, Richi. It seems that we use nunits which is len + bias to iterate then we can
> > simplify the codes.
> >
> > Also, I fixed behavior of len_store,
> >
> > Before this patch:
> > (len - bias) * BITS_PER_UNIT
> > After this patch:
> > (len + bias) * BITS_PER_UNIT
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-sccvn.cc (vn_reference_lookup_3): Add LEN_MAS_STORE and fix LEN_STORE.
> >
> > ---
> > gcc/tree-ssa-sccvn.cc | 19 +++++++++++++++++--
> > 1 file changed, 17 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
> > index 11061a374a2..228ec117ff3 100644
> > --- a/gcc/tree-ssa-sccvn.cc
> > +++ b/gcc/tree-ssa-sccvn.cc
> > @@ -3304,6 +3304,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> > if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
> > return (void *)-1;
> > break;
> > + case IFN_LEN_MASK_STORE:
> > + len = gimple_call_arg (call, 2);
> > + bias = gimple_call_arg (call, 5);
> > + if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
> > + return (void *)-1;
> > + mask = gimple_call_arg (call, internal_fn_mask_index (fn));
> > + mask = vn_valueize (mask);
> > + if (TREE_CODE (mask) != VECTOR_CST)
> > + return (void *)-1;
> > + break;
> > default:
> > return (void *)-1;
> > }
> > @@ -3344,6 +3354,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> > tree vectype = TREE_TYPE (def_rhs);
> > unsigned HOST_WIDE_INT elsz
> > = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (vectype)));
> > + poly_uint64 nunits;
> > + if (len)
> > + nunits = tree_to_uhwi (len) + tree_to_shwi (bias);
> > + else
> > + nunits = TYPE_VECTOR_SUBPARTS (vectype);
>
> Are the _LEN ifns accessible via intrinsics as well? If so I think
> we should use MIN (nunits, len + bias) here as otherwise we risk
> out-of bound accesses.
>
> Otherwise looks good to me.
>
> Thanks,
> Richard.
>
> > if (mask)
> > {
> > HOST_WIDE_INT start = 0, length = 0;
> > @@ -3373,7 +3388,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> > length += elsz;
> > mask_idx++;
> > }
> > - while (known_lt (mask_idx, TYPE_VECTOR_SUBPARTS (vectype)));
> > + while (known_lt (mask_idx, nunits));
> > if (length != 0)
> > {
> > pd.rhs_off = start;
> > @@ -3389,7 +3404,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
> > {
> > pd.offset = offset2i;
> > pd.size = (tree_to_uhwi (len)
> > - + -tree_to_shwi (bias)) * BITS_PER_UNIT;
> > + + tree_to_shwi (bias)) * BITS_PER_UNIT;
> > if (BYTES_BIG_ENDIAN)
> > pd.rhs_off = pd.size - tree_to_uhwi (TYPE_SIZE (vectype));
> > else
> >
>
>
@@ -3304,6 +3304,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
return (void *)-1;
break;
+ case IFN_LEN_MASK_STORE:
+ len = gimple_call_arg (call, 2);
+ bias = gimple_call_arg (call, 5);
+ if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias))
+ return (void *)-1;
+ mask = gimple_call_arg (call, internal_fn_mask_index (fn));
+ mask = vn_valueize (mask);
+ if (TREE_CODE (mask) != VECTOR_CST)
+ return (void *)-1;
+ break;
default:
return (void *)-1;
}
@@ -3344,6 +3354,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
tree vectype = TREE_TYPE (def_rhs);
unsigned HOST_WIDE_INT elsz
= tree_to_uhwi (TYPE_SIZE (TREE_TYPE (vectype)));
+ poly_uint64 nunits;
+ if (len)
+ nunits = tree_to_uhwi (len) + tree_to_shwi (bias);
+ else
+ nunits = TYPE_VECTOR_SUBPARTS (vectype);
if (mask)
{
HOST_WIDE_INT start = 0, length = 0;
@@ -3373,7 +3388,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
length += elsz;
mask_idx++;
}
- while (known_lt (mask_idx, TYPE_VECTOR_SUBPARTS (vectype)));
+ while (known_lt (mask_idx, nunits));
if (length != 0)
{
pd.rhs_off = start;
@@ -3389,7 +3404,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
{
pd.offset = offset2i;
pd.size = (tree_to_uhwi (len)
- + -tree_to_shwi (bias)) * BITS_PER_UNIT;
+ + tree_to_shwi (bias)) * BITS_PER_UNIT;
if (BYTES_BIG_ENDIAN)
pd.rhs_off = pd.size - tree_to_uhwi (TYPE_SIZE (vectype));
else