rs6000: Enable const_anchor for 'addi'

Message ID 20221014031748.55813-1-guojiufu@linux.ibm.com
State Accepted
Headers
Series rs6000: Enable const_anchor for 'addi' |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Jiufu Guo Oct. 14, 2022, 3:17 a.m. UTC
  Hi,

There is a functionality as const_anchor in cse.cc.  This const_anchor
supports to generate new constants through adding small gap/offsets to
existing constant.  For example:

void __attribute__ ((noinline)) foo (long long *a)
{
  *a++ = 0x2351847027482577LL;
  *a++ = 0x2351847027482578LL;
}
The second constant (0x2351847027482578LL) can be compated by adding '1'
to the first constant (0x2351847027482577LL).
This is profitable if more than one instructions are need to build the
second constant.

* For rs6000, we can enable this functionality, as the instruction
'addi' is just for this when gap is smaller than 0x8000.

* Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
one issue. The issue is:
"gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
"try_const_anchors". e.g. it may not need to check const_anchor for
{[%1:DI]=0;} which is in BLK mode. And "SCALAR_INT_MODE_P (mode)" is
checked when invoking insert_const_anchors.
So, this patch also adds this checking before calling try_const_anchors.

* One potential side effect of this patch:
Comparing with
"r101=0x2351847027482577LL
...
r201=0x2351847027482578LL"
The new r201 will be "r201=r101+1", and then r101 will live longer,
and would increase pressure when allocating registers.
But I feel, this would be acceptable for this const_anchor feature.

* With this patch, I checked the performance change on SPEC2017, while,
and the performance is not aggressive, since this functionality is not
hit on any hot path. There are runtime wavings/noise(e.g. on
povray_r/xalancbmk_r/xz_r), that are not caused by the patch.

With this patch, I also checked the changes in object files (from
GCC bootstrap and SPEC), the significant changes are the improvement
that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
other optimizations opportunities: like combine/jump2. While the
code to store/load one more register is also occurring in few cases,
but it does not impact overall performance.

* To refine this patch, some history discussions are referenced:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html


Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
Is this ok for trunk?


BR,
Jeff (Jiufu)

gcc/ChangeLog:

	* config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
	* cse.cc (cse_insn): Add guard condition.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/const_anchors.c: New test.
	* gcc.target/powerpc/try_const_anchors_ice.c: New test.

---
 gcc/config/rs6000/rs6000.cc                   |  4 ++++
 gcc/cse.cc                                    |  3 ++-
 .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
 .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
 4 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
  

Comments

Jiufu Guo Nov. 9, 2022, 3:18 a.m. UTC | #1
Hi,

I would like to have a ping for this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html

BR,
Jeff(Jiufu)

Jiufu Guo <guojiufu@linux.ibm.com> writes:

> Hi,
>
> There is a functionality as const_anchor in cse.cc.  This const_anchor
> supports to generate new constants through adding small gap/offsets to
> existing constant.  For example:
>
> void __attribute__ ((noinline)) foo (long long *a)
> {
>   *a++ = 0x2351847027482577LL;
>   *a++ = 0x2351847027482578LL;
> }
> The second constant (0x2351847027482578LL) can be compated by adding '1'
> to the first constant (0x2351847027482577LL).
> This is profitable if more than one instructions are need to build the
> second constant.
>
> * For rs6000, we can enable this functionality, as the instruction
> 'addi' is just for this when gap is smaller than 0x8000.
>
> * Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
> one issue. The issue is:
> "gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
> "try_const_anchors". e.g. it may not need to check const_anchor for
> {[%1:DI]=0;} which is in BLK mode. And "SCALAR_INT_MODE_P (mode)" is
> checked when invoking insert_const_anchors.
> So, this patch also adds this checking before calling try_const_anchors.
>
> * One potential side effect of this patch:
> Comparing with
> "r101=0x2351847027482577LL
> ...
> r201=0x2351847027482578LL"
> The new r201 will be "r201=r101+1", and then r101 will live longer,
> and would increase pressure when allocating registers.
After r201 is changed to "r201=r101+1", r101 will live longer, and
r201 depends on r101.  This would be the major concern for this patch,
I guess.
> But I feel, this would be acceptable for this const_anchor feature.
>
> * With this patch, I checked the performance change on SPEC2017, while,
> and the performance is not aggressive, since this functionality is not
> hit on any hot path. There are runtime wavings/noise(e.g. on
> povray_r/xalancbmk_r/xz_r), that are not caused by the patch.
>
> With this patch, I also checked the changes in object files (from
> GCC bootstrap and SPEC), the significant changes are the improvement
> that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
> other optimizations opportunities: like combine/jump2. While the
> code to store/load one more register is also occurring in few cases,
> but it does not impact overall performance.
>
> * To refine this patch, some history discussions are referenced:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
> https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html
>
>
> Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
> Is this ok for trunk?
>
>
> BR,
> Jeff (Jiufu)
>
> gcc/ChangeLog:
>
> 	* config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
> 	* cse.cc (cse_insn): Add guard condition.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/powerpc/const_anchors.c: New test.
> 	* gcc.target/powerpc/try_const_anchors_ice.c: New test.
>
> ---
>  gcc/config/rs6000/rs6000.cc                   |  4 ++++
>  gcc/cse.cc                                    |  3 ++-
>  .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
>  .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
>  4 files changed, 42 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index d2743f7bce6..80cded6dec1 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1760,6 +1760,10 @@ static const struct attribute_spec rs6000_attribute_table[] =
>  
>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
> +
> +#undef TARGET_CONST_ANCHOR
> +#define TARGET_CONST_ANCHOR 0x8000
> +
>  
>  
>  /* Processor table.  */
> diff --git a/gcc/cse.cc b/gcc/cse.cc
> index b13afd4ba72..56542b91c1e 100644
> --- a/gcc/cse.cc
> +++ b/gcc/cse.cc
> @@ -5005,7 +5005,8 @@ cse_insn (rtx_insn *insn)
>        if (targetm.const_anchor
>  	  && !src_related
>  	  && src_const
> -	  && GET_CODE (src_const) == CONST_INT)
> +	  && GET_CODE (src_const) == CONST_INT
> +	  && SCALAR_INT_MODE_P (mode))
>  	{
>  	  src_related = try_const_anchors (src_const, mode);
>  	  src_related_is_const_anchor = src_related != NULL_RTX;
> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> new file mode 100644
> index 00000000000..39958ff9765
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile { target has_arch_ppc64 } } */
> +/* { dg-options "-O2" } */
> +
> +#define C1 0x2351847027482577ULL
> +#define C2 0x2351847027482578ULL
> +
> +void __attribute__ ((noinline)) foo (long long *a)
> +{
> +  *a++ = C1;
> +  *a++ = C2;
> +}
> +
> +void __attribute__ ((noinline)) foo1 (long long *a, long long b)
> +{
> +  *a++ = C1;
> +  if (b)
> +    *a++ = C2;
> +}
> +
> +/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> new file mode 100644
> index 00000000000..4c8a892e803
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
> +   it could case ICE in try_const_anchors which only supports SCALAR_INT.  */
> +
> +long
> +foo (const int val)
> +{
> +  if (val == (0))
> +    return 0;
> +  void *p = __builtin_stack_save ();
> +  char c = val;
> +  __builtin_stack_restore (p);
> +  return c;
> +}
  
Jiufu Guo April 26, 2023, 5:39 a.m. UTC | #2
Hi,

I'm thinking that we may enable this patch for stage1, so ping it.
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html

BR,
Jeff (Jiufu)

Jiufu Guo <guojiufu@linux.ibm.com> writes:

> Hi,
>
> There is a functionality as const_anchor in cse.cc.  This const_anchor
> supports to generate new constants through adding small gap/offsets to
> existing constant.  For example:
>
> void __attribute__ ((noinline)) foo (long long *a)
> {
>   *a++ = 0x2351847027482577LL;
>   *a++ = 0x2351847027482578LL;
> }
> The second constant (0x2351847027482578LL) can be compated by adding '1'
> to the first constant (0x2351847027482577LL).
> This is profitable if more than one instructions are need to build the
> second constant.
>
> * For rs6000, we can enable this functionality, as the instruction
> 'addi' is just for this when gap is smaller than 0x8000.
>
> * Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
> one issue. The issue is:
> "gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
> "try_const_anchors". 
>
> * One potential side effect of this patch:
> Comparing with
> "r101=0x2351847027482577LL
> ...
> r201=0x2351847027482578LL"
> The new r201 will be "r201=r101+1", and then r101 will live longer,
> and would increase pressure when allocating registers.
> But I feel, this would be acceptable for this const_anchor feature.
>
> * With this patch, I checked the performance change on SPEC2017, while,
> and the performance is not aggressive, since this functionality is not
> hit on any hot path. There are runtime wavings/noise(e.g. on
> povray_r/xalancbmk_r/xz_r), that are not caused by the patch.
>
> With this patch, I also checked the changes in object files (from
> GCC bootstrap and SPEC), the significant changes are the improvement
> that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
> other optimizations opportunities: like combine/jump2. While the
> code to store/load one more register is also occurring in few cases,
> but it does not impact overall performance.
>
> * To refine this patch, some history discussions are referenced:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
> https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html
>
>
> Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
> Is this ok for trunk?
>
>
> BR,
> Jeff (Jiufu)
>
> gcc/ChangeLog:
>
> 	* config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
> 	* cse.cc (cse_insn): Add guard condition.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/powerpc/const_anchors.c: New test.
> 	* gcc.target/powerpc/try_const_anchors_ice.c: New test.
>
> ---
>  gcc/config/rs6000/rs6000.cc                   |  4 ++++
>  gcc/cse.cc                                    |  3 ++-
>  .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
>  .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
>  4 files changed, 42 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index d2743f7bce6..80cded6dec1 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1760,6 +1760,10 @@ static const struct attribute_spec rs6000_attribute_table[] =
>  
>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
> +
> +#undef TARGET_CONST_ANCHOR
> +#define TARGET_CONST_ANCHOR 0x8000
> +
>  
>  
>  /* Processor table.  */
> diff --git a/gcc/cse.cc b/gcc/cse.cc
> index b13afd4ba72..56542b91c1e 100644
> --- a/gcc/cse.cc
> +++ b/gcc/cse.cc
> @@ -5005,7 +5005,8 @@ cse_insn (rtx_insn *insn)
>        if (targetm.const_anchor
>  	  && !src_related
>  	  && src_const
> -	  && GET_CODE (src_const) == CONST_INT)
> +	  && GET_CODE (src_const) == CONST_INT
> +	  && SCALAR_INT_MODE_P (mode))
>  	{
>  	  src_related = try_const_anchors (src_const, mode);
>  	  src_related_is_const_anchor = src_related != NULL_RTX;
> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> new file mode 100644
> index 00000000000..39958ff9765
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile { target has_arch_ppc64 } } */
> +/* { dg-options "-O2" } */
> +
> +#define C1 0x2351847027482577ULL
> +#define C2 0x2351847027482578ULL
> +
> +void __attribute__ ((noinline)) foo (long long *a)
> +{
> +  *a++ = C1;
> +  *a++ = C2;
> +}
> +
> +void __attribute__ ((noinline)) foo1 (long long *a, long long b)
> +{
> +  *a++ = C1;
> +  if (b)
> +    *a++ = C2;
> +}
> +
> +/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> new file mode 100644
> index 00000000000..4c8a892e803
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
> +   it could case ICE in try_const_anchors which only supports SCALAR_INT.  */
> +
> +long
> +foo (const int val)
> +{
> +  if (val == (0))
> +    return 0;
> +  void *p = __builtin_stack_save ();
> +  char c = val;
> +  __builtin_stack_restore (p);
> +  return c;
> +}
  
Jiufu Guo May 17, 2023, 6:47 a.m. UTC | #3
Gentle ping...

Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> Hi,
>
> I'm thinking that we may enable this patch for stage1, so ping it.
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html
>
> BR,
> Jeff (Jiufu)
>
> Jiufu Guo <guojiufu@linux.ibm.com> writes:
>
>> Hi,
>>
>> There is a functionality as const_anchor in cse.cc.  This const_anchor
>> supports to generate new constants through adding small gap/offsets to
>> existing constant.  For example:
>>
>> void __attribute__ ((noinline)) foo (long long *a)
>> {
>>   *a++ = 0x2351847027482577LL;
>>   *a++ = 0x2351847027482578LL;
>> }
>> The second constant (0x2351847027482578LL) can be compated by adding '1'
>> to the first constant (0x2351847027482577LL).
>> This is profitable if more than one instructions are need to build the
>> second constant.
>>
>> * For rs6000, we can enable this functionality, as the instruction
>> 'addi' is just for this when gap is smaller than 0x8000.
>>
>> * Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
>> one issue. The issue is:
>> "gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
>> "try_const_anchors". 
>>
>> * One potential side effect of this patch:
>> Comparing with
>> "r101=0x2351847027482577LL
>> ...
>> r201=0x2351847027482578LL"
>> The new r201 will be "r201=r101+1", and then r101 will live longer,
>> and would increase pressure when allocating registers.
>> But I feel, this would be acceptable for this const_anchor feature.
>>
>> * With this patch, I checked the performance change on SPEC2017, while,
>> and the performance is not aggressive, since this functionality is not
>> hit on any hot path. There are runtime wavings/noise(e.g. on
>> povray_r/xalancbmk_r/xz_r), that are not caused by the patch.
>>
>> With this patch, I also checked the changes in object files (from
>> GCC bootstrap and SPEC), the significant changes are the improvement
>> that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
>> other optimizations opportunities: like combine/jump2. While the
>> code to store/load one more register is also occurring in few cases,
>> but it does not impact overall performance.
>>
>> * To refine this patch, some history discussions are referenced:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
>> https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html
>>
>>
>> Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
>> Is this ok for trunk?
>>
>>
>> BR,
>> Jeff (Jiufu)
>>
>> gcc/ChangeLog:
>>
>> 	* config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
>> 	* cse.cc (cse_insn): Add guard condition.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	* gcc.target/powerpc/const_anchors.c: New test.
>> 	* gcc.target/powerpc/try_const_anchors_ice.c: New test.
>>
>> ---
>>  gcc/config/rs6000/rs6000.cc                   |  4 ++++
>>  gcc/cse.cc                                    |  3 ++-
>>  .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
>>  .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
>>  4 files changed, 42 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>>
>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index d2743f7bce6..80cded6dec1 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -1760,6 +1760,10 @@ static const struct attribute_spec rs6000_attribute_table[] =
>>  
>>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
>>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
>> +
>> +#undef TARGET_CONST_ANCHOR
>> +#define TARGET_CONST_ANCHOR 0x8000
>> +
>>  
>>  
>>  /* Processor table.  */
>> diff --git a/gcc/cse.cc b/gcc/cse.cc
>> index b13afd4ba72..56542b91c1e 100644
>> --- a/gcc/cse.cc
>> +++ b/gcc/cse.cc
>> @@ -5005,7 +5005,8 @@ cse_insn (rtx_insn *insn)
>>        if (targetm.const_anchor
>>  	  && !src_related
>>  	  && src_const
>> -	  && GET_CODE (src_const) == CONST_INT)
>> +	  && GET_CODE (src_const) == CONST_INT
>> +	  && SCALAR_INT_MODE_P (mode))
>>  	{
>>  	  src_related = try_const_anchors (src_const, mode);
>>  	  src_related_is_const_anchor = src_related != NULL_RTX;
>> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>> new file mode 100644
>> index 00000000000..39958ff9765
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile { target has_arch_ppc64 } } */
>> +/* { dg-options "-O2" } */
>> +
>> +#define C1 0x2351847027482577ULL
>> +#define C2 0x2351847027482578ULL
>> +
>> +void __attribute__ ((noinline)) foo (long long *a)
>> +{
>> +  *a++ = C1;
>> +  *a++ = C2;
>> +}
>> +
>> +void __attribute__ ((noinline)) foo1 (long long *a, long long b)
>> +{
>> +  *a++ = C1;
>> +  if (b)
>> +    *a++ = C2;
>> +}
>> +
>> +/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
>> diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>> new file mode 100644
>> index 00000000000..4c8a892e803
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>> @@ -0,0 +1,16 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +
>> +/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
>> +   it could case ICE in try_const_anchors which only supports SCALAR_INT.  */
>> +
>> +long
>> +foo (const int val)
>> +{
>> +  if (val == (0))
>> +    return 0;
>> +  void *p = __builtin_stack_save ();
>> +  char c = val;
>> +  __builtin_stack_restore (p);
>> +  return c;
>> +}
  
Jiufu Guo May 31, 2023, 3 a.m. UTC | #4
Gentle ping...

Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> Gentle ping...
>
> Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
>> Hi,
>>
>> I'm thinking that we may enable this patch for stage1, so ping it.
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html
>>
>> BR,
>> Jeff (Jiufu)
>>
>> Jiufu Guo <guojiufu@linux.ibm.com> writes:
>>
>>> Hi,
>>>
>>> There is a functionality as const_anchor in cse.cc.  This const_anchor
>>> supports to generate new constants through adding small gap/offsets to
>>> existing constant.  For example:
>>>
>>> void __attribute__ ((noinline)) foo (long long *a)
>>> {
>>>   *a++ = 0x2351847027482577LL;
>>>   *a++ = 0x2351847027482578LL;
>>> }
>>> The second constant (0x2351847027482578LL) can be compated by adding '1'
>>> to the first constant (0x2351847027482577LL).
>>> This is profitable if more than one instructions are need to build the
>>> second constant.
>>>
>>> * For rs6000, we can enable this functionality, as the instruction
>>> 'addi' is just for this when gap is smaller than 0x8000.
>>>
>>> * Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
>>> one issue. The issue is:
>>> "gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
>>> "try_const_anchors". 
>>>
>>> * One potential side effect of this patch:
>>> Comparing with
>>> "r101=0x2351847027482577LL
>>> ...
>>> r201=0x2351847027482578LL"
>>> The new r201 will be "r201=r101+1", and then r101 will live longer,
>>> and would increase pressure when allocating registers.
>>> But I feel, this would be acceptable for this const_anchor feature.
>>>
>>> * With this patch, I checked the performance change on SPEC2017, while,
>>> and the performance is not aggressive, since this functionality is not
>>> hit on any hot path. There are runtime wavings/noise(e.g. on
>>> povray_r/xalancbmk_r/xz_r), that are not caused by the patch.
>>>
>>> With this patch, I also checked the changes in object files (from
>>> GCC bootstrap and SPEC), the significant changes are the improvement
>>> that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
>>> other optimizations opportunities: like combine/jump2. While the
>>> code to store/load one more register is also occurring in few cases,
>>> but it does not impact overall performance.
>>>
>>> * To refine this patch, some history discussions are referenced:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
>>> https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
>>> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html
>>>
>>>
>>> Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
>>> Is this ok for trunk?
>>>
>>>
>>> BR,
>>> Jeff (Jiufu)
>>>
>>> gcc/ChangeLog:
>>>
>>> 	* config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
>>> 	* cse.cc (cse_insn): Add guard condition.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 	* gcc.target/powerpc/const_anchors.c: New test.
>>> 	* gcc.target/powerpc/try_const_anchors_ice.c: New test.
>>>
>>> ---
>>>  gcc/config/rs6000/rs6000.cc                   |  4 ++++
>>>  gcc/cse.cc                                    |  3 ++-
>>>  .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
>>>  .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
>>>  4 files changed, 42 insertions(+), 1 deletion(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>>> index d2743f7bce6..80cded6dec1 100644
>>> --- a/gcc/config/rs6000/rs6000.cc
>>> +++ b/gcc/config/rs6000/rs6000.cc
>>> @@ -1760,6 +1760,10 @@ static const struct attribute_spec rs6000_attribute_table[] =
>>>  
>>>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
>>>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
>>> +
>>> +#undef TARGET_CONST_ANCHOR
>>> +#define TARGET_CONST_ANCHOR 0x8000
>>> +
>>>  
>>>  
>>>  /* Processor table.  */
>>> diff --git a/gcc/cse.cc b/gcc/cse.cc
>>> index b13afd4ba72..56542b91c1e 100644
>>> --- a/gcc/cse.cc
>>> +++ b/gcc/cse.cc
>>> @@ -5005,7 +5005,8 @@ cse_insn (rtx_insn *insn)
>>>        if (targetm.const_anchor
>>>  	  && !src_related
>>>  	  && src_const
>>> -	  && GET_CODE (src_const) == CONST_INT)
>>> +	  && GET_CODE (src_const) == CONST_INT
>>> +	  && SCALAR_INT_MODE_P (mode))
>>>  	{
>>>  	  src_related = try_const_anchors (src_const, mode);
>>>  	  src_related_is_const_anchor = src_related != NULL_RTX;
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>>> new file mode 100644
>>> index 00000000000..39958ff9765
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>>> @@ -0,0 +1,20 @@
>>> +/* { dg-do compile { target has_arch_ppc64 } } */
>>> +/* { dg-options "-O2" } */
>>> +
>>> +#define C1 0x2351847027482577ULL
>>> +#define C2 0x2351847027482578ULL
>>> +
>>> +void __attribute__ ((noinline)) foo (long long *a)
>>> +{
>>> +  *a++ = C1;
>>> +  *a++ = C2;
>>> +}
>>> +
>>> +void __attribute__ ((noinline)) foo1 (long long *a, long long b)
>>> +{
>>> +  *a++ = C1;
>>> +  if (b)
>>> +    *a++ = C2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>>> new file mode 100644
>>> index 00000000000..4c8a892e803
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>>> @@ -0,0 +1,16 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2" } */
>>> +
>>> +/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
>>> +   it could case ICE in try_const_anchors which only supports SCALAR_INT.  */
>>> +
>>> +long
>>> +foo (const int val)
>>> +{
>>> +  if (val == (0))
>>> +    return 0;
>>> +  void *p = __builtin_stack_save ();
>>> +  char c = val;
>>> +  __builtin_stack_restore (p);
>>> +  return c;
>>> +}
  
David Edelsohn May 31, 2023, 1:40 p.m. UTC | #5
On Tue, May 30, 2023 at 11:00 PM Jiufu Guo <guojiufu@linux.ibm.com> wrote:

>
> Gentle ping...
>
> Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
> > Gentle ping...
> >
> > Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> >
> >> Hi,
> >>
> >> I'm thinking that we may enable this patch for stage1, so ping it.
> >> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html
> >>
> >> BR,
> >> Jeff (Jiufu)
> >>
> >> Jiufu Guo <guojiufu@linux.ibm.com> writes:
> >>
> >>> Hi,
> >>>
> >>> There is a functionality as const_anchor in cse.cc.  This const_anchor
> >>> supports to generate new constants through adding small gap/offsets to
> >>> existing constant.  For example:
> >>>
> >>> void __attribute__ ((noinline)) foo (long long *a)
> >>> {
> >>>   *a++ = 0x2351847027482577LL;
> >>>   *a++ = 0x2351847027482578LL;
> >>> }
> >>> The second constant (0x2351847027482578LL) can be compated by adding
> '1'
> >>> to the first constant (0x2351847027482577LL).
> >>> This is profitable if more than one instructions are need to build the
> >>> second constant.
> >>>
> >>> * For rs6000, we can enable this functionality, as the instruction
> >>> 'addi' is just for this when gap is smaller than 0x8000.
> >>>
> >>> * Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
> >>> one issue. The issue is:
> >>> "gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
> >>> "try_const_anchors".
> >>>
> >>> * One potential side effect of this patch:
> >>> Comparing with
> >>> "r101=0x2351847027482577LL
> >>> ...
> >>> r201=0x2351847027482578LL"
> >>> The new r201 will be "r201=r101+1", and then r101 will live longer,
> >>> and would increase pressure when allocating registers.
> >>> But I feel, this would be acceptable for this const_anchor feature.
> >>>
> >>> * With this patch, I checked the performance change on SPEC2017, while,
> >>> and the performance is not aggressive, since this functionality is not
> >>> hit on any hot path. There are runtime wavings/noise(e.g. on
> >>> povray_r/xalancbmk_r/xz_r), that are not caused by the patch.
> >>>
> >>> With this patch, I also checked the changes in object files (from
> >>> GCC bootstrap and SPEC), the significant changes are the improvement
> >>> that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
> >>> other optimizations opportunities: like combine/jump2. While the
> >>> code to store/load one more register is also occurring in few cases,
> >>> but it does not impact overall performance.
> >>>
> >>> * To refine this patch, some history discussions are referenced:
> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
> >>> https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
> >>> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html
> >>>
> >>>
> >>> Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
> >>> Is this ok for trunk?
>

Hi, Jiufu

Thanks for developing this patch and your persistence.

The rs6000.cc part of the patch (TARGET_CONST_ANCHOR) is okay for Stage 1.
This is approved.

I don't have the authority to approve the change to cse_insn.  Is the
cse_insn change a prerequisite?  Will the rs6000 change break or produce
wrong code without the cse change?  The second part of the patch should be
posted separately to the mailing list, with a cc for appropriate
maintainers, because most maintainers will not be following this specific
thread to approve the other part of the patch.

Thanks, David


> >>>
> >>>
> >>> BR,
> >>> Jeff (Jiufu)
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>>     * config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
> >>>     * cse.cc (cse_insn): Add guard condition.
> >>>
> >>> gcc/testsuite/ChangeLog:
> >>>
> >>>     * gcc.target/powerpc/const_anchors.c: New test.
> >>>     * gcc.target/powerpc/try_const_anchors_ice.c: New test.
> >>>
> >>> ---
> >>>  gcc/config/rs6000/rs6000.cc                   |  4 ++++
> >>>  gcc/cse.cc                                    |  3 ++-
> >>>  .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
> >>>  .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
> >>>  4 files changed, 42 insertions(+), 1 deletion(-)
> >>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
> >>>  create mode 100644
> gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> >>>
> >>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> >>> index d2743f7bce6..80cded6dec1 100644
> >>> --- a/gcc/config/rs6000/rs6000.cc
> >>> +++ b/gcc/config/rs6000/rs6000.cc
> >>> @@ -1760,6 +1760,10 @@ static const struct attribute_spec
> rs6000_attribute_table[] =
> >>>
> >>>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
> >>>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO
> rs6000_update_ipa_fn_target_info
> >>> +
> >>> +#undef TARGET_CONST_ANCHOR
> >>> +#define TARGET_CONST_ANCHOR 0x8000
> >>> +
> >>>
> >>>
> >>>  /* Processor table.  */
> >>> diff --git a/gcc/cse.cc b/gcc/cse.cc
> >>> index b13afd4ba72..56542b91c1e 100644
> >>> --- a/gcc/cse.cc
> >>> +++ b/gcc/cse.cc
> >>> @@ -5005,7 +5005,8 @@ cse_insn (rtx_insn *insn)
> >>>        if (targetm.const_anchor
> >>>       && !src_related
> >>>       && src_const
> >>> -     && GET_CODE (src_const) == CONST_INT)
> >>> +     && GET_CODE (src_const) == CONST_INT
> >>> +     && SCALAR_INT_MODE_P (mode))
> >>>     {
> >>>       src_related = try_const_anchors (src_const, mode);
> >>>       src_related_is_const_anchor = src_related != NULL_RTX;
> >>> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> >>> new file mode 100644
> >>> index 00000000000..39958ff9765
> >>> --- /dev/null
> >>> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
> >>> @@ -0,0 +1,20 @@
> >>> +/* { dg-do compile { target has_arch_ppc64 } } */
> >>> +/* { dg-options "-O2" } */
> >>> +
> >>> +#define C1 0x2351847027482577ULL
> >>> +#define C2 0x2351847027482578ULL
> >>> +
> >>> +void __attribute__ ((noinline)) foo (long long *a)
> >>> +{
> >>> +  *a++ = C1;
> >>> +  *a++ = C2;
> >>> +}
> >>> +
> >>> +void __attribute__ ((noinline)) foo1 (long long *a, long long b)
> >>> +{
> >>> +  *a++ = C1;
> >>> +  if (b)
> >>> +    *a++ = C2;
> >>> +}
> >>> +
> >>> +/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
> >>> diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> >>> new file mode 100644
> >>> index 00000000000..4c8a892e803
> >>> --- /dev/null
> >>> +++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
> >>> @@ -0,0 +1,16 @@
> >>> +/* { dg-do compile } */
> >>> +/* { dg-options "-O2" } */
> >>> +
> >>> +/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
> >>> +   it could case ICE in try_const_anchors which only supports
> SCALAR_INT.  */
> >>> +
> >>> +long
> >>> +foo (const int val)
> >>> +{
> >>> +  if (val == (0))
> >>> +    return 0;
> >>> +  void *p = __builtin_stack_save ();
> >>> +  char c = val;
> >>> +  __builtin_stack_restore (p);
> >>> +  return c;
> >>> +}
>
  
Jiufu Guo June 2, 2023, 4:03 a.m. UTC | #6
Hi David,

Thanks!

David Edelsohn <dje.gcc@gmail.com> writes:

> This Message Is From an External Sender 
> This message came from outside your organization. 
>  
> On Tue, May 30, 2023 at 11:00 PM Jiufu Guo <guojiufu@linux.ibm.com> wrote:
>
>  Gentle ping...
>
>  Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
>  > Gentle ping...
>  >
>  > Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>  >
>  >> Hi,
>  >>
>  >> I'm thinking that we may enable this patch for stage1, so ping it.
>  >> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html
>  >>
>  >> BR,
>  >> Jeff (Jiufu)
>  >>
>  >> Jiufu Guo <guojiufu@linux.ibm.com> writes:
>  >>
>  >>> Hi,
>  >>>
>  >>> There is a functionality as const_anchor in cse.cc.  This const_anchor
>  >>> supports to generate new constants through adding small gap/offsets to
>  >>> existing constant.  For example:
>  >>>
>  >>> void __attribute__ ((noinline)) foo (long long *a)
>  >>> {
>  >>>   *a++ = 0x2351847027482577LL;
>  >>>   *a++ = 0x2351847027482578LL;
>  >>> }
>  >>> The second constant (0x2351847027482578LL) can be compated by adding '1'
>  >>> to the first constant (0x2351847027482577LL).
>  >>> This is profitable if more than one instructions are need to build the
>  >>> second constant.
>  >>>
>  >>> * For rs6000, we can enable this functionality, as the instruction
>  >>> 'addi' is just for this when gap is smaller than 0x8000.
>  >>>
>  >>> * Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
>  >>> one issue. The issue is:
>  >>> "gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
>  >>> "try_const_anchors". 
>  >>>
>  >>> * One potential side effect of this patch:
>  >>> Comparing with
>  >>> "r101=0x2351847027482577LL
>  >>> ...
>  >>> r201=0x2351847027482578LL"
>  >>> The new r201 will be "r201=r101+1", and then r101 will live longer,
>  >>> and would increase pressure when allocating registers.
>  >>> But I feel, this would be acceptable for this const_anchor feature.
>  >>>
>  >>> * With this patch, I checked the performance change on SPEC2017, while,
>  >>> and the performance is not aggressive, since this functionality is not
>  >>> hit on any hot path. There are runtime wavings/noise(e.g. on
>  >>> povray_r/xalancbmk_r/xz_r), that are not caused by the patch.
>  >>>
>  >>> With this patch, I also checked the changes in object files (from
>  >>> GCC bootstrap and SPEC), the significant changes are the improvement
>  >>> that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
>  >>> other optimizations opportunities: like combine/jump2. While the
>  >>> code to store/load one more register is also occurring in few cases,
>  >>> but it does not impact overall performance.
>  >>>
>  >>> * To refine this patch, some history discussions are referenced:
>  >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
>  >>> https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
>  >>> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html
>  >>>
>  >>>
>  >>> Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
>  >>> Is this ok for trunk?
>
> Hi, Jiufu
>
> Thanks for developing this patch and your persistence.
>
> The rs6000.cc part of the patch (TARGET_CONST_ANCHOR) is okay for
> Stage 1.  This is approved. 
>
> I don't have the authority to approve the change to cse_insn.  Is the
> cse_insn change a prerequisite?  Will the rs6000 change break or
> produce wrong code 
> without the cse change?  The second part of the patch should be posted
> separately to the mailing list, with a cc for appropriate maintainers,
> because most maintainers will not be following this specific thread
> to approve the other part of the patch.

I would extract the cse part as a seperate patch.
Yes, cse part is prerequest, the bug could be exposed by rs6000 part
change.

BR,
Jeff (Jiufu Guo)

>
> Thanks, David
>  
>  >>>
>  >>>
>  >>> BR,
>  >>> Jeff (Jiufu)
>  >>>
>  >>> gcc/ChangeLog:
>  >>>
>  >>>     * config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
>  >>>     * cse.cc (cse_insn): Add guard condition.
>  >>>
>  >>> gcc/testsuite/ChangeLog:
>  >>>
>  >>>     * gcc.target/powerpc/const_anchors.c: New test.
>  >>>     * gcc.target/powerpc/try_const_anchors_ice.c: New test.
>  >>>
>  >>> ---
>  >>>  gcc/config/rs6000/rs6000.cc                   |  4 ++++
>  >>>  gcc/cse.cc                                    |  3 ++-
>  >>>  .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
>  >>>  .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
>  >>>  4 files changed, 42 insertions(+), 1 deletion(-)
>  >>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  >>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>  >>>
>  >>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>  >>> index d2743f7bce6..80cded6dec1 100644
>  >>> --- a/gcc/config/rs6000/rs6000.cc
>  >>> +++ b/gcc/config/rs6000/rs6000.cc
>  >>> @@ -1760,6 +1760,10 @@ static const struct attribute_spec rs6000_attribute_table[] =
>  >>>  
>  >>>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
>  >>>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
>  >>> +
>  >>> +#undef TARGET_CONST_ANCHOR
>  >>> +#define TARGET_CONST_ANCHOR 0x8000
>  >>> +
>  >>>  
>  >>>  
>  >>>  /* Processor table.  */
>  >>> diff --git a/gcc/cse.cc b/gcc/cse.cc
>  >>> index b13afd4ba72..56542b91c1e 100644
>  >>> --- a/gcc/cse.cc
>  >>> +++ b/gcc/cse.cc
>  >>> @@ -5005,7 +5005,8 @@ cse_insn (rtx_insn *insn)
>  >>>        if (targetm.const_anchor
>  >>>       && !src_related
>  >>>       && src_const
>  >>> -     && GET_CODE (src_const) == CONST_INT)
>  >>> +     && GET_CODE (src_const) == CONST_INT
>  >>> +     && SCALAR_INT_MODE_P (mode))
>  >>>     {
>  >>>       src_related = try_const_anchors (src_const, mode);
>  >>>       src_related_is_const_anchor = src_related != NULL_RTX;
>  >>> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  >>> new file mode 100644
>  >>> index 00000000000..39958ff9765
>  >>> --- /dev/null
>  >>> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  >>> @@ -0,0 +1,20 @@
>  >>> +/* { dg-do compile { target has_arch_ppc64 } } */
>  >>> +/* { dg-options "-O2" } */
>  >>> +
>  >>> +#define C1 0x2351847027482577ULL
>  >>> +#define C2 0x2351847027482578ULL
>  >>> +
>  >>> +void __attribute__ ((noinline)) foo (long long *a)
>  >>> +{
>  >>> +  *a++ = C1;
>  >>> +  *a++ = C2;
>  >>> +}
>  >>> +
>  >>> +void __attribute__ ((noinline)) foo1 (long long *a, long long b)
>  >>> +{
>  >>> +  *a++ = C1;
>  >>> +  if (b)
>  >>> +    *a++ = C2;
>  >>> +}
>  >>> +
>  >>> +/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
>  >>> diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>  >>> new file mode 100644
>  >>> index 00000000000..4c8a892e803
>  >>> --- /dev/null
>  >>> +++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>  >>> @@ -0,0 +1,16 @@
>  >>> +/* { dg-do compile } */
>  >>> +/* { dg-options "-O2" } */
>  >>> +
>  >>> +/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
>  >>> +   it could case ICE in try_const_anchors which only supports SCALAR_INT.  */
>  >>> +
>  >>> +long
>  >>> +foo (const int val)
>  >>> +{
>  >>> +  if (val == (0))
>  >>> +    return 0;
>  >>> +  void *p = __builtin_stack_save ();
>  >>> +  char c = val;
>  >>> +  __builtin_stack_restore (p);
>  >>> +  return c;
>  >>> +}
  
Jiufu Guo June 19, 2023, 3:38 a.m. UTC | #7
Hi!

David Edelsohn <dje.gcc@gmail.com> writes:

> This Message Is From an External Sender 
> This message came from outside your organization. 
>  
> On Tue, May 30, 2023 at 11:00 PM Jiufu Guo <guojiufu@linux.ibm.com> wrote:
>
>  Gentle ping...
>
>  Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
>  > Gentle ping...
>  >
>  > Jiufu Guo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>  >
>  >> Hi,
>  >>
>  >> I'm thinking that we may enable this patch for stage1, so ping it.
>  >> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603530.html
>  >>
>  >> BR,
>  >> Jeff (Jiufu)
>  >>
>  >> Jiufu Guo <guojiufu@linux.ibm.com> writes:
>  >>
>  >>> Hi,
>  >>>
>  >>> There is a functionality as const_anchor in cse.cc.  This const_anchor
>  >>> supports to generate new constants through adding small gap/offsets to
>  >>> existing constant.  For example:
>  >>>
>  >>> void __attribute__ ((noinline)) foo (long long *a)
>  >>> {
>  >>>   *a++ = 0x2351847027482577LL;
>  >>>   *a++ = 0x2351847027482578LL;
>  >>> }
>  >>> The second constant (0x2351847027482578LL) can be compated by adding '1'
>  >>> to the first constant (0x2351847027482577LL).
>  >>> This is profitable if more than one instructions are need to build the
>  >>> second constant.
>  >>>
>  >>> * For rs6000, we can enable this functionality, as the instruction
>  >>> 'addi' is just for this when gap is smaller than 0x8000.
>  >>>
>  >>> * Besides enabling TARGET_CONST_ANCHOR on rs6000, this patch also fixed
>  >>> one issue. The issue is:
>  >>> "gcc_assert (SCALAR_INT_MODE_P (mode))" is an requirement for function
>  >>> "try_const_anchors". 
>  >>>
>  >>> * One potential side effect of this patch:
>  >>> Comparing with
>  >>> "r101=0x2351847027482577LL
>  >>> ...
>  >>> r201=0x2351847027482578LL"
>  >>> The new r201 will be "r201=r101+1", and then r101 will live longer,
>  >>> and would increase pressure when allocating registers.
>  >>> But I feel, this would be acceptable for this const_anchor feature.
>  >>>
>  >>> * With this patch, I checked the performance change on SPEC2017, while,
>  >>> and the performance is not aggressive, since this functionality is not
>  >>> hit on any hot path. There are runtime wavings/noise(e.g. on
>  >>> povray_r/xalancbmk_r/xz_r), that are not caused by the patch.
>  >>>
>  >>> With this patch, I also checked the changes in object files (from
>  >>> GCC bootstrap and SPEC), the significant changes are the improvement
>  >>> that: "addi" vs. "2 or more insns: lis+or.."; it also exposes some
>  >>> other optimizations opportunities: like combine/jump2. While the
>  >>> code to store/load one more register is also occurring in few cases,
>  >>> but it does not impact overall performance.
>  >>>
>  >>> * To refine this patch, some history discussions are referenced:
>  >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33699
>  >>> https://gcc.gnu.org/pipermail/gcc-patches/2009-April/260421.html
>  >>> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html
>  >>>
>  >>>
>  >>> Bootstrap and regtest pass on ppc64 and ppc64le for this patch.
>  >>> Is this ok for trunk?
>
> Hi, Jiufu
>
> Thanks for developing this patch and your persistence.
>
> The rs6000.cc part of the patch (TARGET_CONST_ANCHOR) is okay for
> Stage 1.  This is approved.

Pushed as r14-1919-g41f42d120c4a66.  Thanks!

BR,
Jeff (Jiufu Guo)

>
> I don't have the authority to approve the change to cse_insn.  Is the cse_insn change a prerequisite?  Will the rs6000 change break or produce wrong
> code without the cse change?  The second part of the patch should be posted separately to the mailing list, with a cc for appropriate maintainers,
> because most maintainers will not be following this specific thread to approve the other part of the patch.
>
> Thanks, David
>  
>  >>>
>  >>>
>  >>> BR,
>  >>> Jeff (Jiufu)
>  >>>
>  >>> gcc/ChangeLog:
>  >>>
>  >>>     * config/rs6000/rs6000.cc (TARGET_CONST_ANCHOR): New define.
>  >>>     * cse.cc (cse_insn): Add guard condition.
>  >>>
>  >>> gcc/testsuite/ChangeLog:
>  >>>
>  >>>     * gcc.target/powerpc/const_anchors.c: New test.
>  >>>     * gcc.target/powerpc/try_const_anchors_ice.c: New test.
>  >>>
>  >>> ---
>  >>>  gcc/config/rs6000/rs6000.cc                   |  4 ++++
>  >>>  gcc/cse.cc                                    |  3 ++-
>  >>>  .../gcc.target/powerpc/const_anchors.c        | 20 +++++++++++++++++++
>  >>>  .../powerpc/try_const_anchors_ice.c           | 16 +++++++++++++++
>  >>>  4 files changed, 42 insertions(+), 1 deletion(-)
>  >>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  >>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>  >>>
>  >>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>  >>> index d2743f7bce6..80cded6dec1 100644
>  >>> --- a/gcc/config/rs6000/rs6000.cc
>  >>> +++ b/gcc/config/rs6000/rs6000.cc
>  >>> @@ -1760,6 +1760,10 @@ static const struct attribute_spec rs6000_attribute_table[] =
>  >>>  
>  >>>  #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
>  >>>  #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
>  >>> +
>  >>> +#undef TARGET_CONST_ANCHOR
>  >>> +#define TARGET_CONST_ANCHOR 0x8000
>  >>> +
>  >>>  
>  >>>  
>  >>>  /* Processor table.  */
>  >>> diff --git a/gcc/cse.cc b/gcc/cse.cc
>  >>> index b13afd4ba72..56542b91c1e 100644
>  >>> --- a/gcc/cse.cc
>  >>> +++ b/gcc/cse.cc
>  >>> @@ -5005,7 +5005,8 @@ cse_insn (rtx_insn *insn)
>  >>>        if (targetm.const_anchor
>  >>>       && !src_related
>  >>>       && src_const
>  >>> -     && GET_CODE (src_const) == CONST_INT)
>  >>> +     && GET_CODE (src_const) == CONST_INT
>  >>> +     && SCALAR_INT_MODE_P (mode))
>  >>>     {
>  >>>       src_related = try_const_anchors (src_const, mode);
>  >>>       src_related_is_const_anchor = src_related != NULL_RTX;
>  >>> diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  >>> new file mode 100644
>  >>> index 00000000000..39958ff9765
>  >>> --- /dev/null
>  >>> +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
>  >>> @@ -0,0 +1,20 @@
>  >>> +/* { dg-do compile { target has_arch_ppc64 } } */
>  >>> +/* { dg-options "-O2" } */
>  >>> +
>  >>> +#define C1 0x2351847027482577ULL
>  >>> +#define C2 0x2351847027482578ULL
>  >>> +
>  >>> +void __attribute__ ((noinline)) foo (long long *a)
>  >>> +{
>  >>> +  *a++ = C1;
>  >>> +  *a++ = C2;
>  >>> +}
>  >>> +
>  >>> +void __attribute__ ((noinline)) foo1 (long long *a, long long b)
>  >>> +{
>  >>> +  *a++ = C1;
>  >>> +  if (b)
>  >>> +    *a++ = C2;
>  >>> +}
>  >>> +
>  >>> +/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
>  >>> diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>  >>> new file mode 100644
>  >>> index 00000000000..4c8a892e803
>  >>> --- /dev/null
>  >>> +++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
>  >>> @@ -0,0 +1,16 @@
>  >>> +/* { dg-do compile } */
>  >>> +/* { dg-options "-O2" } */
>  >>> +
>  >>> +/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
>  >>> +   it could case ICE in try_const_anchors which only supports SCALAR_INT.  */
>  >>> +
>  >>> +long
>  >>> +foo (const int val)
>  >>> +{
>  >>> +  if (val == (0))
>  >>> +    return 0;
>  >>> +  void *p = __builtin_stack_save ();
>  >>> +  char c = val;
>  >>> +  __builtin_stack_restore (p);
>  >>> +  return c;
>  >>> +}
  

Patch

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d2743f7bce6..80cded6dec1 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -1760,6 +1760,10 @@  static const struct attribute_spec rs6000_attribute_table[] =
 
 #undef TARGET_UPDATE_IPA_FN_TARGET_INFO
 #define TARGET_UPDATE_IPA_FN_TARGET_INFO rs6000_update_ipa_fn_target_info
+
+#undef TARGET_CONST_ANCHOR
+#define TARGET_CONST_ANCHOR 0x8000
+
 
 
 /* Processor table.  */
diff --git a/gcc/cse.cc b/gcc/cse.cc
index b13afd4ba72..56542b91c1e 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -5005,7 +5005,8 @@  cse_insn (rtx_insn *insn)
       if (targetm.const_anchor
 	  && !src_related
 	  && src_const
-	  && GET_CODE (src_const) == CONST_INT)
+	  && GET_CODE (src_const) == CONST_INT
+	  && SCALAR_INT_MODE_P (mode))
 	{
 	  src_related = try_const_anchors (src_const, mode);
 	  src_related_is_const_anchor = src_related != NULL_RTX;
diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
new file mode 100644
index 00000000000..39958ff9765
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile { target has_arch_ppc64 } } */
+/* { dg-options "-O2" } */
+
+#define C1 0x2351847027482577ULL
+#define C2 0x2351847027482578ULL
+
+void __attribute__ ((noinline)) foo (long long *a)
+{
+  *a++ = C1;
+  *a++ = C2;
+}
+
+void __attribute__ ((noinline)) foo1 (long long *a, long long b)
+{
+  *a++ = C1;
+  if (b)
+    *a++ = C2;
+}
+
+/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
new file mode 100644
index 00000000000..4c8a892e803
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/try_const_anchors_ice.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* __builtin_stack_restore could generates {[%1:DI]=0;} in BLK mode,
+   it could case ICE in try_const_anchors which only supports SCALAR_INT.  */
+
+long
+foo (const int val)
+{
+  if (val == (0))
+    return 0;
+  void *p = __builtin_stack_save ();
+  char c = val;
+  __builtin_stack_restore (p);
+  return c;
+}