[v3] RISC-V: elide unnecessary sign extend when expanding cmp_and_jump

Message ID 20231030032122.757369-1-vineetg@rivosinc.com
State Accepted
Headers
Series [v3] RISC-V: elide unnecessary sign extend when expanding cmp_and_jump |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Vineet Gupta Oct. 30, 2023, 3:21 a.m. UTC
  RV64 compare and branch instructions only support 64-bit operands.
At Expand time, the backend conservatively zero/sign extends
its operands even if not needed, such as incoming 32-bit function args
which ABI/ISA guarantee to be sign-extended already.

And subsequently REE fails to eliminate them as
   "missing defintion(s)" or "multiple definition(s)
since function args don't have explicit definition.

So during expand riscv_extend_comparands (), if an operand is a
subreg-promoted SI with inner DI, which is representative of a function
arg, just peel away the subreg to expose the DI, eliding the sign
extension. As Jeff noted this routine is also used in if-conversion so
also helps there.

Note there's currently patches floating around to improve REE and also a
new pass to eliminate unneccesary extensions, but it is still beneficial
to not generate those extra extensions in first place. It is obviously
less work for post-reload passes such as REE, but even for earlier
passes, such as combine, having to deal with one less thing and ensuing
fewer combinations is a win too.

Way too many existing tests used to observe this issue.
e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc
It elimiates the SEXT.W

Tested with rv64gc with no regressions, I'm relying on PAtrick's
pre-commit CI to do the full testing.

gcc/ChangeLog:
	* config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New.
	* (riscv_extend_comparands): Call New function on operands.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
Changes since v2:
  - Fix linting issues flagged by pre-commit CI
Changes since v1:
  - Elide sign extension for 32-bit operarnds only
  - Apply elison for both arguments
---
 gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)
  

Comments

Patrick O'Neill Oct. 30, 2023, 4:43 p.m. UTC | #1
On 10/29/23 20:21, Vineet Gupta wrote:
> RV64 compare and branch instructions only support 64-bit operands.
> At Expand time, the backend conservatively zero/sign extends
> its operands even if not needed, such as incoming 32-bit function args
> which ABI/ISA guarantee to be sign-extended already.
>
> And subsequently REE fails to eliminate them as
>     "missing defintion(s)" or "multiple definition(s)
> since function args don't have explicit definition.
>
> So during expand riscv_extend_comparands (), if an operand is a
> subreg-promoted SI with inner DI, which is representative of a function
> arg, just peel away the subreg to expose the DI, eliding the sign
> extension. As Jeff noted this routine is also used in if-conversion so
> also helps there.
>
> Note there's currently patches floating around to improve REE and also a
> new pass to eliminate unneccesary extensions, but it is still beneficial
> to not generate those extra extensions in first place. It is obviously
> less work for post-reload passes such as REE, but even for earlier
> passes, such as combine, having to deal with one less thing and ensuing
> fewer combinations is a win too.
>
> Way too many existing tests used to observe this issue.
> e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc
> It elimiates the SEXT.W
>
> Tested with rv64gc with no regressions, I'm relying on PAtrick's
> pre-commit CI to do the full testing.
Testing on the pre-commit CI has completed.
https://github.com/ewlu/gcc-precommit-ci/issues/499#issuecomment-1784446631

The patch was applied to this baseline:
https://github.com/gcc-mirror/gcc/commit/c6929b085580cf00cbc52b0f5b0afe2b9caa2a22

and no new failures or resolved failures were found when running the 
testsuite.

Tested-by: Patrick O'Neill <patrick@rivosinc.com>

Thanks!
Patrick
>
> gcc/ChangeLog:
> 	* config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New.
> 	* (riscv_extend_comparands): Call New function on operands.
>
> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
> ---
> Changes since v2:
>    - Fix linting issues flagged by pre-commit CI
> Changes since v1:
>    - Elide sign extension for 32-bit operarnds only
>    - Apply elison for both arguments
> ---
>   gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++--
>   1 file changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index ca9a2ca81d53..269beb3b159b 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -3678,6 +3678,24 @@ riscv_zero_if_equal (rtx cmp0, rtx cmp1)
>   		       cmp0, cmp1, 0, 0, OPTAB_DIRECT);
>   }
>   
> +/* Helper function for riscv_extend_comparands to Sign-extend the OP.
> +   However if the OP is SI subreg promoted with an inner DI, such as
> +       (subreg/s/v:SI (reg/v:DI) 0)
> +   just peel off the SUBREG to get DI, avoiding extraneous extension.  */
> +
> +static void
> +riscv_sign_extend_if_not_subreg_prom (rtx *op)
> +{
> +  if (GET_MODE (*op) == SImode
> +      && GET_CODE (*op) == SUBREG
> +      && SUBREG_PROMOTED_VAR_P (*op)
> +      && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant ()
> +	 == GET_MODE_SIZE (word_mode))
> +    *op = XEXP (*op, 0);
> +  else
> +    *op = gen_rtx_SIGN_EXTEND (word_mode, *op);
> +}
> +
>   /* Sign- or zero-extend OP0 and OP1 for integer comparisons.  */
>   
>   static void
> @@ -3707,9 +3725,10 @@ riscv_extend_comparands (rtx_code code, rtx *op0, rtx *op1)
>   	}
>         else
>   	{
> -	  *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0);
> +	  riscv_sign_extend_if_not_subreg_prom (op0);
> +
>   	  if (*op1 != const0_rtx)
> -	    *op1 = gen_rtx_SIGN_EXTEND (word_mode, *op1);
> +	    riscv_sign_extend_if_not_subreg_prom (op1);
>   	}
>       }
>   }
  
Jeff Law Oct. 30, 2023, 8:33 p.m. UTC | #2
On 10/29/23 21:21, Vineet Gupta wrote:
> RV64 compare and branch instructions only support 64-bit operands.
> At Expand time, the backend conservatively zero/sign extends
> its operands even if not needed, such as incoming 32-bit function args
> which ABI/ISA guarantee to be sign-extended already.
> 
> And subsequently REE fails to eliminate them as
>     "missing defintion(s)" or "multiple definition(s)
> since function args don't have explicit definition.
> 
> So during expand riscv_extend_comparands (), if an operand is a
> subreg-promoted SI with inner DI, which is representative of a function
> arg, just peel away the subreg to expose the DI, eliding the sign
> extension. As Jeff noted this routine is also used in if-conversion so
> also helps there.
> 
> Note there's currently patches floating around to improve REE and also a
> new pass to eliminate unneccesary extensions, but it is still beneficial
> to not generate those extra extensions in first place. It is obviously
> less work for post-reload passes such as REE, but even for earlier
> passes, such as combine, having to deal with one less thing and ensuing
> fewer combinations is a win too.
> 
> Way too many existing tests used to observe this issue.
> e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc
> It elimiates the SEXT.W
> 
> Tested with rv64gc with no regressions, I'm relying on PAtrick's
> pre-commit CI to do the full testing.
> 
> gcc/ChangeLog:
> 	* config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New.
> 	* (riscv_extend_comparands): Call New function on operands.
> 
> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
> ---
> Changes since v2:
>    - Fix linting issues flagged by pre-commit CI
> Changes since v1:
>    - Elide sign extension for 32-bit operarnds only
>    - Apply elison for both arguments
> ---
>   gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++--
>   1 file changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index ca9a2ca81d53..269beb3b159b 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -3678,6 +3678,24 @@ riscv_zero_if_equal (rtx cmp0, rtx cmp1)
>   		       cmp0, cmp1, 0, 0, OPTAB_DIRECT);
>   }
>   
> +/* Helper function for riscv_extend_comparands to Sign-extend the OP.
> +   However if the OP is SI subreg promoted with an inner DI, such as
> +       (subreg/s/v:SI (reg/v:DI) 0
> +   just peel off the SUBREG to get DI, avoiding extraneous extension.  */
> +
> +static void
> +riscv_sign_extend_if_not_subreg_prom (rtx *op)
> +{
> +  if (GET_MODE (*op) == SImode
> +      && GET_CODE (*op) == SUBREG
> +      && SUBREG_PROMOTED_VAR_P (*op)
> +      && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant ()
> +	 == GET_MODE_SIZE (word_mode))
> +    *op = XEXP (*op, 0);
> +  else
> +    *op = gen_rtx_SIGN_EXTEND (word_mode, *op);
So for the wrapped test GET_MODE_SIZE stuff), add parenthesis and indent 
the "==" clause.  ie

   && (GET_MODE_SIZE (GET_MODE (XEXP (*op), 0))).to_constant ()
       == GET_MODE_SIZE (word_mode))

Don't you also need to verify that the subreg was sign extended?  The 
PROMOTED_VAR_P just notes that it was promoted, not *how* it was 
promoted.  I think you just need to add a test like this:

   && SUBREG_PROMOTED_SIGNED_P (*op)


I don't guess you have data on how this impacts dynamic instruction 
counts on anything significant do you?

OK with the formatting nit fixed and adding the additional check to 
ensure the value was sign extended.


jeff
  
Vineet Gupta Oct. 30, 2023, 11:21 p.m. UTC | #3
On 10/30/23 13:33, Jeff Law wrote:
>
>
> On 10/29/23 21:21, Vineet Gupta wrote:
>> RV64 compare and branch instructions only support 64-bit operands.
>> At Expand time, the backend conservatively zero/sign extends
>> its operands even if not needed, such as incoming 32-bit function args
>> which ABI/ISA guarantee to be sign-extended already.
>>
>> And subsequently REE fails to eliminate them as
>>     "missing defintion(s)" or "multiple definition(s)
>> since function args don't have explicit definition.
>>
>> So during expand riscv_extend_comparands (), if an operand is a
>> subreg-promoted SI with inner DI, which is representative of a function
>> arg, just peel away the subreg to expose the DI, eliding the sign
>> extension. As Jeff noted this routine is also used in if-conversion so
>> also helps there.
>>
>> Note there's currently patches floating around to improve REE and also a
>> new pass to eliminate unneccesary extensions, but it is still beneficial
>> to not generate those extra extensions in first place. It is obviously
>> less work for post-reload passes such as REE, but even for earlier
>> passes, such as combine, having to deal with one less thing and ensuing
>> fewer combinations is a win too.
>>
>> Way too many existing tests used to observe this issue.
>> e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc
>> It elimiates the SEXT.W
>>
>> Tested with rv64gc with no regressions, I'm relying on PAtrick's
>> pre-commit CI to do the full testing.
>>
>> gcc/ChangeLog:
>>     * config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New.
>>     * (riscv_extend_comparands): Call New function on operands.
>>
>> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
>> ---
>> Changes since v2:
>>    - Fix linting issues flagged by pre-commit CI
>> Changes since v1:
>>    - Elide sign extension for 32-bit operarnds only
>>    - Apply elison for both arguments
>> ---
>>   gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++--
>>   1 file changed, 21 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> index ca9a2ca81d53..269beb3b159b 100644
>> --- a/gcc/config/riscv/riscv.cc
>> +++ b/gcc/config/riscv/riscv.cc
>> @@ -3678,6 +3678,24 @@ riscv_zero_if_equal (rtx cmp0, rtx cmp1)
>>                  cmp0, cmp1, 0, 0, OPTAB_DIRECT);
>>   }
>>   +/* Helper function for riscv_extend_comparands to Sign-extend the OP.
>> +   However if the OP is SI subreg promoted with an inner DI, such as
>> +       (subreg/s/v:SI (reg/v:DI) 0
>> +   just peel off the SUBREG to get DI, avoiding extraneous 
>> extension.  */
>> +
>> +static void
>> +riscv_sign_extend_if_not_subreg_prom (rtx *op)
>> +{
>> +  if (GET_MODE (*op) == SImode
>> +      && GET_CODE (*op) == SUBREG
>> +      && SUBREG_PROMOTED_VAR_P (*op)
>> +      && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant ()
>> +     == GET_MODE_SIZE (word_mode))
>> +    *op = XEXP (*op, 0);
>> +  else
>> +    *op = gen_rtx_SIGN_EXTEND (word_mode, *op);
> So for the wrapped test GET_MODE_SIZE stuff), add parenthesis and 
> indent the "==" clause.  ie
>
>   && (GET_MODE_SIZE (GET_MODE (XEXP (*op), 0))).to_constant ()
>       == GET_MODE_SIZE (word_mode))

Ok. FWIW I was using the wrong checker: git_check_commit.py vs. 
check_GNU_style.sh

>
> Don't you also need to verify that the subreg was sign extended? The 
> PROMOTED_VAR_P just notes that it was promoted, not *how* it was 
> promoted.  I think you just need to add a test like this:
>
>   && SUBREG_PROMOTED_SIGNED_P (*op)

Thx for catching this.
The orig test case I used to spot the issue had an unsigned promoted 
subreg but I was convinced it could still be removed (wrong on so many 
counts).


> I don't guess you have data on how this impacts dynamic instruction 
> counts on anything significant do you?

No, haven't run it yet. I can fire one though. I doubt if this is as 
significant as the prev one, even if this is the right thing to do.

>
> OK with the formatting nit fixed and adding the additional check to 
> ensure the value was sign extended.

Thx. I just wait for SPEC run before pushing this.

-Vineet
  
Vineet Gupta Oct. 31, 2023, 11:45 p.m. UTC | #4
On 10/30/23 16:21, Vineet Gupta wrote:
>> I don't guess you have data on how this impacts dynamic instruction 
>> counts on anything significant do you?
>
> No, haven't run it yet. I can fire one though. I doubt if this is as 
> significant as the prev one, even if this is the right thing to do. 

Very very small improvement overall

49,318,030,233,258 (w/o patch)
49,318,000,693,233 (W/ patch)
  
Vineet Gupta Nov. 1, 2023, 12:05 a.m. UTC | #5
On 10/30/23 13:33, Jeff Law wrote:
>
>> +/* Helper function for riscv_extend_comparands to Sign-extend the OP.
>> +   However if the OP is SI subreg promoted with an inner DI, such as
>> +       (subreg/s/v:SI (reg/v:DI) 0
>> +   just peel off the SUBREG to get DI, avoiding extraneous 
>> extension.  */
>> +
>> +static void
>> +riscv_sign_extend_if_not_subreg_prom (rtx *op)
>> +{
>> +  if (GET_MODE (*op) == SImode

So I may have been partially wrong about v2 patch being wrong and 
needing this fixup ;-) [1]

It seems we don't have to limit this to SImode. I re-read the calling 
convention doc [2] and it says this

"When passed in registers or on the stack, integer scalars narrower than XLEN
  bits are widened according to the sign of their type up to 32 bits, then
  sign-extended to XLEN bits."

This essentially means signed short, and signed char will be already 
sign-extended at caller site and need not be done in callee: Palmer 
mention in internal slack that unadorned char is unsigned on RISC-V 
hence we don't see the compiler extra work for say 
gcc.dg/torture/pr75964.c. If the test is however tweaked to use signed 
chars (or short), it seems caller is doing the work (adjusting the 
constant being passed to be a sign-extended variant).

This further validates Jeff's comment about checking for 
SUBREG_PROMOTED_SIGNED_P (it was anyhow the right thing to begin with 
anyways).

At this point I feel like I'm into splitting hairs (in vain) territory, 
as fixing this might not matter much in practice ....

I'd suppose we go ahead with the v3 with changes Jeff asked for and 
maybe do a later fixup to relax SI.

>> +      && GET_CODE (*op) == SUBREG
>> +      && SUBREG_PROMOTED_VAR_P (*op)
>> +      && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant ()
>> +     == GET_MODE_SIZE (word_mode))
>> +    *op = XEXP (*op, 0);
>> +  else
>> +    *op = gen_rtx_SIGN_EXTEND (word_mode, *op);
> So for the wrapped test GET_MODE_SIZE stuff), add parenthesis and 
> indent the "==" clause.  ie
>
>   && (GET_MODE_SIZE (GET_MODE (XEXP (*op), 0))).to_constant ()
>       == GET_MODE_SIZE (word_mode))
>
> Don't you also need to verify that the subreg was sign extended? The 
> PROMOTED_VAR_P just notes that it was promoted, not *how* it was 
> promoted.  I think you just need to add a test like this:
>
>   && SUBREG_PROMOTED_SIGNED_P (*op)

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634327.html
[2] 
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc 
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc>
  
Jeff Law Nov. 1, 2023, 12:45 a.m. UTC | #6
On 10/31/23 18:05, Vineet Gupta wrote:
> On 10/30/23 13:33, Jeff Law wrote:
>>
>>> +/* Helper function for riscv_extend_comparands to Sign-extend the OP.
>>> +   However if the OP is SI subreg promoted with an inner DI, such as
>>> +       (subreg/s/v:SI (reg/v:DI) 0
>>> +   just peel off the SUBREG to get DI, avoiding extraneous 
>>> extension.  */
>>> +
>>> +static void
>>> +riscv_sign_extend_if_not_subreg_prom (rtx *op)
>>> +{
>>> +  if (GET_MODE (*op) == SImode
> 
> So I may have been partially wrong about v2 patch being wrong and 
> needing this fixup ;-) [1]
> 
> It seems we don't have to limit this to SImode. I re-read the calling 
> convention doc [2] and it says this
> 
> "When passed in registers or on the stack, integer scalars narrower than 
> XLEN
>   bits are widened according to the sign of their type up to 32 bits, then
>   sign-extended to XLEN bits."
> 
> This essentially means signed short, and signed char will be already 
> sign-extended at caller site and need not be done in callee: Palmer 
> mention in internal slack that unadorned char is unsigned on RISC-V 
> hence we don't see the compiler extra work for say 
> gcc.dg/torture/pr75964.c. If the test is however tweaked to use signed 
> chars (or short), it seems caller is doing the work (adjusting the 
> constant being passed to be a sign-extended variant).
> 
> This further validates Jeff's comment about checking for 
> SUBREG_PROMOTED_SIGNED_P (it was anyhow the right thing to begin with 
> anyways).
> 
> At this point I feel like I'm into splitting hairs (in vain) territory, 
> as fixing this might not matter much in practice ...
> 
> I'd suppose we go ahead with the v3 with changes Jeff asked for and 
> maybe do a later fixup to relax SI.
Consider any such fixup pre-approved.  I was thinking that the 8/16 bit 
sub-objects should probably be extended one way or another.  It wouldn't 
make much sense not to.

Not as much of a win as I'd hoped.  But I'll take it

jeff
  

Patch

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index ca9a2ca81d53..269beb3b159b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3678,6 +3678,24 @@  riscv_zero_if_equal (rtx cmp0, rtx cmp1)
 		       cmp0, cmp1, 0, 0, OPTAB_DIRECT);
 }
 
+/* Helper function for riscv_extend_comparands to Sign-extend the OP.
+   However if the OP is SI subreg promoted with an inner DI, such as
+       (subreg/s/v:SI (reg/v:DI) 0)
+   just peel off the SUBREG to get DI, avoiding extraneous extension.  */
+
+static void
+riscv_sign_extend_if_not_subreg_prom (rtx *op)
+{
+  if (GET_MODE (*op) == SImode
+      && GET_CODE (*op) == SUBREG
+      && SUBREG_PROMOTED_VAR_P (*op)
+      && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant ()
+	 == GET_MODE_SIZE (word_mode))
+    *op = XEXP (*op, 0);
+  else
+    *op = gen_rtx_SIGN_EXTEND (word_mode, *op);
+}
+
 /* Sign- or zero-extend OP0 and OP1 for integer comparisons.  */
 
 static void
@@ -3707,9 +3725,10 @@  riscv_extend_comparands (rtx_code code, rtx *op0, rtx *op1)
 	}
       else
 	{
-	  *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0);
+	  riscv_sign_extend_if_not_subreg_prom (op0);
+
 	  if (*op1 != const0_rtx)
-	    *op1 = gen_rtx_SIGN_EXTEND (word_mode, *op1);
+	    riscv_sign_extend_if_not_subreg_prom (op1);
 	}
     }
 }