RISC-V: improve codegen for large constants with same 32-bit lo and hi parts [2]

Message ID 20230518205716.3258223-1-vineetg@rivosinc.com
State Accepted
Headers
Series RISC-V: improve codegen for large constants with same 32-bit lo and hi parts [2] |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Vineet Gupta May 18, 2023, 8:57 p.m. UTC
  [part #2 of PR/109279]

SPEC2017 deepsjeng uses large constants which currently generates less than
ideal code. This fix improves codegen for large constants which have
same low and hi parts: e.g.

	long long f(void) { return 0x0101010101010101ull; }

Before
        li      a5,0x1010000
        addi    a5,a5,0x101
        mv      a0,a5
        slli    a5,a5,32
        add     a0,a5,a0
        ret

With patch
	li	a5,0x1010000
	addi	a5,a5,0x101
	slli	a0,a5,32
	add	a0,a0,a5
	ret

This is testsuite clean.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_split_integer): if loval is equal
	  to hival, ASHIFT the corresponding regs.

Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
 gcc/config/riscv/riscv.cc | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)
  

Comments

Jeff Law May 19, 2023, 4:33 p.m. UTC | #1
On 5/18/23 14:57, Vineet Gupta wrote:
> [part #2 of PR/109279]
> 
> SPEC2017 deepsjeng uses large constants which currently generates less than
> ideal code. This fix improves codegen for large constants which have
> same low and hi parts: e.g.
> 
> 	long long f(void) { return 0x0101010101010101ull; }
> 
> Before
>          li      a5,0x1010000
>          addi    a5,a5,0x101
>          mv      a0,a5
>          slli    a5,a5,32
>          add     a0,a5,a0
>          ret
> 
> With patch
> 	li	a5,0x1010000
> 	addi	a5,a5,0x101
> 	slli	a0,a5,32
> 	add	a0,a0,a5
> 	ret
> 
> This is testsuite clean.
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv.cc (riscv_split_integer): if loval is equal
> 	  to hival, ASHIFT the corresponding regs.
LGTM.  Please install.  Thanks for taking care of this!  The updated 
sequence looks good.

Jeff
  
Palmer Dabbelt May 19, 2023, 4:36 p.m. UTC | #2
On Fri, 19 May 2023 09:33:34 PDT (-0700), jeffreyalaw@gmail.com wrote:
>
>
> On 5/18/23 14:57, Vineet Gupta wrote:
>> [part #2 of PR/109279]
>>
>> SPEC2017 deepsjeng uses large constants which currently generates less than
>> ideal code. This fix improves codegen for large constants which have
>> same low and hi parts: e.g.
>>
>> 	long long f(void) { return 0x0101010101010101ull; }
>>
>> Before
>>          li      a5,0x1010000
>>          addi    a5,a5,0x101
>>          mv      a0,a5
>>          slli    a5,a5,32
>>          add     a0,a5,a0
>>          ret
>>
>> With patch
>> 	li	a5,0x1010000
>> 	addi	a5,a5,0x101
>> 	slli	a0,a5,32
>> 	add	a0,a0,a5
>> 	ret
>>
>> This is testsuite clean.
>>
>> gcc/ChangeLog:
>>
>> 	* config/riscv/riscv.cc (riscv_split_integer): if loval is equal
>> 	  to hival, ASHIFT the corresponding regs.
> LGTM.  Please install.  Thanks for taking care of this!  The updated
> sequence looks good.

Works for me.  Did you start that performance backports branch?  Either 
way, I think this should go on it.
  
Vineet Gupta May 19, 2023, 5:21 p.m. UTC | #3
On 5/19/23 09:36, Palmer Dabbelt wrote:
> Works for me.  Did you start that performance backports branch?  
> Either way, I think this should go on it. 

Please note that there is a bit of dependency chain. Assuming the 
aforementioned branch is gcc 13.1 based, this change also needs my 
splitter relaxation fix g0530254413f8 to ensure incremental improvements 
to large const codegen.
  
Vineet Gupta May 19, 2023, 5:21 p.m. UTC | #4
On 5/19/23 09:33, Jeff Law wrote:
>
>
> On 5/18/23 14:57, Vineet Gupta wrote:
>> [part #2 of PR/109279]
>>
>> SPEC2017 deepsjeng uses large constants which currently generates 
>> less than
>> ideal code. This fix improves codegen for large constants which have
>> same low and hi parts: e.g.
>>
>>     long long f(void) { return 0x0101010101010101ull; }
>>
>> Before
>>          li      a5,0x1010000
>>          addi    a5,a5,0x101
>>          mv      a0,a5
>>          slli    a5,a5,32
>>          add     a0,a5,a0
>>          ret
>>
>> With patch
>>     li    a5,0x1010000
>>     addi    a5,a5,0x101
>>     slli    a0,a5,32
>>     add    a0,a0,a5
>>     ret
>>
>> This is testsuite clean.
>>
>> gcc/ChangeLog:
>>
>>     * config/riscv/riscv.cc (riscv_split_integer): if loval is equal
>>       to hival, ASHIFT the corresponding regs.
> LGTM.  Please install.  Thanks for taking care of this!  The updated 
> sequence looks good.
>
> Jeff
  

Patch

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 79122699b6f5..4e1bb2f14cf8 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -703,13 +703,18 @@  riscv_split_integer (HOST_WIDE_INT val, machine_mode mode)
   unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32);
   rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode);
 
-  riscv_move_integer (hi, hi, hival, mode);
   riscv_move_integer (lo, lo, loval, mode);
 
-  hi = gen_rtx_fmt_ee (ASHIFT, mode, hi, GEN_INT (32));
-  hi = force_reg (mode, hi);
+  if (loval == hival)
+      hi = gen_rtx_ASHIFT (mode, lo, GEN_INT (32));
+  else
+    {
+      riscv_move_integer (hi, hi, hival, mode);
+      hi = gen_rtx_ASHIFT (mode, hi, GEN_INT (32));
+    }
 
-  return gen_rtx_fmt_ee (PLUS, mode, hi, lo);
+  hi = force_reg (mode, hi);
+  return gen_rtx_PLUS (mode, hi, lo);
 }
 
 /* Return true if X is a thread-local symbol.  */