RISC-V: Support the ins "rol" with immediate operand
Checks
Commit Message
From: wangfeng <wangfeng@eswincomputing.com>
There is no Immediate operand of ins "rol" accroding to the B-ext,
so the immediate operand should be loaded into register at first.
But we can convert it to the ins "rori" or "roriw", and then one
immediate load ins can be reduced.
Please refer to the following use cases:
unsigned long foo2(unsigned long rs1)
{
return (rs1 << 10) | (rs1 >> 54);
}
The complier result is:
li a1,10
rol a0,a0,a1
This patch will generate one ins
rori a0,a0,54
gcc/ChangeLog:
* config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern
gcc/testsuite/ChangeLog:
* gcc.target/riscv/zbb-rol-ror-04.c: New test.
* gcc.target/riscv/zbb-rol-ror-05.c: New test.
---
gcc/config/riscv/bitmanip.md | 36 +++++++++++++++----
.../gcc.target/riscv/zbb-rol-ror-04.c | 24 +++++++++++++
.../gcc.target/riscv/zbb-rol-ror-05.c | 15 ++++++++
3 files changed, 69 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
Comments
On 11/27/22 19:14, Feng Wang wrote:
> From: wangfeng <wangfeng@eswincomputing.com>
>
> There is no Immediate operand of ins "rol" accroding to the B-ext,
> so the immediate operand should be loaded into register at first.
> But we can convert it to the ins "rori" or "roriw", and then one
> immediate load ins can be reduced.
>
> Please refer to the following use cases:
> unsigned long foo2(unsigned long rs1)
> {
> return (rs1 << 10) | (rs1 >> 54);
> }
>
> The complier result is:
> li a1,10
> rol a0,a0,a1
>
> This patch will generate one ins
> rori a0,a0,54
>
> gcc/ChangeLog:
>
> * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zbb-rol-ror-04.c: New test.
> * gcc.target/riscv/zbb-rol-ror-05.c: New test.
So this arrived after stage1 close and I'm not aware of an existing BZ
around this issue, so I'd tend to think this should wait for stage1 to
re-open in the spring.
From a technical standpoint, would it be better to hand this in a more
generic way? ie, when converting from gimple into RTL, if we want to
generate a rotate left by immediate and don't have a suitable insn, then
change it to a rotate right by an adjusted immediate. This could
probably be done in optabs.cc::expand_binop.
We might need similar code in combine.cc or simplify-rtx.cc since some
rotate cases (or exposure of the constant) may not show up until later
in the RTL pipeline.
Anyway, doing this in a more generic way seems like it's worth
investigating.
jeff
on 2022-11-28 23:39 Jeff Law<jeffreyalaw@gmail.com> wrote:
>
>
>On 11/27/22 19:14, Feng Wang wrote:
>> From: wangfeng <wangfeng@eswincomputing.com>
>>
>> There is no Immediate operand of ins "rol" accroding to the B-ext,
>> so the immediate operand should be loaded into register at first.
>> But we can convert it to the ins "rori" or "roriw", and then one
>> immediate load ins can be reduced.
>>
>> Please refer to the following use cases:
>> unsigned long foo2(unsigned long rs1)
>> {
>> return (rs1 << 10) | (rs1 >> 54);
>> }
>>
>> The complier result is:
>> li a1,10
>> rol a0,a0,a1
>>
>> This patch will generate one ins
>> rori a0,a0,54
>>
>> gcc/ChangeLog:
>>
>> * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/zbb-rol-ror-04.c: New test.
>> * gcc.target/riscv/zbb-rol-ror-05.c: New test.
>
>So this arrived after stage1 close and I'm not aware of an existing BZ
>around this issue, so I'd tend to think this should wait for stage1 to
>re-open in the spring.
>
>
> From a technical standpoint, would it be better to hand this in a more
>generic way? ie, when converting from gimple into RTL, if we want to
>generate a rotate left by immediate and don't have a suitable insn, then
>change it to a rotate right by an adjusted immediate. This could
>probably be done in optabs.cc::expand_binop.
>
>
>We might need similar code in combine.cc or simplify-rtx.cc since some
>rotate cases (or exposure of the constant) may not show up until later
>in the RTL pipeline.
>
>
>Anyway, doing this in a more generic way seems like it's worth
>investigating.
>
>
>jeff
>
Hi jeff,
Thanks for your reply. In the currently it will judge the rotate shift number when converting from
gimple into RTL, if the shift number bigger than mode_size/2, then reverse the rotate direction.
I think the purpose of this process is to handle rotate shift quickly. I will think about your advice
and try to modify it in the expand pass.
Wang Feng
Best regards
On 11/28/22 18:53, Feng Wang wrote:
> on 2022-11-28 23:39 Jeff Law<jeffreyalaw@gmail.com> wrote:
>>
>>
>> On 11/27/22 19:14, Feng Wang wrote:
>>> From: wangfeng <wangfeng@eswincomputing.com>
>>>
>>> There is no Immediate operand of ins "rol" accroding to the B-ext,
>>> so the immediate operand should be loaded into register at first.
>>> But we can convert it to the ins "rori" or "roriw", and then one
>>> immediate load ins can be reduced.
>>>
>>> Please refer to the following use cases:
>>> unsigned long foo2(unsigned long rs1)
>>> {
>>> return (rs1 << 10) | (rs1 >> 54);
>>> }
>>>
>>> The complier result is:
>>> li a1,10
>>> rol a0,a0,a1
>>>
>>> This patch will generate one ins
>>> rori a0,a0,54
>>>
>>> gcc/ChangeLog:
>>>
>>> * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/riscv/zbb-rol-ror-04.c: New test.
>>> * gcc.target/riscv/zbb-rol-ror-05.c: New test.
>>
>> So this arrived after stage1 close and I'm not aware of an existing BZ
>> around this issue, so I'd tend to think this should wait for stage1 to
>> re-open in the spring.
>>
>>
>> From a technical standpoint, would it be better to hand this in a more
>> generic way? ie, when converting from gimple into RTL, if we want to
>> generate a rotate left by immediate and don't have a suitable insn, then
>> change it to a rotate right by an adjusted immediate. This could
>> probably be done in optabs.cc::expand_binop.
>>
>>
>> We might need similar code in combine.cc or simplify-rtx.cc since some
>> rotate cases (or exposure of the constant) may not show up until later
>> in the RTL pipeline.
>>
>>
>> Anyway, doing this in a more generic way seems like it's worth
>> investigating.
>>
>>
>> jeff
>>
> Hi jeff,
>
> Thanks for your reply. In the currently it will judge the rotate shift number when converting from
> gimple into RTL, if the shift number bigger than mode_size/2, then reverse the rotate direction.
> I think the purpose of this process is to handle rotate shift quickly. I will think about your advice
> and try to modify it in the expand pass.
Yea, in the past there were targets where the cost of a shift or rotate
was proportional to the number of bits shifted. So for a rotate in
particular it was advantageous to reverse the rotation if the count was
more than mode_size/2.
I suspect such processors are a lot less common now than in the past.
But we can probably utilize some of that code to suit our needs.
Jeff
@@ -300,25 +300,49 @@
(define_insn "rotlsi3"
[(set (match_operand:SI 0 "register_operand" "=r")
(rotate:SI (match_operand:SI 1 "register_operand" "r")
- (match_operand:QI 2 "register_operand" "r")))]
+ (match_operand:QI 2 "arith_operand" "rI")))]
"TARGET_ZBB"
- "rol%~\t%0,%1,%2"
+ {
+ if (immediate_operand(operands[2], QImode))
+ {
+ operands[2] = GEN_INT(GET_MODE_BITSIZE (SImode) - INTVAL(operands[2]));
+ return "rori\t%0,%1,%2";
+ }
+ else
+ return "rol\t%0,%1,%2";
+ }
[(set_attr "type" "bitmanip")])
(define_insn "rotldi3"
[(set (match_operand:DI 0 "register_operand" "=r")
(rotate:DI (match_operand:DI 1 "register_operand" "r")
- (match_operand:QI 2 "register_operand" "r")))]
+ (match_operand:QI 2 "arith_operand" "rI")))]
"TARGET_64BIT && TARGET_ZBB"
- "rol\t%0,%1,%2"
+ {
+ if (immediate_operand(operands[2], QImode))
+ {
+ operands[2] = GEN_INT(GET_MODE_BITSIZE (DImode) - INTVAL(operands[2]));
+ return "rori\t%0,%1,%2";
+ }
+ else
+ return "rol\t%0,%1,%2";
+ }
[(set_attr "type" "bitmanip")])
(define_insn "rotlsi3_sext"
[(set (match_operand:DI 0 "register_operand" "=r")
(sign_extend:DI (rotate:SI (match_operand:SI 1 "register_operand" "r")
- (match_operand:QI 2 "register_operand" "r"))))]
+ (match_operand:QI 2 "arith_operand" "rI"))))]
"TARGET_64BIT && TARGET_ZBB"
- "rolw\t%0,%1,%2"
+ {
+ if (immediate_operand(operands[2], QImode))
+ {
+ operands[2] = GEN_INT(GET_MODE_BITSIZE (SImode) - INTVAL(operands[2]));
+ return "roriw\t%0,%1,%2";
+ }
+ else
+ return "rolw\t%0,%1,%2";
+ }
[(set_attr "type" "bitmanip")])
;; orc.b (or-combine) is added as an unspec for the benefit of the support
new file mode 100644
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-g" } } */
+
+unsigned long foo1 (unsigned long rs1)
+{ return (rs1 >> (34)) | (rs1 << 30); }
+
+unsigned long foo2(unsigned long rs1)
+{
+ return (rs1 << 10) | (rs1 >> 54);
+}
+
+unsigned int foo3(unsigned int rs1)
+{
+ return (rs1 >> 20) | (rs1 << 12);
+}
+
+unsigned int foo4(unsigned int rs1)
+{
+ return (rs1 << 10) | (rs1 >> 22);
+}
+
+/* { dg-final { scan-assembler-times "rori\t" 2 } } */
+/* { dg-final { scan-assembler-times "roriw" 2 } } */
\ No newline at end of file
new file mode 100644
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zbb -mabi=ilp32" } */
+/* { dg-skip-if "" { rv64-*-* } { "-g" } } */
+
+unsigned int foo1(unsigned int rs1)
+{
+ return (rs1 >> 20) | (rs1 << 12);
+}
+
+unsigned int foo2(unsigned int rs1)
+{
+ return (rs1 << 10) | (rs1 >> 22);
+}
+
+/* { dg-final { scan-assembler-times "rori" 2 } } */
\ No newline at end of file