RISC-V: Produce better code with complex constants [PR95632] [PR106602]
Checks
Commit Message
Due to RISC-V limitations on operations with big constants combine
is failing to match such operations and is not being able to
produce optimal code as it keeps splitting them. By pretending we
can do those operations we can get more opportunities for
simplification of surrounding instructions.
2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com>
Jeff Law <jlaw@ventanamicro.com>
gcc/Changelog:
PR target/95632
PR target/106602
* config/riscv/riscv.md: New pattern to simulate complex
const_int loads.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr95632.c: New test.
* gcc.target/riscv/pr106602.c: Likewise.
---
gcc/config/riscv/riscv.md | 16 ++++++++++++++++
gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++++++++++++++
gcc/testsuite/gcc.target/riscv/pr95632.c | 15 +++++++++++++++
3 files changed, 45 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c
create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c
Comments
On 12/7/22 13:55, Raphael Moreira Zinsly wrote:
> Due to RISC-V limitations on operations with big constants combine
> is failing to match such operations and is not being able to
> produce optimal code as it keeps splitting them. By pretending we
> can do those operations we can get more opportunities for
> simplification of surrounding instructions.
>
> 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com>
> Jeff Law <jlaw@ventanamicro.com>
>
> gcc/Changelog:
> PR target/95632
> PR target/106602
> * config/riscv/riscv.md: New pattern to simulate complex
> const_int loads.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/pr95632.c: New test.
> * gcc.target/riscv/pr106602.c: Likewise.
So to give a little background to others.
The core issue is that when we break down constants early, it can make
it difficult for combine to reconstruct the constant and simplify code
using the reconstructed constant -- you end up trying to do 4->3 or
worse combination sequences which aren't supported by the combiner.
Usually this kind of scenario is handled with a "bridge" pattern. Those
are generally defined as patterns that exist solely for combine and may
not correspond to any real instruction on the target. "bridge" patterns
are typically 2->1 or 3->1 combinations and are intermediate steps for
4->N or even larger combination opportunities. Obviously if the bridge
doesn't allow subsequent simplifications, then the bridge pattern must
generate correct code (either by generating suitable assembly or
splitting later).
Raphael's patch introduces a bridge pattern that pretends we can load up
splittable constants in a single insn. We restrict the bridge pattern
to be active from the point when CSE is no longer expected through the
combiner up to the first splitter pass (where we'll break it down again
if it's still in the IL).
So we get most of the benefit of splitting constants early (CSE, LICM,
etc) while also getting the benefits of splitting late (combine
simplifications).
Given I was working with Raphael on the patch, it's probably best for
someone else to do the review rather than me approving it :-)
Jeff
On Wed, Dec 07, 2022 at 05:55:17PM -0300, Raphael Moreira Zinsly wrote:
> Due to RISC-V limitations on operations with big constants combine
> is failing to match such operations and is not being able to
> produce optimal code as it keeps splitting them. By pretending we
> can do those operations we can get more opportunities for
> simplification of surrounding instructions.
>
> 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com>
> Jeff Law <jlaw@ventanamicro.com>
Just nits, not a proper review.
2 spaces after date and 2 spaces before <, rather than just 1.
>
> gcc/Changelog:
> PR target/95632
> PR target/106602
> * config/riscv/riscv.md: New pattern to simulate complex
> const_int loads.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/pr95632.c: New test.
> * gcc.target/riscv/pr106602.c: Likewise.
All lines in the ChangeLog should be tab indented, rather than just some of
them and others with 8 spaces.
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1667,6 +1667,22 @@
> MAX_MACHINE_MODE, &operands[3], TRUE);
> })
>
> +;; Pretend to have the ability to load complex const_int in order to get
> +;; better code generation around them.
> +(define_insn_and_split ""
define_insn_and_split patterns better should have some name, even if it
starts with *. It makes dumps more readable, and you can refer to it
in the ChangeLog when it is added or changed etc.
> + [(set (match_operand:GPR 0 "register_operand" "=r")
> + (match_operand:GPR 1 "splittable_const_int_operand" "i"))]
> + "cse_not_expected"
> + "#"
> + "&& 1"
> + [(const_int 0)]
> +
Why the empty line?
> +{
> + riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]),
> + <GPR:MODE>mode, TRUE);
You can just use <MODE> if there is only one iterator in the pattern.
Jakub
On Wed, 07 Dec 2022 13:30:32 PST (-0800), gcc-patches@gcc.gnu.org wrote:
> On Wed, Dec 07, 2022 at 05:55:17PM -0300, Raphael Moreira Zinsly wrote:
>> Due to RISC-V limitations on operations with big constants combine
>> is failing to match such operations and is not being able to
>> produce optimal code as it keeps splitting them. By pretending we
>> can do those operations we can get more opportunities for
>> simplification of surrounding instructions.
>>
>> 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com>
>> Jeff Law <jlaw@ventanamicro.com>
>
> Just nits, not a proper review.
> 2 spaces after date and 2 spaces before <, rather than just 1.
>
>>
>> gcc/Changelog:
>> PR target/95632
>> PR target/106602
>> * config/riscv/riscv.md: New pattern to simulate complex
>> const_int loads.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.target/riscv/pr95632.c: New test.
>> * gcc.target/riscv/pr106602.c: Likewise.
>
> All lines in the ChangeLog should be tab indented, rather than just some of
> them and others with 8 spaces.
There's alsot contrib/git-commit-mklog.py, which provides a template for
these (I also have trouble remembering the formatting rules).
>
>> --- a/gcc/config/riscv/riscv.md
>> +++ b/gcc/config/riscv/riscv.md
>> @@ -1667,6 +1667,22 @@
>> MAX_MACHINE_MODE, &operands[3], TRUE);
>> })
>>
>> +;; Pretend to have the ability to load complex const_int in order to get
>> +;; better code generation around them.
>> +(define_insn_and_split ""
>
> define_insn_and_split patterns better should have some name, even if it
> starts with *. It makes dumps more readable, and you can refer to it
> in the ChangeLog when it is added or changed etc.
>
>> + [(set (match_operand:GPR 0 "register_operand" "=r")
>> + (match_operand:GPR 1 "splittable_const_int_operand" "i"))]
>> + "cse_not_expected"
>> + "#"
>> + "&& 1"
>> + [(const_int 0)]
>> +
>
> Why the empty line?
>
>> +{
>> + riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]),
>> + <GPR:MODE>mode, TRUE);
>
> You can just use <MODE> if there is only one iterator in the pattern.
>
> Jakub
On Wed, 07 Dec 2022 12:55:17 PST (-0800), rzinsly@ventanamicro.com wrote:
> Due to RISC-V limitations on operations with big constants combine
> is failing to match such operations and is not being able to
> produce optimal code as it keeps splitting them. By pretending we
> can do those operations we can get more opportunities for
> simplification of surrounding instructions.
I saw Jeff's comments. This is always the kind of thing that worries
me: we're essentially lying to the optimizer in order to trick it into
generating better code, which might just make it generate worse code.
It's always easy to see a small example that improves, but those could
be wiped out by secondary effects in real code. So I'd usually want to
have some benchmarking for a patch like this.
That said, if this is just the standard way of doing things then maybe
it's just fine?
> 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com>
> Jeff Law <jlaw@ventanamicro.com>
>
> gcc/Changelog:
> PR target/95632
> PR target/106602
> * config/riscv/riscv.md: New pattern to simulate complex
> const_int loads.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/riscv/pr95632.c: New test.
> * gcc.target/riscv/pr106602.c: Likewise.
> ---
> gcc/config/riscv/riscv.md | 16 ++++++++++++++++
> gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++++++++++++++
> gcc/testsuite/gcc.target/riscv/pr95632.c | 15 +++++++++++++++
> 3 files changed, 45 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c
>
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index df57e2b0b4a..0a9b5ec22b0 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1667,6 +1667,22 @@
> MAX_MACHINE_MODE, &operands[3], TRUE);
> })
>
> +;; Pretend to have the ability to load complex const_int in order to get
> +;; better code generation around them.
> +(define_insn_and_split ""
> + [(set (match_operand:GPR 0 "register_operand" "=r")
> + (match_operand:GPR 1 "splittable_const_int_operand" "i"))]
> + "cse_not_expected"
> + "#"
> + "&& 1"
> + [(const_int 0)]
> +
> +{
> + riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]),
> + <GPR:MODE>mode, TRUE);
> + DONE;
> +})
There's some comments from Jakub on this, I don't see any additional
issues with the code (aside from the "does it help" stuff from above).
> +
> ;; 64-bit integer moves
>
> (define_expand "movdi"
> diff --git a/gcc/testsuite/gcc.target/riscv/pr106602.c b/gcc/testsuite/gcc.target/riscv/pr106602.c
> new file mode 100644
> index 00000000000..83b70877012
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr106602.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=rv64gc" } */
There's a DG hook to limit this to 64-bit targets, that way it'll run
with whatever target is being tested.
> +
> +unsigned long
> +foo2 (unsigned long a)
> +{
> + return (unsigned long)(unsigned int) a << 6;
> +}
> +
> +/* { dg-final { scan-assembler-times "slli\t" 1 } } */
> +/* { dg-final { scan-assembler-times "srli\t" 1 } } */
> +/* { dg-final { scan-assembler-not "\tli\t" } } */
> +/* { dg-final { scan-assembler-not "addi\t" } } */
> +/* { dg-final { scan-assembler-not "and\t" } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/pr95632.c b/gcc/testsuite/gcc.target/riscv/pr95632.c
> new file mode 100644
> index 00000000000..bd316ab1d7b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr95632.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -march=rv32imafc -mabi=ilp32f" } */
Is there a reason to make this rv32-only? Unless I'm missing something
this should generate pretty much the same code for rv64.
> +
> +unsigned short
> +foo (unsigned short crc)
> +{
> + crc ^= 0x4002;
> + crc >>= 1;
> + crc |= 0x8000;
> +
> + return crc;
> +}
> +
> +/* { dg-final { scan-assembler-times "srli\t" 1 } } */
> +/* { dg-final { scan-assembler-not "slli\t" } } */
On 12/8/22 10:53, Palmer Dabbelt wrote:
> On Wed, 07 Dec 2022 12:55:17 PST (-0800), rzinsly@ventanamicro.com wrote:
>> Due to RISC-V limitations on operations with big constants combine
>> is failing to match such operations and is not being able to
>> produce optimal code as it keeps splitting them. By pretending we
>> can do those operations we can get more opportunities for
>> simplification of surrounding instructions.
>
> I saw Jeff's comments. This is always the kind of thing that worries
> me: we're essentially lying to the optimizer in order to trick it into
> generating better code, which might just make it generate worse code.
> It's always easy to see a small example that improves, but those could
> be wiped out by secondary effects in real code. So I'd usually want to
> have some benchmarking for a patch like this.
>
> That said, if this is just the standard way of doing things then maybe
> it's just fine?
Bridge combiner patterns are pretty standard. The insn's condition of
cse_not_expected is also in there to minimize the potential for
surprises by not exposing this too early.
jeff
On Thu, 08 Dec 2022 10:15:47 PST (-0800), gcc-patches@gcc.gnu.org wrote:
>
>
> On 12/8/22 10:53, Palmer Dabbelt wrote:
>> On Wed, 07 Dec 2022 12:55:17 PST (-0800), rzinsly@ventanamicro.com wrote:
>>> Due to RISC-V limitations on operations with big constants combine
>>> is failing to match such operations and is not being able to
>>> produce optimal code as it keeps splitting them. By pretending we
>>> can do those operations we can get more opportunities for
>>> simplification of surrounding instructions.
>>
>> I saw Jeff's comments. This is always the kind of thing that worries
>> me: we're essentially lying to the optimizer in order to trick it into
>> generating better code, which might just make it generate worse code.
>> It's always easy to see a small example that improves, but those could
>> be wiped out by secondary effects in real code. So I'd usually want to
>> have some benchmarking for a patch like this.
>>
>> That said, if this is just the standard way of doing things then maybe
>> it's just fine?
> Bridge combiner patterns are pretty standard. The insn's condition of
> cse_not_expected is also in there to minimize the potential for
> surprises by not exposing this too early.
OK, I'm fine with this, then -- aside from the fairly minor issues
pointed out.
@@ -1667,6 +1667,22 @@
MAX_MACHINE_MODE, &operands[3], TRUE);
})
+;; Pretend to have the ability to load complex const_int in order to get
+;; better code generation around them.
+(define_insn_and_split ""
+ [(set (match_operand:GPR 0 "register_operand" "=r")
+ (match_operand:GPR 1 "splittable_const_int_operand" "i"))]
+ "cse_not_expected"
+ "#"
+ "&& 1"
+ [(const_int 0)]
+
+{
+ riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]),
+ <GPR:MODE>mode, TRUE);
+ DONE;
+})
+
;; 64-bit integer moves
(define_expand "movdi"
new file mode 100644
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gc" } */
+
+unsigned long
+foo2 (unsigned long a)
+{
+ return (unsigned long)(unsigned int) a << 6;
+}
+
+/* { dg-final { scan-assembler-times "slli\t" 1 } } */
+/* { dg-final { scan-assembler-times "srli\t" 1 } } */
+/* { dg-final { scan-assembler-not "\tli\t" } } */
+/* { dg-final { scan-assembler-not "addi\t" } } */
+/* { dg-final { scan-assembler-not "and\t" } } */
new file mode 100644
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv32imafc -mabi=ilp32f" } */
+
+unsigned short
+foo (unsigned short crc)
+{
+ crc ^= 0x4002;
+ crc >>= 1;
+ crc |= 0x8000;
+
+ return crc;
+}
+
+/* { dg-final { scan-assembler-times "srli\t" 1 } } */
+/* { dg-final { scan-assembler-not "slli\t" } } */