RISC-V: Fix CTZ unnecessary sign extension [PR #106888]

Message ID 20230504171421.1829763-1-rzinsly@ventanamicro.com
State Accepted
Headers
Series RISC-V: Fix CTZ unnecessary sign extension [PR #106888] |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Raphael Moreira Zinsly May 4, 2023, 5:14 p.m. UTC
  We were not able to match the CTZ sign extend pattern on RISC-V
because it get optimized to zero extend and/or to ANDI patterns.
For the ANDI case, combine scrambles the RTL and generates the
extension by using subregs.

	gcc/ChangeLog:
		PR target/106888
		* config/riscv/bitmanip.md
		(<bitmanip_optab>disi2): Match with any_extend.
		(<bitmanip_optab>disi2_sext): New pattern to match
		with sign extend using an ANDI instruction.

	gcc/testsuite/ChangeLog:
		PR target/106888
		* gcc.target/riscv/pr106888.c: New test.
		* gcc.target/riscv/zbbw.c: Check for ANDI.
---
 gcc/config/riscv/bitmanip.md              | 14 +++++++++++++-
 gcc/testsuite/gcc.target/riscv/pr106888.c | 12 ++++++++++++
 gcc/testsuite/gcc.target/riscv/zbbw.c     |  1 +
 3 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr106888.c
  

Comments

Jeff Law May 6, 2023, 2:57 p.m. UTC | #1
On 5/4/23 11:14, Raphael Moreira Zinsly wrote:
> We were not able to match the CTZ sign extend pattern on RISC-V
> because it get optimized to zero extend and/or to ANDI patterns.
> For the ANDI case, combine scrambles the RTL and generates the
> extension by using subregs.
So to provide a few more details here.

Coming into combine we have:
> (insn 2 4 3 2 (set (reg/v:DI 136 [ i ])
>         (reg:DI 10 a0 [ i ])) "j.c":3:1 179 {*movdi_64bit}
>      (expr_list:REG_DEAD (reg:DI 10 a0 [ i ])
>         (nil)))
> (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
> (insn 6 3 7 2 (set (reg:SI 137)
>         (ctz:SI (subreg/u:SI (reg/v:DI 136 [ i ]) 0))) "j.c":4:13 345 {*ctzsi2}
>      (expr_list:REG_DEAD (reg/v:DI 136 [ i ])
>         (nil)))
> (insn 7 6 12 2 (set (reg/v:DI 135 [ <retval> ])
>         (sign_extend:DI (reg:SI 137))) "j.c":4:13 116 {extendsidi2}
>      (expr_list:REG_DEAD (reg:SI 137)
>         (nil)))


The first key being we're starting with an SImode CTZ.  So we have an 
extension on the result and the original argument is in DImode, so we 
have a subreg to get the input into SImode.  That allows us to match the 
standard ctzw pattern.

Of course we know the result of the ctz is in the range 0..32 because 
it's an SImode operand. Thus the extension is redundant and we'd like to 
remove it.

Even though insn 7 is a SIGN_EXTEND, combine knows the SImode sign bit 
is always zero.  As a result it'll canonicalize to ZERO_EXTEND (there's 
a larger discussion around that in the context of aarch64 that I'm not 
going to wade into at the moment).

So combine ultimately tries to match this:


> Trying 6 -> 7:
>     6: r137:SI=ctz(r139:DI#0)
>       REG_DEAD r139:DI
>     7: r135:DI=sign_extend(r137:SI)
>       REG_DEAD r137:SI
> Successfully matched this instruction:
> (set (reg/v:DI 135 [ <retval> ])
>     (and:DI (subreg:DI (ctz:SI (subreg/u:SI (reg:DI 139) 0)) 0)
>         (const_int 127 [0x7f])))

The inner subreg is (of course) still there and must remain so that we 
can continue to distinguish between an SI and DI mode ctz which generate 
different assembler codes on riscv.

Combine has turned the zero extension into a masking operation.  Of 
course the masking operation has to happen in DImode hence new  subreg 
wrapping the result of the ctz so that it can be used in a DImode operation.






> 
> 	gcc/ChangeLog:
> 		PR target/106888
> 		* config/riscv/bitmanip.md
> 		(<bitmanip_optab>disi2): Match with any_extend.
> 		(<bitmanip_optab>disi2_sext): New pattern to match
> 		with sign extend using an ANDI instruction.
> 
> 	gcc/testsuite/ChangeLog:
> 		PR target/106888
> 		* gcc.target/riscv/pr106888.c: New test.
> 		* gcc.target/riscv/zbbw.c: Check for ANDI.
> ---
>   gcc/config/riscv/bitmanip.md              | 14 +++++++++++++-
>   gcc/testsuite/gcc.target/riscv/pr106888.c | 12 ++++++++++++
>   gcc/testsuite/gcc.target/riscv/zbbw.c     |  1 +
>   3 files changed, 26 insertions(+), 1 deletion(-)
>   create mode 100644 gcc/testsuite/gcc.target/riscv/pr106888.c
> 
> diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
> index a27fc3e34a1..8dc3e85a338 100644
> --- a/gcc/config/riscv/bitmanip.md
> +++ b/gcc/config/riscv/bitmanip.md
> @@ -246,13 +246,25 @@
>   
>   (define_insn "*<bitmanip_optab>disi2"
>     [(set (match_operand:DI 0 "register_operand" "=r")
> -        (sign_extend:DI
> +        (any_extend:DI
>             (clz_ctz_pcnt:SI (match_operand:SI 1 "register_operand" "r"))))]
>     "TARGET_64BIT && TARGET_ZBB"
>     "<bitmanip_insn>w\t%0,%1"
>     [(set_attr "type" "<bitmanip_insn>")
>      (set_attr "mode" "SI")])
>   
> +;; A SImode clz_ctz_pcnt may be extended to DImode via subreg.
> +(define_insn "*<bitmanip_optab>disi2_sext"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +        (and:DI (subreg:DI
> +          (clz_ctz_pcnt:SI (subreg:SI
> +             (match_operand:DI 1 "register_operand" "r") 0)) 0)
> +          (match_operand:DI 2 "const_int_operand")))]
> +  "TARGET_64BIT && TARGET_ZBB && ((INTVAL (operands[2]) & 0x3f) == 0x3f)"
> +  "<bitmanip_insn>w\t%0,%1"
> +  [(set_attr "type" "bitmanip")
> +   (set_attr "mode" "SI")])
Looking at this again after a few months away, I'm pretty sure we can 
eliminate that explicit (subreg:SI ...)).  Instead just use 
(match_operand:SI ...) just like the existing pattern already did.

So the pattern just needs the (subreg:DI ...) on the result to allow us 
to mask in DImode...  So something like this:

> ;; A SImode clz_ctz_pcnt may be extended to DImode via subreg.
> (define_insn "*<bitmanip_optab>disi2_sext"
>   [(set (match_operand:DI 0 "register_operand" "=r")
>         (and:DI (subreg:DI
>           (clz_ctz_pcnt:SI (match_operand:SI 1 "register_operand" "r")) 0)
>           (match_operand:DI 2 "const_int_operand")))]
>   "TARGET_64BIT && TARGET_ZBB && ((INTVAL (operands[2]) & 0x3f) == 0x3f)"
>   "<bitmanip_insn>w\t%0,%1"
>   [(set_attr "type" "bitmanip")
>    (set_attr "mode" "SI")])


I lightly tested this locally and it seems to work just as well as your 
original, but is slightly simpler and avoids the explicit subreg.

OK with that change.

jeff
  

Patch

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index a27fc3e34a1..8dc3e85a338 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -246,13 +246,25 @@ 
 
 (define_insn "*<bitmanip_optab>disi2"
   [(set (match_operand:DI 0 "register_operand" "=r")
-        (sign_extend:DI
+        (any_extend:DI
           (clz_ctz_pcnt:SI (match_operand:SI 1 "register_operand" "r"))))]
   "TARGET_64BIT && TARGET_ZBB"
   "<bitmanip_insn>w\t%0,%1"
   [(set_attr "type" "<bitmanip_insn>")
    (set_attr "mode" "SI")])
 
+;; A SImode clz_ctz_pcnt may be extended to DImode via subreg.
+(define_insn "*<bitmanip_optab>disi2_sext"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+        (and:DI (subreg:DI
+          (clz_ctz_pcnt:SI (subreg:SI
+             (match_operand:DI 1 "register_operand" "r") 0)) 0)
+          (match_operand:DI 2 "const_int_operand")))]
+  "TARGET_64BIT && TARGET_ZBB && ((INTVAL (operands[2]) & 0x3f) == 0x3f)"
+  "<bitmanip_insn>w\t%0,%1"
+  [(set_attr "type" "bitmanip")
+   (set_attr "mode" "SI")])
+
 (define_insn "*<bitmanip_optab>di2"
   [(set (match_operand:DI 0 "register_operand" "=r")
         (clz_ctz_pcnt:DI (match_operand:DI 1 "register_operand" "r")))]
diff --git a/gcc/testsuite/gcc.target/riscv/pr106888.c b/gcc/testsuite/gcc.target/riscv/pr106888.c
new file mode 100644
index 00000000000..77fb8e5b79c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr106888.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbb -mabi=lp64" } */
+
+int
+ctz (int i)
+{
+  int res = __builtin_ctz (i);
+  return res&0xffff;
+}
+
+/* { dg-final { scan-assembler-times "ctzw" 1 } } */
+/* { dg-final { scan-assembler-not "andi" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/zbbw.c b/gcc/testsuite/gcc.target/riscv/zbbw.c
index 709743c3b68..f7b2b63853f 100644
--- a/gcc/testsuite/gcc.target/riscv/zbbw.c
+++ b/gcc/testsuite/gcc.target/riscv/zbbw.c
@@ -23,3 +23,4 @@  popcount (int i)
 /* { dg-final { scan-assembler-times "clzw" 1 } } */
 /* { dg-final { scan-assembler-times "ctzw" 1 } } */
 /* { dg-final { scan-assembler-times "cpopw" 1 } } */
+/* { dg-final { scan-assembler-not "andi\t" } } */