[committed] CRIS: Add peephole2 to handle gcc.target/cris/rld-legit1.c for LRA

Message ID 20230328013037.3EFE020417@pchp3.se.axis.com
State Repeat Merge
Headers
Series [committed] CRIS: Add peephole2 to handle gcc.target/cris/rld-legit1.c for LRA |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Hans-Peter Nilsson March 28, 2023, 1:30 a.m. UTC
  The test-case gcc.target/cris/rld-legit1.c is a reduced
test-case that required defining LEGITIMIZE_RELOAD_ADDRESS
to stop the address from being decomposed into several insns
by reload.  Valid but suboptimal code was generated.

(Before implementing that hook for CRIS, the same test-case
also exposed a bug in reload, and a fix was committed to
avoid an ICE; see e.g. git r0-71992-gff0d9879ab0f30 and
related commits.  But, post-cc0, reload no longer handles
this test-case without LEGITIMIZE_RELOAD_ADDRESS helping and
there'd again an be ICE for CRIS (again: only if
LEGITIMIZE_RELOAD_ADDRESS is disabled).  There's a patch to
reload to fix that, at
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612039.html)

But, LRA also does not handle that test-case gracefully, and
like reload without LEGITIMIZE_RELOAD_ADDRESS for CRIS,
decomposes the address into a suboptimal (but valid)
sequence, about as messy as that from reload, and
gcc.target/cris/rld-legit1.c would regress for LRA.  There's
nothing equivalent to LEGITIMIZE_RELOAD_ADDRESS for LRA.
(Stepping through LRA, I can't find an obvious place where
to put such a hook.  Granted, I haven't seen this kind of
messy decomposition in other code, so I'm not insisting a
LEGITIMIZE_RELOAD_ADDRESS-like hook is a good idea.)

These new peephole2's are required to not regress
gcc.target/cris/rld-legit1.c with LRA enabled for CRIS.
They don't appear to otherwise make a difference for neither
libgcc, newlib libc, my own at-a-glance tests nor coremark,
for neither LRA nor reload.

	* config/cris/cris.md (BW2): New mode-iterator.
	(lra_szext_decomposed, lra_szext_decomposed_indirect_with_offset): New
	peephole2s.
---
 gcc/config/cris/cris.md | 50 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)
  

Patch

diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 2bea480a0200..c3de259983c6 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -183,6 +183,10 @@  (define_mode_iterator SI_ [SI])
 
 (define_mode_iterator WD [SI HI])
 (define_mode_iterator BW [HI QI])
+
+; Another "BW" for use where an independent iteration is needed.
+(define_mode_iterator BW2 [HI QI])
+
 (define_mode_attr S [(SI "HI") (HI "QI")])
 (define_mode_attr s [(SI "hi") (HI "qi")])
 (define_mode_attr m [(SI ".d") (HI ".w") (QI ".b")])
@@ -2832,6 +2836,52 @@  (define_peephole2 ; andqu
   operands[3] = gen_rtx_ZERO_EXTEND (SImode, op1);
   operands[4] = GEN_INT (trunc_int_for_mode (INTVAL (operands[1]), QImode));
 })
+
+;; Fix a decomposed szext: fuse it with the memory operand of the
+;; load.  This is typically the sign-extension part of a decomposed
+;; "indirect offset" address.
+(define_peephole2 ; lra_szext_decomposed
+  [(parallel
+    [(set (match_operand:BW 0 "register_operand")
+	  (match_operand:BW 1 "memory_operand"))
+     (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (parallel
+    [(set (match_operand:SI 2 "register_operand") (szext:SI (match_dup 0)))
+     (clobber (reg:CC CRIS_CC0_REGNUM))])]
+  "REGNO (operands[0]) == REGNO (operands[2])
+   || peep2_reg_dead_p (2, operands[0])"
+  [(parallel
+    [(set (match_dup 2) (szext:SI (match_dup 1)))
+     (clobber (reg:CC CRIS_CC0_REGNUM))])])
+
+;; Re-compose a decomposed "indirect offset" address for a szext
+;; operation.  The non-clobbering "addi" is generated by LRA.
+;; This and lra_szext_decomposed is covered by cris/rld-legit1.c.
+(define_peephole2 ; lra_szext_decomposed_indirect_with_offset
+  [(parallel
+    [(set (match_operand:SI 0 "register_operand")
+	  (sign_extend:SI (mem:BW (match_operand:SI 1 "register_operand"))))
+     (clobber (reg:CC CRIS_CC0_REGNUM))])
+   (set (match_dup 0)
+	(plus:SI (match_dup 0) (match_operand:SI 2 "register_operand")))
+   (parallel
+    [(set (match_operand:SI 3 "register_operand")
+	  (szext:SI (mem:BW2 (match_dup 0))))
+     (clobber (reg:CC CRIS_CC0_REGNUM))])]
+  "(REGNO (operands[0]) == REGNO (operands[3])
+    || peep2_reg_dead_p (3, operands[0]))
+   && (REGNO (operands[0]) == REGNO (operands[1])
+       || peep2_reg_dead_p (3, operands[0]))"
+  [(parallel
+    [(set
+      (match_dup 3)
+      (szext:SI
+       (mem:BW2 (plus:SI (szext:SI (mem:BW (match_dup 1))) (match_dup 2)))))
+     (clobber (reg:CC CRIS_CC0_REGNUM))])])
+
+;; Add operations with similar or same decomposed addresses here, when
+;; encountered - but only when covered by mentioned test-cases for at
+;; least one of the cases generalized in the pattern.
 
 ;; Local variables:
 ;; mode:emacs-lisp