[v2] xtensa: Eliminate the use of callee-saved register that saves and restores only once
Checks
Commit Message
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later. However, it is often the case that
the save and the reference are each only once and a simple register-
register move (the frame pointer is needed to recover the stack pointer
and must be excluded).
e.g. in the following example, if there are no other occurrences of
register A14:
;; before
; prologue {
...
s32i.n a14, sp, 16
...
; } prologue
...
mov.n a14, a6
...
call0 foo
...
mov.n a8, a14
...
; epilogue {
...
l32i.n a14, sp, 16
...
; } epilogue
It can be possible like this:
;; after
; prologue {
...
(deleted)
...
; } prologue
...
s32i.n a6, sp, 16
...
call0 foo
...
l32i.n a8, sp, 16
...
; epilogue {
...
(deleted)
...
; } epilogue
This patch introduces a new peephole2 pattern that implements the above.
gcc/ChangeLog:
* config/xtensa/xtensa.md: New peephole2 pattern that eliminates
the use of callee-saved register that saves and restores only once
for other register, by using its stack slot directly.
---
gcc/config/xtensa/xtensa.md | 60 +++++++++++++++++++++++++++++++++++++
1 file changed, 60 insertions(+)
Comments
Hi Suwa-san,
On Mon, Jan 16, 2023 at 8:12 PM Takayuki 'January June' Suwa
<jjsuwa_sys3175@yahoo.co.jp> wrote:
>
> In the case of the CALL0 ABI, values that must be retained before and
> after function calls are placed in the callee-saved registers (A12
> through A15) and referenced later. However, it is often the case that
> the save and the reference are each only once and a simple register-
> register move (the frame pointer is needed to recover the stack pointer
> and must be excluded).
>
> e.g. in the following example, if there are no other occurrences of
> register A14:
>
> ;; before
> ; prologue {
> ...
> s32i.n a14, sp, 16
> ...
> ; } prologue
> ...
> mov.n a14, a6
> ...
> call0 foo
> ...
> mov.n a8, a14
> ...
> ; epilogue {
> ...
> l32i.n a14, sp, 16
> ...
> ; } epilogue
>
> It can be possible like this:
>
> ;; after
> ; prologue {
> ...
> (deleted)
> ...
> ; } prologue
> ...
> s32i.n a6, sp, 16
> ...
> call0 foo
> ...
> l32i.n a8, sp, 16
> ...
> ; epilogue {
> ...
> (deleted)
> ...
> ; } epilogue
>
> This patch introduces a new peephole2 pattern that implements the above.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the use of callee-saved register that saves and restores only once
> for other register, by using its stack slot directly.
> ---
> gcc/config/xtensa/xtensa.md | 60 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 60 insertions(+)
There's still a few regressions in tests with -fcompare-debug because
code generated with -g and without it is different:
+FAIL: gcc.dg/pr41241.c (test for excess errors)
+FAIL: gcc.dg/pr48159-1.c (test for excess errors)
+FAIL: gcc.dg/pr65521.c (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c -O2 (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions (test for
excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c -O3 -g (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c -Os (test for excess errors)
+FAIL: gcc.dg/torture/pr42878-1.c -O2 -flto -fno-use-linker-plugin
-flto-partition=none (test for excess errors)
E.g. check the following test with -g0 and -g:
gcc/cc1 gcc/testsuite/gcc.dg/torture/pr42878-1.c -mlongcalls
-mtext-section-literals -fdiagnostics-plain-output -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions
@@ -3024,3 +3024,63 @@ FALLTHRU:;
operands[1] = GEN_INT (imm0);
operands[2] = GEN_INT (imm1);
})
+
+(define_peephole2
+ [(set (match_operand:SI 0 "register_operand")
+ (match_operand:SI 1 "reload_operand"))]
+ "!TARGET_WINDOWED_ABI && df
+ && epilogue_contains (insn)
+ && ! call_used_or_fixed_reg_p (REGNO (operands[0]))
+ && (!frame_pointer_needed
+ || REGNO (operands[0]) != HARD_FRAME_POINTER_REGNUM)"
+ [(const_int 0)]
+{
+ rtx reg = operands[0], pattern;
+ rtx_insn *insnP = NULL, *insnS = NULL, *insnR = NULL;
+ df_ref ref;
+ rtx_insn *insn;
+ for (ref = DF_REG_DEF_CHAIN (REGNO (reg));
+ ref; ref = DF_REF_NEXT_REG (ref))
+ if (DF_REF_CLASS (ref) != DF_REF_REGULAR)
+ continue;
+ else if ((insn = DF_REF_INSN (ref)) == curr_insn)
+ continue;
+ else if (GET_CODE (pattern = PATTERN (insn)) == SET
+ && rtx_equal_p (SET_DEST (pattern), reg)
+ && REG_P (SET_SRC (pattern)))
+ {
+ if (insnS)
+ FAIL;
+ insnS = insn;
+ continue;
+ }
+ else
+ FAIL;
+ for (ref = DF_REG_USE_CHAIN (REGNO (reg));
+ ref; ref = DF_REF_NEXT_REG (ref))
+ if (DF_REF_CLASS (ref) != DF_REF_REGULAR)
+ continue;
+ else if (prologue_contains (insn = DF_REF_INSN (ref)))
+ {
+ insnP = insn;
+ continue;
+ }
+ else if (GET_CODE (pattern = PATTERN (insn)) == SET
+ && rtx_equal_p (SET_SRC (pattern), reg)
+ && REG_P (SET_DEST (pattern)))
+ {
+ if (insnR)
+ FAIL;
+ insnR = insn;
+ continue;
+ }
+ else
+ FAIL;
+ if (!insnP || !insnS || !insnR)
+ FAIL;
+ SET_DEST (PATTERN (insnS)) = copy_rtx (operands[1]);
+ df_insn_rescan (insnS);
+ SET_SRC (PATTERN (insnR)) = copy_rtx (operands[1]);
+ df_insn_rescan (insnR);
+ set_insn_deleted (insnP);
+})