[v2] xtensa: Eliminate unnecessary general-purpose reg-reg moves
Checks
Commit Message
Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:
/* example */
double test(double a, double b) {
return __builtin_copysign(a, b);
}
test:
add.n a3, a3, a3
extui a5, a5, 31, 1
ssai 1
;; be in the same BB
src a7, a5, a3 ;; No '0' in the source constraints
;; No CALL insns in this span
;; Both A3 and A7 are irrelevant to
;; insns in this span
mov.n a3, a7 ;; An unnecessary reg-reg move
;; A7 is not used after this
ret.n
The last two instructions above, excluding the return instruction,
could be done like this:
src a3, a5, a3
This symptom often occurs when handling DI/DFmode values with SImode
instructions. This patch solves the above problem using peephole2
pattern.
gcc/ChangeLog:
* config/xtensa/xtensa.md: New peephole2 pattern that eliminates
the occurrence of genral-purpose register used only once and for
transferring intermediate value.
---
gcc/config/xtensa/xtensa.md | 43 +++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
Comments
Hi Suwa-san,
On Tue, Jan 17, 2023 at 9:25 PM Takayuki 'January June' Suwa
<jjsuwa_sys3175@yahoo.co.jp> wrote:
>
> Register-register move instructions that can be easily seen as
> unnecessary by the human eye may remain in the compiled result.
> For example:
>
> /* example */
> double test(double a, double b) {
> return __builtin_copysign(a, b);
> }
>
> test:
> add.n a3, a3, a3
> extui a5, a5, 31, 1
> ssai 1
> ;; be in the same BB
> src a7, a5, a3 ;; No '0' in the source constraints
> ;; No CALL insns in this span
> ;; Both A3 and A7 are irrelevant to
> ;; insns in this span
> mov.n a3, a7 ;; An unnecessary reg-reg move
> ;; A7 is not used after this
> ret.n
>
> The last two instructions above, excluding the return instruction,
> could be done like this:
>
> src a3, a5, a3
>
> This symptom often occurs when handling DI/DFmode values with SImode
> instructions. This patch solves the above problem using peephole2
> pattern.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the occurrence of genral-purpose register used only once and for
> transferring intermediate value.
> ---
> gcc/config/xtensa/xtensa.md | 43 +++++++++++++++++++++++++++++++++++++
> 1 file changed, 43 insertions(+)
This still generates ICE, this time while building libstdc++:
during RTL pass: ce3
In file included from
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/bits/locale_facets.h:2687,
from
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/locale:42,
from gcc/libstdc++-v3/src/c++11/locale-inst.cc:38,
from gcc/libstdc++-v3/src/c++11/wlocale-inst.cc:35:
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/bits/locale_facets.tcc:
In member function ‘_InIter std::num_get<_CharT,
_InIter>::do_get(iter_type, iter_type, std::ios_base&,
std::ios_base::iostate&, bool&) const [with _CharT = wchar_t; _InIter
= std::istreamb
uf_iterator<wchar_t, std::char_traits<wchar_t> >]’:
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/bits/locale_facets.tcc:686:5:
internal compiler error: in df_refs_verify, at df-scan.cc:4009
686 | }
| ^
0x6eb0dc df_refs_verify
gcc/gcc/df-scan.cc:4009
0xd19a74 df_insn_refs_verify
gcc/gcc/df-scan.cc:4092
0xd1b94c df_bb_verify
gcc/gcc/df-scan.cc:4125
0xd1bd77 df_scan_verify()
gcc/gcc/df-scan.cc:4246
0xd06ca7 df_verify()
gcc/gcc/df-core.cc:1818
0xd06ca7 df_analyze_1
gcc/gcc/df-core.cc:1214
0x1a7287c if_convert
gcc/gcc/ifcvt.cc:5858
0x1a73ddd execute
gcc/gcc/ifcvt.cc:6026
@@ -3091,3 +3091,46 @@ FALLTHRU:;
df_insn_rescan (insnR);
set_insn_deleted (insnP);
})
+
+(define_peephole2
+ [(set (match_operand 0 "register_operand")
+ (match_operand 1 "register_operand"))]
+ "GET_MODE_SIZE (GET_MODE (operands[0])) == 4
+ && GET_MODE_SIZE (GET_MODE (operands[1])) == 4
+ && GP_REG_P (REGNO (operands[0])) && GP_REG_P (REGNO (operands[1]))
+ && peep2_reg_dead_p (1, operands[1])"
+ [(const_int 0)]
+{
+ basic_block bb = BLOCK_FOR_INSN (curr_insn);
+ rtx_insn *head = BB_HEAD (bb), *insn;
+ rtx dest = operands[0], src = operands[1], pattern, t_dest;
+ int i;
+ for (insn = PREV_INSN (curr_insn);
+ insn && insn != head;
+ insn = PREV_INSN (insn))
+ if (CALL_P (insn))
+ break;
+ else if (INSN_P (insn))
+ {
+ if (GET_CODE (pattern = PATTERN (insn)) == SET
+ && REG_P (t_dest = SET_DEST (pattern))
+ && GET_MODE_SIZE (GET_MODE (t_dest)) == 4
+ && REGNO (t_dest) == REGNO (src))
+ {
+ extract_constrain_insn (insn);
+ for (i = 1; i < recog_data.n_operands; ++i)
+ if (strchr (recog_data.constraints[i], '0'))
+ goto ABORT;
+ SET_REGNO (t_dest, REGNO (dest));
+ goto FALLTHRU;
+ }
+ if (reg_overlap_mentioned_p (dest, pattern)
+ || reg_overlap_mentioned_p (src, pattern)
+ || set_of (dest, insn)
+ || set_of (src, insn))
+ break;
+ }
+ABORT:
+ FAIL;
+FALLTHRU:;
+})