reload: Handle generating reloads that also clobbers flags
Checks
Commit Message
Regtested cris-elf with its LEGITIMIZE_RELOAD_ADDRESS
disabled, where it regresses gcc.target/cris/rld-legit1.c;
as expected, because that test guards proper function of its
LEGITIMIZE_RELOAD_ADDRESS i.e., that there's no sign of
decomposed address elements.
LRA also causes a similar decomposition (and worse, in even
smaller bits), but it can create valid insns as-is.
Unfortunately, it doesn't have something equivalent to
LEGITIMIZE_RELOAD_ADDRESS so it generates worse code for
cases where that hook helped reload.
I fear reload-related patches these days are treated like a
redheaded stepchild and even worse as this one is intended
for stage 1. Either way, I need to create a reference to
it, and it's properly tested and has been a help when
working towards LRA, thus might help other targets: ok to
install for the next stage 1?
-- >8 --
When LEGITIMIZE_RELOAD_ADDRESS for cris-elf is disabled,
this code is now required for reload to generate valid insns
from some reload-decomposed addresses, for example the
(plus:SI
(sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8]))
(reg/v/f:SI 33 [ y ]))
generated in gcc.target/cris/rld-legit1.c (a valid address
but with two registers needing reload). Now after decc0:ing,
most SET insns for former cc0 targets need to be a parallel
with a clobber of the flags register. Such targets
typically have TARGET_FLAGS_REGNUM set to a valid register.
* reload1.cc (emit_insn_if_valid_for_reload_1): Rename from
emit_insn_if_valid_for_reload.
(emit_insn_if_valid_for_reload): Call new helper, and if a SET fails
to be recognized, also try emitting a parallel that clobbers
TARGET_FLAGS_REGNUM, as applicable.
---
gcc/reload1.cc | 29 ++++++++++++++++++++++++++---
1 file changed, 26 insertions(+), 3 deletions(-)
Comments
On 2/15/23 08:34, Hans-Peter Nilsson via Gcc-patches wrote:
> Regtested cris-elf with its LEGITIMIZE_RELOAD_ADDRESS
> disabled, where it regresses gcc.target/cris/rld-legit1.c;
> as expected, because that test guards proper function of its
> LEGITIMIZE_RELOAD_ADDRESS i.e., that there's no sign of
> decomposed address elements.
>
> LRA also causes a similar decomposition (and worse, in even
> smaller bits), but it can create valid insns as-is.
> Unfortunately, it doesn't have something equivalent to
> LEGITIMIZE_RELOAD_ADDRESS so it generates worse code for
> cases where that hook helped reload.
>
> I fear reload-related patches these days are treated like a
> redheaded stepchild and even worse as this one is intended
> for stage 1. Either way, I need to create a reference to
> it, and it's properly tested and has been a help when
> working towards LRA, thus might help other targets: ok to
> install for the next stage 1?
>
> -- >8 --
> When LEGITIMIZE_RELOAD_ADDRESS for cris-elf is disabled,
> this code is now required for reload to generate valid insns
> from some reload-decomposed addresses, for example the
> (plus:SI
> (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8]))
> (reg/v/f:SI 33 [ y ]))
> generated in gcc.target/cris/rld-legit1.c (a valid address
> but with two registers needing reload). Now after decc0:ing,
> most SET insns for former cc0 targets need to be a parallel
> with a clobber of the flags register. Such targets
> typically have TARGET_FLAGS_REGNUM set to a valid register.
>
> * reload1.cc (emit_insn_if_valid_for_reload_1): Rename from
> emit_insn_if_valid_for_reload.
> (emit_insn_if_valid_for_reload): Call new helper, and if a SET fails
> to be recognized, also try emitting a parallel that clobbers
> TARGET_FLAGS_REGNUM, as applicable.
BUt isn't it the case that we're not supposed to be exposing the flags
register until after reload? And if that's the case, then why would
this be necessary? Clearly I must be missing something.
jeff
> Date: Tue, 18 Apr 2023 07:43:41 -0600
> From: Jeff Law <jeffreyalaw@gmail.com>
> On 2/15/23 08:34, Hans-Peter Nilsson via Gcc-patches wrote:
> > Regtested cris-elf with its LEGITIMIZE_RELOAD_ADDRESS
> > disabled, where it regresses gcc.target/cris/rld-legit1.c;
> > as expected, because that test guards proper function of its
> > LEGITIMIZE_RELOAD_ADDRESS i.e., that there's no sign of
> > decomposed address elements.
> >
> > LRA also causes a similar decomposition (and worse, in even
> > smaller bits), but it can create valid insns as-is.
> > Unfortunately, it doesn't have something equivalent to
> > LEGITIMIZE_RELOAD_ADDRESS so it generates worse code for
> > cases where that hook helped reload.
> >
> > I fear reload-related patches these days are treated like a
> > redheaded stepchild and even worse as this one is intended
> > for stage 1. Either way, I need to create a reference to
> > it, and it's properly tested and has been a help when
> > working towards LRA, thus might help other targets: ok to
> > install for the next stage 1?
> >
> > -- >8 --
> > When LEGITIMIZE_RELOAD_ADDRESS for cris-elf is disabled,
> > this code is now required for reload to generate valid insns
> > from some reload-decomposed addresses, for example the
> > (plus:SI
> > (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8]))
> > (reg/v/f:SI 33 [ y ]))
> > generated in gcc.target/cris/rld-legit1.c (a valid address
> > but with two registers needing reload). Now after decc0:ing,
> > most SET insns for former cc0 targets need to be a parallel
> > with a clobber of the flags register. Such targets
> > typically have TARGET_FLAGS_REGNUM set to a valid register.
> >
> > * reload1.cc (emit_insn_if_valid_for_reload_1): Rename from
> > emit_insn_if_valid_for_reload.
> > (emit_insn_if_valid_for_reload): Call new helper, and if a SET fails
> > to be recognized, also try emitting a parallel that clobbers
> > TARGET_FLAGS_REGNUM, as applicable.
> BUt isn't it the case that we're not supposed to be exposing the flags
> register until after reload? And if that's the case, then why would
> this be necessary? Clearly I must be missing something.
That "supposed to" is only *one* possible implementation.
The one in CRIS - and I believe the preferred one; one I
should advocate more - is to *always* expose clobbering of
the flags. (I managed to do the CRIS decc0ification
transformation without loss of performance. There were much
fewer issues with code taking PATTERN (insn) and failing on
it being PARALLEL than I had expected, much thanks to use of
rtx_single_set.)
Think about it: why should the semantics of a valid insn
change after a "random" pass? That's almost as crazy as the
implied semantics of cc0.
brgds, H-P
> That "supposed to" is only *one* possible implementation.
> The one in CRIS - and I believe the preferred one; one I
> should advocate more - is to *always* expose clobbering of
> the flags.
Yes, both approaches are acceptable IMO and should work.
On 4/18/23 08:12, Hans-Peter Nilsson wrote:
>> Date: Tue, 18 Apr 2023 07:43:41 -0600
>> From: Jeff Law <jeffreyalaw@gmail.com>
>
>> On 2/15/23 08:34, Hans-Peter Nilsson via Gcc-patches wrote:
>>> Regtested cris-elf with its LEGITIMIZE_RELOAD_ADDRESS
>>> disabled, where it regresses gcc.target/cris/rld-legit1.c;
>>> as expected, because that test guards proper function of its
>>> LEGITIMIZE_RELOAD_ADDRESS i.e., that there's no sign of
>>> decomposed address elements.
>>>
>>> LRA also causes a similar decomposition (and worse, in even
>>> smaller bits), but it can create valid insns as-is.
>>> Unfortunately, it doesn't have something equivalent to
>>> LEGITIMIZE_RELOAD_ADDRESS so it generates worse code for
>>> cases where that hook helped reload.
>>>
>>> I fear reload-related patches these days are treated like a
>>> redheaded stepchild and even worse as this one is intended
>>> for stage 1. Either way, I need to create a reference to
>>> it, and it's properly tested and has been a help when
>>> working towards LRA, thus might help other targets: ok to
>>> install for the next stage 1?
>>>
>>> -- >8 --
>>> When LEGITIMIZE_RELOAD_ADDRESS for cris-elf is disabled,
>>> this code is now required for reload to generate valid insns
>>> from some reload-decomposed addresses, for example the
>>> (plus:SI
>>> (sign_extend:SI (mem:HI (reg/v/f:SI 32 [ a ]) [1 *a_6(D)+0 S2 A8]))
>>> (reg/v/f:SI 33 [ y ]))
>>> generated in gcc.target/cris/rld-legit1.c (a valid address
>>> but with two registers needing reload). Now after decc0:ing,
>>> most SET insns for former cc0 targets need to be a parallel
>>> with a clobber of the flags register. Such targets
>>> typically have TARGET_FLAGS_REGNUM set to a valid register.
>>>
>>> * reload1.cc (emit_insn_if_valid_for_reload_1): Rename from
>>> emit_insn_if_valid_for_reload.
>>> (emit_insn_if_valid_for_reload): Call new helper, and if a SET fails
>>> to be recognized, also try emitting a parallel that clobbers
>>> TARGET_FLAGS_REGNUM, as applicable.
>> BUt isn't it the case that we're not supposed to be exposing the flags
>> register until after reload? And if that's the case, then why would
>> this be necessary? Clearly I must be missing something.
>
> That "supposed to" is only *one* possible implementation.
> The one in CRIS - and I believe the preferred one; one I
> should advocate more - is to *always* expose clobbering of
> the flags. (I managed to do the CRIS decc0ification
> transformation without loss of performance. There were much
> fewer issues with code taking PATTERN (insn) and failing on
> it being PARALLEL than I had expected, much thanks to use of
> rtx_single_set.)
>
> Think about it: why should the semantics of a valid insn
> change after a "random" pass? That's almost as crazy as the
> implied semantics of cc0.
Ah, yes, thanks for the reminder that there's multiple approaches here.
If I cared enough it'd probably make more sense at this point to expose
cc0 early on the H8 as doing so would allow easier codegen for overflow
tests which in turn could significantly speed up the testsuite.
OK for the trunk.
jeff
@@ -8377,11 +8377,11 @@ emit_reload_insns (class insn_chain *chain)
reg_reloaded_dead |= reg_reloaded_died;
}
-/* Go through the motions to emit INSN and test if it is strictly valid.
- Return the emitted insn if valid, else return NULL. */
+
+/* Helper for emit_insn_if_valid_for_reload. */
static rtx_insn *
-emit_insn_if_valid_for_reload (rtx pat)
+emit_insn_if_valid_for_reload_1 (rtx pat)
{
rtx_insn *last = get_last_insn ();
int code;
@@ -8403,6 +8403,29 @@ emit_insn_if_valid_for_reload (rtx pat)
return NULL;
}
+/* Go through the motions to emit INSN and test if it is strictly valid.
+ Return the emitted insn if valid, else return NULL. */
+
+static rtx_insn *
+emit_insn_if_valid_for_reload (rtx pat)
+{
+ rtx_insn *insn = emit_insn_if_valid_for_reload_1 (pat);
+
+ if (insn)
+ return insn;
+
+ /* If the pattern is a SET, and this target has a single
+ flags-register, try again with a PARALLEL that clobbers that
+ register. */
+ if (targetm.flags_regnum == INVALID_REGNUM || GET_CODE (pat) != SET)
+ return NULL;
+
+ rtx flags_clobber = gen_hard_reg_clobber (CCmode, targetm.flags_regnum);
+ rtx parpat = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, pat, flags_clobber));
+
+ return emit_insn_if_valid_for_reload (parpat);
+}
+
/* Emit code to perform a reload from IN (which may be a reload register) to
OUT (which may also be a reload register). IN or OUT is from operand
OPNUM with reload type TYPE.