Checks
Commit Message
This small patch changes everything that checks targetm.lra_p behave as
if it returned true.
It has no effect on any primary or secondary target. It also is fine
for nds32 and for nios2, and it works fine for microblaze (which used
old reload before), resulting in smaller code.
I have patches to completely rip out old reload, and more stuff after
that, but of course not everything is nice yet:
It makes a few targets no longer build though. In my testing (of all
linux targets that built before) these are alpha, c6x, h8300, m68k,
32-bit parisc, sh, and xtensa.
alpha builds the compiler, but it then crashes with
/home/segher/src/kernel/drivers/tty/serial/serial_core.c:1029:1: internal compiler error: maximum number of generated reload insns per insn achieved (90)
(and in three more files) which can mean anything unfortunately.
c6x is more exciting:
/home/segher/src/kernel/fs/char_dev.c:598:1: internal compiler error: in priority, at haifa-sched.cc:1597
which is
/* We should not be interested in priority of an already scheduled insn. */
gcc_assert (QUEUE_INDEX (insn) != QUEUE_SCHEDULED);
h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
141 | }
| ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
(reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?
m68k builds the compiler fine, but then has
/home/segher/src/kernel/fs/squashfs/namei.c: In function 'squashfs_lookup':
/home/segher/src/kernel/fs/squashfs/namei.c:232:1: error: insn does not satisfy its constraints:
232 | }
| ^
(insn 270 470 271 30 (set (mem:SI (pre_dec:SI (reg/f:SI 15 %sp)) [1 S4 A16])
(plus:SI (sign_extend:SI (reg:HI 8 %a0 [175]))
(reg:SI 2 %d2 [orig:173 val ] [173]))) "/home/segher/src/kernel/fs/squashfs/namei.c":212:13 77 {pushasi}
(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))
during RTL pass: postreload
/home/segher/src/kernel/fs/squashfs/namei.c:232:1: internal compiler error: in extract_constrain_insn, at recog.cc:2692
(and similar in two more files).
32-bit parisc now runs into the 90 reloads thing (parisc64 already did
without the patch).
sh doesn't build GCC:
/home/segher/src/gcc/libgcc/libgcc2.c: In function '__divdc3':
/home/segher/src/gcc/libgcc/libgcc2.c:2182:1: error: unable to generate reloads for:
2182 | }
| ^
(call_insn/u 132 131 1855 7 (parallel [
(set (reg:SI 0 r0)
(call (mem:SI (symbol_ref:SI ("__ltdf2") [flags 0x41] <function_decl 0x7fff8a796500 __ltdf2>) [0 S4 A32])
(const_int 0 [0])))
(use (reg:SI 154 fpscr0))
(use (reg:SI 12 r12))
(clobber (reg:SI 146 pr))
(clobber (reg:SI 758))
]) "/home/segher/src/gcc/libgcc/libgcc2.c":2082:7 233 {call_value_pcrel}
(expr_list:REG_DEAD (reg:DF 6 r6)
(expr_list:REG_DEAD (reg:DF 4 r4)
(expr_list:REG_CALL_DECL (symbol_ref:SI ("__ltdf2") [flags 0x41] <function_decl 0x7fff8a796500 __ltdf2>)
(expr_list:REG_EH_REGION (const_int -2147483648 [0xffffffff80000000])
(nil)))))
(expr_list (use (reg:DF 6 r6))
(expr_list (use (reg:DF 4 r4))
(nil))))
during RTL pass: reload
And finally, xtensa does
/home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy its constraints:
840 | }
| ^
(insn 8 7 9 2 (set (reg:SI 9 a9 [57])
(const_int 1431655765 [0x55555555])) "/home/segher/src/gcc/libgcc/libgcc2.c":828:21 37 {movsi_internal}
(expr_list:REG_EQUIV (const_int 1431655765 [0x55555555])
(nil)))
during RTL pass: postreload
/home/segher/src/gcc/libgcc/libgcc2.c:840:1: internal compiler error: in extract_constrain_insn, at recog.cc:2692
- ~ - ~ -
All in all, more worked than expected. I think it is time to finally
make this switch.
I did not test targets without a linux port. I build the linux kernel
as well, to see if the resulting compiler actually works :-)
Suggestions what to use for other targets are welcome.
Is there any way to get the final targets updated for LRA?
Other rants? :-)
Segher
---
gcc/auto-inc-dec.cc | 2 +-
gcc/combine.cc | 2 +-
gcc/ira-lives.cc | 5 -----
gcc/ira.cc | 2 +-
gcc/recog.cc | 2 +-
gcc/targhooks.cc | 4 ----
6 files changed, 4 insertions(+), 13 deletions(-)
Comments
> On Oct 13, 2022, at 7:56 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>
> This small patch changes everything that checks targetm.lra_p behave as
> if it returned true.
>
> It has no effect on any primary or secondary target. It also is fine
> for nds32 and for nios2, and it works fine for microblaze (which used
> old reload before), resulting in smaller code.
>
> I have patches to completely rip out old reload, and more stuff after
> that, but of course not everything is nice yet:
I guess I'll have to look harder to see if it's possible to make LRA handle CISC addressing modes like memory indirect, autoincrement, autodecrement, and others that the old reload handles at least somewhat. Ideally LRA should do a better job; right now I believe it doesn't really do these things at all. Targets like pdp11 and vax would like these.
paul
On 10/13/22 17:56, Segher Boessenkool wrote:
>
> h8300 fails during GCC build:
> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
> 141 | }
> | ^
> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
> (nil)))
> during RTL pass: final
> which looks like a backend bug, I don't see a pattern that could split
> this (without needing an extra clobber)?
I'm aware of this -- its invalid RTL:
Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.
I'd actually tried to turn on LRA for the H8 port a little while ago and
stumbled over it.
I'm aware of a similar situation involving a general register on the H8,
but using reload instead of LRA. I looked at it a while back and my
recollection was that the insn was actually fine until reload got its
grubby hands on it. And when I wandered reload to hunt for anything
which handled the restriction noted above, I didn't find anything. If
you were to search for H8 bugs in bugzilla, it should be discoverable.
While we could potentially work around this in the backend, it'd be a
hack at best. It hasn't risen to the top of my priority list yet. I
considered suggesting we change this from "invalid" to "target defined"
behavior, but that felt like a cop-out.
jeff
On 10/13/22 17:56, Segher Boessenkool wrote:
>
> h8300 fails during GCC build:
> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
> 141 | }
> | ^
> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
> (nil)))
> during RTL pass: final
> which looks like a backend bug, I don't see a pattern that could split
> this (without needing an extra clobber)?
Really smells like an LRA bug to me.
We have this as we leave IRA:
(insn 10 9 11 2 (set (reg/f:SI 30)
(plus:SI (reg/f:SI 11 fp)
(const_int -4 [0xfffffffffffffffc]))) "j.c":31:7 264 {*addsi}
(expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
(const_int -4 [0xfffffffffffffffc]))
(nil)))
(insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
(reg/f:SI 30)) "j.c":31:7 19 {*movsi}
(expr_list:REG_DEAD (reg/f:SI 30)
(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil))))
And as we leave LRA:
(note 10 9 11 2 NOTE_INSN_DELETED)
(insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
(reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))
Register elimination ultimately discovered that (reg 30) was the same as
the stack pointer and did the natural substitution. The natural
substitution results in invalid RTL and there's really no way to back
out and do something different AFAICT in lra-eliminations.cc.
The only reason we fault is because the H8 backend knows this is invalid
RTL and refuses to split it. If we were to try and re-recognize the
insn in question it would fail to recognize after the substitution
because the auto-inc'd operand overlaps with the other operand.
Jeff
On 2022/10/14 8:56, Segher Boessenkool wrote:
> And finally, xtensa does
> /home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy its constraints:
> 840 | }
> | ^
> (insn 8 7 9 2 (set (reg:SI 9 a9 [57])
> (const_int 1431655765 [0x55555555])) "/home/segher/src/gcc/libgcc/libgcc2.c":828:21 37 {movsi_internal}
> (expr_list:REG_EQUIV (const_int 1431655765 [0x55555555])
> (nil)))
> during RTL pass: postreload
> /home/segher/src/gcc/libgcc/libgcc2.c:840:1: internal compiler error: in extract_constrain_insn, at recog.cc:2692
This is a result of knowing that Reload is tolerant of out-of-constraint constants.
LRA support needs to be taken care of before that, ie. in the "split1" pass.
Excuse me in haste.
> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>
>
> On 10/13/22 17:56, Segher Boessenkool wrote:
>>
>> h8300 fails during GCC build:
>> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>> 141 | }
>> | ^
>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>> (nil)))
>> during RTL pass: final
>> which looks like a backend bug, I don't see a pattern that could split
>> this (without needing an extra clobber)?
>
> I'm aware of this -- its invalid RTL:
>
> Uses of the register outside of an address are not permitted within the
> same insn as a use in an embedded side effect expression because such
> insns behave differently on different machines and hence must be treated
> as ambiguous and disallowed.
I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed differently on different machine models. The solution I picked is to create two sets of machine-specific constraint codes, one for "register N" and the other for "autoinc/dec of any register other than N" and pairing those. (You can see this in pdp11.md, the mov<mode> definition.)
But the pdp11 case is actually not as restrictive as the rule you mentioned. The problem case is register N source, autoinc/dec rN destination. The opposite case, which we see here -- autoinc/dec Rn source, Rn destination -- is just fine. Perhaps not all that important, but the ISA definition does not object to it. So I'm not sure why there would be a general rule that says it's considered ambiguous when the target machine architecture says it is not.
paul
On 10/14/22 06:37, Koning, Paul wrote:
>
>> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> On 10/13/22 17:56, Segher Boessenkool wrote:
>>> h8300 fails during GCC build:
>>> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
>>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>>> 141 | }
>>> | ^
>>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
>>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
>>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>>> (nil)))
>>> during RTL pass: final
>>> which looks like a backend bug, I don't see a pattern that could split
>>> this (without needing an extra clobber)?
>> I'm aware of this -- its invalid RTL:
>>
>> Uses of the register outside of an address are not permitted within the
>> same insn as a use in an embedded side effect expression because such
>> insns behave differently on different machines and hence must be treated
>> as ambiguous and disallowed.
> I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed differently on different machine models. The solution I picked is to create two sets of machine-specific constraint codes, one for "register N" and the other for "autoinc/dec of any register other than N" and pairing those. (You can see this in pdp11.md, the mov<mode> definition.)
I've long suspected the pdp11 was the inspiration for this restriction
(I have memories of noting it before I relocated to Utah, so circa
1992). The key problem is the generic parts of the compiler don't know
what the semantics ought to be -- so it's not obvious when they do a
substitution whether or not the substitution of one reg for another is
actually valid. It's important to remember that sometimes when we
substitute one register for another, we don't have any contextual
information about source vs dest -- it's a long standing wart that
causes problems in other cases as well.
That punts the problem to the backends and the H8 actually tries to deal
with this restriction. Basically in the movxx pattern conditions, when
the destination uses an autoinc addressing mode, the pattern's condition
will check that the source register is different. I would expect other
ports likely to do something similar.
But that approach falls down with reload/lra doing substitutions without
validating the result. I guess it might be possible to cobble together
something with secondary reloads, but it's way way way down on my todo list.
And yes, this case where the autoinc is on the destination works
consistently on the H8 as well. We could consider loosening the
restrictions and let this through. It's certainly simpler as it's a doc
change and removing a bit of code on the H8. It sounds like the pdp11
already assumes that case is valid.
Jeff
On Fri, Oct 14, 2022 at 12:36:47AM +0000, Koning, Paul wrote:
> I guess I'll have to look harder to see if it's possible to make LRA handle CISC addressing modes like memory indirect, autoincrement, autodecrement, and others that the old reload handles at least somewhat. Ideally LRA should do a better job; right now I believe it doesn't really do these things at all. Targets like pdp11 and vax would like these.
So what does it do now? Break every more complex addressing mode apart
again? Or ICE? Or something in between?
Segher
On Fri, Oct 14, 2022 at 03:20:40PM +0900, Takayuki 'January June' Suwa wrote:
> On 2022/10/14 8:56, Segher Boessenkool wrote:
> > And finally, xtensa does
> > /home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy its constraints:
> > 840 | }
> > | ^
> > (insn 8 7 9 2 (set (reg:SI 9 a9 [57])
> > (const_int 1431655765 [0x55555555])) "/home/segher/src/gcc/libgcc/libgcc2.c":828:21 37 {movsi_internal}
> > (expr_list:REG_EQUIV (const_int 1431655765 [0x55555555])
> > (nil)))
> > during RTL pass: postreload
> > /home/segher/src/gcc/libgcc/libgcc2.c:840:1: internal compiler error: in extract_constrain_insn, at recog.cc:2692
>
> This is a result of knowing that Reload is tolerant of out-of-constraint constants.
> LRA support needs to be taken care of before that, ie. in the "split1" pass.
> Excuse me in haste.
So old reload did a split itself? Or it left it to say the split2 pass?
Thanks for looking into it!
Segher
Hi!
On Thu, Oct 13, 2022 at 10:47:20PM -0600, Jeff Law wrote:
> On 10/13/22 17:56, Segher Boessenkool wrote:
> >h8300 fails during GCC build:
> >/home/segher/src/gcc/libgcc/unwind.inc: In function
> >'_Unwind_SjLj_RaiseException':
> >/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
> > 141 | }
> > | ^
> >(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
> > (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12
> > 19 {*movsi}
> > (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
> > (nil)))
> >during RTL pass: final
> >which looks like a backend bug, I don't see a pattern that could split
> >this (without needing an extra clobber)?
>
> Really smells like an LRA bug to me.
>
>
> We have this as we leave IRA:
>
>
> (insn 10 9 11 2 (set (reg/f:SI 30)
> (plus:SI (reg/f:SI 11 fp)
> (const_int -4 [0xfffffffffffffffc]))) "j.c":31:7 264
> {*addsi}
> (expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
> (const_int -4 [0xfffffffffffffffc]))
> (nil)))
> (insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
> (reg/f:SI 30)) "j.c":31:7 19 {*movsi}
> (expr_list:REG_DEAD (reg/f:SI 30)
> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
> (nil))))
>
>
> And as we leave LRA:
>
> (note 10 9 11 2 NOTE_INSN_DELETED)
> (insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
> (reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
> (nil)))
LRA only ever generates insns that pass recog. The backend allows this
define_insn, requiring it to be split (it returns template "#"), but
then somehow it doesn't match in any split pass?
> Register elimination ultimately discovered that (reg 30) was the same as
> the stack pointer and did the natural substitution. The natural
> substitution results in invalid RTL and there's really no way to back
> out and do something different AFAICT in lra-eliminations.cc.
>
> The only reason we fault is because the H8 backend knows this is invalid
> RTL and refuses to split it. If we were to try and re-recognize the
> insn in question it would fail to recognize after the substitution
> because the auto-inc'd operand overlaps with the other operand.
But it *did* recog? Or does LRA somehow not always recog() everything?
I always thought that was one of the huge improvements over old reload
(it does everything very locally instead of very globally)!
Segher
> On Oct 14, 2022, at 10:38 AM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>
>
> On 10/14/22 06:37, Koning, Paul wrote:
>>
>>> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>>
>>>
>>> On 10/13/22 17:56, Segher Boessenkool wrote:
>>>> h8300 fails during GCC build:
>>>> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
>>>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>>>> 141 | }
>>>> | ^
>>>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
>>>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
>>>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>>>> (nil)))
>>>> during RTL pass: final
>>>> which looks like a backend bug, I don't see a pattern that could split
>>>> this (without needing an extra clobber)?
>>> I'm aware of this -- its invalid RTL:
>>>
>>> Uses of the register outside of an address are not permitted within the
>>> same insn as a use in an embedded side effect expression because such
>>> insns behave differently on different machines and hence must be treated
>>> as ambiguous and disallowed.
>> I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed differently on different machine models. The solution I picked is to create two sets of machine-specific constraint codes, one for "register N" and the other for "autoinc/dec of any register other than N" and pairing those. (You can see this in pdp11.md, the mov<mode> definition.)
>
> I've long suspected the pdp11 was the inspiration for this restriction (I have memories of noting it before I relocated to Utah, so circa 1992). The key problem is the generic parts of the compiler don't know what the semantics ought to be -- so it's not obvious when they do a substitution whether or not the substitution of one reg for another is actually valid. It's important to remember that sometimes when we substitute one register for another, we don't have any contextual information about source vs dest -- it's a long standing wart that causes problems in other cases as well.
>
> That punts the problem to the backends and the H8 actually tries to deal with this restriction. Basically in the movxx pattern conditions, when the destination uses an autoinc addressing mode, the pattern's condition will check that the source register is different. I would expect other ports likely to do something similar.
>
> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
Aren't the constraints enforced? My experience is that I was getting these bad addressing modes in some test programs, and that the constraints I created to make the requirement explicit cured that. Maybe I'm expecting too much from constraints, but my (admittedly inexperienced) understanding of them is that they inform reload what sort of things it can construct, and what it cannot.
If reload obeys the constraints in the patterns then the back end machine definition can be written to avoid the problematic cases, and it is no longer necessary to have a general (and as I pointed out, overly broad) rule in generic code.
paul
> Am 14.10.2022 um 16:40 schrieb Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org>:
>
>
>> On 10/14/22 06:37, Koning, Paul wrote:
>>
>>>> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>>
>>>
>>> On 10/13/22 17:56, Segher Boessenkool wrote:
>>>> h8300 fails during GCC build:
>>>> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
>>>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>>>> 141 | }
>>>> | ^
>>>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
>>>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
>>>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>>>> (nil)))
>>>> during RTL pass: final
>>>> which looks like a backend bug, I don't see a pattern that could split
>>>> this (without needing an extra clobber)?
>>> I'm aware of this -- its invalid RTL:
>>>
>>> Uses of the register outside of an address are not permitted within the
>>> same insn as a use in an embedded side effect expression because such
>>> insns behave differently on different machines and hence must be treated
>>> as ambiguous and disallowed.
>> I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed differently on different machine models. The solution I picked is to create two sets of machine-specific constraint codes, one for "register N" and the other for "autoinc/dec of any register other than N" and pairing those. (You can see this in pdp11.md, the mov<mode> definition.)
>
> I've long suspected the pdp11 was the inspiration for this restriction (I have memories of noting it before I relocated to Utah, so circa 1992). The key problem is the generic parts of the compiler don't know what the semantics ought to be -- so it's not obvious when they do a substitution whether or not the substitution of one reg for another is actually valid. It's important to remember that sometimes when we substitute one register for another, we don't have any contextual information about source vs dest -- it's a long standing wart that causes problems in other cases as well.
>
> That punts the problem to the backends and the H8 actually tries to deal with this restriction. Basically in the movxx pattern conditions, when the destination uses an autoinc addressing mode, the pattern's condition will check that the source register is different. I would expect other ports likely to do something similar.
>
> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
>
> And yes, this case where the autoinc is on the destination works consistently on the H8 as well. We could consider loosening the restrictions and let this through. It's certainly simpler as it's a doc change and removing a bit of code on the H8. It sounds like the pdp11 already assumes that case is valid.
But what is the semantics of the RTL IL?
That ought to be documented.
>
> Jeff
>
> On Oct 14, 2022, at 12:18 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>
> On Fri, Oct 14, 2022 at 12:36:47AM +0000, Koning, Paul wrote:
>> I guess I'll have to look harder to see if it's possible to make LRA handle CISC addressing modes like memory indirect, autoincrement, autodecrement, and others that the old reload handles at least somewhat. Ideally LRA should do a better job; right now I believe it doesn't really do these things at all. Targets like pdp11 and vax would like these.
>
> So what does it do now? Break every more complex addressing mode apart
> again? Or ICE? Or something in between?
The former. LRA does handle some cases but not all that the target permits and not as many as the old reload.
Example:
unsigned int cksum (unsigned int *buf, unsigned int len)
{
unsigned int ret = 0;
do {
ret += *buf++;
} while (--len != 0);
return ret;
}
The loop looks like this:
L_2:
add (r2)+,r0
sob r1,L_2
which is what I would expect. Now throw in an indirection:
Old reload produces this loop:
L_2:
add @(r2)+,r0
sob r1,L_2
while LRA doesn't understand it can use the autoincrement indirect mode:
L_2:
mov (r2)+,r3
add (r3),r0
sob r1,L_2
This is from a GCC 13.0 test build of last June, with -O2 -m45, with and without -mlra.
paul
On 10/14/22 10:37, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Oct 13, 2022 at 10:47:20PM -0600, Jeff Law wrote:
>> On 10/13/22 17:56, Segher Boessenkool wrote:
>>> h8300 fails during GCC build:
>>> /home/segher/src/gcc/libgcc/unwind.inc: In function
>>> '_Unwind_SjLj_RaiseException':
>>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>>> 141 | }
>>> | ^
>>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
>>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12
>>> 19 {*movsi}
>>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>>> (nil)))
>>> during RTL pass: final
>>> which looks like a backend bug, I don't see a pattern that could split
>>> this (without needing an extra clobber)?
>> Really smells like an LRA bug to me.
>>
>>
>> We have this as we leave IRA:
>>
>>
>> (insn 10 9 11 2 (set (reg/f:SI 30)
>> (plus:SI (reg/f:SI 11 fp)
>> (const_int -4 [0xfffffffffffffffc]))) "j.c":31:7 264
>> {*addsi}
>> (expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
>> (const_int -4 [0xfffffffffffffffc]))
>> (nil)))
>> (insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
>> (reg/f:SI 30)) "j.c":31:7 19 {*movsi}
>> (expr_list:REG_DEAD (reg/f:SI 30)
>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>> (nil))))
>>
>>
>> And as we leave LRA:
>>
>> (note 10 9 11 2 NOTE_INSN_DELETED)
>> (insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
>> (reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>> (nil)))
> LRA only ever generates insns that pass recog. The backend allows this
> define_insn, requiring it to be split (it returns template "#"), but
> then somehow it doesn't match in any split pass?
Nope. The elimination code will just change one register without
re-recognizing. That's precisely what happens here.
>
>> Register elimination ultimately discovered that (reg 30) was the same as
>> the stack pointer and did the natural substitution. The natural
>> substitution results in invalid RTL and there's really no way to back
>> out and do something different AFAICT in lra-eliminations.cc.
>>
>> The only reason we fault is because the H8 backend knows this is invalid
>> RTL and refuses to split it. If we were to try and re-recognize the
>> insn in question it would fail to recognize after the substitution
>> because the auto-inc'd operand overlaps with the other operand.
> But it *did* recog? Or does LRA somehow not always recog() everything?
> I always thought that was one of the huge improvements over old reload
> (it does everything very locally instead of very globally)!
No, LRA does not force re-recognition in some cases, particularly for
register eliminations.
jeff
On 10/14/22 10:37, Koning, Paul wrote:
>
>> On Oct 14, 2022, at 10:38 AM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> On 10/14/22 06:37, Koning, Paul wrote:
>>>> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>>>
>>>>
>>>> On 10/13/22 17:56, Segher Boessenkool wrote:
>>>>> h8300 fails during GCC build:
>>>>> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
>>>>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>>>>> 141 | }
>>>>> | ^
>>>>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
>>>>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
>>>>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>>>>> (nil)))
>>>>> during RTL pass: final
>>>>> which looks like a backend bug, I don't see a pattern that could split
>>>>> this (without needing an extra clobber)?
>>>> I'm aware of this -- its invalid RTL:
>>>>
>>>> Uses of the register outside of an address are not permitted within the
>>>> same insn as a use in an embedded side effect expression because such
>>>> insns behave differently on different machines and hence must be treated
>>>> as ambiguous and disallowed.
>>> I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed differently on different machine models. The solution I picked is to create two sets of machine-specific constraint codes, one for "register N" and the other for "autoinc/dec of any register other than N" and pairing those. (You can see this in pdp11.md, the mov<mode> definition.)
>> I've long suspected the pdp11 was the inspiration for this restriction (I have memories of noting it before I relocated to Utah, so circa 1992). The key problem is the generic parts of the compiler don't know what the semantics ought to be -- so it's not obvious when they do a substitution whether or not the substitution of one reg for another is actually valid. It's important to remember that sometimes when we substitute one register for another, we don't have any contextual information about source vs dest -- it's a long standing wart that causes problems in other cases as well.
>>
>> That punts the problem to the backends and the H8 actually tries to deal with this restriction. Basically in the movxx pattern conditions, when the destination uses an autoinc addressing mode, the pattern's condition will check that the source register is different. I would expect other ports likely to do something similar.
>>
>> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
> Aren't the constraints enforced? My experience is that I was getting these bad addressing modes in some test programs, and that the constraints I created to make the requirement explicit cured that. Maybe I'm expecting too much from constraints, but my (admittedly inexperienced) understanding of them is that they inform reload what sort of things it can construct, and what it cannot.
It's not really a constraint issue -- the pattern's condition would
cause this not to recognize, but LRA doesn't re-recognize the insn. We
might be able to hack something in the constraints to force a reload of
the source operand in this case. Ugly, but a possibility.
Jeff
On 10/14/22 10:39, Richard Biener wrote:
>
>> Am 14.10.2022 um 16:40 schrieb Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org>:
>>
>>
>>> On 10/14/22 06:37, Koning, Paul wrote:
>>>
>>>>> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>>>
>>>> On 10/13/22 17:56, Segher Boessenkool wrote:
>>>>> h8300 fails during GCC build:
>>>>> /home/segher/src/gcc/libgcc/unwind.inc: In function '_Unwind_SjLj_RaiseException':
>>>>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>>>>> 141 | }
>>>>> | ^
>>>>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
>>>>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 {*movsi}
>>>>> (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>>>>> (nil)))
>>>>> during RTL pass: final
>>>>> which looks like a backend bug, I don't see a pattern that could split
>>>>> this (without needing an extra clobber)?
>>>> I'm aware of this -- its invalid RTL:
>>>>
>>>> Uses of the register outside of an address are not permitted within the
>>>> same insn as a use in an embedded side effect expression because such
>>>> insns behave differently on different machines and hence must be treated
>>>> as ambiguous and disallowed.
>>> I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed differently on different machine models. The solution I picked is to create two sets of machine-specific constraint codes, one for "register N" and the other for "autoinc/dec of any register other than N" and pairing those. (You can see this in pdp11.md, the mov<mode> definition.)
>> I've long suspected the pdp11 was the inspiration for this restriction (I have memories of noting it before I relocated to Utah, so circa 1992). The key problem is the generic parts of the compiler don't know what the semantics ought to be -- so it's not obvious when they do a substitution whether or not the substitution of one reg for another is actually valid. It's important to remember that sometimes when we substitute one register for another, we don't have any contextual information about source vs dest -- it's a long standing wart that causes problems in other cases as well.
>>
>> That punts the problem to the backends and the H8 actually tries to deal with this restriction. Basically in the movxx pattern conditions, when the destination uses an autoinc addressing mode, the pattern's condition will check that the source register is different. I would expect other ports likely to do something similar.
>>
>> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
>>
>> And yes, this case where the autoinc is on the destination works consistently on the H8 as well. We could consider loosening the restrictions and let this through. It's certainly simpler as it's a doc change and removing a bit of code on the H8. It sounds like the pdp11 already assumes that case is valid.
> But what is the semantics of the RTL IL?
> That ought to be documented.
*If* we went a route to relax the restriction (and I'm still not sure
that's a good idea), we absolutely would have to document the semantics.
jeff
On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
> >LRA only ever generates insns that pass recog. The backend allows this
> >define_insn, requiring it to be split (it returns template "#"), but
> >then somehow it doesn't match in any split pass?
>
> Nope. The elimination code will just change one register without
> re-recognizing. That's precisely what happens here.
That is a big oversight then. Please file a PR?
> >>Register elimination ultimately discovered that (reg 30) was the same as
> >>the stack pointer and did the natural substitution. The natural
> >>substitution results in invalid RTL and there's really no way to back
> >>out and do something different AFAICT in lra-eliminations.cc.
> >>
> >>The only reason we fault is because the H8 backend knows this is invalid
> >>RTL and refuses to split it. If we were to try and re-recognize the
> >>insn in question it would fail to recognize after the substitution
> >>because the auto-inc'd operand overlaps with the other operand.
> >But it *did* recog? Or does LRA somehow not always recog() everything?
> >I always thought that was one of the huge improvements over old reload
> >(it does everything very locally instead of very globally)!
>
> No, LRA does not force re-recognition in some cases, particularly for
> register eliminations.
It is the only way it can know if it needs to reload more. Even if it
somehow can assume it doesn't have to check this in some cases, an
assert (inside a CHECKING_P) would be nice?
Segher
> On Oct 14, 2022, at 1:10 PM, Jeff Law <jeffreyalaw@gmail.com> wrote:
>
> On 10/14/22 10:37, Koning, Paul wrote:
>>
>>> ...
>>> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
>> Aren't the constraints enforced? My experience is that I was getting these bad addressing modes in some test programs, and that the constraints I created to make the requirement explicit cured that. Maybe I'm expecting too much from constraints, but my (admittedly inexperienced) understanding of them is that they inform reload what sort of things it can construct, and what it cannot.
>
> It's not really a constraint issue -- the pattern's condition would cause this not to recognize, but LRA doesn't re-recognize the insn. We might be able to hack something in the constraints to force a reload of the source operand in this case. Ugly, but a possibility.
I find it hard to cope with constraints that don't constrain. Minimally it should be clearly documented exactly what cases fail to obey the constraints and what a target writer can do to deal with those failures.
As it stands, I find myself working hard to write MD code that accurately describes the rules of the machine, and for the core machinery to disregard those instructions is painful.
Is there a compelling argument for every case where LRA fails to obey the constraints? If not, can they just be called bugs and added to the to-be-fixed queue?
paul
On 10/14/22 11:35, Segher Boessenkool wrote:
> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
>>> LRA only ever generates insns that pass recog. The backend allows this
>>> define_insn, requiring it to be split (it returns template "#"), but
>>> then somehow it doesn't match in any split pass?
>> Nope. The elimination code will just change one register without
>> re-recognizing. That's precisely what happens here.
> That is a big oversight then. Please file a PR?
Sure. But just recognizing (for this particular case) will just move
the fault from a failure to split to a failure to recognize. From my
wanderings in the elimination code, I don't see that it has a path that
would allow it to reasonably handle this case -- ie, if the insn does
not recognize, what then? Conceptually we need to generate an
input-reload but I don't see a way to do that in the elimination code.
Maybe Vlad knows how it ought to be handled.
> It is the only way it can know if it needs to reload more. Even if it
> somehow can assume it doesn't have to check this in some cases, an
> assert (inside a CHECKING_P) would be nice?
Agreed.
jeff
> On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>
>
> On 10/14/22 11:35, Segher Boessenkool wrote:
>> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
>>>> LRA only ever generates insns that pass recog. The backend allows this
>>>> define_insn, requiring it to be split (it returns template "#"), but
>>>> then somehow it doesn't match in any split pass?
>>> Nope. The elimination code will just change one register without
>>> re-recognizing. That's precisely what happens here.
>> That is a big oversight then. Please file a PR?
>
> Sure. But just recognizing (for this particular case) will just move the fault from a failure to split to a failure to recognize. From my wanderings in the elimination code, I don't see that it has a path that would allow it to reasonably handle this case -- ie, if the insn does not recognize, what then? Conceptually we need to generate an input-reload but I don't see a way to do that in the elimination code. Maybe Vlad knows how it ought to be handled.
I probably have too simplistic a view of this, but the way I think of it is that LRA (and reload) make decisions subject to constraints, and among those constraints are the ones specified in the MD file patterns. That to me means that a substitution proposed to be made by the LRA code is subject to those invariants: it can't do that if the constraints say "no" and must then consider some other alternative.
paul
On Fri, Oct 14, 2022 at 07:58:39PM +0000, Koning, Paul wrote:
> > On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > On 10/14/22 11:35, Segher Boessenkool wrote:
> >> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
> >>>> LRA only ever generates insns that pass recog. The backend allows this
> >>>> define_insn, requiring it to be split (it returns template "#"), but
> >>>> then somehow it doesn't match in any split pass?
> >>> Nope. The elimination code will just change one register without
> >>> re-recognizing. That's precisely what happens here.
> >> That is a big oversight then. Please file a PR?
> >
> > Sure. But just recognizing (for this particular case) will just move the fault from a failure to split to a failure to recognize. From my wanderings in the elimination code, I don't see that it has a path that would allow it to reasonably handle this case -- ie, if the insn does not recognize, what then? Conceptually we need to generate an input-reload but I don't see a way to do that in the elimination code. Maybe Vlad knows how it ought to be handled.
>
> I probably have too simplistic a view of this, but the way I think of it is that LRA (and reload) make decisions subject to constraints, and among those constraints are the ones specified in the MD file patterns. That to me means that a substitution proposed to be made by the LRA code is subject to those invariants: it can't do that if the constraints say "no" and must then consider some other alternative.
I think that is exactly right for LRA.
Old reload conceptually changed the whole function all at once, starting
with valid RTL, and ending with strictly valid RTL. LRA works locally,
one instruction at a time essentially, and makes the changes
immediately. If when it has finished work on the function offsets have
changed, it walks over the whole function again, repeat until done.
"Strictly valid" means that the constraints are considered, and the insn
is only valid if some enabled alternative satisfies all constraints.
I hope I got that all right, I'm not an expert! :-)
Segher
> On Oct 14, 2022, at 4:12 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>
> On Fri, Oct 14, 2022 at 07:58:39PM +0000, Koning, Paul wrote:
>>> On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>> On 10/14/22 11:35, Segher Boessenkool wrote:
>>>> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
>>>>>> LRA only ever generates insns that pass recog. The backend allows this
>>>>>> define_insn, requiring it to be split (it returns template "#"), but
>>>>>> then somehow it doesn't match in any split pass?
>>>>> Nope. The elimination code will just change one register without
>>>>> re-recognizing. That's precisely what happens here.
>>>> That is a big oversight then. Please file a PR?
>>>
>>> Sure. But just recognizing (for this particular case) will just move the fault from a failure to split to a failure to recognize. From my wanderings in the elimination code, I don't see that it has a path that would allow it to reasonably handle this case -- ie, if the insn does not recognize, what then? Conceptually we need to generate an input-reload but I don't see a way to do that in the elimination code. Maybe Vlad knows how it ought to be handled.
>>
>> I probably have too simplistic a view of this, but the way I think of it is that LRA (and reload) make decisions subject to constraints, and among those constraints are the ones specified in the MD file patterns. That to me means that a substitution proposed to be made by the LRA code is subject to those invariants: it can't do that if the constraints say "no" and must then consider some other alternative.
>
> I think that is exactly right for LRA.
>
> Old reload conceptually changed the whole function all at once, starting
> with valid RTL, and ending with strictly valid RTL. LRA works locally,
> one instruction at a time essentially, and makes the changes
> immediately. If when it has finished work on the function offsets have
> changed, it walks over the whole function again, repeat until done.
>
> "Strictly valid" means that the constraints are considered, and the insn
> is only valid if some enabled alternative satisfies all constraints.
>
> I hope I got that all right, I'm not an expert! :-)
Thanks Segher.
As I said earlier, if for some reason this straightforward understanding is not completely accurate, that can be handled provided it is documented when and why the exceptions arise, and what methods the target author should use to deal with these things when they happen.
As a target maintainer not deeply skilled in the GCC common internals, I tend to trip over these things. With the old reload, and secondary reload in particular, it always felt to me like the answer was "keep tweaking the target definition files until the test cases stop breaking". That isn't how it should be.
Perhaps some of these issues come from out of the ordinary target restrictions. The autoinc/autodec case we're discussing may be an example of that. The one I remember in particular was the pdp11 float instructions, where I have 6 registers but only 4 of these can be loaded from or stored to memory. Putting the other two to work while having spill to memory work right took quite a lot of iteration.
It may be LRA is better in these areas. I haven't spent much time with that, other than to create a way to enable its use and observing that (a) I got about the same test suite numbers either way and (b) the LRA code was not as good in some of the cases.
paul
On 10/14/22 11:36, Koning, Paul wrote:
>
>> On Oct 14, 2022, at 1:10 PM, Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>> On 10/14/22 10:37, Koning, Paul wrote:
>>>> ...
>>>> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
>>> Aren't the constraints enforced? My experience is that I was getting these bad addressing modes in some test programs, and that the constraints I created to make the requirement explicit cured that. Maybe I'm expecting too much from constraints, but my (admittedly inexperienced) understanding of them is that they inform reload what sort of things it can construct, and what it cannot.
>> It's not really a constraint issue -- the pattern's condition would cause this not to recognize, but LRA doesn't re-recognize the insn. We might be able to hack something in the constraints to force a reload of the source operand in this case. Ugly, but a possibility.
> I find it hard to cope with constraints that don't constrain. Minimally it should be clearly documented exactly what cases fail to obey the constraints and what a target writer can do to deal with those failures.
Constraints have a purpose, but as I've noted, they really don't come
into play here. Had LRA tried to see if what it created as a valid
move insn, the backend would have said "nope, that's not valid". That's
a stronger test than checking the constraints. If the insn is not valid
according to its condition, then the constraints simply don't matter.
I'm not aware of a case where constraints are failing to be obeyed and
constraints simply aren't a viable solution here other than to paper
over the problem and hope it doesn't show up elsewhere.
Right now operand 0's constraint is "<" meaning pre-inc operand, operand
1 is "r". How would you define a new constraint for operand 1 that
disallows overlap with operand 0 given that the H8 allows autoinc on any
register operand? You can't look at operand 0 while processing the
constraint for operand 1. Similarly if you try to define a new
constraint for operand0 without looking at operand1.
Hence the h8300_move_ok test in the insn's condition where we can look
at both operands to assess if it's a legitimate insn.
>
> As it stands, I find myself working hard to write MD code that accurately describes the rules of the machine, and for the core machinery to disregard those instructions is painful.
No doubt.
>
> Is there a compelling argument for every case where LRA fails to obey the constraints? If not, can they just be called bugs and added to the to-be-fixed queue?
There was in the reload days, though I honestly don't remember what it
was, I'm much less familiar with LRA in this regard, but I trust Vlad's
engineering skills and strongly believe that failing to recognize was
done for a good reason.
It's also worth repeating, we can get the same fundamental failure on
the H8 with reload. The testcase is different, but the core issue is
the same. We have a move with an autoinc destination and the same
register is also used as a source operand incorerctly created by reload.
What's a bit interesting here is the m68k doesn't do any kind of
checking for these scenarios. It just accepts them and generates the
obvious code. I'm more tempted by the minute to do the same on the H8 :-)
Jeff
> On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>
>
> On 10/14/22 11:36, Koning, Paul wrote:
>>
>>> On Oct 14, 2022, at 1:10 PM, Jeff Law <jeffreyalaw@gmail.com> wrote:
>>>
>>> On 10/14/22 10:37, Koning, Paul wrote:
>>>>> ...
>>>>> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
>>>> Aren't the constraints enforced? My experience is that I was getting these bad addressing modes in some test programs, and that the constraints I created to make the requirement explicit cured that. Maybe I'm expecting too much from constraints, but my (admittedly inexperienced) understanding of them is that they inform reload what sort of things it can construct, and what it cannot.
>>> It's not really a constraint issue -- the pattern's condition would cause this not to recognize, but LRA doesn't re-recognize the insn. We might be able to hack something in the constraints to force a reload of the source operand in this case. Ugly, but a possibility.
>> I find it hard to cope with constraints that don't constrain. Minimally it should be clearly documented exactly what cases fail to obey the constraints and what a target writer can do to deal with those failures.
>
> Constraints have a purpose, but as I've noted, they really don't come into play here. Had LRA tried to see if what it created as a valid move insn, the backend would have said "nope, that's not valid". That's a stronger test than checking the constraints. If the insn is not valid according to its condition, then the constraints simply don't matter.
>
> I'm not aware of a case where constraints are failing to be obeyed and constraints simply aren't a viable solution here other than to paper over the problem and hope it doesn't show up elsewhere.
>
> Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is "r". How would you define a new constraint for operand 1 that disallows overlap with operand 0 given that the H8 allows autoinc on any register operand? You can't look at operand 0 while processing the constraint for operand 1. Similarly if you try to define a new constraint for operand0 without looking at operand1.
Easy but cumbersome: define constraints for "register N" (for each N) and another set for "autoinc on any register other than N". In pdp11, I called these Z0, Z1... and Za, Zb... respectively. Then the insn gets constraints that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands. As I said, see pdp11.md, the mov insn.
paul
On 10/14/22 15:21, Koning, Paul wrote:
>
>> On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> On 10/14/22 11:36, Koning, Paul wrote:
>>>> On Oct 14, 2022, at 1:10 PM, Jeff Law <jeffreyalaw@gmail.com> wrote:
>>>>
>>>> On 10/14/22 10:37, Koning, Paul wrote:
>>>>>> ...
>>>>>> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
>>>>> Aren't the constraints enforced? My experience is that I was getting these bad addressing modes in some test programs, and that the constraints I created to make the requirement explicit cured that. Maybe I'm expecting too much from constraints, but my (admittedly inexperienced) understanding of them is that they inform reload what sort of things it can construct, and what it cannot.
>>>> It's not really a constraint issue -- the pattern's condition would cause this not to recognize, but LRA doesn't re-recognize the insn. We might be able to hack something in the constraints to force a reload of the source operand in this case. Ugly, but a possibility.
>>> I find it hard to cope with constraints that don't constrain. Minimally it should be clearly documented exactly what cases fail to obey the constraints and what a target writer can do to deal with those failures.
>> Constraints have a purpose, but as I've noted, they really don't come into play here. Had LRA tried to see if what it created as a valid move insn, the backend would have said "nope, that's not valid". That's a stronger test than checking the constraints. If the insn is not valid according to its condition, then the constraints simply don't matter.
>>
>> I'm not aware of a case where constraints are failing to be obeyed and constraints simply aren't a viable solution here other than to paper over the problem and hope it doesn't show up elsewhere.
>>
>> Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is "r". How would you define a new constraint for operand 1 that disallows overlap with operand 0 given that the H8 allows autoinc on any register operand? You can't look at operand 0 while processing the constraint for operand 1. Similarly if you try to define a new constraint for operand0 without looking at operand1.
> Easy but cumbersome: define constraints for "register N" (for each N) and another set for "autoinc on any register other than N". In pdp11, I called these Z0, Z1... and Za, Zb... respectively. Then the insn gets constraints that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands. As I said, see pdp11.md, the mov insn.
Yea, you're right. It's definitely possible. Painful, but possible.
jeff
On 10/14/22 15:21, Koning, Paul wrote:
>
>> On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> On 10/14/22 11:36, Koning, Paul wrote:
>>>> On Oct 14, 2022, at 1:10 PM, Jeff Law <jeffreyalaw@gmail.com> wrote:
>>>>
>>>> On 10/14/22 10:37, Koning, Paul wrote:
>>>>>> ...
>>>>>> But that approach falls down with reload/lra doing substitutions without validating the result. I guess it might be possible to cobble together something with secondary reloads, but it's way way way down on my todo list.
>>>>> Aren't the constraints enforced? My experience is that I was getting these bad addressing modes in some test programs, and that the constraints I created to make the requirement explicit cured that. Maybe I'm expecting too much from constraints, but my (admittedly inexperienced) understanding of them is that they inform reload what sort of things it can construct, and what it cannot.
>>>> It's not really a constraint issue -- the pattern's condition would cause this not to recognize, but LRA doesn't re-recognize the insn. We might be able to hack something in the constraints to force a reload of the source operand in this case. Ugly, but a possibility.
>>> I find it hard to cope with constraints that don't constrain. Minimally it should be clearly documented exactly what cases fail to obey the constraints and what a target writer can do to deal with those failures.
>> Constraints have a purpose, but as I've noted, they really don't come into play here. Had LRA tried to see if what it created as a valid move insn, the backend would have said "nope, that's not valid". That's a stronger test than checking the constraints. If the insn is not valid according to its condition, then the constraints simply don't matter.
>>
>> I'm not aware of a case where constraints are failing to be obeyed and constraints simply aren't a viable solution here other than to paper over the problem and hope it doesn't show up elsewhere.
>>
>> Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is "r". How would you define a new constraint for operand 1 that disallows overlap with operand 0 given that the H8 allows autoinc on any register operand? You can't look at operand 0 while processing the constraint for operand 1. Similarly if you try to define a new constraint for operand0 without looking at operand1.
> Easy but cumbersome: define constraints for "register N" (for each N) and another set for "autoinc on any register other than N". In pdp11, I called these Z0, Z1... and Za, Zb... respectively. Then the insn gets constraints that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands. As I said, see pdp11.md, the mov insn.
It generally looks sound, but golly gee, this runs into the "reload
doesn't validate insns problem" if it's done in a reload tree rather
than an lra tree. We've got an insn with a pre-inc destination and a
reg source. The source pseudo doesn't get a hard reg, reload replaces
the pseudo with a mem as expected. Reload finishes with something like this:
(insn 100 98 101 15 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [8 S4 A32])
(mem/c:SI (plus:SI (reg/f:SI 7 sp)
(const_int 8 [0x8])) [9 %sfp+-24 S4 A32])) "j.c":62:11
19 {*movsix}
(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))
Which, isn't a valid instruction on the H8. The insn's condition
verifies that one of the two operands must be a REG. But reload never
bothered to re-recognize the insn after makng the substitution, then
naturally it blows up in reload_cse with a constraint failure because
the pre-inc destination constraints require a reg for the source
operand. But the real culprit here is reload making the substitution
and not validing that the result is valid.
Arggh!
Which brings me back to pondering just removing the autoinc magic
checking in the H8 port :-) I've actually got that spinning in the
tester just to see if there's any obvious fallout. I've already spent
more time on this than I can reasonably justify.
Jeff
On 2022/10/15 1:25, Segher Boessenkool wrote:
> On Fri, Oct 14, 2022 at 03:20:40PM +0900, Takayuki 'January June' Suwa wrote:
>> On 2022/10/14 8:56, Segher Boessenkool wrote:
>>> And finally, xtensa does
>>> /home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy its constraints:
>>> 840 | }
>>> | ^
>>> (insn 8 7 9 2 (set (reg:SI 9 a9 [57])
>>> (const_int 1431655765 [0x55555555])) "/home/segher/src/gcc/libgcc/libgcc2.c":828:21 37 {movsi_internal}
>>> (expr_list:REG_EQUIV (const_int 1431655765 [0x55555555])
>>> (nil)))
>>> during RTL pass: postreload
>>> /home/segher/src/gcc/libgcc/libgcc2.c:840:1: internal compiler error: in extract_constrain_insn, at recog.cc:2692
>>
>> This is a result of knowing that Reload is tolerant of out-of-constraint constants.
>> LRA support needs to be taken care of before that, ie. in the "split1" pass.
>> Excuse me in haste.
>
> So old reload did a split itself? Or it left it to say the split2 pass?
Reload did.
[testpiece]
unsigned int test(void) {
return 0x55555555;
}
[282r.ira]
(insn 9 5 10 2 (set (reg/i:SI 2 a2)
(const_int 1431655765 [0x55555555])) "test.c":3:1 27 {movsi_internal}
(expr_list:REG_EQUAL (const_int 1431655765 [0x55555555])
(nil)))
[283r.reload]
;; Function test (test, funcdef_no=0, decl_uid=1386, cgraph_uid=1, symbol_order=0)
insn=9, live_throughout: 1, dead_or_set: 2
insn=10, live_throughout: 1, 2, dead_or_set:
Spilling for insn 9.
Reloads for insn # 9
Reload 0: reload_in (SI) = (const_int 1431655765 [0x55555555])
RL_REGS, RELOAD_FOR_INPUT (opnum = 1)
reload_in_reg: (const_int 1431655765 [0x55555555])
reload_reg_rtx: (reg/i:SI 2 a2)
deleting insn with uid = 9.
(insn 14 5 10 2 (set (reg/i:SI 2 a2)
(mem/u/c:SI (symbol_ref/u:SI ("*.LC0") [flags 0x2]) [0 S4 A32])) "test.c":3:1 27 {movsi_internal}
(nil))
@@ -1443,7 +1443,7 @@ merge_in_block (int max_reg, basic_block bb)
/* Reload should handle auto-inc within a jump correctly, while LRA
is known to have issues with autoinc. */
- if (JUMP_P (insn) && targetm.lra_p ())
+ if (JUMP_P (insn))
continue;
if (dump_file)
@@ -2026,7 +2026,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn *pred ATTRIBUTE_UNUSED,
if (AUTO_INC_DEC)
for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
if (REG_NOTE_KIND (link) == REG_INC
- && ((JUMP_P (i3) && targetm.lra_p ())
+ && (JUMP_P (i3)
|| reg_used_between_p (XEXP (link, 0), insn, i3)
|| (pred != NULL_RTX
&& reg_overlap_mentioned_p (XEXP (link, 0), PATTERN (pred)))
@@ -1133,11 +1133,6 @@ find_call_crossed_cheap_reg (rtx_insn *insn)
rtx
non_conflicting_reg_copy_p (rtx_insn *insn)
{
- /* Reload has issues with overlapping pseudos being assigned to the
- same hard register, so don't allow it. See PR87600 for details. */
- if (!targetm.lra_p ())
- return NULL_RTX;
-
rtx set = single_set (insn);
/* Disallow anything other than a simple register to register copy
@@ -1664,7 +1664,7 @@ ira_init_once (void)
ira_init_costs_once ();
lra_init_once ();
- ira_use_lra_p = targetm.lra_p ();
+ ira_use_lra_p = 1;
}
/* Free ira_max_register_move_cost, ira_may_move_in_cost and
@@ -3242,7 +3242,7 @@ constrain_operands (int strict, alternative_mask alternatives)
|| (strict < 0 && CONSTANT_P (op))
/* Before reload, accept a pseudo or hard register,
since LRA can turn it into a mem. */
- || (strict < 0 && targetm.lra_p () && REG_P (op))
+ || (strict < 0 && REG_P (op))
/* During reload, accept a pseudo */
|| (reload_in_progress && REG_P (op)
&& REGNO (op) >= FIRST_PSEUDO_REGISTER)))
@@ -1380,10 +1380,6 @@ default_secondary_reload (bool in_p ATTRIBUTE_UNUSED, rtx x ATTRIBUTE_UNUSED,
machine_mode
default_secondary_memory_needed_mode (machine_mode mode)
{
- if (!targetm.lra_p ()
- && known_lt (GET_MODE_BITSIZE (mode), BITS_PER_WORD)
- && INTEGRAL_MODE_P (mode))
- return mode_for_size (BITS_PER_WORD, GET_MODE_CLASS (mode), 0).require ();
return mode;
}