Add new target hook: simplify_modecc_const.

Message ID 00fa01d8a0e9$0efdfc60$2cf9f520$@nextmovesoftware.com
State New, archived
Headers
Series Add new target hook: simplify_modecc_const. |

Commit Message

Roger Sayle July 26, 2022, 12:13 p.m. UTC
  This patch is a major revision of the patch I originally proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598040.html

The primary motivation of this patch is to avoid incorrect optimization
of MODE_CC comparisons in simplify_const_relational_operation when/if a
backend represents the (known) contents of a MODE_CC register using a
CONST_INT.  In such cases, the RTL optimizers don't know the semantics
of this integer value, so shouldn't change anything (i.e. should return
NULL_RTX from simplify_const_relational_operation).

The secondary motivation is that by introducing a new target hook, called
simplify_modecc_const, the backend can (optionally) encode and interpret
a target dependent encoding of MODE_CC registers.

The worked example provided with this patch is to allow the i386 backend
to explicitly model the carry flag (MODE_CCC) using 1 to indicate that
the carry flag is set, and 0 to indicate the carry flag is clear.  This
allows the instructions stc (set carry flag), clc (clear carry flag) and
cmc (complement carry flag) to be represented in RTL.

However an even better example would be the rs6000 backend, where this
patch/target hook would allow improved modelling of the condition register
CR.  The powerpc's comparison instructions set fields/bits in the CR
register [where bit 0 indicates less than, bit 1 greater than, bit 2
equal to and bit3 overflow] analogous to x86's flags register [containing
bits for carry, zero, overflow, parity etc.].  These fields can be
manipulated directly using crset (aka creqv) and crclr (aka crxor)
instructions and even transferred from general purpose registers using
mtcr.  However, without a patch like this, it's impossible to safely
model/represent these instructions in rs6000.md.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
and both with and without a patch to add stc, clc and cmc support to
the x86 backend.  I'll resubmit the x86 target pieces again with that
follow-up backend patch, so for now I'm only looking for approval
of the middle-end infrastructure pieces.  The x86 hunks below are
provided as context/documentation for how this hook could/should be
used (but I wouldn't object to pre-approval of those bits by Uros).
Ok for mainline?


2022-07-26  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        * target.def (simplify_modecc_const): New target hook.
        * doc/tm.texi (TARGET_SIMPLIFY_MODECC_CONST): Document here.
        * doc/tm.texi.in (TARGET_SIMPLIFY_MODECC_CONST): Locate @hook here.
        * hooks.cc (hook_rtx_mode_int_rtx_null): Define default hook here.
        * hooks.h (hook_rtx_mode_int_rtx_null): Prototype here.
        * simplify-rtx.c (simplify_const_relational_operation): Avoid
        mis-optimizing MODE_CC comparisons by calling new target hook.

        * config/i386.cc (ix86_simplify_modecc_const): Implement new target
        hook, supporting interpreting MODE_CCC values as the x86 carry flag.
        (TARGET_SIMPLIFY_MODECC_CONST): Define as
ix86_simplify_modecc_const.


Thanks in advance,
Roger
--

> -----Original Message-----
> From: Segher Boessenkool <segher@kernel.crashing.org>
> Sent: 07 July 2022 23:39
> To: Roger Sayle <roger@nextmovesoftware.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Be careful with MODE_CC in
> simplify_const_relational_operation.
> 
> Hi!
> 
> On Thu, Jul 07, 2022 at 10:08:04PM +0100, Roger Sayle wrote:
> > I think it's fair to describe RTL's representation of condition flags
> > using MODE_CC as a little counter-intuitive.
> 
> "A little challenging", and you should see that as a good thing, as a
puzzle to
> crack :-)
> 
> > For example, the i386
> > backend represents the carry flag (in adc instructions) using RTL of
> > the form "(ltu:SI (reg:CCC) (const_int 0))", where great care needs to
> > be taken not to treat this like a normal RTX expression, after all LTU
> > (less-than-unsigned) against const0_rtx would normally always be
> > false.
> 
> A comparison of a MODE_CC thing against 0 means the result of a
> *previous* comparison (or other cc setter) is looked at.  Usually it
simply looks
> at some condition bits in a flags register.  It does not do any actual
comparison:
> that has been done before (if at all even).
> 
> > Hence, MODE_CC comparisons need to be treated with caution, and
> > simplify_const_relational_operation returns early (to avoid
> > problems) when GET_MODE_CLASS (GET_MODE (op0)) == MODE_CC.
> 
> Not just to avoid problems: there simply isn't enough information to do a
> correct job.
> 
> > However, consider the (currently) hypothetical situation, where the
> > RTL optimizers determine that a previous instruction unconditionally
> > sets or clears the carry flag, and this gets propagated by combine
> > into the above expression, we'd end up with something that looks like
> > (ltu:SI (const_int 1) (const_int 0)), which doesn't mean what it says.
> > Fortunately, simplify_const_relational_operation is passed the
> > original mode of the comparison (cmp_mode, the original mode of op0)
> > which can be checked for MODE_CC, even when op0 is now VOIDmode
> > (const_int) after the substitution.  Defending against this is clearly
> > the right thing to do.
> >
> > More controversially, rather than just abort
> > simplification/optimization in this case, we can use the comparison
> > operator to infer/select the semantics of the CC_MODE flag.
> > Hopefully, whenever a backend uses LTU, it represents the (set) carry
> > flag (and behaves like i386.md), in which case the result of the
simplified
> expression is the first operand.
> > [If there's no standardization of semantics across backends, then we
> > should always just return 0; but then miss potential optimizations].
> 
> On PowerPC, ltu means the result of an unsigned comparison (we have
> instructions for that, cmpl[wd][i] mainly) was "smaller than".  It does
not mean
> anything is unsigned smaller than zero.  It also has nothing to do with
carries,
> which are done via a different register (the XER).
> 
> > +  /* Handle MODE_CC comparisons that have been simplified to
> > +     constants.  */
> > +  if (GET_MODE_CLASS (mode) == MODE_CC
> > +      && op1 == const0_rtx
> > +      && CONST_INT_P (op0))
> > +    {
> > +      /* LTU represents the carry flag.  */
> > +      if (code == LTU)
> > +	return op0 == const0_rtx ? const0_rtx : const_true_rtx;
> > +      return 0;
> > +    }
> > +
> >    /* We can't simplify MODE_CC values since we don't know what the
> >       actual comparison is.  */
> 
> ^^^
> This comment is 100% true.  We cannot simplify any MODE_CC comparison
> without having more context.  The combiner does have that context when it
> tries to combine the CC setter with the CC consumer, for example.
> 
> Do you have some piece of motivating example code?
> 
> 
> Segher
  

Comments

Segher Boessenkool July 26, 2022, 5:44 p.m. UTC | #1
Hi!

On Tue, Jul 26, 2022 at 01:13:02PM +0100, Roger Sayle wrote:
> This patch is a major revision of the patch I originally proposed here:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598040.html
> 
> The primary motivation of this patch is to avoid incorrect optimization
> of MODE_CC comparisons in simplify_const_relational_operation when/if a
> backend represents the (known) contents of a MODE_CC register using a
> CONST_INT.  In such cases, the RTL optimizers don't know the semantics
> of this integer value, so shouldn't change anything (i.e. should return
> NULL_RTX from simplify_const_relational_operation).

This is invalid RTL.  What would  (set (reg:CC) (const_int 0))  mean,
for example?  If this was valid it would make most existing code using
CC modes do essentially random things :-(

The documentation (in tm.texi, "Condition Code") says
  Alternatively, you can use @code{BImode} if the comparison operator is
  specified already in the compare instruction.  In this case, you are not
  interested in most macros in this section.

> The worked example provided with this patch is to allow the i386 backend
> to explicitly model the carry flag (MODE_CCC) using 1 to indicate that
> the carry flag is set, and 0 to indicate the carry flag is clear.  This
> allows the instructions stc (set carry flag), clc (clear carry flag) and
> cmc (complement carry flag) to be represented in RTL.

Hrm, I wonder how other targets do this.

On Power we have a separate hard register for the carry flag of course
(it is a separate bit in the hardware as well, XER[CA]).

On Arm there is arm_carry_operation (as well as arm_borrow_operation).

Aarch64 directly uses
(define_expand "add<mode>3_carryin"
  [(set (match_operand:GPI 0 "register_operand")
        (plus:GPI
          (plus:GPI
            (ltu:GPI (reg:CC_C CC_REGNUM) (const_int 0))
            (match_operand:GPI 1 "aarch64_reg_or_zero"))
          (match_operand:GPI 2 "aarch64_reg_or_zero")))]
   ""
   ""
)
(CC_Cmode means only the C bit is validly set).

s390 does similar.  sparc does similar.

> However an even better example would be the rs6000 backend, where this
> patch/target hook would allow improved modelling of the condition register
> CR.  The powerpc's comparison instructions set fields/bits in the CR
> register [where bit 0 indicates less than, bit 1 greater than, bit 2
> equal to and bit3 overflow]

There are eight condition register fields which can be used
interchangeably (some insns only write to CR0, CR1, or CR6).  The
meaning of the four bits in a field depends on the instruction that set
them.  For integer comparisons bit 3 does not mean anything to do with a
comparison: instead, it is a copy of the XER[SO] bit ("summary
overflow").  The rs6000 backend does not currently model this (we do not
model the overflow instructions at all!)

> analogous to x86's flags register [containing
> bits for carry, zero, overflow, parity etc.].  These fields can be
> manipulated directly using crset (aka creqv) and crclr (aka crxor)
> instructions

crand, crnand, cror, crxor, crnor, creqv, crandc, crorc insns, or the
extended mnemonics crmove, crclr, crnot, crset, yes.  All these for
setting single bits; there also is mcrf to copy all four bits of a CR
field to another.

> and even transferred from general purpose registers using
> mtcr.  However, without a patch like this, it's impossible to safely
> model/represent these instructions in rs6000.md.

And yet we do.  See for example @cceq_rev_compare_<mode> which
implements crnot.

> +  /* Handle MODE_CC comparisons that have been simplified to
> +     constants.  */
> +  if (GET_MODE_CLASS (mode) == MODE_CC
> +      && op1 == const0_rtx
> +      && CONST_INT_P (op0))
> +    return targetm.simplify_modecc_const (mode, (int)code, op0);

Comparing two integer constants is invalid RTL *in all contexts*.  The
items compared do not have a mode!  From rtl.texi:
  A @code{compare} specifying two @code{VOIDmode} constants is not valid
  since there is no way to know in what mode the comparison is to be
  performed; the comparison must either be folded during the compilation
  or the first operand must be loaded into a register while its mode is
  still known.


Segher
  
Roger Sayle July 26, 2022, 9:04 p.m. UTC | #2
Hi Segher,
It's very important to distinguish the invariants that exist for the RTL
data structures as held in memory (rtx), vs. the use of "enum rtx_code"s,
"machine_mode"s and operands in the various processing functions
of the middle-end.

Yes, it's very true that RTL integer constants don't specify a mode
(are VOIDmode), so therefore operations like ZERO_EXTEND or EQ
don't make sense with all constant operands.  This is (one reason)
why constant-only operands are disallowed from RTL (data structures),
and why in APIs that perform/simplify these operations, the original
operand mode (of the const_int(s)) must be/is always passed as a
parameter.

Hence, for say simplify_const_binary_operation, op0 and op1 can
both be const_int, as the mode argument specifies the mode of the
"code" operation. Likewise, in simplify_relational_operation, both
op0 and op1 may be CONST_INT as "cmp_mode" explicitly specifies
the mode that the operation is performed in and "mode" specifies
the mode of the result.

Your comment that "comparing two integer constants is invalid
RTL *in all contexts*" is a serious misunderstanding of what's
going on.  At no point is a RTL rtx node ever allocated with two
integer constant operands.  RTL simplification is for hypothetical
"what if" transformations (just like try_combine calls recog with
RTL that may not be real instructions), and these simplifcations
are even sometimes required to preserve the documented RTL
invariants.  Comparisons of two integers must be simplified to
true/false precisely to ensure that they never appear in an actual
COMPARE node.

I worry this fundamental misunderstanding is the same issue that
has been holding up understanding/approving a previous patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html

For a related bug, consider PR rtl-optimization/67382, that's assigned
to you in bugzilla.  In this case, the RTL optimizers know that both
operands to a COMPARE are integer constants (both -2), yet the
compiler still performs a run-time comparison and conditional jump:

        movl    $-2, %eax
        movl    %eax, 12(%rsp)
        cmpl    $-2, %eax
        je      .L1

Failing to optimize/consider a comparison between two integer
constants *in any context* just leads to poor code.

Hopefully, this clears up that the documented constraints on RTL rtx
aren't exactly the same as the constraints on the use of rtx_codes in
simplify-rtx's functional APIs.  So simplify_subreg really gets called
on operands that are neither REG nor MEM, as this is unrelated to
what the documentation of the SUBREG rtx specifies.

If you don't believe that op0 and op1 can ever both be const_int
in this function, perhaps consider it harmless dead code and humor
me.

Thanks in advance,
Roger
--

> -----Original Message-----
> From: Segher Boessenkool <segher@kernel.crashing.org>
> Sent: 26 July 2022 18:45
> To: Roger Sayle <roger@nextmovesoftware.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Add new target hook: simplify_modecc_const.
> 
> Hi!
> 
> On Tue, Jul 26, 2022 at 01:13:02PM +0100, Roger Sayle wrote:
> > This patch is a major revision of the patch I originally proposed here:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598040.html
> >
> > The primary motivation of this patch is to avoid incorrect
> > optimization of MODE_CC comparisons in
> > simplify_const_relational_operation when/if a backend represents the
> > (known) contents of a MODE_CC register using a CONST_INT.  In such
> > cases, the RTL optimizers don't know the semantics of this integer
> > value, so shouldn't change anything (i.e. should return NULL_RTX from
> simplify_const_relational_operation).
> 
> This is invalid RTL.  What would  (set (reg:CC) (const_int 0))  mean, for
example?
> If this was valid it would make most existing code using CC modes do
essentially
> random things :-(
> 
> The documentation (in tm.texi, "Condition Code") says
>   Alternatively, you can use @code{BImode} if the comparison operator is
>   specified already in the compare instruction.  In this case, you are not
>   interested in most macros in this section.
> 
> > The worked example provided with this patch is to allow the i386
> > backend to explicitly model the carry flag (MODE_CCC) using 1 to
> > indicate that the carry flag is set, and 0 to indicate the carry flag
> > is clear.  This allows the instructions stc (set carry flag), clc
> > (clear carry flag) and cmc (complement carry flag) to be represented in
RTL.
> 
> Hrm, I wonder how other targets do this.
> 
> On Power we have a separate hard register for the carry flag of course (it
is a
> separate bit in the hardware as well, XER[CA]).
> 
> On Arm there is arm_carry_operation (as well as arm_borrow_operation).
> 
> Aarch64 directly uses
> (define_expand "add<mode>3_carryin"
>   [(set (match_operand:GPI 0 "register_operand")
>         (plus:GPI
>           (plus:GPI
>             (ltu:GPI (reg:CC_C CC_REGNUM) (const_int 0))
>             (match_operand:GPI 1 "aarch64_reg_or_zero"))
>           (match_operand:GPI 2 "aarch64_reg_or_zero")))]
>    ""
>    ""
> )
> (CC_Cmode means only the C bit is validly set).
> 
> s390 does similar.  sparc does similar.
> 
> > However an even better example would be the rs6000 backend, where this
> > patch/target hook would allow improved modelling of the condition
> > register CR.  The powerpc's comparison instructions set fields/bits in
> > the CR register [where bit 0 indicates less than, bit 1 greater than,
> > bit 2 equal to and bit3 overflow]
> 
> There are eight condition register fields which can be used
interchangeably
> (some insns only write to CR0, CR1, or CR6).  The meaning of the four bits
in a
> field depends on the instruction that set them.  For integer comparisons
bit 3
> does not mean anything to do with a
> comparison: instead, it is a copy of the XER[SO] bit ("summary overflow").
The
> rs6000 backend does not currently model this (we do not model the overflow
> instructions at all!)
> 
> > analogous to x86's flags register [containing bits for carry, zero,
> > overflow, parity etc.].  These fields can be manipulated directly
> > using crset (aka creqv) and crclr (aka crxor) instructions
> 
> crand, crnand, cror, crxor, crnor, creqv, crandc, crorc insns, or the
extended
> mnemonics crmove, crclr, crnot, crset, yes.  All these for setting single
bits;
> there also is mcrf to copy all four bits of a CR field to another.
> 
> > and even transferred from general purpose registers using mtcr.
> > However, without a patch like this, it's impossible to safely
> > model/represent these instructions in rs6000.md.
> 
> And yet we do.  See for example @cceq_rev_compare_<mode> which
> implements crnot.
> 
> > +  /* Handle MODE_CC comparisons that have been simplified to
> > +     constants.  */
> > +  if (GET_MODE_CLASS (mode) == MODE_CC
> > +      && op1 == const0_rtx
> > +      && CONST_INT_P (op0))
> > +    return targetm.simplify_modecc_const (mode, (int)code, op0);
> 
> Comparing two integer constants is invalid RTL *in all contexts*.  The
items
> compared do not have a mode!  From rtl.texi:
>   A @code{compare} specifying two @code{VOIDmode} constants is not valid
>   since there is no way to know in what mode the comparison is to be
>   performed; the comparison must either be folded during the compilation
>   or the first operand must be loaded into a register while its mode is
>   still known.
> 
> 
> Segher
  
Segher Boessenkool July 26, 2022, 10:11 p.m. UTC | #3
Hi!

On Tue, Jul 26, 2022 at 10:04:45PM +0100, Roger Sayle wrote:
> It's very important to distinguish the invariants that exist for the RTL
> data structures as held in memory (rtx),

"In memory"?  What does that mean here?  RTX are just RTL expressions,
nothing more, nothing less.

> vs. the use of "enum rtx_code"s,
> "machine_mode"s and operands in the various processing functions
> of the middle-end.

Of course.

> Yes, it's very true that RTL integer constants don't specify a mode
> (are VOIDmode), so therefore operations like ZERO_EXTEND or EQ
> don't make sense with all constant operands.

ZERO_EXTEND makes sense for all non-negative operands and no negative
operands.  Anything with some integer mode (so not VOIDmode!) can be
the first argument to EQ, at least structurally.

> This is (one reason)
> why constant-only operands are disallowed from RTL (data structures),
> and why in APIs that perform/simplify these operations, the original
> operand mode (of the const_int(s)) must be/is always passed as a
> parameter.
> 
> Hence, for say simplify_const_binary_operation, op0 and op1 can
> both be const_int, as the mode argument specifies the mode of the
> "code" operation. Likewise, in simplify_relational_operation, both
> op0 and op1 may be CONST_INT as "cmp_mode" explicitly specifies
> the mode that the operation is performed in and "mode" specifies
> the mode of the result.

> Your comment that "comparing two integer constants is invalid
> RTL *in all contexts*" is a serious misunderstanding of what's
> going on.

Not at all.  I showed the quote from the documentation: it is always
invalid to have two VOIDmode arguments to COMPARE.

> At no point is a RTL rtx node ever allocated with two
> integer constant operands.  RTL simplification is for hypothetical
> "what if" transformations (just like try_combine calls recog with
> RTL that may not be real instructions), and these simplifcations
> are even sometimes required to preserve the documented RTL
> invariants.  Comparisons of two integers must be simplified to
> true/false precisely to ensure that they never appear in an actual
> COMPARE node.

As the function documentation clearly states, two VOIDmode args (and
MODE that as well) is a special case for infinite precision arithmetic.

> I worry this fundamental misunderstanding is the same issue that
> has been holding up understanding/approving a previous patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html

https://patchwork.ozlabs.org/project/gcc/patch/001401d7a2a5$5bf07db0$13d17910$@nextmovesoftware.com/

Let's not discuss this in this thread though.

> For a related bug, consider PR rtl-optimization/67382, that's assigned
> to you in bugzilla.  In this case, the RTL optimizers know that both
> operands to a COMPARE are integer constants (both -2), yet the
> compiler still performs a run-time comparison and conditional jump:
> 
>         movl    $-2, %eax
>         movl    %eax, 12(%rsp)
>         cmpl    $-2, %eax
>         je      .L1
> 
> Failing to optimize/consider a comparison between two integer
> constants *in any context* just leads to poor code.

If combine would ever generate invalid RTL, the resulting insn does not
pass recog(), making the combine attempt fail.  This is not the way.

The simplifier (part of combine) has a function to actually simplify
tautologies and contradictions, the simplify_const_relational_operation
function you edited here.

> Hopefully, this clears up that the documented constraints on RTL rtx
> aren't exactly the same as the constraints on the use of rtx_codes in
> simplify-rtx's functional APIs.  So simplify_subreg really gets called
> on operands that are neither REG nor MEM, as this is unrelated to
> what the documentation of the SUBREG rtx specifies.

Thank you for telling the maintainer of combine the basics of what all
of this does!  I hadn't noticed any of that before.

> If you don't believe that op0 and op1 can ever both be const_int
> in this function, perhaps consider it harmless dead code and humor
> me.

They can be, as clearly documented (and obvious from the code), but you
can not ever have that in the RTL stream, which is needed for your patch
to do anything.

I consider it harmful, and not dead.  Sorry.


Do you have comments on the rest?


Segher
  
Roger Sayle July 27, 2022, 7:51 a.m. UTC | #4
Hi Segher,

> Thank you for telling the maintainer of combine the basics of what all of
this
> does!  I hadn't noticed any of that before.

You're welcome.  I've also been maintaining combine for some time now:
https://gcc.gnu.org/legacy-ml/gcc/2003-10/msg00455.html

> They can be, as clearly documented (and obvious from the code), but you
can
> not ever have that in the RTL stream, which is needed for your patch to do
> anything.

That's the misunderstanding; neither this nor the previous SUBREG patch,
affect/change what is in the RTL stream, no COMPARE nodes are every
changed or modified, only eliminated by the propagation/fusion in combine
(or CSE).

We have --enable-checking=rtl to guarantee that the documented invariants
always hold in the RTL stream.

Cheers,
Roger
  
Segher Boessenkool July 27, 2022, 6:37 p.m. UTC | #5
On Wed, Jul 27, 2022 at 08:51:58AM +0100, Roger Sayle wrote:
> > They can be, as clearly documented (and obvious from the code), but you
> can
> > not ever have that in the RTL stream, which is needed for your patch to do
> > anything.
> 
> That's the misunderstanding; neither this nor the previous SUBREG patch,
> affect/change what is in the RTL stream, no COMPARE nodes are every
> changed or modified, only eliminated by the propagation/fusion in combine
> (or CSE).

There are no guarantees at all for that though?

> We have --enable-checking=rtl to guarantee that the documented invariants
> always hold in the RTL stream.

That unfortunately only checks a few structural constraints, and not
even all.  For example not that STRICT_LOW_PART has a SUBREG as
argument, which is documented, and the only thing that makes sense
anyway.  This is PR106101.  Unfortunately many targets violate this.


Segher
  
Richard Sandiford July 28, 2022, 12:39 p.m. UTC | #6
Seems this thread has become a bit heated, so I'll try to proceed
with caution :-)

In the below, I'll use "X-mode const_int" to mean "a const_int that
is known from context to represent an X-mode value".  Of course,
the const_int itself always stores VOIDmode.

"Roger Sayle" <roger@nextmovesoftware.com> writes:
> Hi Segher,
> It's very important to distinguish the invariants that exist for the RTL
> data structures as held in memory (rtx), vs. the use of "enum rtx_code"s,
> "machine_mode"s and operands in the various processing functions
> of the middle-end.

FWIW, I agree this distinction is important, with the proviso (which
I think you were also adding) that the code never loses track of what
mode an rtx operand (stored in a variable) actually has/is being
interpreted to have.

In other words, the reason (zero_extend (const_int N)) is invalid is
not that constant integers can't be extended in principle (of course
they can).  It's invalid because we've lost track of how many bits
that N actually has.  That problem doesn't apply in contexts where
the operation is described using individual variables (rather than
a single rtx) *provided that* one of those variables says what mode
any potential const_ints actually represent.

> Yes, it's very true that RTL integer constants don't specify a mode
> (are VOIDmode), so therefore operations like ZERO_EXTEND or EQ
> don't make sense with all constant operands.  This is (one reason)
> why constant-only operands are disallowed from RTL (data structures),
> and why in APIs that perform/simplify these operations, the original
> operand mode (of the const_int(s)) must be/is always passed as a
> parameter.
>
> Hence, for say simplify_const_binary_operation, op0 and op1 can
> both be const_int, as the mode argument specifies the mode of the
> "code" operation. Likewise, in simplify_relational_operation, both
> op0 and op1 may be CONST_INT as "cmp_mode" explicitly specifies
> the mode that the operation is performed in and "mode" specifies
> the mode of the result.

And the mode argument to simplify_const_relational_operation specifies
the mode of the operands, not the mode of the result.  I.e. it specifies
the modes of op0 and op1 rather than the mode that would be attached to
the code in "(code:mode ...)" if an rtx were created with these parameters.

That confused me when I saw the patch initially.  Elsewhere in the
file "mode" tends to be the mode of the result, in cases where the
mode of the result can be different from the modes of the operands,
so using it for the mode of the operands seems a bit confusing
(not your fault of course).

I still struggle with the idea of having CC-mode const_ints though
(using the meaning of "CC-mode const_ints" above).  I realise
(compare (...) (const_int 0)) has been the norm "for ever", but here
it feels like we're also blessing non-zero CC-mode const_ints.
That raises the question of how many significant bits a CC-mode
const_int actually has.  Currently:

         ...  For historical reasons,
	 the size of a CC mode is four units.

But treating CC-mode const_ints as having 32 significant bits is surely
the wrong thing to do.

So if we want to add more interpretation around CC modes, I think we
should first clean up the representation to make the set of valid values
more explicit.  (Preferably without reusing const_int for constant values,
but that's probably a losing battle :-))

Thanks,
Richard
  
H.J. Lu Oct. 10, 2022, 3:50 p.m. UTC | #7
On Thu, Jul 28, 2022 at 5:40 AM Richard Sandiford via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Seems this thread has become a bit heated, so I'll try to proceed
> with caution :-)
>
> In the below, I'll use "X-mode const_int" to mean "a const_int that
> is known from context to represent an X-mode value".  Of course,
> the const_int itself always stores VOIDmode.
>
> "Roger Sayle" <roger@nextmovesoftware.com> writes:
> > Hi Segher,
> > It's very important to distinguish the invariants that exist for the RTL
> > data structures as held in memory (rtx), vs. the use of "enum rtx_code"s,
> > "machine_mode"s and operands in the various processing functions
> > of the middle-end.
>
> FWIW, I agree this distinction is important, with the proviso (which
> I think you were also adding) that the code never loses track of what
> mode an rtx operand (stored in a variable) actually has/is being
> interpreted to have.
>
> In other words, the reason (zero_extend (const_int N)) is invalid is
> not that constant integers can't be extended in principle (of course
> they can).  It's invalid because we've lost track of how many bits
> that N actually has.  That problem doesn't apply in contexts where
> the operation is described using individual variables (rather than
> a single rtx) *provided that* one of those variables says what mode
> any potential const_ints actually represent.
>
> > Yes, it's very true that RTL integer constants don't specify a mode
> > (are VOIDmode), so therefore operations like ZERO_EXTEND or EQ
> > don't make sense with all constant operands.  This is (one reason)
> > why constant-only operands are disallowed from RTL (data structures),
> > and why in APIs that perform/simplify these operations, the original
> > operand mode (of the const_int(s)) must be/is always passed as a
> > parameter.
> >
> > Hence, for say simplify_const_binary_operation, op0 and op1 can
> > both be const_int, as the mode argument specifies the mode of the
> > "code" operation. Likewise, in simplify_relational_operation, both
> > op0 and op1 may be CONST_INT as "cmp_mode" explicitly specifies
> > the mode that the operation is performed in and "mode" specifies
> > the mode of the result.
>
> And the mode argument to simplify_const_relational_operation specifies
> the mode of the operands, not the mode of the result.  I.e. it specifies
> the modes of op0 and op1 rather than the mode that would be attached to
> the code in "(code:mode ...)" if an rtx were created with these parameters.
>
> That confused me when I saw the patch initially.  Elsewhere in the
> file "mode" tends to be the mode of the result, in cases where the
> mode of the result can be different from the modes of the operands,
> so using it for the mode of the operands seems a bit confusing
> (not your fault of course).
>
> I still struggle with the idea of having CC-mode const_ints though
> (using the meaning of "CC-mode const_ints" above).  I realise
> (compare (...) (const_int 0)) has been the norm "for ever", but here
> it feels like we're also blessing non-zero CC-mode const_ints.
> That raises the question of how many significant bits a CC-mode
> const_int actually has.  Currently:
>
>          ...  For historical reasons,
>          the size of a CC mode is four units.
>
> But treating CC-mode const_ints as having 32 significant bits is surely
> the wrong thing to do.
>
> So if we want to add more interpretation around CC modes, I think we
> should first clean up the representation to make the set of valid values
> more explicit.  (Preferably without reusing const_int for constant values,
> but that's probably a losing battle :-))
>
> Thanks,
> Richard

Here is a testcase to show that combine generates

(set (reg:CCC 17 flags)
       (ltu:SI (const_int 1 [1])
         (const_int 0 [0])))

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107172

This new target hook handles it properly.
  
Jeff Law Oct. 14, 2022, 8:26 p.m. UTC | #8
On 7/26/22 11:44, Segher Boessenkool wrote:
> Hi!
>
> On Tue, Jul 26, 2022 at 01:13:02PM +0100, Roger Sayle wrote:
>> This patch is a major revision of the patch I originally proposed here:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598040.html
>>
>> The primary motivation of this patch is to avoid incorrect optimization
>> of MODE_CC comparisons in simplify_const_relational_operation when/if a
>> backend represents the (known) contents of a MODE_CC register using a
>> CONST_INT.  In such cases, the RTL optimizers don't know the semantics
>> of this integer value, so shouldn't change anything (i.e. should return
>> NULL_RTX from simplify_const_relational_operation).
> This is invalid RTL.  What would  (set (reg:CC) (const_int 0))  mean,
> for example?  If this was valid it would make most existing code using
> CC modes do essentially random things :-(

I'm not sure why you're claiming (set (reg:CC) (const_int 0)) is 
invalid.  I'm not aware of anything that would make it invalid -- but 
generic code doesn't really know how to interpret what it means.  While 
I have concerns in that space, they're pretty obscure and likely don't 
occur in practice due to the use of CC modes.

Not the interpretation of the underlying bits in the condition code is 
already defined as machine specific:

@findex CCmode

@item CCmode
``Condition Code'' mode represents the value of a condition code, which
is a machine-specific set of bits used to represent the result of a
comparison operation.  Other machine-specific modes may also be used for
the condition code.  (@pxref{Condition Code}).


Roger's patch does introduce special meaning to a relational operators 
in MODE_CC with two constants and I think that's really what you're 
concerned with and I would share a similar concern. Though perhaps not 
as severe as yours given how special MODE_CC has to be in many contexts.


I suspect we could probably all agree that rejecting a MODE_CC 
relational in simplify_const_relational_operation which would have been 
the minimal change to address the bug Roger is trying to fix.   I 
wouldn't be surprised if he started with that, then realized "hey, if we 
could ask the backend what 0 or 1 means for CC, then we could actually 
optimize this away completely and here we are...


> Comparing two integer constants is invalid RTL *in all contexts*.  The
> items compared do not have a mode!  From rtl.texi:
>    A @code{compare} specifying two @code{VOIDmode} constants is not valid
>    since there is no way to know in what mode the comparison is to be
>    performed; the comparison must either be folded during the compilation
>    or the first operand must be loaded into a register while its mode is
>    still known.

And I think the counter argument is that MODE_CC has enough special 
properties that it could be an exception to this rule. I'm not sure I'm 
ready to say, yes we should make this change, but I'm also not ready to 
summarily reject the change.


jeff
  
Jeff Law Oct. 14, 2022, 8:31 p.m. UTC | #9
On 10/10/22 09:50, H.J. Lu via Gcc-patches wrote:
> On Thu, Jul 28, 2022 at 5:40 AM Richard Sandiford via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>> Seems this thread has become a bit heated, so I'll try to proceed
>> with caution :-)
>>
>> In the below, I'll use "X-mode const_int" to mean "a const_int that
>> is known from context to represent an X-mode value".  Of course,
>> the const_int itself always stores VOIDmode.
>>
>> "Roger Sayle" <roger@nextmovesoftware.com> writes:
>>> Hi Segher,
>>> It's very important to distinguish the invariants that exist for the RTL
>>> data structures as held in memory (rtx), vs. the use of "enum rtx_code"s,
>>> "machine_mode"s and operands in the various processing functions
>>> of the middle-end.
>> FWIW, I agree this distinction is important, with the proviso (which
>> I think you were also adding) that the code never loses track of what
>> mode an rtx operand (stored in a variable) actually has/is being
>> interpreted to have.
>>
>> In other words, the reason (zero_extend (const_int N)) is invalid is
>> not that constant integers can't be extended in principle (of course
>> they can).  It's invalid because we've lost track of how many bits
>> that N actually has.  That problem doesn't apply in contexts where
>> the operation is described using individual variables (rather than
>> a single rtx) *provided that* one of those variables says what mode
>> any potential const_ints actually represent.
>>
>>> Yes, it's very true that RTL integer constants don't specify a mode
>>> (are VOIDmode), so therefore operations like ZERO_EXTEND or EQ
>>> don't make sense with all constant operands.  This is (one reason)
>>> why constant-only operands are disallowed from RTL (data structures),
>>> and why in APIs that perform/simplify these operations, the original
>>> operand mode (of the const_int(s)) must be/is always passed as a
>>> parameter.
>>>
>>> Hence, for say simplify_const_binary_operation, op0 and op1 can
>>> both be const_int, as the mode argument specifies the mode of the
>>> "code" operation. Likewise, in simplify_relational_operation, both
>>> op0 and op1 may be CONST_INT as "cmp_mode" explicitly specifies
>>> the mode that the operation is performed in and "mode" specifies
>>> the mode of the result.
>> And the mode argument to simplify_const_relational_operation specifies
>> the mode of the operands, not the mode of the result.  I.e. it specifies
>> the modes of op0 and op1 rather than the mode that would be attached to
>> the code in "(code:mode ...)" if an rtx were created with these parameters.
>>
>> That confused me when I saw the patch initially.  Elsewhere in the
>> file "mode" tends to be the mode of the result, in cases where the
>> mode of the result can be different from the modes of the operands,
>> so using it for the mode of the operands seems a bit confusing
>> (not your fault of course).
>>
>> I still struggle with the idea of having CC-mode const_ints though
>> (using the meaning of "CC-mode const_ints" above).  I realise
>> (compare (...) (const_int 0)) has been the norm "for ever", but here
>> it feels like we're also blessing non-zero CC-mode const_ints.
>> That raises the question of how many significant bits a CC-mode
>> const_int actually has.  Currently:
>>
>>           ...  For historical reasons,
>>           the size of a CC mode is four units.
>>
>> But treating CC-mode const_ints as having 32 significant bits is surely
>> the wrong thing to do.
>>
>> So if we want to add more interpretation around CC modes, I think we
>> should first clean up the representation to make the set of valid values
>> more explicit.  (Preferably without reusing const_int for constant values,
>> but that's probably a losing battle :-))
>>
>> Thanks,
>> Richard
> Here is a testcase to show that combine generates
>
> (set (reg:CCC 17 flags)
>         (ltu:SI (const_int 1 [1])
>           (const_int 0 [0])))
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107172
>
> This new target hook handles it properly

ANd does it work if you reject MODE_CC comparisons with two constants in 
simplify_const_relational_operation?


I suspect it will work, but generate suboptimal code.

jeff
  
H.J. Lu Oct. 14, 2022, 9:05 p.m. UTC | #10
On Fri, Oct 14, 2022 at 1:32 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
> On 10/10/22 09:50, H.J. Lu via Gcc-patches wrote:
> > On Thu, Jul 28, 2022 at 5:40 AM Richard Sandiford via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >> Seems this thread has become a bit heated, so I'll try to proceed
> >> with caution :-)
> >>
> >> In the below, I'll use "X-mode const_int" to mean "a const_int that
> >> is known from context to represent an X-mode value".  Of course,
> >> the const_int itself always stores VOIDmode.
> >>
> >> "Roger Sayle" <roger@nextmovesoftware.com> writes:
> >>> Hi Segher,
> >>> It's very important to distinguish the invariants that exist for the RTL
> >>> data structures as held in memory (rtx), vs. the use of "enum rtx_code"s,
> >>> "machine_mode"s and operands in the various processing functions
> >>> of the middle-end.
> >> FWIW, I agree this distinction is important, with the proviso (which
> >> I think you were also adding) that the code never loses track of what
> >> mode an rtx operand (stored in a variable) actually has/is being
> >> interpreted to have.
> >>
> >> In other words, the reason (zero_extend (const_int N)) is invalid is
> >> not that constant integers can't be extended in principle (of course
> >> they can).  It's invalid because we've lost track of how many bits
> >> that N actually has.  That problem doesn't apply in contexts where
> >> the operation is described using individual variables (rather than
> >> a single rtx) *provided that* one of those variables says what mode
> >> any potential const_ints actually represent.
> >>
> >>> Yes, it's very true that RTL integer constants don't specify a mode
> >>> (are VOIDmode), so therefore operations like ZERO_EXTEND or EQ
> >>> don't make sense with all constant operands.  This is (one reason)
> >>> why constant-only operands are disallowed from RTL (data structures),
> >>> and why in APIs that perform/simplify these operations, the original
> >>> operand mode (of the const_int(s)) must be/is always passed as a
> >>> parameter.
> >>>
> >>> Hence, for say simplify_const_binary_operation, op0 and op1 can
> >>> both be const_int, as the mode argument specifies the mode of the
> >>> "code" operation. Likewise, in simplify_relational_operation, both
> >>> op0 and op1 may be CONST_INT as "cmp_mode" explicitly specifies
> >>> the mode that the operation is performed in and "mode" specifies
> >>> the mode of the result.
> >> And the mode argument to simplify_const_relational_operation specifies
> >> the mode of the operands, not the mode of the result.  I.e. it specifies
> >> the modes of op0 and op1 rather than the mode that would be attached to
> >> the code in "(code:mode ...)" if an rtx were created with these parameters.
> >>
> >> That confused me when I saw the patch initially.  Elsewhere in the
> >> file "mode" tends to be the mode of the result, in cases where the
> >> mode of the result can be different from the modes of the operands,
> >> so using it for the mode of the operands seems a bit confusing
> >> (not your fault of course).
> >>
> >> I still struggle with the idea of having CC-mode const_ints though
> >> (using the meaning of "CC-mode const_ints" above).  I realise
> >> (compare (...) (const_int 0)) has been the norm "for ever", but here
> >> it feels like we're also blessing non-zero CC-mode const_ints.
> >> That raises the question of how many significant bits a CC-mode
> >> const_int actually has.  Currently:
> >>
> >>           ...  For historical reasons,
> >>           the size of a CC mode is four units.
> >>
> >> But treating CC-mode const_ints as having 32 significant bits is surely
> >> the wrong thing to do.
> >>
> >> So if we want to add more interpretation around CC modes, I think we
> >> should first clean up the representation to make the set of valid values
> >> more explicit.  (Preferably without reusing const_int for constant values,
> >> but that's probably a losing battle :-))
> >>
> >> Thanks,
> >> Richard
> > Here is a testcase to show that combine generates
> >
> > (set (reg:CCC 17 flags)
> >         (ltu:SI (const_int 1 [1])
> >           (const_int 0 [0])))
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107172
> >
> > This new target hook handles it properly
>
> ANd does it work if you reject MODE_CC comparisons with two constants in
> simplify_const_relational_operation?
>
>
> I suspect it will work, but generate suboptimal code.

It doesn't work for

(ltu:SI (const_int 1 [0x1]) (const_int 0 [0]))

simplified from

(ltu:SI (reg:CCC 17 flags) (const_int 0 [0]))

When simplify_const_relational_operation returns NULL for
MODE_CC comparison with two constants, combine will try
it again with VOIDmode comparison with two constants.
  

Patch

diff --git a/gcc/target.def b/gcc/target.def
index 2a7fa68..cd81b4e 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -7107,6 +7107,20 @@  counters are incremented using atomic operations.  Targets not supporting\n\
 type.",
  HOST_WIDE_INT, (void), default_gcov_type_size)
 
+DEFHOOK
+(simplify_modecc_const,
+ "On targets that represent the result of a comparison using a\n\
+@code{MODE_CC} flags register, allow the backend to represent a known\n\
+result using a target-dependent integer constant.  This helper function\n\
+of @code{simplify_const_relational_operation} attempts to simplify a\n\
+comparison operator @var{code} that is defined by a @code{COMPARE} in mode\n\
+@var{mode}, which is known to be of class @code{MODE_CC}, and is encoded\n\
+as @var{val}, which must be a @code{CONST_INT}.  The result must be\n\
+@code{const_true_rtx}, @code{const0_rtx} or @code{NULL_RTX} if no\n\
+simplification is possible.",
+ rtx, (machine_mode mode, int code, rtx val),
+ hook_rtx_mode_int_rtx_null)
+
 /* This value represents whether the shadow call stack is implemented on
    the target platform.  */
 DEFHOOKPOD
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index b0ea398..5a1231d 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6631,6 +6631,18 @@  post-reload comparison elimination pass, or the delay slot filler pass,
 then this value should be set appropriately.
 @end deftypevr
 
+@deftypefn {Target Hook} rtx TARGET_SIMPLIFY_MODECC_CONST (machine_mode @var{mode}, int @var{code}, rtx @var{val})
+On targets that represent the result of a comparison using a
+@code{MODE_CC} flags register, allow the backend to represent a known
+result using a target-dependent integer constant.  This helper function
+of @code{simplify_const_relational_operation} attempts to simplify a
+comparison operator @var{code} that is defined by a @code{COMPARE} in mode
+@var{mode}, which is known to be of class @code{MODE_CC}, and is encoded
+as @var{val}, which must be a @code{CONST_INT}.  The result must be
+@code{const_true_rtx}, @code{const0_rtx} or @code{NULL_RTX} if no
+simplification is possible.
+@end deftypefn
+
 @node Costs
 @section Describing Relative Costs of Operations
 @cindex costs of instructions
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index f869ddd..8cecde7 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4397,6 +4397,8 @@  like:
 
 @hook TARGET_FLAGS_REGNUM
 
+@hook TARGET_SIMPLIFY_MODECC_CONST
+
 @node Costs
 @section Describing Relative Costs of Operations
 @cindex costs of instructions
diff --git a/gcc/hooks.cc b/gcc/hooks.cc
index b29233f..90be303 100644
--- a/gcc/hooks.cc
+++ b/gcc/hooks.cc
@@ -401,6 +401,14 @@  hook_rtx_tree_int_null (tree, int)
   return NULL;
 }
 
+/* Generic hook that takes a machine mode, an int and an rtx and
+   returns NULL_RTX.  */
+rtx
+hook_rtx_mode_int_rtx_null (machine_mode, int, rtx)
+{
+  return NULL;
+}
+
 /* Generic hook that takes a machine mode and returns an unsigned int 0.  */
 unsigned int
 hook_uint_mode_0 (machine_mode)
diff --git a/gcc/hooks.h b/gcc/hooks.h
index 1056e1e..e978817 100644
--- a/gcc/hooks.h
+++ b/gcc/hooks.h
@@ -123,6 +123,7 @@  extern bool default_can_output_mi_thunk_no_vcall (const_tree, HOST_WIDE_INT,
 extern rtx hook_rtx_rtx_identity (rtx);
 extern rtx hook_rtx_rtx_null (rtx);
 extern rtx hook_rtx_tree_int_null (tree, int);
+extern rtx hook_rtx_mode_int_rtx_null (machine_mode, int, rtx);
 
 extern char *hook_charptr_void_null (void);
 extern const char *hook_constcharptr_void_null (void);
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index fa20665..699e08a 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -6012,6 +6012,13 @@  simplify_const_relational_operation (enum rtx_code code,
 	      || (GET_MODE (op0) == VOIDmode
 		  && GET_MODE (op1) == VOIDmode));
 
+  /* Handle MODE_CC comparisons that have been simplified to
+     constants.  */
+  if (GET_MODE_CLASS (mode) == MODE_CC
+      && op1 == const0_rtx
+      && CONST_INT_P (op0))
+    return targetm.simplify_modecc_const (mode, (int)code, op0);
+
   /* If op0 is a compare, extract the comparison arguments from it.  */
   if (GET_CODE (op0) == COMPARE && op1 == const0_rtx)
     {
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index e03f86d..a73c265 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -15995,6 +15995,31 @@  ix86_cc_modes_compatible (machine_mode m1, machine_mode m2)
     }
 }
 
+/* Simplify the results of comparisons with constant operands.
+   MODE is a MODE_CC, CODE is a comparison operator, and VAL is
+   a target (and mode) dependent CONST_INT.  */
+
+rtx
+ix86_simplify_modecc_const (machine_mode mode, int code, rtx val)
+{
+  /* For CCCmode, const0_rtx represents the carry flag is clear,
+     otherwise the carry flag is set.  */
+  switch ((enum rtx_code)code)
+    {
+    case LTU:
+      if (mode == E_CCCmode)
+	return val == const0_rtx ? const0_rtx : const_true_rtx;
+      break;
+    case GEU:
+      if (mode == E_CCCmode)
+	return val == const0_rtx ? const_true_rtx : const0_rtx;
+      break;
+    default:
+      break;
+    }
+  return NULL_RTX;
+}
+
 /* Return strategy to use for floating-point.  We assume that fcomi is always
    preferrable where available, since that is also true when looking at size
    (2 bytes, vs. 3 for fnstsw+sahf and at least 5 for fnstsw+test).  */
@@ -24978,6 +25003,9 @@  static bool ix86_libc_has_fast_function (int fcode ATTRIBUTE_UNUSED)
 #undef TARGET_GEN_MEMSET_SCRATCH_RTX
 #define TARGET_GEN_MEMSET_SCRATCH_RTX ix86_gen_scratch_sse_rtx
 
+#undef TARGET_SIMPLIFY_MODECC_CONST
+#define TARGET_SIMPLIFY_MODECC_CONST ix86_simplify_modecc_const
+
 #if CHECKING_P
 #undef TARGET_RUN_TARGET_SELFTESTS
 #define TARGET_RUN_TARGET_SELFTESTS selftest::ix86_run_selftests