[x86] Add alternate representation for {and,or,xor}b %ah,%dh.
Checks
Commit Message
A patch that I'm working on to improve RTL simplifications in the
middle-end results in the regression of pr78904-1b.c, due to changes in
the canonical representation of high-byte (%ah, %bh, %ch, %dh) logic.
This patch avoids/prevents those failures by adding support for the
alternate representation, duplicating the existing *<code>qi_ext<mode>_2
as *<code>qi_ext<mode>_3 (the new version also replacing any_or with
any_logic to provide *andqi_ext<mode>_3 in the same pattern). Removing
the original pattern isn't trivial, as it's generated by define_split,
but this can be investigated after the other pieces are approved.
The current representation of this instruction is:
(set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
(const_int 8 [0x8])
(const_int 8 [0x8]))
(subreg:DI (xor:QI (subreg:QI (zero_extract:DI (reg:DI 94)
(const_int 8 [0x8])
(const_int 8 [0x8])) 0)
(subreg:QI (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
(const_int 8 [0x8])
(const_int 8 [0x8])) 0)) 0))
after my proposed middle-end improvement, we attempt to recognize:
(set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
(const_int 8 [0x8])
(const_int 8 [0x8]))
(zero_extract:DI (xor:DI (reg:DI 94)
(reg/v:DI 87 [ aD.2763 ]))
(const_int 8 [0x8])
(const_int 8 [0x8])))
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures. Ok for mainline?
2023-06-18 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386.md (*<code>qi_ext<mode>_3): New define_insn.
Thanks in advance,
Roger
--
Comments
On Sun, Jun 18, 2023 at 11:35 AM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> A patch that I'm working on to improve RTL simplifications in the
> middle-end results in the regression of pr78904-1b.c, due to changes in
> the canonical representation of high-byte (%ah, %bh, %ch, %dh) logic.
> This patch avoids/prevents those failures by adding support for the
> alternate representation, duplicating the existing *<code>qi_ext<mode>_2
> as *<code>qi_ext<mode>_3 (the new version also replacing any_or with
> any_logic to provide *andqi_ext<mode>_3 in the same pattern). Removing
> the original pattern isn't trivial, as it's generated by define_split,
> but this can be investigated after the other pieces are approved.
IIRC, I have added these patterns to please combine, based on what
combine generates for the above mentioned testcases. I believe there
is no canonical representation of high-byte logic, so these patterns
are what was appropriate at the time. So, yes, a canonical
representation is the way to go.
Also, please note PR82524. I have a solution for this, we need a
define and split to perform some additional copy of a non-matched
register (there is one pattern that does that in i386.md, but
considering that these patterns are not that common and are rarely
used, I left others as they are).
>
> The current representation of this instruction is:
>
> (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
> (const_int 8 [0x8])
> (const_int 8 [0x8]))
> (subreg:DI (xor:QI (subreg:QI (zero_extract:DI (reg:DI 94)
> (const_int 8 [0x8])
> (const_int 8 [0x8])) 0)
> (subreg:QI (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
> (const_int 8 [0x8])
> (const_int 8 [0x8])) 0)) 0))
>
> after my proposed middle-end improvement, we attempt to recognize:
>
> (set (zero_extract:DI (reg/v:DI 87 [ aD.2763 ])
> (const_int 8 [0x8])
> (const_int 8 [0x8]))
> (zero_extract:DI (xor:DI (reg:DI 94)
> (reg/v:DI 87 [ aD.2763 ]))
> (const_int 8 [0x8])
> (const_int 8 [0x8])))
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures. Ok for mainline?
I would rather commit this fix after the regression happens. The patch
regresses relatively rarely used simplification, and scan-asm
regressions are not that critical. So, OK for mainline, but after the
regression happens, and it will be clear what patch fixes.
Thanks,
Uros.
>
>
> 2023-06-18 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> * config/i386/i386.md (*<code>qi_ext<mode>_3): New define_insn.
>
>
> Thanks in advance,
> Roger
> --
>
@@ -10848,6 +10848,8 @@
[(set_attr "type" "alu")
(set_attr "mode" "QI")])
+;; *andqi_ext<mode>_3 is defined via *<code>qi_ext<mode>_3 below.
+
;; Convert wide AND instructions with immediate operand to shorter QImode
;; equivalents when possible.
;; Don't do the splitting with memory operands, since it introduces risk
@@ -11560,6 +11562,26 @@
[(set_attr "type" "alu")
(set_attr "mode" "QI")])
+(define_insn "*<code>qi_ext<mode>_3"
+ [(set (zero_extract:SWI248
+ (match_operand 0 "int248_register_operand" "+Q")
+ (const_int 8)
+ (const_int 8))
+ (zero_extract:SWI248
+ (any_logic:SWI248
+ (match_operand 1 "int248_register_operand" "%0")
+ (match_operand 2 "int248_register_operand" "Q"))
+ (const_int 8)
+ (const_int 8)))
+ (clobber (reg:CC FLAGS_REG))]
+ "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
+ /* FIXME: without this LRA can't reload this pattern, see PR82524. */
+ && (rtx_equal_p (operands[0], operands[1])
+ || rtx_equal_p (operands[0], operands[2]))"
+ "<logic>{b}\t{%h2, %h0|%h0, %h2}"
+ [(set_attr "type" "alu")
+ (set_attr "mode" "QI")])
+
;; Convert wide OR instructions with immediate operand to shorter QImode
;; equivalents when possible.
;; Don't do the splitting with memory operands, since it introduces risk