rs6000: Fix vector parity support [PR108699]
Checks
Commit Message
Hi,
The failures on the original failed case builtin-bitops-1.c
and the associated test case pr108699.c here show that the
current support of parity vector mode is wrong on Power.
The hardware insns vprtyb[wdq] which operate on the least
significant bit of each byte per element, they doesn't match
what RTL opcode parity needs, but the current implementation
expands it with them wrongly.
This patch is to fix the handling with one more pre-insn
vpopcntb. It also fixes an oversight having V8HI in VEC_IP,
replaces VParity with VEC_IP, and adjusts the existing
UNSPEC_PARITY to a more meaningful name UNSPEC_PARITYB.
I also noticed that we can make use of vpopcnt[bhwd] on
Power8 (AND with 1 on each element), but it's next stage1
content, I plan to support it with one subsequent patch
and make this patch focus on bug fixing.
Bootstrapped and regtested on powerpc64-linux-gnu P{7,8,9}
and powerpc64le-linux-gnu P10.
Is it ok for trunk?
BR,
Kewen
-----
PR target/108699
gcc/ChangeLog:
* config/rs6000/altivec.md (*p9v_parity<mode>2): Rename to ...
(p9v_parityb<mode>2): ... this. Adjust pattern with UNSPEC_PARITYB,
and replace mode_iterator VParity with VEC_IP.
(mode_iterator VParity): Remove.
* config/rs6000/rs6000-builtins.def (VPRTYBD): Replace parityv2di2 with
p9v_paritybv2di2.
(VPRTYBW): Replace parityv4si2 with p9v_paritybv4si2.
(VPRTYBQ): Replace parityv1ti2 with p9v_paritybv1ti2.
* config/rs6000/rs6000.cc (rs6000_emit_parity): Replace
gen_paritysi2_cmpb with gen_paritybsi2, and replace gen_paritydi2_cmpb
with gen_paritybdi2
* config/rs6000/rs6000.md (parity<mode>2_cmpb): Rename to ...
(parityb<mode>2): ... this.
(UNSPEC_PARITY): Rename to ...
(UNSPEC_PARITYB): ... this.
* config/rs6000/vector.md (mode_iterator VEC_IP): Remove V8HI.
(parity<mode>2 with VEC_IP): Expand with popcountv16qi2 and the
corresponding vector parity byte p9v_parityb<mode>2.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/p9-vparity.c: Add scan-assembler-not for vpopcntb
to distinguish parity byte from parity.
* gcc.target/powerpc/pr108699.c: New test.
---
gcc/config/rs6000/altivec.md | 15 +++----
gcc/config/rs6000/rs6000-builtins.def | 6 +--
gcc/config/rs6000/rs6000.cc | 4 +-
gcc/config/rs6000/rs6000.md | 7 ++--
gcc/config/rs6000/vector.md | 14 +++++--
gcc/testsuite/gcc.target/powerpc/p9-vparity.c | 1 +
gcc/testsuite/gcc.target/powerpc/pr108699.c | 42 +++++++++++++++++++
7 files changed, 68 insertions(+), 21 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr108699.c
--
2.27.0
Comments
Hi!
On Thu, Feb 16, 2023 at 05:23:40PM +0800, Kewen.Lin wrote:
> This patch is to fix the handling with one more pre-insn
> vpopcntb. It also fixes an oversight having V8HI in VEC_IP,
> replaces VParity with VEC_IP, and adjusts the existing
> UNSPEC_PARITY to a more meaningful name UNSPEC_PARITYB.
Please don't do that. UNSPEC_PARITYB is worse than UNSPEC_PARITY,
even more so for the prtyw etc. instructions.
You might want to express the vector parity insns separately, but then
*do that*, don't rename the normal stuff as well, and use a more obvious
name like UNSPEC_VPARITY please.
> const vsll __builtin_altivec_vprtybd (vsll);
> - VPRTYBD parityv2di2 {}
> + VPRTYBD p9v_paritybv2di2 {}
Why this? Please keep the simpler names if at all possible.
> {
> emit_insn (gen_popcntbsi2 (tmp, src));
> - emit_insn (gen_paritysi2_cmpb (dst, tmp));
> + emit_insn (gen_paritybsi2 (dst, tmp));
> }
It is completely non-obvious what a "paritybsi2" is. There is no such
thing as a "parityb", not for normal people anyway. It is very
important that names give a hint of what they stand for.
The _cmpb of the existing name indicates that a cmpb insn is generated
here as well. Has that changed>
> -(define_insn "parity<mode>2_cmpb"
> +(define_insn "parityb<mode>2"
> [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> - (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")] UNSPEC_PARITY))]
> + (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")]
> + UNSPEC_PARITYB))]
> "TARGET_CMPB && TARGET_POPCNTB"
> "prty<wd> %0,%1"
> [(set_attr "type" "popcnt")])
Hrm, the original name was not so good apparently. Still, please don't
change multiple independent things in one patch, it makes the patch hard
to read and understand and very hard to spot mistakes in.
> @@ -1226,7 +1225,16 @@ (define_expand "popcount<mode>2"
> (define_expand "parity<mode>2"
> [(set (match_operand:VEC_IP 0 "register_operand")
> (parity:VEC_IP (match_operand:VEC_IP 1 "register_operand")))]
> - "TARGET_P9_VECTOR")
> + "TARGET_P9_VECTOR"
> +{
> + rtx op1 = gen_lowpart (V16QImode, operands[1]);
> + rtx res = gen_reg_rtx (V16QImode);
> + emit_insn (gen_popcountv16qi2 (res, op1));
> + emit_insn (gen_p9v_parityb<mode>2 (operands[0],
> + gen_lowpart (<MODE>mode, res)));
> +
> + DONE;
> +})
So first do a patch that is essentially just this?
Later patches can do all other things (also, not do this expand for
TImode at all, ho hum).
Segher
Hi Segher,
Thanks for the review comments!
on 2023/2/16 19:14, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Feb 16, 2023 at 05:23:40PM +0800, Kewen.Lin wrote:
>> This patch is to fix the handling with one more pre-insn
>> vpopcntb. It also fixes an oversight having V8HI in VEC_IP,
>> replaces VParity with VEC_IP, and adjusts the existing
>> UNSPEC_PARITY to a more meaningful name UNSPEC_PARITYB.
>
> Please don't do that. UNSPEC_PARITYB is worse than UNSPEC_PARITY,
> even more so for the prtyw etc. instructions.
I thought the scalar insns prty[wd] also operate on byte
(especially on the least significant bit in each byte),
PARITYB(yte) seems better ...
>
> You might want to express the vector parity insns separately, but then
> *do that*, don't rename the normal stuff as well, and use a more obvious
> name like UNSPEC_VPARITY please.
I'll update for vector only. Maybe it's better with UNSPEC_VPARITY*B*?
since the mnemonic has "b"(yte).
>
>> const vsll __builtin_altivec_vprtybd (vsll);
>> - VPRTYBD parityv2di2 {}
>> + VPRTYBD p9v_paritybv2di2 {}
>
> Why this? Please keep the simpler names if at all possible.
The bif would like to map with the vector parity byte insns
directly, the parity<mode>2 can't work here any more.
The name is updated from previous *p9v_parity<mode>2 (becoming
to a named define_insn), I noticed there are some names with
p8v_, p9v_, meant to keep it consistent with the context.
You want this to be simplified as parity*b*v2di2?
>
>> {
>> emit_insn (gen_popcntbsi2 (tmp, src));
>> - emit_insn (gen_paritysi2_cmpb (dst, tmp));
>> + emit_insn (gen_paritybsi2 (dst, tmp));
>> }
>
> It is completely non-obvious what a "paritybsi2" is. There is no such
> thing as a "parityb", not for normal people anyway. It is very
> important that names give a hint of what they stand for.
>
> The _cmpb of the existing name indicates that a cmpb insn is generated
> here as well. Has that changed>
>
I got the same understanding initially, but as you may have noticed
there isn't a cmpb, it seems just to be different from the name
parity<mode>2 so put the condition as one suffix.
>> -(define_insn "parity<mode>2_cmpb"
>> +(define_insn "parityb<mode>2"
>> [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>> - (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")] UNSPEC_PARITY))]
>> + (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")]
>> + UNSPEC_PARITYB))]
>> "TARGET_CMPB && TARGET_POPCNTB"
>> "prty<wd> %0,%1"
>> [(set_attr "type" "popcnt")])
>
> Hrm, the original name was not so good apparently. Still, please don't
> change multiple independent things in one patch, it makes the patch hard
> to read and understand and very hard to spot mistakes in.
Got it, good point.
>
>> @@ -1226,7 +1225,16 @@ (define_expand "popcount<mode>2"
>> (define_expand "parity<mode>2"
>> [(set (match_operand:VEC_IP 0 "register_operand")
>> (parity:VEC_IP (match_operand:VEC_IP 1 "register_operand")))]
>> - "TARGET_P9_VECTOR")
>> + "TARGET_P9_VECTOR"
>> +{
>> + rtx op1 = gen_lowpart (V16QImode, operands[1]);
>> + rtx res = gen_reg_rtx (V16QImode);
>> + emit_insn (gen_popcountv16qi2 (res, op1));
>> + emit_insn (gen_p9v_parityb<mode>2 (operands[0],
>> + gen_lowpart (<MODE>mode, res)));
>> +
>> + DONE;
>> +})
>
> So first do a patch that is essentially just this?
OK, will update and test it again.
>
> Later patches can do all other things (also, not do this expand for
> TImode at all, ho hum).
OK, I guess all the others are for next stage1. :)
BR,
Kewen
Hi!
On Thu, Feb 16, 2023 at 08:06:02PM +0800, Kewen.Lin wrote:
> on 2023/2/16 19:14, Segher Boessenkool wrote:
> > On Thu, Feb 16, 2023 at 05:23:40PM +0800, Kewen.Lin wrote:
> >> This patch is to fix the handling with one more pre-insn
> >> vpopcntb. It also fixes an oversight having V8HI in VEC_IP,
> >> replaces VParity with VEC_IP, and adjusts the existing
> >> UNSPEC_PARITY to a more meaningful name UNSPEC_PARITYB.
> >
> > Please don't do that. UNSPEC_PARITYB is worse than UNSPEC_PARITY,
> > even more so for the prtyw etc. instructions.
>
> I thought the scalar insns prty[wd] also operate on byte
> (especially on the least significant bit in each byte),
> PARITYB(yte) seems better ...
The scalar instruction does not include a "b" in the mnemonic, and it
says nothing "byte" or "bit" in the instruction name either. The
existing name is simpler, less confusing, simply better.
> > You might want to express the vector parity insns separately, but then
> > *do that*, don't rename the normal stuff as well, and use a more obvious
> > name like UNSPEC_VPARITY please.
>
> I'll update for vector only. Maybe it's better with UNSPEC_VPARITY*B*?
> since the mnemonic has "b"(yte).
No, you are right that the semantics are pretty much the same. Please
just keep UNSPEC_PARITY everywhere.
> >> const vsll __builtin_altivec_vprtybd (vsll);
> >> - VPRTYBD parityv2di2 {}
> >> + VPRTYBD p9v_paritybv2di2 {}
> >
> > Why this? Please keep the simpler names if at all possible.
>
> The bif would like to map with the vector parity byte insns
> directly, the parity<mode>2 can't work here any more.
Ah, because it cannot use the expander here, it has to be a define_insn?
Why is that?
> The name is updated from previous *p9v_parity<mode>2 (becoming
> to a named define_insn), I noticed there are some names with
> p8v_, p9v_, meant to keep it consistent with the context.
> You want this to be simplified as parity*b*v2di2?
Without the "b". But that would be better then, yes. This is a great
example why p9v_ in the name is not good: most users do not care at all
what ISA version this insn first appeared in.
> > It is completely non-obvious what a "paritybsi2" is. There is no such
> > thing as a "parityb", not for normal people anyway. It is very
> > important that names give a hint of what they stand for.
> >
> > The _cmpb of the existing name indicates that a cmpb insn is generated
> > here as well. Has that changed>
> >
>
> I got the same understanding initially, but as you may have noticed
> there isn't a cmpb, it seems just to be different from the name
> parity<mode>2 so put the condition as one suffix.
Yeah. Something for a future improvement.
> >> -(define_insn "parity<mode>2_cmpb"
> >> +(define_insn "parityb<mode>2"
> >> [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> >> - (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")] UNSPEC_PARITY))]
> >> + (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")]
> >> + UNSPEC_PARITYB))]
> >> "TARGET_CMPB && TARGET_POPCNTB"
> >> "prty<wd> %0,%1"
> >> [(set_attr "type" "popcnt")])
> >
> > Hrm, the original name was not so good apparently. Still, please don't
> > change multiple independent things in one patch, it makes the patch hard
> > to read and understand and very hard to spot mistakes in.
>
> Got it, good point.
And we are in stage 4 so you really really do not want something that
may be a mistake, that may cause any problems :-)
> > So first do a patch that is essentially just this?
>
> OK, will update and test it again.
Thanks!
> > Later patches can do all other things (also, not do this expand for
> > TImode at all, ho hum).
>
> OK, I guess all the others are for next stage1. :)
Yes exactly. And one (small, self-contained) thing per patch please.
Thanks again,
Segher
Hi Segher,
Thanks for the comments!
on 2023/2/16 23:10, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Feb 16, 2023 at 08:06:02PM +0800, Kewen.Lin wrote:
>> on 2023/2/16 19:14, Segher Boessenkool wrote:
>>> On Thu, Feb 16, 2023 at 05:23:40PM +0800, Kewen.Lin wrote:
>>>> This patch is to fix the handling with one more pre-insn
>>>> vpopcntb. It also fixes an oversight having V8HI in VEC_IP,
>>>> replaces VParity with VEC_IP, and adjusts the existing
>>>> UNSPEC_PARITY to a more meaningful name UNSPEC_PARITYB.
>>>
>>> Please don't do that. UNSPEC_PARITYB is worse than UNSPEC_PARITY,
>>> even more so for the prtyw etc. instructions.
>>
>> I thought the scalar insns prty[wd] also operate on byte
>> (especially on the least significant bit in each byte),
>> PARITYB(yte) seems better ...
>
> The scalar instruction does not include a "b" in the mnemonic, and it
> says nothing "byte" or "bit" in the instruction name either. The
> existing name is simpler, less confusing, simply better.
>
>>> You might want to express the vector parity insns separately, but then
>>> *do that*, don't rename the normal stuff as well, and use a more obvious
>>> name like UNSPEC_VPARITY please.
>>
>> I'll update for vector only. Maybe it's better with UNSPEC_VPARITY*B*?
>> since the mnemonic has "b"(yte).
>
> No, you are right that the semantics are pretty much the same. Please
> just keep UNSPEC_PARITY everywhere.
OK, since it has UNSPEC, I would hope the reader can realize it's
different from RTL opcode parity and mainly operating on byte. :)
>
>>>> const vsll __builtin_altivec_vprtybd (vsll);
>>>> - VPRTYBD parityv2di2 {}
>>>> + VPRTYBD p9v_paritybv2di2 {}
>>>
>>> Why this? Please keep the simpler names if at all possible.
>>
>> The bif would like to map with the vector parity byte insns
>> directly, the parity<mode>2 can't work here any more.
>
> Ah, because it cannot use the expander here, it has to be a define_insn?
No, the above statement seems to cause some misunderstanding, let me clarify:
first, the built-in functions __builtin_altivec_vprtyb[wdq] require to be
mapped to hardware insns vprtyb[wdq] directly as the functions name show.
Before this patch, the standard pattern name parity<mode>2 expands to those
insns directly (wrongly), so it's fine to use those expanders here. After
this patch, those expands get fixed to get parity for each vector element
(vpopcntb + vprtyb*), they are not valid to be used for expanding these
built-in functions (not 1-1 map any more), so this patch fixes it with
the correct name which maps to vprtyb*.
> Why is that?
>
>> The name is updated from previous *p9v_parity<mode>2 (becoming
>> to a named define_insn), I noticed there are some names with
>> p8v_, p9v_, meant to keep it consistent with the context.
>> You want this to be simplified as parity*b*v2di2?
>
> Without the "b". But that would be better then, yes. This is a great
> example why p9v_ in the name is not good: most users do not care at all
> what ISA version this insn first appeared in.
The name without "b" is standard pattern name, whose semantic doesn't align
with what these insns provide and we already have the matched expander with
it ("parity<mode>2"), so we can't use the name here :(. As you felt a name
with "b" is better than "p9v_*", I'll go with "parityb" then. :)
>>> Later patches can do all other things (also, not do this expand for
>>> TImode at all, ho hum).
>>
>> OK, I guess all the others are for next stage1. :)
>
> Yes exactly. And one (small, self-contained) thing per patch please.
Got it, thanks again!
BR,
Kewen
Hi!
On Fri, Feb 17, 2023 at 11:33:16AM +0800, Kewen.Lin wrote:
> on 2023/2/16 23:10, Segher Boessenkool wrote:
> > No, you are right that the semantics are pretty much the same. Please
> > just keep UNSPEC_PARITY everywhere.
>
> OK, since it has UNSPEC, I would hope the reader can realize it's
> different from RTL opcode parity and mainly operating on byte. :)
Yeah. Often, even usually, unspecs differ in some crucial ways from
similarly named RTL expressions: you would not want an unspec at all
otherwise!
> > Ah, because it cannot use the expander here, it has to be a define_insn?
>
> No, the above statement seems to cause some misunderstanding, let me clarify:
> first, the built-in functions __builtin_altivec_vprtyb[wdq] require to be
> mapped to hardware insns vprtyb[wdq] directly as the functions name show.
No, that is not true at all. Builtins do **not** guarantee to expand to
any specific machine instruction. This is one reason why such names are
not so good, are quite misleading.
If you want specific machine insns, you need to use inline asm, that is
what it is there for. Builtins generate code with some specified
semantics, nothing more, nothing less; just like everything else the
compiler does, the "as-if" rule in full swing.
> >> The name is updated from previous *p9v_parity<mode>2 (becoming
> >> to a named define_insn), I noticed there are some names with
> >> p8v_, p9v_, meant to keep it consistent with the context.
> >> You want this to be simplified as parity*b*v2di2?
> >
> > Without the "b". But that would be better then, yes. This is a great
> > example why p9v_ in the name is not good: most users do not care at all
> > what ISA version this insn first appeared in.
>
> The name without "b" is standard pattern name, whose semantic doesn't align
> with what these insns provide
Heh, it is never easy is it? :-)
> and we already have the matched expander with
> it ("parity<mode>2"), so we can't use the name here :(. As you felt a name
> with "b" is better than "p9v_*", I'll go with "parityb" then. :)
Something longer and less confusing please. Or maybe just with the insn
name, that isn't a problem in the machine desription (as it is for
builtin names or other user-facing stuff). "rs6000_vprtyb" maybe?
Segher
Hi Segher,
Thanks for the comments!
on 2023/2/19 20:12, Segher Boessenkool wrote:
> Hi!
>
> On Fri, Feb 17, 2023 at 11:33:16AM +0800, Kewen.Lin wrote:
>> on 2023/2/16 23:10, Segher Boessenkool wrote:
>>> No, you are right that the semantics are pretty much the same. Please
>>> just keep UNSPEC_PARITY everywhere.
>>
>> OK, since it has UNSPEC, I would hope the reader can realize it's
>> different from RTL opcode parity and mainly operating on byte. :)
>
> Yeah. Often, even usually, unspecs differ in some crucial ways from
> similarly named RTL expressions: you would not want an unspec at all
> otherwise!
>
>>> Ah, because it cannot use the expander here, it has to be a define_insn?
>>
>> No, the above statement seems to cause some misunderstanding, let me clarify:
>> first, the built-in functions __builtin_altivec_vprtyb[wdq] require to be
>> mapped to hardware insns vprtyb[wdq] directly as the functions name show.
>
> No, that is not true at all. Builtins do **not** guarantee to expand to
> any specific machine instruction. This is one reason why such names are
> not so good, are quite misleading.
OK, I agree that we don't claim there is a 1-1 map, but for those bifs
*_(vsx|altivec)_<mnemonic>, it looks that we map them with the corresponding hw
insn (mnemonic in the name) all the time. IMHO, it makes sense, since otherwise
it would be quite misleading (it should use one general name instead).
For this particular built-in __builtin_altivec_vprtyb[wdq], I think we all
agree that we don't want to expand it into vpopcntb + vprtyb[wdq]. :)
>
> If you want specific machine insns, you need to use inline asm, that is
> what it is there for. Builtins generate code with some specified
> semantics, nothing more, nothing less; just like everything else the
> compiler does, the "as-if" rule in full swing.
>
>>>> The name is updated from previous *p9v_parity<mode>2 (becoming
>>>> to a named define_insn), I noticed there are some names with
>>>> p8v_, p9v_, meant to keep it consistent with the context.
>>>> You want this to be simplified as parity*b*v2di2?
>>>
>>> Without the "b". But that would be better then, yes. This is a great
>>> example why p9v_ in the name is not good: most users do not care at all
>>> what ISA version this insn first appeared in.
>>
>> The name without "b" is standard pattern name, whose semantic doesn't align
>> with what these insns provide
>
> Heh, it is never easy is it? :-)
Yeah. :)
>
>> and we already have the matched expander with
>> it ("parity<mode>2"), so we can't use the name here :(. As you felt a name
>> with "b" is better than "p9v_*", I'll go with "parityb" then. :)
>
> Something longer and less confusing please. Or maybe just with the insn
> name, that isn't a problem in the machine desription (as it is for
> builtin names or other user-facing stuff). "rs6000_vprtyb" maybe?
Thanks for the suggestion! Will go with "rs6000_vprtyb" if the others in
v2 [1] look good to you.
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612212.html
BR,
Kewen
@@ -215,13 +215,6 @@ (define_mode_iterator VM2 [V4SI
;; versus floating point
(define_mode_attr VS_sxwsp [(V4SI "sxw") (V4SF "sp")])
-;; Specific iterator for parity which does not have a byte/half-word form, but
-;; does have a quad word form
-(define_mode_iterator VParity [V4SI
- V2DI
- V1TI
- TI])
-
(define_mode_attr VI_char [(V2DI "d") (V4SI "w") (V8HI "h") (V16QI "b")])
(define_mode_attr VI_scalar [(V2DI "DI") (V4SI "SI") (V8HI "HI") (V16QI "QI")])
(define_mode_attr VI_unit [(V16QI "VECTOR_UNIT_ALTIVEC_P (V16QImode)")
@@ -4195,9 +4188,11 @@ (define_insn "*p8v_popcount<mode>2"
[(set_attr "type" "vecsimple")])
;; Vector parity
-(define_insn "*p9v_parity<mode>2"
- [(set (match_operand:VParity 0 "register_operand" "=v")
- (parity:VParity (match_operand:VParity 1 "register_operand" "v")))]
+(define_insn "p9v_parityb<mode>2"
+ [(set (match_operand:VEC_IP 0 "register_operand" "=v")
+ (unspec:VEC_IP
+ [(match_operand:VEC_IP 1 "register_operand" "v")]
+ UNSPEC_PARITYB))]
"TARGET_P9_VECTOR"
"vprtyb<wd> %0,%1"
[(set_attr "type" "vecsimple")])
@@ -2666,13 +2666,13 @@
VMSUMUDM altivec_vmsumudm {}
const vsll __builtin_altivec_vprtybd (vsll);
- VPRTYBD parityv2di2 {}
+ VPRTYBD p9v_paritybv2di2 {}
const vsq __builtin_altivec_vprtybq (vsq);
- VPRTYBQ parityv1ti2 {}
+ VPRTYBQ p9v_paritybv1ti2 {}
const vsi __builtin_altivec_vprtybw (vsi);
- VPRTYBW parityv4si2 {}
+ VPRTYBW p9v_paritybv4si2 {}
const vsll __builtin_altivec_vrldmi (vsll, vsll, vsll);
VRLDMI altivec_vrldmi {}
@@ -22973,12 +22973,12 @@ rs6000_emit_parity (rtx dst, rtx src)
if (mode == SImode)
{
emit_insn (gen_popcntbsi2 (tmp, src));
- emit_insn (gen_paritysi2_cmpb (dst, tmp));
+ emit_insn (gen_paritybsi2 (dst, tmp));
}
else
{
emit_insn (gen_popcntbdi2 (tmp, src));
- emit_insn (gen_paritydi2_cmpb (dst, tmp));
+ emit_insn (gen_paritybdi2 (dst, tmp));
}
return;
}
@@ -109,7 +109,7 @@ (define_c_enum "unspec"
UNSPEC_MACHOPIC_OFFSET
UNSPEC_BPERM
UNSPEC_COPYSIGN
- UNSPEC_PARITY
+ UNSPEC_PARITYB
UNSPEC_CMPB
UNSPEC_FCTIW
UNSPEC_FCTID
@@ -2501,9 +2501,10 @@ (define_expand "parity<mode>2"
DONE;
})
-(define_insn "parity<mode>2_cmpb"
+(define_insn "parityb<mode>2"
[(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
- (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")] UNSPEC_PARITY))]
+ (unspec:GPR [(match_operand:GPR 1 "gpc_reg_operand" "r")]
+ UNSPEC_PARITYB))]
"TARGET_CMPB && TARGET_POPCNTB"
"prty<wd> %0,%1"
[(set_attr "type" "popcnt")])
@@ -33,8 +33,7 @@ (define_mode_iterator VEC_IC [V16QI V8HI V4SI V2DI (V1TI "TARGET_POWER10")])
(define_mode_iterator VEC_TI [V1TI TI])
;; Vector int modes for parity
-(define_mode_iterator VEC_IP [V8HI
- V4SI
+(define_mode_iterator VEC_IP [V4SI
V2DI
V1TI
TI])
@@ -1226,7 +1225,16 @@ (define_expand "popcount<mode>2"
(define_expand "parity<mode>2"
[(set (match_operand:VEC_IP 0 "register_operand")
(parity:VEC_IP (match_operand:VEC_IP 1 "register_operand")))]
- "TARGET_P9_VECTOR")
+ "TARGET_P9_VECTOR"
+{
+ rtx op1 = gen_lowpart (V16QImode, operands[1]);
+ rtx res = gen_reg_rtx (V16QImode);
+ emit_insn (gen_popcountv16qi2 (res, op1));
+ emit_insn (gen_p9v_parityb<mode>2 (operands[0],
+ gen_lowpart (<MODE>mode, res)));
+
+ DONE;
+})
;; Same size conversions
@@ -105,3 +105,4 @@ parity_ti_4u (__uint128_t a)
/* { dg-final { scan-assembler "vprtybd" } } */
/* { dg-final { scan-assembler "vprtybq" } } */
/* { dg-final { scan-assembler "vprtybw" } } */
+/* { dg-final { scan-assembler-not "vpopcntb" } } */
new file mode 100644
@@ -0,0 +1,42 @@
+/* { dg-run } */
+/* { dg-options "-O2 -ftree-vectorize -fno-vect-cost-model" } */
+
+#define N 16
+
+unsigned long long vals[N];
+unsigned int res[N];
+unsigned int expects[N] = {0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0};
+
+unsigned long long inputs[N]
+ = {0x0000000000000000ULL, 0x0000000000000001ULL, 0x8000000000000000ULL,
+ 0x0000000000000002ULL, 0x4000000000000000ULL, 0x0000000100000000ULL,
+ 0x0000000080000000ULL, 0xa5a5a5a5a5a5a5a5ULL, 0x5a5a5a5a5a5a5a5aULL,
+ 0xcafecafe00000000ULL, 0x0000cafecafe0000ULL, 0x00000000cafecafeULL,
+ 0x8070600000000000ULL, 0xffffffffffffffffULL};
+
+__attribute__ ((noipa)) void
+init ()
+{
+ for (int i = 0; i < N; i++)
+ vals[i] = inputs[i];
+}
+
+__attribute__ ((noipa)) void
+do_parity ()
+{
+ for (int i = 0; i < N; i++)
+ res[i] = __builtin_parityll (vals[i]);
+}
+
+int
+main (void)
+{
+ init ();
+ do_parity ();
+ for (int i = 0; i < N; i++)
+ if (res[i] != expects[i])
+ __builtin_abort();
+
+ return 0;
+}
+