rs6000: vec_cmpne confusing implementation
Checks
Commit Message
GCC maintainers:
The following patch cleans up the definition for the
__builtin_altivec_vcmpnet. The current implementation implies that the
builtin is only supported on Power 9 since it is defined under the
Power 9 stanza. However the builtin has no ISA restrictions as stated
in the Power Vector Intrinsic Programming Reference document. The
current built-in works because the builtin gets replaced during GIMPLE
folding by a simple not-equal operator so it doesn't get expanded and
checked for Power 9 code generation.
This patch moves the definition to the Altivec stanza in the builtin
definition file. The builtin then generates code for Power 8 and
earlier processors or Power 9 and later processors.
The patch has been tested on Power 8 and Power 9 with no regressions.
Please let me know if the patch is acceptable for mainline. Thanks.
Carl
-----------------------------------------
rs6000: vec_cmpne confusing implementation
__builtin_altivec_vcmpnet does not have any ISA restrictions. The current
built-in definitions for vcmpneb, vcmpneh, vcmpnew, vcmpnet are defined
under the Power 9 section. This implies they are only supported on Power
9 and above when in fact they are defined and work on Power 8.
This patch moves the definitions to the Altivec stanza and maps the
builtin dispatches to different code generation for Power 8 and earlier
or for Power 9 and later.
gcc/ChangeLog:
* config/rs6000/rs6000-builtins.def (vcmpneb, vcmpneh, vcmpnew.
vcmpnet): Move definitions to Altivec stanza.
* config/rs6000/vsx.md (cmpneb, vcmpneh, vcmpnew, vcmpnet): New
define_expand.
(cmpneb, vcmpneh, vcmpnew, vcmpnet): Rename define_insn.
Patch has been tested on Power 8 and Power 9 with no regressions.
---
gcc/config/rs6000/rs6000-builtins.def | 24 ++++++------
gcc/config/rs6000/vsx.md | 54 +++++++++++++++++++++++++--
2 files changed, 63 insertions(+), 15 deletions(-)
Comments
Hi Carl,
on 2023/5/3 23:30, Carl Love via Gcc-patches wrote:
> GCC maintainers:
>
> The following patch cleans up the definition for the
> __builtin_altivec_vcmpnet. The current implementation implies that the
> builtin is only supported on Power 9 since it is defined under the
> Power 9 stanza. However the builtin has no ISA restrictions as stated
> in the Power Vector Intrinsic Programming Reference document. The
> current built-in works because the builtin gets replaced during GIMPLE
> folding by a simple not-equal operator so it doesn't get expanded and
> checked for Power 9 code generation.
>
> This patch moves the definition to the Altivec stanza in the builtin
> definition file. The builtin then generates code for Power 8 and
> earlier processors or Power 9 and later processors.
>
> The patch has been tested on Power 8 and Power 9 with no regressions.
>
> Please let me know if the patch is acceptable for mainline. Thanks.
>
> Carl
>
> -----------------------------------------
> rs6000: vec_cmpne confusing implementation
A better subject seems to be "Make __builtin_altivec_vcmpne{b,h,w,t}
supported under altivec" to match better what this patch does?
>
> __builtin_altivec_vcmpnet does not have any ISA restrictions. The current
Power Vector Intrinsic Programming Reference doesn't define functions
__builtin_altivec_vcmpne* but only vec_cmpne, which hasn't restricted any ISA yet.
So I think what you proposed is actually to extend the current built-in functions
__builtin_altivec_vcmpne{b,h,w,t} not longer power9 and later only.
This patch needs some test cases for the corresponding changes, but I'm curious
why people want to use these built-ins instead of just using vec_cmpne? The
latter looks more general and better.
> built-in definitions for vcmpneb, vcmpneh, vcmpnew, vcmpnet are defined
> under the Power 9 section. This implies they are only supported on Power
> 9 and above when in fact they are defined and work on Power 8.
>
> This patch moves the definitions to the Altivec stanza and maps the
> builtin dispatches to different code generation for Power 8 and earlier
> or for Power 9 and later.
>
> gcc/ChangeLog:
>
> * config/rs6000/rs6000-builtins.def (vcmpneb, vcmpneh, vcmpnew.
> vcmpnet): Move definitions to Altivec stanza.
> * config/rs6000/vsx.md (cmpneb, vcmpneh, vcmpnew, vcmpnet): New
> define_expand.
> (cmpneb, vcmpneh, vcmpnew, vcmpnet): Rename define_insn.
>
> Patch has been tested on Power 8 and Power 9 with no regressions.
> ---
> gcc/config/rs6000/rs6000-builtins.def | 24 ++++++------
> gcc/config/rs6000/vsx.md | 54 +++++++++++++++++++++++++--
> 2 files changed, 63 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 638d0bc72ca..adb4122be29 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -641,6 +641,18 @@
> const int __builtin_altivec_vcmpgtuw_p (int, vsi, vsi);
> VCMPGTUW_P vector_gtu_v4si_p {pred}
>
> + const vsc __builtin_altivec_vcmpneb (vsc, vsc);
> + VCMPNEB vcmpneb {}
> +
> + const vss __builtin_altivec_vcmpneh (vss, vss);
> + VCMPNEH vcmpneh {}
> +
> + const vsi __builtin_altivec_vcmpnew (vsi, vsi);
> + VCMPNEW vcmpnew {}
> +
The expanders for these three can just start with
separated eq and not, later if vcmpne{b,h,w} get
supported, these two can be combined further.
> + const vbq __builtin_altivec_vcmpnet (vsq, vsq);
> + VCMPNET vcmpnet {}
The current define_expand for vcmpnet guards TARGET_POWER10,
once you have test cases to cover these changes, you will found
the current expander isn't enough for what you want.
> +
> const vsi __builtin_altivec_vctsxs (vf, const int<5>);
> VCTSXS altivec_vctsxs {}
>
> @@ -2599,9 +2611,6 @@
> const signed int __builtin_altivec_vcmpaew_p (vsi, vsi);
> VCMPAEW_P vector_ae_v4si_p {}
>
> - const vsc __builtin_altivec_vcmpneb (vsc, vsc);
> - VCMPNEB vcmpneb {}
> -
> const signed int __builtin_altivec_vcmpneb_p (vsc, vsc);
> VCMPNEB_P vector_ne_v16qi_p {}
>
> @@ -2614,15 +2623,9 @@
> const signed int __builtin_altivec_vcmpnefp_p (vf, vf);
> VCMPNEFP_P vector_ne_v4sf_p {}
>
> - const vss __builtin_altivec_vcmpneh (vss, vss);
> - VCMPNEH vcmpneh {}
> -
> const signed int __builtin_altivec_vcmpneh_p (vss, vss);
> VCMPNEH_P vector_ne_v8hi_p {}
>
> - const vsi __builtin_altivec_vcmpnew (vsi, vsi);
> - VCMPNEW vcmpnew {}
> -
> const signed int __builtin_altivec_vcmpnew_p (vsi, vsi);
> VCMPNEW_P vector_ne_v4si_p {}
>
> @@ -3203,9 +3206,6 @@
> const signed int __builtin_altivec_vcmpgtut_p (signed int, vuq, vuq);
> VCMPGTUT_P vector_gtu_v1ti_p {pred}
>
> - const vbq __builtin_altivec_vcmpnet (vsq, vsq);
> - VCMPNET vcmpnet {}
> -
> const signed int __builtin_altivec_vcmpnet_p (vsq, vsq);
> VCMPNET_P vector_ne_v1ti_p {}
>
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 7d845df5c2d..3f05e3e6d00 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -5652,8 +5652,56 @@
> DONE;
> })
>
> +;; Expand for builtin vcmpneb
> +(define_expand "vcmpneb"
> + [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
> + (not:V16QI
> + (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
> + (match_operand:V16QI 2 "altivec_register_operand" "v"))))]
> + ""
> + {
> + if (TARGET_P9_VECTOR)
> + emit_insn (gen_vcmpneb_p9 (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vcmpequb_p (operands[0], operands[1],
> + operands[2]));
> + DONE;
> + })
> +
It's better to have an explicit condition TARGET_ALTIVEC to match the stanza.
As mentioned above, you can just expand it with eq and not with something like:
(define_expand "vcmpneb"
[(set (match_operand:V16QI 3 "altivec_register_operand" "=v")
(eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
(match_operand:V16QI 2 "altivec_register_operand" "v")))
(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
(not:V16QI (match_dup 3))]
"TARGET_ALTIVEC"
{
operands[3] = gen_reg_rtx (V16QImode);
});
And move these expands to altivec.md since this is not vsx specific.
> +;; Expand for builtin vcmpneh
> +(define_expand "vcmpneh"
> + [(set (match_operand:V8HI 0 "altivec_register_operand" "=v")
> + (not:V8HI
> + (eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v")
> + (match_operand:V8HI 2 "altivec_register_operand" "v"))))]
> + ""
> + {
> + if (TARGET_P9_VECTOR)
> + emit_insn (gen_vcmpneh_p9 (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vcmpequh_p (operands[0], operands[1],
> + operands[2]));
> + DONE;
> + })
> +
Ditto,
> +;; Expand for builtin vcmpnew
> +(define_expand "vcmpnew"
> + [(set (match_operand:V4SI 0 "altivec_register_operand" "=v")
> + (not:V4SI
> + (eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v")
> + (match_operand:V4SI 2 "altivec_register_operand" "v"))))]
> + ""
> + {
> + if (TARGET_P9_VECTOR)
> + emit_insn (gen_vcmpnew_p9 (operands[0], operands[1], operands[2]));
> + else
> + emit_insn (gen_altivec_vcmpequw_p (operands[0], operands[1],
> + operands[2]));
> + DONE;
> + })
Ditto, they can be merged with some mode iterators further.
> +
> ;; Vector Compare Not Equal Byte (specified/not+eq:)
> -(define_insn "vcmpneb"
> +(define_insn "vcmpneb_p9"
> [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
> (not:V16QI
> (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
> @@ -5703,7 +5751,7 @@
> [(set_attr "type" "vecsimple")])
>
> ;; Vector Compare Not Equal Half Word (specified/not+eq:)
> -(define_insn "vcmpneh"
> +(define_insn "vcmpneh_p9"
> [(set (match_operand:V8HI 0 "altivec_register_operand" "=v")
> (not:V8HI
> (eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v")
> @@ -5723,7 +5771,7 @@
> [(set_attr "type" "vecsimple")])
>
> ;; Vector Compare Not Equal Word (specified/not+eq:)
> -(define_insn "vcmpnew"
> +(define_insn "vcmpnew_p9"
> [(set (match_operand:V4SI 0 "altivec_register_operand" "=v")
> (not:V4SI
> (eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v")
These are not needed with the above changes.
BR,
Kewen
@@ -641,6 +641,18 @@
const int __builtin_altivec_vcmpgtuw_p (int, vsi, vsi);
VCMPGTUW_P vector_gtu_v4si_p {pred}
+ const vsc __builtin_altivec_vcmpneb (vsc, vsc);
+ VCMPNEB vcmpneb {}
+
+ const vss __builtin_altivec_vcmpneh (vss, vss);
+ VCMPNEH vcmpneh {}
+
+ const vsi __builtin_altivec_vcmpnew (vsi, vsi);
+ VCMPNEW vcmpnew {}
+
+ const vbq __builtin_altivec_vcmpnet (vsq, vsq);
+ VCMPNET vcmpnet {}
+
const vsi __builtin_altivec_vctsxs (vf, const int<5>);
VCTSXS altivec_vctsxs {}
@@ -2599,9 +2611,6 @@
const signed int __builtin_altivec_vcmpaew_p (vsi, vsi);
VCMPAEW_P vector_ae_v4si_p {}
- const vsc __builtin_altivec_vcmpneb (vsc, vsc);
- VCMPNEB vcmpneb {}
-
const signed int __builtin_altivec_vcmpneb_p (vsc, vsc);
VCMPNEB_P vector_ne_v16qi_p {}
@@ -2614,15 +2623,9 @@
const signed int __builtin_altivec_vcmpnefp_p (vf, vf);
VCMPNEFP_P vector_ne_v4sf_p {}
- const vss __builtin_altivec_vcmpneh (vss, vss);
- VCMPNEH vcmpneh {}
-
const signed int __builtin_altivec_vcmpneh_p (vss, vss);
VCMPNEH_P vector_ne_v8hi_p {}
- const vsi __builtin_altivec_vcmpnew (vsi, vsi);
- VCMPNEW vcmpnew {}
-
const signed int __builtin_altivec_vcmpnew_p (vsi, vsi);
VCMPNEW_P vector_ne_v4si_p {}
@@ -3203,9 +3206,6 @@
const signed int __builtin_altivec_vcmpgtut_p (signed int, vuq, vuq);
VCMPGTUT_P vector_gtu_v1ti_p {pred}
- const vbq __builtin_altivec_vcmpnet (vsq, vsq);
- VCMPNET vcmpnet {}
-
const signed int __builtin_altivec_vcmpnet_p (vsq, vsq);
VCMPNET_P vector_ne_v1ti_p {}
@@ -5652,8 +5652,56 @@
DONE;
})
+;; Expand for builtin vcmpneb
+(define_expand "vcmpneb"
+ [(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
+ (not:V16QI
+ (eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
+ (match_operand:V16QI 2 "altivec_register_operand" "v"))))]
+ ""
+ {
+ if (TARGET_P9_VECTOR)
+ emit_insn (gen_vcmpneb_p9 (operands[0], operands[1], operands[2]));
+ else
+ emit_insn (gen_altivec_vcmpequb_p (operands[0], operands[1],
+ operands[2]));
+ DONE;
+ })
+
+;; Expand for builtin vcmpneh
+(define_expand "vcmpneh"
+ [(set (match_operand:V8HI 0 "altivec_register_operand" "=v")
+ (not:V8HI
+ (eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v")
+ (match_operand:V8HI 2 "altivec_register_operand" "v"))))]
+ ""
+ {
+ if (TARGET_P9_VECTOR)
+ emit_insn (gen_vcmpneh_p9 (operands[0], operands[1], operands[2]));
+ else
+ emit_insn (gen_altivec_vcmpequh_p (operands[0], operands[1],
+ operands[2]));
+ DONE;
+ })
+
+;; Expand for builtin vcmpnew
+(define_expand "vcmpnew"
+ [(set (match_operand:V4SI 0 "altivec_register_operand" "=v")
+ (not:V4SI
+ (eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v")
+ (match_operand:V4SI 2 "altivec_register_operand" "v"))))]
+ ""
+ {
+ if (TARGET_P9_VECTOR)
+ emit_insn (gen_vcmpnew_p9 (operands[0], operands[1], operands[2]));
+ else
+ emit_insn (gen_altivec_vcmpequw_p (operands[0], operands[1],
+ operands[2]));
+ DONE;
+ })
+
;; Vector Compare Not Equal Byte (specified/not+eq:)
-(define_insn "vcmpneb"
+(define_insn "vcmpneb_p9"
[(set (match_operand:V16QI 0 "altivec_register_operand" "=v")
(not:V16QI
(eq:V16QI (match_operand:V16QI 1 "altivec_register_operand" "v")
@@ -5703,7 +5751,7 @@
[(set_attr "type" "vecsimple")])
;; Vector Compare Not Equal Half Word (specified/not+eq:)
-(define_insn "vcmpneh"
+(define_insn "vcmpneh_p9"
[(set (match_operand:V8HI 0 "altivec_register_operand" "=v")
(not:V8HI
(eq:V8HI (match_operand:V8HI 1 "altivec_register_operand" "v")
@@ -5723,7 +5771,7 @@
[(set_attr "type" "vecsimple")])
;; Vector Compare Not Equal Word (specified/not+eq:)
-(define_insn "vcmpnew"
+(define_insn "vcmpnew_p9"
[(set (match_operand:V4SI 0 "altivec_register_operand" "=v")
(not:V4SI
(eq:V4SI (match_operand:V4SI 1 "altivec_register_operand" "v")