[1/2] x86: correct / simplify @vec_extract_hi_<mode> and vec_extract_hi_v32qi

Message ID 9406368b-8b88-9da9-a89c-1c610eb22f66@suse.com
State Accepted
Headers
Series x86: vec_extract_* adjustments |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Jan Beulich July 5, 2023, 8 a.m. UTC
  The middle alternative each was unusable without enabling AVX512DQ (in
addition to AVX512VL), which is entirely unrelated here. The last
alternative is usable with AVX512VL only (due to type restrictions on
what may be put in the upper 16 YMM registers), and hence is pointlessly
forcing 512-bit mode (without actually reflecting that in the "mode"
attribute).

gcc/

	* config/i386/sse.md (@vec_extract_hi_<mode>): Drop last
	alternative. Switch new last alternative's "isa" attribute to
	"avx512vl".
	(vec_extract_hi_v32qi): Likewise.
---
Like elsewhere I suspect "prefix_extra" is bogus here and should be
dropped.

Is "sselog1" actually appropriate here? Extracts are special forms of
moves after all, not logical operations. Even "sseshuf1" would seem to
come closer.
  

Comments

Hongtao Liu July 5, 2023, 8:40 a.m. UTC | #1
On Wed, Jul 5, 2023 at 4:00 PM Jan Beulich via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> The middle alternative each was unusable without enabling AVX512DQ (in
> addition to AVX512VL), which is entirely unrelated here. The last
> alternative is usable with AVX512VL only (due to type restrictions on
> what may be put in the upper 16 YMM registers), and hence is pointlessly
> forcing 512-bit mode (without actually reflecting that in the "mode"
> attribute).
Ok.
>
> gcc/
>
>         * config/i386/sse.md (@vec_extract_hi_<mode>): Drop last
>         alternative. Switch new last alternative's "isa" attribute to
>         "avx512vl".
>         (vec_extract_hi_v32qi): Likewise.
> ---
> Like elsewhere I suspect "prefix_extra" is bogus here and should be
> dropped.
>
> Is "sselog1" actually appropriate here? Extracts are special forms of
> moves after all, not logical operations. Even "sseshuf1" would seem to
> come closer.
Honestly, I don't know why it's marked as sselog1, but looking at the
code,  almost all vec_extract patterns are marked as sselog1, guess
it's originally from pextr.
Agree that it's should be more close to shuffle instructions.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -12029,9 +12029,9 @@
>    "operands[1] = gen_lowpart (<ssehalfvecmode>mode, operands[1]);")
>
>  (define_insn "@vec_extract_hi_<mode>"
> -  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=xm,vm,vm")
> +  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=xm,vm")
>         (vec_select:<ssehalfvecmode>
> -         (match_operand:V16_256 1 "register_operand" "x,v,v")
> +         (match_operand:V16_256 1 "register_operand" "x,v")
>           (parallel [(const_int 8) (const_int 9)
>                      (const_int 10) (const_int 11)
>                      (const_int 12) (const_int 13)
> @@ -12039,13 +12039,12 @@
>    "TARGET_AVX"
>    "@
>     vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}"
> +   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}"
>    [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> -   (set_attr "isa" "*,avx512dq,avx512f")
> -   (set_attr "prefix" "vex,evex,evex")
> +   (set_attr "isa" "*,avx512vl")
> +   (set_attr "prefix" "vex,evex")
>     (set_attr "mode" "OI")])
>
>  (define_insn_and_split "vec_extract_lo_v64qi"
> @@ -12144,9 +12143,9 @@
>    "operands[1] = gen_lowpart (V16QImode, operands[1]);")
>
>  (define_insn "vec_extract_hi_v32qi"
> -  [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm,vm")
> +  [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm")
>         (vec_select:V16QI
> -         (match_operand:V32QI 1 "register_operand" "x,v,v")
> +         (match_operand:V32QI 1 "register_operand" "x,v")
>           (parallel [(const_int 16) (const_int 17)
>                      (const_int 18) (const_int 19)
>                      (const_int 20) (const_int 21)
> @@ -12158,13 +12157,12 @@
>    "TARGET_AVX"
>    "@
>     vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
> -   vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}"
> +   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}"
>    [(set_attr "type" "sselog1")
>     (set_attr "prefix_extra" "1")
>     (set_attr "length_immediate" "1")
> -   (set_attr "isa" "*,avx512dq,avx512f")
> -   (set_attr "prefix" "vex,evex,evex")
> +   (set_attr "isa" "*,avx512vl")
> +   (set_attr "prefix" "vex,evex")
>     (set_attr "mode" "OI")])
>
>  ;; NB: *vec_extract<mode>_0 must be placed before *vec_extracthf.
>
  
Jan Beulich July 5, 2023, 8:54 a.m. UTC | #2
On 05.07.2023 10:40, Hongtao Liu wrote:
> On Wed, Jul 5, 2023 at 4:00 PM Jan Beulich via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> The middle alternative each was unusable without enabling AVX512DQ (in
>> addition to AVX512VL), which is entirely unrelated here. The last
>> alternative is usable with AVX512VL only (due to type restrictions on
>> what may be put in the upper 16 YMM registers), and hence is pointlessly
>> forcing 512-bit mode (without actually reflecting that in the "mode"
>> attribute).
> Ok.

Thanks.

>> ---
>> Like elsewhere I suspect "prefix_extra" is bogus here and should be
>> dropped.
>>
>> Is "sselog1" actually appropriate here? Extracts are special forms of
>> moves after all, not logical operations. Even "sseshuf1" would seem to
>> come closer.
> Honestly, I don't know why it's marked as sselog1, but looking at the
> code,  almost all vec_extract patterns are marked as sselog1, guess
> it's originally from pextr.
> Agree that it's should be more close to shuffle instructions.

Yet as said I think these are special forms of moves. To me "shuffle"
involves more than one element. Yet then I don't really know what
the "type" attributes are used for (other than vaguely "for
scheduling"), and hence whether treating extracts as shuffles would
be more appropriate. (IOW I'd be happy to make a patch to convert all
extracts, but I'd need to know whether the conversion should be to
"sseshuf", "sseshuf1", or "ssemov". In the former two cases knowing
the "Why?" would also help, especially for writing a sensible
description. I also haven't found any explanation towards the
difference between sse<type> and sse<type>1: The "memory" attribute
evaluates to "both" for the 1 forms if operand 1 is in memory, yet
that doesn't seem to fit any of the uses here.)

Jan
  
Hongtao Liu July 5, 2023, 10:13 a.m. UTC | #3
On Wed, Jul 5, 2023 at 4:55 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 05.07.2023 10:40, Hongtao Liu wrote:
> > On Wed, Jul 5, 2023 at 4:00 PM Jan Beulich via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> The middle alternative each was unusable without enabling AVX512DQ (in
> >> addition to AVX512VL), which is entirely unrelated here. The last
> >> alternative is usable with AVX512VL only (due to type restrictions on
> >> what may be put in the upper 16 YMM registers), and hence is pointlessly
> >> forcing 512-bit mode (without actually reflecting that in the "mode"
> >> attribute).
> > Ok.
>
> Thanks.
>
> >> ---
> >> Like elsewhere I suspect "prefix_extra" is bogus here and should be
> >> dropped.
> >>
> >> Is "sselog1" actually appropriate here? Extracts are special forms of
> >> moves after all, not logical operations. Even "sseshuf1" would seem to
> >> come closer.
> > Honestly, I don't know why it's marked as sselog1, but looking at the
> > code,  almost all vec_extract patterns are marked as sselog1, guess
> > it's originally from pextr.
> > Agree that it's should be more close to shuffle instructions.
>
> Yet as said I think these are special forms of moves. To me "shuffle"
> involves more than one element. Yet then I don't really know what
I think if it only extracts from the low part, it's close to a move,
otherwise it's more like shuffle(shuffle the specific elements to the
low part).
I guess one possible reason it's marked as sselog1 is from port usage
perspective, it's more close to vector logic instructions?
> the "type" attributes are used for (other than vaguely "for
> scheduling"), and hence whether treating extracts as shuffles would
AFAI, it's only used by scheduling, I don't know if there're tools
based on GCC schedule model.
> be more appropriate. (IOW I'd be happy to make a patch to convert all
> extracts, but I'd need to know whether the conversion should be to
> "sseshuf", "sseshuf1", or "ssemov". In the former two cases knowing
> the "Why?" would also help, especially for writing a sensible
> description. I also haven't found any explanation towards the
> difference between sse<type> and sse<type>1: The "memory" attribute
> evaluates to "both" for the 1 forms if operand 1 is in memory, yet
> that doesn't seem to fit any of the uses here.)
I think sse<type>1 only has one input operand, but sse<type> may have
two or more.
For instruction perspective,  they're the same type, sse<type>1 is
introduced to avoid Segment Fault in define_memory_attr which will
check operands[2] or operands[3].
(Similar for other attribute default setting)
>
> Jan




--
BR,
Hongtao
  

Patch

--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -12029,9 +12029,9 @@ 
   "operands[1] = gen_lowpart (<ssehalfvecmode>mode, operands[1]);")
 
 (define_insn "@vec_extract_hi_<mode>"
-  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=xm,vm,vm")
+  [(set (match_operand:<ssehalfvecmode> 0 "nonimmediate_operand" "=xm,vm")
 	(vec_select:<ssehalfvecmode>
-	  (match_operand:V16_256 1 "register_operand" "x,v,v")
+	  (match_operand:V16_256 1 "register_operand" "x,v")
 	  (parallel [(const_int 8) (const_int 9)
 		     (const_int 10) (const_int 11)
 		     (const_int 12) (const_int 13)
@@ -12039,13 +12039,12 @@ 
   "TARGET_AVX"
   "@
    vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
-   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
-   vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}"
+   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
-   (set_attr "isa" "*,avx512dq,avx512f")
-   (set_attr "prefix" "vex,evex,evex")
+   (set_attr "isa" "*,avx512vl")
+   (set_attr "prefix" "vex,evex")
    (set_attr "mode" "OI")])
 
 (define_insn_and_split "vec_extract_lo_v64qi"
@@ -12144,9 +12143,9 @@ 
   "operands[1] = gen_lowpart (V16QImode, operands[1]);")
 
 (define_insn "vec_extract_hi_v32qi"
-  [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm,vm")
+  [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm")
 	(vec_select:V16QI
-	  (match_operand:V32QI 1 "register_operand" "x,v,v")
+	  (match_operand:V32QI 1 "register_operand" "x,v")
 	  (parallel [(const_int 16) (const_int 17)
 		     (const_int 18) (const_int 19)
 		     (const_int 20) (const_int 21)
@@ -12158,13 +12157,12 @@ 
   "TARGET_AVX"
   "@
    vextract%~128\t{$0x1, %1, %0|%0, %1, 0x1}
-   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}
-   vextracti32x4\t{$0x1, %g1, %0|%0, %g1, 0x1}"
+   vextracti32x4\t{$0x1, %1, %0|%0, %1, 0x1}"
   [(set_attr "type" "sselog1")
    (set_attr "prefix_extra" "1")
    (set_attr "length_immediate" "1")
-   (set_attr "isa" "*,avx512dq,avx512f")
-   (set_attr "prefix" "vex,evex,evex")
+   (set_attr "isa" "*,avx512vl")
+   (set_attr "prefix" "vex,evex")
    (set_attr "mode" "OI")])
 
 ;; NB: *vec_extract<mode>_0 must be placed before *vec_extracthf.