[v2,0/8] Support Intel APX EGPR

Message ID	20231102112911.2372810-1-lili.cui@intel.com
Headers	Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 278D13858D32 From: "Cui, Lili" <lili.cui@intel.com> To: binutils@sourceware.org Cc: jbeulich@suse.com, hongjiu.lu@intel.com, ccoutant@gmail.com Subject: [PATCH v2 0/8] Support Intel APX EGPR Date: Thu, 2 Nov 2023 11:29:03 +0000 Message-Id: <20231102112911.2372810-1-lili.cui@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org X-getmail-retrieved-from-mailbox: INBOX
Series	Support Intel APX EGPR \| [v2,0/8] Support Intel APX EGPR [1/8] Support APX GPR32 with rex2 prefix [2/8] Created an empty EVEX_MAP4_ sub-table for EVEX instructions. [3/8] Support APX GPR32 with extend evex prefix [4/8] Add tests for APX GPR32 with extend evex prefix [5/8] Support APX NDD [6/8] Support APX Push2/Pop2 [7/8] Support APX NDD optimized encoding. [8/8] Support APX JMPABS

Message ID

20231102112911.2372810-1-lili.cui@intel.com

Headers

Received-SPF: pass (google.com: domain of
 binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 278D13858D32
From: "Cui, Lili" <lili.cui@intel.com>
To: binutils@sourceware.org
Cc: jbeulich@suse.com,
	hongjiu.lu@intel.com,
	ccoutant@gmail.com
Subject: [PATCH v2 0/8] Support Intel APX EGPR
Date: Thu,  2 Nov 2023 11:29:03 +0000
Message-Id: <20231102112911.2372810-1-lili.cui@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org

Series

Support Intel APX EGPR |

Message

Cui, Lili Nov. 2, 2023, 11:29 a.m. UTC

  This is V2 of all APX patch.
1. Merged patch part II 1/6 into patch 1/8.
2. Created a new patch for empty EVEX_MAP4_ sub-table.
3. The NF patch needs to be suspended, Where NF should be placed is under discussion. Since the patch part II 2/6 depends on the NF patch, it is also suspended.
4. There are no comments yet for APX linker patch.


Cui, Lili (4):
  Support APX GPR32 with rex2 prefix
  Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
  Support APX GPR32 with extend evex prefix
  Add tests for APX GPR32 with extend evex prefix

Hu, Lin1 (2):
  Support APX NDD optimized encoding.
  Support APX JMPABS

Mo, Zewei (1):
  Support APX Push2/Pop2

konglin1 (1):
  Support APX NDD

 gas/config/tc-i386.c                          |  480 +++++-
 gas/doc/c-i386.texi                           |    3 +-
 gas/testsuite/gas/i386/apx-jmpabs-inval.l     |    3 +
 gas/testsuite/gas/i386/apx-jmpabs-inval.s     |    6 +
 gas/testsuite/gas/i386/apx-push2pop2-inval.l  |    5 +
 gas/testsuite/gas/i386/apx-push2pop2-inval.s  |    9 +
 gas/testsuite/gas/i386/i386.exp               |    2 +
 .../i386/ilp32/x86-64-opcode-inval-intel.d    |    4 +-
 .../gas/i386/ilp32/x86-64-opcode-inval.d      |    4 +-
 .../gas/i386/x86-64-apx-egpr-inval.l          |  203 +++
 .../gas/i386/x86-64-apx-egpr-inval.s          |  210 +++
 .../gas/i386/x86-64-apx-egpr-promote-inval.l  |   16 +
 .../gas/i386/x86-64-apx-egpr-promote-inval.s  |   17 +
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d |   20 +
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s |   21 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.d   |   37 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.s   |   38 +
 .../gas/i386/x86-64-apx-evex-promoted-intel.d |  326 +++++
 .../gas/i386/x86-64-apx-evex-promoted.d       |  326 +++++
 .../gas/i386/x86-64-apx-evex-promoted.s       |  322 ++++
 .../gas/i386/x86-64-apx-jmpabs-intel.d        |   14 +
 .../gas/i386/x86-64-apx-jmpabs-inval.d        |   55 +
 .../gas/i386/x86-64-apx-jmpabs-inval.s        |   17 +
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d    |   14 +
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s    |   10 +
 .../gas/i386/x86-64-apx-ndd-optimize.d        |  124 ++
 .../gas/i386/x86-64-apx-ndd-optimize.s        |  117 ++
 gas/testsuite/gas/i386/x86-64-apx-ndd.d       |  161 ++
 gas/testsuite/gas/i386/x86-64-apx-ndd.s       |  154 ++
 .../gas/i386/x86-64-apx-push2pop2-intel.d     |   42 +
 .../gas/i386/x86-64-apx-push2pop2-inval.l     |   11 +
 .../gas/i386/x86-64-apx-push2pop2-inval.s     |   15 +
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d |   42 +
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s |   39 +
 gas/testsuite/gas/i386/x86-64-apx-rex2.d      |   83 ++
 gas/testsuite/gas/i386/x86-64-apx-rex2.s      |   86 ++
 gas/testsuite/gas/i386/x86-64-evex.d          |    2 +-
 gas/testsuite/gas/i386/x86-64-inval-movbe.l   |   31 +-
 gas/testsuite/gas/i386/x86-64-inval-movbe.s   |    1 +
 gas/testsuite/gas/i386/x86-64-inval-pseudo.l  |    6 +
 gas/testsuite/gas/i386/x86-64-inval-pseudo.s  |    4 +
 .../gas/i386/x86-64-opcode-inval-intel.d      |    4 +-
 gas/testsuite/gas/i386/x86-64-opcode-inval.d  |    4 +-
 gas/testsuite/gas/i386/x86-64-pseudos-bad.l   |   42 +
 gas/testsuite/gas/i386/x86-64-pseudos-bad.s   |   49 +
 gas/testsuite/gas/i386/x86-64-pseudos.d       |   63 +
 gas/testsuite/gas/i386/x86-64-pseudos.s       |   65 +
 gas/testsuite/gas/i386/x86-64.exp             |   15 +
 include/opcode/i386.h                         |    2 +
 opcodes/i386-dis-evex-len.h                   |   10 +
 opcodes/i386-dis-evex-mod.h                   |   52 +
 opcodes/i386-dis-evex-prefix.h                |   73 +
 opcodes/i386-dis-evex-reg.h                   |   77 +
 opcodes/i386-dis-evex-w.h                     |   10 +
 opcodes/i386-dis-evex-x86-64.h                |  140 ++
 opcodes/i386-dis-evex.h                       |  347 ++++-
 opcodes/i386-dis.c                            |  574 ++++++--
 opcodes/i386-gen.c                            |   52 +-
 opcodes/i386-opc.h                            |   27 +-
 opcodes/i386-opc.tbl                          | 1291 ++++++++++-------
 opcodes/i386-reg.tbl                          |   64 +
 61 files changed, 5217 insertions(+), 824 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/apx-jmpabs-inval.l
 create mode 100644 gas/testsuite/gas/i386/apx-jmpabs-inval.s
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.s
 create mode 100644 opcodes/i386-dis-evex-x86-64.h

Comments

Jan Beulich Nov. 2, 2023, 1:22 p.m. UTC | #1

On 02.11.2023 12:29, Cui, Lili wrote:
> This is V2 of all APX patch.
> 1. Merged patch part II 1/6 into patch 1/8.
> 2. Created a new patch for empty EVEX_MAP4_ sub-table.
> 3. The NF patch needs to be suspended, Where NF should be placed is under discussion. Since the patch part II 2/6 depends on the NF patch, it is also suspended.
> 4. There are no comments yet for APX linker patch.
> 
> 
> Cui, Lili (4):
>   Support APX GPR32 with rex2 prefix
>   Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
>   Support APX GPR32 with extend evex prefix
>   Add tests for APX GPR32 with extend evex prefix
> 
> Hu, Lin1 (2):
>   Support APX NDD optimized encoding.
>   Support APX JMPABS
> 
> Mo, Zewei (1):
>   Support APX Push2/Pop2
> 
> konglin1 (1):
>   Support APX NDD

Mind me asking whether this work is now based on my "x86: split insn
templates' CPU field"? You don't say so here, so my initial assumption
would be that it isn't. That's also supported by me peeking at patch 3.
Yet that patch was specifically created as a prereq for the APX work to
base on top (and it may require further refinement, the need for which
I could only know once you're actually using that patch as a prereq).

Jan

Cui, Lili Nov. 3, 2023, 4:42 p.m. UTC | #2

> Subject: Re: [PATCH v2 0/8] Support Intel APX EGPR
> 
> On 02.11.2023 12:29, Cui, Lili wrote:
> > This is V2 of all APX patch.
> > 1. Merged patch part II 1/6 into patch 1/8.
> > 2. Created a new patch for empty EVEX_MAP4_ sub-table.
> > 3. The NF patch needs to be suspended, Where NF should be placed is
> under discussion. Since the patch part II 2/6 depends on the NF patch, it is
> also suspended.
> > 4. There are no comments yet for APX linker patch.
> >
> >
> > Cui, Lili (4):
> >   Support APX GPR32 with rex2 prefix
> >   Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
> >   Support APX GPR32 with extend evex prefix
> >   Add tests for APX GPR32 with extend evex prefix
> >
> > Hu, Lin1 (2):
> >   Support APX NDD optimized encoding.
> >   Support APX JMPABS
> >
> > Mo, Zewei (1):
> >   Support APX Push2/Pop2
> >
> > konglin1 (1):
> >   Support APX NDD
> 
> Mind me asking whether this work is now based on my "x86: split insn
> templates' CPU field"? You don't say so here, so my initial assumption would
> be that it isn't. That's also supported by me peeking at patch 3.
> Yet that patch was specifically created as a prereq for the APX work to base on
> top (and it may require further refinement, the need for which I could only
> know once you're actually using that patch as a prereq).
> 

Sorry for missing this patch, I rebased patch3 on it.  this patch works without my old code. I will sent out new patch3.

+//       else if (x.bitfield.cpuapx_f)
+//         {
+//           /* All cpu in x need to be enabled in cpu_arch_flags.  */
+//           if (cpu_flags_not_or_check (&x, &cpu_arch_flags))
+//             match |= CPU_FLAGS_ARCH_MATCH;
+//         }


AMX can works with the following changing. 
--------------------------------------------------------
opcodes/i386-opc.tbl:

#define APX_F_64 APX_F&x64
ldtilecfg, 0x49/0, AMX_TILE&x64&(AMX_TILE|APX_F), Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }

gas/config/tc-i386.c:

   if (t->opcode_modifier.vex && t->opcode_modifier.evex)
   {
-      if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
-          || maybe_cpu (t, CpuFMA))
-         && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
+    if ((maybe_cpu (t, CpuAVX) || maybe_cpu (t, CpuAVX2)
+        || maybe_cpu (t, CpuFMA) ||  maybe_cpu (t, CpuAMX_TILE))
+       && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)
+           || maybe_cpu (t, CpuAPX_F)))
        {
          if (need_evex_encoding ())
            {
@@ -3725,7 +3726,7 @@ install_template (const insn_template *t)
                i.tm.cpu.bitfield.cpuavx = 1;
              else
                {
-                 gas_assert (!i.tm.cpu.bitfield.isa);
+//               gas_assert (!i.tm.cpu.bitfield.isa);
                  i.tm.cpu.bitfield.isa = i.tm.cpu_any.bitfield.isa;
                }
            }
-----------------------------------------------------------------

But if we want to merge bextr's vex and evex formats, we need to support BMI&(BMI |( APX_F&x64))
....
bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
bextr, 0xf7, BMI&APX_F_64, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|SwapSources|No_b
...

Thanks,
Lili.

Jan Beulich Nov. 6, 2023, 7:30 a.m. UTC | #3

On 03.11.2023 17:42, Cui, Lili wrote:
> But if we want to merge bextr's vex and evex formats, we need to support BMI&(BMI |( APX_F&x64))

Maybe more like BMI&(<tbd>|APX_F), with further work (which I was considering
anyway) towards x64 becoming a prereq to the increasing number of 64-bit-
only features? (The <tbd> may well be BMI as you suggest, even if that reads
a little odd.

Jan

Cui, Lili Nov. 6, 2023, 2:20 p.m. UTC | #4

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, November 6, 2023 3:30 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
> binutils@sourceware.org
> Subject: Re: [PATCH v2 0/8] Support Intel APX EGPR
> 
> On 03.11.2023 17:42, Cui, Lili wrote:
> > But if we want to merge bextr's vex and evex formats, we need to
> > support BMI&(BMI |( APX_F&x64))
> 
> Maybe more like BMI&(<tbd>|APX_F), with further work (which I was
> considering
> anyway) towards x64 becoming a prereq to the increasing number of 64-bit-
> only features? (The <tbd> may well be BMI as you suggest, even if that reads a
> little odd.
> 

Yes, most VEX instructions don't require x64, but apx_f is x64 based. If the format "BMI&(BMI |( APX_F&x64))"  is complicated to implement or looks ugly, maybe we can handle x64 uniformly for apx_f in tc-i386.c.

Lili.

Jan Beulich Nov. 6, 2023, 2:44 p.m. UTC | #5

On 06.11.2023 15:20, Cui, Lili wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Monday, November 6, 2023 3:30 PM
>>
>> On 03.11.2023 17:42, Cui, Lili wrote:
>>> But if we want to merge bextr's vex and evex formats, we need to
>>> support BMI&(BMI |( APX_F&x64))
>>
>> Maybe more like BMI&(<tbd>|APX_F), with further work (which I was
>> considering
>> anyway) towards x64 becoming a prereq to the increasing number of 64-bit-
>> only features? (The <tbd> may well be BMI as you suggest, even if that reads a
>> little odd.
>>
> 
> Yes, most VEX instructions don't require x64, but apx_f is x64 based. If the format "BMI&(BMI |( APX_F&x64))"  is complicated to implement or looks ugly, maybe we can handle x64 uniformly for apx_f in tc-i386.c.

Well, some adjustment is needed there anyway, at the very least for the
equivalent of e.g. the present handling of AVX|AVX512F or FMA|AVX512F.
The goal wants to be to balance the amount of special casing code against
complications in representing data in the opcode table. One question I
have is: In how far is it necessary to actually represent APX_F in the
BMI templates? There are two things triggering use of the EVEX encoding,
iirc: Use of an extended register or NF. Use of an extended register is
itself already dependent upon APX_F, and whatever the representation of
NF is going to be, its parsing could be made dependent upon APX_F, too.
No (strong) need then for the template to enforce APX_F yet another time,
hopefully.

Jan

Cui, Lili Nov. 6, 2023, 4:03 p.m. UTC | #6

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, November 6, 2023 10:45 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
> binutils@sourceware.org
> Subject: Re: [PATCH v2 0/8] Support Intel APX EGPR
> 
> On 06.11.2023 15:20, Cui, Lili wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Monday, November 6, 2023 3:30 PM
> >>
> >> On 03.11.2023 17:42, Cui, Lili wrote:
> >>> But if we want to merge bextr's vex and evex formats, we need to
> >>> support BMI&(BMI |( APX_F&x64))
> >>
> >> Maybe more like BMI&(<tbd>|APX_F), with further work (which I was
> >> considering
> >> anyway) towards x64 becoming a prereq to the increasing number of
> >> 64-bit- only features? (The <tbd> may well be BMI as you suggest,
> >> even if that reads a little odd.
> >>
> >
> > Yes, most VEX instructions don't require x64, but apx_f is x64 based. If the
> format "BMI&(BMI |( APX_F&x64))"  is complicated to implement or looks ugly,
> maybe we can handle x64 uniformly for apx_f in tc-i386.c.
> 
> Well, some adjustment is needed there anyway, at the very least for the
> equivalent of e.g. the present handling of AVX|AVX512F or FMA|AVX512F.
> The goal wants to be to balance the amount of special casing code against
> complications in representing data in the opcode table. One question I have is:
> In how far is it necessary to actually represent APX_F in the BMI templates?
> There are two things triggering use of the EVEX encoding,
> iirc: Use of an extended register or NF. Use of an extended register is itself
> already dependent upon APX_F, and whatever the representation of NF is
> going to be, its parsing could be made dependent upon APX_F, too.
> No (strong) need then for the template to enforce APX_F yet another time,
> hopefully.
> 

In [patch 2/8] Support APX GPR32 with extend evex prefix, I only merged AMX's vex and evex formats (both vex and evex require x64), due to x64 reasons, BMI and other VEX instructions listed in 3.1.5 are not merged yet.
NDD also triggers EVEX encoding. There are some VEX instructions that support NF, their vex and evex cannot be merged.

Lili.

Jan Beulich Nov. 6, 2023, 4:10 p.m. UTC | #7

On 06.11.2023 17:03, Cui, Lili wrote:
> 
> 
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Monday, November 6, 2023 10:45 PM
>> To: Cui, Lili <lili.cui@intel.com>
>> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
>> binutils@sourceware.org
>> Subject: Re: [PATCH v2 0/8] Support Intel APX EGPR
>>
>> On 06.11.2023 15:20, Cui, Lili wrote:
>>>> -----Original Message-----
>>>> From: Jan Beulich <jbeulich@suse.com>
>>>> Sent: Monday, November 6, 2023 3:30 PM
>>>>
>>>> On 03.11.2023 17:42, Cui, Lili wrote:
>>>>> But if we want to merge bextr's vex and evex formats, we need to
>>>>> support BMI&(BMI |( APX_F&x64))
>>>>
>>>> Maybe more like BMI&(<tbd>|APX_F), with further work (which I was
>>>> considering
>>>> anyway) towards x64 becoming a prereq to the increasing number of
>>>> 64-bit- only features? (The <tbd> may well be BMI as you suggest,
>>>> even if that reads a little odd.
>>>>
>>>
>>> Yes, most VEX instructions don't require x64, but apx_f is x64 based. If the
>> format "BMI&(BMI |( APX_F&x64))"  is complicated to implement or looks ugly,
>> maybe we can handle x64 uniformly for apx_f in tc-i386.c.
>>
>> Well, some adjustment is needed there anyway, at the very least for the
>> equivalent of e.g. the present handling of AVX|AVX512F or FMA|AVX512F.
>> The goal wants to be to balance the amount of special casing code against
>> complications in representing data in the opcode table. One question I have is:
>> In how far is it necessary to actually represent APX_F in the BMI templates?
>> There are two things triggering use of the EVEX encoding,
>> iirc: Use of an extended register or NF. Use of an extended register is itself
>> already dependent upon APX_F, and whatever the representation of NF is
>> going to be, its parsing could be made dependent upon APX_F, too.
>> No (strong) need then for the template to enforce APX_F yet another time,
>> hopefully.
>>
> 
> In [patch 2/8] Support APX GPR32 with extend evex prefix,

Yet again patch 2/8, but this time the title reference at least clarifies
you mean 3/8.

> I only merged AMX's vex and evex formats (both vex and evex require x64), due to x64 reasons, BMI and other VEX instructions listed in 3.1.5 are not merged yet.

I see. Looking at patch 3 is next.

> NDD also triggers EVEX encoding.

But not for insns which were previously VEX-encoded?

> There are some VEX instructions that support NF, their vex and evex cannot be merged.

Why not?

Jan

Cui, Lili Nov. 7, 2023, 1:53 a.m. UTC | #8

> >> Subject: Re: [PATCH v2 0/8] Support Intel APX EGPR
> >>
> >> On 06.11.2023 15:20, Cui, Lili wrote:
> >>>> -----Original Message-----
> >>>> From: Jan Beulich <jbeulich@suse.com>
> >>>> Sent: Monday, November 6, 2023 3:30 PM
> >>>>
> >>>> On 03.11.2023 17:42, Cui, Lili wrote:
> >>>>> But if we want to merge bextr's vex and evex formats, we need to
> >>>>> support BMI&(BMI |( APX_F&x64))
> >>>>
> >>>> Maybe more like BMI&(<tbd>|APX_F), with further work (which I was
> >>>> considering
> >>>> anyway) towards x64 becoming a prereq to the increasing number of
> >>>> 64-bit- only features? (The <tbd> may well be BMI as you suggest,
> >>>> even if that reads a little odd.
> >>>>
> >>>
> >>> Yes, most VEX instructions don't require x64, but apx_f is x64
> >>> based. If the
> >> format "BMI&(BMI |( APX_F&x64))"  is complicated to implement or
> >> looks ugly, maybe we can handle x64 uniformly for apx_f in tc-i386.c.
> >>
> >> Well, some adjustment is needed there anyway, at the very least for
> >> the equivalent of e.g. the present handling of AVX|AVX512F or
> FMA|AVX512F.
> >> The goal wants to be to balance the amount of special casing code
> >> against complications in representing data in the opcode table. One
> question I have is:
> >> In how far is it necessary to actually represent APX_F in the BMI templates?
> >> There are two things triggering use of the EVEX encoding,
> >> iirc: Use of an extended register or NF. Use of an extended register
> >> is itself already dependent upon APX_F, and whatever the
> >> representation of NF is going to be, its parsing could be made dependent
> upon APX_F, too.
> >> No (strong) need then for the template to enforce APX_F yet another
> >> time, hopefully.
> >>
> >
> > In [patch 2/8] Support APX GPR32 with extend evex prefix,
> 
> Yet again patch 2/8, but this time the title reference at least clarifies you mean
> 3/8.
> 

I was all mixed up last night.

> > I only merged AMX's vex and evex formats (both vex and evex require x64),
> due to x64 reasons, BMI and other VEX instructions listed in 3.1.5 are not
> merged yet.
> 
> I see. Looking at patch 3 is next.
> 
> > NDD also triggers EVEX encoding.
> 
> But not for insns which were previously VEX-encoded?
> 
EVEX promote from VEX has no ND bit. You are right, there are only two factors extend register and NF.

> > There are some VEX instructions that support NF, their vex and evex cannot
> be merged.
> 
> Why not?
> 
Like bextr, new EVEX table has NF tag, I think we cannot merge them.

bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
bextr, 0xf7, BMI|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }

Lili.

Jan Beulich Nov. 7, 2023, 10:11 a.m. UTC | #9

On 07.11.2023 02:53, Cui, Lili wrote:
>>>> Subject: Re: [PATCH v2 0/8] Support Intel APX EGPR
>>>>
>>>> On 06.11.2023 15:20, Cui, Lili wrote:
>>> There are some VEX instructions that support NF, their vex and evex cannot
>> be merged.
>>
>> Why not?
>>
> Like bextr, new EVEX table has NF tag, I think we cannot merge them.
> 
> bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
> bextr, 0xf7, BMI|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }

NF is no different than, say, SAE (which didn't get in the way of folding VEX
and EVEX templates). It's a(nother) reliable indication that EVEX encoding is
going to be needed.

Jan