[v4,0/9] Support Intel APX EGPR

Message ID	20231219121218.974012-1-lili.cui@intel.com
Headers	Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BACB73858417 From: "Cui, Lili" <lili.cui@intel.com> To: binutils@sourceware.org Cc: hongjiu.lu@intel.com, jbeulich@suse.com Subject: [PATCH v4 0/9] Support Intel APX EGPR Date: Tue, 19 Dec 2023 12:12:09 +0000 Message-Id: <20231219121218.974012-1-lili.cui@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org X-getmail-retrieved-from-mailbox: INBOX
Series	Support Intel APX EGPR \| [v4,0/9] Support Intel APX EGPR [v4,1/9] Support APX GPR32 with rex2 prefix [v4,2/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions. [v4,3/9] Support APX GPR32 with extend evex prefix [v4,4/9] Add tests for APX GPR32 with extend evex prefix [v4,5/9] Support APX NDD [v4,6/9] Support APX Push2/Pop2 [v4,7/9] Support APX PUSHP/POPP [v4,`8/9] Support APX NDD optimized encoding. [v4,9/9] Support APX JMPABS for disassembler

Message ID

20231219121218.974012-1-lili.cui@intel.com

Headers

Received-SPF: pass (google.com: domain of
 binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BACB73858417
From: "Cui, Lili" <lili.cui@intel.com>
To: binutils@sourceware.org
Cc: hongjiu.lu@intel.com,
	jbeulich@suse.com
Subject: [PATCH v4 0/9] Support Intel APX EGPR
Date: Tue, 19 Dec 2023 12:12:09 +0000
Message-Id: <20231219121218.974012-1-lili.cui@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org

Series

Support Intel APX EGPR |

Message

Cui, Lili Dec. 19, 2023, 12:12 p.m. UTC

  *** BLURB HERE ***
Future optimizations to be made.
1. The current implementation of vexvvvvv needs to be optimized.
2. The handling of double VEX/EVEX templates in check_register() needs to be optimized.
3. Convert vround* with egpr to VRNDSCALE* instead of reporting an error.
4. Find a suitable variable to replace OperandConstraint=REX2_REQUIRED.

Cui, Lili (5):
  Support APX GPR32 with rex2 prefix
  Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
  Support APX GPR32 with extend evex prefix
  Add tests for APX GPR32 with extend evex prefix
  Support APX PUSHP/POPP

Hu, Lin1 (2):
  Support APX NDD optimized encoding.
  Support APX JMPABS for disassembler

Mo, Zewei (1):
  Support APX Push2/Pop2

konglin1 (1):
  Support APX NDD

 gas/config/tc-i386.c                          | 466 +++++++++++-
 gas/doc/c-i386.texi                           |   7 +-
 gas/testsuite/gas/i386/apx-push2pop2-inval.l  |   5 +
 gas/testsuite/gas/i386/apx-push2pop2-inval.s  |   9 +
 gas/testsuite/gas/i386/i386.exp               |   1 +
 .../i386/ilp32/x86-64-opcode-inval-intel.d    |  47 +-
 .../gas/i386/ilp32/x86-64-opcode-inval.d      |  47 +-
 .../gas/i386/x86-64-apx-egpr-inval.l          | 202 +++++
 .../gas/i386/x86-64-apx-egpr-inval.s          | 209 +++++
 .../gas/i386/x86-64-apx-egpr-promote-inval.l  |  20 +
 .../gas/i386/x86-64-apx-egpr-promote-inval.s  |  29 +
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d |  20 +
 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s |  21 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  41 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  43 ++
 .../gas/i386/x86-64-apx-evex-promoted-intel.d | 318 ++++++++
 .../gas/i386/x86-64-apx-evex-promoted.d       | 318 ++++++++
 .../gas/i386/x86-64-apx-evex-promoted.s       | 314 ++++++++
 .../gas/i386/x86-64-apx-jmpabs-intel.d        |  12 +
 .../gas/i386/x86-64-apx-jmpabs-inval.d        |  40 +
 .../gas/i386/x86-64-apx-jmpabs-inval.s        |  15 +
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d    |  12 +
 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s    |   5 +
 .../gas/i386/x86-64-apx-ndd-optimize.d        | 132 ++++
 .../gas/i386/x86-64-apx-ndd-optimize.s        | 125 +++
 gas/testsuite/gas/i386/x86-64-apx-ndd.d       | 160 ++++
 gas/testsuite/gas/i386/x86-64-apx-ndd.s       | 155 ++++
 .../gas/i386/x86-64-apx-push2pop2-intel.d     |  42 +
 .../gas/i386/x86-64-apx-push2pop2-inval.l     |  13 +
 .../gas/i386/x86-64-apx-push2pop2-inval.s     |  17 +
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d |  42 +
 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s |  39 +
 .../gas/i386/x86-64-apx-pushp-popp-intel.d    |  14 +
 .../gas/i386/x86-64-apx-pushp-popp-inval.l    |   5 +
 .../gas/i386/x86-64-apx-pushp-popp-inval.s    |   7 +
 .../gas/i386/x86-64-apx-pushp-popp.d          |  14 +
 .../gas/i386/x86-64-apx-pushp-popp.s          |   8 +
 gas/testsuite/gas/i386/x86-64-apx-rex2.d      |  83 ++
 gas/testsuite/gas/i386/x86-64-apx-rex2.s      |  86 +++
 gas/testsuite/gas/i386/x86-64-evex.d          |   2 +-
 gas/testsuite/gas/i386/x86-64-inval-pseudo.l  |   6 +
 gas/testsuite/gas/i386/x86-64-inval-pseudo.s  |   4 +
 .../gas/i386/x86-64-opcode-inval-intel.d      |  26 +-
 gas/testsuite/gas/i386/x86-64-opcode-inval.d  |  26 +-
 gas/testsuite/gas/i386/x86-64-opcode-inval.s  |   4 -
 gas/testsuite/gas/i386/x86-64-pseudos-bad.l   |  75 +-
 gas/testsuite/gas/i386/x86-64-pseudos-bad.s   |  74 ++
 gas/testsuite/gas/i386/x86-64-pseudos.d       |  63 ++
 gas/testsuite/gas/i386/x86-64-pseudos.s       |  64 ++
 gas/testsuite/gas/i386/x86-64.exp             |  20 +-
 include/opcode/i386.h                         |   2 +
 opcodes/i386-dis-evex-len.h                   |  10 +
 opcodes/i386-dis-evex-prefix.h                |  66 ++
 opcodes/i386-dis-evex-reg.h                   |  70 ++
 opcodes/i386-dis-evex-w.h                     |  10 +
 opcodes/i386-dis-evex-x86-64.h                |  50 ++
 opcodes/i386-dis-evex.h                       | 347 ++++++++-
 opcodes/i386-dis.c                            | 715 +++++++++++++-----
 opcodes/i386-gen.c                            |  52 +-
 opcodes/i386-opc.h                            |  27 +-
 opcodes/i386-opc.tbl                          | 223 ++++--
 opcodes/i386-reg.tbl                          |  64 ++
 62 files changed, 4688 insertions(+), 455 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-pushp-popp.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.s
 create mode 100644 opcodes/i386-dis-evex-x86-64.h

Comments

Jan Beulich Dec. 19, 2023, 12:35 p.m. UTC | #1

On 19.12.2023 13:12, Cui, Lili wrote:
> *** BLURB HERE ***
> Future optimizations to be made.
> 1. The current implementation of vexvvvvv needs to be optimized.
> 2. The handling of double VEX/EVEX templates in check_register() needs to be optimized.

I hope this is just stale here, and the dependency on templates was now
removed again from check_register().

Jan

> 3. Convert vround* with egpr to VRNDSCALE* instead of reporting an error.
> 4. Find a suitable variable to replace OperandConstraint=REX2_REQUIRED.
> 
> Cui, Lili (5):
>   Support APX GPR32 with rex2 prefix
>   Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
>   Support APX GPR32 with extend evex prefix
>   Add tests for APX GPR32 with extend evex prefix
>   Support APX PUSHP/POPP
> 
> Hu, Lin1 (2):
>   Support APX NDD optimized encoding.
>   Support APX JMPABS for disassembler
> 
> Mo, Zewei (1):
>   Support APX Push2/Pop2
> 
> konglin1 (1):
>   Support APX NDD

Cui, Lili Dec. 20, 2023, 8:50 a.m. UTC | #2

> On 19.12.2023 13:12, Cui, Lili wrote:
> > *** BLURB HERE ***
> > Future optimizations to be made.
> > 1. The current implementation of vexvvvvv needs to be optimized.
> > 2. The handling of double VEX/EVEX templates in check_register() needs to
> be optimized.
> 
> I hope this is just stale here, and the dependency on templates was now
> removed again from check_register().
> 
> Jan
> 

In fact, I didn't remove it in V4, I didn't find a better place to deal with it. I don't know if you agree with this implementation below.

/* For dual VEX/EVEX templates, evex encoding is required when the input has
   egpr.*/
static INLINE void
vex_with_Egpr_requires_evex_encoding (const insn_template *t)
{
  for (unsigned int op = 0; op < i.operands; op++)
    {
      if (i.types[op].bitfield.class != Reg)
        continue;

      if (i.op[op].regs->reg_flags & RegRex2)
        i.vec_encoding = vex_encoding_evex;
    }

  if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
      || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
    i.vec_encoding = vex_encoding_evex;
}

static INLINE void
install_template (const insn_template *t)
{
  unsigned int l;

  i.tm = *t;

  /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
  if (t->opcode_modifier.vex && t->opcode_modifier.evex)
    {
      vex_with_Egpr_requires_evex_encoding (t);


Regards,
Lili.

Jan Beulich Dec. 20, 2023, 8:57 a.m. UTC | #3

On 20.12.2023 09:50, Cui, Lili wrote:
>> On 19.12.2023 13:12, Cui, Lili wrote:
>>> *** BLURB HERE ***
>>> Future optimizations to be made.
>>> 1. The current implementation of vexvvvvv needs to be optimized.
>>> 2. The handling of double VEX/EVEX templates in check_register() needs to
>> be optimized.
>>
>> I hope this is just stale here, and the dependency on templates was now
>> removed again from check_register().
> 
> In fact, I didn't remove it in V4, I didn't find a better place to deal with it. I don't know if you agree with this implementation below.

I'm afraid I don't, both because it still isn't clear to me what's wrong
with my alternative proposal, and also for the formal reason of ...

> /* For dual VEX/EVEX templates, evex encoding is required when the input has
>    egpr.*/
> static INLINE void
> vex_with_Egpr_requires_evex_encoding (const insn_template *t)
> {
>   for (unsigned int op = 0; op < i.operands; op++)
>     {
>       if (i.types[op].bitfield.class != Reg)
>         continue;
> 
>       if (i.op[op].regs->reg_flags & RegRex2)
>         i.vec_encoding = vex_encoding_evex;

... it not being okay to override i.vec_encoding like this, when it
may already have been set to another value.

Jan

>     }
> 
>   if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
>       || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
>     i.vec_encoding = vex_encoding_evex;
> }
> 
> static INLINE void
> install_template (const insn_template *t)
> {
>   unsigned int l;
> 
>   i.tm = *t;
> 
>   /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
>   if (t->opcode_modifier.vex && t->opcode_modifier.evex)
>     {
>       vex_with_Egpr_requires_evex_encoding (t);
> 
> 
> Regards,
> Lili.

Cui, Lili Dec. 20, 2023, 10:42 a.m. UTC | #4

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Wednesday, December 20, 2023 4:57 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; binutils@sourceware.org
> Subject: Re: [PATCH v4 0/9] Support Intel APX EGPR
> 
> On 20.12.2023 09:50, Cui, Lili wrote:
> >> On 19.12.2023 13:12, Cui, Lili wrote:
> >>> *** BLURB HERE ***
> >>> Future optimizations to be made.
> >>> 1. The current implementation of vexvvvvv needs to be optimized.
> >>> 2. The handling of double VEX/EVEX templates in check_register()
> >>> needs to
> >> be optimized.
> >>
> >> I hope this is just stale here, and the dependency on templates was
> >> now removed again from check_register().
> >
> > In fact, I didn't remove it in V4, I didn't find a better place to deal with it. I
> don't know if you agree with this implementation below.
> 
> I'm afraid I don't, both because it still isn't clear to me what's wrong with my
> alternative proposal, and also for the formal reason of ...
> 

For the alternative proposal, do you mean adding a new variable to avoid introducing new loops over all operands? How about this ? or do you want to add other variable and handle it in check_register?

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -464,6 +464,9 @@ struct _i386_insn
     /* Have NOTRACK prefix.  */
     const char *notrack_prefix;

+    /* Has Egpr.  */
+    bool has_egpr;
+
     /* Error message.  */
     enum i386_error error;
   };
@@ -3683,7 +3686,7 @@ install_template (const insn_template *t)
           || maybe_cpu (t, CpuFMA))
          && (maybe_cpu (t, CpuAVX512F) || maybe_cpu (t, CpuAVX512VL)))
        {
-         if (need_evex_encoding ())
+         if (need_evex_encoding () || i.has_egpr)
            {
              i.tm.opcode_modifier.vex = 0;
              i.tm.cpu.bitfield.cpuavx512f = i.tm.cpu_any.bitfield.cpuavx512f;
@@ -3704,7 +3707,7 @@ install_template (const insn_template *t)
       if (APX_F(CpuCMPCCXADD) || APX_F(CpuAMX_TILE) || APX_F(CpuAVX512F)
          || APX_F(CpuAVX512DQ) || APX_F(CpuAVX512BW) || APX_F(CpuBMI)
          || APX_F(CpuBMI2))
-       if (need_evex_encoding ())
+       if (need_evex_encoding () || i.has_egpr)
          i.tm.opcode_modifier.vex = 0;
        else
          i.tm.opcode_modifier.evex = 0;
@@ -14523,11 +14526,12 @@ static bool check_register (const reg_entry *r)
          || flag_code != CODE_64BIT)
        return false;

+      i.has_egpr = true;


Lili.

> > /* For dual VEX/EVEX templates, evex encoding is required when the input
> has
> >    egpr.*/
> > static INLINE void
> > vex_with_Egpr_requires_evex_encoding (const insn_template *t) {
> >   for (unsigned int op = 0; op < i.operands; op++)
> >     {
> >       if (i.types[op].bitfield.class != Reg)
> >         continue;
> >
> >       if (i.op[op].regs->reg_flags & RegRex2)
> >         i.vec_encoding = vex_encoding_evex;
> 
> ... it not being okay to override i.vec_encoding like this, when it may already
> have been set to another value.
> 
> Jan
> 
> >     }
> >
> >   if ((i.index_reg && (i.index_reg->reg_flags & RegRex2))
> >       || (i.base_reg && (i.base_reg->reg_flags & RegRex2)))
> >     i.vec_encoding = vex_encoding_evex; }
> >
> > static INLINE void
> > install_template (const insn_template *t) {
> >   unsigned int l;
> >
> >   i.tm = *t;
> >
> >   /* Dual VEX/EVEX templates need stripping one of the possible variants.  */
> >   if (t->opcode_modifier.vex && t->opcode_modifier.evex)
> >     {
> >       vex_with_Egpr_requires_evex_encoding (t);
> >
> >
> > Regards,
> > Lili.

Jan Beulich Dec. 20, 2023, 11 a.m. UTC | #5

On 20.12.2023 11:42, Cui, Lili wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Wednesday, December 20, 2023 4:57 PM
>>
>> On 20.12.2023 09:50, Cui, Lili wrote:
>>>> On 19.12.2023 13:12, Cui, Lili wrote:
>>>>> *** BLURB HERE ***
>>>>> Future optimizations to be made.
>>>>> 1. The current implementation of vexvvvvv needs to be optimized.
>>>>> 2. The handling of double VEX/EVEX templates in check_register()
>>>>> needs to
>>>> be optimized.
>>>>
>>>> I hope this is just stale here, and the dependency on templates was
>>>> now removed again from check_register().
>>>
>>> In fact, I didn't remove it in V4, I didn't find a better place to deal with it. I
>> don't know if you agree with this implementation below.
>>
>> I'm afraid I don't, both because it still isn't clear to me what's wrong with my
>> alternative proposal, and also for the formal reason of ...
>>
> 
> For the alternative proposal, do you mean adding a new variable to avoid introducing new loops over all operands? How about this ? or do you want to add other variable and handle it in check_register?

No, the alternative proposal continues to be to introduce a new
enumerator to record in i.vec_encoding (vex_encoding_egpr is what
iirc I had suggested before, despite the naming anomaly). What you
outline below would, however, still be better than adding another
loop (as you had it earlier), imo.

> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -464,6 +464,9 @@ struct _i386_insn
>      /* Have NOTRACK prefix.  */
>      const char *notrack_prefix;
> 
> +    /* Has Egpr.  */
> +    bool has_egpr;
> +
>      /* Error message.  */
>      enum i386_error error;
>    };

As a general remark, when you add new fields to a struct, please
try to find a slot that ideally is using existing padding _and_ is
next to related fields, or at least one of the two.

Jan

Cui, Lili Dec. 20, 2023, 11:50 a.m. UTC | #6

> >>>> On 19.12.2023 13:12, Cui, Lili wrote:
> >>>>> *** BLURB HERE ***
> >>>>> Future optimizations to be made.
> >>>>> 1. The current implementation of vexvvvvv needs to be optimized.
> >>>>> 2. The handling of double VEX/EVEX templates in check_register()
> >>>>> needs to
> >>>> be optimized.
> >>>>
> >>>> I hope this is just stale here, and the dependency on templates was
> >>>> now removed again from check_register().
> >>>
> >>> In fact, I didn't remove it in V4, I didn't find a better place to
> >>> deal with it. I
> >> don't know if you agree with this implementation below.
> >>
> >> I'm afraid I don't, both because it still isn't clear to me what's
> >> wrong with my alternative proposal, and also for the formal reason of ...
> >>
> >
> > For the alternative proposal, do you mean adding a new variable to avoid
> introducing new loops over all operands? How about this ? or do you want to
> add other variable and handle it in check_register?
> 
> No, the alternative proposal continues to be to introduce a new enumerator
> to record in i.vec_encoding (vex_encoding_egpr is what iirc I had suggested
> before, despite the naming anomaly). What you outline below would,
> however, still be better than adding another loop (as you had it earlier), imo.
> 

I guessed you want to add a new type like vex_encoding_egpr, but I don't know how to do it differently with before, when the instruction support legacy, vex and evex encodings, if we put the vex and eves templates in front of the legacy templates (in i386-opc.tbl), we'll assign the vex_encoding_egpr for the legacy input, and it will have the same problem as before. And we also need to handle it in check_register(). Maybe you hinted at some other way of handling it, but I didn't get it.


     if (current_templates.start->opcode_modifier.vex
        && current_templates.start->opcode_modifier.evex)
      i.vec_encoding = vex_encoding_egpr;


> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -464,6 +464,9 @@ struct _i386_insn
> >      /* Have NOTRACK prefix.  */
> >      const char *notrack_prefix;
> >
> > +    /* Has Egpr.  */
> > +    bool has_egpr;
> > +
> >      /* Error message.  */
> >      enum i386_error error;
> >    };
> 
> As a general remark, when you add new fields to a struct, please try to find a
> slot that ideally is using existing padding _and_ is next to related fields, or at
> least one of the two.
> 

Moved to

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -438,6 +438,9 @@ struct _i386_insn
     /* Prefer the REX2 prefix in encoding.  */
     bool rex2_encoding;

+    /* Has Egpr.  */
+    bool has_egpr;
+
     /* Disable instruction size optimization.  */
     bool no_optimize;

Lili.

Jan Beulich Dec. 20, 2023, 12:01 p.m. UTC | #7

On 20.12.2023 12:50, Cui, Lili wrote:
>>>>>> On 19.12.2023 13:12, Cui, Lili wrote:
>>>>>>> *** BLURB HERE ***
>>>>>>> Future optimizations to be made.
>>>>>>> 1. The current implementation of vexvvvvv needs to be optimized.
>>>>>>> 2. The handling of double VEX/EVEX templates in check_register()
>>>>>>> needs to
>>>>>> be optimized.
>>>>>>
>>>>>> I hope this is just stale here, and the dependency on templates was
>>>>>> now removed again from check_register().
>>>>>
>>>>> In fact, I didn't remove it in V4, I didn't find a better place to
>>>>> deal with it. I
>>>> don't know if you agree with this implementation below.
>>>>
>>>> I'm afraid I don't, both because it still isn't clear to me what's
>>>> wrong with my alternative proposal, and also for the formal reason of ...
>>>>
>>>
>>> For the alternative proposal, do you mean adding a new variable to avoid
>> introducing new loops over all operands? How about this ? or do you want to
>> add other variable and handle it in check_register?
>>
>> No, the alternative proposal continues to be to introduce a new enumerator
>> to record in i.vec_encoding (vex_encoding_egpr is what iirc I had suggested
>> before, despite the naming anomaly). What you outline below would,
>> however, still be better than adding another loop (as you had it earlier), imo.
>>
> 
> I guessed you want to add a new type like vex_encoding_egpr, but I don't know how to do it differently with before, when the instruction support legacy, vex and evex encodings, if we put the vex and eves templates in front of the legacy templates (in i386-opc.tbl), we'll assign the vex_encoding_egpr for the legacy input, and it will have the same problem as before. And we also need to handle it in check_register(). Maybe you hinted at some other way of handling it, but I didn't get it.
> 
> 
>      if (current_templates.start->opcode_modifier.vex
>         && current_templates.start->opcode_modifier.evex)
>       i.vec_encoding = vex_encoding_egpr;

Since setting of the new encoding type has to happen in check_register(),
using current_templates (as said several times before) is not an option.

Anyway, in the interest of forward progress, feel free to go with ...

>>> --- a/gas/config/tc-i386.c
>>> +++ b/gas/config/tc-i386.c
>>> @@ -464,6 +464,9 @@ struct _i386_insn
>>>      /* Have NOTRACK prefix.  */
>>>      const char *notrack_prefix;
>>>
>>> +    /* Has Egpr.  */
>>> +    bool has_egpr;
>>> +
>>>      /* Error message.  */
>>>      enum i386_error error;
>>>    };
>>
>> As a general remark, when you add new fields to a struct, please try to find a
>> slot that ideally is using existing padding _and_ is next to related fields, or at
>> least one of the two.
>>
> 
> Moved to
> 
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -438,6 +438,9 @@ struct _i386_insn
>      /* Prefer the REX2 prefix in encoding.  */
>      bool rex2_encoding;
> 
> +    /* Has Egpr.  */
> +    bool has_egpr;

... this approach then, and subsequently I'll see if I can re-arrange things
(and if I'm bothered enough to do so). The comment is pretty unhelpful as is,
how about "Need to use an eGPR capable encoding (REX2 or EVEX)" or some such?

Jan

Cui, Lili Dec. 20, 2023, 12:16 p.m. UTC | #8

> On 20.12.2023 12:50, Cui, Lili wrote:
> >>>>>> On 19.12.2023 13:12, Cui, Lili wrote:
> >>>>>>> *** BLURB HERE ***
> >>>>>>> Future optimizations to be made.
> >>>>>>> 1. The current implementation of vexvvvvv needs to be optimized.
> >>>>>>> 2. The handling of double VEX/EVEX templates in check_register()
> >>>>>>> needs to
> >>>>>> be optimized.
> >>>>>>
> >>>>>> I hope this is just stale here, and the dependency on templates
> >>>>>> was now removed again from check_register().
> >>>>>
> >>>>> In fact, I didn't remove it in V4, I didn't find a better place to
> >>>>> deal with it. I
> >>>> don't know if you agree with this implementation below.
> >>>>
> >>>> I'm afraid I don't, both because it still isn't clear to me what's
> >>>> wrong with my alternative proposal, and also for the formal reason of ...
> >>>>
> >>>
> >>> For the alternative proposal, do you mean adding a new variable to
> >>> avoid
> >> introducing new loops over all operands? How about this ? or do you
> >> want to add other variable and handle it in check_register?
> >>
> >> No, the alternative proposal continues to be to introduce a new
> >> enumerator to record in i.vec_encoding (vex_encoding_egpr is what
> >> iirc I had suggested before, despite the naming anomaly). What you
> >> outline below would, however, still be better than adding another loop (as
> you had it earlier), imo.
> >>
> >
> > I guessed you want to add a new type like vex_encoding_egpr, but I don't
> know how to do it differently with before, when the instruction support
> legacy, vex and evex encodings, if we put the vex and eves templates in front
> of the legacy templates (in i386-opc.tbl), we'll assign the vex_encoding_egpr
> for the legacy input, and it will have the same problem as before. And we also
> need to handle it in check_register(). Maybe you hinted at some other way of
> handling it, but I didn't get it.
> >
> >
> >      if (current_templates.start->opcode_modifier.vex
> >         && current_templates.start->opcode_modifier.evex)
> >       i.vec_encoding = vex_encoding_egpr;
> 
> Since setting of the new encoding type has to happen in check_register(),
> using current_templates (as said several times before) is not an option.
> 
> Anyway, in the interest of forward progress, feel free to go with ...
> 
> >>> --- a/gas/config/tc-i386.c
> >>> +++ b/gas/config/tc-i386.c
> >>> @@ -464,6 +464,9 @@ struct _i386_insn
> >>>      /* Have NOTRACK prefix.  */
> >>>      const char *notrack_prefix;
> >>>
> >>> +    /* Has Egpr.  */
> >>> +    bool has_egpr;
> >>> +
> >>>      /* Error message.  */
> >>>      enum i386_error error;
> >>>    };
> >>
> >> As a general remark, when you add new fields to a struct, please try
> >> to find a slot that ideally is using existing padding _and_ is next
> >> to related fields, or at least one of the two.
> >>
> >
> > Moved to
> >
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -438,6 +438,9 @@ struct _i386_insn
> >      /* Prefer the REX2 prefix in encoding.  */
> >      bool rex2_encoding;
> >
> > +    /* Has Egpr.  */
> > +    bool has_egpr;
> 
> ... this approach then, and subsequently I'll see if I can re-arrange things (and
> if I'm bothered enough to do so). The comment is pretty unhelpful as is, how
> about "Need to use an eGPR capable encoding (REX2 or EVEX)" or some such?
> 

Great, thanks!

--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -438,6 +438,9 @@ struct _i386_insn
     /* Prefer the REX2 prefix in encoding.  */
     bool rex2_encoding;

+    /* Need to use an Egpr capable encoding (REX2 or EVEX).  */
+    bool has_egpr;
+

Lili.