diff mbox series

[01/10] Support Intel AVX-IFMA

Message ID	20221014091248.4920-2-haochen.jiang@intel.com
State	Accepted
Headers	Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2775438582A0 To: binutils@sourceware.org Subject: [PATCH 01/10] Support Intel AVX-IFMA Date: Fri, 14 Oct 2022 17:12:39 +0800 Message-Id: <20221014091248.4920-2-haochen.jiang@intel.com> In-Reply-To: <20221014091248.4920-1-haochen.jiang@intel.com> References: <20221014091248.4920-1-haochen.jiang@intel.com> Precedence: list From: Haochen Jiang via Binutils <binutils@sourceware.org> Reply-To: Haochen Jiang <haochen.jiang@intel.com> Cc: wwwhhhyyy <hongyu.wang@intel.com> Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" <binutils-bounces+ouuuleilei=gmail.com@sourceware.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?=
Series	Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions \| [0/10] Add new Intel Sierra Forest, Grand Ridge, Granite Rapids Instructions [01/10] Support Intel AVX-IFMA [02/10] Support Intel AVX-VNNI-INT8 [03/10] Support Intel AVX-NE-CONVERT [04/10] Support Intel CMPccXADD [05/10] Add handler for more i386_cpu_flags [06/10] Support Intel RAO-INT [07/10] Support Intel WRMSRNS [08/10] Support Intel MSRLIST [09/10] Support Intel AMX-FP16 [10/10] Support Intel PREFETCHI

Checks

Context	Check	Description
snail/binutils-gdb-check	success	Github commit url

Commit Message

Jiang, Haochen Oct. 14, 2022, 9:12 a.m. UTC

  From: wwwhhhyyy <hongyu.wang@intel.com>

x86: Support Intel AVX-IFMA

Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
encoding for Intel IFMA instructions.

gas/

	* NEWS: Support Intel AVX-IFMA.
	* config/tc-i386.c (cpu_arch): Add avx_ifma.
	* doc/c-i386.texi: Document .avx_ifma, noavx_ifma and how to
	encode Intel IFMA instructions with VEX prefix.
	* testsuite/gas/i386/avx-ifma.d: New file.
	* testsuite/gas/i386/avx-ifma-intel.d: Likewise.
	* testsuite/gas/i386/avx-ifma.s: Likewise.
	* testsuite/gas/i386/x86-64-avx-ifma.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-ifma-intel.d: Likewise.
	* testsuite/gas/i386/x86-64-avx-ifma.s: Likewise.
	* testsuite/gas/i386/i386.exp: Run AVX IFMA tests.

opcodes/

	* i386-dis.c (PREFIX_VEX_0F38B4): New.
	(PREFIX_VEX_0F38B5): Likewise.
	(VEX_W_0F38B4_P_2): Likewise.
	(VEX_W_0F38B5_P_2): Likewise.
	(prefix_table): Add PREFIX_VEX_0F38B4 and PREFIX_VEX_0F38B5.
	(vex_table): Add VEX_W_0F38B4_P_2 and VEX_W_0F38B5_P_2.
	* i386-gen.c (cpu_flag_init): Clear the CpuAVX_IFMA bit in
	CPU_UNKNOWN_FLAGS. Add CPU_AVX_IFMA_FLGAS and
	CPU_ANY_AVX_IFMA_FLAGS.
	(cpu_flags): Add CpuAVX_IFMA.
	* i386-opc.h (CpuAVX_IFMA): New.
	(i386_cpu_flags): Add cpuavx_ifma.
	* i386-opc.tbl: Add Intel AVX IFMA instructions.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Likewise.
---
 gas/NEWS                                      |    2 +
 gas/config/tc-i386.c                          |    1 +
 gas/doc/c-i386.texi                           |    7 +-
 gas/testsuite/gas/i386/avx-ifma-intel.d       |   30 +
 gas/testsuite/gas/i386/avx-ifma-inval.l       |    2 +
 gas/testsuite/gas/i386/avx-ifma-inval.s       |    6 +
 gas/testsuite/gas/i386/avx-ifma.d             |   30 +
 gas/testsuite/gas/i386/avx-ifma.s             |   21 +
 gas/testsuite/gas/i386/i386.exp               |    6 +
 gas/testsuite/gas/i386/noavx512-1.l           |   24 +-
 .../gas/i386/x86-64-avx-ifma-intel.d          |   34 +
 .../gas/i386/x86-64-avx-ifma-inval.l          |    3 +
 .../gas/i386/x86-64-avx-ifma-inval.s          |    7 +
 gas/testsuite/gas/i386/x86-64-avx-ifma.d      |   34 +
 gas/testsuite/gas/i386/x86-64-avx-ifma.s      |   23 +
 opcodes/i386-dis.c                            |   16 +-
 opcodes/i386-gen.c                            |    5 +
 opcodes/i386-init.h                           |  514 +-
 opcodes/i386-opc.h                            |    3 +
 opcodes/i386-opc.tbl                          |    7 +
 opcodes/i386-tbl.h                            | 7808 +++++++++--------
 21 files changed, 4432 insertions(+), 4151 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/avx-ifma-intel.d
 create mode 100644 gas/testsuite/gas/i386/avx-ifma-inval.l
 create mode 100644 gas/testsuite/gas/i386/avx-ifma-inval.s
 create mode 100644 gas/testsuite/gas/i386/avx-ifma.d
 create mode 100644 gas/testsuite/gas/i386/avx-ifma.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-ifma-intel.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-ifma-inval.l
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-ifma-inval.s
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-ifma.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-avx-ifma.s

Comments

Jan Beulich Oct. 14, 2022, 9:52 a.m. UTC | #1

On 14.10.2022 11:12, Haochen Jiang wrote:
> From: wwwhhhyyy <hongyu.wang@intel.com>
> 
> x86: Support Intel AVX-IFMA
> 
> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> encoding for Intel IFMA instructions.

I firmly object to the proliferation of this mis-feature. As expressed
before for AVX-VNNI, as long as the user has disabled AVX512 (or
respective sub-features thereof), there should be no need to use {vex} in
the source code. There's also no reason at all to make the disassembler
print {vex} prefixes - we don't do so for any other insns (apart from
AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
(when none of the EVEX-specific features is used).

I actually have a patch queued to undo the odd behavior for AVX-VNNI, at
least on the assembler side (which also drops the PseudoVexPrefix
attribute).

> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
> @@ -1526,6 +1526,8 @@ enum
>    VEX_W_0F385E_X86_64_P_3,
>    VEX_W_0F3878,
>    VEX_W_0F3879,
> +  VEX_W_0F38B4,
> +  VEX_W_0F38B5,
>    VEX_W_0F38CF,
>    VEX_W_0F3A00_L_1,
>    VEX_W_0F3A01_L_1,
> @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { VEX_W_TABLE (VEX_W_0F38B4) },
> +    { VEX_W_TABLE (VEX_W_0F38B5) },
>      { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
>      { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
>      /* b8 */
> @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = {
>      /* VEX_W_0F3879 */
>      { "vpbroadcastw",	{ XM, EXw }, PREFIX_DATA },
>    },
> +  {
> +    /* VEX_W_0F38B4 */
> +    { Bad_Opcode },
> +    { "%XV vpmadd52luq",	{ XM, Vex, EXx }, PREFIX_DATA },
> +  },
> +  {
> +    /* VEX_W_0F38B5 */
> +    { Bad_Opcode },
> +    { "%XV vpmadd52huq",	{ XM, Vex, EXx }, PREFIX_DATA },
> +  },

Irrespective of the aspect mentioned at the top I think this is yet
another case where VEX and EVEX table entries can be shared. This would
(if the {vex} printing really needs retaining for whatever obscure
reason) merely require the processing of %XV to do nothing for EVEX-
encoded insns, plus of course the separating blank would then also need
to be included in the processing of %XV.

I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries,
which you could then re-base on top of.

> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] =
>      "CPU_AVX512F_FLAGS|CpuAVX512_BF16" },
>    { "CPU_AVX512_FP16_FLAGS",
>      "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
> +  { "CPU_AVX_IFMA_FLAGS",
> +    "CPU_AVX2_FLAGS|CpuAVX_IFMA" },
>    { "CPU_IAMCU_FLAGS",
>      "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" },
>    { "CPU_ADX_FLAGS",
> @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] =
>      "CpuHRESET" },
>    { "CPU_ANY_AVX512_FP16_FLAGS",
>      "CpuAVX512_FP16" },
> +  { "CPU_ANY_AVX_IFMA_FLAGS",
> +    "CpuAVX_IFMA" },

If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs
adjustment, such that disabling of AVX2 also results in disabling of
AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.)

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16, Modrm|Masking=3|EVexMap6|VexW0|Broadcast
>  vrsqrtsh, 0x664f, None, CpuAVX512_FP16, Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
>  
>  // FP16 (HFNI) instructions end.
> +
> +// AVX_IFMA instructions.

Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples
of the (needless) use of underscores like this.

> +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
> +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }

Please use plain VexVVVV (without =1) - we want to have as little clutter as
possible on these usually already overlong lines.

Jan

H.J. Lu Oct. 14, 2022, 6:10 p.m. UTC | #2

On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 14.10.2022 11:12, Haochen Jiang wrote:
> > From: wwwhhhyyy <hongyu.wang@intel.com>
> >
> > x86: Support Intel AVX-IFMA
> >
> > Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> > cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> > are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> > encoding for Intel IFMA instructions.
>
> I firmly object to the proliferation of this mis-feature. As expressed
> before for AVX-VNNI, as long as the user has disabled AVX512 (or
> respective sub-features thereof), there should be no need to use {vex} in
> the source code. There's also no reason at all to make the disassembler
> print {vex} prefixes - we don't do so for any other insns (apart from
> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> (when none of the EVEX-specific features is used).

The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
without a prefix, which are generated by compilers or handwritten, will be
always encoded with EVEX.

> I actually have a patch queued to undo the odd behavior for AVX-VNNI, at
> least on the assembler side (which also drops the PseudoVexPrefix
> attribute).
>

Jan Beulich Oct. 16, 2022, 6:39 a.m. UTC | #3

On 14.10.2022 20:10, H.J. Lu wrote:
> On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 14.10.2022 11:12, Haochen Jiang wrote:
>>> From: wwwhhhyyy <hongyu.wang@intel.com>
>>>
>>> x86: Support Intel AVX-IFMA
>>>
>>> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
>>> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
>>> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
>>> encoding for Intel IFMA instructions.
>>
>> I firmly object to the proliferation of this mis-feature. As expressed
>> before for AVX-VNNI, as long as the user has disabled AVX512 (or
>> respective sub-features thereof), there should be no need to use {vex} in
>> the source code. There's also no reason at all to make the disassembler
>> print {vex} prefixes - we don't do so for any other insns (apart from
>> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
>> (when none of the EVEX-specific features is used).
> 
> The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
> without a prefix, which are generated by compilers or handwritten, will be
> always encoded with EVEX.

So again: Why is this necessary when a programmer disabled AVX512? I fully
agree we need to pick the EVEX encoding by default if available, but I see
no reason whatsoever to insist on a {vex} prefix when the EVEX variant is
unavailable anyway. As you said back at the time for AVX-VNNI - this was a
design decision taken at Intel. Which is fine for a draft implementation.
But decisions for an open source project should be taken in the open, and
opinions of others should not simply be put off.

Jan

H.J. Lu Oct. 17, 2022, 10:23 p.m. UTC | #4

On Sat, Oct 15, 2022 at 11:39 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 14.10.2022 20:10, H.J. Lu wrote:
> > On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 14.10.2022 11:12, Haochen Jiang wrote:
> >>> From: wwwhhhyyy <hongyu.wang@intel.com>
> >>>
> >>> x86: Support Intel AVX-IFMA
> >>>
> >>> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> >>> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> >>> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> >>> encoding for Intel IFMA instructions.
> >>
> >> I firmly object to the proliferation of this mis-feature. As expressed
> >> before for AVX-VNNI, as long as the user has disabled AVX512 (or
> >> respective sub-features thereof), there should be no need to use {vex} in
> >> the source code. There's also no reason at all to make the disassembler
> >> print {vex} prefixes - we don't do so for any other insns (apart from
> >> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> >> (when none of the EVEX-specific features is used).
> >
> > The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
> > without a prefix, which are generated by compilers or handwritten, will be
> > always encoded with EVEX.
>
> So again: Why is this necessary when a programmer disabled AVX512? I fully
> agree we need to pick the EVEX encoding by default if available, but I see
> no reason whatsoever to insist on a {vex} prefix when the EVEX variant is
> unavailable anyway. As you said back at the time for AVX-VNNI - this was a
> design decision taken at Intel. Which is fine for a draft implementation.
> But decisions for an open source project should be taken in the open, and
> opinions of others should not simply be put off.
>

We can discuss how to initialize i.vec_encoding.  But it is orthogonal to
this patch.

Jan Beulich Oct. 18, 2022, 5:33 a.m. UTC | #5

On 18.10.2022 00:23, H.J. Lu wrote:
> On Sat, Oct 15, 2022 at 11:39 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 14.10.2022 20:10, H.J. Lu wrote:
>>> On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 14.10.2022 11:12, Haochen Jiang wrote:
>>>>> From: wwwhhhyyy <hongyu.wang@intel.com>
>>>>>
>>>>> x86: Support Intel AVX-IFMA
>>>>>
>>>>> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
>>>>> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
>>>>> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
>>>>> encoding for Intel IFMA instructions.
>>>>
>>>> I firmly object to the proliferation of this mis-feature. As expressed
>>>> before for AVX-VNNI, as long as the user has disabled AVX512 (or
>>>> respective sub-features thereof), there should be no need to use {vex} in
>>>> the source code. There's also no reason at all to make the disassembler
>>>> print {vex} prefixes - we don't do so for any other insns (apart from
>>>> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
>>>> (when none of the EVEX-specific features is used).
>>>
>>> The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
>>> without a prefix, which are generated by compilers or handwritten, will be
>>> always encoded with EVEX.
>>
>> So again: Why is this necessary when a programmer disabled AVX512? I fully
>> agree we need to pick the EVEX encoding by default if available, but I see
>> no reason whatsoever to insist on a {vex} prefix when the EVEX variant is
>> unavailable anyway. As you said back at the time for AVX-VNNI - this was a
>> design decision taken at Intel. Which is fine for a draft implementation.
>> But decisions for an open source project should be taken in the open, and
>> opinions of others should not simply be put off.
>>
> 
> We can discuss how to initialize i.vec_encoding.  But it is orthogonal to
> this patch.

One can view it as orthogonal, yes, but if we change the model then doing
so before more code and testcases need changing is imo preferable.

Jan

H.J. Lu Oct. 18, 2022, 9:28 p.m. UTC | #6

On Mon, Oct 17, 2022 at 10:33 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 18.10.2022 00:23, H.J. Lu wrote:
> > On Sat, Oct 15, 2022 at 11:39 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 14.10.2022 20:10, H.J. Lu wrote:
> >>> On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 14.10.2022 11:12, Haochen Jiang wrote:
> >>>>> From: wwwhhhyyy <hongyu.wang@intel.com>
> >>>>>
> >>>>> x86: Support Intel AVX-IFMA
> >>>>>
> >>>>> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> >>>>> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> >>>>> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> >>>>> encoding for Intel IFMA instructions.
> >>>>
> >>>> I firmly object to the proliferation of this mis-feature. As expressed
> >>>> before for AVX-VNNI, as long as the user has disabled AVX512 (or
> >>>> respective sub-features thereof), there should be no need to use {vex} in
> >>>> the source code. There's also no reason at all to make the disassembler
> >>>> print {vex} prefixes - we don't do so for any other insns (apart from
> >>>> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> >>>> (when none of the EVEX-specific features is used).
> >>>
> >>> The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
> >>> without a prefix, which are generated by compilers or handwritten, will be
> >>> always encoded with EVEX.
> >>
> >> So again: Why is this necessary when a programmer disabled AVX512? I fully
> >> agree we need to pick the EVEX encoding by default if available, but I see
> >> no reason whatsoever to insist on a {vex} prefix when the EVEX variant is
> >> unavailable anyway. As you said back at the time for AVX-VNNI - this was a
> >> design decision taken at Intel. Which is fine for a draft implementation.
> >> But decisions for an open source project should be taken in the open, and
> >> opinions of others should not simply be put off.
> >>
> >
> > We can discuss how to initialize i.vec_encoding.  But it is orthogonal to
> > this patch.
>
> One can view it as orthogonal, yes, but if we change the model then doing
> so before more code and testcases need changing is imo preferable.
>

We can skip the pseudo VEX prefix check when AVX512F is disabled.
There should be no testcase changes.

Jan Beulich Oct. 19, 2022, 6:01 a.m. UTC | #7

On 18.10.2022 23:28, H.J. Lu wrote:
> On Mon, Oct 17, 2022 at 10:33 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 18.10.2022 00:23, H.J. Lu wrote:
>>> On Sat, Oct 15, 2022 at 11:39 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 14.10.2022 20:10, H.J. Lu wrote:
>>>>> On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>
>>>>>> On 14.10.2022 11:12, Haochen Jiang wrote:
>>>>>>> From: wwwhhhyyy <hongyu.wang@intel.com>
>>>>>>>
>>>>>>> x86: Support Intel AVX-IFMA
>>>>>>>
>>>>>>> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
>>>>>>> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
>>>>>>> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
>>>>>>> encoding for Intel IFMA instructions.
>>>>>>
>>>>>> I firmly object to the proliferation of this mis-feature. As expressed
>>>>>> before for AVX-VNNI, as long as the user has disabled AVX512 (or
>>>>>> respective sub-features thereof), there should be no need to use {vex} in
>>>>>> the source code. There's also no reason at all to make the disassembler
>>>>>> print {vex} prefixes - we don't do so for any other insns (apart from
>>>>>> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
>>>>>> (when none of the EVEX-specific features is used).
>>>>>
>>>>> The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
>>>>> without a prefix, which are generated by compilers or handwritten, will be
>>>>> always encoded with EVEX.
>>>>
>>>> So again: Why is this necessary when a programmer disabled AVX512? I fully
>>>> agree we need to pick the EVEX encoding by default if available, but I see
>>>> no reason whatsoever to insist on a {vex} prefix when the EVEX variant is
>>>> unavailable anyway. As you said back at the time for AVX-VNNI - this was a
>>>> design decision taken at Intel. Which is fine for a draft implementation.
>>>> But decisions for an open source project should be taken in the open, and
>>>> opinions of others should not simply be put off.
>>>>
>>>
>>> We can discuss how to initialize i.vec_encoding.  But it is orthogonal to
>>> this patch.
>>
>> One can view it as orthogonal, yes, but if we change the model then doing
>> so before more code and testcases need changing is imo preferable.
>>
> 
> We can skip the pseudo VEX prefix check when AVX512F is disabled.

Let me see if I can pull ahead the patch I have (right now it's at the end
of the 3rd series I have pending, when the 1st one continues to be debated),
so the new cases in this series could then come on top.

> There should be no testcase changes.

Well - existing tests ought it continue to work, yes, but the prefix-less
forms then also will want testing.

Jan

H.J. Lu Oct. 19, 2022, 9:27 p.m. UTC | #8

On Tue, Oct 18, 2022 at 11:01 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 18.10.2022 23:28, H.J. Lu wrote:
> > On Mon, Oct 17, 2022 at 10:33 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 18.10.2022 00:23, H.J. Lu wrote:
> >>> On Sat, Oct 15, 2022 at 11:39 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>
> >>>> On 14.10.2022 20:10, H.J. Lu wrote:
> >>>>> On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
> >>>>>>
> >>>>>> On 14.10.2022 11:12, Haochen Jiang wrote:
> >>>>>>> From: wwwhhhyyy <hongyu.wang@intel.com>
> >>>>>>>
> >>>>>>> x86: Support Intel AVX-IFMA
> >>>>>>>
> >>>>>>> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> >>>>>>> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> >>>>>>> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> >>>>>>> encoding for Intel IFMA instructions.
> >>>>>>
> >>>>>> I firmly object to the proliferation of this mis-feature. As expressed
> >>>>>> before for AVX-VNNI, as long as the user has disabled AVX512 (or
> >>>>>> respective sub-features thereof), there should be no need to use {vex} in
> >>>>>> the source code. There's also no reason at all to make the disassembler
> >>>>>> print {vex} prefixes - we don't do so for any other insns (apart from
> >>>>>> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> >>>>>> (when none of the EVEX-specific features is used).
> >>>>>
> >>>>> The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
> >>>>> without a prefix, which are generated by compilers or handwritten, will be
> >>>>> always encoded with EVEX.
> >>>>
> >>>> So again: Why is this necessary when a programmer disabled AVX512? I fully
> >>>> agree we need to pick the EVEX encoding by default if available, but I see
> >>>> no reason whatsoever to insist on a {vex} prefix when the EVEX variant is
> >>>> unavailable anyway. As you said back at the time for AVX-VNNI - this was a
> >>>> design decision taken at Intel. Which is fine for a draft implementation.
> >>>> But decisions for an open source project should be taken in the open, and
> >>>> opinions of others should not simply be put off.
> >>>>
> >>>
> >>> We can discuss how to initialize i.vec_encoding.  But it is orthogonal to
> >>> this patch.
> >>
> >> One can view it as orthogonal, yes, but if we change the model then doing
> >> so before more code and testcases need changing is imo preferable.
> >>
> >
> > We can skip the pseudo VEX prefix check when AVX512F is disabled.
>
> Let me see if I can pull ahead the patch I have (right now it's at the end
> of the 3rd series I have pending, when the 1st one continues to be debated),
> so the new cases in this series could then come on top.

Something like this:

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 01f84cb9a36..a9fd3115659 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -6458,8 +6458,9 @@ match_template (char mnem_suffix)

       /* Check Pseudo Prefix.  */
       if (t->opcode_modifier.pseudovexprefix
+    && cpu_arch_flags.bitfield.cpuavx512f
     && !(i.vec_encoding == vex_encoding_vex
-        || i.vec_encoding == vex_encoding_vex3))
+         || i.vec_encoding == vex_encoding_vex3))
   continue;

       /* Check AT&T mnemonic.   */

It works on existing tests.

> > There should be no testcase changes.
>
> Well - existing tests ought it continue to work, yes, but the prefix-less
> forms then also will want testing.
>
> Jan

Jan Beulich Oct. 20, 2022, 6:15 a.m. UTC | #9

On 19.10.2022 23:27, H.J. Lu wrote:
> On Tue, Oct 18, 2022 at 11:01 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 18.10.2022 23:28, H.J. Lu wrote:
>>> On Mon, Oct 17, 2022 at 10:33 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 18.10.2022 00:23, H.J. Lu wrote:
>>>>> On Sat, Oct 15, 2022 at 11:39 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>
>>>>>> On 14.10.2022 20:10, H.J. Lu wrote:
>>>>>>> On Fri, Oct 14, 2022 at 2:52 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>>>>
>>>>>>>> On 14.10.2022 11:12, Haochen Jiang wrote:
>>>>>>>>> From: wwwhhhyyy <hongyu.wang@intel.com>
>>>>>>>>>
>>>>>>>>> x86: Support Intel AVX-IFMA
>>>>>>>>>
>>>>>>>>> Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
>>>>>>>>> cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
>>>>>>>>> are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
>>>>>>>>> encoding for Intel IFMA instructions.
>>>>>>>>
>>>>>>>> I firmly object to the proliferation of this mis-feature. As expressed
>>>>>>>> before for AVX-VNNI, as long as the user has disabled AVX512 (or
>>>>>>>> respective sub-features thereof), there should be no need to use {vex} in
>>>>>>>> the source code. There's also no reason at all to make the disassembler
>>>>>>>> print {vex} prefixes - we don't do so for any other insns (apart from
>>>>>>>> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
>>>>>>>> (when none of the EVEX-specific features is used).
>>>>>>>
>>>>>>> The {vex} prefix is used with AVX-IFMA instructions so that IFMA instructions
>>>>>>> without a prefix, which are generated by compilers or handwritten, will be
>>>>>>> always encoded with EVEX.
>>>>>>
>>>>>> So again: Why is this necessary when a programmer disabled AVX512? I fully
>>>>>> agree we need to pick the EVEX encoding by default if available, but I see
>>>>>> no reason whatsoever to insist on a {vex} prefix when the EVEX variant is
>>>>>> unavailable anyway. As you said back at the time for AVX-VNNI - this was a
>>>>>> design decision taken at Intel. Which is fine for a draft implementation.
>>>>>> But decisions for an open source project should be taken in the open, and
>>>>>> opinions of others should not simply be put off.
>>>>>>
>>>>>
>>>>> We can discuss how to initialize i.vec_encoding.  But it is orthogonal to
>>>>> this patch.
>>>>
>>>> One can view it as orthogonal, yes, but if we change the model then doing
>>>> so before more code and testcases need changing is imo preferable.
>>>>
>>>
>>> We can skip the pseudo VEX prefix check when AVX512F is disabled.
>>
>> Let me see if I can pull ahead the patch I have (right now it's at the end
>> of the 3rd series I have pending, when the 1st one continues to be debated),
>> so the new cases in this series could then come on top.
> 
> Something like this:
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 01f84cb9a36..a9fd3115659 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -6458,8 +6458,9 @@ match_template (char mnem_suffix)
> 
>        /* Check Pseudo Prefix.  */
>        if (t->opcode_modifier.pseudovexprefix
> +    && cpu_arch_flags.bitfield.cpuavx512f
>      && !(i.vec_encoding == vex_encoding_vex
> -        || i.vec_encoding == vex_encoding_vex3))
> +         || i.vec_encoding == vex_encoding_vex3))
>    continue;
> 
>        /* Check AT&T mnemonic.   */
> 
> It works on existing tests.

But adds code rather than removing some (with the same effect). I see
you've approved my patch doing the latter, so I'll put it in. The new
ISA extensions then will want to follow suit. I also have the {evex}
disassembler patch mostly ready, but I'd like to, at the same time,
address testsuite anomalies that I've spotted along the road.

Jan

Frager, Neal via Binutils Oct. 24, 2022, 5:53 a.m. UTC | #10

> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Friday, October 14, 2022 5:53 PM
> To: Jiang, Haochen <haochen.jiang@intel.com>
> Cc: hjl.tools@gmail.com; Wang, Hongyu <hongyu.wang@intel.com>;
> binutils@sourceware.org
> Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA
> 
> On 14.10.2022 11:12, Haochen Jiang wrote:
> > From: wwwhhhyyy <hongyu.wang@intel.com>
> >
> > x86: Support Intel AVX-IFMA
> >
> > Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> > cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> > are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> > encoding for Intel IFMA instructions.
> 
> I firmly object to the proliferation of this mis-feature. As expressed
> before for AVX-VNNI, as long as the user has disabled AVX512 (or
> respective sub-features thereof), there should be no need to use {vex} in
> the source code. There's also no reason at all to make the disassembler
> print {vex} prefixes - we don't do so for any other insns (apart from
> AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> (when none of the EVEX-specific features is used).
> 
> I actually have a patch queued to undo the odd behavior for AVX-VNNI, at
> least on the assembler side (which also drops the PseudoVexPrefix
> attribute).

Has rebased the patch to latest trunk and removed PseudoVexPrefix in table.
Also added some testcases just like how your patch did.

> 
> > --- a/opcodes/i386-dis.c
> > +++ b/opcodes/i386-dis.c
> > @@ -1526,6 +1526,8 @@ enum
> >    VEX_W_0F385E_X86_64_P_3,
> >    VEX_W_0F3878,
> >    VEX_W_0F3879,
> > +  VEX_W_0F38B4,
> > +  VEX_W_0F38B5,
> >    VEX_W_0F38CF,
> >    VEX_W_0F3A00_L_1,
> >    VEX_W_0F3A01_L_1,
> > @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { VEX_W_TABLE (VEX_W_0F38B4) },
> > +    { VEX_W_TABLE (VEX_W_0F38B5) },
> >      { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> >      { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> >      /* b8 */
> > @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = {
> >      /* VEX_W_0F3879 */
> >      { "vpbroadcastw",	{ XM, EXw }, PREFIX_DATA },
> >    },
> > +  {
> > +    /* VEX_W_0F38B4 */
> > +    { Bad_Opcode },
> > +    { "%XV vpmadd52luq",	{ XM, Vex, EXx }, PREFIX_DATA },
> > +  },
> > +  {
> > +    /* VEX_W_0F38B5 */
> > +    { Bad_Opcode },
> > +    { "%XV vpmadd52huq",	{ XM, Vex, EXx }, PREFIX_DATA },
> > +  },
> 
> Irrespective of the aspect mentioned at the top I think this is yet
> another case where VEX and EVEX table entries can be shared. This would
> (if the {vex} printing really needs retaining for whatever obscure
> reason) merely require the processing of %XV to do nothing for EVEX-
> encoded insns, plus of course the separating blank would then also need
> to be included in the processing of %XV.
> 
> I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries,
> which you could then re-base on top of.

Folded the table of AVX512IFMA and AVX-IFMA.

> 
> > --- a/opcodes/i386-gen.c
> > +++ b/opcodes/i386-gen.c
> > @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] =
> >      "CPU_AVX512F_FLAGS|CpuAVX512_BF16" },
> >    { "CPU_AVX512_FP16_FLAGS",
> >      "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
> > +  { "CPU_AVX_IFMA_FLAGS",
> > +    "CPU_AVX2_FLAGS|CpuAVX_IFMA" },
> >    { "CPU_IAMCU_FLAGS",
> >      "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" },
> >    { "CPU_ADX_FLAGS",
> > @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] =
> >      "CpuHRESET" },
> >    { "CPU_ANY_AVX512_FP16_FLAGS",
> >      "CpuAVX512_FP16" },
> > +  { "CPU_ANY_AVX_IFMA_FLAGS",
> > +    "CpuAVX_IFMA" },
> 
> If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs
> adjustment, such that disabling of AVX2 also results in disabling of
> AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.)
> 

Added AVX-IFMA to CPU_ANY_AVX2_FLAGS.

> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16,
> Modrm|Masking=3|EVexMap6|VexW0|Broadcast
> >  vrsqrtsh, 0x664f, None, CpuAVX512_FP16,
> Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|N
> o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
> >
> >  // FP16 (HFNI) instructions end.
> > +
> > +// AVX_IFMA instructions.
> 
> Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples
> of the (needless) use of underscores like this.
> 
> > +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA,
> Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> RegXMM|RegYMM }
> > +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA,
> Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> RegXMM|RegYMM }
> 
> Please use plain VexVVVV (without =1) - we want to have as little clutter as
> possible on these usually already overlong lines.

Changed to VexVVVV.

Thx for your review and see if there is still something need to be changed.

Haochen

> 
> Jan

H.J. Lu Oct. 24, 2022, 7:09 p.m. UTC | #11

On Sun, Oct 23, 2022 at 10:53 PM Jiang, Haochen <haochen.jiang@intel.com> wrote:
>
> > -----Original Message-----
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: Friday, October 14, 2022 5:53 PM
> > To: Jiang, Haochen <haochen.jiang@intel.com>
> > Cc: hjl.tools@gmail.com; Wang, Hongyu <hongyu.wang@intel.com>;
> > binutils@sourceware.org
> > Subject: Re: [PATCH 01/10] Support Intel AVX-IFMA
> >
> > On 14.10.2022 11:12, Haochen Jiang wrote:
> > > From: wwwhhhyyy <hongyu.wang@intel.com>
> > >
> > > x86: Support Intel AVX-IFMA
> > >
> > > Intel AVX IFMA instructions are marked with CpuVEX_PREFIX, which is
> > > cleared by default.  Without {vex} pseudo prefix, Intel IFMA instructions
> > > are encoded with EVEX prefix.  {vex} pseudo prefix will turn on VEX
> > > encoding for Intel IFMA instructions.
> >
> > I firmly object to the proliferation of this mis-feature. As expressed
> > before for AVX-VNNI, as long as the user has disabled AVX512 (or
> > respective sub-features thereof), there should be no need to use {vex} in
> > the source code. There's also no reason at all to make the disassembler
> > print {vex} prefixes - we don't do so for any other insns (apart from
> > AVX-VNNI) where an ambiguity exists between their VEX and EVEX encodings
> > (when none of the EVEX-specific features is used).
> >
> > I actually have a patch queued to undo the odd behavior for AVX-VNNI, at
> > least on the assembler side (which also drops the PseudoVexPrefix
> > attribute).
>
> Has rebased the patch to latest trunk and removed PseudoVexPrefix in table.
> Also added some testcases just like how your patch did.
>
> >
> > > --- a/opcodes/i386-dis.c
> > > +++ b/opcodes/i386-dis.c
> > > @@ -1526,6 +1526,8 @@ enum
> > >    VEX_W_0F385E_X86_64_P_3,
> > >    VEX_W_0F3878,
> > >    VEX_W_0F3879,
> > > +  VEX_W_0F38B4,
> > > +  VEX_W_0F38B5,
> > >    VEX_W_0F38CF,
> > >    VEX_W_0F3A00_L_1,
> > >    VEX_W_0F3A01_L_1,
> > > @@ -6293,8 +6295,8 @@ static const struct dis386 vex_table[][256] = {
> > >      { Bad_Opcode },
> > >      { Bad_Opcode },
> > >      { Bad_Opcode },
> > > -    { Bad_Opcode },
> > > -    { Bad_Opcode },
> > > +    { VEX_W_TABLE (VEX_W_0F38B4) },
> > > +    { VEX_W_TABLE (VEX_W_0F38B5) },
> > >      { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> > >      { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
> > >      /* b8 */
> > > @@ -7599,6 +7601,16 @@ static const struct dis386 vex_w_table[][2] = {
> > >      /* VEX_W_0F3879 */
> > >      { "vpbroadcastw",      { XM, EXw }, PREFIX_DATA },
> > >    },
> > > +  {
> > > +    /* VEX_W_0F38B4 */
> > > +    { Bad_Opcode },
> > > +    { "%XV vpmadd52luq",   { XM, Vex, EXx }, PREFIX_DATA },
> > > +  },
> > > +  {
> > > +    /* VEX_W_0F38B5 */
> > > +    { Bad_Opcode },
> > > +    { "%XV vpmadd52huq",   { XM, Vex, EXx }, PREFIX_DATA },
> > > +  },
> >
> > Irrespective of the aspect mentioned at the top I think this is yet
> > another case where VEX and EVEX table entries can be shared. This would
> > (if the {vex} printing really needs retaining for whatever obscure
> > reason) merely require the processing of %XV to do nothing for EVEX-
> > encoded insns, plus of course the separating blank would then also need
> > to be included in the processing of %XV.
> >
> > I guess I'll make a patch to fold the AVX-VNNI and AVX512-VNNI entries,
> > which you could then re-base on top of.
>
> Folded the table of AVX512IFMA and AVX-IFMA.
>
> >
> > > --- a/opcodes/i386-gen.c
> > > +++ b/opcodes/i386-gen.c
> > > @@ -245,6 +245,8 @@ static initializer cpu_flag_init[] =
> > >      "CPU_AVX512F_FLAGS|CpuAVX512_BF16" },
> > >    { "CPU_AVX512_FP16_FLAGS",
> > >      "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
> > > +  { "CPU_AVX_IFMA_FLAGS",
> > > +    "CPU_AVX2_FLAGS|CpuAVX_IFMA" },
> > >    { "CPU_IAMCU_FLAGS",
> > >      "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" },
> > >    { "CPU_ADX_FLAGS",
> > > @@ -439,6 +441,8 @@ static initializer cpu_flag_init[] =
> > >      "CpuHRESET" },
> > >    { "CPU_ANY_AVX512_FP16_FLAGS",
> > >      "CpuAVX512_FP16" },
> > > +  { "CPU_ANY_AVX_IFMA_FLAGS",
> > > +    "CpuAVX_IFMA" },
> >
> > If AVX2 is taken as a prereq feature, then CPU_ANY_AVX2_FLAGS also needs
> > adjustment, such that disabling of AVX2 also results in disabling of
> > AVX-IFMA. (The same issue actually exists for AVX-VNNI afaics.)
> >
>
> Added AVX-IFMA to CPU_ANY_AVX2_FLAGS.
>
> > > --- a/opcodes/i386-opc.tbl
> > > +++ b/opcodes/i386-opc.tbl
> > > @@ -3263,3 +3263,10 @@ vrsqrtph, 0x664e, None, CpuAVX512_FP16,
> > Modrm|Masking=3|EVexMap6|VexW0|Broadcast
> > >  vrsqrtsh, 0x664f, None, CpuAVX512_FP16,
> > Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|N
> > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> > { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
> > >
> > >  // FP16 (HFNI) instructions end.
> > > +
> > > +// AVX_IFMA instructions.
> >
> > Nit: Perhaps better use AVX-IFMA here, but I see we're having many examples
> > of the (needless) use of underscores like this.
> >
> > > +vpmadd52huq, 0x66B5, None, CpuAVX_IFMA,
> > Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> > { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> > RegXMM|RegYMM }
> > > +vpmadd52luq, 0x66B4, None, CpuAVX_IFMA,
> > Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|N
> > o_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf,
> > { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM,
> > RegXMM|RegYMM }
> >
> > Please use plain VexVVVV (without =1) - we want to have as little clutter as
> > possible on these usually already overlong lines.
>
> Changed to VexVVVV.
>
> Thx for your review and see if there is still something need to be changed.

OK.

Thanks.

> Haochen
>
> >
> > Jan

Jan Beulich Oct. 25, 2022, 6:29 a.m. UTC | #12

On 24.10.2022 07:53, Jiang, Haochen wrote:
> Thx for your review and see if there is still something need to be changed.

The only minor comment I have is that insertion in opcodes/i386-opc.tbl would
probably better be right after the AVX512-IFMA group, rather than at the very
end of the file.

As a general remark: Please avoid sending patches as (only) attachments, as
it makes commenting more cumbersome. Also please send new versions as new
mails/threads rather than replying to the earlier version. This makes it
easier to track what belongs where.

Jan

diff mbox series

Patch

diff --git a/gas/NEWS b/gas/NEWS
index 16cb347e77..7cf65728ba 100644
--- a/gas/NEWS
+++ b/gas/NEWS
@@ -1,5 +1,7 @@ 
 -*- text -*-
 
+* Add support for Intel AVX-IFMA instructions.
+
 * gas now supports --compress-debug-sections=zstd to compress
   debug sections with zstd.
 * Add --enable-default-compressed-debug-sections-algorithm={zlib,zstd}
diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 01f84cb9a3..2fe7674884 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -1094,6 +1094,7 @@  static const arch_entry cpu_arch[] =
   SUBARCH (uintr, UINTR, ANY_UINTR, false),
   SUBARCH (hreset, HRESET, ANY_HRESET, false),
   SUBARCH (avx512_fp16, AVX512_FP16, ANY_AVX512_FP16, false),
+  SUBARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, false),
 };
 
 #undef SUBARCH
diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi
index c4a3967014..9e629605f8 100644
--- a/gas/doc/c-i386.texi
+++ b/gas/doc/c-i386.texi
@@ -213,6 +213,7 @@  accept various extension mnemonics.  For example,
 @code{avx512_bf16},
 @code{avx_vnni},
 @code{avx512_fp16},
+@code{avx_ifma},
 @code{noavx512f},
 @code{noavx512cd},
 @code{noavx512er},
@@ -233,6 +234,7 @@  accept various extension mnemonics.  For example,
 @code{noavx512_bf16},
 @code{noavx_vnni},
 @code{noavx512_fp16},
+@code{noavx_ifma},
 @code{noenqcmd},
 @code{noserialize},
 @code{notsxldtrk},
@@ -873,9 +875,9 @@  prefix which generates REX prefix unconditionally.
 @samp{@{nooptimize@}} -- disable instruction size optimization.
 @end itemize
 
-Mnemonics of Intel VNNI instructions are encoded with the EVEX prefix
+Mnemonics of Intel VNNI/IFMA instructions are encoded with the EVEX prefix
 by default.  The pseudo @samp{@{vex@}} prefix can be used to encode
-mnemonics of Intel VNNI instructions with the VEX prefix.
+mnemonics of Intel VNNI/IFMA instructions with the VEX prefix.
 
 @cindex conversion instructions, i386
 @cindex i386 conversion instructions
@@ -1533,6 +1535,7 @@  supported on the CPU specified.  The choices for @var{cpu_type} are:
 @item @samp{.avx512_bitalg} @tab @samp{.avx512_bf16} @tab @samp{.avx512_vp2intersect}
 @item @samp{.tdx} @tab @samp{.avx_vnni}  @tab @samp{.avx512_fp16}
 @item @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} @tab @samp{.ibt}
+@item @samp{.avx_ifma}
 @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote}
 @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq}
 @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk}
diff --git a/gas/testsuite/gas/i386/avx-ifma-intel.d b/gas/testsuite/gas/i386/avx-ifma-intel.d
new file mode 100644
index 0000000000..3b6bcce2a1
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-ifma-intel.d
@@ -0,0 +1,30 @@ 
+#as:
+#objdump: -dw -Mintel
+#name: i386 AVX IFMA insns (Intel disassembly)
+#source: avx-ifma.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b5 d2[ 	]*vpmadd52huq xmm2,xmm4,xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b5 d2[ 	]*vpmadd52huq xmm2,xmm4,xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b5 d2[ 	]*\{vex\} vpmadd52huq xmm2,xmm4,xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b5 11[ 	]*\{vex\} vpmadd52huq xmm2,xmm4,XMMWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b5 d2[ 	]*vpmadd52huq ymm2,ymm4,ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b5 d2[ 	]*vpmadd52huq ymm2,ymm4,ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b5 d2[ 	]*\{vex\} vpmadd52huq ymm2,ymm4,ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b5 11[ 	]*\{vex\} vpmadd52huq ymm2,ymm4,YMMWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b4 d2[ 	]*vpmadd52luq xmm2,xmm4,xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b4 d2[ 	]*vpmadd52luq xmm2,xmm4,xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b4 d2[ 	]*\{vex\} vpmadd52luq xmm2,xmm4,xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b4 11[ 	]*\{vex\} vpmadd52luq xmm2,xmm4,XMMWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b4 d2[ 	]*vpmadd52luq ymm2,ymm4,ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b4 d2[ 	]*vpmadd52luq ymm2,ymm4,ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b4 d2[ 	]*\{vex\} vpmadd52luq ymm2,ymm4,ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b4 11[ 	]*\{vex\} vpmadd52luq ymm2,ymm4,YMMWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b5 d2[ 	]*vpmadd52huq xmm2,xmm4,xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b5 d2[ 	]*vpmadd52huq ymm2,ymm4,ymm2
+#pass
diff --git a/gas/testsuite/gas/i386/avx-ifma-inval.l b/gas/testsuite/gas/i386/avx-ifma-inval.l
new file mode 100644
index 0000000000..f706972175
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-ifma-inval.l
@@ -0,0 +1,2 @@ 
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpmadd52huq'
diff --git a/gas/testsuite/gas/i386/avx-ifma-inval.s b/gas/testsuite/gas/i386/avx-ifma-inval.s
new file mode 100644
index 0000000000..0697ab2215
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-ifma-inval.s
@@ -0,0 +1,6 @@ 
+# Check illegal in AVXIFMA instructions
+
+	.text
+	.arch .noavx512ifma
+_start:
+	vpmadd52huq %xmm2,%xmm4,%xmm2
diff --git a/gas/testsuite/gas/i386/avx-ifma.d b/gas/testsuite/gas/i386/avx-ifma.d
new file mode 100644
index 0000000000..50c24947db
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-ifma.d
@@ -0,0 +1,30 @@ 
+#as:
+#objdump: -dw
+#name: i386 AVX IFMA insns
+#source: avx-ifma.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b5 d2[ 	]*vpmadd52huq %xmm2,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b5 d2[ 	]*vpmadd52huq %xmm2,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b5 d2[ 	]*\{vex\} vpmadd52huq %xmm2,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b5 11[ 	]*\{vex\} vpmadd52huq \(%ecx\),%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b5 d2[ 	]*vpmadd52huq %ymm2,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b5 d2[ 	]*vpmadd52huq %ymm2,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b5 d2[ 	]*\{vex\} vpmadd52huq %ymm2,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b5 11[ 	]*\{vex\} vpmadd52huq \(%ecx\),%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b4 d2[ 	]*vpmadd52luq %xmm2,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b4 d2[ 	]*vpmadd52luq %xmm2,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b4 d2[ 	]*\{vex\} vpmadd52luq %xmm2,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b4 11[ 	]*\{vex\} vpmadd52luq \(%ecx\),%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b4 d2[ 	]*vpmadd52luq %ymm2,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b4 d2[ 	]*vpmadd52luq %ymm2,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b4 d2[ 	]*\{vex\} vpmadd52luq %ymm2,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b4 11[ 	]*\{vex\} vpmadd52luq \(%ecx\),%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 08 b5 d2[ 	]*vpmadd52huq %xmm2,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 f2 dd 28 b5 d2[ 	]*vpmadd52huq %ymm2,%ymm4,%ymm2
+#pass
diff --git a/gas/testsuite/gas/i386/avx-ifma.s b/gas/testsuite/gas/i386/avx-ifma.s
new file mode 100644
index 0000000000..983b48ebcb
--- /dev/null
+++ b/gas/testsuite/gas/i386/avx-ifma.s
@@ -0,0 +1,21 @@ 
+       .allow_index_reg
+
+.macro test_insn mnemonic
+       \mnemonic	%xmm2, %xmm4, %xmm2
+       {evex} \mnemonic %xmm2, %xmm4, %xmm2
+       {vex}  \mnemonic %xmm2, %xmm4, %xmm2
+       {vex}  \mnemonic (%ecx), %xmm4, %xmm2
+       \mnemonic	%ymm2, %ymm4, %ymm2
+       {evex} \mnemonic %ymm2, %ymm4, %ymm2
+       {vex}  \mnemonic %ymm2, %ymm4, %ymm2
+       {vex}  \mnemonic (%ecx), %ymm4, %ymm2
+.endm
+
+       .text
+_start:
+       test_insn vpmadd52huq
+       test_insn vpmadd52luq
+
+       .arch .avx_ifma
+        vpmadd52huq       %xmm2, %xmm4, %xmm2
+        vpmadd52huq       %ymm2, %ymm4, %ymm2
diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp
index 0ad2b6a818..3a46807e4f 100644
--- a/gas/testsuite/gas/i386/i386.exp
+++ b/gas/testsuite/gas/i386/i386.exp
@@ -478,6 +478,9 @@  if [gas_32_check] then {
     run_list_test "avx512_bf16_vl-inval"
     run_dump_test "avx-vnni"
     run_list_test "avx-vnni-inval"
+    run_dump_test "avx-ifma"
+    run_dump_test "avx-ifma-intel"
+    run_list_test "avx-ifma-inval"
     run_list_test "sg"
     run_dump_test "clzero"
     run_dump_test "invlpgb"
@@ -1145,6 +1148,9 @@  if [gas_64_check] then {
     run_list_test "x86-64-avx512_bf16_vl-inval"
     run_dump_test "x86-64-avx-vnni"
     run_list_test "x86-64-avx-vnni-inval"
+    run_dump_test "x86-64-avx-ifma"
+    run_dump_test "x86-64-avx-ifma-intel"
+    run_list_test "x86-64-avx-ifma-inval"
     run_dump_test "x86-64-clzero"
     run_dump_test "x86-64-mwaitx-bdver4"
     run_list_test "x86-64-mwaitx-reg"
diff --git a/gas/testsuite/gas/i386/noavx512-1.l b/gas/testsuite/gas/i386/noavx512-1.l
index 15a6fc689b..75c28afafb 100644
--- a/gas/testsuite/gas/i386/noavx512-1.l
+++ b/gas/testsuite/gas/i386/noavx512-1.l
@@ -37,9 +37,9 @@ 
 .*:120: Error: .*not supported.*
 .*:121: Error: .*not supported.*
 .*:122: Error: .*not supported.*
-.*:126: Error: .*not supported.*
-.*:127: Error: .*not supported.*
-.*:128: Error: .*not supported.*
+.*:126: Error: .*unsupported instruction.*
+.*:127: Error: .*unsupported instruction.*
+.*:128: Error: .*unsupported instruction.*
 .*:135: Error: .*operand size mismatch.*
 .*:136: Error: .*unsupported masking.*
 .*:137: Error: .*unsupported masking.*
@@ -50,9 +50,9 @@ 
 .*:142: Error: .*not supported.*
 .*:143: Error: .*not supported.*
 .*:144: Error: .*not supported.*
-.*:148: Error: .*not supported.*
-.*:149: Error: .*not supported.*
-.*:150: Error: .*not supported.*
+.*:148: Error: .*unsupported instruction.*
+.*:149: Error: .*unsupported instruction.*
+.*:150: Error: .*unsupported instruction.*
 .*:151: Error: .*not supported.*
 .*:157: Error: .*operand size mismatch.*
 .*:158: Error: .*unsupported masking.*
@@ -64,9 +64,9 @@ 
 .*:164: Error: .*not supported.*
 .*:165: Error: .*not supported.*
 .*:166: Error: .*not supported.*
-.*:170: Error: .*not supported.*
-.*:171: Error: .*not supported.*
-.*:172: Error: .*not supported.*
+.*:170: Error: .*unsupported instruction.*
+.*:171: Error: .*unsupported instruction.*
+.*:172: Error: .*unsupported instruction.*
 .*:173: Error: .*not supported.*
 .*:174: Error: .*not supported.*
 .*:175: Error: .*not supported.*
@@ -84,9 +84,9 @@ 
 .*:189: Error: .*bad register name.*
 .*:190: Error: .*unknown vector operation.*
 .*:191: Error: .*unknown vector operation.*
-.*:192: Error: .*not supported.*
-.*:193: Error: .*not supported.*
-.*:194: Error: .*not supported.*
+.*:192: Error: .*bad register name.*
+.*:193: Error: .*unknown vector operation.*
+.*:194: Error: .*unknown vector operation.*
 .*:195: Error: .*not supported.*
 .*:196: Error: .*not supported.*
 .*:197: Error: .*not supported.*
diff --git a/gas/testsuite/gas/i386/x86-64-avx-ifma-intel.d b/gas/testsuite/gas/i386/x86-64-avx-ifma-intel.d
new file mode 100644
index 0000000000..0b3b053e5d
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-ifma-intel.d
@@ -0,0 +1,34 @@ 
+#as:
+#objdump: -dw -Mintel
+#name: x86-64 AVX IFMA insns (Intel disassembly)
+#source: x86-64-avx-ifma.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b5 d4[ 	]*vpmadd52huq xmm2,xmm4,xmm12
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b5 d4[ 	]*vpmadd52huq xmm2,xmm4,xmm12
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 d9 b5 d4[ 	]*\{vex\} vpmadd52huq xmm2,xmm4,xmm12
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b5 11[ 	]*\{vex\} vpmadd52huq xmm2,xmm4,XMMWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 08 b5 d6[ 	]*vpmadd52huq xmm2,xmm4,xmm22
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b5 d4[ 	]*vpmadd52huq ymm2,ymm4,ymm12
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b5 d4[ 	]*vpmadd52huq ymm2,ymm4,ymm12
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 dd b5 d4[ 	]*\{vex\} vpmadd52huq ymm2,ymm4,ymm12
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b5 11[ 	]*\{vex\} vpmadd52huq ymm2,ymm4,YMMWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 28 b5 d6[ 	]*vpmadd52huq ymm2,ymm4,ymm22
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b4 d4[ 	]*vpmadd52luq xmm2,xmm4,xmm12
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b4 d4[ 	]*vpmadd52luq xmm2,xmm4,xmm12
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 d9 b4 d4[ 	]*\{vex\} vpmadd52luq xmm2,xmm4,xmm12
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b4 11[ 	]*\{vex\} vpmadd52luq xmm2,xmm4,XMMWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 08 b4 d6[ 	]*vpmadd52luq xmm2,xmm4,xmm22
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b4 d4[ 	]*vpmadd52luq ymm2,ymm4,ymm12
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b4 d4[ 	]*vpmadd52luq ymm2,ymm4,ymm12
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 dd b4 d4[ 	]*\{vex\} vpmadd52luq ymm2,ymm4,ymm12
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b4 11[ 	]*\{vex\} vpmadd52luq ymm2,ymm4,YMMWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 28 b4 d6[ 	]*vpmadd52luq ymm2,ymm4,ymm22
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b5 d4[ 	]*vpmadd52huq xmm2,xmm4,xmm12
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b5 d4[ 	]*vpmadd52huq ymm2,ymm4,ymm12
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-avx-ifma-inval.l b/gas/testsuite/gas/i386/x86-64-avx-ifma-inval.l
new file mode 100644
index 0000000000..57a7f16807
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-ifma-inval.l
@@ -0,0 +1,3 @@ 
+.* Assembler messages:
+.*:6: Error: unsupported instruction `vpmadd52huq'
+.*:7: Error: unsupported instruction `vpmadd52huq'
diff --git a/gas/testsuite/gas/i386/x86-64-avx-ifma-inval.s b/gas/testsuite/gas/i386/x86-64-avx-ifma-inval.s
new file mode 100644
index 0000000000..0e37bf2361
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-ifma-inval.s
@@ -0,0 +1,7 @@ 
+# Check illegal in AVXIFMA instructions
+
+	.text
+	.arch .noavx512ifma
+_start:
+	vpmadd52huq %xmm2, %xmm4, %xmm2
+	vpmadd52huq %xmm22, %xmm4, %xmm2
diff --git a/gas/testsuite/gas/i386/x86-64-avx-ifma.d b/gas/testsuite/gas/i386/x86-64-avx-ifma.d
new file mode 100644
index 0000000000..b1670b68b6
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-ifma.d
@@ -0,0 +1,34 @@ 
+#as:
+#objdump: -dw
+#name: x86-64 AVX IFMA insns
+#source: x86-64-avx-ifma.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b5 d4[ 	]*vpmadd52huq %xmm12,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b5 d4[ 	]*vpmadd52huq %xmm12,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 d9 b5 d4[ 	]*\{vex\} vpmadd52huq %xmm12,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b5 11[ 	]*\{vex\} vpmadd52huq \(%rcx\),%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 08 b5 d6[ 	]*vpmadd52huq %xmm22,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b5 d4[ 	]*vpmadd52huq %ymm12,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b5 d4[ 	]*vpmadd52huq %ymm12,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 dd b5 d4[ 	]*\{vex\} vpmadd52huq %ymm12,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b5 11[ 	]*\{vex\} vpmadd52huq \(%rcx\),%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 28 b5 d6[ 	]*vpmadd52huq %ymm22,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b4 d4[ 	]*vpmadd52luq %xmm12,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b4 d4[ 	]*vpmadd52luq %xmm12,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 d9 b4 d4[ 	]*\{vex\} vpmadd52luq %xmm12,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 d9 b4 11[ 	]*\{vex\} vpmadd52luq \(%rcx\),%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 08 b4 d6[ 	]*vpmadd52luq %xmm22,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b4 d4[ 	]*vpmadd52luq %ymm12,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b4 d4[ 	]*vpmadd52luq %ymm12,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 c2 dd b4 d4[ 	]*\{vex\} vpmadd52luq %ymm12,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*c4 e2 dd b4 11[ 	]*\{vex\} vpmadd52luq \(%rcx\),%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 b2 dd 28 b4 d6[ 	]*vpmadd52luq %ymm22,%ymm4,%ymm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 08 b5 d4[ 	]*vpmadd52huq %xmm12,%xmm4,%xmm2
+[ 	]*[a-f0-9]+:[ 	]*62 d2 dd 28 b5 d4[ 	]*vpmadd52huq %ymm12,%ymm4,%ymm2
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-avx-ifma.s b/gas/testsuite/gas/i386/x86-64-avx-ifma.s
new file mode 100644
index 0000000000..bfc524a103
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-avx-ifma.s
@@ -0,0 +1,23 @@ 
+       .allow_index_reg
+
+.macro test_insn mnemonic
+       \mnemonic	%xmm12, %xmm4, %xmm2
+       {evex} \mnemonic %xmm12, %xmm4, %xmm2
+       {vex}  \mnemonic %xmm12, %xmm4, %xmm2
+       {vex}  \mnemonic (%rcx), %xmm4, %xmm2
+       \mnemonic	%xmm22, %xmm4, %xmm2
+       \mnemonic	%ymm12, %ymm4, %ymm2
+       {evex} \mnemonic %ymm12, %ymm4, %ymm2
+       {vex}  \mnemonic %ymm12, %ymm4, %ymm2
+       {vex}  \mnemonic (%rcx), %ymm4, %ymm2
+       \mnemonic	%ymm22, %ymm4, %ymm2
+.endm
+
+       .text
+_start:
+       test_insn vpmadd52huq
+       test_insn vpmadd52luq
+
+       .arch .avx_ifma
+        vpmadd52huq       %xmm12, %xmm4, %xmm2
+        vpmadd52huq       %ymm12, %ymm4, %ymm2
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index 98d3ecd9f0..74f71467c5 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -1526,6 +1526,8 @@  enum
   VEX_W_0F385E_X86_64_P_3,
   VEX_W_0F3878,
   VEX_W_0F3879,
+  VEX_W_0F38B4,
+  VEX_W_0F38B5,
   VEX_W_0F38CF,
   VEX_W_0F3A00_L_1,
   VEX_W_0F3A01_L_1,
@@ -6293,8 +6295,8 @@  static const struct dis386 vex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { VEX_W_TABLE (VEX_W_0F38B4) },
+    { VEX_W_TABLE (VEX_W_0F38B5) },
     { "vfmaddsub231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
     { "vfmsubadd231p%XW", { XM, Vex, EXx }, PREFIX_DATA },
     /* b8 */
@@ -7599,6 +7601,16 @@  static const struct dis386 vex_w_table[][2] = {
     /* VEX_W_0F3879 */
     { "vpbroadcastw",	{ XM, EXw }, PREFIX_DATA },
   },
+  {
+    /* VEX_W_0F38B4 */
+    { Bad_Opcode },
+    { "%XV vpmadd52luq",	{ XM, Vex, EXx }, PREFIX_DATA },
+  },
+  {
+    /* VEX_W_0F38B5 */
+    { Bad_Opcode },
+    { "%XV vpmadd52huq",	{ XM, Vex, EXx }, PREFIX_DATA },
+  },
   {
     /* VEX_W_0F38CF */
     { "vgf2p8mulb", { XM, Vex, EXx }, PREFIX_DATA },
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 6d54b472ab..cf7b5ece2a 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -245,6 +245,8 @@  static initializer cpu_flag_init[] =
     "CPU_AVX512F_FLAGS|CpuAVX512_BF16" },
   { "CPU_AVX512_FP16_FLAGS",
     "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" },
+  { "CPU_AVX_IFMA_FLAGS",
+    "CPU_AVX2_FLAGS|CpuAVX_IFMA" },
   { "CPU_IAMCU_FLAGS",
     "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" },
   { "CPU_ADX_FLAGS",
@@ -439,6 +441,8 @@  static initializer cpu_flag_init[] =
     "CpuHRESET" },
   { "CPU_ANY_AVX512_FP16_FLAGS",
     "CpuAVX512_FP16" },
+  { "CPU_ANY_AVX_IFMA_FLAGS",
+    "CpuAVX_IFMA" },
 };
 
 static initializer operand_type_init[] =
@@ -640,6 +644,7 @@  static bitfield cpu_flags[] =
   BITFIELD (CpuTDX),
   BITFIELD (CpuAVX_VNNI),
   BITFIELD (CpuAVX512_FP16),
+  BITFIELD (CpuAVX_IFMA),
   BITFIELD (CpuMWAITX),
   BITFIELD (CpuCLZERO),
   BITFIELD (CpuOSPKE),
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index c033aeb8e0..8ff66d42cc 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -209,6 +209,8 @@  enum
   CpuAVX_VNNI,
   /* Intel AVX-512 FP16 Instructions support required.  */
   CpuAVX512_FP16,
+  /* Intel AVX IFMA Instructions support required.  */
+  CpuAVX_IFMA,
   /* mwaitx instruction required */
   CpuMWAITX,
   /* Clzero instruction required */
@@ -388,6 +390,7 @@  typedef union i386_cpu_flags
       unsigned int cputdx:1;
       unsigned int cpuavx_vnni:1;
       unsigned int cpuavx512_fp16:1;
+      unsigned int cpuavx_ifma:1;
       unsigned int cpumwaitx:1;
       unsigned int cpuclzero:1;
       unsigned int cpuospke:1;
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index fbd48c203a..d39c676abf 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -3263,3 +3263,10 @@  vrsqrtph, 0x664e, None, CpuAVX512_FP16, Modrm|Masking=3|EVexMap6|VexW0|Broadcast
 vrsqrtsh, 0x664f, None, CpuAVX512_FP16, Modrm|EVexLIG|Masking=3|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 // FP16 (HFNI) instructions end.
+
+// AVX_IFMA instructions.
+
+vpmadd52huq, 0x66B5, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmadd52luq, 0x66B4, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+
+// AVX_IFMA instructions end.