[5/8] Support APX NDD

Message ID 20231102112911.2372810-6-lili.cui@intel.com
State Unresolved
Headers
Series Support Intel APX EGPR |

Checks

Context Check Description
snail/binutils-gdb-check warning Git am fail log

Commit Message

Cui, Lili Nov. 2, 2023, 11:29 a.m. UTC
  From: konglin1 <lingling.kong@intel.com>

opcodes/ChangeLog:

	* opcodes/i386-dis-evex-prefix.h: Add NDD decode for adox/adcx.
	* opcodes/i386-dis-evex-reg.h: Handle for REG_EVEX_MAP4_80,
	REG_EVEX_MAP4_81, REG_EVEX_MAP4_83,  REG_EVEX_MAP4_F6,
	REG_EVEX_MAP4_F7, REG_EVEX_MAP4_FE, REG_EVEX_MAP4_FF.
	* opcodes/i386-dis-evex.h: Add NDD insn.
	* opcodes/i386-dis.c (VexGb): Add new define.
	(VexGv): Ditto.
	(get_valid_dis386): Change for NDD decode.
	(print_insn): Ditto.
	(print_register): Ditto.
	(intel_operand_size): Ditto.
	(OP_E_memory): Ditto.
	(OP_VEX): Ditto.
	* opcodes/i386-opc.h (VexVVVV_SRC): New.
	VexVVVV_DST):  Ditto.
	* opcodes/i386-opc.tbl: Add APX NDD instructions and adjust VexVVVV.
	* opcodes/i386-tbl.h: Regenerated.

gas/ChangeLog:

	* gas/config/tc-i386.c (is_any_apx_evex_encoding): Add legacy insn
	promote to SPACE_EVEXMAP4.
	(md_assemble): Change for ndd encode.
	(process_operands): Ditto.
	(build_modrm_byte): Ditto.
	(operand_size_match):
	Support APX NDD that the number of operands is 3.
	(match_template): Support swap the first two operands for
	APX NDD.
	reg_table
	* testsuite/gas/i386/x86-64.exp: Add x86-64-apx-ndd.
	* testsuite/gas/i386/x86-64-apx-ndd.d: New test.
	* testsuite/gas/i386/x86-64-apx-ndd.s: Ditto.
	* testsuite/gas/i386/x86-64-pseudos.d: Add test.
	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d : Ditto.
	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s : Ditto.
---
 gas/config/tc-i386.c                          |  111 +-
 .../gas/i386/x86-64-apx-evex-promoted-bad.d   |    4 +
 .../gas/i386/x86-64-apx-evex-promoted-bad.s   |    5 +-
 gas/testsuite/gas/i386/x86-64-apx-ndd.d       |  161 +++
 gas/testsuite/gas/i386/x86-64-apx-ndd.s       |  154 +++
 gas/testsuite/gas/i386/x86-64-pseudos.d       |   42 +
 gas/testsuite/gas/i386/x86-64-pseudos.s       |   43 +
 gas/testsuite/gas/i386/x86-64.exp             |    1 +
 opcodes/i386-dis-evex-prefix.h                |    4 -
 opcodes/i386-dis-evex-reg.h                   |   54 +
 opcodes/i386-dis-evex.h                       |  126 +-
 opcodes/i386-dis.c                            |  145 +-
 opcodes/i386-gen.c                            |    1 +
 opcodes/i386-opc.h                            |    9 +-
 opcodes/i386-opc.tbl                          | 1231 +++++++++--------
 15 files changed, 1354 insertions(+), 737 deletions(-)
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
 create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
  

Comments

Jan Beulich Nov. 8, 2023, 10:39 a.m. UTC | #1
On 02.11.2023 12:29, Cui, Lili wrote:
> From: konglin1 <lingling.kong@intel.com>
> 
> opcodes/ChangeLog:
> 
> 	* opcodes/i386-dis-evex-prefix.h: Add NDD decode for adox/adcx.
> 	* opcodes/i386-dis-evex-reg.h: Handle for REG_EVEX_MAP4_80,
> 	REG_EVEX_MAP4_81, REG_EVEX_MAP4_83,  REG_EVEX_MAP4_F6,
> 	REG_EVEX_MAP4_F7, REG_EVEX_MAP4_FE, REG_EVEX_MAP4_FF.
> 	* opcodes/i386-dis-evex.h: Add NDD insn.
> 	* opcodes/i386-dis.c (VexGb): Add new define.
> 	(VexGv): Ditto.
> 	(get_valid_dis386): Change for NDD decode.
> 	(print_insn): Ditto.
> 	(print_register): Ditto.
> 	(intel_operand_size): Ditto.
> 	(OP_E_memory): Ditto.
> 	(OP_VEX): Ditto.
> 	* opcodes/i386-opc.h (VexVVVV_SRC): New.
> 	VexVVVV_DST):  Ditto.
> 	* opcodes/i386-opc.tbl: Add APX NDD instructions and adjust VexVVVV.
> 	* opcodes/i386-tbl.h: Regenerated.
> 
> gas/ChangeLog:
> 
> 	* gas/config/tc-i386.c (is_any_apx_evex_encoding): Add legacy insn
> 	promote to SPACE_EVEXMAP4.
> 	(md_assemble): Change for ndd encode.
> 	(process_operands): Ditto.
> 	(build_modrm_byte): Ditto.
> 	(operand_size_match):
> 	Support APX NDD that the number of operands is 3.
> 	(match_template): Support swap the first two operands for
> 	APX NDD.
> 	reg_table
> 	* testsuite/gas/i386/x86-64.exp: Add x86-64-apx-ndd.
> 	* testsuite/gas/i386/x86-64-apx-ndd.d: New test.
> 	* testsuite/gas/i386/x86-64-apx-ndd.s: Ditto.
> 	* testsuite/gas/i386/x86-64-pseudos.d: Add test.
> 	* testsuite/gas/i386/x86-64-pseudos.s: Ditto.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d : Ditto.
> 	* testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s : Ditto.
> ---
>  gas/config/tc-i386.c                          |  111 +-
>  .../gas/i386/x86-64-apx-evex-promoted-bad.d   |    4 +
>  .../gas/i386/x86-64-apx-evex-promoted-bad.s   |    5 +-
>  gas/testsuite/gas/i386/x86-64-apx-ndd.d       |  161 +++
>  gas/testsuite/gas/i386/x86-64-apx-ndd.s       |  154 +++
>  gas/testsuite/gas/i386/x86-64-pseudos.d       |   42 +
>  gas/testsuite/gas/i386/x86-64-pseudos.s       |   43 +
>  gas/testsuite/gas/i386/x86-64.exp             |    1 +
>  opcodes/i386-dis-evex-prefix.h                |    4 -
>  opcodes/i386-dis-evex-reg.h                   |   54 +
>  opcodes/i386-dis-evex.h                       |  126 +-
>  opcodes/i386-dis.c                            |  145 +-
>  opcodes/i386-gen.c                            |    1 +
>  opcodes/i386-opc.h                            |    9 +-
>  opcodes/i386-opc.tbl                          | 1231 +++++++++--------
>  15 files changed, 1354 insertions(+), 737 deletions(-)
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
> 
> diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
> index 398909a6a30..5b925505435 100644
> --- a/gas/config/tc-i386.c
> +++ b/gas/config/tc-i386.c
> @@ -2317,8 +2317,10 @@ operand_size_match (const insn_template *t)
>        unsigned int given = i.operands - j - 1;
>  
>        /* For FMA4 and XOP insns VEX.W controls just the first two
> -	 register operands.  */
> -      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP))
> +	 register operands. And APX_F insns just swap the two source operands,
> +	 with the 3rd one being the destination.  */
> +      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP)
> +	  || is_cpu (t,CpuAPX_F))
>  	given = j < 2 ? 1 - j : j;

Nit: Please retain consistency wrt style (here: missing blank after comma
in the addition). Feels like I said so before.

> @@ -3959,6 +3961,7 @@ static INLINE bool
>  is_any_apx_evex_encoding (void)
>  {
>    return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4 
> +    || i.rex2_encoding
>      || (i.vex.register_specifier
>  	&& i.vex.register_specifier->reg_flags & RegRex2);
>  }

See my comment on an earlier patch regarding the use of i.rex2 here. But
I doubt this is correct in the first place: Why would {rex2} cause EVEX
encoding to be picked?

> @@ -7481,26 +7484,33 @@ match_template (char mnem_suffix)
>  	  overlap1 = operand_type_and (operand_types[0], operand_types[1]);
>  	  if (t->opcode_modifier.d && i.reg_operands == i.operands
>  	      && !operand_type_all_zero (&overlap1))
> -	    switch (i.dir_encoding)
> -	      {
> -	      case dir_encoding_load:
> -		if (operand_type_check (operand_types[i.operands - 1], anymem)
> -		    || t->opcode_modifier.regmem)
> -		  goto check_reverse;
> -		break;
> +	    {
>  
> -	      case dir_encoding_store:
> -		if (!operand_type_check (operand_types[i.operands - 1], anymem)
> -		    && !t->opcode_modifier.regmem)
> -		  goto check_reverse;
> -		break;
> +	      int MemOperand = i.operands - 1 -
> +		(t->opcode_space == SPACE_EVEXMAP4
> +		 && t->opcode_modifier.vexvvvv);

Nit: I don't think local variables should start with a capital letter. I
wonder anyway - can't you just re-use e.g. j here? That would then also
avoid the misleading name: Right here you don't know yet whether that's
the "memory" operand. Finally, if you set the variable ahead if the
enclosing if(), you could (I think) avoid all this re-indentation,
making the change quite a bit easier to review (i.e. to see what really
changes).

> @@ -7530,11 +7540,13 @@ match_template (char mnem_suffix)
>  		continue;
>  	      /* Try reversing direction of operands.  */
>  	      j = is_cpu (t, CpuFMA4)
> -		  || is_cpu (t, CpuXOP) ? 1 : i.operands - 1;
> +		  || is_cpu (t, CpuXOP)
> +		  || is_cpu (t, CpuAPX_F) ? 1 : i.operands - 1;
>  	      overlap0 = operand_type_and (i.types[0], operand_types[j]);
>  	      overlap1 = operand_type_and (i.types[j], operand_types[0]);
>  	      overlap2 = operand_type_and (i.types[1], operand_types[1]);
> -	      gas_assert (t->operands != 3 || !check_register);
> +	      gas_assert (t->operands != 3 || !check_register
> +			  || is_cpu (t,CpuAPX_F));

Nit: Missing blank again. And again in the next hunk. I won't comment
on such any further, expecting you to go through globally.

> @@ -8588,11 +8609,10 @@ process_operands (void)
>    const reg_entry *default_seg = NULL;
>  
>    /* We only need to check those implicit registers for instructions
> -     with 3 operands or less.  */
> -  if (i.operands <= 3)
> -    for (unsigned int j = 0; j < i.operands; j++)
> -      if (i.types[j].bitfield.instance != InstanceNone)
> -	i.reg_operands--;
> +     with 4 operands or less.  */
> +  for (unsigned int j = 0; j < i.operands; j++)
> +    if (i.types[j].bitfield.instance != InstanceNone)
> +      i.reg_operands--;

While you made the requested code adjustment, adjusting the comment
renders it stale now. It needs dropping instead, as it did only
explain the if(), not the for().

> @@ -8946,26 +8966,35 @@ build_modrm_byte (void)
>  				     || i.vec_encoding == vex_encoding_evex));
>      }
>  
> -  for (v = source + 1; v < dest; ++v)
> -    if (v != reg_slot)
> -      break;
> -  if (v >= dest)
> -    v = ~0;
> -  if (i.tm.extension_opcode != None)
> +  if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
>      {
> -      if (dest != source)
> -	v = dest;
> -      dest = ~0;
> +      v = dest;
> +      dest-- ;
>      }
> -  gas_assert (source < dest);
> -  if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
> -      && source != op)
> +  else if (i.tm.opcode_modifier.vexvvvv == VexVVVV_SRC)
>      {
> -      unsigned int tmp = source;
> +      v = source + 1;
> +      for (v = source + 1; v < dest; ++v)
> +	if (v != reg_slot)
> +	  break;
> +      if (i.tm.extension_opcode != None)
> +	{
> +	  if (dest != source)
> +	    v = dest;
> +	  dest = ~0;
> +	}
> +      gas_assert (source < dest);
> +      if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
> +	  && source != op)
> +	{
> +	  unsigned int tmp = source;
>  
> -      source = v;
> -      v = tmp;
> +	  source = v;
> +	  v = tmp;
> +	}
>      }
> +  else
> +    v = ~0;
>  
>    if (v < MAX_OPERANDS)
>      {

I'm having trouble following this change. The VexVVVV-is-source case
shouldn't change at all, I'd expect. This would ideally be easily
visible from the change done. Yet if looking a little more closely I
can e.g. spot a stray "v = source + 1;" which wasn't there before. And
there also look to be things being dropped. Such a change (again) wants
doing such that it is easy to see what changes. If need be by making a
mechanical prereq change doing just re-indentation, but nothing else.
It feels though as if there is too much changing here anyway: The
difference for VexVVVV-is-dest is that you need to consume the
destination early. If you do so, then the rest should be possible to
keep as is: You'll subsequently deal with just the normal ModR/M, as if
without a VexVVVV-encoded operand.

> --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> @@ -10,7 +10,7 @@ _start:
>          .byte 0x62, 0xfc, 0x7f, 0x08, 0x60, 0xc2
>          .byte 0xff, 0xff
>          #VSIB vpgatherqq 0x7b(%rbp,%zmm17,8),%zmm16{%k1} set EVEX.P[10] == 0
> -	#(illegal value).
> +        #(illegal value).
>          .byte 0x62, 0xe2, 0xf9, 0x41, 0x91, 0x84, 0xcd, 0x7b, 0x00, 0x00, 0x00
>          .byte 0xff
>          #EVEX_MAP4 movbe %r18w,%ax set EVEX.mm == b01 (illegal value).

???

> @@ -27,3 +27,6 @@ _start:
>          .byte 0xff
>          #EVEX from VEX enqcmd 0x123(%r31,%rax,4),%r31 EVEX.P[23:22] == 1 (illegal value).
>          .byte 0x62, 0x4c, 0x7f, 0x28, 0xf8, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00
> +        .byte 0xff

This then similarly looks to belong into the earlier patch, suggesting
that it had #pass too early.

> +        #{evex} inc %rax EVEX.vvvv' > 0 (illegal value).
> +        .byte 0x62, 0xf4, 0xec, 0x08, 0xff, 0xc0

Again I suspect this can be expressed via .insn, thus ending up more clear.
I can only stress again that one of the reasons of introducing .insn was to
reduce the number of such entirely unreadable byte sequences.

> --- /dev/null
> +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
> @@ -0,0 +1,154 @@
> +# Check 64bit APX NDD instructions with evex prefix encoding
> +
> +	.allow_index_reg
> +	.text
> +_start:
> +inc    %rax,%rbx

Please can instructions be indented by a tab?

> +inc    %r31,%r8
> +inc    %r31,%r16
> +add    %r31b,%r8b,%r16b
> +addb    %r31b,%r8b,%r16b
> +add    %r31,%r8,%r16
> +addq    %r31,%r8,%r16
> +add    %r31d,%r8d,%r16d
> +addl    %r31d,%r8d,%r16d
> +add    %r31w,%r8w,%r16w
> +addw    %r31w,%r8w,%r16w
> +{store} add    %r31,%r8,%r16
> +{load}  add    %r31,%r8,%r16
> +add    %r31,(%r8),%r16
> +add    (%r31),%r8,%r16
> +add    0x9090(%r31,%r16,1),%r8,%r16
> +add    %r31,(%r8,%r16,8),%r16
> +add    $0x34,%r13b,%r17b
> +addl   $0x11,(%r19,%rax,4),%r20d
> +add    $0x1234,%ax,%r30w
> +add    $0x12344433,%r15,%r16
> +addq   $0x12344433,(%r15,%rcx,4),%r16
> +add    $0xfffffffff4332211,%rax,%r8
> +dec    %rax,%r17
> +decb   (%r31,%r12,1),%r8b
> +not    %rax,%r17
> +notb   (%r31,%r12,1),%r8b
> +neg    %rax,%r17
> +negb   (%r31,%r12,1),%r8b
> +sub    %r15b,%r17b,%r18b
> +sub    %r15d,(%r8),%r18d
> +sub    (%r15,%rax,1),%r16b,%r8b
> +sub    (%r15,%rax,1),%r16w,%r8w
> +subl   $0x11,(%r19,%rax,4),%r20d
> +sub    $0x1234,%ax,%r30w
> +sbb    %r15b,%r17b,%r18b
> +sbb    %r15d,(%r8),%r18d
> +sbb    (%r15,%rax,1),%r16b,%r8b
> +sbb    (%r15,%rax,1),%r16w,%r8w
> +sbbl   $0x11,(%r19,%rax,4),%r20d
> +sbb    $0x1234,%ax,%r30w
> +adc    %r15b,%r17b,%r18b
> +adc    %r15d,(%r8),%r18d
> +adc    (%r15,%rax,1),%r16b,%r8b
> +adc    (%r15,%rax,1),%r16w,%r8w
> +adcl   $0x11,(%r19,%rax,4),%r20d
> +adc    $0x1234,%ax,%r30w
> +or     %r15b,%r17b,%r18b
> +or     %r15d,(%r8),%r18d
> +or     (%r15,%rax,1),%r16b,%r8b
> +or     (%r15,%rax,1),%r16w,%r8w
> +orl    $0x11,(%r19,%rax,4),%r20d
> +or     $0x1234,%ax,%r30w
> +xor    %r15b,%r17b,%r18b
> +xor    %r15d,(%r8),%r18d
> +xor    (%r15,%rax,1),%r16b,%r8b
> +xor    (%r15,%rax,1),%r16w,%r8w
> +xorl   $0x11,(%r19,%rax,4),%r20d
> +xor    $0x1234,%ax,%r30w
> +and    %r15b,%r17b,%r18b
> +and    %r15d,(%r8),%r18d
> +and    (%r15,%rax,1),%r16b,%r8b
> +and    (%r15,%rax,1),%r16w,%r8w
> +andl   $0x11,(%r19,%rax,4),%r20d
> +and    $0x1234,%ax,%r30w
> +rorb   (%rax),%r31b

While there's a doc problem here as well, the question here and there is
the same: Which form of ROR does this represent, when the shift count
isn't specified explicitly? It could be %cl or $1. I would strongly
advise against introducing further ambiguous instruction patterns like
this, and instead demand that both %cl and $1 be always named explicitly
as operands.

> +ror    $0x2,%r12b,%r31b
> +rorl   $0x2,(%rax),%r31d
> +rorw   (%rax),%r31w
> +ror    %cl,%r16b,%r8b
> +rorw   %cl,(%r19,%rax,4),%r31w
> +rolb   (%rax),%r31b
> +rol    $0x2,%r12b,%r31b
> +roll   $0x2,(%rax),%r31d
> +rolw   (%rax),%r31w
> +rol    %cl,%r16b,%r8b
> +rolw   %cl,(%r19,%rax,4),%r31w
> +rcrb   (%rax),%r31b
> +rcr    $0x2,%r12b,%r31b
> +rcrl   $0x2,(%rax),%r31d
> +rcrw   (%rax),%r31w
> +rcr    %cl,%r16b,%r8b
> +rcrw   %cl,(%r19,%rax,4),%r31w
> +rclb   (%rax),%r31b
> +rcl    $0x2,%r12b,%r31b
> +rcll   $0x2,(%rax),%r31d
> +rclw   (%rax),%r31w
> +rcl    %cl,%r16b,%r8b
> +rclw   %cl,(%r19,%rax,4),%r31w
> +shlb   (%rax),%r31b
> +shl    $0x2,%r12b,%r31b
> +shll   $0x2,(%rax),%r31d
> +shlw   (%rax),%r31w
> +shl    %cl,%r16b,%r8b
> +shlw   %cl,(%r19,%rax,4),%r31w
> +sarb   (%rax),%r31b
> +sar    $0x2,%r12b,%r31b
> +sarl   $0x2,(%rax),%r31d
> +sarw   (%rax),%r31w
> +sar    %cl,%r16b,%r8b
> +sarw   %cl,(%r19,%rax,4),%r31w
> +shlb   (%rax),%r31b
> +shl    $0x2,%r12b,%r31b
> +shll   $0x2,(%rax),%r31d
> +shlw   (%rax),%r31w
> +shl    %cl,%r16b,%r8b
> +shlw   %cl,(%r19,%rax,4),%r31w
> +shrb   (%rax),%r31b
> +shr    $0x2,%r12b,%r31b
> +shrl   $0x2,(%rax),%r31d
> +shrw   (%rax),%r31w
> +shr    %cl,%r16b,%r8b
> +shrw   %cl,(%r19,%rax,4),%r31w
> +shld   $0x1,%r12,(%rax),%r31
> +shld   $0x2,%r8w,%r12w,%r31w
> +shld   $0x2,%r15d,(%rax),%r31d
> +shld   %cl,%r9w,(%rax),%r31w
> +shld   %cl,%r12,%r16,%r8
> +shld   %cl,%r13w,(%r19,%rax,4),%r31w
> +shrd   $0x1,%r12,(%rax),%r31
> +shrd   $0x2,%r8w,%r12w,%r31w
> +shrd   $0x2,%r15d,(%rax),%r31d
> +shrd   %cl,%r9w,(%rax),%r31w
> +shrd   %cl,%r12,%r16,%r8
> +shrd   %cl,%r13w,(%r19,%rax,4),%r31w
> +adcx   %r15d,%r8d,%r18d
> +adcx   (%r15,%r31,1),%r8d,%r18d
> +adcx   (%r15,%r31,1),%r8
> +adox   %r15d,%r8d,%r18d
> +adox   (%r15,%r31,1),%r8d,%r18d
> +adox   (%r15,%r31,1),%r8
> +cmovo  0x90909090(%eax),%edx,%r8d
> +cmovno 0x90909090(%eax),%edx,%r8d
> +cmovb  0x90909090(%eax),%edx,%r8d
> +cmovae 0x90909090(%eax),%edx,%r8d
> +cmove  0x90909090(%eax),%edx,%r8d
> +cmovne 0x90909090(%eax),%edx,%r8d
> +cmovbe 0x90909090(%eax),%edx,%r8d
> +cmova  0x90909090(%eax),%edx,%r8d
> +cmovs  0x90909090(%eax),%edx,%r8d
> +cmovns 0x90909090(%eax),%edx,%r8d
> +cmovp  0x90909090(%eax),%edx,%r8d
> +cmovnp 0x90909090(%eax),%edx,%r8d
> +cmovl  0x90909090(%eax),%edx,%r8d
> +cmovge 0x90909090(%eax),%edx,%r8d
> +cmovle 0x90909090(%eax),%edx,%r8d
> +cmovg  0x90909090(%eax),%edx,%r8d
> +imul   0x90909(%eax),%edx,%r8d
> +imul   0x909(%rax,%r31,8),%rdx,%r25

Overall there's also again the sorting criteria question: Without any
sorting, how is one to (easily) check full coverage?

> --- a/opcodes/i386-gen.c
> +++ b/opcodes/i386-gen.c
> @@ -473,6 +473,7 @@ static bitfield opcode_modifiers[] =
>    BITFIELD (IntelSyntax),
>    BITFIELD (ISA64),
>    BITFIELD (NoEgpr),
> +  BITFIELD (NF),
>  };

This wants introducing earlier, when the BMI / BMI2 templates are touched
anyway. As said before, it would be very nice if within such a series the
same places wouldn't needlessly be touched more than once. You don't
implement parsing of NF here anyway, so how much in advance the attribute
is added to the opcode table is really irrelevant, and should hence be
done as conventiently as possible.

> --- a/opcodes/i386-opc.h
> +++ b/opcodes/i386-opc.h
> @@ -636,7 +636,10 @@ enum
>    /* How to encode VEX.vvvv:
>       0: VEX.vvvv must be 1111b.
>       1: VEX.vvvv encodes one of the register operands.
> +     2: VEX.vvvv encodes as the dest register operands.
>     */
> +#define VexVVVV_SRC   1
> +#define VexVVVV_DST   2
>    VexVVVV,

Nit: Singular in the new comment line, please (plus perhaps drop "as").
And "source" wants adding to the earlier comment line.

> --- a/opcodes/i386-opc.tbl
> +++ b/opcodes/i386-opc.tbl
> @@ -138,6 +138,9 @@
>  #define Vsz512 Vsz=VSZ512
>  
>  #define APX_F APX_F|x64
> +#define VexVVVVSrc  VexVVVV=VexVVVV_SRC
> +#define VexVVVVDest VexVVVV=VexVVVV_DST

I don't this we need the former. Continuing to use just VexVVVV there
is going to be quite fine, and easier to read. Plus, I'm sorry to say
it this bluntly, it is entirely inappropriate to bloat an already
large patch by mechanically replacing VexVVVV by this new alias. If
such a change was really wanted, it would need separating out as an
entirely mechanical one.

For the latter, to help readability, how about DstVVVV?

> +
>  
>  // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
>  // the bit to mark commutative VEX encodings where swapping the source

Please don't add stray (especially double) blank lines. Instead a
blank line would be wanted _ahead_ of the addition.

> @@ -190,6 +193,8 @@ mov, 0xf21, i386|No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { De
>  mov, 0xf21, x64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Debug, Reg64 }
>  mov, 0xf24, i386|No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test, Reg32 }
>  
> +// Move after swapping the bytes
> +movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  // Move after swapping the bytes
>  movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  movbe, 0x60, Movbe|APX_F, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVex128|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }

What is this addition about? Please can you look at your own patches before
sending them out?

> @@ -290,22 +295,36 @@ add, 0x0, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg3
>  add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
>  add, 0x4, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
>  add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> +add, 0x0, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
> +add, 0x83/0, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +add, 0x80/0, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}

Earlier review comments were not addressed. I'll stop here, as far as this
file is concerned, expecting a new version to be sent addressing earlier
comments and having been sanity checked and having the bogus VexVVVV
renaming dropped.

Jan
  
Jan Beulich Nov. 8, 2023, 11:13 a.m. UTC | #2
On 02.11.2023 12:29, Cui, Lili wrote:
> --- a/opcodes/i386-dis-evex-prefix.h
> +++ b/opcodes/i386-dis-evex-prefix.h
> @@ -338,10 +338,6 @@
>      { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
>      { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
>    },
> -  /* PREFIX_EVEX_MAP4_66 */
> -  {
> -    { "wrssK",	{ M, Gdq }, 0 },
> -  },
>    /* PREFIX_EVEX_MAP4_D8 */
>    {
>      { "sha1nexte", { XM, EXxmm }, 0 },

What's going on here?

> --- a/opcodes/i386-dis-evex-reg.h
> +++ b/opcodes/i386-dis-evex-reg.h
> @@ -56,6 +56,36 @@
>      { "blsmskS",	{ VexGdq, Edq }, 0 },
>      { "blsiS",		{ VexGdq, Edq }, 0 },
>    },
> +  /* REG_EVEX_MAP4_80 */
> +  {
> +    { "addA",	{ VexGb, Eb, Ib }, 0 },
> +    { "orA",	{ VexGb, Eb, Ib }, 0 },
> +    { "adcA",	{ VexGb, Eb, Ib }, 0 },
> +    { "sbbA",	{ VexGb, Eb, Ib }, 0 },
> +    { "andA",	{ VexGb, Eb, Ib }, 0 },
> +    { "subA",	{ VexGb, Eb, Ib }, 0 },
> +    { "xorA",	{ VexGb, Eb, Ib }, 0 },
> +  },
> +  /* REG_EVEX_MAP4_81 */
> +  {
> +    { "addQ",	{ VexGv, Ev, Iv }, 0 },
> +    { "orQ",	{ VexGv, Ev, Iv }, 0 },
> +    { "adcQ",	{ VexGv, Ev, Iv }, 0 },
> +    { "sbbQ",	{ VexGv, Ev, Iv }, 0 },
> +    { "andQ",	{ VexGv, Ev, Iv }, 0 },
> +    { "subQ",	{ VexGv, Ev, Iv }, 0 },
> +    { "xorQ",	{ VexGv, Ev, Iv }, 0 },
> +  },
> +  /* REG_EVEX_MAP4_83 */
> +  {
> +    { "addQ",	{ VexGv, Ev, sIb }, 0 },
> +    { "orQ",	{ VexGv, Ev, sIb }, 0 },
> +    { "adcQ",	{ VexGv, Ev, sIb }, 0 },
> +    { "sbbQ",	{ VexGv, Ev, sIb }, 0 },
> +    { "andQ",	{ VexGv, Ev, sIb }, 0 },
> +    { "subQ",	{ VexGv, Ev, sIb }, 0 },
> +    { "xorQ",	{ VexGv, Ev, sIb }, 0 },
> +  },

No sign of prefix decoding, and also no PREFIX_OPCODE present?

> @@ -63,3 +93,27 @@
>      { "aesencwide256kl",	{ M }, 0 },
>      { "aesdecwide256kl",	{ M }, 0 },
>    },
> +  /* REG_EVEX_MAP4_F6 */
> +  {
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { "notA",	{ VexGb, Eb }, 0 },
> +    { "negA",	{ VexGb, Eb }, 0 },
> +  },
> +  /* REG_EVEX_MAP4_F7 */
> +  {
> +    { Bad_Opcode },
> +    { Bad_Opcode },
> +    { "notQ",	{ VexGv, Ev }, 0 },
> +    { "negQ",	{ VexGv, Ev }, 0 },
> +  },
> +  /* REG_EVEX_MAP4_FE */
> +  {
> +    { "incA",   { VexGb ,Eb }, 0 },
> +    { "decA",   { VexGb ,Eb }, 0 },
> +  },
> +  /* REG_EVEX_MAP4_FF */
> +  {
> +    { "incQ",   { VexGv ,Ev }, 0 },
> +    { "decQ",   { VexGv ,Ev }, 0 },
> +  },

Same here, plus for the inc/dec some commas are misplaced. Padding also
looks to be incosnsitent (tab vs blanks).

> --- a/opcodes/i386-dis-evex.h
> +++ b/opcodes/i386-dis-evex.h
>[...]
> @@ -947,23 +947,23 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* 40 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "cmovoS",		{ VexGv, Gv, Ev }, 0 },
> +    { "cmovnoS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmovbS",		{ VexGv, Gv, Ev }, 0 },
> +    { "cmovaeS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmoveS",		{ VexGv, Gv, Ev }, 0 },
> +    { "cmovneS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmovbeS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmovaS",		{ VexGv, Gv, Ev }, 0 },
>      /* 48 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "cmovsS",		{ VexGv, Gv, Ev }, 0 },
> +    { "cmovnsS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmovpS",		{ VexGv, Gv, Ev }, 0 },
> +    { "cmovnpS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmovlS",		{ VexGv, Gv, Ev }, 0 },
> +    { "cmovgeS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmovleS",	{ VexGv, Gv, Ev }, 0 },
> +    { "cmovgS",		{ VexGv, Gv, Ev }, 0 },

Considering CFCMOVcc which sits at the same opcode, doing things like
this sets us up for needing to touch all of these again. Maybe that's
the best that can be done, but I still wonder whether this couldn't
be taken care of right away when introducing these entries.

> @@ -989,7 +989,7 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { "wrussK",	{ M, Gdq }, PREFIX_DATA },
> -    { PREFIX_TABLE (PREFIX_EVEX_MAP4_66) },
> +    { PREFIX_TABLE (PREFIX_0F38F6) },

Perhaps related to the earlier question: What's going on here?

> @@ -1060,7 +1060,7 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "shldS",		{ VexGv, Ev, Gv, CL }, 0 },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* A8 */
> @@ -1069,9 +1069,9 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> +    { "shrdS",		{ VexGv, Ev, Gv, CL }, 0 },
>      { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { "imulS",		{ VexGv, Gv, Ev }, 0 },

PREFIX_OPCODE (or prefix decoding) again missing in all of these?

> @@ -1091,8 +1091,8 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* C0 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_C0) },
> +    { REG_TABLE (REG_C1) },
>      { Bad_Opcode },
>      { Bad_Opcode },
>      { Bad_Opcode },
> @@ -1109,10 +1109,10 @@ static const struct dis386 evex_table[][256] = {
>      { Bad_Opcode },
>      { Bad_Opcode },
>      /* D0 */
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> -    { Bad_Opcode },
> +    { REG_TABLE (REG_D0) },
> +    { REG_TABLE (REG_D1) },
> +    { REG_TABLE (REG_D2) },
> +    { REG_TABLE (REG_D3) },

Some form of prefix decoding is going to be needed for these too, despite
the goal of wanting to re-use the legacy table entries. Perhaps adding
PREFIX_OPCODE there would be benign to the legacy insns?

> --- a/opcodes/i386-dis.c
> +++ b/opcodes/i386-dis.c
>[...]
> @@ -2660,47 +2668,47 @@ static const struct dis386 reg_table[][8] = {
>    },
>    /* REG_D0 */
>    {
> -    { "rolA",	{ Eb, I1 }, 0 },
> -    { "rorA",	{ Eb, I1 }, 0 },
> -    { "rclA",	{ Eb, I1 }, 0 },
> -    { "rcrA",	{ Eb, I1 }, 0 },
> -    { "shlA",	{ Eb, I1 }, 0 },
> -    { "shrA",	{ Eb, I1 }, 0 },
> -    { "shlA",	{ Eb, I1 }, 0 },
> -    { "sarA",	{ Eb, I1 }, 0 },
> +    { "rolA",	{ VexGb, Eb, I1 }, 0 },
> +    { "rorA",	{ VexGb, Eb, I1 }, 0 },
> +    { "rclA",	{ VexGb, Eb, I1 }, 0 },
> +    { "rcrA",	{ VexGb, Eb, I1 }, 0 },
> +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
> +    { "shrA",	{ VexGb, Eb, I1 }, 0 },
> +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
> +    { "sarA",	{ VexGb, Eb, I1 }, 0 },
>    },
>    /* REG_D1 */
>    {
> -    { "rolQ",	{ Ev, I1 }, 0 },
> -    { "rorQ",	{ Ev, I1 }, 0 },
> -    { "rclQ",	{ Ev, I1 }, 0 },
> -    { "rcrQ",	{ Ev, I1 }, 0 },
> -    { "shlQ",	{ Ev, I1 }, 0 },
> -    { "shrQ",	{ Ev, I1 }, 0 },
> -    { "shlQ",	{ Ev, I1 }, 0 },
> -    { "sarQ",	{ Ev, I1 }, 0 },
> +    { "rolQ",	{ VexGv, Ev, I1 }, 0 },
> +    { "rorQ",	{ VexGv, Ev, I1 }, 0 },
> +    { "rclQ",	{ VexGv, Ev, I1 }, 0 },
> +    { "rcrQ",	{ VexGv, Ev, I1 }, 0 },
> +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
> +    { "shrQ",	{ VexGv, Ev, I1 }, 0 },
> +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
> +    { "sarQ",	{ VexGv, Ev, I1 }, 0 },
>    },

As mentioned on the assembler side already, I think we would be better
off making const_1_mode print $1 in AT&T syntax at least for these new
insn forms, to eliminate the ambiguity.

> @@ -9061,6 +9069,15 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  	  ins->rex &= ~REX_B;
>  	  ins->rex2 &= ~REX_R;
>  	}
> +      if (ins->evex_type == evex_from_legacy)
> +	{
> +	  /* EVEX from legacy instructions, when the EVEX.ND bit is 0,
> +	     all bits of EVEX.vvvv and EVEX.V' must be 1.  */
> +	  if (!ins->vex.b && (ins->vex.register_specifier
> +				  || !ins->vex.v))
> +	    return &bad_opcode;
> +	  ins->rex |= REX_OPCODE;

What is this line about?

> @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
>  	return &err_opcode;
>  
>        /* Set vector length.  */
> -      if (ins->modrm.mod == 3 && ins->vex.b)
> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type == evex_default)
>  	ins->vex.length = 512;
>        else
>  	{

Is this change really needed for anything?

> @@ -9530,6 +9547,7 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>  		    {
>  		      oappend (&ins, "{bad}");
>  		      continue;
> +
>  		    }
>  
>  		  /* Instructions with a mask register destination allow for

Stray and bogus change.

> @@ -9553,7 +9571,7 @@ print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
>  
>  	  /* Check whether rounding control was enabled for an insn not
>  	     supporting it.  */
> -	  if (ins.modrm.mod == 3 && ins.vex.b
> +	  if (ins.modrm.mod == 3 && ins.vex.b && ins.evex_type == evex_default
>  	      && !(ins.evex_used & EVEX_b_used))
>  	    {

This could do with extending the comment, mentioning the aliasing of
EVEX.brs and EVEX.nd.

> @@ -11013,7 +11031,7 @@ print_displacement (instr_info *ins, bfd_signed_vma val)
>  static void
>  intel_operand_size (instr_info *ins, int bytemode, int sizeflag)
>  {
> -  if (ins->vex.b)
> +  if (ins->vex.b && ins->evex_type == evex_default)
>      {
>        if (!ins->vex.no_broadcast)
>  	switch (bytemode)

This aliasing would also be worthwhile mentioning here and ...

> @@ -11946,7 +11964,7 @@ OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
>  	  print_operand_value (ins, disp & 0xffff, dis_style_text);
>  	}
>      }
> -  if (ins->vex.b)
> +  if (ins->vex.b && ins->evex_type == evex_default)
>      {
>        ins->evex_used |= EVEX_b_used;

... here.

> @@ -13307,6 +13325,14 @@ OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
>    if (!ins->need_vex)
>      return true;
>  
> +  if (ins->evex_type == evex_from_legacy)
> +    {
> +      ins->evex_used |= EVEX_b_used;
> +      /* Here vex.b is treated as "EVEX.ND.  */

Okay, here you have such a helpful comment, just that - nit - there's
an unbalanced double-quote. (The comment would also more logically
come first in this block.)

Jan
  
Jan Beulich Nov. 9, 2023, 9:37 a.m. UTC | #3
On 02.11.2023 12:29, Cui, Lili wrote:
> @@ -190,6 +193,8 @@ mov, 0xf21, i386|No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { De
>  mov, 0xf21, x64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Debug, Reg64 }
>  mov, 0xf24, i386|No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test, Reg32 }
>  
> +// Move after swapping the bytes
> +movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  // Move after swapping the bytes
>  movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>  movbe, 0x60, Movbe|APX_F, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVex128|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> @@ -290,22 +295,36 @@ add, 0x0, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg3
>  add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
>  add, 0x4, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
>  add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> +add, 0x0, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
> +add, 0x83/0, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +add, 0x80/0, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}

Why are there (just as an example) only 3 new forms of ADD, but ...

> @@ -338,10 +366,19 @@ adc, 0x10, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg
>  adc, 0x83/2, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
>  adc, 0x14, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
>  adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> +adc, 0x10, APX_F, D|W|CheckOperandSize|Modrm|EVex128|EVexMap4|No_sSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> +adc, 0x83/2, APX_F, Modrm|EVex128|EVexMap4|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
> +adc, 0x80/2, APX_F, W|Modrm|EVex128|EVexMap4|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
> +adc, 0x10, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
> +adc, 0x83/2, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> +adc, 0x80/2, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }

.... 6 new forms of ADC? My guess is that's NF-related, but doesn't that
mean that until NF support is added e.g. "{evex} add %eax, %eax" won't
assemble as intended? IOW in line with NF as an attribute being added right
here, those other templates also will want introducing right here.

While looking at patch 7 I'm also wondering whether same-base-opcode forms
wouldn't better be kept together. I haven't finished yet looking at what's
needed there, but even if it doesn't turn out strictly necessary there it
may still be a good idea anyway (unless of course that would get in the
way of anything).

Jan
  
Cui, Lili Nov. 20, 2023, 1:19 a.m. UTC | #4
> > 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
> >
> > diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index
> > 398909a6a30..5b925505435 100644
> > --- a/gas/config/tc-i386.c
> > +++ b/gas/config/tc-i386.c
> > @@ -2317,8 +2317,10 @@ operand_size_match (const insn_template *t)
> >        unsigned int given = i.operands - j - 1;
> >
> >        /* For FMA4 and XOP insns VEX.W controls just the first two
> > -	 register operands.  */
> > -      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP))
> > +	 register operands. And APX_F insns just swap the two source
> operands,
> > +	 with the 3rd one being the destination.  */
> > +      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP)
> > +	  || is_cpu (t,CpuAPX_F))
> >  	given = j < 2 ? 1 - j : j;
> 
> Nit: Please retain consistency wrt style (here: missing blank after comma in
> the addition). Feels like I said so before.
> 

Done.

> > @@ -3959,6 +3961,7 @@ static INLINE bool  is_any_apx_evex_encoding
> > (void)  {
> >    return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4
> > +    || i.rex2_encoding
> >      || (i.vex.register_specifier
> >  	&& i.vex.register_specifier->reg_flags & RegRex2);  }
> 
> See my comment on an earlier patch regarding the use of i.rex2 here. But I
> doubt this is correct in the first place: Why would {rex2} cause EVEX encoding
> to be picked?
> 

Deleted.

> > @@ -7481,26 +7484,33 @@ match_template (char mnem_suffix)
> >  	  overlap1 = operand_type_and (operand_types[0],
> operand_types[1]);
> >  	  if (t->opcode_modifier.d && i.reg_operands == i.operands
> >  	      && !operand_type_all_zero (&overlap1))
> > -	    switch (i.dir_encoding)
> > -	      {
> > -	      case dir_encoding_load:
> > -		if (operand_type_check (operand_types[i.operands - 1],
> anymem)
> > -		    || t->opcode_modifier.regmem)
> > -		  goto check_reverse;
> > -		break;
> > +	    {
> >
> > -	      case dir_encoding_store:
> > -		if (!operand_type_check (operand_types[i.operands - 1],
> anymem)
> > -		    && !t->opcode_modifier.regmem)
> > -		  goto check_reverse;
> > -		break;
> > +	      int MemOperand = i.operands - 1 -
> > +		(t->opcode_space == SPACE_EVEXMAP4
> > +		 && t->opcode_modifier.vexvvvv);
> 
> Nit: I don't think local variables should start with a capital letter. I wonder
> anyway - can't you just re-use e.g. j here? That would then also avoid the
> misleading name: Right here you don't know yet whether that's the
> "memory" operand. Finally, if you set the variable ahead if the enclosing if(),
> you could (I think) avoid all this re-indentation, making the change quite a bit
> easier to review (i.e. to see what really changes).
> 

Good idea. Changed.

> > @@ -7530,11 +7540,13 @@ match_template (char mnem_suffix)
> >  		continue;
> >  	      /* Try reversing direction of operands.  */
> >  	      j = is_cpu (t, CpuFMA4)
> > -		  || is_cpu (t, CpuXOP) ? 1 : i.operands - 1;
> > +		  || is_cpu (t, CpuXOP)
> > +		  || is_cpu (t, CpuAPX_F) ? 1 : i.operands - 1;
> >  	      overlap0 = operand_type_and (i.types[0], operand_types[j]);
> >  	      overlap1 = operand_type_and (i.types[j], operand_types[0]);
> >  	      overlap2 = operand_type_and (i.types[1], operand_types[1]);
> > -	      gas_assert (t->operands != 3 || !check_register);
> > +	      gas_assert (t->operands != 3 || !check_register
> > +			  || is_cpu (t,CpuAPX_F));
> 
> Nit: Missing blank again. And again in the next hunk. I won't comment on
> such any further, expecting you to go through globally.
>
 
Done.

> > @@ -8588,11 +8609,10 @@ process_operands (void)
> >    const reg_entry *default_seg = NULL;
> >
> >    /* We only need to check those implicit registers for instructions
> > -     with 3 operands or less.  */
> > -  if (i.operands <= 3)
> > -    for (unsigned int j = 0; j < i.operands; j++)
> > -      if (i.types[j].bitfield.instance != InstanceNone)
> > -	i.reg_operands--;
> > +     with 4 operands or less.  */
> > +  for (unsigned int j = 0; j < i.operands; j++)
> > +    if (i.types[j].bitfield.instance != InstanceNone)
> > +      i.reg_operands--;
> 
> While you made the requested code adjustment, adjusting the comment
> renders it stale now. It needs dropping instead, as it did only explain the if(),
> not the for().
> 

Done.

> > @@ -8946,26 +8966,35 @@ build_modrm_byte (void)
> >  				     || i.vec_encoding == vex_encoding_evex));
> >      }
> >
> > -  for (v = source + 1; v < dest; ++v)
> > -    if (v != reg_slot)
> > -      break;
> > -  if (v >= dest)
> > -    v = ~0;
> > -  if (i.tm.extension_opcode != None)
> > +  if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
> >      {
> > -      if (dest != source)
> > -	v = dest;
> > -      dest = ~0;
> > +      v = dest;
> > +      dest-- ;
> >      }
> > -  gas_assert (source < dest);
> > -  if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
> > -      && source != op)
> > +  else if (i.tm.opcode_modifier.vexvvvv == VexVVVV_SRC)
> >      {
> > -      unsigned int tmp = source;
> > +      v = source + 1;
> > +      for (v = source + 1; v < dest; ++v)
> > +	if (v != reg_slot)
> > +	  break;
> > +      if (i.tm.extension_opcode != None)
> > +	{
> > +	  if (dest != source)
> > +	    v = dest;
> > +	  dest = ~0;
> > +	}
> > +      gas_assert (source < dest);
> > +      if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
> > +	  && source != op)
> > +	{
> > +	  unsigned int tmp = source;
> >
> > -      source = v;
> > -      v = tmp;
> > +	  source = v;
> > +	  v = tmp;
> > +	}
> >      }
> > +  else
> > +    v = ~0;
> >
> >    if (v < MAX_OPERANDS)
> >      {
> 
> I'm having trouble following this change. The VexVVVV-is-source case
> shouldn't change at all, I'd expect. This would ideally be easily visible from
> the change done. Yet if looking a little more closely I can e.g. spot a stray "v =
> source + 1;" which wasn't there before. And there also look to be things being
> dropped. Such a change (again) wants doing such that it is easy to see what
> changes. If need be by making a mechanical prereq change doing just re-
> indentation, but nothing else.
> It feels though as if there is too much changing here anyway: The difference
> for VexVVVV-is-dest is that you need to consume the destination early. If you
> do so, then the rest should be possible to keep as is: You'll subsequently deal
> with just the normal ModR/M, as if without a VexVVVV-encoded operand.
> 

Changed to add these codes before the old code.

  if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
    {
      v = dest;
      dest-- ;
    }
  else
    {
...
    }

> > --- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> > +++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> > @@ -10,7 +10,7 @@ _start:
> >          .byte 0x62, 0xfc, 0x7f, 0x08, 0x60, 0xc2
> >          .byte 0xff, 0xff
> >          #VSIB vpgatherqq 0x7b(%rbp,%zmm17,8),%zmm16{%k1} set
> EVEX.P[10] == 0
> > -	#(illegal value).
> > +        #(illegal value).
> >          .byte 0x62, 0xe2, 0xf9, 0x41, 0x91, 0x84, 0xcd, 0x7b, 0x00, 0x00, 0x00
> >          .byte 0xff
> >          #EVEX_MAP4 movbe %r18w,%ax set EVEX.mm == b01 (illegal value).
> 
> ???
>
> > @@ -27,3 +27,6 @@ _start:
> >          .byte 0xff
> >          #EVEX from VEX enqcmd 0x123(%r31,%rax,4),%r31 EVEX.P[23:22] == 1
> (illegal value).
> >          .byte 0x62, 0x4c, 0x7f, 0x28, 0xf8, 0xbc, 0x87, 0x23, 0x01,
> > 0x00, 0x00
> > +        .byte 0xff
> 
> This then similarly looks to belong into the earlier patch, suggesting that it
> had #pass too early.
> 
Changed.

> > +        #{evex} inc %rax EVEX.vvvv' > 0 (illegal value).
> > +        .byte 0x62, 0xf4, 0xec, 0x08, 0xff, 0xc0
> 
> Again I suspect this can be expressed via .insn, thus ending up more clear.
> I can only stress again that one of the reasons of introducing .insn was to
> reduce the number of such entirely unreadable byte sequences.

Done.

> 
> > --- /dev/null
> > +++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
> > @@ -0,0 +1,154 @@
> > +# Check 64bit APX NDD instructions with evex prefix encoding
> > +
> > +	.allow_index_reg
> > +	.text
> > +_start:
> > +inc    %rax,%rbx
> 
> Please can instructions be indented by a tab?
> 

Done.

> > +inc    %r31,%r8
> > +inc    %r31,%r16
> > +add    %r31b,%r8b,%r16b
> > +addb    %r31b,%r8b,%r16b
> > +add    %r31,%r8,%r16
> > +addq    %r31,%r8,%r16
> > +add    %r31d,%r8d,%r16d
> > +addl    %r31d,%r8d,%r16d
> > +add    %r31w,%r8w,%r16w
> > +addw    %r31w,%r8w,%r16w
> > +{store} add    %r31,%r8,%r16
> > +{load}  add    %r31,%r8,%r16
> > +add    %r31,(%r8),%r16
> > +add    (%r31),%r8,%r16
> > +add    0x9090(%r31,%r16,1),%r8,%r16
> > +add    %r31,(%r8,%r16,8),%r16
> > +add    $0x34,%r13b,%r17b
> > +addl   $0x11,(%r19,%rax,4),%r20d
> > +add    $0x1234,%ax,%r30w
> > +add    $0x12344433,%r15,%r16
> > +addq   $0x12344433,(%r15,%rcx,4),%r16
> > +add    $0xfffffffff4332211,%rax,%r8
> > +dec    %rax,%r17
> > +decb   (%r31,%r12,1),%r8b
> > +not    %rax,%r17
> > +notb   (%r31,%r12,1),%r8b
> > +neg    %rax,%r17
> > +negb   (%r31,%r12,1),%r8b
> > +sub    %r15b,%r17b,%r18b
> > +sub    %r15d,(%r8),%r18d
> > +sub    (%r15,%rax,1),%r16b,%r8b
> > +sub    (%r15,%rax,1),%r16w,%r8w
> > +subl   $0x11,(%r19,%rax,4),%r20d
> > +sub    $0x1234,%ax,%r30w
> > +sbb    %r15b,%r17b,%r18b
> > +sbb    %r15d,(%r8),%r18d
> > +sbb    (%r15,%rax,1),%r16b,%r8b
> > +sbb    (%r15,%rax,1),%r16w,%r8w
> > +sbbl   $0x11,(%r19,%rax,4),%r20d
> > +sbb    $0x1234,%ax,%r30w
> > +adc    %r15b,%r17b,%r18b
> > +adc    %r15d,(%r8),%r18d
> > +adc    (%r15,%rax,1),%r16b,%r8b
> > +adc    (%r15,%rax,1),%r16w,%r8w
> > +adcl   $0x11,(%r19,%rax,4),%r20d
> > +adc    $0x1234,%ax,%r30w
> > +or     %r15b,%r17b,%r18b
> > +or     %r15d,(%r8),%r18d
> > +or     (%r15,%rax,1),%r16b,%r8b
> > +or     (%r15,%rax,1),%r16w,%r8w
> > +orl    $0x11,(%r19,%rax,4),%r20d
> > +or     $0x1234,%ax,%r30w
> > +xor    %r15b,%r17b,%r18b
> > +xor    %r15d,(%r8),%r18d
> > +xor    (%r15,%rax,1),%r16b,%r8b
> > +xor    (%r15,%rax,1),%r16w,%r8w
> > +xorl   $0x11,(%r19,%rax,4),%r20d
> > +xor    $0x1234,%ax,%r30w
> > +and    %r15b,%r17b,%r18b
> > +and    %r15d,(%r8),%r18d
> > +and    (%r15,%rax,1),%r16b,%r8b
> > +and    (%r15,%rax,1),%r16w,%r8w
> > +andl   $0x11,(%r19,%rax,4),%r20d
> > +and    $0x1234,%ax,%r30w
> > +rorb   (%rax),%r31b
> 
> While there's a doc problem here as well, the question here and there is the
> same: Which form of ROR does this represent, when the shift count isn't
> specified explicitly? It could be %cl or $1. I would strongly advise against
> introducing further ambiguous instruction patterns like this, and instead
> demand that both %cl and $1 be always named explicitly as operands.
> 

Done.  GCC also changed.

> > +ror    $0x2,%r12b,%r31b
> > +rorl   $0x2,(%rax),%r31d
> > +rorw   (%rax),%r31w
> > +ror    %cl,%r16b,%r8b
> > +rorw   %cl,(%r19,%rax,4),%r31w
> > +rolb   (%rax),%r31b
> > +rol    $0x2,%r12b,%r31b
> > +roll   $0x2,(%rax),%r31d
> > +rolw   (%rax),%r31w
> > +rol    %cl,%r16b,%r8b
> > +rolw   %cl,(%r19,%rax,4),%r31w
> > +rcrb   (%rax),%r31b
> > +rcr    $0x2,%r12b,%r31b
> > +rcrl   $0x2,(%rax),%r31d
> > +rcrw   (%rax),%r31w
> > +rcr    %cl,%r16b,%r8b
> > +rcrw   %cl,(%r19,%rax,4),%r31w
> > +rclb   (%rax),%r31b
> > +rcl    $0x2,%r12b,%r31b
> > +rcll   $0x2,(%rax),%r31d
> > +rclw   (%rax),%r31w
> > +rcl    %cl,%r16b,%r8b
> > +rclw   %cl,(%r19,%rax,4),%r31w
> > +shlb   (%rax),%r31b
> > +shl    $0x2,%r12b,%r31b
> > +shll   $0x2,(%rax),%r31d
> > +shlw   (%rax),%r31w
> > +shl    %cl,%r16b,%r8b
> > +shlw   %cl,(%r19,%rax,4),%r31w
> > +sarb   (%rax),%r31b
> > +sar    $0x2,%r12b,%r31b
> > +sarl   $0x2,(%rax),%r31d
> > +sarw   (%rax),%r31w
> > +sar    %cl,%r16b,%r8b
> > +sarw   %cl,(%r19,%rax,4),%r31w
> > +shlb   (%rax),%r31b
> > +shl    $0x2,%r12b,%r31b
> > +shll   $0x2,(%rax),%r31d
> > +shlw   (%rax),%r31w
> > +shl    %cl,%r16b,%r8b
> > +shlw   %cl,(%r19,%rax,4),%r31w
> > +shrb   (%rax),%r31b
> > +shr    $0x2,%r12b,%r31b
> > +shrl   $0x2,(%rax),%r31d
> > +shrw   (%rax),%r31w
> > +shr    %cl,%r16b,%r8b
> > +shrw   %cl,(%r19,%rax,4),%r31w
> > +shld   $0x1,%r12,(%rax),%r31
> > +shld   $0x2,%r8w,%r12w,%r31w
> > +shld   $0x2,%r15d,(%rax),%r31d
> > +shld   %cl,%r9w,(%rax),%r31w
> > +shld   %cl,%r12,%r16,%r8
> > +shld   %cl,%r13w,(%r19,%rax,4),%r31w
> > +shrd   $0x1,%r12,(%rax),%r31
> > +shrd   $0x2,%r8w,%r12w,%r31w
> > +shrd   $0x2,%r15d,(%rax),%r31d
> > +shrd   %cl,%r9w,(%rax),%r31w
> > +shrd   %cl,%r12,%r16,%r8
> > +shrd   %cl,%r13w,(%r19,%rax,4),%r31w
> > +adcx   %r15d,%r8d,%r18d
> > +adcx   (%r15,%r31,1),%r8d,%r18d
> > +adcx   (%r15,%r31,1),%r8
> > +adox   %r15d,%r8d,%r18d
> > +adox   (%r15,%r31,1),%r8d,%r18d
> > +adox   (%r15,%r31,1),%r8
> > +cmovo  0x90909090(%eax),%edx,%r8d
> > +cmovno 0x90909090(%eax),%edx,%r8d
> > +cmovb  0x90909090(%eax),%edx,%r8d
> > +cmovae 0x90909090(%eax),%edx,%r8d
> > +cmove  0x90909090(%eax),%edx,%r8d
> > +cmovne 0x90909090(%eax),%edx,%r8d
> > +cmovbe 0x90909090(%eax),%edx,%r8d
> > +cmova  0x90909090(%eax),%edx,%r8d
> > +cmovs  0x90909090(%eax),%edx,%r8d
> > +cmovns 0x90909090(%eax),%edx,%r8d
> > +cmovp  0x90909090(%eax),%edx,%r8d
> > +cmovnp 0x90909090(%eax),%edx,%r8d
> > +cmovl  0x90909090(%eax),%edx,%r8d
> > +cmovge 0x90909090(%eax),%edx,%r8d
> > +cmovle 0x90909090(%eax),%edx,%r8d
> > +cmovg  0x90909090(%eax),%edx,%r8d
> > +imul   0x90909(%eax),%edx,%r8d
> > +imul   0x909(%rax,%r31,8),%rdx,%r25
> 
> Overall there's also again the sorting criteria question: Without any sorting,
> how is one to (easily) check full coverage?
> 

> > --- a/opcodes/i386-gen.c
> > +++ b/opcodes/i386-gen.c
> > @@ -473,6 +473,7 @@ static bitfield opcode_modifiers[] =
> >    BITFIELD (IntelSyntax),
> >    BITFIELD (ISA64),
> >    BITFIELD (NoEgpr),
> > +  BITFIELD (NF),
> >  };
> 
> This wants introducing earlier, when the BMI / BMI2 templates are touched
> anyway. As said before, it would be very nice if within such a series the same
> places wouldn't needlessly be touched more than once. You don't implement
> parsing of NF here anyway, so how much in advance the attribute is added to
> the opcode table is really irrelevant, and should hence be done as
> conventiently as possible.
> 

Done.

> > --- a/opcodes/i386-opc.h
> > +++ b/opcodes/i386-opc.h
> > @@ -636,7 +636,10 @@ enum
> >    /* How to encode VEX.vvvv:
> >       0: VEX.vvvv must be 1111b.
> >       1: VEX.vvvv encodes one of the register operands.
> > +     2: VEX.vvvv encodes as the dest register operands.
> >     */
> > +#define VexVVVV_SRC   1
> > +#define VexVVVV_DST   2
> >    VexVVVV,
> 
> Nit: Singular in the new comment line, please (plus perhaps drop "as").
> And "source" wants adding to the earlier comment line.
> 

Done.

> > --- a/opcodes/i386-opc.tbl
> > +++ b/opcodes/i386-opc.tbl
> > @@ -138,6 +138,9 @@
> >  #define Vsz512 Vsz=VSZ512
> >
> >  #define APX_F APX_F|x64
> > +#define VexVVVVSrc  VexVVVV=VexVVVV_SRC #define VexVVVVDest
> > +VexVVVV=VexVVVV_DST
> 
> I don't this we need the former. Continuing to use just VexVVVV there is going
> to be quite fine, and easier to read. Plus, I'm sorry to say it this bluntly, it is
> entirely inappropriate to bloat an already large patch by mechanically
> replacing VexVVVV by this new alias. If such a change was really wanted, it
> would need separating out as an entirely mechanical one.
> 

Removed VexVVVV_SRC.

> For the latter, to help readability, how about DstVVVV?
> 

Done.

> > +
> >
> >  // The EVEX purpose of StaticRounding appears only together with SAE.
> > Re-use  // the bit to mark commutative VEX encodings where swapping
> > the source
> 
> Please don't add stray (especially double) blank lines. Instead a blank line
> would be wanted _ahead_ of the addition.
> 

Done.

> > @@ -190,6 +193,8 @@ mov, 0xf21, i386|No64,
> > D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { De  mov,
> 0xf21,
> > x64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Debug,
> Reg64
> > }  mov, 0xf24, i386|No64,
> > D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test,
> Reg32 }
> >
> > +// Move after swapping the bytes
> > +movbe, 0x0f38f0, Movbe,
> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, {
> > +Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >  // Move after swapping the bytes
> >  movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf,
> {
> > Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> movbe,
> > 0x60, Movbe|APX_F,
> > D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVex128|EVexMap4, {
> > Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> 
> What is this addition about? Please can you look at your own patches before
> sending them out?
> 

Done.

> > @@ -290,22 +295,36 @@ add, 0x0, 0,
> > D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock,
> { Reg8|Reg16|Reg3
> > add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
> > Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }  add,
> 0x4,
> > 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
> Acc|Byte|Word|Dword|Qword }
> > add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, {
> > Imm8|Imm16|Imm32|Imm32S,
> >
> Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInde
> x }
> > +add, 0x0, APX_F,
> >
> +D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMa
> p4|NF, {
> > +Reg8|Reg16|Reg32|Reg64,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex,
> > +Reg8|Reg16|Reg32|Reg64 } add, 0x83/0, APX_F,
> >
> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVex
> Map4|N
> > +F, { Imm8S,
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex,
> > +Reg16|Reg32|Reg64 } add, 0x80/0, APX_F,
> >
> +W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4
> |NF, {
> > +Imm8|Imm16|Imm32|Imm32S,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex,
> > +Reg8|Reg16|Reg32|Reg64}
> 
> Earlier review comments were not addressed. I'll stop here, as far as this file is
> concerned, expecting a new version to be sent addressing earlier comments
> and having been sanity checked and having the bogus VexVVVV renaming
> dropped.
> 

 Ok.

Thanks,
Lili.
  
Cui, Lili Nov. 20, 2023, 1:33 a.m. UTC | #5
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Thursday, November 9, 2023 5:37 PM
> To: Cui, Lili <lili.cui@intel.com>; Kong, Lingling <lingling.kong@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
> binutils@sourceware.org
> Subject: Re: [PATCH 5/8] Support APX NDD
> 
> On 02.11.2023 12:29, Cui, Lili wrote:
> > @@ -190,6 +193,8 @@ mov, 0xf21, i386|No64,
> > D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { De  mov,
> 0xf21,
> > x64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Debug,
> Reg64
> > }  mov, 0xf24, i386|No64,
> > D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test,
> Reg32 }
> >
> > +// Move after swapping the bytes
> > +movbe, 0x0f38f0, Movbe,
> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, {
> > +Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >  // Move after swapping the bytes
> >  movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf,
> {
> > Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> movbe,
> > 0x60, Movbe|APX_F,
> > D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVex128|EVexMap4, {
> > Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 } @@
> > -290,22 +295,36 @@ add, 0x0, 0,
> > D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock,
> { Reg8|Reg16|Reg3
> > add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
> > Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }  add,
> 0x4,
> > 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
> Acc|Byte|Word|Dword|Qword }
> > add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, {
> > Imm8|Imm16|Imm32|Imm32S,
> >
> Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInde
> x }
> > +add, 0x0, APX_F,
> >
> +D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMa
> p4|NF, {
> > +Reg8|Reg16|Reg32|Reg64,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex,
> > +Reg8|Reg16|Reg32|Reg64 } add, 0x83/0, APX_F,
> >
> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVex
> Map4|N
> > +F, { Imm8S,
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex,
> > +Reg16|Reg32|Reg64 } add, 0x80/0, APX_F,
> >
> +W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4
> |NF, {
> > +Imm8|Imm16|Imm32|Imm32S,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex,
> > +Reg8|Reg16|Reg32|Reg64}
> 
> Why are there (just as an example) only 3 new forms of ADD, but ...
> 
> > @@ -338,10 +366,19 @@ adc, 0x10, 0,
> > D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock,
> { Reg8|Reg16|Reg
> > adc, 0x83/2, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
> > Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }  adc,
> 0x14,
> > 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
> Acc|Byte|Word|Dword|Qword }
> > adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, {
> > Imm8|Imm16|Imm32|Imm32S,
> >
> Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInde
> x }
> > +adc, 0x10, APX_F,
> > +D|W|CheckOperandSize|Modrm|EVex128|EVexMap4|No_sSuf, {
> > +Reg8|Reg16|Reg32|Reg64,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex }
> > +adc, 0x83/2, APX_F, Modrm|EVex128|EVexMap4|No_bSuf|No_sSuf,
> { Imm8S,
> > +Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex } adc,
> > +0x80/2, APX_F, W|Modrm|EVex128|EVexMap4|No_sSuf, {
> > +Imm8|Imm16|Imm32|Imm32S,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex }
> > +adc, 0x10, APX_F,
> >
> +D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMa
> p4, {
> > +Reg8|Reg16|Reg32|Reg64,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex,
> > +Reg8|Reg16|Reg32|Reg64 } adc, 0x83/2, APX_F,
> >
> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVex
> Map4,
> > +{ Imm8S,
> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex,
> > +Reg16|Reg32|Reg64 } adc, 0x80/2, APX_F,
> >
> +W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4,
> {
> > +Imm8|Imm16|Imm32|Imm32S,
> >
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> ex,
> > +Reg8|Reg16|Reg32|Reg64 }
> 
> .... 6 new forms of ADC? My guess is that's NF-related, but doesn't that mean
> that until NF support is added e.g. "{evex} add %eax, %eax" won't assemble
> as intended? IOW in line with NF as an attribute being added right here, those
> other templates also will want introducing right here.
> 
> While looking at patch 7 I'm also wondering whether same-base-opcode
> forms wouldn't better be kept together. I haven't finished yet looking at
> what's needed there, but even if it doesn't turn out strictly necessary there it
> may still be a good idea anyway (unless of course that would get in the way of
> anything).
> 

For ADC, there are 3 templates that do not support NDD and NF, I moved them here from patch 3/8 which we discussed before since its decoding is supported together in the NDD patch.

Thanks,
Lili.
  
Jan Beulich Nov. 20, 2023, 8:19 a.m. UTC | #6
On 20.11.2023 02:33, Cui, Lili wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Thursday, November 9, 2023 5:37 PM
>>
>> On 02.11.2023 12:29, Cui, Lili wrote:
>>> @@ -190,6 +193,8 @@ mov, 0xf21, i386|No64,
>>> D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { De  mov,
>> 0xf21,
>>> x64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Debug,
>> Reg64
>>> }  mov, 0xf24, i386|No64,
>>> D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test,
>> Reg32 }
>>>
>>> +// Move after swapping the bytes
>>> +movbe, 0x0f38f0, Movbe,
>> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, {
>>> +Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>>>  // Move after swapping the bytes
>>>  movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf,
>> {
>>> Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
>> movbe,
>>> 0x60, Movbe|APX_F,
>>> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVex128|EVexMap4, {
>>> Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 } @@
>>> -290,22 +295,36 @@ add, 0x0, 0,
>>> D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock,
>> { Reg8|Reg16|Reg3
>>> add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
>>> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }  add,
>> 0x4,
>>> 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
>> Acc|Byte|Word|Dword|Qword }
>>> add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, {
>>> Imm8|Imm16|Imm32|Imm32S,
>>>
>> Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInde
>> x }
>>> +add, 0x0, APX_F,
>>>
>> +D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMa
>> p4|NF, {
>>> +Reg8|Reg16|Reg32|Reg64,
>>>
>> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
>> ex,
>>> +Reg8|Reg16|Reg32|Reg64 } add, 0x83/0, APX_F,
>>>
>> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVex
>> Map4|N
>>> +F, { Imm8S,
>> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex,
>>> +Reg16|Reg32|Reg64 } add, 0x80/0, APX_F,
>>>
>> +W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4
>> |NF, {
>>> +Imm8|Imm16|Imm32|Imm32S,
>>>
>> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
>> ex,
>>> +Reg8|Reg16|Reg32|Reg64}
>>
>> Why are there (just as an example) only 3 new forms of ADD, but ...
>>
>>> @@ -338,10 +366,19 @@ adc, 0x10, 0,
>>> D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock,
>> { Reg8|Reg16|Reg
>>> adc, 0x83/2, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
>>> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }  adc,
>> 0x14,
>>> 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
>> Acc|Byte|Word|Dword|Qword }
>>> adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, {
>>> Imm8|Imm16|Imm32|Imm32S,
>>>
>> Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInde
>> x }
>>> +adc, 0x10, APX_F,
>>> +D|W|CheckOperandSize|Modrm|EVex128|EVexMap4|No_sSuf, {
>>> +Reg8|Reg16|Reg32|Reg64,
>>>
>> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
>> ex }
>>> +adc, 0x83/2, APX_F, Modrm|EVex128|EVexMap4|No_bSuf|No_sSuf,
>> { Imm8S,
>>> +Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex } adc,
>>> +0x80/2, APX_F, W|Modrm|EVex128|EVexMap4|No_sSuf, {
>>> +Imm8|Imm16|Imm32|Imm32S,
>>>
>> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
>> ex }
>>> +adc, 0x10, APX_F,
>>>
>> +D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMa
>> p4, {
>>> +Reg8|Reg16|Reg32|Reg64,
>>>
>> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
>> ex,
>>> +Reg8|Reg16|Reg32|Reg64 } adc, 0x83/2, APX_F,
>>>
>> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVex
>> Map4,
>>> +{ Imm8S,
>> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex,
>>> +Reg16|Reg32|Reg64 } adc, 0x80/2, APX_F,
>>>
>> +W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4,
>> {
>>> +Imm8|Imm16|Imm32|Imm32S,
>>>
>> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
>> ex,
>>> +Reg8|Reg16|Reg32|Reg64 }
>>
>> .... 6 new forms of ADC? My guess is that's NF-related, but doesn't that mean
>> that until NF support is added e.g. "{evex} add %eax, %eax" won't assemble
>> as intended? IOW in line with NF as an attribute being added right here, those
>> other templates also will want introducing right here.
>>
>> While looking at patch 7 I'm also wondering whether same-base-opcode
>> forms wouldn't better be kept together. I haven't finished yet looking at
>> what's needed there, but even if it doesn't turn out strictly necessary there it
>> may still be a good idea anyway (unless of course that would get in the way of
>> anything).
>>
> 
> For ADC, there are 3 templates that do not support NDD and NF, I moved them here from patch 3/8 which we discussed before since its decoding is supported together in the NDD patch.

Feels like my question wasn't really answered: Why are there 6 new ADC templates
here, but only 3 ADD ones? Or in other words: Why are the other 3 ADD ones only
added later?

Also I'm curious: Why do ADC and SBB not allow for NF? They always consume
EFLAGS.CF, but subsequent insns may have no need for their EFLAGS output,
just like for e.g. ADD and SUB. Yet NF is only about EFLAGS output aiui, not
about any bit(s) consumed from EFLAGS.

Jan
  
Cui, Lili Nov. 20, 2023, 12:36 p.m. UTC | #7
> On 02.11.2023 12:29, Cui, Lili wrote:
> > --- a/opcodes/i386-dis-evex-prefix.h
> > +++ b/opcodes/i386-dis-evex-prefix.h
> > @@ -338,10 +338,6 @@
> >      { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
> >      { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
> >    },
> > -  /* PREFIX_EVEX_MAP4_66 */
> > -  {
> > -    { "wrssK",	{ M, Gdq }, 0 },
> > -  },
> >    /* PREFIX_EVEX_MAP4_D8 */
> >    {
> >      { "sha1nexte", { XM, EXxmm }, 0 },
> 
> What's going on here?
>

It should be in patch 3/8. changed.

> > --- a/opcodes/i386-dis-evex-reg.h
> > +++ b/opcodes/i386-dis-evex-reg.h
> > @@ -56,6 +56,36 @@
> >      { "blsmskS",	{ VexGdq, Edq }, 0 },
> >      { "blsiS",		{ VexGdq, Edq }, 0 },
> >    },
> > +  /* REG_EVEX_MAP4_80 */
> > +  {
> > +    { "addA",	{ VexGb, Eb, Ib }, 0 },
> > +    { "orA",	{ VexGb, Eb, Ib }, 0 },
> > +    { "adcA",	{ VexGb, Eb, Ib }, 0 },
> > +    { "sbbA",	{ VexGb, Eb, Ib }, 0 },
> > +    { "andA",	{ VexGb, Eb, Ib }, 0 },
> > +    { "subA",	{ VexGb, Eb, Ib }, 0 },
> > +    { "xorA",	{ VexGb, Eb, Ib }, 0 },
> > +  },
> > +  /* REG_EVEX_MAP4_81 */
> > +  {
> > +    { "addQ",	{ VexGv, Ev, Iv }, 0 },
> > +    { "orQ",	{ VexGv, Ev, Iv }, 0 },
> > +    { "adcQ",	{ VexGv, Ev, Iv }, 0 },
> > +    { "sbbQ",	{ VexGv, Ev, Iv }, 0 },
> > +    { "andQ",	{ VexGv, Ev, Iv }, 0 },
> > +    { "subQ",	{ VexGv, Ev, Iv }, 0 },
> > +    { "xorQ",	{ VexGv, Ev, Iv }, 0 },
> > +  },
> > +  /* REG_EVEX_MAP4_83 */
> > +  {
> > +    { "addQ",	{ VexGv, Ev, sIb }, 0 },
> > +    { "orQ",	{ VexGv, Ev, sIb }, 0 },
> > +    { "adcQ",	{ VexGv, Ev, sIb }, 0 },
> > +    { "sbbQ",	{ VexGv, Ev, sIb }, 0 },
> > +    { "andQ",	{ VexGv, Ev, sIb }, 0 },
> > +    { "subQ",	{ VexGv, Ev, sIb }, 0 },
> > +    { "xorQ",	{ VexGv, Ev, sIb }, 0 },
> > +  },
> 
> No sign of prefix decoding, and also no PREFIX_OPCODE present?
> 

Added NO_PREFIX and PREFIX_NP_OR_DATA for them.

> > @@ -63,3 +93,27 @@
> >      { "aesencwide256kl",	{ M }, 0 },
> >      { "aesdecwide256kl",	{ M }, 0 },
> >    },
> > +  /* REG_EVEX_MAP4_F6 */
> > +  {
> > +    { Bad_Opcode },
> > +    { Bad_Opcode },
> > +    { "notA",	{ VexGb, Eb }, 0 },
> > +    { "negA",	{ VexGb, Eb }, 0 },
> > +  },
> > +  /* REG_EVEX_MAP4_F7 */
> > +  {
> > +    { Bad_Opcode },
> > +    { Bad_Opcode },
> > +    { "notQ",	{ VexGv, Ev }, 0 },
> > +    { "negQ",	{ VexGv, Ev }, 0 },
> > +  },
> > +  /* REG_EVEX_MAP4_FE */
> > +  {
> > +    { "incA",   { VexGb ,Eb }, 0 },
> > +    { "decA",   { VexGb ,Eb }, 0 },
> > +  },
> > +  /* REG_EVEX_MAP4_FF */
> > +  {
> > +    { "incQ",   { VexGv ,Ev }, 0 },
> > +    { "decQ",   { VexGv ,Ev }, 0 },
> > +  },
> 
> Same here, plus for the inc/dec some commas are misplaced. Padding also
> looks to be incosnsitent (tab vs blanks).
> 

Done.

> > --- a/opcodes/i386-dis-evex.h
> > +++ b/opcodes/i386-dis-evex.h
> >[...]
> > @@ -947,23 +947,23 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* 40 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "cmovoS",		{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovnoS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovbS",		{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovaeS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmoveS",		{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovneS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovbeS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovaS",		{ VexGv, Gv, Ev }, 0 },
> >      /* 48 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "cmovsS",		{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovnsS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovpS",		{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovnpS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovlS",		{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovgeS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovleS",	{ VexGv, Gv, Ev }, 0 },
> > +    { "cmovgS",		{ VexGv, Gv, Ev }, 0 },
> 
> Considering CFCMOVcc which sits at the same opcode, doing things like this
> sets us up for needing to touch all of these again. Maybe that's the best that
> can be done, but I still wonder whether this couldn't be taken care of right
> away when introducing these entries.
> 

How about adding a special letter CF% in front of them?

> > @@ -989,7 +989,7 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { "wrussK",	{ M, Gdq }, PREFIX_DATA },
> > -    { PREFIX_TABLE (PREFIX_EVEX_MAP4_66) },
> > +    { PREFIX_TABLE (PREFIX_0F38F6) },
> 
> Perhaps related to the earlier question: What's going on here?
> 

It should be in patch 3/8. changed.

> > @@ -1060,7 +1060,7 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "shldS",		{ VexGv, Ev, Gv, CL }, 0 },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* A8 */
> > @@ -1069,9 +1069,9 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > +    { "shrdS",		{ VexGv, Ev, Gv, CL }, 0 },
> >      { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { "imulS",		{ VexGv, Gv, Ev }, 0 },
> 
> PREFIX_OPCODE (or prefix decoding) again missing in all of these?
> 

Done.

> > @@ -1091,8 +1091,8 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* C0 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { REG_TABLE (REG_C0) },
> > +    { REG_TABLE (REG_C1) },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> > @@ -1109,10 +1109,10 @@ static const struct dis386 evex_table[][256] = {
> >      { Bad_Opcode },
> >      { Bad_Opcode },
> >      /* D0 */
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > -    { Bad_Opcode },
> > +    { REG_TABLE (REG_D0) },
> > +    { REG_TABLE (REG_D1) },
> > +    { REG_TABLE (REG_D2) },
> > +    { REG_TABLE (REG_D3) },
> 
> Some form of prefix decoding is going to be needed for these too, despite the
> goal of wanting to re-use the legacy table entries. Perhaps adding
> PREFIX_OPCODE there would be benign to the legacy insns?
> 

Added PREFIX_NP_OR_DATA and NO_PREFIX for them.

> > --- a/opcodes/i386-dis.c
> > +++ b/opcodes/i386-dis.c
> >[...]
> > @@ -2660,47 +2668,47 @@ static const struct dis386 reg_table[][8] = {
> >    },
> >    /* REG_D0 */
> >    {
> > -    { "rolA",	{ Eb, I1 }, 0 },
> > -    { "rorA",	{ Eb, I1 }, 0 },
> > -    { "rclA",	{ Eb, I1 }, 0 },
> > -    { "rcrA",	{ Eb, I1 }, 0 },
> > -    { "shlA",	{ Eb, I1 }, 0 },
> > -    { "shrA",	{ Eb, I1 }, 0 },
> > -    { "shlA",	{ Eb, I1 }, 0 },
> > -    { "sarA",	{ Eb, I1 }, 0 },
> > +    { "rolA",	{ VexGb, Eb, I1 }, 0 },
> > +    { "rorA",	{ VexGb, Eb, I1 }, 0 },
> > +    { "rclA",	{ VexGb, Eb, I1 }, 0 },
> > +    { "rcrA",	{ VexGb, Eb, I1 }, 0 },
> > +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
> > +    { "shrA",	{ VexGb, Eb, I1 }, 0 },
> > +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
> > +    { "sarA",	{ VexGb, Eb, I1 }, 0 },
> >    },
> >    /* REG_D1 */
> >    {
> > -    { "rolQ",	{ Ev, I1 }, 0 },
> > -    { "rorQ",	{ Ev, I1 }, 0 },
> > -    { "rclQ",	{ Ev, I1 }, 0 },
> > -    { "rcrQ",	{ Ev, I1 }, 0 },
> > -    { "shlQ",	{ Ev, I1 }, 0 },
> > -    { "shrQ",	{ Ev, I1 }, 0 },
> > -    { "shlQ",	{ Ev, I1 }, 0 },
> > -    { "sarQ",	{ Ev, I1 }, 0 },
> > +    { "rolQ",	{ VexGv, Ev, I1 }, 0 },
> > +    { "rorQ",	{ VexGv, Ev, I1 }, 0 },
> > +    { "rclQ",	{ VexGv, Ev, I1 }, 0 },
> > +    { "rcrQ",	{ VexGv, Ev, I1 }, 0 },
> > +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
> > +    { "shrQ",	{ VexGv, Ev, I1 }, 0 },
> > +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
> > +    { "sarQ",	{ VexGv, Ev, I1 }, 0 },  
> >    },
> 
> As mentioned on the assembler side already, I think we would be better off
> making const_1_mode print $1 in AT&T syntax at least for these new insn
> forms, to eliminate the ambiguity.
> 

It is related to correctness and should be revised. Since they share the same entries, I will created a new patch to modify the legacy instruction and then extend them to NDD. Do you agree?

> > @@ -9061,6 +9069,15 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >  	  ins->rex &= ~REX_B;
> >  	  ins->rex2 &= ~REX_R;
> >  	}
> > +      if (ins->evex_type == evex_from_legacy)
> > +	{
> > +	  /* EVEX from legacy instructions, when the EVEX.ND bit is 0,
> > +	     all bits of EVEX.vvvv and EVEX.V' must be 1.  */
> > +	  if (!ins->vex.b && (ins->vex.register_specifier
> > +				  || !ins->vex.v))
> > +	    return &bad_opcode;
> > +	  ins->rex |= REX_OPCODE;
> 
> What is this line about?
> 
Changed to :

      /* EVEX from legacy instructions, when the EVEX.ND bit is 0,
         all bits of EVEX.vvvv and EVEX.V' must be 1.  */
      if (ins->evex_type == evex_from_legacy && !ins->vex.b
          && (ins->vex.register_specifier || !ins->vex.v))
        return &bad_opcode;

> > @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386 *dp,
> instr_info *ins)
> >  	return &err_opcode;
> >
> >        /* Set vector length.  */
> > -      if (ins->modrm.mod == 3 && ins->vex.b)
> > +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type ==
> > + evex_default)
> >  	ins->vex.length = 512;
> >        else
> >  	{
> 
> Is this change really needed for anything?
> 

If it's NDD and ins->vex.b ==1, we need to avoid giving NDD a wrong value. 

> > @@ -9530,6 +9547,7 @@ print_insn (bfd_vma pc, disassemble_info *info,
> int intel_syntax)
> >  		    {
> >  		      oappend (&ins, "{bad}");
> >  		      continue;
> > +
> >  		    }
> >
> >  		  /* Instructions with a mask register destination allow for
> 
> Stray and bogus change.
> 

Done.

> > @@ -9553,7 +9571,7 @@ print_insn (bfd_vma pc, disassemble_info *info,
> > int intel_syntax)
> >
> >  	  /* Check whether rounding control was enabled for an insn not
> >  	     supporting it.  */
> > -	  if (ins.modrm.mod == 3 && ins.vex.b
> > +	  if (ins.modrm.mod == 3 && ins.vex.b && ins.evex_type ==
> > +evex_default
> >  	      && !(ins.evex_used & EVEX_b_used))
> >  	    {
> 
> This could do with extending the comment, mentioning the aliasing of
> EVEX.brs and EVEX.nd.
> 

Done.

> > @@ -11013,7 +11031,7 @@ print_displacement (instr_info *ins,
> > bfd_signed_vma val)  static void  intel_operand_size (instr_info *ins,
> > int bytemode, int sizeflag)  {
> > -  if (ins->vex.b)
> > +  if (ins->vex.b && ins->evex_type == evex_default)
> >      {
> >        if (!ins->vex.no_broadcast)
> >  	switch (bytemode)
> 
> This aliasing would also be worthwhile mentioning here and ...
> 
> > @@ -11946,7 +11964,7 @@ OP_E_memory (instr_info *ins, int bytemode,
> int sizeflag)
> >  	  print_operand_value (ins, disp & 0xffff, dis_style_text);
> >  	}
> >      }
> > -  if (ins->vex.b)
> > +  if (ins->vex.b && ins->evex_type == evex_default)
> >      {
> >        ins->evex_used |= EVEX_b_used;
> 
> ... here.
> 

Done.

> > @@ -13307,6 +13325,14 @@ OP_VEX (instr_info *ins, int bytemode, int
> sizeflag ATTRIBUTE_UNUSED)
> >    if (!ins->need_vex)
> >      return true;
> >
> > +  if (ins->evex_type == evex_from_legacy)
> > +    {
> > +      ins->evex_used |= EVEX_b_used;
> > +      /* Here vex.b is treated as "EVEX.ND.  */
> 
> Okay, here you have such a helpful comment, just that - nit - there's an
> unbalanced double-quote. (The comment would also more logically come
> first in this block.)
> 

Done.

Thanks,
Lili.
  
Cui, Lili Nov. 20, 2023, 12:54 p.m. UTC | #8
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, November 20, 2023 4:19 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
> binutils@sourceware.org; Kong, Lingling <lingling.kong@intel.com>
> Subject: Re: [PATCH 5/8] Support APX NDD
> 
> On 20.11.2023 02:33, Cui, Lili wrote:
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Thursday, November 9, 2023 5:37 PM
> >>
> >> On 02.11.2023 12:29, Cui, Lili wrote:
> >>> @@ -190,6 +193,8 @@ mov, 0xf21, i386|No64,
> >>> D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { De  mov,
> >> 0xf21,
> >>> x64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64,
> { Debug,
> >> Reg64
> >>> }  mov, 0xf24, i386|No64,
> >>> D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test,
> >> Reg32 }
> >>>
> >>> +// Move after swapping the bytes
> >>> +movbe, 0x0f38f0, Movbe,
> >> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, {
> >>> +Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >>>  // Move after swapping the bytes
> >>>  movbe, 0x0f38f0, Movbe,
> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf,
> >> {
> >>> Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
> >> movbe,
> >>> 0x60, Movbe|APX_F,
> >>> D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVex128|EVexMap4, {
> >>> Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 } @@
> >>> -290,22 +295,36 @@ add, 0x0, 0,
> >>> D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock,
> >> { Reg8|Reg16|Reg3
> >>> add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
> >>> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
> add,
> >> 0x4,
> >>> 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
> >> Acc|Byte|Word|Dword|Qword }
> >>> add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, {
> >>> Imm8|Imm16|Imm32|Imm32S,
> >>>
> >>
> Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInde
> >> x }
> >>> +add, 0x0, APX_F,
> >>>
> >>
> +D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMa
> >> p4|NF, {
> >>> +Reg8|Reg16|Reg32|Reg64,
> >>>
> >>
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> >> ex,
> >>> +Reg8|Reg16|Reg32|Reg64 } add, 0x83/0, APX_F,
> >>>
> >>
> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVex
> >> Map4|N
> >>> +F, { Imm8S,
> >> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex,
> >>> +Reg16|Reg32|Reg64 } add, 0x80/0, APX_F,
> >>>
> >>
> +W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4
> >> |NF, {
> >>> +Imm8|Imm16|Imm32|Imm32S,
> >>>
> >>
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> >> ex,
> >>> +Reg8|Reg16|Reg32|Reg64}
> >>
> >> Why are there (just as an example) only 3 new forms of ADD, but ...
> >>
> >>> @@ -338,10 +366,19 @@ adc, 0x10, 0,
> >>> D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock,
> >> { Reg8|Reg16|Reg
> >>> adc, 0x83/2, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S,
> >>> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
> adc,
> >> 0x14,
> >>> 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S,
> >> Acc|Byte|Word|Dword|Qword }
> >>> adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, {
> >>> Imm8|Imm16|Imm32|Imm32S,
> >>>
> >>
> Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInde
> >> x }
> >>> +adc, 0x10, APX_F,
> >>> +D|W|CheckOperandSize|Modrm|EVex128|EVexMap4|No_sSuf, {
> >>> +Reg8|Reg16|Reg32|Reg64,
> >>>
> >>
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> >> ex }
> >>> +adc, 0x83/2, APX_F, Modrm|EVex128|EVexMap4|No_bSuf|No_sSuf,
> >> { Imm8S,
> >>> +Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
> adc,
> >>> +0x80/2, APX_F, W|Modrm|EVex128|EVexMap4|No_sSuf, {
> >>> +Imm8|Imm16|Imm32|Imm32S,
> >>>
> >>
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> >> ex }
> >>> +adc, 0x10, APX_F,
> >>>
> >>
> +D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMa
> >> p4, {
> >>> +Reg8|Reg16|Reg32|Reg64,
> >>>
> >>
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> >> ex,
> >>> +Reg8|Reg16|Reg32|Reg64 } adc, 0x83/2, APX_F,
> >>>
> >>
> +Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVex
> >> Map4,
> >>> +{ Imm8S,
> >> Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex,
> >>> +Reg16|Reg32|Reg64 } adc, 0x80/2, APX_F,
> >>>
> >>
> +W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4,
> >> {
> >>> +Imm8|Imm16|Imm32|Imm32S,
> >>>
> >>
> +Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseInd
> >> ex,
> >>> +Reg8|Reg16|Reg32|Reg64 }
> >>
> >> .... 6 new forms of ADC? My guess is that's NF-related, but doesn't
> >> that mean that until NF support is added e.g. "{evex} add %eax, %eax"
> >> won't assemble as intended? IOW in line with NF as an attribute being
> >> added right here, those other templates also will want introducing right
> here.
> >>
> >> While looking at patch 7 I'm also wondering whether same-base-opcode
> >> forms wouldn't better be kept together. I haven't finished yet
> >> looking at what's needed there, but even if it doesn't turn out
> >> strictly necessary there it may still be a good idea anyway (unless
> >> of course that would get in the way of anything).
> >>
> >
> > For ADC, there are 3 templates that do not support NDD and NF, I moved
> them here from patch 3/8 which we discussed before since its decoding is
> supported together in the NDD patch.
> 
> Feels like my question wasn't really answered: Why are there 6 new ADC
> templates here, but only 3 ADD ones? Or in other words: Why are the other 3
> ADD ones only added later?
> 

The three extra items of ADC are special, they support neither ND nor NF. ADD does not have this situation. The ND part of add is here, and the remaining part of NF place in NF patch.

> Also I'm curious: Why do ADC and SBB not allow for NF? They always
> consume EFLAGS.CF, but subsequent insns may have no need for their EFLAGS
> output, just like for e.g. ADD and SUB. Yet NF is only about EFLAGS output
> aiui, not about any bit(s) consumed from EFLAGS.
> 

I will get back to you when I know the answer。

Thanks,
Lili.
  
Jan Beulich Nov. 20, 2023, 4:33 p.m. UTC | #9
On 20.11.2023 13:36, Cui, Lili wrote:
>> On 02.11.2023 12:29, Cui, Lili wrote:
>>> --- a/opcodes/i386-dis-evex.h
>>> +++ b/opcodes/i386-dis-evex.h
>>> [...]
>>> @@ -947,23 +947,23 @@ static const struct dis386 evex_table[][256] = {
>>>      { Bad_Opcode },
>>>      { Bad_Opcode },
>>>      /* 40 */
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> +    { "cmovoS",		{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovnoS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovbS",		{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovaeS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmoveS",		{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovneS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovbeS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovaS",		{ VexGv, Gv, Ev }, 0 },
>>>      /* 48 */
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> -    { Bad_Opcode },
>>> +    { "cmovsS",		{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovnsS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovpS",		{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovnpS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovlS",		{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovgeS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovleS",	{ VexGv, Gv, Ev }, 0 },
>>> +    { "cmovgS",		{ VexGv, Gv, Ev }, 0 },
>>
>> Considering CFCMOVcc which sits at the same opcode, doing things like this
>> sets us up for needing to touch all of these again. Maybe that's the best that
>> can be done, but I still wonder whether this couldn't be taken care of right
>> away when introducing these entries.
> 
> How about adding a special letter CF% in front of them?

Why not.

>>> --- a/opcodes/i386-dis.c
>>> +++ b/opcodes/i386-dis.c
>>> [...]
>>> @@ -2660,47 +2668,47 @@ static const struct dis386 reg_table[][8] = {
>>>    },
>>>    /* REG_D0 */
>>>    {
>>> -    { "rolA",	{ Eb, I1 }, 0 },
>>> -    { "rorA",	{ Eb, I1 }, 0 },
>>> -    { "rclA",	{ Eb, I1 }, 0 },
>>> -    { "rcrA",	{ Eb, I1 }, 0 },
>>> -    { "shlA",	{ Eb, I1 }, 0 },
>>> -    { "shrA",	{ Eb, I1 }, 0 },
>>> -    { "shlA",	{ Eb, I1 }, 0 },
>>> -    { "sarA",	{ Eb, I1 }, 0 },
>>> +    { "rolA",	{ VexGb, Eb, I1 }, 0 },
>>> +    { "rorA",	{ VexGb, Eb, I1 }, 0 },
>>> +    { "rclA",	{ VexGb, Eb, I1 }, 0 },
>>> +    { "rcrA",	{ VexGb, Eb, I1 }, 0 },
>>> +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
>>> +    { "shrA",	{ VexGb, Eb, I1 }, 0 },
>>> +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
>>> +    { "sarA",	{ VexGb, Eb, I1 }, 0 },
>>>    },
>>>    /* REG_D1 */
>>>    {
>>> -    { "rolQ",	{ Ev, I1 }, 0 },
>>> -    { "rorQ",	{ Ev, I1 }, 0 },
>>> -    { "rclQ",	{ Ev, I1 }, 0 },
>>> -    { "rcrQ",	{ Ev, I1 }, 0 },
>>> -    { "shlQ",	{ Ev, I1 }, 0 },
>>> -    { "shrQ",	{ Ev, I1 }, 0 },
>>> -    { "shlQ",	{ Ev, I1 }, 0 },
>>> -    { "sarQ",	{ Ev, I1 }, 0 },
>>> +    { "rolQ",	{ VexGv, Ev, I1 }, 0 },
>>> +    { "rorQ",	{ VexGv, Ev, I1 }, 0 },
>>> +    { "rclQ",	{ VexGv, Ev, I1 }, 0 },
>>> +    { "rcrQ",	{ VexGv, Ev, I1 }, 0 },
>>> +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
>>> +    { "shrQ",	{ VexGv, Ev, I1 }, 0 },
>>> +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
>>> +    { "sarQ",	{ VexGv, Ev, I1 }, 0 },  
>>>    },
>>
>> As mentioned on the assembler side already, I think we would be better off
>> making const_1_mode print $1 in AT&T syntax at least for these new insn
>> forms, to eliminate the ambiguity.
>>
> 
> It is related to correctness and should be revised. Since they share the same entries, I will created a new patch to modify the legacy instruction and then extend them to NDD. Do you agree?

I certainly appreciate any reusing, where it is possible (and it ought to be
possible here, yes).

>>> @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386 *dp,
>> instr_info *ins)
>>>  	return &err_opcode;
>>>
>>>        /* Set vector length.  */
>>> -      if (ins->modrm.mod == 3 && ins->vex.b)
>>> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type ==
>>> + evex_default)
>>>  	ins->vex.length = 512;
>>>        else
>>>  	{
>>
>> Is this change really needed for anything?
> 
> If it's NDD and ins->vex.b ==1, we need to avoid giving NDD a wrong value. 

But this is recording ->vex.length, not anything NDD related (afaics).

Jan
  
Jan Beulich Nov. 20, 2023, 4:43 p.m. UTC | #10
On 20.11.2023 13:54, Cui, Lili wrote:
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Monday, November 20, 2023 4:19 PM
>>
>> On 20.11.2023 02:33, Cui, Lili wrote:
>>> For ADC, there are 3 templates that do not support NDD and NF, I moved
>> them here from patch 3/8 which we discussed before since its decoding is
>> supported together in the NDD patch.
>>
>> Feels like my question wasn't really answered: Why are there 6 new ADC
>> templates here, but only 3 ADD ones? Or in other words: Why are the other 3
>> ADD ones only added later?
>>
> 
> The three extra items of ADC are special, they support neither ND nor NF. ADD does not have this situation. The ND part of add is here, and the remaining part of NF place in NF patch.

Why would ADD not have this situation? Without 2-operand EVEX templates,
how would you deal with e.g.

	{evex} add %eax, %ecx

when at the same time you (correctly) permit

	{evex} adc %eax, %ecx

? The difference between ADD and ADC is solely the presence / absence of
the NF attribute.

Jan
  
Cui, Lili Nov. 22, 2023, 7:46 a.m. UTC | #11
> >>> --- a/opcodes/i386-dis-evex.h
> >>> +++ b/opcodes/i386-dis-evex.h
> >>> [...]
> >>> @@ -947,23 +947,23 @@ static const struct dis386 evex_table[][256] = {
> >>>      { Bad_Opcode },
> >>>      { Bad_Opcode },
> >>>      /* 40 */
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> +    { "cmovoS",		{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovnoS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovbS",		{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovaeS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmoveS",		{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovneS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovbeS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovaS",		{ VexGv, Gv, Ev }, 0 },
> >>>      /* 48 */
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> -    { Bad_Opcode },
> >>> +    { "cmovsS",		{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovnsS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovpS",		{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovnpS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovlS",		{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovgeS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovleS",	{ VexGv, Gv, Ev }, 0 },
> >>> +    { "cmovgS",		{ VexGv, Gv, Ev }, 0 },
> >>
> >> Considering CFCMOVcc which sits at the same opcode, doing things like
> >> this sets us up for needing to touch all of these again. Maybe that's
> >> the best that can be done, but I still wonder whether this couldn't
> >> be taken care of right away when introducing these entries.
> >
> > How about adding a special letter %CF in front of them?
> 
> Why not.
>

Done.
 
> >>> --- a/opcodes/i386-dis.c
> >>> +++ b/opcodes/i386-dis.c
> >>> [...]
> >>> @@ -2660,47 +2668,47 @@ static const struct dis386 reg_table[][8] = {
> >>>    },
> >>>    /* REG_D0 */
> >>>    {
> >>> -    { "rolA",	{ Eb, I1 }, 0 },
> >>> -    { "rorA",	{ Eb, I1 }, 0 },
> >>> -    { "rclA",	{ Eb, I1 }, 0 },
> >>> -    { "rcrA",	{ Eb, I1 }, 0 },
> >>> -    { "shlA",	{ Eb, I1 }, 0 },
> >>> -    { "shrA",	{ Eb, I1 }, 0 },
> >>> -    { "shlA",	{ Eb, I1 }, 0 },
> >>> -    { "sarA",	{ Eb, I1 }, 0 },
> >>> +    { "rolA",	{ VexGb, Eb, I1 }, 0 },
> >>> +    { "rorA",	{ VexGb, Eb, I1 }, 0 },
> >>> +    { "rclA",	{ VexGb, Eb, I1 }, 0 },
> >>> +    { "rcrA",	{ VexGb, Eb, I1 }, 0 },
> >>> +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
> >>> +    { "shrA",	{ VexGb, Eb, I1 }, 0 },
> >>> +    { "shlA",	{ VexGb, Eb, I1 }, 0 },
> >>> +    { "sarA",	{ VexGb, Eb, I1 }, 0 },
> >>>    },
> >>>    /* REG_D1 */
> >>>    {
> >>> -    { "rolQ",	{ Ev, I1 }, 0 },
> >>> -    { "rorQ",	{ Ev, I1 }, 0 },
> >>> -    { "rclQ",	{ Ev, I1 }, 0 },
> >>> -    { "rcrQ",	{ Ev, I1 }, 0 },
> >>> -    { "shlQ",	{ Ev, I1 }, 0 },
> >>> -    { "shrQ",	{ Ev, I1 }, 0 },
> >>> -    { "shlQ",	{ Ev, I1 }, 0 },
> >>> -    { "sarQ",	{ Ev, I1 }, 0 },
> >>> +    { "rolQ",	{ VexGv, Ev, I1 }, 0 },
> >>> +    { "rorQ",	{ VexGv, Ev, I1 }, 0 },
> >>> +    { "rclQ",	{ VexGv, Ev, I1 }, 0 },
> >>> +    { "rcrQ",	{ VexGv, Ev, I1 }, 0 },
> >>> +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
> >>> +    { "shrQ",	{ VexGv, Ev, I1 }, 0 },
> >>> +    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
> >>> +    { "sarQ",	{ VexGv, Ev, I1 }, 0 },
> >>>    },
> >>
> >> As mentioned on the assembler side already, I think we would be
> >> better off making const_1_mode print $1 in AT&T syntax at least for
> >> these new insn forms, to eliminate the ambiguity.
> >>
> >
> > It is related to correctness and should be revised. Since they share the same
> entries, I will created a new patch to modify the legacy instruction and then
> extend them to NDD. Do you agree?
> 
> I certainly appreciate any reusing, where it is possible (and it ought to be
> possible here, yes).
> 

Done.

> >>> @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386 *dp,
> >> instr_info *ins)
> >>>  	return &err_opcode;
> >>>
> >>>        /* Set vector length.  */
> >>> -      if (ins->modrm.mod == 3 && ins->vex.b)
> >>> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type ==
> >>> + evex_default)
> >>>  	ins->vex.length = 512;
> >>>        else
> >>>  	{
> >>
> >> Is this change really needed for anything?
> >
> > If it's NDD and ins->vex.b ==1, we need to avoid giving NDD a wrong value.
> 
> But this is recording ->vex.length, not anything NDD related (afaics).

There are some instructions that use OP_VEX, which will use ->vex.length.

For example:
"addB",             { VexGb, Eb, Gb }
  
Jan Beulich Nov. 22, 2023, 8:47 a.m. UTC | #12
On 22.11.2023 08:46, Cui, Lili wrote:
>>>>> @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386 *dp,
>>>> instr_info *ins)
>>>>>  	return &err_opcode;
>>>>>
>>>>>        /* Set vector length.  */
>>>>> -      if (ins->modrm.mod == 3 && ins->vex.b)
>>>>> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type ==
>>>>> + evex_default)
>>>>>  	ins->vex.length = 512;
>>>>>        else
>>>>>  	{
>>>>
>>>> Is this change really needed for anything?
>>>
>>> If it's NDD and ins->vex.b ==1, we need to avoid giving NDD a wrong value.
>>
>> But this is recording ->vex.length, not anything NDD related (afaics).
> 
> There are some instructions that use OP_VEX, which will use ->vex.length.
> 
> For example:
> "addB",             { VexGb, Eb, Gb }

But that's a GPR, for which ->vex.length is not supposed to have an effect.

Jan
  
Cui, Lili Nov. 22, 2023, 10:45 a.m. UTC | #13
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Wednesday, November 22, 2023 4:48 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
> binutils@sourceware.org; Kong, Lingling <lingling.kong@intel.com>
> Subject: Re: [PATCH 5/8] Support APX NDD
> 
> On 22.11.2023 08:46, Cui, Lili wrote:
> >>>>> @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386 *dp,
> >>>> instr_info *ins)
> >>>>>  	return &err_opcode;
> >>>>>
> >>>>>        /* Set vector length.  */
> >>>>> -      if (ins->modrm.mod == 3 && ins->vex.b)
> >>>>> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type ==
> >>>>> + evex_default)
> >>>>>  	ins->vex.length = 512;
> >>>>>        else
> >>>>>  	{
> >>>>
> >>>> Is this change really needed for anything?
> >>>
> >>> If it's NDD and ins->vex.b ==1, we need to avoid giving NDD a wrong
> value.
> >>
> >> But this is recording ->vex.length, not anything NDD related (afaics).
> >
> > There are some instructions that use OP_VEX, which will use ->vex.length.
> >
> > For example:
> > "addB",             { VexGb, Eb, Gb }
> 
> But that's a GPR, for which ->vex.length is not supposed to have an effect.
> 

For EVEX-promoted instructions,  evex.ll == 0b00, which has the same encoding as vex.length == 128 and they can share the same processing with ->vex.length, ->vex.length also handles GPR in OP_VEX.


  switch (ins->vex.length)
    {
    case 128:
      switch (bytemode)
        {
        case x_mode:
          names = att_names_xmm;
          ins->evex_used |= EVEX_len_used;
          break;
        case v_mode:
        case dq_mode:
          if (ins->rex & REX_W)
            names = att_names64;
          else if (bytemode == v_mode
                   && !(sizeflag & DFLAG))
            names = att_names16;
          else
            names = att_names32;
          break;
        case b_mode:
          names = att_names8rex;
          break;
        case q_mode:
          names = att_names64;
          break;
        case mask_bd_mode:
        case mask_mode:
          if (reg > 0x7)
            {
              oappend (ins, "(bad)");
              return true;
            }
          names = att_names_mask;
          break;
        default:
          abort ();
          return true;
        }
      break;

Thanks,
Lili.
  
Jan Beulich Nov. 23, 2023, 10:57 a.m. UTC | #14
On 22.11.2023 11:45, Cui, Lili wrote:
> 
> 
>> -----Original Message-----
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Wednesday, November 22, 2023 4:48 PM
>> To: Cui, Lili <lili.cui@intel.com>
>> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
>> binutils@sourceware.org; Kong, Lingling <lingling.kong@intel.com>
>> Subject: Re: [PATCH 5/8] Support APX NDD
>>
>> On 22.11.2023 08:46, Cui, Lili wrote:
>>>>>>> @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386 *dp,
>>>>>> instr_info *ins)
>>>>>>>  	return &err_opcode;
>>>>>>>
>>>>>>>        /* Set vector length.  */
>>>>>>> -      if (ins->modrm.mod == 3 && ins->vex.b)
>>>>>>> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type ==
>>>>>>> + evex_default)
>>>>>>>  	ins->vex.length = 512;
>>>>>>>        else
>>>>>>>  	{
>>>>>>
>>>>>> Is this change really needed for anything?
>>>>>
>>>>> If it's NDD and ins->vex.b ==1, we need to avoid giving NDD a wrong
>> value.
>>>>
>>>> But this is recording ->vex.length, not anything NDD related (afaics).
>>>
>>> There are some instructions that use OP_VEX, which will use ->vex.length.
>>>
>>> For example:
>>> "addB",             { VexGb, Eb, Gb }
>>
>> But that's a GPR, for which ->vex.length is not supposed to have an effect.
>>
> 
> For EVEX-promoted instructions,  evex.ll == 0b00, which has the same encoding as vex.length == 128 and they can share the same processing with ->vex.length, ->vex.length also handles GPR in OP_VEX.
> 
> 
>   switch (ins->vex.length)
>     {
>     case 128:
>       switch (bytemode)
>         {
>         case x_mode:
>           names = att_names_xmm;
>           ins->evex_used |= EVEX_len_used;
>           break;
>         case v_mode:
>         case dq_mode:
>           if (ins->rex & REX_W)
>             names = att_names64;
>           else if (bytemode == v_mode
>                    && !(sizeflag & DFLAG))
>             names = att_names16;
>           else
>             names = att_names32;
>           break;
>         case b_mode:
>           names = att_names8rex;
>           break;
>         case q_mode:
>           names = att_names64;
>           break;
>         case mask_bd_mode:
>         case mask_mode:
>           if (reg > 0x7)
>             {
>               oappend (ins, "(bad)");
>               return true;
>             }
>           names = att_names_mask;
>           break;
>         default:
>           abort ();
>           return true;
>         }
>       break;

Hmm, okay, I see that this is then down to an anomaly in pre-existing code.
I don't think any of what's quoted above should actually depend on
->vex.length; it'll surely need to change once the first VEX/EVEX-encoded
insns appears while has .l / .ll != 0 but still encodes a GPR. Yet then
nothing I can sensibly demand you fix up front. Without which what I'd like
to ask for (re-iterating earlier remarks): For anything not obvious (which
this falls under), please add some explanation to the patch description.
The general issue there is that while ChangeLog entries enumerate what is
being done, they hardly ever say _why_ certain changes are needed.

Jan
  
Cui, Lili Nov. 23, 2023, 12:14 p.m. UTC | #15
> -----Original Message-----
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Thursday, November 23, 2023 6:58 PM
> To: Cui, Lili <lili.cui@intel.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
> binutils@sourceware.org; Kong, Lingling <lingling.kong@intel.com>
> Subject: Re: [PATCH 5/8] Support APX NDD
> 
> On 22.11.2023 11:45, Cui, Lili wrote:
> >
> >
> >> -----Original Message-----
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Wednesday, November 22, 2023 4:48 PM
> >> To: Cui, Lili <lili.cui@intel.com>
> >> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; ccoutant@gmail.com;
> >> binutils@sourceware.org; Kong, Lingling <lingling.kong@intel.com>
> >> Subject: Re: [PATCH 5/8] Support APX NDD
> >>
> >> On 22.11.2023 08:46, Cui, Lili wrote:
> >>>>>>> @@ -9087,7 +9104,7 @@ get_valid_dis386 (const struct dis386
> *dp,
> >>>>>> instr_info *ins)
> >>>>>>>  	return &err_opcode;
> >>>>>>>
> >>>>>>>        /* Set vector length.  */
> >>>>>>> -      if (ins->modrm.mod == 3 && ins->vex.b)
> >>>>>>> +      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type
> >>>>>>> + ==
> >>>>>>> + evex_default)
> >>>>>>>  	ins->vex.length = 512;
> >>>>>>>        else
> >>>>>>>  	{
> >>>>>>
> >>>>>> Is this change really needed for anything?
> >>>>>
> >>>>> If it's NDD and ins->vex.b ==1, we need to avoid giving NDD a
> >>>>> wrong
> >> value.
> >>>>
> >>>> But this is recording ->vex.length, not anything NDD related (afaics).
> >>>
> >>> There are some instructions that use OP_VEX, which will use ->vex.length.
> >>>
> >>> For example:
> >>> "addB",             { VexGb, Eb, Gb }
> >>
> >> But that's a GPR, for which ->vex.length is not supposed to have an effect.
> >>
> >
> > For EVEX-promoted instructions,  evex.ll == 0b00, which has the same
> encoding as vex.length == 128 and they can share the same processing with -
> >vex.length, ->vex.length also handles GPR in OP_VEX.
> >
> >
> >   switch (ins->vex.length)
> >     {
> >     case 128:
> >       switch (bytemode)
> >         {
> >         case x_mode:
> >           names = att_names_xmm;
> >           ins->evex_used |= EVEX_len_used;
> >           break;
> >         case v_mode:
> >         case dq_mode:
> >           if (ins->rex & REX_W)
> >             names = att_names64;
> >           else if (bytemode == v_mode
> >                    && !(sizeflag & DFLAG))
> >             names = att_names16;
> >           else
> >             names = att_names32;
> >           break;
> >         case b_mode:
> >           names = att_names8rex;
> >           break;
> >         case q_mode:
> >           names = att_names64;
> >           break;
> >         case mask_bd_mode:
> >         case mask_mode:
> >           if (reg > 0x7)
> >             {
> >               oappend (ins, "(bad)");
> >               return true;
> >             }
> >           names = att_names_mask;
> >           break;
> >         default:
> >           abort ();
> >           return true;
> >         }
> >       break;
> 
> Hmm, okay, I see that this is then down to an anomaly in pre-existing code.
> I don't think any of what's quoted above should actually depend on
> ->vex.length; it'll surely need to change once the first
> ->VEX/EVEX-encoded
> insns appears while has .l / .ll != 0 but still encodes a GPR. Yet then nothing I
> can sensibly demand you fix up front. Without which what I'd like to ask for
> (re-iterating earlier remarks): For anything not obvious (which this falls under),
> please add some explanation to the patch description.
> The general issue there is that while ChangeLog entries enumerate what is
> being done, they hardly ever say _why_ certain changes are needed.
> 
Thank you for your suggestion,  comments are really important. These patches were not written by me, when you asked about many strange places without comments, I actually couldn’t understand them either. I could only remove the code and compile it to extract the information from the wrong testcase.  I also realized the importance of comments.

Regards,
Lili.
  
Cui, Lili Dec. 7, 2023, 8:17 a.m. UTC | #16
Hi H.J,

Could you help review these APX patches? There are 11 patches in total.

[PATCH] Clean reg class and base_reg for input output operand (%dx). (for legacy)
[PATCH 1/9] Make const_1_mode print $1 in AT&T syntax. (for legacy)
[PATCH v3 2/9] Support APX GPR32 with rex2 prefix.
[PATCH v3 3/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
[PATCH v3 4/9] Support APX GPR32 with extend evex prefix.
[PATCH v3 5/9] Add tests for APX GPR32 with extend evex prefix.
[PATCH v3 6/9] Support APX NDD.
[PATCH v3 7/9] Support APX Push2/Pop2.
[PATCH] Support APX PUSHP/POPP.
[PATCH v3 8/9] Support APX NDD optimized encoding.
[PATCH v3 9/9] Support APX JMPABS for disassembler.

There are 3 patches that need to be updated. Lin and I will post later.
 [PATCH v3 2/9] Support APX GPR32 with rex2 prefix
 [PATCH] Support APX PUSHP/POPP
 [PATCH v3 9/9] Support APX JMPABS for disassembler

Thanks,
Lili.

> -----Original Message-----
> From: Cui, Lili <lili.cui@intel.com>
> Sent: Friday, November 24, 2023 2:56 PM
> To: Beulich, Jan <JBeulich@suse.com>
> Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; binutils@sourceware.org
> Subject: [PATCH v3 0/9] Support Intel APX EGPR
> 
> 
> This is V3 of all APX patches.
> 1. Created a patch to make const_1_mode print $1 in AT&T syntax.
> 2. How to print the rex2 prefix needs to be discussed later.
> 3. After NF patch, need to add tests for pfx macros emitting {evex} in
> noreg64.s.
> 
> Cui, Lili (5):
>   Make const_1_mode print $1 in AT&T syntax
>   Support APX GPR32 with rex2 prefix
>   Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
>   Support APX GPR32 with extend evex prefix
>   Add tests for APX GPR32 with extend evex prefix
> 
> Hu, Lin1 (2):
>   Support APX NDD optimized encoding.
>   Support APX JMPABS for disassembler
> 
> Mo, Zewei (1):
>   Support APX Push2/Pop2
> 
> konglin1 (1):
>   Support APX NDD
> 
> 
> Thanks,
> Lili.
>  gas/config/tc-i386.c                          | 472 +++++++++++--
>  gas/doc/c-i386.texi                           |   6 +-
>  gas/testsuite/gas/i386/apx-push2pop2-inval.l  |   5 +
>  gas/testsuite/gas/i386/apx-push2pop2-inval.s  |   9 +
>  gas/testsuite/gas/i386/i386.exp               |   1 +
>  .../i386/ilp32/x86-64-opcode-inval-intel.d    |  47 +-
>  .../gas/i386/ilp32/x86-64-opcode-inval.d      |  47 +-
>  gas/testsuite/gas/i386/intel.d                |   6 +-
>  gas/testsuite/gas/i386/lfence-load.d          |   2 +-
>  gas/testsuite/gas/i386/noreg16-data32.d       |  32 +-
>  gas/testsuite/gas/i386/noreg16.d              |  32 +-
>  gas/testsuite/gas/i386/noreg32-data16.d       |  32 +-
>  gas/testsuite/gas/i386/noreg32.d              |  32 +-
>  gas/testsuite/gas/i386/noreg64-data16.d       |  32 +-
>  gas/testsuite/gas/i386/noreg64-rex64.d        |  32 +-
>  gas/testsuite/gas/i386/noreg64.d              |  32 +-
>  gas/testsuite/gas/i386/opcode-suffix.d        |   6 +-
>  gas/testsuite/gas/i386/opcode.d               |  10 +-
>  .../gas/i386/x86-64-apx-egpr-inval.l          | 203 ++++++
>  .../gas/i386/x86-64-apx-egpr-inval.s          | 210 ++++++
>  .../gas/i386/x86-64-apx-egpr-promote-inval.l  |  20 +  .../gas/i386/x86-64-apx-
> egpr-promote-inval.s  |  29 +  gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d |
> 20 +  gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s |  21 +
>  .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  34 +
>  .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  36 +
>  .../gas/i386/x86-64-apx-evex-promoted-intel.d | 318 +++++++++
>  .../gas/i386/x86-64-apx-evex-promoted.d       | 318 +++++++++
>  .../gas/i386/x86-64-apx-evex-promoted.s       | 314 +++++++++
>  .../gas/i386/x86-64-apx-jmpabs-intel.d        |  11 +
>  .../gas/i386/x86-64-apx-jmpabs-inval.d        |  40 ++
>  .../gas/i386/x86-64-apx-jmpabs-inval.s        |  15 +
>  gas/testsuite/gas/i386/x86-64-apx-jmpabs.d    |  11 +
>  gas/testsuite/gas/i386/x86-64-apx-jmpabs.s    |   5 +
>  .../gas/i386/x86-64-apx-ndd-optimize.d        | 130 ++++
>  .../gas/i386/x86-64-apx-ndd-optimize.s        | 123 ++++
>  gas/testsuite/gas/i386/x86-64-apx-ndd.d       | 160 +++++
>  gas/testsuite/gas/i386/x86-64-apx-ndd.s       | 155 +++++
>  .../gas/i386/x86-64-apx-push2pop2-intel.d     |  42 ++
>  .../gas/i386/x86-64-apx-push2pop2-inval.l     |  13 +
>  .../gas/i386/x86-64-apx-push2pop2-inval.s     |  17 +
>  gas/testsuite/gas/i386/x86-64-apx-push2pop2.d |  42 ++
> gas/testsuite/gas/i386/x86-64-apx-push2pop2.s |  39 ++
>  gas/testsuite/gas/i386/x86-64-apx-rex2.d      |  83 +++
>  gas/testsuite/gas/i386/x86-64-apx-rex2.s      |  86 +++
>  gas/testsuite/gas/i386/x86-64-evex.d          |   2 +-
>  gas/testsuite/gas/i386/x86-64-inval-pseudo.l  |   6 +
>  gas/testsuite/gas/i386/x86-64-inval-pseudo.s  |   4 +
>  gas/testsuite/gas/i386/x86-64-lfence-load.d   |   2 +-
>  .../gas/i386/x86-64-opcode-inval-intel.d      |  26 +-
>  gas/testsuite/gas/i386/x86-64-opcode-inval.d  |  26 +-
>  gas/testsuite/gas/i386/x86-64-opcode-inval.s  |   4 -
>  gas/testsuite/gas/i386/x86-64-opcode.d        |   6 +-
>  gas/testsuite/gas/i386/x86-64-pseudos-bad.l   |  59 +-
>  gas/testsuite/gas/i386/x86-64-pseudos-bad.s   |  58 ++
>  gas/testsuite/gas/i386/x86-64-pseudos.d       |  63 ++
>  gas/testsuite/gas/i386/x86-64-pseudos.s       |  65 ++
>  gas/testsuite/gas/i386/x86-64.exp             |  17 +-
>  include/opcode/i386.h                         |   2 +
>  opcodes/i386-dis-evex-len.h                   |  10 +
>  opcodes/i386-dis-evex-prefix.h                |  66 ++
>  opcodes/i386-dis-evex-reg.h                   |  71 ++
>  opcodes/i386-dis-evex-w.h                     |  10 +
>  opcodes/i386-dis-evex-x86-64.h                |  60 ++
>  opcodes/i386-dis-evex.h                       | 347 +++++++++-
>  opcodes/i386-dis.c                            | 644 +++++++++++++-----
>  opcodes/i386-gen.c                            |  57 +-
>  opcodes/i386-opc.h                            |  30 +-
>  opcodes/i386-opc.tbl                          | 210 ++++--
>  opcodes/i386-reg.tbl                          |  64 ++
>  70 files changed, 4669 insertions(+), 570 deletions(-)  create mode 100644
> gas/testsuite/gas/i386/apx-push2pop2-inval.l
>  create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-
> bad.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-
> intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.d
>  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.s
>  create mode 100644 opcodes/i386-dis-evex-x86-64.h
> 
> --
> 2.25.1
  
Cui, Lili Dec. 7, 2023, 8:33 a.m. UTC | #17
Sorry, adjust the format.

 [PATCH] Clean reg class and base_reg for input output operand (%dx). (for legacy) .
 [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax. (for legacy).
 [PATCH v3 2/9] Support APX GPR32 with rex2 prefix.
 [PATCH v3 3/9] Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
 [PATCH v3 4/9] Support APX GPR32 with extend evex prefix.
 [PATCH v3 5/9] Add tests for APX GPR32 with extend evex prefix.
 [PATCH v3 6/9] Support APX NDD.
 [PATCH v3 7/9] Support APX Push2/Pop2.
 [PATCH] Support APX PUSHP/POPP.
 [PATCH v3 8/9] Support APX NDD optimized encoding.
 [PATCH v3 9/9] Support APX JMPABS for disassembler.
 
 There are 3 patches that need to be updated. Lin and I will post later.
  [PATCH v3 2/9] Support APX GPR32 with rex2 prefix.
  [PATCH] Support APX PUSHP/POPP.
  [PATCH v3 9/9] Support APX JMPABS for disassembler.


Regards,
Lili.

> -----Original Message-----
> From: Cui, Lili
> Sent: Thursday, December 7, 2023 4:17 PM
> To: H.J. Lu <hjl.tools@gmail.com>
> Cc: binutils@sourceware.org
> Subject: RE: [PATCH v3 0/9] Support Intel APX EGPR
> 
> Hi H.J,
> 
> Could you help review these APX patches? There are 11 patches in total.
> 
> [PATCH] Clean reg class and base_reg for input output operand (%dx). (for
> legacy) [PATCH 1/9] Make const_1_mode print $1 in AT&T syntax. (for legacy)
> [PATCH v3 2/9] Support APX GPR32 with rex2 prefix.
> [PATCH v3 3/9] Created an empty EVEX_MAP4_ sub-table for EVEX
> instructions.
> [PATCH v3 4/9] Support APX GPR32 with extend evex prefix.
> [PATCH v3 5/9] Add tests for APX GPR32 with extend evex prefix.
> [PATCH v3 6/9] Support APX NDD.
> [PATCH v3 7/9] Support APX Push2/Pop2.
> [PATCH] Support APX PUSHP/POPP.
> [PATCH v3 8/9] Support APX NDD optimized encoding.
> [PATCH v3 9/9] Support APX JMPABS for disassembler.
> 
> There are 3 patches that need to be updated. Lin and I will post later.
>  [PATCH v3 2/9] Support APX GPR32 with rex2 prefix  [PATCH] Support APX
> PUSHP/POPP  [PATCH v3 9/9] Support APX JMPABS for disassembler
> 
> Thanks,
> Lili.
> 
> > -----Original Message-----
> > From: Cui, Lili <lili.cui@intel.com>
> > Sent: Friday, November 24, 2023 2:56 PM
> > To: Beulich, Jan <JBeulich@suse.com>
> > Cc: Lu, Hongjiu <hongjiu.lu@intel.com>; binutils@sourceware.org
> > Subject: [PATCH v3 0/9] Support Intel APX EGPR
> >
> >
> > This is V3 of all APX patches.
> > 1. Created a patch to make const_1_mode print $1 in AT&T syntax.
> > 2. How to print the rex2 prefix needs to be discussed later.
> > 3. After NF patch, need to add tests for pfx macros emitting {evex} in
> > noreg64.s.
> >
> > Cui, Lili (5):
> >   Make const_1_mode print $1 in AT&T syntax
> >   Support APX GPR32 with rex2 prefix
> >   Created an empty EVEX_MAP4_ sub-table for EVEX instructions.
> >   Support APX GPR32 with extend evex prefix
> >   Add tests for APX GPR32 with extend evex prefix
> >
> > Hu, Lin1 (2):
> >   Support APX NDD optimized encoding.
> >   Support APX JMPABS for disassembler
> >
> > Mo, Zewei (1):
> >   Support APX Push2/Pop2
> >
> > konglin1 (1):
> >   Support APX NDD
> >
> >
> > Thanks,
> > Lili.
> >  gas/config/tc-i386.c                          | 472 +++++++++++--
> >  gas/doc/c-i386.texi                           |   6 +-
> >  gas/testsuite/gas/i386/apx-push2pop2-inval.l  |   5 +
> >  gas/testsuite/gas/i386/apx-push2pop2-inval.s  |   9 +
> >  gas/testsuite/gas/i386/i386.exp               |   1 +
> >  .../i386/ilp32/x86-64-opcode-inval-intel.d    |  47 +-
> >  .../gas/i386/ilp32/x86-64-opcode-inval.d      |  47 +-
> >  gas/testsuite/gas/i386/intel.d                |   6 +-
> >  gas/testsuite/gas/i386/lfence-load.d          |   2 +-
> >  gas/testsuite/gas/i386/noreg16-data32.d       |  32 +-
> >  gas/testsuite/gas/i386/noreg16.d              |  32 +-
> >  gas/testsuite/gas/i386/noreg32-data16.d       |  32 +-
> >  gas/testsuite/gas/i386/noreg32.d              |  32 +-
> >  gas/testsuite/gas/i386/noreg64-data16.d       |  32 +-
> >  gas/testsuite/gas/i386/noreg64-rex64.d        |  32 +-
> >  gas/testsuite/gas/i386/noreg64.d              |  32 +-
> >  gas/testsuite/gas/i386/opcode-suffix.d        |   6 +-
> >  gas/testsuite/gas/i386/opcode.d               |  10 +-
> >  .../gas/i386/x86-64-apx-egpr-inval.l          | 203 ++++++
> >  .../gas/i386/x86-64-apx-egpr-inval.s          | 210 ++++++
> >  .../gas/i386/x86-64-apx-egpr-promote-inval.l  |  20 +
> > .../gas/i386/x86-64-apx- egpr-promote-inval.s  |  29 +
> > gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d |
> > 20 +  gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s |  21 +
> >  .../gas/i386/x86-64-apx-evex-promoted-bad.d   |  34 +
> >  .../gas/i386/x86-64-apx-evex-promoted-bad.s   |  36 +
> >  .../gas/i386/x86-64-apx-evex-promoted-intel.d | 318 +++++++++
> >  .../gas/i386/x86-64-apx-evex-promoted.d       | 318 +++++++++
> >  .../gas/i386/x86-64-apx-evex-promoted.s       | 314 +++++++++
> >  .../gas/i386/x86-64-apx-jmpabs-intel.d        |  11 +
> >  .../gas/i386/x86-64-apx-jmpabs-inval.d        |  40 ++
> >  .../gas/i386/x86-64-apx-jmpabs-inval.s        |  15 +
> >  gas/testsuite/gas/i386/x86-64-apx-jmpabs.d    |  11 +
> >  gas/testsuite/gas/i386/x86-64-apx-jmpabs.s    |   5 +
> >  .../gas/i386/x86-64-apx-ndd-optimize.d        | 130 ++++
> >  .../gas/i386/x86-64-apx-ndd-optimize.s        | 123 ++++
> >  gas/testsuite/gas/i386/x86-64-apx-ndd.d       | 160 +++++
> >  gas/testsuite/gas/i386/x86-64-apx-ndd.s       | 155 +++++
> >  .../gas/i386/x86-64-apx-push2pop2-intel.d     |  42 ++
> >  .../gas/i386/x86-64-apx-push2pop2-inval.l     |  13 +
> >  .../gas/i386/x86-64-apx-push2pop2-inval.s     |  17 +
> >  gas/testsuite/gas/i386/x86-64-apx-push2pop2.d |  42 ++
> > gas/testsuite/gas/i386/x86-64-apx-push2pop2.s |  39 ++
> >  gas/testsuite/gas/i386/x86-64-apx-rex2.d      |  83 +++
> >  gas/testsuite/gas/i386/x86-64-apx-rex2.s      |  86 +++
> >  gas/testsuite/gas/i386/x86-64-evex.d          |   2 +-
> >  gas/testsuite/gas/i386/x86-64-inval-pseudo.l  |   6 +
> >  gas/testsuite/gas/i386/x86-64-inval-pseudo.s  |   4 +
> >  gas/testsuite/gas/i386/x86-64-lfence-load.d   |   2 +-
> >  .../gas/i386/x86-64-opcode-inval-intel.d      |  26 +-
> >  gas/testsuite/gas/i386/x86-64-opcode-inval.d  |  26 +-
> >  gas/testsuite/gas/i386/x86-64-opcode-inval.s  |   4 -
> >  gas/testsuite/gas/i386/x86-64-opcode.d        |   6 +-
> >  gas/testsuite/gas/i386/x86-64-pseudos-bad.l   |  59 +-
> >  gas/testsuite/gas/i386/x86-64-pseudos-bad.s   |  58 ++
> >  gas/testsuite/gas/i386/x86-64-pseudos.d       |  63 ++
> >  gas/testsuite/gas/i386/x86-64-pseudos.s       |  65 ++
> >  gas/testsuite/gas/i386/x86-64.exp             |  17 +-
> >  include/opcode/i386.h                         |   2 +
> >  opcodes/i386-dis-evex-len.h                   |  10 +
> >  opcodes/i386-dis-evex-prefix.h                |  66 ++
> >  opcodes/i386-dis-evex-reg.h                   |  71 ++
> >  opcodes/i386-dis-evex-w.h                     |  10 +
> >  opcodes/i386-dis-evex-x86-64.h                |  60 ++
> >  opcodes/i386-dis-evex.h                       | 347 +++++++++-
> >  opcodes/i386-dis.c                            | 644 +++++++++++++-----
> >  opcodes/i386-gen.c                            |  57 +-
> >  opcodes/i386-opc.h                            |  30 +-
> >  opcodes/i386-opc.tbl                          | 210 ++++--
> >  opcodes/i386-reg.tbl                          |  64 ++
> >  70 files changed, 4669 insertions(+), 570 deletions(-)  create mode
> > 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.l
> >  create mode 100644 gas/testsuite/gas/i386/apx-push2pop2-inval.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.l
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-egpr-inval.s
> >  create mode 100644
> > gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l
> >  create mode 100644
> > gas/testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-egpr.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-
> > bad.d
> >  create mode 100644
> > gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted-
> > intel.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-evex-promoted.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-intel.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs-inval.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-jmpabs.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd-optimize.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-ndd.s
> >  create mode 100644
> > gas/testsuite/gas/i386/x86-64-apx-push2pop2-intel.d
> >  create mode 100644
> > gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.l
> >  create mode 100644
> > gas/testsuite/gas/i386/x86-64-apx-push2pop2-inval.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-push2pop2.s
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.d
> >  create mode 100644 gas/testsuite/gas/i386/x86-64-apx-rex2.s
> >  create mode 100644 opcodes/i386-dis-evex-x86-64.h
> >
> > --
> > 2.25.1
  

Patch

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 398909a6a30..5b925505435 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -2317,8 +2317,10 @@  operand_size_match (const insn_template *t)
       unsigned int given = i.operands - j - 1;
 
       /* For FMA4 and XOP insns VEX.W controls just the first two
-	 register operands.  */
-      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP))
+	 register operands. And APX_F insns just swap the two source operands,
+	 with the 3rd one being the destination.  */
+      if (is_cpu (t, CpuFMA4) || is_cpu (t, CpuXOP)
+	  || is_cpu (t,CpuAPX_F))
 	given = j < 2 ? 1 - j : j;
 
       if (t->operand_types[j].bitfield.class == Reg
@@ -3959,6 +3961,7 @@  static INLINE bool
 is_any_apx_evex_encoding (void)
 {
   return i.rex2 || i.tm.opcode_space == SPACE_EVEXMAP4 
+    || i.rex2_encoding
     || (i.vex.register_specifier
 	&& i.vex.register_specifier->reg_flags & RegRex2);
 }
@@ -7481,26 +7484,33 @@  match_template (char mnem_suffix)
 	  overlap1 = operand_type_and (operand_types[0], operand_types[1]);
 	  if (t->opcode_modifier.d && i.reg_operands == i.operands
 	      && !operand_type_all_zero (&overlap1))
-	    switch (i.dir_encoding)
-	      {
-	      case dir_encoding_load:
-		if (operand_type_check (operand_types[i.operands - 1], anymem)
-		    || t->opcode_modifier.regmem)
-		  goto check_reverse;
-		break;
+	    {
 
-	      case dir_encoding_store:
-		if (!operand_type_check (operand_types[i.operands - 1], anymem)
-		    && !t->opcode_modifier.regmem)
-		  goto check_reverse;
-		break;
+	      int MemOperand = i.operands - 1 -
+		(t->opcode_space == SPACE_EVEXMAP4
+		 && t->opcode_modifier.vexvvvv);
+
+	      switch (i.dir_encoding)
+		{
+		case dir_encoding_load:
+		  if (operand_type_check (operand_types[MemOperand], anymem)
+		      || t->opcode_modifier.regmem)
+		    goto check_reverse;
+		  break;
 
-	      case dir_encoding_swap:
-		goto check_reverse;
+		case dir_encoding_store:
+		  if (!operand_type_check (operand_types[MemOperand], anymem)
+		      && !t->opcode_modifier.regmem)
+		    goto check_reverse;
+		  break;
 
-	      case dir_encoding_default:
-		break;
-	      }
+		case dir_encoding_swap:
+		  goto check_reverse;
+
+		case dir_encoding_default:
+		  break;
+		}
+	    }
 	  /* If we want store form, we skip the current load.  */
 	  if ((i.dir_encoding == dir_encoding_store
 	       || i.dir_encoding == dir_encoding_swap)
@@ -7530,11 +7540,13 @@  match_template (char mnem_suffix)
 		continue;
 	      /* Try reversing direction of operands.  */
 	      j = is_cpu (t, CpuFMA4)
-		  || is_cpu (t, CpuXOP) ? 1 : i.operands - 1;
+		  || is_cpu (t, CpuXOP)
+		  || is_cpu (t, CpuAPX_F) ? 1 : i.operands - 1;
 	      overlap0 = operand_type_and (i.types[0], operand_types[j]);
 	      overlap1 = operand_type_and (i.types[j], operand_types[0]);
 	      overlap2 = operand_type_and (i.types[1], operand_types[1]);
-	      gas_assert (t->operands != 3 || !check_register);
+	      gas_assert (t->operands != 3 || !check_register
+			  || is_cpu (t,CpuAPX_F));
 	      if (!operand_type_match (overlap0, i.types[0])
 		  || !operand_type_match (overlap1, i.types[j])
 		  || (t->operands == 3
@@ -7569,6 +7581,11 @@  match_template (char mnem_suffix)
 		  found_reverse_match = Opcode_VexW;
 		  goto check_operands_345;
 		}
+	      else if (is_cpu (t,CpuAPX_F) && i.operands == 3)
+		{
+		  found_reverse_match = Opcode_D;
+		  goto check_operands_345;
+		}
 	      else if (t->opcode_space != SPACE_BASE
 		       && (t->opcode_space != SPACE_0F
 			   /* MOV to/from CR/DR/TR, as an exception, follow
@@ -7742,6 +7759,9 @@  match_template (char mnem_suffix)
 
       i.tm.base_opcode ^= found_reverse_match;
 
+      if (i.tm.opcode_space == SPACE_EVEXMAP4)
+	goto swap_first_2;
+
       /* Certain SIMD insns have their load forms specified in the opcode
 	 table, and hence we need to _set_ RegMem instead of clearing it.
 	 We need to avoid setting the bit though on insns like KMOVW.  */
@@ -7761,6 +7781,7 @@  match_template (char mnem_suffix)
 	 flipping VEX.W.  */
       i.tm.opcode_modifier.vexw ^= VEXW0 ^ VEXW1;
 
+    swap_first_2:
       j = i.tm.operand_types[0].bitfield.imm8;
       i.tm.operand_types[j] = operand_types[j + 1];
       i.tm.operand_types[j + 1] = operand_types[j];
@@ -8588,11 +8609,10 @@  process_operands (void)
   const reg_entry *default_seg = NULL;
 
   /* We only need to check those implicit registers for instructions
-     with 3 operands or less.  */
-  if (i.operands <= 3)
-    for (unsigned int j = 0; j < i.operands; j++)
-      if (i.types[j].bitfield.instance != InstanceNone)
-	i.reg_operands--;
+     with 4 operands or less.  */
+  for (unsigned int j = 0; j < i.operands; j++)
+    if (i.types[j].bitfield.instance != InstanceNone)
+      i.reg_operands--;
 
   if (i.tm.opcode_modifier.sse2avx)
     {
@@ -8946,26 +8966,35 @@  build_modrm_byte (void)
 				     || i.vec_encoding == vex_encoding_evex));
     }
 
-  for (v = source + 1; v < dest; ++v)
-    if (v != reg_slot)
-      break;
-  if (v >= dest)
-    v = ~0;
-  if (i.tm.extension_opcode != None)
+  if (i.tm.opcode_modifier.vexvvvv == VexVVVV_DST)
     {
-      if (dest != source)
-	v = dest;
-      dest = ~0;
+      v = dest;
+      dest-- ;
     }
-  gas_assert (source < dest);
-  if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
-      && source != op)
+  else if (i.tm.opcode_modifier.vexvvvv == VexVVVV_SRC)
     {
-      unsigned int tmp = source;
+      v = source + 1;
+      for (v = source + 1; v < dest; ++v)
+	if (v != reg_slot)
+	  break;
+      if (i.tm.extension_opcode != None)
+	{
+	  if (dest != source)
+	    v = dest;
+	  dest = ~0;
+	}
+      gas_assert (source < dest);
+      if (i.tm.opcode_modifier.operandconstraint == SWAP_SOURCES
+	  && source != op)
+	{
+	  unsigned int tmp = source;
 
-      source = v;
-      v = tmp;
+	  source = v;
+	  v = tmp;
+	}
     }
+  else
+    v = ~0;
 
   if (v < MAX_OPERANDS)
     {
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
index ad5b2e3cb5c..9060b697c0d 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d
@@ -28,4 +28,8 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:[ 	]+62 4c 7f[ 	]+\(bad\)
 [ 	]*[a-f0-9]+:[ 	]+28 f8[ 	]+sub    %bh,%al
 [ 	]*[a-f0-9]+:[ 	]+bc 87 23 01 00[ 	]+mov    \$0x12387,%esp
+[ 	]*[a-f0-9]+:[ 	]+00 ff[ 	]+add    %bh,%bh
+[ 	]*[a-f0-9]+:[ 	]+62 f4 ec[ 	]+\(bad\)
+[ 	]*[a-f0-9]+:[ 	]+08 ff[ 	]+or     %bh,%bh
+[ 	]*[a-f0-9]+:[ 	]+c0[ 	]+.byte 0xc0
 #pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
index 9bb06d9f494..d4f4cb72e6e 100644
--- a/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
+++ b/gas/testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s
@@ -10,7 +10,7 @@  _start:
         .byte 0x62, 0xfc, 0x7f, 0x08, 0x60, 0xc2
         .byte 0xff, 0xff
         #VSIB vpgatherqq 0x7b(%rbp,%zmm17,8),%zmm16{%k1} set EVEX.P[10] == 0
-	#(illegal value).
+        #(illegal value).
         .byte 0x62, 0xe2, 0xf9, 0x41, 0x91, 0x84, 0xcd, 0x7b, 0x00, 0x00, 0x00
         .byte 0xff
         #EVEX_MAP4 movbe %r18w,%ax set EVEX.mm == b01 (illegal value).
@@ -27,3 +27,6 @@  _start:
         .byte 0xff
         #EVEX from VEX enqcmd 0x123(%r31,%rax,4),%r31 EVEX.P[23:22] == 1 (illegal value).
         .byte 0x62, 0x4c, 0x7f, 0x28, 0xf8, 0xbc, 0x87, 0x23, 0x01, 0x00, 0x00
+        .byte 0xff
+        #{evex} inc %rax EVEX.vvvv' > 0 (illegal value).
+        .byte 0x62, 0xf4, 0xec, 0x08, 0xff, 0xc0
diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd.d b/gas/testsuite/gas/i386/x86-64-apx-ndd.d
new file mode 100644
index 00000000000..c2cf0825e11
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.d
@@ -0,0 +1,161 @@ 
+#as:
+#objdump: -dw
+#name: x86-64 APX NDD instructions with evex prefix encoding
+#source: x86-64-apx-ndd.s
+
+.*: +file format .*
+
+
+Disassembly of section .text:
+
+0+ <_start>:
+\s*[a-f0-9]+:\s*62 f4 e4 18 ff c0\s+inc    %rax,%rbx
+\s*[a-f0-9]+:\s*62 dc bc 18 ff c7\s+inc    %r31,%r8
+\s*[a-f0-9]+:\s*62 dc fc 10 ff c7\s+inc    %r31,%r16
+\s*[a-f0-9]+:\s*62 44 7c 10 00 f8\s+add    %r31b,%r8b,%r16b
+\s*[a-f0-9]+:\s*62 44 7c 10 00 f8\s+add    %r31b,%r8b,%r16b
+\s*[a-f0-9]+:\s*62 44 fc 10 01 f8\s+add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 44 fc 10 01 f8\s+add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 44 7c 10 01 f8\s+add    %r31d,%r8d,%r16d
+\s*[a-f0-9]+:\s*62 44 7c 10 01 f8\s+add    %r31d,%r8d,%r16d
+\s*[a-f0-9]+:\s*62 44 7d 10 01 f8\s+add    %r31w,%r8w,%r16w
+\s*[a-f0-9]+:\s*62 44 7d 10 01 f8\s+add    %r31w,%r8w,%r16w
+\s*[a-f0-9]+:\s*62 44 fc 10 01 f8\s+add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 5c fc 10 03 c7\s+add    %r31,%r8,%r16
+\s*[a-f0-9]+:\s*62 44 fc 10 01 38\s+add    %r31,\(%r8\),%r16
+\s*[a-f0-9]+:\s*62 5c fc 10 03 07\s+add    \(%r31\),%r8,%r16
+\s*[a-f0-9]+:\s*62 5c f8 10 03 84 07 90 90 00 00\s+add\s+0x9090\(%r31,%r16,1\),%r8,%r16
+\s*[a-f0-9]+:\s*62 44 f8 10 01 3c c0\s+add    %r31,\(%r8,%r16,8\),%r16
+\s*[a-f0-9]+:\s*62 d4 74 10 80 c5 34\s+add    \$0x34,%r13b,%r17b
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 04 83 11\s+addl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 c0 34 12\s+add    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 d4 fc 10 81 c7 33 44 34 12\s+add    \$0x12344433,%r15,%r16
+\s*[a-f0-9]+:\s*62 d4 fc 10 81 04 8f 33 44 34 12\s+addq   \$0x12344433,\(%r15,%rcx,4\),%r16
+\s*[a-f0-9]+:\s*62 f4 bc 18 81 c0 11 22 33 f4\s+add    \$0xfffffffff4332211,%rax,%r8
+\s*[a-f0-9]+:\s*62 f4 f4 10 ff c8    	dec    %rax,%r17
+\s*[a-f0-9]+:\s*62 9c 3c 18 fe 0c 27 	decb   \(%r31,%r12,1\),%r8b
+\s*[a-f0-9]+:\s*62 f4 f4 10 f7 d0    	not    %rax,%r17
+\s*[a-f0-9]+:\s*62 9c 3c 18 f6 14 27 	notb   \(%r31,%r12,1\),%r8b
+\s*[a-f0-9]+:\s*62 f4 f4 10 f7 d8    	neg    %rax,%r17
+\s*[a-f0-9]+:\s*62 9c 3c 18 f6 1c 27 	negb   \(%r31,%r12,1\),%r8b
+\s*[a-f0-9]+:\s*62 7c 6c 10 28 f9    	sub    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 2a 04 07 	sub    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 2b 04 07 	sub    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 2c 83 11 	subl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 e8 34 12 	sub    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 18 f9    	sbb    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 1a 04 07 	sbb    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 1b 04 07 	sbb    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 1c 83 11 	sbbl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 d8 34 12 	sbb    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 10 f9    	adc    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 12 04 07 	adc    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 13 04 07 	adc    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 14 83 11 	adcl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 d0 34 12 	adc    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 08 f9    	or     %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 0a 04 07 	or     \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 0b 04 07 	or     \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 0c 83 11 	orl    \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 c8 34 12 	or     \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 30 f9    	xor    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 32 04 07 	xor    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 33 04 07 	xor    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 34 83 11 	xorl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 f0 34 12 	xor    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 7c 6c 10 20 f9    	and    %r15b,%r17b,%r18b
+\s*[a-f0-9]+:\s*62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
+\s*[a-f0-9]+:\s*62 c4 3c 18 22 04 07 	and    \(%r15,%rax,1\),%r16b,%r8b
+\s*[a-f0-9]+:\s*62 c4 3d 18 23 04 07 	and    \(%r15,%rax,1\),%r16w,%r8w
+\s*[a-f0-9]+:\s*62 fc 5c 10 83 24 83 11 	andl   \$0x11,\(%r19,%rax,4\),%r20d
+\s*[a-f0-9]+:\s*62 f4 0d 10 81 e0 34 12 	and    \$0x1234,%ax,%r30w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 08    	rorb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 cc 02 	ror    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 08 02 	rorl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 08    	rorw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 c8    	ror    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 0c 83 	rorw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 00    	rolb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 c4 02 	rol    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 00 02 	roll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 00    	rolw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 c0    	rol    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 04 83 	rolw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 18    	rcrb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 dc 02 	rcr    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 18 02 	rcrl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 18    	rcrw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 d8    	rcr    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 1c 83 	rcrw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 10    	rclb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 d4 02 	rcl    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 10 02 	rcll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 10    	rclw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 d0    	rcl    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 14 83 	rclw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 20    	shlb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 e4 02 	shl    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 20 02 	shll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 20    	shlw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e0    	shl    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 24 83 	shlw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 38    	sarb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 fc 02 	sar    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 38 02 	sarl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 38    	sarw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 f8    	sar    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 3c 83 	sarw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 20    	shlb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 e4 02 	shl    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 20 02 	shll   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 20    	shlw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e0    	shl    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 24 83 	shlw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 f4 04 10 d0 28    	shrb   \(%rax\),%r31b
+\s*[a-f0-9]+:\s*62 d4 04 10 c0 ec 02 	shr    \$0x2,%r12b,%r31b
+\s*[a-f0-9]+:\s*62 f4 04 10 c1 28 02 	shrl   \$0x2,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 f4 05 10 d1 28    	shrw   \(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 fc 3c 18 d2 e8    	shr    %cl,%r16b,%r8b
+\s*[a-f0-9]+:\s*62 fc 05 10 d3 2c 83 	shrw   %cl,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 74 84 10 24 20 01 	shld   \$0x1,%r12,\(%rax\),%r31
+\s*[a-f0-9]+:\s*62 54 05 10 24 c4 02 	shld   \$0x2,%r8w,%r12w,%r31w
+\s*[a-f0-9]+:\s*62 74 04 10 24 38 02 	shld   \$0x2,%r15d,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 74 05 10 a5 08    	shld   %cl,%r9w,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 7c bc 18 a5 e0\s+shld   %cl,%r12,%r16,%r8
+\s*[a-f0-9]+:\s*62 7c 05 10 a5 2c 83\s+shld   %cl,%r13w,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 74 84 10 2c 20 01 	shrd   \$0x1,%r12,\(%rax\),%r31
+\s*[a-f0-9]+:\s*62 54 05 10 2c c4 02 	shrd   \$0x2,%r8w,%r12w,%r31w
+\s*[a-f0-9]+:\s*62 74 04 10 2c 38 02 	shrd   \$0x2,%r15d,\(%rax\),%r31d
+\s*[a-f0-9]+:\s*62 74 05 10 ad 08\s+shrd   %cl,%r9w,\(%rax\),%r31w
+\s*[a-f0-9]+:\s*62 7c bc 18 ad e0\s+shrd   %cl,%r12,%r16,%r8
+\s*[a-f0-9]+:\s*62 7c 05 10 ad 2c 83\s+shrd   %cl,%r13w,\(%r19,%rax,4\),%r31w
+\s*[a-f0-9]+:\s*62 54 6d 10 66 c7    	adcx   %r15d,%r8d,%r18d
+\s*[a-f0-9]+:\s*62 14 69 10 66 04 3f 	adcx   \(%r15,%r31,1\),%r8d,%r18d
+\s*[a-f0-9]+:\s*62 14 f9 08 66 04 3f 	adcx   \(%r15,%r31,1\),%r8
+\s*[a-f0-9]+:\s*62 54 6e 10 66 c7    	adox   %r15d,%r8d,%r18d
+\s*[a-f0-9]+:\s*62 14 6a 10 66 04 3f 	adox   \(%r15,%r31,1\),%r8d,%r18d
+\s*[a-f0-9]+:\s*62 14 fa 08 66 04 3f 	adox   \(%r15,%r31,1\),%r8
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 40 90 90 90 90 90 	cmovo  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 41 90 90 90 90 90 	cmovno -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 42 90 90 90 90 90 	cmovb  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 43 90 90 90 90 90 	cmovae -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 44 90 90 90 90 90 	cmove  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 45 90 90 90 90 90 	cmovne -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 46 90 90 90 90 90 	cmovbe -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 47 90 90 90 90 90 	cmova  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 48 90 90 90 90 90 	cmovs  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 49 90 90 90 90 90 	cmovns -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4a 90 90 90 90 90 	cmovp  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4b 90 90 90 90 90 	cmovnp -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4c 90 90 90 90 90 	cmovl  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4d 90 90 90 90 90 	cmovge -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4e 90 90 90 90 90 	cmovle -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 4f 90 90 90 90 90 	cmovg  -0x6f6f6f70\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*67 62 f4 3c 18 af 90 09 09 09 00 	imul   0x90909\(%eax\),%edx,%r8d
+\s*[a-f0-9]+:\s*62 b4 b0 10 af 94 f8 09 09 00 00 	imul   0x909\(%rax,%r31,8\),%rdx,%r25
+#pass
diff --git a/gas/testsuite/gas/i386/x86-64-apx-ndd.s b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
new file mode 100644
index 00000000000..ee04d2f140a
--- /dev/null
+++ b/gas/testsuite/gas/i386/x86-64-apx-ndd.s
@@ -0,0 +1,154 @@ 
+# Check 64bit APX NDD instructions with evex prefix encoding
+
+	.allow_index_reg
+	.text
+_start:
+inc    %rax,%rbx
+inc    %r31,%r8
+inc    %r31,%r16
+add    %r31b,%r8b,%r16b
+addb    %r31b,%r8b,%r16b
+add    %r31,%r8,%r16
+addq    %r31,%r8,%r16
+add    %r31d,%r8d,%r16d
+addl    %r31d,%r8d,%r16d
+add    %r31w,%r8w,%r16w
+addw    %r31w,%r8w,%r16w
+{store} add    %r31,%r8,%r16
+{load}  add    %r31,%r8,%r16
+add    %r31,(%r8),%r16
+add    (%r31),%r8,%r16
+add    0x9090(%r31,%r16,1),%r8,%r16
+add    %r31,(%r8,%r16,8),%r16
+add    $0x34,%r13b,%r17b
+addl   $0x11,(%r19,%rax,4),%r20d
+add    $0x1234,%ax,%r30w
+add    $0x12344433,%r15,%r16
+addq   $0x12344433,(%r15,%rcx,4),%r16
+add    $0xfffffffff4332211,%rax,%r8
+dec    %rax,%r17
+decb   (%r31,%r12,1),%r8b
+not    %rax,%r17
+notb   (%r31,%r12,1),%r8b
+neg    %rax,%r17
+negb   (%r31,%r12,1),%r8b
+sub    %r15b,%r17b,%r18b
+sub    %r15d,(%r8),%r18d
+sub    (%r15,%rax,1),%r16b,%r8b
+sub    (%r15,%rax,1),%r16w,%r8w
+subl   $0x11,(%r19,%rax,4),%r20d
+sub    $0x1234,%ax,%r30w
+sbb    %r15b,%r17b,%r18b
+sbb    %r15d,(%r8),%r18d
+sbb    (%r15,%rax,1),%r16b,%r8b
+sbb    (%r15,%rax,1),%r16w,%r8w
+sbbl   $0x11,(%r19,%rax,4),%r20d
+sbb    $0x1234,%ax,%r30w
+adc    %r15b,%r17b,%r18b
+adc    %r15d,(%r8),%r18d
+adc    (%r15,%rax,1),%r16b,%r8b
+adc    (%r15,%rax,1),%r16w,%r8w
+adcl   $0x11,(%r19,%rax,4),%r20d
+adc    $0x1234,%ax,%r30w
+or     %r15b,%r17b,%r18b
+or     %r15d,(%r8),%r18d
+or     (%r15,%rax,1),%r16b,%r8b
+or     (%r15,%rax,1),%r16w,%r8w
+orl    $0x11,(%r19,%rax,4),%r20d
+or     $0x1234,%ax,%r30w
+xor    %r15b,%r17b,%r18b
+xor    %r15d,(%r8),%r18d
+xor    (%r15,%rax,1),%r16b,%r8b
+xor    (%r15,%rax,1),%r16w,%r8w
+xorl   $0x11,(%r19,%rax,4),%r20d
+xor    $0x1234,%ax,%r30w
+and    %r15b,%r17b,%r18b
+and    %r15d,(%r8),%r18d
+and    (%r15,%rax,1),%r16b,%r8b
+and    (%r15,%rax,1),%r16w,%r8w
+andl   $0x11,(%r19,%rax,4),%r20d
+and    $0x1234,%ax,%r30w
+rorb   (%rax),%r31b
+ror    $0x2,%r12b,%r31b
+rorl   $0x2,(%rax),%r31d
+rorw   (%rax),%r31w
+ror    %cl,%r16b,%r8b
+rorw   %cl,(%r19,%rax,4),%r31w
+rolb   (%rax),%r31b
+rol    $0x2,%r12b,%r31b
+roll   $0x2,(%rax),%r31d
+rolw   (%rax),%r31w
+rol    %cl,%r16b,%r8b
+rolw   %cl,(%r19,%rax,4),%r31w
+rcrb   (%rax),%r31b
+rcr    $0x2,%r12b,%r31b
+rcrl   $0x2,(%rax),%r31d
+rcrw   (%rax),%r31w
+rcr    %cl,%r16b,%r8b
+rcrw   %cl,(%r19,%rax,4),%r31w
+rclb   (%rax),%r31b
+rcl    $0x2,%r12b,%r31b
+rcll   $0x2,(%rax),%r31d
+rclw   (%rax),%r31w
+rcl    %cl,%r16b,%r8b
+rclw   %cl,(%r19,%rax,4),%r31w
+shlb   (%rax),%r31b
+shl    $0x2,%r12b,%r31b
+shll   $0x2,(%rax),%r31d
+shlw   (%rax),%r31w
+shl    %cl,%r16b,%r8b
+shlw   %cl,(%r19,%rax,4),%r31w
+sarb   (%rax),%r31b
+sar    $0x2,%r12b,%r31b
+sarl   $0x2,(%rax),%r31d
+sarw   (%rax),%r31w
+sar    %cl,%r16b,%r8b
+sarw   %cl,(%r19,%rax,4),%r31w
+shlb   (%rax),%r31b
+shl    $0x2,%r12b,%r31b
+shll   $0x2,(%rax),%r31d
+shlw   (%rax),%r31w
+shl    %cl,%r16b,%r8b
+shlw   %cl,(%r19,%rax,4),%r31w
+shrb   (%rax),%r31b
+shr    $0x2,%r12b,%r31b
+shrl   $0x2,(%rax),%r31d
+shrw   (%rax),%r31w
+shr    %cl,%r16b,%r8b
+shrw   %cl,(%r19,%rax,4),%r31w
+shld   $0x1,%r12,(%rax),%r31
+shld   $0x2,%r8w,%r12w,%r31w
+shld   $0x2,%r15d,(%rax),%r31d
+shld   %cl,%r9w,(%rax),%r31w
+shld   %cl,%r12,%r16,%r8
+shld   %cl,%r13w,(%r19,%rax,4),%r31w
+shrd   $0x1,%r12,(%rax),%r31
+shrd   $0x2,%r8w,%r12w,%r31w
+shrd   $0x2,%r15d,(%rax),%r31d
+shrd   %cl,%r9w,(%rax),%r31w
+shrd   %cl,%r12,%r16,%r8
+shrd   %cl,%r13w,(%r19,%rax,4),%r31w
+adcx   %r15d,%r8d,%r18d
+adcx   (%r15,%r31,1),%r8d,%r18d
+adcx   (%r15,%r31,1),%r8
+adox   %r15d,%r8d,%r18d
+adox   (%r15,%r31,1),%r8d,%r18d
+adox   (%r15,%r31,1),%r8
+cmovo  0x90909090(%eax),%edx,%r8d
+cmovno 0x90909090(%eax),%edx,%r8d
+cmovb  0x90909090(%eax),%edx,%r8d
+cmovae 0x90909090(%eax),%edx,%r8d
+cmove  0x90909090(%eax),%edx,%r8d
+cmovne 0x90909090(%eax),%edx,%r8d
+cmovbe 0x90909090(%eax),%edx,%r8d
+cmova  0x90909090(%eax),%edx,%r8d
+cmovs  0x90909090(%eax),%edx,%r8d
+cmovns 0x90909090(%eax),%edx,%r8d
+cmovp  0x90909090(%eax),%edx,%r8d
+cmovnp 0x90909090(%eax),%edx,%r8d
+cmovl  0x90909090(%eax),%edx,%r8d
+cmovge 0x90909090(%eax),%edx,%r8d
+cmovle 0x90909090(%eax),%edx,%r8d
+cmovg  0x90909090(%eax),%edx,%r8d
+imul   0x90909(%eax),%edx,%r8d
+imul   0x909(%rax,%r31,8),%rdx,%r25
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.d b/gas/testsuite/gas/i386/x86-64-pseudos.d
index 708c22b5899..1d399ffa949 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos.d
+++ b/gas/testsuite/gas/i386/x86-64-pseudos.d
@@ -137,6 +137,48 @@  Disassembly of section .text:
  +[a-f0-9]+:	33 07                	xor    \(%rdi\),%eax
  +[a-f0-9]+:	31 07                	xor    %eax,\(%rdi\)
  +[a-f0-9]+:	33 07                	xor    \(%rdi\),%eax
+ +[a-f0-9]+:	62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
+ +[a-f0-9]+:	62 44 fc 10 03 38    	add    \(%r8\),%r31,%r16
+ +[a-f0-9]+:	62 44 fc 10 01 38    	add    %r31,\(%r8\),%r16
+ +[a-f0-9]+:	62 44 fc 10 03 38    	add    \(%r8\),%r31,%r16
+ +[a-f0-9]+:	62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 2b 38    	sub    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 29 38    	sub    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 2b 38    	sub    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 1b 38    	sbb    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 19 38    	sbb    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 1b 38    	sbb    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 23 38    	and    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 21 38    	and    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 23 38    	and    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 0b 38    	or     \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 09 38    	or     %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 0b 38    	or     \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 33 38    	xor    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 31 38    	xor    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 33 38    	xor    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 13 38    	adc    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 54 6c 10 11 38    	adc    %r15d,\(%r8\),%r18d
+ +[a-f0-9]+:	62 54 6c 10 13 38    	adc    \(%r8\),%r15d,%r18d
+ +[a-f0-9]+:	62 44 fc 10 01 f8    	add    %r31,%r8,%r16
+ +[a-f0-9]+:	62 5c fc 10 03 c7    	add    %r31,%r8,%r16
+ +[a-f0-9]+:	62 7c 6c 10 28 f9    	sub    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 2a cf    	sub    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 18 f9    	sbb    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 1a cf    	sbb    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 20 f9    	and    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 22 cf    	and    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 08 f9    	or     %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 0a cf    	or     %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 30 f9    	xor    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 32 cf    	xor    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 7c 6c 10 10 f9    	adc    %r15b,%r17b,%r18b
+ +[a-f0-9]+:	62 c4 6c 10 12 cf    	adc    %r15b,%r17b,%r18b
  +[a-f0-9]+:	b0 12                	mov    \$0x12,%al
  +[a-f0-9]+:	b8 45 03 00 00       	mov    \$0x345,%eax
  +[a-f0-9]+:	b0 12                	mov    \$0x12,%al
diff --git a/gas/testsuite/gas/i386/x86-64-pseudos.s b/gas/testsuite/gas/i386/x86-64-pseudos.s
index 29a0c3368fc..e5b3a0d625d 100644
--- a/gas/testsuite/gas/i386/x86-64-pseudos.s
+++ b/gas/testsuite/gas/i386/x86-64-pseudos.s
@@ -134,6 +134,49 @@  _start:
 	{load} xor (%rdi), %eax
 	{store} xor %eax, (%rdi)
 	{store} xor (%rdi), %eax
+	{load}  add    %r31,(%r8),%r16
+	{load}	add    (%r8),%r31,%r16
+	{store} add    %r31,(%r8),%r16
+	{store}	add    (%r8),%r31,%r16
+	{load} 	sub    %r15d,(%r8),%r18d
+	{load}	sub    (%r8),%r15d,%r18d
+	{store} sub    %r15d,(%r8),%r18d
+	{store} sub    (%r8),%r15d,%r18d
+	{load} 	sbb    %r15d,(%r8),%r18d
+	{load}	sbb    (%r8),%r15d,%r18d
+	{store} sbb    %r15d,(%r8),%r18d
+	{store} sbb    (%r8),%r15d,%r18d
+	{load} 	and    %r15d,(%r8),%r18d
+	{load}	and    (%r8),%r15d,%r18d
+	{store} and    %r15d,(%r8),%r18d
+	{store} and    (%r8),%r15d,%r18d
+	{load} 	or     %r15d,(%r8),%r18d
+	{load}	or     (%r8),%r15d,%r18d
+	{store} or     %r15d,(%r8),%r18d
+	{store} or     (%r8),%r15d,%r18d
+	{load} 	xor    %r15d,(%r8),%r18d
+	{load}	xor    (%r8),%r15d,%r18d
+	{store} xor    %r15d,(%r8),%r18d
+	{store} xor    (%r8),%r15d,%r18d
+	{load} 	adc    %r15d,(%r8),%r18d
+	{load}	adc    (%r8),%r15d,%r18d
+	{store} adc    %r15d,(%r8),%r18d
+	{store} adc    (%r8),%r15d,%r18d
+
+	{store} add    %r31,%r8,%r16
+	{load}  add    %r31,%r8,%r16
+	{store} sub    %r15b,%r17b,%r18b
+	{load}	sub    %r15b,%r17b,%r18b
+	{store}	sbb    %r15b,%r17b,%r18b
+	{load}	sbb    %r15b,%r17b,%r18b
+	{store}	and    %r15b,%r17b,%r18b
+	{load}	and    %r15b,%r17b,%r18b
+	{store}	or     %r15b,%r17b,%r18b
+	{load}	or     %r15b,%r17b,%r18b
+	{store}	xor    %r15b,%r17b,%r18b
+	{load}	xor    %r15b,%r17b,%r18b
+	{store}	adc    %r15b,%r17b,%r18b
+	{load}	adc    %r15b,%r17b,%r18b
 
 	.irp m, mov, adc, add, and, cmp, or, sbb, sub, test, xor
 	\m	$0x12, %al
diff --git a/gas/testsuite/gas/i386/x86-64.exp b/gas/testsuite/gas/i386/x86-64.exp
index dc1fa8dddb9..07cb716d2a5 100644
--- a/gas/testsuite/gas/i386/x86-64.exp
+++ b/gas/testsuite/gas/i386/x86-64.exp
@@ -367,6 +367,7 @@  run_dump_test "x86-64-apx-rex2"
 run_dump_test "x86-64-apx-evex-promoted"
 run_dump_test "x86-64-apx-evex-promoted-intel"
 run_dump_test "x86-64-apx-evex-egpr"
+run_dump_test "x86-64-apx-ndd"
 run_dump_test "x86-64-avx512f-rcigrz-intel"
 run_dump_test "x86-64-avx512f-rcigrz"
 run_dump_test "x86-64-clwb"
diff --git a/opcodes/i386-dis-evex-prefix.h b/opcodes/i386-dis-evex-prefix.h
index e8f32324ade..09d8c10bdfd 100644
--- a/opcodes/i386-dis-evex-prefix.h
+++ b/opcodes/i386-dis-evex-prefix.h
@@ -338,10 +338,6 @@ 
     { "vcmpp%XH", { MaskG, Vex, EXxh, EXxEVexS, CMP }, 0 },
     { "vcmps%XH", { MaskG, VexScalar, EXw, EXxEVexS, CMP }, 0 },
   },
-  /* PREFIX_EVEX_MAP4_66 */
-  {
-    { "wrssK",	{ M, Gdq }, 0 },
-  },
   /* PREFIX_EVEX_MAP4_D8 */
   {
     { "sha1nexte", { XM, EXxmm }, 0 },
diff --git a/opcodes/i386-dis-evex-reg.h b/opcodes/i386-dis-evex-reg.h
index c3b4f083346..b75558c40ca 100644
--- a/opcodes/i386-dis-evex-reg.h
+++ b/opcodes/i386-dis-evex-reg.h
@@ -56,6 +56,36 @@ 
     { "blsmskS",	{ VexGdq, Edq }, 0 },
     { "blsiS",		{ VexGdq, Edq }, 0 },
   },
+  /* REG_EVEX_MAP4_80 */
+  {
+    { "addA",	{ VexGb, Eb, Ib }, 0 },
+    { "orA",	{ VexGb, Eb, Ib }, 0 },
+    { "adcA",	{ VexGb, Eb, Ib }, 0 },
+    { "sbbA",	{ VexGb, Eb, Ib }, 0 },
+    { "andA",	{ VexGb, Eb, Ib }, 0 },
+    { "subA",	{ VexGb, Eb, Ib }, 0 },
+    { "xorA",	{ VexGb, Eb, Ib }, 0 },
+  },
+  /* REG_EVEX_MAP4_81 */
+  {
+    { "addQ",	{ VexGv, Ev, Iv }, 0 },
+    { "orQ",	{ VexGv, Ev, Iv }, 0 },
+    { "adcQ",	{ VexGv, Ev, Iv }, 0 },
+    { "sbbQ",	{ VexGv, Ev, Iv }, 0 },
+    { "andQ",	{ VexGv, Ev, Iv }, 0 },
+    { "subQ",	{ VexGv, Ev, Iv }, 0 },
+    { "xorQ",	{ VexGv, Ev, Iv }, 0 },
+  },
+  /* REG_EVEX_MAP4_83 */
+  {
+    { "addQ",	{ VexGv, Ev, sIb }, 0 },
+    { "orQ",	{ VexGv, Ev, sIb }, 0 },
+    { "adcQ",	{ VexGv, Ev, sIb }, 0 },
+    { "sbbQ",	{ VexGv, Ev, sIb }, 0 },
+    { "andQ",	{ VexGv, Ev, sIb }, 0 },
+    { "subQ",	{ VexGv, Ev, sIb }, 0 },
+    { "xorQ",	{ VexGv, Ev, sIb }, 0 },
+  },
   /* REG_EVEX_MAP4_D8_PREFIX_1 */
   {
     { "aesencwide128kl",	{ M }, 0 },
@@ -63,3 +93,27 @@ 
     { "aesencwide256kl",	{ M }, 0 },
     { "aesdecwide256kl",	{ M }, 0 },
   },
+  /* REG_EVEX_MAP4_F6 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { "notA",	{ VexGb, Eb }, 0 },
+    { "negA",	{ VexGb, Eb }, 0 },
+  },
+  /* REG_EVEX_MAP4_F7 */
+  {
+    { Bad_Opcode },
+    { Bad_Opcode },
+    { "notQ",	{ VexGv, Ev }, 0 },
+    { "negQ",	{ VexGv, Ev }, 0 },
+  },
+  /* REG_EVEX_MAP4_FE */
+  {
+    { "incA",   { VexGb ,Eb }, 0 },
+    { "decA",   { VexGb ,Eb }, 0 },
+  },
+  /* REG_EVEX_MAP4_FF */
+  {
+    { "incQ",   { VexGv ,Ev }, 0 },
+    { "decQ",   { VexGv ,Ev }, 0 },
+  },
diff --git a/opcodes/i386-dis-evex.h b/opcodes/i386-dis-evex.h
index 65a2cbeaeb2..ef752f417c5 100644
--- a/opcodes/i386-dis-evex.h
+++ b/opcodes/i386-dis-evex.h
@@ -875,64 +875,64 @@  static const struct dis386 evex_table[][256] = {
   /* EVEX_MAP4_ */
   {
     /* 00 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "addB",             { VexGb, Eb, Gb }, 0  },
+    { "addS",             { VexGv, Ev, Gv }, 0 },
+    { "addB",             { VexGb, Gb, EbS }, 0 },
+    { "addS",             { VexGv, Gv, EvS }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 08 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "orB",		{ VexGb, Eb, Gb }, 0 },
+    { "orS",		{ VexGv, Ev, Gv }, 0 },
+    { "orB",		{ VexGb, Gb, EbS }, 0 },
+    { "orS",		{ VexGv, Gv, EvS }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 10 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "adcB",		{ VexGb, Eb, Gb }, 0 },
+    { "adcS",		{ VexGv, Ev, Gv }, 0 },
+    { "adcB",		{ VexGb, Gb, EbS }, 0 },
+    { "adcS",		{ VexGv, Gv, EvS }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 18 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "sbbB",		{ VexGb, Eb, Gb }, 0 },
+    { "sbbS",		{ VexGv, Ev, Gv }, 0 },
+    { "sbbB",		{ VexGb, Gb, EbS }, 0 },
+    { "sbbS",		{ VexGv, Gv, EvS }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 20 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "andB",		{ VexGb, Eb, Gb }, 0 },
+    { "andS",		{ VexGv, Ev, Gv }, 0 },
+    { "andB",		{ VexGb, Gb, EbS }, 0 },
+    { "andS",		{ VexGv, Gv, EvS }, 0 },
+    { "shldS",		{ VexGv, Ev, Gv, Ib }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 28 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "subB",		{ VexGb, Eb, Gb }, 0 },
+    { "subS",		{ VexGv, Ev, Gv }, 0 },
+    { "subB",		{ VexGb, Gb, EbS }, 0 },
+    { "subS",		{ VexGv, Gv, EvS }, 0 },
+    { "shrdS",		{ VexGv, Ev, Gv, Ib }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
     /* 30 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "xorB",		{ VexGb, Eb, Gb }, 0 },
+    { "xorS",		{ VexGv, Ev, Gv }, 0 },
+    { "xorB",		{ VexGb, Gb, EbS }, 0 },
+    { "xorS",		{ VexGv, Gv, EvS }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -947,23 +947,23 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 40 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "cmovoS",		{ VexGv, Gv, Ev }, 0 },
+    { "cmovnoS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmovbS",		{ VexGv, Gv, Ev }, 0 },
+    { "cmovaeS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmoveS",		{ VexGv, Gv, Ev }, 0 },
+    { "cmovneS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmovbeS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmovaS",		{ VexGv, Gv, Ev }, 0 },
     /* 48 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "cmovsS",		{ VexGv, Gv, Ev }, 0 },
+    { "cmovnsS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmovpS",		{ VexGv, Gv, Ev }, 0 },
+    { "cmovnpS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmovlS",		{ VexGv, Gv, Ev }, 0 },
+    { "cmovgeS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmovleS",	{ VexGv, Gv, Ev }, 0 },
+    { "cmovgS",		{ VexGv, Gv, Ev }, 0 },
     /* 50 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -989,7 +989,7 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { "wrussK",	{ M, Gdq }, PREFIX_DATA },
-    { PREFIX_TABLE (PREFIX_EVEX_MAP4_66) },
+    { PREFIX_TABLE (PREFIX_0F38F6) },
     { Bad_Opcode },
     /* 68 */
     { Bad_Opcode },
@@ -1019,10 +1019,10 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* 80 */
+    { REG_TABLE (REG_EVEX_MAP4_80) },
+    { REG_TABLE (REG_EVEX_MAP4_81) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_EVEX_MAP4_83) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1060,7 +1060,7 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
+    { "shldS",		{ VexGv, Ev, Gv, CL }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
     /* A8 */
@@ -1069,9 +1069,9 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
+    { "shrdS",		{ VexGv, Ev, Gv, CL }, 0 },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { "imulS",		{ VexGv, Gv, Ev }, 0 },
     /* B0 */
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1091,8 +1091,8 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* C0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_C0) },
+    { REG_TABLE (REG_C1) },
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1109,10 +1109,10 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     /* D0 */
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_D0) },
+    { REG_TABLE (REG_D1) },
+    { REG_TABLE (REG_D2) },
+    { REG_TABLE (REG_D3) },
     { "sha1rnds4", { XM, EXxmm, Ib }, 0 },
     { Bad_Opcode },
     { Bad_Opcode },
@@ -1151,8 +1151,8 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { Bad_Opcode },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_EVEX_MAP4_F6) },
+    { REG_TABLE (REG_EVEX_MAP4_F7) },
     /* F8 */
     { PREFIX_TABLE (PREFIX_EVEX_MAP4_F8) },
     { MOD_TABLE (MOD_EVEX_MAP4_F9) },
@@ -1160,8 +1160,8 @@  static const struct dis386 evex_table[][256] = {
     { Bad_Opcode },
     { PREFIX_TABLE (PREFIX_EVEX_MAP4_FC) },
     { Bad_Opcode },
-    { Bad_Opcode },
-    { Bad_Opcode },
+    { REG_TABLE (REG_EVEX_MAP4_FE) },
+    { REG_TABLE (REG_EVEX_MAP4_FF) },
   },
   /* EVEX_MAP5_ */
   {
diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c
index ef431087ba5..0de3959cf80 100644
--- a/opcodes/i386-dis.c
+++ b/opcodes/i386-dis.c
@@ -573,6 +573,8 @@  fetch_error (const instr_info *ins)
 #define VexGatherD { OP_VEX, vex_vsib_d_w_dq_mode }
 #define VexGatherQ { OP_VEX, vex_vsib_q_w_dq_mode }
 #define VexGdq { OP_VEX, dq_mode }
+#define VexGb { OP_VEX, b_mode }
+#define VexGv { OP_VEX, v_mode }
 #define VexTmm { OP_VEX, tmm_mode }
 #define XMVexI4 { OP_REG_VexI4, x_mode }
 #define XMVexScalarI4 { OP_REG_VexI4, scalar_mode }
@@ -885,7 +887,14 @@  enum
   REG_EVEX_0F38C6_L_2,
   REG_EVEX_0F38C7_L_2,
   REG_EVEX_0F38F3_L_0,
-  REG_EVEX_MAP4_D8_PREFIX_1
+  REG_EVEX_MAP4_80,
+  REG_EVEX_MAP4_81,
+  REG_EVEX_MAP4_83,
+  REG_EVEX_MAP4_D8_PREFIX_1,
+  REG_EVEX_MAP4_F6,
+  REG_EVEX_MAP4_F7,
+  REG_EVEX_MAP4_FE,
+  REG_EVEX_MAP4_FF,
 };
 
 enum
@@ -1171,7 +1180,6 @@  enum
   PREFIX_EVEX_0F3A67,
   PREFIX_EVEX_0F3AC2,
 
-  PREFIX_EVEX_MAP4_66,
   PREFIX_EVEX_MAP4_D8,
   PREFIX_EVEX_MAP4_DA,
   PREFIX_EVEX_MAP4_DB,
@@ -2616,25 +2624,25 @@  static const struct dis386 reg_table[][8] = {
   },
   /* REG_C0 */
   {
-    { "rolA",	{ Eb, Ib }, 0 },
-    { "rorA",	{ Eb, Ib }, 0 },
-    { "rclA",	{ Eb, Ib }, 0 },
-    { "rcrA",	{ Eb, Ib }, 0 },
-    { "shlA",	{ Eb, Ib }, 0 },
-    { "shrA",	{ Eb, Ib }, 0 },
-    { "shlA",	{ Eb, Ib }, 0 },
-    { "sarA",	{ Eb, Ib }, 0 },
+    { "rolA",	{ VexGb, Eb, Ib }, 0 },
+    { "rorA",	{ VexGb, Eb, Ib }, 0 },
+    { "rclA",	{ VexGb, Eb, Ib }, 0 },
+    { "rcrA",	{ VexGb, Eb, Ib }, 0 },
+    { "shlA",	{ VexGb, Eb, Ib }, 0 },
+    { "shrA",	{ VexGb, Eb, Ib }, 0 },
+    { "shlA",	{ VexGb, Eb, Ib }, 0 },
+    { "sarA",	{ VexGb, Eb, Ib }, 0 },
   },
   /* REG_C1 */
   {
-    { "rolQ",	{ Ev, Ib }, 0 },
-    { "rorQ",	{ Ev, Ib }, 0 },
-    { "rclQ",	{ Ev, Ib }, 0 },
-    { "rcrQ",	{ Ev, Ib }, 0 },
-    { "shlQ",	{ Ev, Ib }, 0 },
-    { "shrQ",	{ Ev, Ib }, 0 },
-    { "shlQ",	{ Ev, Ib }, 0 },
-    { "sarQ",	{ Ev, Ib }, 0 },
+    { "rolQ",	{ VexGv, Ev, Ib }, 0 },
+    { "rorQ",	{ VexGv, Ev, Ib }, 0 },
+    { "rclQ",	{ VexGv, Ev, Ib }, 0 },
+    { "rcrQ",	{ VexGv, Ev, Ib }, 0 },
+    { "shlQ",	{ VexGv, Ev, Ib }, 0 },
+    { "shrQ",	{ VexGv, Ev, Ib }, 0 },
+    { "shlQ",	{ VexGv, Ev, Ib }, 0 },
+    { "sarQ",	{ VexGv, Ev, Ib }, 0 },
   },
   /* REG_C6 */
   {
@@ -2660,47 +2668,47 @@  static const struct dis386 reg_table[][8] = {
   },
   /* REG_D0 */
   {
-    { "rolA",	{ Eb, I1 }, 0 },
-    { "rorA",	{ Eb, I1 }, 0 },
-    { "rclA",	{ Eb, I1 }, 0 },
-    { "rcrA",	{ Eb, I1 }, 0 },
-    { "shlA",	{ Eb, I1 }, 0 },
-    { "shrA",	{ Eb, I1 }, 0 },
-    { "shlA",	{ Eb, I1 }, 0 },
-    { "sarA",	{ Eb, I1 }, 0 },
+    { "rolA",	{ VexGb, Eb, I1 }, 0 },
+    { "rorA",	{ VexGb, Eb, I1 }, 0 },
+    { "rclA",	{ VexGb, Eb, I1 }, 0 },
+    { "rcrA",	{ VexGb, Eb, I1 }, 0 },
+    { "shlA",	{ VexGb, Eb, I1 }, 0 },
+    { "shrA",	{ VexGb, Eb, I1 }, 0 },
+    { "shlA",	{ VexGb, Eb, I1 }, 0 },
+    { "sarA",	{ VexGb, Eb, I1 }, 0 },
   },
   /* REG_D1 */
   {
-    { "rolQ",	{ Ev, I1 }, 0 },
-    { "rorQ",	{ Ev, I1 }, 0 },
-    { "rclQ",	{ Ev, I1 }, 0 },
-    { "rcrQ",	{ Ev, I1 }, 0 },
-    { "shlQ",	{ Ev, I1 }, 0 },
-    { "shrQ",	{ Ev, I1 }, 0 },
-    { "shlQ",	{ Ev, I1 }, 0 },
-    { "sarQ",	{ Ev, I1 }, 0 },
+    { "rolQ",	{ VexGv, Ev, I1 }, 0 },
+    { "rorQ",	{ VexGv, Ev, I1 }, 0 },
+    { "rclQ",	{ VexGv, Ev, I1 }, 0 },
+    { "rcrQ",	{ VexGv, Ev, I1 }, 0 },
+    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
+    { "shrQ",	{ VexGv, Ev, I1 }, 0 },
+    { "shlQ",	{ VexGv, Ev, I1 }, 0 },
+    { "sarQ",	{ VexGv, Ev, I1 }, 0 },
   },
   /* REG_D2 */
   {
-    { "rolA",	{ Eb, CL }, 0 },
-    { "rorA",	{ Eb, CL }, 0 },
-    { "rclA",	{ Eb, CL }, 0 },
-    { "rcrA",	{ Eb, CL }, 0 },
-    { "shlA",	{ Eb, CL }, 0 },
-    { "shrA",	{ Eb, CL }, 0 },
-    { "shlA",	{ Eb, CL }, 0 },
-    { "sarA",	{ Eb, CL }, 0 },
+    { "rolA",	{ VexGb, Eb, CL }, 0 },
+    { "rorA",	{ VexGb, Eb, CL }, 0 },
+    { "rclA",	{ VexGb, Eb, CL }, 0 },
+    { "rcrA",	{ VexGb, Eb, CL }, 0 },
+    { "shlA",	{ VexGb, Eb, CL }, 0 },
+    { "shrA",	{ VexGb, Eb, CL }, 0 },
+    { "shlA",	{ VexGb, Eb, CL }, 0 },
+    { "sarA",	{ VexGb, Eb, CL }, 0 },
   },
   /* REG_D3 */
   {
-    { "rolQ",	{ Ev, CL }, 0 },
-    { "rorQ",	{ Ev, CL }, 0 },
-    { "rclQ",	{ Ev, CL }, 0 },
-    { "rcrQ",	{ Ev, CL }, 0 },
-    { "shlQ",	{ Ev, CL }, 0 },
-    { "shrQ",	{ Ev, CL }, 0 },
-    { "shlQ",	{ Ev, CL }, 0 },
-    { "sarQ",	{ Ev, CL }, 0 },
+    { "rolQ",	{ VexGv, Ev, CL }, 0 },
+    { "rorQ",	{ VexGv, Ev, CL }, 0 },
+    { "rclQ",	{ VexGv, Ev, CL }, 0 },
+    { "rcrQ",	{ VexGv, Ev, CL }, 0 },
+    { "shlQ",	{ VexGv, Ev, CL }, 0 },
+    { "shrQ",	{ VexGv, Ev, CL }, 0 },
+    { "shlQ",	{ VexGv, Ev, CL }, 0 },
+    { "sarQ",	{ VexGv, Ev, CL }, 0 },
   },
   /* REG_F6 */
   {
@@ -3646,8 +3654,8 @@  static const struct dis386 prefix_table[][4] = {
   /* PREFIX_0F38F6 */
   {
     { "wrssK",	{ M, Gdq }, 0 },
-    { "adoxS",	{ Gdq, Edq}, 0 },
-    { "adcxS",	{ Gdq, Edq}, 0 },
+    { "adoxS",	{ VexGdq, Gdq, Edq}, 0 },
+    { "adcxS",	{ VexGdq, Gdq, Edq}, 0 },
     { Bad_Opcode },
   },
 
@@ -9061,6 +9069,15 @@  get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 	  ins->rex &= ~REX_B;
 	  ins->rex2 &= ~REX_R;
 	}
+      if (ins->evex_type == evex_from_legacy)
+	{
+	  /* EVEX from legacy instructions, when the EVEX.ND bit is 0,
+	     all bits of EVEX.vvvv and EVEX.V' must be 1.  */
+	  if (!ins->vex.b && (ins->vex.register_specifier
+				  || !ins->vex.v))
+	    return &bad_opcode;
+	  ins->rex |= REX_OPCODE;
+	}
 
       ins->need_vex = 4;
 
@@ -9087,7 +9104,7 @@  get_valid_dis386 (const struct dis386 *dp, instr_info *ins)
 	return &err_opcode;
 
       /* Set vector length.  */
-      if (ins->modrm.mod == 3 && ins->vex.b)
+      if (ins->modrm.mod == 3 && ins->vex.b && ins->evex_type == evex_default)
 	ins->vex.length = 512;
       else
 	{
@@ -9530,6 +9547,7 @@  print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
 		    {
 		      oappend (&ins, "{bad}");
 		      continue;
+
 		    }
 
 		  /* Instructions with a mask register destination allow for
@@ -9553,7 +9571,7 @@  print_insn (bfd_vma pc, disassemble_info *info, int intel_syntax)
 
 	  /* Check whether rounding control was enabled for an insn not
 	     supporting it.  */
-	  if (ins.modrm.mod == 3 && ins.vex.b
+	  if (ins.modrm.mod == 3 && ins.vex.b && ins.evex_type == evex_default
 	      && !(ins.evex_used & EVEX_b_used))
 	    {
 	      for (i = 0; i < MAX_OPERANDS; ++i)
@@ -11013,7 +11031,7 @@  print_displacement (instr_info *ins, bfd_signed_vma val)
 static void
 intel_operand_size (instr_info *ins, int bytemode, int sizeflag)
 {
-  if (ins->vex.b)
+  if (ins->vex.b && ins->evex_type == evex_default)
     {
       if (!ins->vex.no_broadcast)
 	switch (bytemode)
@@ -11946,7 +11964,7 @@  OP_E_memory (instr_info *ins, int bytemode, int sizeflag)
 	  print_operand_value (ins, disp & 0xffff, dis_style_text);
 	}
     }
-  if (ins->vex.b)
+  if (ins->vex.b && ins->evex_type == evex_default)
     {
       ins->evex_used |= EVEX_b_used;
 
@@ -13307,6 +13325,14 @@  OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
   if (!ins->need_vex)
     return true;
 
+  if (ins->evex_type == evex_from_legacy)
+    {
+      ins->evex_used |= EVEX_b_used;
+      /* Here vex.b is treated as "EVEX.ND.  */
+      if (!ins->vex.b)
+	return true;
+    }
+
   reg = ins->vex.register_specifier;
   ins->vex.register_specifier = 0;
   if (ins->address_mode != mode_64bit)
@@ -13398,12 +13424,19 @@  OP_VEX (instr_info *ins, int bytemode, int sizeflag ATTRIBUTE_UNUSED)
 	  names = att_names_xmm;
 	  ins->evex_used |= EVEX_len_used;
 	  break;
+	case v_mode:
 	case dq_mode:
 	  if (ins->rex & REX_W)
 	    names = att_names64;
+	  else if (bytemode == v_mode
+		   && !(sizeflag & DFLAG))
+	    names = att_names16;
 	  else
 	    names = att_names32;
 	  break;
+	case b_mode:
+	  names = att_names8rex;
+	  break;
 	case mask_bd_mode:
 	case mask_mode:
 	  if (reg > 0x7)
diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c
index 3ab2362a3cc..2e6ae807bbe 100644
--- a/opcodes/i386-gen.c
+++ b/opcodes/i386-gen.c
@@ -473,6 +473,7 @@  static bitfield opcode_modifiers[] =
   BITFIELD (IntelSyntax),
   BITFIELD (ISA64),
   BITFIELD (NoEgpr),
+  BITFIELD (NF),
 };
 
 #define CLASS(n) #n, n
diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h
index d7d28bf3d93..bb826bbdb34 100644
--- a/opcodes/i386-opc.h
+++ b/opcodes/i386-opc.h
@@ -636,7 +636,10 @@  enum
   /* How to encode VEX.vvvv:
      0: VEX.vvvv must be 1111b.
      1: VEX.vvvv encodes one of the register operands.
+     2: VEX.vvvv encodes as the dest register operands.
    */
+#define VexVVVV_SRC   1
+#define VexVVVV_DST   2
   VexVVVV,
   /* How the VEX.W bit is used:
      0: Set by the REX.W bit.
@@ -749,6 +752,9 @@  enum
   /* egprs (r16-r31) on instruction illegal.  */
   NoEgpr,
 
+  /* No CSPAZO flags update indication.  */
+  NF,
+
   /* The last bitfield in i386_opcode_modifier.  */
   Opcode_Modifier_Num
 };
@@ -779,7 +785,7 @@  typedef struct i386_opcode_modifier
   unsigned int immext:1;
   unsigned int norex64:1;
   unsigned int vex:2;
-  unsigned int vexvvvv:1;
+  unsigned int vexvvvv:2;
   unsigned int vexw:2;
   unsigned int opcodeprefix:2;
   unsigned int sib:3;
@@ -797,6 +803,7 @@  typedef struct i386_opcode_modifier
   unsigned int intelsyntax:1;
   unsigned int isa64:2;
   unsigned int noegpr:1;
+  unsigned int nf:1;
 } i386_opcode_modifier;
 
 /* Operand classes.  */
diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl
index bb42270483b..b1f7491e7d7 100644
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -138,6 +138,9 @@ 
 #define Vsz512 Vsz=VSZ512
 
 #define APX_F APX_F|x64
+#define VexVVVVSrc  VexVVVV=VexVVVV_SRC
+#define VexVVVVDest VexVVVV=VexVVVV_DST
+
 
 // The EVEX purpose of StaticRounding appears only together with SAE. Re-use
 // the bit to mark commutative VEX encodings where swapping the source
@@ -190,6 +193,8 @@  mov, 0xf21, i386|No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { De
 mov, 0xf21, x64, D|RegMem|No_bSuf|No_wSuf|No_lSuf|No_sSuf|NoRex64, { Debug, Reg64 }
 mov, 0xf24, i386|No64, D|RegMem|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Test, Reg32 }
 
+// Move after swapping the bytes
+movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 // Move after swapping the bytes
 movbe, 0x0f38f0, Movbe, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 movbe, 0x60, Movbe|APX_F, D|Modrm|CheckOperandSize|No_bSuf|No_sSuf|EVex128|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
@@ -290,22 +295,36 @@  add, 0x0, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg3
 add, 0x83/0, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 add, 0x4, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 add, 0x80/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+add, 0x0, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+add, 0x83/0, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+add, 0x80/0, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}
 
 inc, 0x40, No64, No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32 }
 inc, 0xfe/0, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+inc, 0xfe/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, {Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64}
 
 sub, 0x28, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sub, 0x83/5, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 sub, 0x2c, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 sub, 0x80/5, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+sub, 0x28, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64, }
+sub, 0x83/5, APX_F, Modrm|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+sub, 0x80/5, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 dec, 0x48, No64, No_bSuf|No_sSuf|No_qSuf, { Reg16|Reg32 }
 dec, 0xfe/1, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+dec, 0xfe/1, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 sbb, 0x18, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sbb, 0x83/3, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 sbb, 0x1c, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 sbb, 0x80/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+sbb, 0x18, APX_F, D|W|CheckOperandSize|Modrm|EVex128|EVexMap4|No_sSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+sbb, 0x83/3, APX_F, Modrm|EVex128|EVexMap4|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
+sbb, 0x80/3, APX_F, W|Modrm|EVex128|EVexMap4|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+sbb, 0x18, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+sbb, 0x83/3, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+sbb, 0x80/3, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 cmp, 0x38, 0, D|W|CheckOperandSize|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 cmp, 0x83/7, 0, Modrm|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
@@ -320,16 +339,25 @@  and, 0x20, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|
 and, 0x83/4, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock|Optimize, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 and, 0x24, 0, W|No_sSuf|Optimize, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 and, 0x80/4, 0, W|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+and, 0x20, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+and, 0x83/4, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF|Optimize, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+and, 0x80/4, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF|Optimize, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 or, 0x8, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 or, 0x83/1, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 or, 0xc, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 or, 0x80/1, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+or, 0x8, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+or, 0x83/1, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+or, 0x80/1, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 xor, 0x30, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 xor, 0x83/6, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 xor, 0x34, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 xor, 0x80/6, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+xor, 0x30, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF|Optimize, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+xor, 0x83/6, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+xor, 0x80/6, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 // clr with 1 operand is really xor with 2 operands.
 clr, 0x30, 0, W|Modrm|No_sSuf|RegKludge|Optimize, { Reg8|Reg16|Reg32|Reg64 }
@@ -338,10 +366,19 @@  adc, 0x10, 0, D|W|CheckOperandSize|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg
 adc, 0x83/2, 0, Modrm|No_bSuf|No_sSuf|HLEPrefixLock, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 adc, 0x14, 0, W|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Acc|Byte|Word|Dword|Qword }
 adc, 0x80/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+adc, 0x10, APX_F, D|W|CheckOperandSize|Modrm|EVex128|EVexMap4|No_sSuf, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+adc, 0x83/2, APX_F, Modrm|EVex128|EVexMap4|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
+adc, 0x80/2, APX_F, W|Modrm|EVex128|EVexMap4|No_sSuf, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+adc, 0x10, APX_F, D|W|CheckOperandSize|Modrm|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+adc, 0x83/2, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+adc, 0x80/2, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Imm8|Imm16|Imm32|Imm32S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 neg, 0xf6/3, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+neg, 0xf6/3, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 not, 0xf6/2, 0, W|Modrm|No_sSuf|HLEPrefixLock, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+not, 0xf6/2, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+not, 0xf6/2, APX_F, W|Modrm|CheckOperandSize|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 aaa, 0x37, No64, NoSuf, {}
 aas, 0x3f, No64, NoSuf, {}
@@ -375,6 +412,7 @@  cqto, 0x99, x64, Size64|NoSuf, {}
 mul, 0xf6/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 imul, 0xf6/5, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 imul, 0xfaf, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|Word|Dword|Qword|BaseIndex, Reg16|Reg32|Reg64 }
+imul, 0xaf, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg16|Reg32|Reg64|Unspecified|Word|Dword|Qword|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 }
 imul, 0x6b, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 imul, 0x69, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm16|Imm32|Imm32S, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 // imul with 2 operands mimics imul with 3 by putting the register in
@@ -392,49 +430,95 @@  rol, 0xd0/0, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|
 rol, 0xc0/0, i186, W|Modrm|No_sSuf, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rol, 0xd2/0, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rol, 0xd0/0, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rol, 0xd0/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|NF|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rol, 0xc0/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|NF|EVexMap4|NF, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rol, 0xd2/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|NF|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rol, 0xd0/0, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|NF|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 ror, 0xd0/1, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ror, 0xc0/1, i186, W|Modrm|No_sSuf, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ror, 0xd2/1, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 ror, 0xd0/1, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+ror, 0xd0/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+ror, 0xc0/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8|Imm8S, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+ror, 0xd2/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+ror, 0xd0/1, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 rcl, 0xd0/2, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rcl, 0xc0/2, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rcl, 0xd2/2, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rcl, 0xd0/2, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcl, 0xc0/2, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcl, 0xd2/2, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rcl, 0xc0/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rcl, 0xd2/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rcl, 0xd0/2, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 rcr, 0xd0/3, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rcr, 0xc0/3, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rcr, 0xd2/3, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 rcr, 0xd0/3, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcr, 0xc0/3, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcr, 0xd2/3, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rcr, 0xc0/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rcr, 0xd2/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+rcr, 0xd0/3, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 sal, 0xd0/4, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sal, 0xc0/4, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sal, 0xd2/4, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sal, 0xd0/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+sal, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+sal, 0xc0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+sal, 0xd2/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+sal, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 shl, 0xd0/4, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 shl, 0xc0/4, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 shl, 0xd2/4, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 shl, 0xd0/4, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+shl, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+shl, 0xc0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+shl, 0xd2/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+shl, 0xd0/4, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 shr, 0xd0/5, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 shr, 0xc0/5, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 shr, 0xd2/5, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 shr, 0xd0/5, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+shr, 0xd0/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+shr, 0xc0/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+shr, 0xd2/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+shr, 0xd0/5, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 sar, 0xd0/7, 0, W|Modrm|No_sSuf, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sar, 0xc0/7, i186, W|Modrm|No_sSuf, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sar, 0xd2/7, 0, W|Modrm|No_sSuf, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
 sar, 0xd0/7, 0, W|Modrm|No_sSuf, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex }
+sar, 0xd0/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm1, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+sar, 0xc0/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+sar, 0xd2/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { ShiftCount, Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
+sar, 0xd0/7, APX_F, W|Modrm|No_sSuf|CheckOperandSize|VexVVVVDest|EVex128|EVexMap4|NF, { Reg8|Reg16|Reg32|Reg64|Byte|Word|Dword|Qword|Unspecified|BaseIndex, Reg8|Reg16|Reg32|Reg64 }
 
 shld, 0xfa4, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 shld, 0xfa5, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 shld, 0xfa5, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
+shld, 0x24, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+shld, 0xa5, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+shld, 0xa5, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 shrd, 0xfac, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 shrd, 0xfad, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
 shrd, 0xfad, i386, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex }
+shrd, 0x2c, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Imm8, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+shrd, 0xad, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { ShiftCount, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+shrd, 0xad, APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4|NF, { Reg16|Reg32|Reg64, Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // Control transfer instructions.
 call, 0xe8, No64, JumpDword|DefaultSize|No_bSuf|No_sSuf|No_qSuf|BNDPrefixOk, { Disp16|Disp32 }
@@ -940,6 +1024,7 @@  ud2b, 0xfb9, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|U
 ud0, 0xfff, i186, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 cmov<cc>, 0xf4<cc:opc>, CMOV, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
+cmov<cc>, 0x4<cc:opc>, CMOV|APX_F, Modrm|CheckOperandSize|No_bSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64, Reg16|Reg32|Reg64 }
 
 fcmovb, 0xda/0, i687, Modrm|NoSuf, { FloatReg, FloatAcc }
 fcmovnae, 0xda/0, i687, Modrm|NoSuf, { FloatReg, FloatAcc }
@@ -985,12 +1070,12 @@  pause, 0xf390, i186, NoSuf, {}
 // MMX/SSE2 instructions.
 
 <mmx:cpu:pfx:attr:reg:mem, +
-    $avx:AVX:66:Vex128|VexVVVV|VexW0|SSE2AVX:RegXMM:Xmmword, +
+    $avx:AVX:66:Vex128|VexVVVVSrc|VexW0|SSE2AVX:RegXMM:Xmmword, +
     $sse:SSE2:66::RegXMM:Xmmword, +
     $mmx:MMX:::RegMMX:Qword>
 
 <sse2:cpu:attr:scal:vvvv, +
-    $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, +
+    $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVVSrc, +
     $sse:SSE2:::>
 
 <bw:opc:vexw:elem:kcpu:kpfx:cpubmi, +
@@ -1073,7 +1158,7 @@  pxor<mmx>, 0x<mmx:pfx>0fef, <mmx:cpu>, Modrm|<mmx:attr>|C|NoSuf, { <mmx:reg>|<mm
 // SSE instructions.
 
 <sse:cpu:attr:scal:vvvv, +
-    $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, +
+    $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVVSrc, +
     $sse:SSE:::>
 <frel:imm:comm, eq:0:C, lt:1:, le:2:, unord:3:C, neq:4:C, nlt:5:, nle:6:, ord:7:C>
 
@@ -1089,8 +1174,8 @@  comiss<sse>, 0x0f2f, <sse:cpu>, Modrm|<sse:scal>|NoSuf, { Dword|Unspecified|Base
 cvtpi2ps, 0xf2a, SSE, Modrm|NoSuf, { Qword|Unspecified|BaseIndex|RegMMX, RegXMM }
 cvtps2pi, 0xf2d, SSE, Modrm|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegMMX }
 cvtsi2ss<sse>, 0xf30f2a, <sse:cpu>|No64, Modrm|<sse:scal>|<sse:vvvv>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
-cvtsi2ss, 0xf32a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVV|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|ATTSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
-cvtsi2ss, 0xf32a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVV|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|IntelSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+cvtsi2ss, 0xf32a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVVSrc|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|ATTSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+cvtsi2ss, 0xf32a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|IntelSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
 cvtsi2ss, 0xf30f2a, SSE|x64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
 cvtsi2ss, 0xf30f2a, SSE|x64, Modrm|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
 cvtss2si, 0xf32d, AVX, Modrm|VexLIG|Space0F|No_bSuf|No_wSuf|No_sSuf|SSE2AVX, { Dword|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
@@ -1108,11 +1193,11 @@  minps<sse>, 0x0f5d, <sse:cpu>, Modrm|<sse:attr>|<sse:vvvv>|NoSuf, { RegXMM|Unspe
 minss<sse>, 0xf30f5d, <sse:cpu>, Modrm|<sse:scal>|<sse:vvvv>|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movaps<sse>, 0x0f28, <sse:cpu>, D|Modrm|<sse:attr>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 movhlps<sse>, 0x0f12, <sse:cpu>, Modrm|<sse:attr>|<sse:vvvv>|NoSuf, { RegXMM, RegXMM }
-movhps, 0x16, AVX, Modrm|Vex|Space0F|VexVVVV|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
+movhps, 0x16, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
 movhps, 0x17, AVX, Modrm|Vex|Space0F|VexW0|NoSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex }
 movhps, 0xf16, SSE, D|Modrm|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 movlhps<sse>, 0x0f16, <sse:cpu>, Modrm|<sse:attr>|<sse:vvvv>|NoSuf, { RegXMM, RegXMM }
-movlps, 0x12, AVX, Modrm|Vex|Space0F|VexVVVV|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
+movlps, 0x12, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
 movlps, 0x13, AVX, Modrm|Vex|Space0F|VexW0|NoSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex }
 movlps, 0xf12, SSE, D|Modrm|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 movmskps<sse>, 0x0f50, <sse:cpu>, Modrm|<sse:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|NoRex64, { RegXMM, Reg32|Reg64 }
@@ -1120,7 +1205,7 @@  movntps<sse>, 0x0f2b, <sse:cpu>, Modrm|<sse:attr>|NoSuf, { RegXMM, Xmmword|Unspe
 movntq, 0xfe7, SSE|3dnowA, Modrm|NoSuf, { RegMMX, Qword|Unspecified|BaseIndex }
 movntdq<sse2>, 0x660fe7, <sse2:cpu>, Modrm|<sse2:attr>|NoSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
 movss, 0xf310, AVX, D|Modrm|VexLIG|Space0F|VexW0|NoSuf|SSE2AVX, { Dword|Unspecified|BaseIndex, RegXMM }
-movss, 0xf310, AVX, D|Modrm|VexLIG|Space0F|VexVVVV|VexW0|NoSuf|SSE2AVX, { RegXMM, RegXMM }
+movss, 0xf310, AVX, D|Modrm|VexLIG|Space0F|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { RegXMM, RegXMM }
 movss, 0xf30f10, SSE, D|Modrm|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movups<sse>, 0x0f10, <sse:cpu>, D|Modrm|<sse:attr>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 mulps<sse>, 0x0f59, <sse:cpu>, Modrm|<sse:attr>|<sse:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1184,8 +1269,8 @@  cvtpi2pd, 0x660f2a, SSE2, Modrm|NoSuf, { RegMMX, RegXMM }
 cvtpi2pd, 0xf3e6, AVX, Modrm|Vex|Space0F|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
 cvtpi2pd, 0x660f2a, SSE2, Modrm|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 cvtsi2sd<sse2>, 0xf20f2a, <sse2:cpu>|No64, Modrm|IgnoreSize|<sse2:scal>|<sse2:vvvv>|No_bSuf|No_wSuf|No_sSuf|No_qSuf, { Reg32|Unspecified|BaseIndex, RegXMM }
-cvtsi2sd, 0xf22a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVV|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|ATTSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
-cvtsi2sd, 0xf22a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVV|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|IntelSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+cvtsi2sd, 0xf22a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVVSrc|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|ATTSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
+cvtsi2sd, 0xf22a, AVX|x64, Modrm|Vex=3|Space0F|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf|SSE2AVX|IntelSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
 cvtsi2sd, 0xf20f2a, SSE2|x64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
 cvtsi2sd, 0xf20f2a, SSE2|x64, Modrm|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, RegXMM }
 divpd<sse2>, 0x660f5e, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1195,16 +1280,16 @@  maxsd<sse2>, 0xf20f5f, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|NoSuf, { Qword|
 minpd<sse2>, 0x660f5d, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 minsd<sse2>, 0xf20f5d, <sse2:cpu>, Modrm|<sse2:scal>|<sse2:vvvv>|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movapd<sse2>, 0x660f28, <sse2:cpu>, D|Modrm|<sse2:attr>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
-movhpd, 0x6616, AVX, Modrm|Vex|Space0F|VexVVVV|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
+movhpd, 0x6616, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
 movhpd, 0x6617, AVX, Modrm|Vex|Space0F|VexW0|NoSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex }
 movhpd, 0x660f16, SSE2, D|Modrm|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM }
-movlpd, 0x6612, AVX, Modrm|Vex|Space0F|VexVVVV|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
+movlpd, 0x6612, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
 movlpd, 0x6613, AVX, Modrm|Vex|Space0F|VexW0|NoSuf|SSE2AVX, { RegXMM, Qword|Unspecified|BaseIndex }
 movlpd, 0x660f12, SSE2, D|Modrm|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 movmskpd<sse2>, 0x660f50, <sse2:cpu>, Modrm|<sse2:attr>|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|NoRex64, { RegXMM, Reg32|Reg64 }
 movntpd<sse2>, 0x660f2b, <sse2:cpu>, Modrm|<sse2:attr>|NoSuf, { RegXMM, Xmmword|Unspecified|BaseIndex }
 movsd, 0xf210, AVX, D|Modrm|VexLIG|Space0F|VexW0|NoSuf|SSE2AVX, { Qword|Unspecified|BaseIndex, RegXMM }
-movsd, 0xf210, AVX, D|Modrm|VexLIG|Space0F|VexVVVV|VexW0|NoSuf|SSE2AVX, { RegXMM, RegXMM }
+movsd, 0xf210, AVX, D|Modrm|VexLIG|Space0F|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { RegXMM, RegXMM }
 movsd, 0xf20f10, SSE2, D|Modrm|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 movupd<sse2>, 0x660f10, <sse2:cpu>, D|Modrm|<sse2:attr>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 mulpd<sse2>, 0x660f59, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1255,7 +1340,7 @@  punpcklqdq<sse2>, 0x660f6c, <sse2:cpu>, Modrm|<sse2:attr>|<sse2:vvvv>|NoSuf, { R
 
 // SSE3 instructions.
 
-<sse3:cpu:attr:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexVVVV, $sse:SSE3::>
+<sse3:cpu:attr:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexVVVVSrc, $sse:SSE3::>
 
 addsubpd<sse3>, 0x660fd0, <sse3:cpu>, Modrm|<sse3:attr>|<sse3:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 addsubps<sse3>, 0xf20fd0, <sse3:cpu>, Modrm|<sse3:attr>|<sse3:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1333,7 +1418,7 @@  invpcid, 0xf3f2, APX_F|INVPCID, Modrm|NoSuf|EVex128|EVexMap4, { Oword|Unspecifie
 // SSSE3 instructions.
 
 <ssse3:cpu:pfx:attr:vvvv:reg:mem, +
-    $avx:AVX:66:Vex128|VexW0|SSE2AVX:VexVVVV:RegXMM:Xmmword, +
+    $avx:AVX:66:Vex128|VexW0|SSE2AVX:VexVVVVSrc:RegXMM:Xmmword, +
     $sse:SSSE3:66:::RegXMM:Xmmword, +
     $mmx:SSSE3::::RegMMX:Qword>
 
@@ -1354,12 +1439,12 @@  pabsd<ssse3>, 0x<ssse3:pfx>0f381e, <ssse3:cpu>, Modrm|<ssse3:attr>|NoSuf, { <sss
 
 // SSE4.1 instructions.
 
-<sse41:cpu:attr:scal:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVV, $sse:SSE4_1:::>
+<sse41:cpu:attr:scal:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexLIG|VexW0|SSE2AVX:VexVVVVSrc, $sse:SSE4_1:::>
 <sd:ppfx:spfx:opc:vexw:elem, s::f3:0:VexW0:Dword, d:66:f2:1:VexW1:Qword>
 
 blendp<sd><sse41>, 0x660f3a0c | <sd:opc>, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
-blendvp<sd>, 0x664a | <sd:opc>, AVX, Modrm|Vex128|Space0F3A|VexVVVV|VexW0|NoSuf|SSE2AVX, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
-blendvp<sd>, 0x664a | <sd:opc>, AVX, Modrm|Vex128|Space0F3A|VexVVVV|VexW0|NoSuf|Implicit1stXmm0|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
+blendvp<sd>, 0x664a | <sd:opc>, AVX, Modrm|Vex128|Space0F3A|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
+blendvp<sd>, 0x664a | <sd:opc>, AVX, Modrm|Vex128|Space0F3A|VexVVVVSrc|VexW0|NoSuf|Implicit1stXmm0|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 blendvp<sd>, 0x660f3814 | <sd:opc>, SSE4_1, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
 blendvp<sd>, 0x660f3814 | <sd:opc>, SSE4_1, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 dpp<sd><sse41>, 0x660f3a40 | <sd:opc>, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1371,8 +1456,8 @@  insertps<sse41>, 0x660f3a21, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf,
 movntdqa<sse41>, 0x660f382a, <sse41:cpu>, Modrm|<sse41:attr>|NoSuf, { Xmmword|Unspecified|BaseIndex, RegXMM }
 mpsadbw<sse41>, 0x660f3a42, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
 packusdw<sse41>, 0x660f382b, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
-pblendvb, 0x664c, AVX, Modrm|Vex128|Space0F3A|VexVVVV|VexW0|NoSuf|SSE2AVX, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
-pblendvb, 0x664c, AVX, Modrm|Vex128|Space0F3A|VexVVVV|VexW0|NoSuf|Implicit1stXmm0|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
+pblendvb, 0x664c, AVX, Modrm|Vex128|Space0F3A|VexVVVVSrc|VexW0|NoSuf|SSE2AVX, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
+pblendvb, 0x664c, AVX, Modrm|Vex128|Space0F3A|VexVVVVSrc|VexW0|NoSuf|Implicit1stXmm0|SSE2AVX, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pblendvb, 0x660f3810, SSE4_1, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM }
 pblendvb, 0x660f3810, SSE4_1, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pblendw<sse41>, 0x660f3a0e, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1386,7 +1471,7 @@  phminposuw<sse41>, 0x660f3841, <sse41:cpu>, Modrm|<sse41:attr>|NoSuf, { RegXMM|U
 pinsrb<sse41>, 0x660f3a20, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf|IgnoreSize|NoRex64, { Imm8, Reg32|Reg64, RegXMM }
 pinsrb<sse41>, 0x660f3a20, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM }
 pinsrd<sse41>, 0x660f3a22, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf|IgnoreSize, { Imm8, Reg32|Unspecified|BaseIndex, RegXMM }
-pinsrq, 0x6622, AVX|x64, Modrm|Vex|Space0F3A|VexVVVV|VexW1|NoSuf|SSE2AVX, { Imm8, Reg64|Unspecified|BaseIndex, RegXMM }
+pinsrq, 0x6622, AVX|x64, Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|NoSuf|SSE2AVX, { Imm8, Reg64|Unspecified|BaseIndex, RegXMM }
 pinsrq, 0x660f3a22, SSE4_1|x64, Modrm|Size64|NoSuf, { Imm8, Reg64|Unspecified|BaseIndex, RegXMM }
 pmaxsb<sse41>, 0x660f383c, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pmaxsd<sse41>, 0x660f383d, <sse41:cpu>, Modrm|<sse41:attr>|<sse41:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1416,7 +1501,7 @@  rounds<sd><sse41>, 0x660f3a0a | <sd:opc>, <sse41:cpu>, Modrm|<sse41:scal>|<sse41
 
 // SSE4.2 instructions.
 
-<sse42:cpu:attr:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexVVVV, $sse:SSE4_2::>
+<sse42:cpu:attr:vvvv, $avx:AVX:Vex128|VexW0|SSE2AVX:VexVVVVSrc, $sse:SSE4_2::>
 
 pcmpgtq<sse42>, 0x660f3837, <sse42:cpu>, Modrm|<sse42:attr>|<sse42:vvvv>|NoSuf|Optimize, { RegXMM|Unspecified|BaseIndex, RegXMM }
 pcmpestri<sse42>, 0x660f3a61, <sse42:cpu>|No64, Modrm|<sse42:attr>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1447,7 +1532,7 @@  xsaveopt64, 0xfae/6, Xsaveopt|x64, Modrm|NoSuf|Size64, { Unspecified|BaseIndex }
 
 // AES instructions.
 
-<aes:cpu:attr:vvvv, $avx:AVX|:Vex128|VexW0|SSE2AVX:VexVVVV, $sse:::>
+<aes:cpu:attr:vvvv, $avx:AVX|:Vex128|VexW0|SSE2AVX:VexVVVVSrc, $sse:::>
 
 aesdec<aes>, 0x660f38de, <aes:cpu>AES, Modrm|<aes:attr>|<aes:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
 aesdeclast<aes>, 0x660f38df, <aes:cpu>AES, Modrm|<aes:attr>|<aes:vvvv>|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1458,7 +1543,7 @@  aeskeygenassist<aes>, 0x660f3adf, <aes:cpu>AES, Modrm|<aes:attr>|NoSuf, { Imm8,
 
 // PCLMULQDQ
 
-<pclmul:cpu:attr, $avx:AVX|:Vex128|VexW0|SSE2AVX|VexVVVV, $sse::>
+<pclmul:cpu:attr, $avx:AVX|:Vex128|VexW0|SSE2AVX|VexVVVVSrc, $sse::>
 
 pclmulqdq<pclmul>, 0x660f3a44, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr>|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
 pclmullqlqdq<pclmul>, 0x660f3a44/0x00, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr>|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1468,7 +1553,7 @@  pclmulhqhqdq<pclmul>, 0x660f3a44/0x11, <pclmul:cpu>PCLMULQDQ, Modrm|<pclmul:attr
 
 // GFNI
 
-<gfni:cpu:w0:w1, $avx:AVX|:Vex128|VexW0|SSE2AVX|VexVVVV:Vex128|VexW1|SSE2AVX|VexVVVV, $sse:::>
+<gfni:cpu:w0:w1, $avx:AVX|:Vex128|VexW0|SSE2AVX|VexVVVVSrc:Vex128|VexW1|SSE2AVX|VexVVVV, $sse:::>
 
 gf2p8affineqb<gfni>, 0x660f3ace, <gfni:cpu>GFNI, Modrm|<gfni:w1>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
 gf2p8affineinvqb<gfni>, 0x660f3acf, <gfni:cpu>GFNI, Modrm|<gfni:w1>|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1493,21 +1578,21 @@  gf2p8mulb<gfni>, 0x660f38cf, <gfni:cpu>GFNI, Modrm|<gfni:w0>|NoSuf, { RegXMM|Uns
     x:Vex128:ATTSyntax:RegXMM|Unspecified|BaseIndex, +
     y:Vex256:ATTSyntax:RegYMM|Unspecified|BaseIndex>
 
-vaddp<sd>, 0x<sd:ppfx>58, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vadds<sd>, 0x<sd:spfx>58, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaddsubpd, 0x66d0, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vaddsubps, 0xf2d0, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vandnp<sd>, 0x<sd:ppfx>55, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vandp<sd>, 0x<sd:ppfx>54, AVX, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vblendp<sd>, 0x660c | <sd:opc>, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vblendvp<sd>, 0x664a | <sd:opc>, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vaddp<sd>, 0x<sd:ppfx>58, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vadds<sd>, 0x<sd:spfx>58, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaddsubpd, 0x66d0, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vaddsubps, 0xf2d0, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandnp<sd>, 0x<sd:ppfx>55, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vandp<sd>, 0x<sd:ppfx>54, AVX, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vblendp<sd>, 0x660c | <sd:opc>, AVX, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vblendvp<sd>, 0x664a | <sd:opc>, AVX, Modrm|Vex|Space0F3A|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vbroadcastf128, 0x661a, AVX, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
 vbroadcastsd, 0x6619, AVX, Modrm|Vex256|Space0F38|VexW0|NoSuf, { Qword|Unspecified|BaseIndex, RegYMM }
 vbroadcastss, 0x6618, AVX, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Dword|Unspecified|BaseIndex, RegXMM|RegYMM }
-vcmp<frel>p<sd>, 0x<sd:ppfx>c2/0x<frel:imm>, AVX, Modrm|<frel:comm>|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vcmp<frel>s<sd>, 0x<sd:spfx>c2/0x<frel:imm>, AVX, Modrm|<frel:comm>|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf|ImmExt, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcmpp<sd>, 0x<sd:ppfx>c2, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vcmps<sd>, 0x<sd:spfx>c2, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { Imm8, <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vcmp<frel>p<sd>, 0x<sd:ppfx>c2/0x<frel:imm>, AVX, Modrm|<frel:comm>|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmp<frel>s<sd>, 0x<sd:spfx>c2/0x<frel:imm>, AVX, Modrm|<frel:comm>|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf|ImmExt, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcmpp<sd>, 0x<sd:ppfx>c2, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vcmps<sd>, 0x<sd:spfx>c2, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Imm8, <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcomis<sd>, 0x<sd:ppfx>2f, AVX, Modrm|VexLIG|Space0F|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM }
 vcvtdq2pd, 0xf3e6, AVX, Modrm|Vex128|Space0F|VexWIG|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
 vcvtdq2pd, 0xf3e6, AVX, Modrm|Vex256|Space0F|VexWIG|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM }
@@ -1518,35 +1603,35 @@  vcvtps2dq, 0x665b, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspe
 vcvtps2pd, 0x5a, AVX, Modrm|Vex128|Space0F|VexWIG|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
 vcvtps2pd, 0x5a, AVX, Modrm|Vex256|Space0F|VexWIG|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM }
 vcvts<sd>2si, 0x<sd:spfx>2d, AVX, Modrm|VexLIG|Space0F|No_bSuf|No_wSuf|No_sSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
-vcvtsd2ss, 0xf25a, AVX, Modrm|Vex=3|Space0F|VexVVVV|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vcvtsi2s<sd>, 0x<sd:spfx>2a, AVX, Modrm|VexLIG|Space0F|VexVVVV|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtsi2s<sd>, 0x<sd:spfx>2a, AVX, Modrm|VexLIG|Space0F|VexVVVV|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtss2sd, 0xf35a, AVX, Modrm|Vex=3|Space0F|VexVVVV|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vcvtsd2ss, 0xf25a, AVX, Modrm|Vex=3|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vcvtsi2s<sd>, 0x<sd:spfx>2a, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2s<sd>, 0x<sd:spfx>2a, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtss2sd, 0xf35a, AVX, Modrm|Vex=3|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vcvttpd2dq<Vxy>, 0x66e6, AVX, Modrm|<Vxy:vex>|Space0F|VexWIG|NoSuf|<Vxy:syntax>, { <Vxy:src>, RegXMM }
 vcvttps2dq, 0xf35b, AVX, Modrm|Vex|Space0F|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vcvtts<sd>2si, 0x<sd:spfx>2c, AVX, Modrm|VexLIG|Space0F|No_bSuf|No_wSuf|No_sSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, Reg32|Reg64 }
-vdivp<sd>, 0x<sd:ppfx>5e, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vdivs<sd>, 0x<sd:spfx>5e, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vdppd, 0x6641, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vdpps, 0x6640, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vdivp<sd>, 0x<sd:ppfx>5e, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vdivs<sd>, 0x<sd:spfx>5e, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vdppd, 0x6641, AVX, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vdpps, 0x6640, AVX, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vextractf128, 0x6619, AVX, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM }
 vextractps, 0x6617, AVX|AVX512F, Modrm|Vex128|EVex128|Space0F3A|VexWIG|Disp8MemShift=2|NoSuf, { Imm8, RegXMM, Reg32|Dword|Unspecified|BaseIndex }
 vextractps, 0x6617, AVX|AVX512F|x64, RegMem|Vex128|EVex128|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg64 }
-vhaddpd, 0x667c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vhaddps, 0xf27c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vhsubpd, 0x667d, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vhsubps, 0xf27d, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vinsertf128, 0x6618, AVX, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM }
-vinsertps, 0x6621, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vhaddpd, 0x667c, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vhaddps, 0xf27c, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vhsubpd, 0x667d, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vhsubps, 0xf27d, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vinsertf128, 0x6618, AVX, Modrm|Vex256|Space0F3A|VexVVVVSrc|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM }
+vinsertps, 0x6621, AVX, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf, { Imm8, Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vlddqu, 0xf2f0, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM }
 vldmxcsr, 0xae/2, AVX, Modrm|Vex128|Space0F|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex }
 vmaskmovdqu, 0x66f7, AVX, Modrm|Vex|Space0F|VexWIG|NoSuf, { RegXMM, RegXMM }
-vmaskmovp<sd>, 0x662e | <sd:opc>, AVX, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
-vmaskmovp<sd>, 0x662c | <sd:opc>, AVX, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vmaxp<sd>, 0x<sd:ppfx>5f, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vmaxs<sd>, 0x<sd:spfx>5f, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vminp<sd>, 0x<sd:ppfx>5d, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vmins<sd>, 0x<sd:spfx>5d, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vmaskmovp<sd>, 0x662e | <sd:opc>, AVX, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
+vmaskmovp<sd>, 0x662c | <sd:opc>, AVX, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vmaxp<sd>, 0x<sd:ppfx>5f, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vmaxs<sd>, 0x<sd:spfx>5f, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vminp<sd>, 0x<sd:ppfx>5d, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vmins<sd>, 0x<sd:spfx>5d, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vmovap<sd>, 0x<sd:ppfx>28, AVX, D|Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 // vmovd really shouldn't allow for 64bit operand (vmovq is the right
 // mnemonic for copying between Reg64/Mem64 and RegXMM, as is mandated
@@ -1559,11 +1644,11 @@  vmovddup, 0xf212, AVX, Modrm|Vex|Space0F|VexWIG|NoSuf, { Qword|Unspecified|BaseI
 vmovddup, 0xf212, AVX, Modrm|Vex=2|Space0F|VexWIG|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM }
 vmovdqa, 0x666f, AVX, D|Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vmovdqu, 0xf36f, AVX, D|Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vmovhlps, 0x12, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM }
-vmovhp<sd>, 0x<sd:ppfx>16, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vmovhlps, 0x12, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM }
+vmovhp<sd>, 0x<sd:ppfx>16, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vmovhp<sd>, 0x<sd:ppfx>17, AVX, Modrm|Vex|Space0F|VexWIG|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex }
-vmovlhps, 0x16, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM }
-vmovlp<sd>, 0x<sd:ppfx>12, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vmovlhps, 0x16, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM }
+vmovlp<sd>, 0x<sd:ppfx>12, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vmovlp<sd>, 0x<sd:ppfx>13, AVX, Modrm|Vex|Space0F|VexWIG|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex }
 vmovmskp<sd>, 0x<sd:ppfx>50, AVX, Modrm|Vex|Space0F|VexWIG|No_bSuf|No_wSuf|No_sSuf, { RegXMM|RegYMM, Reg32|Reg64 }
 vmovntdq, 0x66e7, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
@@ -1573,78 +1658,78 @@  vmovq, 0xf37e, AVX, Load|Modrm|Vex=1|Space0F|VexWIG|NoSuf, { Qword|Unspecified|B
 vmovq, 0x66d6, AVX, Modrm|Vex=1|Space0F|VexWIG|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
 vmovq, 0x666e, AVX|AVX512F|x64, D|Modrm|Vex128|EVex128|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Reg64|Unspecified|BaseIndex, RegXMM }
 vmovs<sd>, 0x<sd:spfx>10, AVX, D|Modrm|VexLIG|Space0F|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex, RegXMM }
-vmovs<sd>, 0x<sd:spfx>10, AVX, D|Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM }
+vmovs<sd>, 0x<sd:spfx>10, AVX, D|Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { RegXMM, RegXMM, RegXMM }
 vmovshdup, 0xf316, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vmovsldup, 0xf312, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vmovup<sd>, 0x<sd:ppfx>10, AVX, D|Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vmpsadbw, 0x6642, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vmulp<sd>, 0x<sd:ppfx>59, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vmuls<sd>, 0x<sd:spfx>59, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vorp<sd>, 0x<sd:ppfx>56, AVX, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vmpsadbw, 0x6642, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vmulp<sd>, 0x<sd:ppfx>59, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vmuls<sd>, 0x<sd:spfx>59, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vorp<sd>, 0x<sd:ppfx>56, AVX, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpabs<bw>, 0x661c | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F38|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vpabsd, 0x661e, AVX|AVX2, Modrm|Vex|Space0F38|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vpackssdw, 0x666b, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpacksswb, 0x6663, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpackusdw, 0x662b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpackuswb, 0x6667, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpadds<bw>, 0x66ec | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpadd<bw>, 0x66fc | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpaddd, 0x66fe, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpaddq, 0x66d4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpaddus<bw>, 0x66dc | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpalignr, 0x660f, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpand, 0x66db, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpandn, 0x66df, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpavg<bw>, 0x66e0 | (3 * <bw:opc>), AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpblendvb, 0x664c, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpblendw, 0x660e, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpcmpeq<bw>, 0x6674 | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpcmpeqd, 0x6676, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpcmpeqq, 0x6629, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpackssdw, 0x666b, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpacksswb, 0x6663, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpackusdw, 0x662b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpackuswb, 0x6667, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpadds<bw>, 0x66ec | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpadd<bw>, 0x66fc | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpaddd, 0x66fe, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpaddq, 0x66d4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpaddus<bw>, 0x66dc | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpalignr, 0x660f, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpand, 0x66db, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpandn, 0x66df, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpavg<bw>, 0x66e0 | (3 * <bw:opc>), AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpblendvb, 0x664c, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpblendw, 0x660e, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcmpeq<bw>, 0x6674 | <bw:opc>, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcmpeqd, 0x6676, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcmpeqq, 0x6629, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpcmpestri, 0x6661, AVX|No64, Modrm|Vex|Space0F3A|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM }
 vpcmpestri, 0x6661, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpcmpestrm, 0x6660, AVX|No64, Modrm|Vex|Space0F3A|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM }
 vpcmpestrm, 0x6660, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM }
-vpcmpgt<bw>, 0x6664 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpcmpgtd, 0x6666, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcmpgt<bw>, 0x6664 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcmpgtd, 0x6666, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpcmpistri, 0x6663, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM }
 vpcmpistrm, 0x6662, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM }
-vperm2f128, 0x6606, AVX, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpermilps, 0x660c, AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vperm2f128, 0x6606, AVX, Modrm|Vex256|Space0F3A|VexVVVVSrc|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vpermilps, 0x660c, AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 vpermilps, 0x6604, AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F3A|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpermilpd, 0x660d, AVX, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpermilpd, 0x660d, AVX, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpermilpd, 0x6605, AVX, Modrm|Vex|Space0F3A|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vpextr<dq>, 0x6616, AVX|<dq:cpu64>, Modrm|Vex|Space0F3A|<dq:vexw64>|NoSuf, { Imm8, RegXMM, <dq:gpr>|Unspecified|BaseIndex }
 vpextrw, 0x66c5, AVX, Load|Modrm|Vex|Space0F|VexWIG|No_bSuf|No_wSuf|No_sSuf, { Imm8, RegXMM, Reg32|Reg64 }
 vpextr<bw>, 0x6614 | <bw:opc>, AVX, RegMem|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg32|Reg64 }
 vpextr<bw>, 0x6614 | <bw:opc>, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, <bw:elem>|Unspecified|BaseIndex }
-vphaddd, 0x6602, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vphaddsw, 0x6603, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vphaddw, 0x6601, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vphaddd, 0x6602, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vphaddsw, 0x6603, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vphaddw, 0x6601, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vphminposuw, 0x6641, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM }
-vphsubd, 0x6606, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vphsubsw, 0x6607, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vphsubw, 0x6605, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpinsrb, 0x6620, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
-vpinsrb, 0x6620, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpinsr<dq>, 0x6622, AVX|<dq:cpu64>, Modrm|Vex|Space0F3A|VexVVVV|<dq:vexw64>|NoSuf, { Imm8, <dq:gpr>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpinsrw, 0x66c4, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|No_bSuf|No_wSuf|No_sSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
-vpinsrw, 0x66c4, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|NoSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmaddubsw, 0x6604, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmaddwd, 0x66f5, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmaxsb, 0x663c, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmaxsd, 0x663d, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmaxsw, 0x66ee, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmaxub, 0x66de, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmaxud, 0x663f, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmaxuw, 0x663e, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpminsb, 0x6638, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpminsd, 0x6639, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpminsw, 0x66ea, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpminub, 0x66da, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpminud, 0x663b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpminuw, 0x663a, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vphsubd, 0x6606, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vphsubsw, 0x6607, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vphsubw, 0x6605, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpinsrb, 0x6620, AVX, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrb, 0x6620, AVX, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsr<dq>, 0x6622, AVX|<dq:cpu64>, Modrm|Vex|Space0F3A|VexVVVVSrc|<dq:vexw64>|NoSuf, { Imm8, <dq:gpr>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsrw, 0x66c4, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|No_bSuf|No_wSuf|No_sSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrw, 0x66c4, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmaddubsw, 0x6604, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaddwd, 0x66f5, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaxsb, 0x663c, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaxsd, 0x663d, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaxsw, 0x66ee, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaxub, 0x66de, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaxud, 0x663f, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmaxuw, 0x663e, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpminsb, 0x6638, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpminsd, 0x6639, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpminsw, 0x66ea, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpminub, 0x66da, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpminud, 0x663b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpminuw, 0x663a, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpmovmskb, 0x66d7, AVX|AVX2, Modrm|Vex|Space0F|VexWIG|No_bSuf|No_wSuf|No_sSuf, { RegXMM|RegYMM, Reg32|Reg64 }
 vpmovsxbd, 0x6621, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM }
 vpmovsxbq, 0x6622, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM }
@@ -1658,66 +1743,66 @@  vpmovzxbw, 0x6630, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|Ba
 vpmovzxdq, 0x6635, AVX, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vpmovzxwd, 0x6633, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
 vpmovzxwq, 0x6634, AVX|AVX512F|AVX512VL, Modrm|Vex128|EVex128|Masking|Space0F38|VexWIG|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM }
-vpmuldq, 0x6628, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmulhrsw, 0x660b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmulhuw, 0x66e4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmulhw, 0x66e5, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmulld, 0x6640, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmullw, 0x66d5, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmuludq, 0x66f4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpor, 0x66eb, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsadbw, 0x66f6, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|C|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpshufb, 0x6600, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmuldq, 0x6628, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmulhrsw, 0x660b, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmulhuw, 0x66e4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmulhw, 0x66e5, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmulld, 0x6640, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmullw, 0x66d5, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmuludq, 0x66f4, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpor, 0x66eb, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsadbw, 0x66f6, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|C|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpshufb, 0x6600, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpshufd, 0x6670, AVX|AVX2, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vpshufhw, 0xf370, AVX|AVX2, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vpshuflw, 0xf270, AVX|AVX2, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vpsign<bw>, 0x6608 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsignd, 0x660a, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsll<dq>, 0x6672 | <dq:opc>/6, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsll<dq>, 0x66f2 | <dq:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpslldq, 0x6673/7, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsllw, 0x6671/6, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsllw, 0x66f1, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsrad, 0x6672/4, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsrad, 0x66e2, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsraw, 0x6671/4, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsraw, 0x66e1, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsrl<dq>, 0x6672 | <dq:opc>/2, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsrl<dq>, 0x66d2 | <dq:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsrldq, 0x6673/3, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsrlw, 0x6671/2, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsrlw, 0x66d1, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsub<bw>, 0x66f8 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsub<dq>, 0x66fa | <dq:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsubs<bw>, 0x66e8 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsubus<bw>, 0x66d8 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsign<bw>, 0x6608 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsignd, 0x660a, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsll<dq>, 0x6672 | <dq:opc>/6, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsll<dq>, 0x66f2 | <dq:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpslldq, 0x6673/7, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsllw, 0x6671/6, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsllw, 0x66f1, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrad, 0x6672/4, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrad, 0x66e2, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsraw, 0x6671/4, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsraw, 0x66e1, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrl<dq>, 0x6672 | <dq:opc>/2, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrl<dq>, 0x66d2 | <dq:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrldq, 0x6673/3, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrlw, 0x6671/2, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsrlw, 0x66d1, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsub<bw>, 0x66f8 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsub<dq>, 0x66fa | <dq:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsubs<bw>, 0x66e8 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsubus<bw>, 0x66d8 | <bw:opc>, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vptest, 0x6617, AVX, Modrm|Vex|Space0F38|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpckhbw, 0x6668, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpckhdq, 0x666a, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpckhqdq, 0x666d, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpckhwd, 0x6669, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpcklbw, 0x6660, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpckldq, 0x6662, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpcklqdq, 0x666c, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpunpcklwd, 0x6661, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpxor, 0x66ef, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpckhbw, 0x6668, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpckhdq, 0x666a, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpckhqdq, 0x666d, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpckhwd, 0x6669, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpcklbw, 0x6660, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpckldq, 0x6662, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpcklqdq, 0x666c, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpunpcklwd, 0x6661, AVX|AVX2, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpxor, 0x66ef, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vrcpps, 0x53, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vrcpss, 0xf353, AVX, Modrm|Vex=3|Space0F|VexVVVV|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vrcpss, 0xf353, AVX, Modrm|Vex=3|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vroundp<sd>, 0x6608 | <sd:opc>, AVX, Modrm|Vex|Space0F3A|VexWIG|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vrounds<sd>, 0x660a | <sd:opc>, AVX, Modrm|VexLIG|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8, <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vrounds<sd>, 0x660a | <sd:opc>, AVX, Modrm|VexLIG|Space0F3A|VexVVVVSrc|VexWIG|NoSuf, { Imm8, <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vrsqrtps, 0x52, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vrsqrtss, 0xf352, AVX, Modrm|Vex=3|Space0F|VexVVVV|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vshufp<sd>, 0x<sd:ppfx>c6, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vrsqrtss, 0xf352, AVX, Modrm|Vex=3|Space0F|VexVVVVSrc|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vshufp<sd>, 0x<sd:ppfx>c6, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vsqrtp<sd>, 0x<sd:ppfx>51, AVX, Modrm|Vex|Space0F|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
-vsqrts<sd>, 0x<sd:spfx>51, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vsqrts<sd>, 0x<sd:spfx>51, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vstmxcsr, 0xae/3, AVX, Modrm|Vex128|Space0F|VexWIG|NoSuf, { Dword|Unspecified|BaseIndex }
-vsubp<sd>, 0x<sd:ppfx>5c, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vsubs<sd>, 0x<sd:spfx>5c, AVX, Modrm|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vsubp<sd>, 0x<sd:ppfx>5c, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vsubs<sd>, 0x<sd:spfx>5c, AVX, Modrm|VexLIG|Space0F|VexVVVVSrc|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vtestp<sd>, 0x660e | <sd:opc>, AVX, Modrm|Vex|Space0F38|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vucomis<sd>, 0x<sd:ppfx>2e, AVX, Modrm|VexLIG|Space0F|VexWIG|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM }
-vunpckhp<sd>, 0x<sd:ppfx>15, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vunpcklp<sd>, 0x<sd:ppfx>14, AVX, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vxorp<sd>, 0x<sd:ppfx>57, AVX, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vunpckhp<sd>, 0x<sd:ppfx>15, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vunpcklp<sd>, 0x<sd:ppfx>14, AVX, Modrm|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vxorp<sd>, 0x<sd:ppfx>57, AVX, Modrm|C|Vex|Space0F|VexVVVVSrc|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vzeroall, 0x77, AVX, Vex=2|Space0F|VexWIG|NoSuf, {}
 vzeroupper, 0x77, AVX, Vex|Space0F|VexWIG|NoSuf, {}
 
@@ -1742,59 +1827,59 @@  vpmovzxwq, 0x6634, AVX2|AVX512F|AVX512VL, Modrm|Vex256|EVex256|Masking|Space0F38
 vbroadcasti128, 0x665A, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
 vbroadcastsd, 0x6619, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { RegXMM, RegYMM }
 vbroadcastss, 0x6618, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpbroadcast<bw>, 0x6678 | <bw:opc>, AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf, { <bw:elem>|Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM }
 vpbroadcastd, 0x6658, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexW0|Disp8MemShift|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vpbroadcastq, 0x6659, AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf|Optimize, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM }
-vperm2i128, 0x6646, AVX2, Modrm|Vex=2|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
-vpermd, 0x6636, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vperm2i128, 0x6646, AVX2, Modrm|Vex=2|Space0F3A|VexVVVVSrc|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }
+vpermd, 0x6636, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
 vpermpd, 0x6601, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM }
-vpermps, 0x6616, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vpermps, 0x6616, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
 vpermq, 0x6600, AVX2|AVX512F, Modrm|Vex256|EVexDYN|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM }
 vextracti128, 0x6639, AVX2, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Unspecified|BaseIndex|RegXMM }
-vinserti128, 0x6638, AVX2, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM }
-vpmaskmov<dq>, 0x668e, AVX2, Modrm|Vex|Space0F38|VexVVVV|<dq:vexw>|CheckOperandSize|NoSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
-vpmaskmov<dq>, 0x668c, AVX2, Modrm|Vex|Space0F38|VexVVVV|<dq:vexw>|CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpsllv<dq>, 0x6647, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsravd, 0x6646, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsrlv<dq>, 0x6645, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vinserti128, 0x6638, AVX2, Modrm|Vex256|Space0F3A|VexVVVVSrc|VexW0|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegYMM, RegYMM }
+vpmaskmov<dq>, 0x668e, AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|<dq:vexw>|CheckOperandSize|NoSuf, { RegXMM|RegYMM, RegXMM|RegYMM, Xmmword|Ymmword|Unspecified|BaseIndex }
+vpmaskmov<dq>, 0x668c, AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|<dq:vexw>|CheckOperandSize|NoSuf, { Xmmword|Ymmword|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpsllv<dq>, 0x6647, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsravd, 0x6646, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsrlv<dq>, 0x6645, AVX2|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX gather instructions
-vgatherdpd, 0x6692, AVX2, Modrm|Vex|Space0F38|VexVVVV|VexW1|SwapSources|CheckOperandSize|NoSuf|VecSIB128, { RegXMM|RegYMM, Qword|Unspecified|BaseIndex, RegXMM|RegYMM }
-vgatherdps, 0x6692, AVX2, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB128, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
-vgatherdps, 0x6692, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB256, { RegYMM, Dword|Unspecified|BaseIndex, RegYMM }
-vgatherqp<sd>, 0x6693, AVX2, Modrm|Vex|Space0F38|VexVVVV|<sd:vexw>|SwapSources|NoSuf|VecSIB128, { RegXMM, <sd:elem>|Unspecified|BaseIndex, RegXMM }
-vgatherqpd, 0x6693, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW1|SwapSources|NoSuf|VecSIB256, { RegYMM, Qword|Unspecified|BaseIndex, RegYMM }
-vgatherqps, 0x6693, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB256, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
-vpgatherdd, 0x6690, AVX2, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB128, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
-vpgatherdd, 0x6690, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB256, { RegYMM, Dword|Unspecified|BaseIndex, RegYMM }
-vpgatherdq, 0x6690, AVX2, Modrm|Vex|Space0F38|VexVVVV|VexW1|SwapSources|CheckOperandSize|NoSuf|VecSIB128, { RegXMM|RegYMM, Qword|Unspecified|BaseIndex, RegXMM|RegYMM }
-vpgatherq<dq>, 0x6691, AVX2, Modrm|Vex128|Space0F38|VexVVVV|<dq:vexw>|SwapSources|NoSuf|VecSIB128, { RegXMM, <dq:elem>|Unspecified|BaseIndex, RegXMM }
-vpgatherqd, 0x6691, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf|VecSIB256, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
-vpgatherqq, 0x6691, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW1|SwapSources|NoSuf|VecSIB256, { RegYMM, Qword|Unspecified|BaseIndex, RegYMM }
+vgatherdpd, 0x6692, AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexW1|SwapSources|CheckOperandSize|NoSuf|VecSIB128, { RegXMM|RegYMM, Qword|Unspecified|BaseIndex, RegXMM|RegYMM }
+vgatherdps, 0x6692, AVX2, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf|VecSIB128, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
+vgatherdps, 0x6692, AVX2, Modrm|Vex256|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf|VecSIB256, { RegYMM, Dword|Unspecified|BaseIndex, RegYMM }
+vgatherqp<sd>, 0x6693, AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|<sd:vexw>|SwapSources|NoSuf|VecSIB128, { RegXMM, <sd:elem>|Unspecified|BaseIndex, RegXMM }
+vgatherqpd, 0x6693, AVX2, Modrm|Vex256|Space0F38|VexVVVVSrc|VexW1|SwapSources|NoSuf|VecSIB256, { RegYMM, Qword|Unspecified|BaseIndex, RegYMM }
+vgatherqps, 0x6693, AVX2, Modrm|Vex256|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf|VecSIB256, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
+vpgatherdd, 0x6690, AVX2, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf|VecSIB128, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
+vpgatherdd, 0x6690, AVX2, Modrm|Vex256|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf|VecSIB256, { RegYMM, Dword|Unspecified|BaseIndex, RegYMM }
+vpgatherdq, 0x6690, AVX2, Modrm|Vex|Space0F38|VexVVVVSrc|VexW1|SwapSources|CheckOperandSize|NoSuf|VecSIB128, { RegXMM|RegYMM, Qword|Unspecified|BaseIndex, RegXMM|RegYMM }
+vpgatherq<dq>, 0x6691, AVX2, Modrm|Vex128|Space0F38|VexVVVVSrc|<dq:vexw>|SwapSources|NoSuf|VecSIB128, { RegXMM, <dq:elem>|Unspecified|BaseIndex, RegXMM }
+vpgatherqd, 0x6691, AVX2, Modrm|Vex256|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf|VecSIB256, { RegXMM, Dword|Unspecified|BaseIndex, RegXMM }
+vpgatherqq, 0x6691, AVX2, Modrm|Vex256|Space0F38|VexVVVVSrc|VexW1|SwapSources|NoSuf|VecSIB256, { RegYMM, Qword|Unspecified|BaseIndex, RegYMM }
 
 // AES + AVX
 
-vaesdec, 0x66de, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaesdeclast, 0x66df, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaesenc, 0x66dc, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vaesenclast, 0x66dd, AVX|AES, Modrm|Vex|Space0F38|VexVVVV|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesdec, 0x66de, AVX|AES, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesdeclast, 0x66df, AVX|AES, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesenc, 0x66dc, AVX|AES, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vaesenclast, 0x66dd, AVX|AES, Modrm|Vex|Space0F38|VexVVVVSrc|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 vaesimc, 0x66db, AVX|AES, Modrm|Vex|Space0F38|VexWIG|NoSuf, { Unspecified|BaseIndex|RegXMM, RegXMM }
 vaeskeygenassist, 0x66df, AVX|AES, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM }
 
 // PCLMULQDQ + AVX
 
-vpclmulqdq, 0x6644, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmullqlqdq, 0x6644/0x00, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmulhqlqdq, 0x6644/0x01, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmullqhqdq, 0x6644/0x10, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
-vpclmulhqhqdq, 0x6644/0x11, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulqdq, 0x6644, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmullqlqdq, 0x6644/0x00, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulhqlqdq, 0x6644/0x01, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmullqhqdq, 0x6644/0x10, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
+vpclmulhqhqdq, 0x6644/0x11, AVX|PCLMULQDQ, Modrm|Vex|Space0F3A|VexVVVVSrc|VexWIG|NoSuf|ImmExt, { Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM }
 
 // GFNI + AVX
 
-vgf2p8affineinvqb, 0x66cf, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vgf2p8affineqb, 0x66ce, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vgf2p8mulb, 0x66cf, GFNI|AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vgf2p8affineinvqb, 0x66cf, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vgf2p8affineqb, 0x66ce, AVX|GFNI, Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vgf2p8mulb, 0x66cf, GFNI|AVX|AVX512F, Modrm|Vex|EVexDYN|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // FSGSBASE, RDRND and F16C
 
@@ -1817,16 +1902,16 @@  vcvtps2ph, 0x661d, F16C, Modrm|Vex=2|Space0F3A|VexW=1|NoSuf, { Imm8, RegYMM, Uns
     d:AVX512F:AVX512DQ:FMA|AVX|AVX512F:66:f2:66:Space0F:Space0F38:1:Vex|EVexDYN:VexLIG|EVexLIG:VexW1:Qword, +
     h:AVX512_FP16:AVX512_FP16:AVX512_FP16::f3::EVexMap5:EVexMap6:0::EVexLIG:VexW0:Word>
 
-vfmadd<fma>p<sdh>, 0x6688 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfmadd<fma>s<sdh>, 0x6689 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vfmaddsub<fma>p<sdh>, 0x6686 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfmsub<fma>p<sdh>, 0x668a | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfmsub<fma>s<sdh>, 0x668b | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vfmsubadd<fma>p<sdh>, 0x6687 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfnmadd<fma>p<sdh>, 0x668c | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfnmadd<fma>s<sdh>, 0x668d | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vfnmsub<fma>p<sdh>, 0x668e | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfnmsub<fma>s<sdh>, 0x668f | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfmadd<fma>p<sdh>, 0x6688 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfmadd<fma>s<sdh>, 0x6689 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfmaddsub<fma>p<sdh>, 0x6686 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfmsub<fma>p<sdh>, 0x668a | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfmsub<fma>s<sdh>, 0x668b | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfmsubadd<fma>p<sdh>, 0x6687 | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfnmadd<fma>p<sdh>, 0x668c | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfnmadd<fma>s<sdh>, 0x668d | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfnmsub<fma>p<sdh>, 0x668e | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vex>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfnmsub<fma>s<sdh>, 0x668f | 0x<fma:opc>, <sdh:fma>, Modrm|<sdh:vexlig>|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 // HLE prefixes
 
@@ -1841,35 +1926,35 @@  xtest, 0xf01d6, HLE|RTM, NoSuf, {}
 
 // BMI2 instructions.
 
-bzhi, 0xf5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-bzhi, 0xf5, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-mulx, 0xf2f6, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-mulx, 0xf2f6, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pdep, 0xf2f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pdep, 0xf2f5, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pext, 0xf3f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-pext, 0xf3f5, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+bzhi, 0xf5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+bzhi, 0xf5, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+mulx, 0xf2f6, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+mulx, 0xf2f6, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVVDest|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pdep, 0xf2f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pdep, 0xf2f5, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pext, 0xf3f5, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+pext, 0xf3f5, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
 rorx, 0xf2f0, BMI2, Modrm|CheckOperandSize|Vex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, Reg32|Reg64 }
 rorx, 0xf2f0, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F3A|No_bSuf|No_wSuf|No_sSuf, { Imm8|Imm8S, Reg32|Reg64|Dword|Qword|Unspecified|BaseIndex, Reg32|Reg64 }
-sarx, 0xf3f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-sarx, 0xf3f7, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shlx, 0x66f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shlx, 0x66f7, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shrx, 0xf2f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-shrx, 0xf2f7, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+sarx, 0xf3f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+sarx, 0xf3f7, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVVDest|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shlx, 0x66f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shlx, 0x66f7, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shrx, 0xf2f7, BMI2, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+shrx, 0xf2f7, BMI2|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 
 // FMA4 instructions
 
-vfmaddp<sd>, 0x6668 | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vfmadds<sd>, 0x666a | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVV|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
-vfmaddsubp<sd>, 0x665c | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vfmsubaddp<sd>, 0x665e | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vfmsubp<sd>, 0x666c | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vfmsubs<sd>, 0x666e | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVV|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
-vfnmaddp<sd>, 0x6678 | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vfnmadds<sd>, 0x667a | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVV|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
-vfnmsubp<sd>, 0x667c | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vfnmsubs<sd>, 0x667e | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVV|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
+vfmaddp<sd>, 0x6668 | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vfmadds<sd>, 0x666a | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVVSrc|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
+vfmaddsubp<sd>, 0x665c | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vfmsubaddp<sd>, 0x665e | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vfmsubp<sd>, 0x666c | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vfmsubs<sd>, 0x666e | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVVSrc|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
+vfnmaddp<sd>, 0x6678 | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vfnmadds<sd>, 0x667a | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVVSrc|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
+vfnmsubp<sd>, 0x667c | <sd:opc>, FMA4, D|Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vfnmsubs<sd>, 0x667e | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVVSrc|VexW1|NoSuf, { <sd:elem>|Unspecified|BaseIndex|RegXMM, RegXMM, RegXMM, RegXMM }
 
 // XOP instructions
 
@@ -1879,11 +1964,11 @@  vfnmsubs<sd>, 0x667e | <sd:opc>, FMA4, D|Modrm|VexLIG|Space0F3A|VexVVVV|VexW1|No
 
 vfrczp<sd>, 0x80 | <sd:opc>, XOP, Modrm|SpaceXOP09|VexW0|CheckOperandSize|NoSuf|Vex, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM }
 vfrczs<sd>, 0x82 | <sd:opc>, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { <sd:elem>|RegXMM|Unspecified|BaseIndex, RegXMM }
-vpcmov, 0xa2, XOP, D|Modrm|Vex|SpaceXOP08|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpcom<sign><xop>, 0xcc | 0x<sign:opc> | <xop:opc>, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpcom<irel><sign><xop>, 0xcc | 0x<sign:opc> | <xop:opc>/<irel:imm>, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpermil2p<sd>, 0x6648 | <sd:opc>, XOP, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpermil2p<sd>, 0x6648 | <sd:opc>, XOP, Modrm|Vex|Space0F3A|VexVVVV|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcmov, 0xa2, XOP, D|Modrm|Vex|SpaceXOP08|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpcom<sign><xop>, 0xcc | 0x<sign:opc> | <xop:opc>, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpcom<irel><sign><xop>, 0xcc | 0x<sign:opc> | <xop:opc>/<irel:imm>, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf|ImmExt, { RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpermil2p<sd>, 0x6648 | <sd:opc>, XOP, Modrm|Vex|Space0F3A|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpermil2p<sd>, 0x6648 | <sd:opc>, XOP, Modrm|Vex|Space0F3A|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vphaddb<dq>, 0xc2 | <dq:opc>, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { RegXMM|Unspecified|BaseIndex, RegXMM }
 vphaddbw, 0xc1, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { RegXMM|Unspecified|BaseIndex, RegXMM }
 vphadddq, 0xcb, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { RegXMM|Unspecified|BaseIndex, RegXMM }
@@ -1895,23 +1980,23 @@  vphaddw<dq>, 0xc6 | <dq:opc>, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { RegXMM|Un
 vphsubbw, 0xe1, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { RegXMM|Unspecified|BaseIndex, RegXMM }
 vphsubdq, 0xe3, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { RegXMM|Unspecified|BaseIndex, RegXMM }
 vphsubwd, 0xe2, XOP, Modrm|SpaceXOP09|VexW0|NoSuf|Vex, { RegXMM|Unspecified|BaseIndex, RegXMM }
-vpmacsdd, 0x9e, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacsdqh, 0x9f, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacsdql, 0x97, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacssdd, 0x8e, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacssdqh, 0x8f, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacssdql, 0x87, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacsswd, 0x86, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacssww, 0x85, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacswd, 0x96, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmacsww, 0x95, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmadcsswd, 0xa6, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpmadcswd, 0xb6, XOP, Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpperm, 0xa3, XOP, D|Modrm|Vex128|SpaceXOP08|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vprot<xop>, 0x90 | <xop:opc>, XOP, D|Modrm|Vex128|SpaceXOP09|VexVVVV|SwapSources|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM }
+vpmacsdd, 0x9e, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacsdqh, 0x9f, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacsdql, 0x97, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacssdd, 0x8e, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacssdqh, 0x8f, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacssdql, 0x87, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacsswd, 0x86, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacssww, 0x85, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacswd, 0x96, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmacsww, 0x95, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmadcsswd, 0xa6, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpmadcswd, 0xb6, XOP, Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpperm, 0xa3, XOP, D|Modrm|Vex128|SpaceXOP08|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vprot<xop>, 0x90 | <xop:opc>, XOP, D|Modrm|Vex128|SpaceXOP09|VexVVVVSrc|SwapSources|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM }
 vprot<xop>, 0xc0 | <xop:opc>, XOP, Modrm|Vex128|SpaceXOP08|VexW0|NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM }
-vpsha<xop>, 0x98 | <xop:opc>, XOP, D|Modrm|Vex128|SpaceXOP09|VexVVVV|SwapSources|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM }
-vpshl<xop>, 0x94 | <xop:opc>, XOP, D|Modrm|Vex128|SpaceXOP09|VexVVVV|SwapSources|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM }
+vpsha<xop>, 0x98 | <xop:opc>, XOP, D|Modrm|Vex128|SpaceXOP09|VexVVVVSrc|SwapSources|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM }
+vpshl<xop>, 0x94 | <xop:opc>, XOP, D|Modrm|Vex128|SpaceXOP09|VexVVVVSrc|SwapSources|VexW0|NoSuf, { RegXMM, RegXMM|Unspecified|BaseIndex, RegXMM }
 
 <xop>
 <irel>
@@ -1921,35 +2006,35 @@  vpshl<xop>, 0x94 | <xop:opc>, XOP, D|Modrm|Vex128|SpaceXOP09|VexVVVV|SwapSources
 
 llwpcb, 0x12/0, LWP, Modrm|SpaceXOP09|NoSuf|Vex, { Reg32|Reg64 }
 slwpcb, 0x12/1, LWP, Modrm|SpaceXOP09|NoSuf|Vex, { Reg32|Reg64 }
-lwpval, 0x12/1, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
-lwpins, 0x12/0, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVV|Vex, { Imm32|Imm32S, Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
+lwpval, 0x12/1, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVVSrc|Vex, { Imm32|Imm32S, Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
+lwpins, 0x12/0, LWP, Modrm|SpaceXOP0A|NoSuf|VexVVVVSrc|Vex, { Imm32|Imm32S, Reg32|Unspecified|BaseIndex, Reg32|Reg64 }
 
 // BMI instructions
 
-andn, 0xf2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-andn, 0xf2, BMI|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
-bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-bextr, 0xf7, BMI|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsi, 0xf3/3, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsi, 0xf3/3, BMI|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsmsk, 0xf3/2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsmsk, 0xf3/2, BMI|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsr, 0xf3/1, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsr, 0xf3/1, BMI|APX_F, Modrm|CheckOperandSize|EVex128|Space0F38|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+andn, 0xf2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+andn, 0xf2, BMI|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
+bextr, 0xf7, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+bextr, 0xf7, BMI|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVSrc|SwapSources|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsi, 0xf3/3, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsi, 0xf3/3, BMI|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVDest|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsmsk, 0xf3/2, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsmsk, 0xf3/2, BMI|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVDest|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsr, 0xf3/1, BMI, Modrm|CheckOperandSize|Vex128|Space0F38|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsr, 0xf3/1, BMI|APX_F, Modrm|CheckOperandSize|EVex128|NF|Space0F38|VexVVVVDest|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 tzcnt, 0xf30fbc, BMI, Modrm|CheckOperandSize|No_bSuf|No_sSuf, { Reg16|Reg32|Reg64|Word|Dword|Qword|Unspecified|BaseIndex, Reg16|Reg32|Reg64 }
 
 // TBM instructions
 
 bextr, 0x10, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP0A|No_bSuf|No_wSuf|No_sSuf, { Imm32|Imm32S, Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blcfill, 0x01/1, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blci, 0x02/6, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blcic, 0x01/5, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blcmsk, 0x02/1, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blcs, 0x01/3, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsfill, 0x01/2, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-blsic, 0x01/6, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-t1mskc, 0x01/7, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
-tzmsk, 0x01/4, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVV|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blcfill, 0x01/1, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blci, 0x02/6, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blcic, 0x01/5, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blcmsk, 0x02/1, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blcs, 0x01/3, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsfill, 0x01/2, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+blsic, 0x01/6, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+t1mskc, 0x01/7, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+tzmsk, 0x01/4, TBM, Modrm|CheckOperandSize|Vex128|SpaceXOP09|VexVVVVSrc|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
 
 // AMD 3DNow! instructions.
 
@@ -2040,7 +2125,11 @@  xstore, 0xfa7c0, PadLock, NoSuf|RepPrefixOk, {}
 
 // Multy-precision Add Carry, rdseed instructions.
 adcx, 0x660f38f6, ADX, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adcx, 0x6666, ADX|APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVex128|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adcx, 0x6666, ADX|APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
 adox, 0xf30f38f6, ADX, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adox, 0xf366, ADX|APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|EVex128|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64 }
+adox, 0xf366, ADX|APX_F, Modrm|CheckOperandSize|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|VexVVVVDest|EVex128|EVexMap4, { Reg32|Reg64|Unspecified|BaseIndex, Reg32|Reg64, Reg32|Reg64 }
 rdseed, 0xfc7/7, RdSeed, Modrm|NoSuf, { Reg16|Reg32|Reg64 }
 
 // SMAP instructions.
@@ -2081,42 +2170,42 @@  sha256msg2, 0xdd, SHA|APX_F, Modrm|NoSuf|EVex128|EVexMap4, { RegXMM|Unspecified|
 
 // SHA512 instructions.
 
-vsha512rnds2, 0xf2cb, SHA512, Modrm|Vex256|Space0F38|VexVVVV|VexW0|NoSuf, { RegXMM, RegYMM, RegYMM }
+vsha512rnds2, 0xf2cb, SHA512, Modrm|Vex256|Space0F38|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegYMM, RegYMM }
 vsha512msg1, 0xf2cc, SHA512, Modrm|Vex256|Space0F38|VexW0|NoSuf, { RegXMM, RegYMM }
 vsha512msg2, 0xf2cd, SHA512, Modrm|Vex256|Space0F38|VexW0|NoSuf, { RegYMM, RegYMM }
 
 // SHA512 instructions end.
 
 // SM3 instructions.
-vsm3rnds2, 0x66de, SM3, Modrm|Space0F3A|Vex128|VexVVVV|VexW0|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vsm3msg1, 0xda, SM3, Modrm|Space0F38|Vex128|VexVVVV|VexW0|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
-vsm3msg2, 0x66da, SM3, Modrm|Space0F38|Vex128|VexVVVV|VexW0|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vsm3rnds2, 0x66de, SM3, Modrm|Space0F3A|Vex128|VexVVVVSrc|VexW0|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vsm3msg1, 0xda, SM3, Modrm|Space0F38|Vex128|VexVVVVSrc|VexW0|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
+vsm3msg2, 0x66da, SM3, Modrm|Space0F38|Vex128|VexVVVVSrc|VexW0|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 // SM3 instructions end.
 
 // SM4 instructions.
 
-vsm4key4, 0xf3da, SM4, Modrm|Space0F38|Vex|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vsm4rnds4, 0xf2da, SM4, Modrm|Space0F38|Vex|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vsm4key4, 0xf3da, SM4, Modrm|Space0F38|Vex|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vsm4rnds4, 0xf2da, SM4, Modrm|Space0F38|Vex|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // SM4 instructions end.
 
 // VAES
 
-vaesdec, 0x66de, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vaesdeclast, 0x66df, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vaesenc, 0x66dc, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vaesenclast, 0x66dd, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vaesdec, 0x66de, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vaesdeclast, 0x66df, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vaesenc, 0x66dc, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vaesenclast, 0x66dd, VAES|AVX|AVX512F, Modrm|Vex|EVexDYN|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // VAES instructions end
 
 // VPCLMULQDQ instructions
 
-vpclmulqdq, 0x6644, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpclmullqlqdq, 0x6644/0x00, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpclmulhqlqdq, 0x6644/0x01, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpclmullqhqdq, 0x6644/0x10, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpclmulqdq, 0x6644, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpclmullqlqdq, 0x6644/0x00, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpclmulhqlqdq, 0x6644/0x01, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpclmullqhqdq, 0x6644/0x10, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // VPCLMULQDQ instructions end
 
@@ -2130,11 +2219,11 @@  vpclmulhqhqdq, 0x6644/0x11, VPCLMULQDQ|AVX|AVX512F, Modrm|Space0F3A|Vex|EVexDYN|
     x:AVX512VL:EVex128|Disp8MemShift=4|ATTSyntax:::RegXMM|Unspecified|BaseIndex:RegXMM, +
     y:AVX512VL:EVex256|Disp8MemShift=5|ATTSyntax:::RegYMM|Unspecified|BaseIndex:RegXMM>
 
-kand<bw>, 0x<bw:kpfx>41, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
-kandn<bw>, 0x<bw:kpfx>42, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
-kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
-kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
-kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
+kand<bw>, 0x<bw:kpfx>41, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegMask, RegMask, RegMask }
+kandn<bw>, 0x<bw:kpfx>42, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegMask, RegMask, RegMask }
+kor<bw>, 0x<bw:kpfx>45, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegMask, RegMask, RegMask }
+kxnor<bw>, 0x<bw:kpfx>46, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegMask, RegMask, RegMask }
+kxor<bw>, 0x<bw:kpfx>47, <bw:kcpu>, Modrm|Vex256|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 
 kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
 kmov<bw>, 0x<bw:kpfx>90, <bw:kcpu>|APX_F, Modrm|EVex128|Space0F|VexW0|NoSuf, { RegMask|<bw:elem>|Unspecified|BaseIndex, RegMask }
@@ -2149,37 +2238,37 @@  kortest<bw>, 0x<bw:kpfx>98, <bw:kcpu>, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMa
 kshiftl<bw>, 0x6632, <bw:kcpu>, Modrm|Vex128|Space0F3A|<bw:vexw>|NoSuf, { Imm8, RegMask, RegMask }
 kshiftr<bw>, 0x6630, <bw:kcpu>, Modrm|Vex128|Space0F3A|<bw:vexw>|NoSuf, { Imm8, RegMask, RegMask }
 
-kunpckbw, 0x664B, AVX512F, Modrm|Vex=2|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
+kunpckbw, 0x664B, AVX512F, Modrm|Vex=2|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 
-vaddp<sdh>, 0x<sdh:ppfx>58, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vdivp<sdh>, 0x<sdh:ppfx>5e, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vmulp<sdh>, 0x<sdh:ppfx>59, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vaddp<sdh>, 0x<sdh:ppfx>58, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vdivp<sdh>, 0x<sdh:ppfx>5e, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vmulp<sdh>, 0x<sdh:ppfx>59, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 vsqrtp<sdh>, 0x<sdh:ppfx>51, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vsubp<sdh>, 0x<sdh:ppfx>5c, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-
-vadds<sdh>, 0x<sdh:spfx>58, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vdivs<sdh>, 0x<sdh:spfx>5e, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vmuls<sdh>, 0x<sdh:spfx>59, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vsqrts<sdh>, 0x<sdh:spfx>51, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-vsubs<sdh>, 0x<sdh:spfx>5C, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
-
-valign<dq>, 0x6603, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vblendmp<sd>, 0x6665, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpblendm<dq>, 0x6664, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpermi2<dq>, 0x6676, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpermi2p<sd>, 0x6677, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpermt2<dq>, 0x667E, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpermt2p<sd>, 0x667F, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmaxs<dq>, 0x663D, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmaxu<dq>, 0x663F, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmins<dq>, 0x6639, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpminu<dq>, 0x663B, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmuldq, 0x6628, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW=2|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmulld, 0x6640, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW=1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vprolv<dq>, 0x6615, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vprorv<dq>, 0x6614, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsravq, 0x6646, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpternlog<dq>, 0x6625, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vsubp<sdh>, 0x<sdh:ppfx>5c, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+
+vadds<sdh>, 0x<sdh:spfx>58, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vdivs<sdh>, 0x<sdh:spfx>5e, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vmuls<sdh>, 0x<sdh:spfx>59, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vsqrts<sdh>, 0x<sdh:spfx>51, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vsubs<sdh>, 0x<sdh:spfx>5C, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+
+valign<dq>, 0x6603, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vblendmp<sd>, 0x6665, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpblendm<dq>, 0x6664, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpermi2<dq>, 0x6676, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpermi2p<sd>, 0x6677, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpermt2<dq>, 0x667E, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpermt2p<sd>, 0x667F, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmaxs<dq>, 0x663D, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmaxu<dq>, 0x663F, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmins<dq>, 0x6639, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpminu<dq>, 0x663B, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmuldq, 0x6628, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW=2|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmulld, 0x6640, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW=1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vprolv<dq>, 0x6615, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vprorv<dq>, 0x6614, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsravq, 0x6646, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpternlog<dq>, 0x6625, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 vbroadcastf32x4, 0x661A, AVX512F, Modrm|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { XMMword|Unspecified|BaseIndex, RegYMM|RegZMM }
 vbroadcasti32x4, 0x665A, AVX512F, Modrm|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { XMMword|Unspecified|BaseIndex, RegYMM|RegZMM }
@@ -2192,11 +2281,11 @@  vbroadcastsd, 0x6619, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift=3|NoS
 vpbroadcastq, 0x6659, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vpbroadcast<dq>, 0x667c, AVX512F, Modrm|Masking|Space0F38|<dq:vexw64>|NoSuf, { <dq:gpr>, RegXMM|RegYMM|RegZMM }
 
-vcmp<frel>p<sd>, 0x<sd:ppfx>C2/0x<frel:imm>, AVX512F, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt|SAE, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vcmpp<sd>, 0x<sd:ppfx>C2, AVX512F, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vcmp<frel>p<sd>, 0x<sd:ppfx>C2/0x<frel:imm>, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt|SAE, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vcmpp<sd>, 0x<sd:ppfx>C2, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
-vcmp<frel>s<sd>, 0x<sd:spfx>C2/0x<frel:imm>, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf|SAE|ImmExt, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegMask }
-vcmps<sd>, 0x<sd:spfx>C2, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegMask }
+vcmp<frel>s<sd>, 0x<sd:spfx>C2/0x<frel:imm>, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf|SAE|ImmExt, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegMask }
+vcmps<sd>, 0x<sd:spfx>C2, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegMask }
 
 vcomis<sdh>, 0x<sdh:ppfx>2f, <sdh:cpu>, Modrm|EVexLIG|<sdh:spc1>|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM }
 vucomis<sdh>, 0x<sdh:ppfx>2e, <sdh:cpu>, Modrm|EVexLIG|<sdh:spc1>|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM }
@@ -2238,23 +2327,23 @@  vcvtps2ph, 0x661D, AVX512F, Modrm|EVex512|Masking|Space0F3A|VexW0|Disp8MemShift=
 vcvts<sd>2si, 0x<sd:spfx>2d, AVX512F, Modrm|EVexLIG|Space0F|Disp8MemShift|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE, { RegXMM|<sd:elem>|Unspecified|BaseIndex, Reg32|Reg64 }
 vcvts<sdh>2usi, 0x<sdh:spfx>79, <sdh:cpu>, Modrm|EVexLIG|<sdh:spc1>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, Reg32|Reg64 }
 
-vcvtsd2ss, 0xF25A, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVV|VexW1|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsd2ss, 0xF25A, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVVSrc|VexW1|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2sd, 0xF22A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|ATTSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|IntelSyntax, { Reg32|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2sd, 0xF27B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vcvtsi2ss, 0xF32A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtsi2ss, 0xF32A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtusi2ss, 0xF37B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtusi2ss, 0xF37B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2ss, 0xF32A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2ss, 0xF32A, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2ss, 0xF37B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2ss, 0xF37B, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vcvtss2sd, 0xF35A, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVV|VexW0|Disp8MemShift=2|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtss2sd, 0xF35A, AVX512F, Modrm|EVexLIG|Masking|Space0F|VexVVVVSrc|VexW0|Disp8MemShift=2|NoSuf|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vcvttpd2dq<Exy>, 0x66e6, AVX512F|<Exy:vl>, Modrm|<Exy:attr>|Masking|Space0F|VexW1|Broadcast|NoSuf|<Exy:sae>, { <Exy:src>|Qword, <Exy:dst> }
 vcvttpd2udq<Exy>, 0x78, AVX512F|<Exy:vl>, Modrm|<Exy:attr>|Masking|Space0F|VexW1|Broadcast|NoSuf|<Exy:sae>, { <Exy:src>|Qword, <Exy:dst> }
@@ -2279,17 +2368,17 @@  vextracti32x4, 0x6639, AVX512F, Modrm|Masking|Space0F3A|VexW=1|Disp8MemShift=4|N
 vextractf64x4, 0x661B, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexW=2|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex }
 vextracti64x4, 0x663B, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexW=2|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex }
 
-vfixupimmp<sd>, 0x6654, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfixupimms<sd>, 0x6655, AVX512F, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8|Imm8S, RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfixupimmp<sd>, 0x6654, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfixupimms<sd>, 0x6655, AVX512F, Modrm|EVexLIG|Masking|Space0F3A|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8|Imm8S, RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vgetmantp<sdh>, 0x<sdh:pfx>26, <sdh:cpu>, Modrm|Masking|Space0F3A|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vgetmants<sdh>, 0x<sdh:pfx>27, <sdh:cpu>, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vgetmants<sdh>, 0x<sdh:pfx>27, <sdh:cpu>, Modrm|EVexLIG|Masking|Space0F3A|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vrndscalep<sdh>, 0x<sdh:pfx>08 | <sdh:opc>, <sdh:cpu>, Modrm|Masking|Space0F3A|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vrndscales<sdh>, 0x<sdh:pfx>0a | <sdh:opc>, <sdh:cpu>, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrndscales<sdh>, 0x<sdh:pfx>0a | <sdh:opc>, <sdh:cpu>, Modrm|EVexLIG|Masking|Space0F3A|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vscalefp<sdh>, 0x662c, <sdh:cpu>, Modrm|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vscalefs<sdh>, 0x662d, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vscalefp<sdh>, 0x662c, <sdh:cpu>, Modrm|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vscalefs<sdh>, 0x662d, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|StaticRounding|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vgatherdpd, 0x6692, AVX512F, Modrm|EVex=1|Masking|NoDefMask|Space0F38|VexW1|Disp8MemShift=3|VecSIB256|NoSuf, { Qword|Unspecified|BaseIndex, RegZMM }
 vgatherdps, 0x6692, AVX512F, Modrm|EVex512|Masking|NoDefMask|Space0F38|VexW0|Disp8MemShift=2|VecSIB512|NoSuf, { Dword|Unspecified|BaseIndex, RegZMM }
@@ -2303,21 +2392,21 @@  vpgatherqq, 0x6691, AVX512F, Modrm|EVex=1|Masking|NoDefMask|Space0F38|VexW1|Disp
 vmovntdqa, 0x662A, AVX512F, Modrm|Space0F38|VexW=1|Disp8ShiftVL|CheckOperandSize|NoSuf, { XMMword|YMMword|ZMMword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
 vgetexpp<sdh>, 0x6642, <sdh:cpu>, Modrm|Masking|<sdh:spc2>|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vgetexps<sdh>, 0x6643, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc2>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vgetexps<sdh>, 0x6643, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc2>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vinsertf32x4, 0x6618, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|XMMword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
-vinserti32x4, 0x6638, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|XMMword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vinsertf32x4, 0x6618, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW0|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|XMMword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vinserti32x4, 0x6638, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW0|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|XMMword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
 
-vinsertf64x4, 0x661A, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexVVVV|VexW1|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
-vinserti64x4, 0x663A, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexVVVV|VexW1|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
+vinsertf64x4, 0x661A, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexVVVVSrc|VexW1|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
+vinserti64x4, 0x663A, AVX512F, Modrm|EVex=1|Masking|Space0F3A|VexVVVVSrc|VexW1|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
 
-vinsertps, 0x6621, AVX512F, Modrm|EVex128|Space0F3A|VexVVVV|VexW0|Disp8MemShift=2|NoSuf, { Imm8, RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vinsertps, 0x6621, AVX512F, Modrm|EVex128|Space0F3A|VexVVVVSrc|VexW0|Disp8MemShift=2|NoSuf, { Imm8, RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vmaxp<sdh>, 0x<sdh:ppfx>5f, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vmaxs<sdh>, 0x<sdh:spfx>5f, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vmaxp<sdh>, 0x<sdh:ppfx>5f, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vmaxs<sdh>, 0x<sdh:spfx>5f, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vminp<sdh>, 0x<sdh:ppfx>5d, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vmins<sdh>, 0x<sdh:spfx>5d, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vminp<sdh>, 0x<sdh:ppfx>5d, <sdh:cpu>, Modrm|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vmins<sdh>, 0x<sdh:spfx>5d, <sdh:cpu>, Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vmovap<sd>, 0x<sd:ppfx>28, AVX512F, D|Modrm|Masking|Space0F|<sd:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vmovntp<sd>, 0x<sd:ppfx>2B, AVX512F, Modrm|Space0F|<sd:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM, XMMword|YMMword|ZMMword|Unspecified|BaseIndex }
@@ -2331,56 +2420,56 @@  vmovntdq, 0x66E7, AVX512F, Modrm|Space0F|VexW=1|Disp8ShiftVL|CheckOperandSize|No
 vmovdqu32, 0xF36F, AVX512F, D|Modrm|Masking|Space0F|VexW=1|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vmovdqu64, 0xF36F, AVX512F, D|Modrm|Masking|Space0F|VexW=2|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vmovhlps, 0x12, AVX512F, Modrm|EVex=4|Space0F|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM, RegXMM }
-vmovlhps, 0x16, AVX512F, Modrm|EVex=4|Space0F|VexVVVV|VexW0|NoSuf, { RegXMM, RegXMM, RegXMM }
+vmovhlps, 0x12, AVX512F, Modrm|EVex=4|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM, RegXMM }
+vmovlhps, 0x16, AVX512F, Modrm|EVex=4|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegXMM, RegXMM, RegXMM }
 
-vmovhp<sd>, 0x<sd:ppfx>16, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|<sd:vexw>|Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vmovhp<sd>, 0x<sd:ppfx>16, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|<sd:vexw>|Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vmovhp<sd>, 0x<sd:ppfx>17, AVX512F, Modrm|EVexLIG|Space0F|<sd:vexw>|Disp8MemShift=3|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex }
-vmovlp<sd>, 0x<sd:ppfx>12, AVX512F, Modrm|EVexLIG|Space0F|VexVVVV|<sd:vexw>|Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vmovlp<sd>, 0x<sd:ppfx>12, AVX512F, Modrm|EVexLIG|Space0F|VexVVVVSrc|<sd:vexw>|Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
 vmovlp<sd>, 0x<sd:ppfx>13, AVX512F, Modrm|EVexLIG|Space0F|<sd:vexw>|Disp8MemShift=3|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex }
 
 vmovq, 0xF37E, AVX512F, Load|Modrm|EVex=2|Space0F|VexW1|Disp8MemShift=3|NoSuf, { Qword|Unspecified|BaseIndex|RegXMM, RegXMM }
 vmovq, 0x66D6, AVX512F, Modrm|EVex=2|Space0F|VexW1|Disp8MemShift=3|NoSuf, { RegXMM, Qword|Unspecified|BaseIndex|RegXMM }
 
 vmovs<sdh>, 0x<sdh:spfx>10, <sdh:cpu>, D|Modrm|EVexLIG|Masking|<sdh:spc1>|<sdh:vexw>|Disp8MemShift|NoSuf, { <sdh:elem>|Unspecified|BaseIndex, RegXMM }
-vmovs<sdh>, 0x<sdh:spfx>10, <sdh:cpu>, D|Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVV|<sdh:vexw>|NoSuf, { RegXMM, RegXMM, RegXMM }
+vmovs<sdh>, 0x<sdh:spfx>10, <sdh:cpu>, D|Modrm|EVexLIG|Masking|<sdh:spc1>|VexVVVVSrc|<sdh:vexw>|NoSuf, { RegXMM, RegXMM, RegXMM }
 
 vmovshdup, 0xF316, AVX512F, Modrm|Masking|Space0F|VexW=1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vmovsldup, 0xF312, AVX512F, Modrm|Masking|Space0F|VexW=1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
 vpabs<dq>, 0x661e | <dq:opc>, AVX512F, Modrm|Masking|Space0F38|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpaddd, 0x66FE, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpaddq, 0x66d4, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpand<dq>, 0x66db, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpandn<dq>, 0x66df, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmuludq, 0x66f4, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpor<dq>, 0x66eb, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsub<dq>, 0x66fa | <dq:opc>, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpckhdq, 0x666A, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpckhqdq, 0x666d, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpckldq, 0x6662, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpcklqdq, 0x666c, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpxor<dq>, 0x66ef, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpaddd, 0x66FE, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpaddq, 0x66d4, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpand<dq>, 0x66db, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpandn<dq>, 0x66df, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmuludq, 0x66f4, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpor<dq>, 0x66eb, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsub<dq>, 0x66fa | <dq:opc>, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpckhdq, 0x666A, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpckhqdq, 0x666d, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpckldq, 0x6662, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpcklqdq, 0x666c, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpxor<dq>, 0x66ef, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 <irel:imm, eq:0, lt:1, le:2, neq:4, nlt:5, nle:6>
 
-vpcmpeqd, 0x6676, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmpeqq, 0x6629, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmpgtd, 0x6666, AVX512F, Modrm|Masking|Space0F|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmpgtq, 0x6637, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmp<dq>, 0x661f, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmpu<dq>, 0x661e, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmp<irel><dq>, 0x661f/<irel:imm>, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmp<irel>u<dq>, 0x661e/<irel:imm>, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpeqd, 0x6676, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpeqq, 0x6629, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpgtd, 0x6666, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpgtq, 0x6637, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmp<dq>, 0x661f, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpu<dq>, 0x661e, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmp<irel><dq>, 0x661f/<irel:imm>, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmp<irel>u<dq>, 0x661e/<irel:imm>, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
-vptestm<dq>, 0x6627, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vptestnm<dq>, 0xf327, AVX512F, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vptestm<dq>, 0x6627, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vptestnm<dq>, 0xf327, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
 vpermilpd, 0x6605, AVX512F, Modrm|Masking|Space0F3A|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpermilpd, 0x660d, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpermilpd, 0x660d, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vpermpd, 0x6616, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
-vpermq, 0x6636, AVX512F, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vpermpd, 0x6616, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vpermq, 0x6636, AVX512F, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
 
 vpmovdb, 0xF331, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegZMM, RegXMM|Unspecified|BaseIndex }
 vpmovsdb, 0xF321, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexW=1|Disp8MemShift=4|NoSuf, { RegZMM, RegXMM|Unspecified|BaseIndex }
@@ -2417,34 +2506,34 @@  vpmovzxwd, 0x6633, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexWIG|Disp8MemShift=
 vpmovsxwq, 0x6624, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegZMM }
 vpmovzxwq, 0x6634, AVX512F, Modrm|EVex=1|Masking|Space0F38|VexWIG|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegZMM }
 
-vprol<dq>, 0x6672/1, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpror<dq>, 0x6672/0, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vprol<dq>, 0x6672/1, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpror<dq>, 0x6672/0, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
 vpshufd, 0x6670, AVX512F, Modrm|Masking|Space0F|VexW=1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vpsll<dq>, 0x66f2 | <dq:opc>, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsll<dq>, 0x6672 | <dq:opc>/6, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpsra<dq>, 0x66e2, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsra<dq>, 0x6672/4, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpsrl<dq>, 0x66d2 | <dq:opc>, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsrl<dq>, 0x6672 | <dq:opc>/2, AVX512F, Modrm|Masking|Space0F|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpsll<dq>, 0x66f2 | <dq:opc>, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsll<dq>, 0x6672 | <dq:opc>/6, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpsra<dq>, 0x66e2, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsra<dq>, 0x6672/4, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpsrl<dq>, 0x66d2 | <dq:opc>, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsrl<dq>, 0x6672 | <dq:opc>/2, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
 vrcp14p<sd>, 0x664C, AVX512F, Modrm|Masking|Space0F38|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vrcp14s<sd>, 0x664D, AVX512F, Modrm|EVexLIG|Masking|Space0F38|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrcp14s<sd>, 0x664D, AVX512F, Modrm|EVexLIG|Masking|Space0F38|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vrsqrt14p<sd>, 0x664E, AVX512F, Modrm|Masking|Space0F38|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vrsqrt14s<sd>, 0x664F, AVX512F, Modrm|EVexLIG|Masking|Space0F38|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrsqrt14s<sd>, 0x664F, AVX512F, Modrm|EVexLIG|Masking|Space0F38|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vshuff32x4, 0x6623, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
-vshufi32x4, 0x6643, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vshuff32x4, 0x6623, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vshufi32x4, 0x6643, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
 
-vshuff64x2, 0x6623, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
-vshufi64x2, 0x6643, AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vshuff64x2, 0x6623, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vshufi64x2, 0x6643, AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
 
-vshufp<sd>, 0x<sd:ppfx>C6, AVX512F, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vshufp<sd>, 0x<sd:ppfx>C6, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vunpckhp<sd>, 0x<sd:ppfx>15, AVX512F, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vunpcklp<sd>, 0x<sd:ppfx>14, AVX512F, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vunpckhp<sd>, 0x<sd:ppfx>15, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vunpcklp<sd>, 0x<sd:ppfx>14, AVX512F, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX512F instructions end.
 
@@ -2464,10 +2553,10 @@  vplzcnt<dq>, 0x6644, AVX512CD, Modrm|Masking|Space0F38|<dq:vexw>|Broadcast|Disp8
 vexp2p<sd>, 0x66C8, AVX512ER, Modrm|EVex512|Masking|Space0F38|<sd:vexw>|Broadcast|Disp8MemShift=6|NoSuf|SAE, { RegZMM|<sd:elem>|Unspecified|BaseIndex, RegZMM }
 
 vrcp28p<sd>, 0x66CA, AVX512ER, Modrm|EVex512|Masking|Space0F38|<sd:vexw>|Broadcast|Disp8MemShift=6|NoSuf|SAE, { RegZMM|<sd:elem>|Unspecified|BaseIndex, RegZMM }
-vrcp28s<sd>, 0x66CB, AVX512ER, Modrm|EVexLIG|Masking|Space0F38|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrcp28s<sd>, 0x66CB, AVX512ER, Modrm|EVexLIG|Masking|Space0F38|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vrsqrt28p<sd>, 0x66CC, AVX512ER, Modrm|EVex512|Masking|Space0F38|<sd:vexw>|Broadcast|Disp8MemShift=6|NoSuf|SAE, { RegZMM|<sd:elem>|Unspecified|BaseIndex, RegZMM }
-vrsqrt28s<sd>, 0x66CD, AVX512ER, Modrm|EVexLIG|Masking|Space0F38|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrsqrt28s<sd>, 0x66CD, AVX512ER, Modrm|EVexLIG|Masking|Space0F38|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 // AVX512ER instructions end.
 
@@ -2613,9 +2702,9 @@  vpmovzxdq, 0x6635, AVX512F|AVX512VL, Modrm|EVex=3|Masking|Space0F38|VexW=1|Disp8
 
 // AVX512BW instructions.
 
-kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
-kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
-kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf|Optimize, { RegMask, RegMask, RegMask }
+kadd<dq>, 0x<dq:kpfx>4a, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kand<dq>, 0x<dq:kpfx>41, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kandn<dq>, 0x<dq:kpfx>42, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW1|<dq:kvsz>|NoSuf|Optimize, { RegMask, RegMask, RegMask }
 kmov<dq>, 0x<dq:kpfx>90, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
 kmov<dq>, 0x<dq:kpfx>90, AVX512BW|APX_F, Modrm|EVex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask|<dq:elem>|Unspecified|BaseIndex, RegMask }
 kmov<dq>, 0x<dq:kpfx>91, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, <dq:elem>|Unspecified|BaseIndex }
@@ -2623,95 +2712,95 @@  kmov<dq>, 0x<dq:kpfx>91, AVX512BW|APX_F, Modrm|EVex128|Space0F|VexW1|<dq:kvsz>|N
 kmov<dq>, 0xf292, AVX512BW, D|Modrm|Vex128|Space0F|<dq:vexw64>|<dq:kvsz>|NoSuf, { <dq:gpr>, RegMask }
 kmov<dq>, 0xf292, AVX512BW|APX_F, D|Modrm|EVex128|Space0F|<dq:vexw64>|<dq:kvsz>|NoSuf, { <dq:gpr>, RegMask }
 knot<dq>, 0x<dq:kpfx>44, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask }
-kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kor<dq>, 0x<dq:kpfx>45, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
 kortest<dq>, 0x<dq:kpfx>98, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask }
 ktest<dq>, 0x<dq:kpfx>99, AVX512BW, Modrm|Vex128|Space0F|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask }
-kxnor<dq>, 0x<dq:kpfx>46, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
-kxor<dq>, 0x<dq:kpfx>47, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|<dq:kvsz>|NoSuf|Optimize, { RegMask, RegMask, RegMask }
-kunpckdq, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW1|Vsz512|NoSuf, { RegMask, RegMask, RegMask }
-kunpckwd, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVV|VexW0|Vsz256|NoSuf, { RegMask, RegMask, RegMask }
+kxnor<dq>, 0x<dq:kpfx>46, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW1|<dq:kvsz>|NoSuf, { RegMask, RegMask, RegMask }
+kxor<dq>, 0x<dq:kpfx>47, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW1|<dq:kvsz>|NoSuf|Optimize, { RegMask, RegMask, RegMask }
+kunpckdq, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW1|Vsz512|NoSuf, { RegMask, RegMask, RegMask }
+kunpckwd, 0x4B, AVX512BW, Modrm|Vex256|Space0F|VexVVVVSrc|VexW0|Vsz256|NoSuf, { RegMask, RegMask, RegMask }
 kshiftl<dq>, 0x6633, AVX512BW, Modrm|Vex128|Space0F3A|<dq:vexw>|<dq:kvsz>|NoSuf, { Imm8, RegMask, RegMask }
 kshiftr<dq>, 0x6631, AVX512BW, Modrm|Vex128|Space0F3A|<dq:vexw>|<dq:kvsz>|NoSuf, { Imm8, RegMask, RegMask }
 
-vdbpsadbw, 0x6642, AVX512BW, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vdbpsadbw, 0x6642, AVX512BW, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 vmovdqu8, 0xF26F, AVX512BW, D|Modrm|Masking|Space0F|VexW=1|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vmovdqu16, 0xF26F, AVX512BW, D|Modrm|Masking|Space0F|VexW=2|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
 vpabs<bw>, 0x661c | <bw:opc>, AVX512BW, Modrm|Masking|Space0F38|VexWIG|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpmaxsb, 0x663C, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpminsb, 0x6638, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpshufb, 0x6600, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-
-vpmaddubsw, 0x6604, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmaxuw, 0x663E, AVX512BW, Modrm|Masking|VexWIG|Space0F38|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpminuw, 0x663A, AVX512BW, Modrm|Masking|VexWIG|Space0F38|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmulhrsw, 0x660B, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-
-vpackssdw, 0x666B, AVX512BW, Modrm|Masking|Space0F|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpacksswb, 0x6663, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpackuswb, 0x6667, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpackusdw, 0x662B, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-
-vpadd<bw>, 0x66fc | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpadds<bw>, 0x66ec | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpaddus<bw>, 0x66dc | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpavg<bw>, 0x66e0 | (3 * <bw:opc>), AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmaxub, 0x66DE, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpminub, 0x66DA, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsub<bw>, 0x66f8 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsubs<bw>, 0x66e8 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsubus<bw>, 0x66d8 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpckhbw, 0x6668, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpcklbw, 0x6660, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-
-vpmaxsw, 0x66EE, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpminsw, 0x66EA, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmulhuw, 0x66E4, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmulhw, 0x66E5, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmullw, 0x66D5, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsllw, 0x6671/6, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpsllw, 0x66F1, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsraw, 0x6671/4, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpsraw, 0x66E1, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsrlw, 0x6671/2, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpsrlw, 0x66D1, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpckhwd, 0x6669, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpunpcklwd, 0x6661, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-
-vpalignr, 0x660F, AVX512BW, Modrm|Masking|Space0F3A|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-
-vpblendm<bw>, 0x6666, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmaxsb, 0x663C, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpminsb, 0x6638, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshufb, 0x6600, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+
+vpmaddubsw, 0x6604, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmaxuw, 0x663E, AVX512BW, Modrm|Masking|VexWIG|Space0F38|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpminuw, 0x663A, AVX512BW, Modrm|Masking|VexWIG|Space0F38|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmulhrsw, 0x660B, AVX512BW, Modrm|Masking|Space0F38|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+
+vpackssdw, 0x666B, AVX512BW, Modrm|Masking|Space0F|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpacksswb, 0x6663, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpackuswb, 0x6667, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpackusdw, 0x662B, AVX512BW, Modrm|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+
+vpadd<bw>, 0x66fc | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpadds<bw>, 0x66ec | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpaddus<bw>, 0x66dc | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpavg<bw>, 0x66e0 | (3 * <bw:opc>), AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmaxub, 0x66DE, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpminub, 0x66DA, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsub<bw>, 0x66f8 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsubs<bw>, 0x66e8 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsubus<bw>, 0x66d8 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpckhbw, 0x6668, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpcklbw, 0x6660, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+
+vpmaxsw, 0x66EE, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpminsw, 0x66EA, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmulhuw, 0x66E4, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmulhw, 0x66E5, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmullw, 0x66D5, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsllw, 0x6671/6, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpsllw, 0x66F1, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsraw, 0x6671/4, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpsraw, 0x66E1, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsrlw, 0x6671/2, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpsrlw, 0x66D1, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8MemShift=4|CheckOperandSize|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpckhwd, 0x6669, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpunpcklwd, 0x6661, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+
+vpalignr, 0x660F, AVX512BW, Modrm|Masking|Space0F3A|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+
+vpblendm<bw>, 0x6666, AVX512BW, Modrm|Masking|Space0F38|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 vpbroadcast<bw>, 0x6678 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F38|VexW0|Disp8MemShift|NoSuf, { RegXMM|<bw:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vpbroadcast<bw>, 0x667a | <bw:opc>, AVX512BW, Modrm|Masking|Space0F38|VexW0|NoSuf, { Reg32, RegXMM|RegYMM|RegZMM }
 
-vpermi2<bw>, 0x6675, <bw:cpubmi>, Modrm|Masking|Space0F38|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpermt2<bw>, 0x667d, <bw:cpubmi>, Modrm|Masking|Space0F38|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vperm<bw>, 0x668d, <bw:cpubmi>, Modrm|Masking|Space0F38|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsllvw, 0x6612, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsravw, 0x6611, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpsrlvw, 0x6610, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpermi2<bw>, 0x6675, <bw:cpubmi>, Modrm|Masking|Space0F38|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpermt2<bw>, 0x667d, <bw:cpubmi>, Modrm|Masking|Space0F38|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vperm<bw>, 0x668d, <bw:cpubmi>, Modrm|Masking|Space0F38|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsllvw, 0x6612, AVX512BW, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsravw, 0x6611, AVX512BW, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsrlvw, 0x6610, AVX512BW, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vpcmpeq<bw>, 0x6674 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmpgt<bw>, 0x6664 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmp<bw>, 0x663f, AVX512BW, Modrm|Masking|Space0F3A|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmpu<bw>, 0x663e, AVX512BW, Modrm|Masking|Space0F3A|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmp<irel><bw>, 0x663f/<irel:imm>, AVX512BW, Modrm|Masking|Space0F3A|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vpcmp<irel>u<bw>, 0x663e/<irel:imm>, AVX512BW, Modrm|Masking|Space0F3A|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpeq<bw>, 0x6674 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpgt<bw>, 0x6664 | <bw:opc>, AVX512BW, Modrm|Masking|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmp<bw>, 0x663f, AVX512BW, Modrm|Masking|Space0F3A|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmpu<bw>, 0x663e, AVX512BW, Modrm|Masking|Space0F3A|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmp<irel><bw>, 0x663f/<irel:imm>, AVX512BW, Modrm|Masking|Space0F3A|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpcmp<irel>u<bw>, 0x663e/<irel:imm>, AVX512BW, Modrm|Masking|Space0F3A|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
-vpslldq, 0x6673/7, AVX512BW, Modrm|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vpsrldq, 0x6673/3, AVX512BW, Modrm|Space0F|VexWIG|VexVVVV|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpslldq, 0x6673/7, AVX512BW, Modrm|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
+vpsrldq, 0x6673/3, AVX512BW, Modrm|Space0F|VexWIG|VexVVVVSrc|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
 vpextrw, 0x66C5, AVX512BW, Load|Modrm|EVex128|Space0F|VexWIG|NoSuf, { Imm8, RegXMM, Reg32|Reg64 }
 vpextr<bw>, 0x6614 | <bw:opc>, AVX512BW, RegMem|EVex128|Space0F3A|VexWIG|NoSuf, { Imm8, RegXMM, Reg32|Reg64 }
 vpextr<bw>, 0x6614 | <bw:opc>, AVX512BW, Modrm|EVex128|Space0F3A|VexWIG|Disp8MemShift|NoSuf, { Imm8, RegXMM, <bw:elem>|Unspecified|BaseIndex }
 
-vpinsrw, 0x66C4, AVX512BW, Modrm|EVex128|Space0F|VexWIG|VexVVVV|NoSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
-vpinsrw, 0x66C4, AVX512BW, Modrm|EVex128|Space0F|VexWIG|VexVVVV|Disp8MemShift=1|NoSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM, RegXMM }
-vpinsrb, 0x6620, AVX512BW, Modrm|EVex128|Space0F3A|VexWIG|VexVVVV|NoSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
-vpinsrb, 0x6620, AVX512BW, Modrm|EVex128|Space0F3A|VexWIG|VexVVVV|NoSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsrw, 0x66C4, AVX512BW, Modrm|EVex128|Space0F|VexWIG|VexVVVVSrc|NoSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrw, 0x66C4, AVX512BW, Modrm|EVex128|Space0F|VexWIG|VexVVVVSrc|Disp8MemShift=1|NoSuf, { Imm8, Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsrb, 0x6620, AVX512BW, Modrm|EVex128|Space0F3A|VexWIG|VexVVVVSrc|NoSuf, { Imm8, Reg32|Reg64, RegXMM, RegXMM }
+vpinsrb, 0x6620, AVX512BW, Modrm|EVex128|Space0F3A|VexWIG|VexVVVVSrc|NoSuf, { Imm8, Byte|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vpmaddwd, 0x66F5, AVX512BW, Modrm|Masking|Space0F|VexVVVV|VexWIG|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmaddwd, 0x66F5, AVX512BW, Modrm|Masking|Space0F|VexVVVVSrc|VexWIG|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 vpmov<bw>2m, 0xf329, AVX512BW, Modrm|EVexDYN|Space0F38|<bw:vexw>|NoSuf, { RegXMM|RegYMM|RegZMM, RegMask }
 vpmovm2<bw>, 0xf328, AVX512BW, Modrm|EVexDYN|Space0F38|<bw:vexw>|NoSuf, { RegMask, RegXMM|RegYMM|RegZMM }
@@ -2735,13 +2824,13 @@  vpmovzxbw, 0x6630, AVX512BW, Modrm|EVex=1|Masking|Space0F38|VexWIG|Disp8MemShift
 vpmovzxbw, 0x6630, AVX512BW|AVX512VL, Modrm|EVex=2|Masking|VexWIG|Space0F38|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
 vpmovzxbw, 0x6630, AVX512BW|AVX512VL, Modrm|EVex=3|Masking|VexWIG|Space0F38|Disp8MemShift=4|NoSuf, { RegXMM|Unspecified|BaseIndex, RegYMM }
 
-vpsadbw, 0x66F6, AVX512BW, Modrm|Space0F|VexVVVV|VexWIG|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpsadbw, 0x66F6, AVX512BW, Modrm|Space0F|VexVVVVSrc|VexWIG|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 vpshufhw, 0xF370, AVX512BW, Modrm|Masking|Space0F|VexWIG|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vpshuflw, 0xF270, AVX512BW, Modrm|Masking|Space0F|VexWIG|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vptestm<bw>, 0x6626, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vptestnm<bw>, 0xf326, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vptestm<bw>, 0x6626, AVX512BW, Modrm|Masking|Space0F38|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vptestnm<bw>, 0xf326, AVX512BW, Modrm|Masking|Space0F38|VexVVVVSrc|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
 // AVX512BW instructions end.
 
@@ -2754,13 +2843,13 @@  vptestnm<bw>, 0xf326, AVX512BW, Modrm|Masking|Space0F38|VexVVVV|<bw:vexw>|Disp8S
     x:AVX512VL:EVex128|Disp8MemShift=4::ATTSyntax:RegXMM|Unspecified|BaseIndex, +
     y:AVX512VL:EVex256|Disp8MemShift=5::ATTSyntax:RegYMM|Unspecified|BaseIndex>
 
-kadd<bw>, 0x<bw:kpfx>4A, AVX512DQ, Modrm|Vex256|Space0F|VexVVVV|VexW0|NoSuf, { RegMask, RegMask, RegMask }
+kadd<bw>, 0x<bw:kpfx>4A, AVX512DQ, Modrm|Vex256|Space0F|VexVVVVSrc|VexW0|NoSuf, { RegMask, RegMask, RegMask }
 ktest<bw>, 0x<bw:kpfx>99, AVX512DQ, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegMask, RegMask }
 
-vandnp<sd>, 0x<sd:ppfx>55, AVX512DQ, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vandp<sd>, 0x<sd:ppfx>54, AVX512DQ, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vorp<sd>, 0x<sd:ppfx>56, AVX512DQ, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vxorp<sd>, 0x<sd:ppfx>57, AVX512DQ, Modrm|Masking|Space0F|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vandnp<sd>, 0x<sd:ppfx>55, AVX512DQ, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vandp<sd>, 0x<sd:ppfx>54, AVX512DQ, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vorp<sd>, 0x<sd:ppfx>56, AVX512DQ, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vxorp<sd>, 0x<sd:ppfx>57, AVX512DQ, Modrm|Masking|Space0F|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|Optimize, { RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 vbroadcastf32x2, 0x6619, AVX512DQ, Modrm|Masking|Space0F38|VexW0|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM }
 vbroadcastf32x8, 0x661B, AVX512DQ, Modrm|EVex=1|Masking|Space0F38|VexW=1|Disp8MemShift=5|NoSuf, { YMMword|Unspecified|BaseIndex, RegZMM }
@@ -2799,16 +2888,16 @@  vcvtuqq2ps<Exy>, 0xf27a, AVX512DQ|<Exy:vl>, Modrm|<Exy:attr>|Masking|Space0F|Vex
 
 vextractf32x8, 0x661B, AVX512DQ, Modrm|EVex=1|Masking|Space0F3A|VexW=1|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex }
 vextracti32x8, 0x663B, AVX512DQ, Modrm|EVex=1|Masking|Space0F3A|VexW=1|Disp8MemShift=5|NoSuf, { Imm8, RegZMM, RegYMM|Unspecified|BaseIndex }
-vinsertf32x8, 0x661A, AVX512DQ, Modrm|EVex512|Masking|Space0F3A|VexVVVV|VexW0|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
-vinserti32x8, 0x663A, AVX512DQ, Modrm|EVex512|Masking|Space0F3A|VexVVVV|VexW0|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
+vinsertf32x8, 0x661A, AVX512DQ, Modrm|EVex512|Masking|Space0F3A|VexVVVVSrc|VexW0|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
+vinserti32x8, 0x663A, AVX512DQ, Modrm|EVex512|Masking|Space0F3A|VexVVVVSrc|VexW0|Disp8MemShift=5|NoSuf, { Imm8, RegYMM|Unspecified|BaseIndex, RegZMM, RegZMM }
 
 vpextr<dq>, 0x6616, AVX512DQ|<dq:cpu64>, Modrm|EVex128|Space0F3A|<dq:vexw64>|Disp8MemShift|NoSuf, { Imm8, RegXMM, <dq:gpr>|Unspecified|BaseIndex }
-vpinsr<dq>, 0x6622, AVX512DQ|<dq:cpu64>, Modrm|EVex128|Space0F3A|VexVVVV|<dq:vexw64>|Disp8MemShift|NoSuf, { Imm8, <dq:gpr>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vpinsr<dq>, 0x6622, AVX512DQ|<dq:cpu64>, Modrm|EVex128|Space0F3A|VexVVVVSrc|<dq:vexw64>|Disp8MemShift|NoSuf, { Imm8, <dq:gpr>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vextractf64x2, 0x6619, AVX512DQ, Modrm|Masking|Space0F3A|VexW=2|Disp8MemShift=4|NoSuf, { Imm8, RegYMM|RegZMM, RegXMM|Unspecified|BaseIndex }
 vextracti64x2, 0x6639, AVX512DQ, Modrm|Masking|Space0F3A|VexW=2|Disp8MemShift=4|NoSuf, { Imm8, RegYMM|RegZMM, RegXMM|Unspecified|BaseIndex }
-vinsertf64x2, 0x6618, AVX512DQ, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
-vinserti64x2, 0x6638, AVX512DQ, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vinsertf64x2, 0x6618, AVX512DQ, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
+vinserti64x2, 0x6638, AVX512DQ, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Disp8MemShift=4|CheckOperandSize|NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegYMM|RegZMM, RegYMM|RegZMM }
 
 vfpclassp<sd>, 0x6666, AVX512DQ, Modrm|Masking|Space0F3A|<sd:vexw>|Broadcast|Disp8ShiftVL|NoSuf|IntelSyntax, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegMask }
 vfpclassp<sd>, 0x6666, AVX512DQ, Modrm|Masking|Space0F3A|<sd:vexw>|Broadcast|Disp8ShiftVL|NoSuf|ATTSyntax, { Imm8|Imm8S, RegXMM|RegYMM|RegZMM|<sd:elem>|BaseIndex, RegMask }
@@ -2820,13 +2909,13 @@  vfpclasss<sdh>, 0x<sdh:pfx>67, <sdh:cpudq>, Modrm|EVexLIG|Masking|Space0F3A|<sdh
 vpmov<dq>2m, 0xf339, AVX512DQ, Modrm|EVexDYN|Space0F38|<dq:vexw>|NoSuf, { RegXMM|RegYMM|RegZMM, RegMask }
 vpmovm2<dq>, 0xf338, AVX512DQ, Modrm|EVexDYN|Space0F38|<dq:vexw>|NoSuf, { RegMask, RegXMM|RegYMM|RegZMM }
 
-vpmullq, 0x6640, AVX512DQ, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmullq, 0x6640, AVX512DQ, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vrangep<sd>, 0x6650, AVX512DQ, Modrm|Masking|Space0F3A|VexVVVV|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vranges<sd>, 0x6651, AVX512DQ, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrangep<sd>, 0x6650, AVX512DQ, Modrm|Masking|Space0F3A|VexVVVVSrc|<sd:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|<sd:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vranges<sd>, 0x6651, AVX512DQ, Modrm|EVexLIG|Masking|Space0F3A|VexVVVVSrc|<sd:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vreducep<sdh>, 0x<sdh:pfx>56, <sdh:cpudq>, Modrm|Masking|Space0F3A|<sdh:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
-vreduces<sdh>, 0x<sdh:pfx>57, <sdh:cpudq>, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
+vreduces<sdh>, 0x<sdh:pfx>57, <sdh:cpudq>, Modrm|EVexLIG|Masking|Space0F3A|VexVVVVSrc|<sdh:vexw>|Disp8MemShift|NoSuf|SAE, { Imm8, RegXMM|<sdh:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 // AVX512DQ instructions end.
 
@@ -2838,37 +2927,37 @@  clwb, 0x660fae/6, CLWB, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex }
 
 // AVX512IFMA instructions
 
-vpmadd52huq, 0x66B5, AVX512IFMA, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpmadd52luq, 0x66B4, AVX512IFMA, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmadd52huq, 0x66B5, AVX512IFMA, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmadd52luq, 0x66B4, AVX512IFMA, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX512IFMA instructions end
 
 // AVX-IFMA instructions.
 
-vpmadd52huq, 0x66B5, AVX_IFMA, Modrm|Vex|Space0F38|VexVVVV|VexW1|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpmadd52luq, 0x66B4, AVX_IFMA, Modrm|Vex|Space0F38|VexVVVV|VexW1|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmadd52huq, 0x66B5, AVX_IFMA, Modrm|Vex|Space0F38|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpmadd52luq, 0x66B4, AVX_IFMA, Modrm|Vex|Space0F38|VexVVVVSrc|VexW1|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX-IFMA instructions end.
 
 // AVX512VBMI instructions
 
-vpmultishiftqb, 0x6683, AVX512VBMI, Modrm|Masking|Space0F38|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpmultishiftqb, 0x6683, AVX512VBMI, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX512VBMI instructions end
 
 // AVX512_4FMAPS instructions
 
-v4fmaddps, 0xf29a, AVX512_4FMAPS, Modrm|EVex=1|Masking|Space0F38|VexVVVV|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
-v4fnmaddps, 0xf2aa, AVX512_4FMAPS, Modrm|EVex=1|Masking|Space0F38|VexVVVV|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
-v4fmaddss, 0xf29b, AVX512_4FMAPS, Modrm|EVex=4|Masking|Space0F38|VexVVVV|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
-v4fnmaddss, 0xf2ab, AVX512_4FMAPS, Modrm|EVex=4|Masking|Space0F38|VexVVVV|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
+v4fmaddps, 0xf29a, AVX512_4FMAPS, Modrm|EVex=1|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+v4fnmaddps, 0xf2aa, AVX512_4FMAPS, Modrm|EVex=1|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+v4fmaddss, 0xf29b, AVX512_4FMAPS, Modrm|EVex=4|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
+v4fnmaddss, 0xf2ab, AVX512_4FMAPS, Modrm|EVex=4|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 // AVX512_4FMAPS instructions end
 
 // AVX512_4VNNIW instructions
 
-vp4dpwssd, 0xf252, AVX512_4VNNIW, Modrm|EVex=1|Masking|Space0F38|VexVVVV|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
-vp4dpwssds, 0xf253, AVX512_4VNNIW, Modrm|EVex=1|Masking|Space0F38|VexVVVV|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vp4dpwssd, 0xf252, AVX512_4VNNIW, Modrm|EVex=1|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
+vp4dpwssds, 0xf253, AVX512_4VNNIW, Modrm|EVex=1|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8MemShift=4|NoSuf|ImplicitQuadGroup, { XMMword|Unspecified|BaseIndex, RegZMM, RegZMM }
 
 // AVX512_4VNNIW instructions end
 
@@ -2886,59 +2975,59 @@  vpcompressw, 0x6663, AVX512_VBMI2, Modrm|Masking|Space0F38|VexW=2|Disp8MemShift=
 vpexpandb, 0x6662, AVX512_VBMI2, Modrm|Masking|Space0F38|VexW=1|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vpexpandw, 0x6662, AVX512_VBMI2, Modrm|Masking|Space0F38|VexW=2|Disp8MemShift=1|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vpshldv<dq>, 0x6671, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpshldvw, 0x6670, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVV|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshldv<dq>, 0x6671, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshldvw, 0x6670, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vpshrdv<dq>, 0x6673, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpshrdvw, 0x6672, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVV|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshrdv<dq>, 0x6673, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshrdvw, 0x6672, AVX512_VBMI2, Modrm|Masking|Space0F38|VexVVVVSrc|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vpshld<dq>, 0x6671, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpshldw, 0x6670, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshld<dq>, 0x6671, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshldw, 0x6670, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vpshrd<dq>, 0x6673, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpshrdw, 0x6672, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshrd<dq>, 0x6673, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpshrdw, 0x6672, AVX512_VBMI2, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX512_VBMI2 instructions end
 
 // AVX512_VNNI instructions
 
-vpdpbusd, 0x6650, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpdpwssd, 0x6652, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpdpbusd, 0x6650, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpdpwssd, 0x6652, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
-vpdpbusds, 0x6651, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vpdpwssds, 0x6653, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpdpbusds, 0x6651, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vpdpwssds, 0x6653, AVX512_VNNI, Modrm|Masking|Space0F38|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX512_VNNI instructions end
 
 // AVX_VNNI instructions
 
-vpdpbusd, 0x6650, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssd, 0x6652, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusd, 0x6650, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssd, 0x6652, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
-vpdpbusds, 0x6651, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwssds, 0x6653, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbusds, 0x6651, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwssds, 0x6653, AVX_VNNI, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX_VNNI instructions end
 
 // AVX-VNNI-INT8 instructions.
 
-vpdpbuud, 0x50, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpbuuds, 0x51, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpbssd, 0xf250, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpbssds, 0xf251, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpbsud, 0xf350, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpbsuds, 0xf351, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbuud, 0x50, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbuuds, 0x51, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbssd, 0xf250, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbssds, 0xf251, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbsud, 0xf350, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpbsuds, 0xf351, AVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX-VNNI-INT8 instructions end.
 
 // AVX-VNNI-INT16 instructions.
 
-vpdpwuud, 0xd2, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwuuds, 0xd3, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwusd, 0x66d2, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwusds, 0x66d3, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwsud, 0xf3d2, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
-vpdpwsuds, 0xf3d3, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwuud, 0xd2, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwuuds, 0xd3, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwusd, 0x66d2, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwusds, 0x66d3, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwsud, 0xf3d2, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
+vpdpwsuds, 0xf3d3, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVVSrc|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
 
 // AVX-VNNI-INT16 instructions end.
 
@@ -2946,14 +3035,14 @@  vpdpwsuds, 0xf3d3, AVX_VNNI_INT16, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckOperan
 
 vpopcnt<bw>, 0x6654, AVX512_BITALG, Modrm|Masking|Space0F38|<bw:vexw>|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vpshufbitqmb, 0x668f, AVX512_BITALG, Modrm|Masking|Space0F38|VexVVVV|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vpshufbitqmb, 0x668f, AVX512_BITALG, Modrm|Masking|Space0F38|VexVVVVSrc|VexW0|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
 // AVX512_BITALG instructions end
 
 // AVX512 + GFNI instructions
 
-vgf2p8affineinvqb, 0x66cf, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vgf2p8affineqb, 0x66ce, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVV|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vgf2p8affineinvqb, 0x66cf, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vgf2p8affineqb, 0x66ce, GFNI|AVX512F, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW1|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { Imm8, RegXMM|RegYMM|RegZMM|Qword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX512 + GFNI instructions end
 
@@ -3082,11 +3171,11 @@  movdir64b, 0x66f8, MOVDIR64B|APX_F, Modrm|AddrPrefixOpReg|NoSuf|EVex128|EVexMap4
 
 // AVX512_BF16 instructions.
 
-vcvtne2ps2bf16, 0xf272, AVX512_BF16, Modrm|Space0F38|VexVVVV|Masking|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vcvtne2ps2bf16, 0xf272, AVX512_BF16, Modrm|Space0F38|VexVVVVSrc|Masking|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 vcvtneps2bf16<Exy>, 0xf372, AVX512_BF16|<Exy:vl>, Modrm|Space0F38|<Exy:attr>|Masking|VexW0|Broadcast|NoSuf, { <Exy:src>|Dword, <Exy:dst> }
 
-vdpbf16ps, 0xf352, AVX512_BF16, Modrm|Space0F38|VexVVVV|Masking|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vdpbf16ps, 0xf352, AVX512_BF16, Modrm|Space0F38|VexVVVVSrc|Masking|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
 
 // AVX512_BF16 instructions end.
 
@@ -3113,7 +3202,7 @@  enqcmds, 0xf3f8, ENQCMD|APX_F, Modrm|AddrPrefixOpReg|NoSuf|EVex128|EVexMap4, { U
 
 // VP2INTERSECT instructions.
 
-vp2intersect<dq>, 0xf268, AVX512_VP2INTERSECT, Modrm|Space0F38|VexVVVV|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vp2intersect<dq>, 0xf268, AVX512_VP2INTERSECT, Modrm|Space0F38|VexVVVVSrc|<dq:vexw>|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
 // VP2INTERSECT instructions end.
 
@@ -3171,15 +3260,15 @@  xresldtrk, 0xf20f01e9, TSXLDTRK, NoSuf, {}
 ldtilecfg, 0x49/0, AMX_TILE|APX_F, Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
 sttilecfg, 0x6649/0, AMX_TILE|APX_F, Modrm|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex }
 
-tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
-tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tcmmimfp16ps, 0x666c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tcmmrlfp16ps, 0x6c, AMX_COMPLEX|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 
-tdpbf16ps, 0xf35c, AMX_BF16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
-tdpfp16ps, 0xf25c, AMX_FP16|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
-tdpbssd, 0xf25e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
-tdpbuud, 0x5e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
-tdpbusd, 0x665e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
-tdpbsud, 0xf35e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVV|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tdpbf16ps, 0xf35c, AMX_BF16|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tdpfp16ps, 0xf25c, AMX_FP16|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tdpbssd, 0xf25e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tdpbuud, 0x5e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tdpbusd, 0x665e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
+tdpbsud, 0xf35e, AMX_INT8|x64, Modrm|Vex128|Space0F38|VexVVVVSrc|VexW0|SwapSources|NoSuf, { RegTMM, RegTMM, RegTMM }
 
 tileloadd, 0xf24b, AMX_TILE|APX_F, Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
 tileloaddt1, 0x664b, AMX_TILE|APX_F, Sibmem|Vex128|EVex128|Space0F38|VexW0|NoSuf, { Unspecified|BaseIndex, RegTMM }
@@ -3244,23 +3333,23 @@  hreset, 0xf30f3af0c0, HRESET, NoSuf, { Imm8 }
 
 // FP16 (HFNI) instructions.
 
-vfcmaddcph, 0xf256, AVX512_FP16, Modrm|VexVVVV|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfcmaddcsh, 0xf257, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVV|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfcmaddcph, 0xf256, AVX512_FP16, Modrm|VexVVVVSrc|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfcmaddcsh, 0xf257, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVVSrc|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vfmaddcph, 0xf356, AVX512_FP16, Modrm|VexVVVV|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfmaddcsh, 0xf357, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVV|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfmaddcph, 0xf356, AVX512_FP16, Modrm|VexVVVVSrc|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfmaddcsh, 0xf357, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVVSrc|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vfcmulcph, 0xf2d6, AVX512_FP16, Modrm|VexVVVV|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfcmulcsh, 0xf2d7, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVV|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfcmulcph, 0xf2d6, AVX512_FP16, Modrm|VexVVVVSrc|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfcmulcsh, 0xf2d7, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVVSrc|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vfmulcph, 0xf3d6, AVX512_FP16, Modrm|VexVVVV|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
-vfmulcsh, 0xf3d7, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVV|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vfmulcph, 0xf3d6, AVX512_FP16, Modrm|VexVVVVSrc|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegXMM|RegYMM|RegZMM }
+vfmulcsh, 0xf3d7, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVVSrc|VexW0|Disp8MemShift=2|DistinctDest|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vcmp<frel>ph, 0xc2/0x<frel:imm>, AVX512_FP16, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
-vcmpph, 0xc2, AVX512_FP16, Modrm|Masking|Space0F3A|VexVVVV|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vcmp<frel>ph, 0xc2/0x<frel:imm>, AVX512_FP16, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|ImmExt|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
+vcmpph, 0xc2, AVX512_FP16, Modrm|Masking|Space0F3A|VexVVVVSrc|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|SAE, { Imm8, RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM, RegMask }
 
-vcmp<frel>sh, 0xf3c2/0x<frel:imm>, AVX512_FP16, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV|VexW0|Disp8MemShift=1|NoSuf|ImmExt|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegMask }
-vcmpsh, 0xf3c2, AVX512_FP16, Modrm|EVexLIG|Masking|Space0F3A|VexVVVV|VexW0|Disp8MemShift=1|NoSuf|SAE, { Imm8, RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegMask }
+vcmp<frel>sh, 0xf3c2/0x<frel:imm>, AVX512_FP16, Modrm|EVexLIG|Masking|Space0F3A|VexVVVVSrc|VexW0|Disp8MemShift=1|NoSuf|ImmExt|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegMask }
+vcmpsh, 0xf3c2, AVX512_FP16, Modrm|EVexLIG|Masking|Space0F3A|VexVVVVSrc|VexW0|Disp8MemShift=1|NoSuf|SAE, { Imm8, RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegMask }
 
 vcvtdq2ph<Exy>, 0x5b, AVX512_FP16|<Exy:vl>, Modrm|<Exy:attr>|Masking|EVexMap5|VexW0|Broadcast|NoSuf|<Exy:sr>, { <Exy:src>|Dword, <Exy:dst> }
 vcvtudq2ph<Exy>, 0xf27a, AVX512_FP16|<Exy:vl>, Modrm|<Exy:attr>|Masking|EVexMap5|VexW0|Broadcast|NoSuf|<Exy:sr>, { <Exy:src>|Dword, <Exy:dst> }
@@ -3298,17 +3387,17 @@  vcvtph2pd, 0x5a, AVX512_FP16, Modrm|EVex512|Masking|EVexMap5|VexW0|Broadcast|Dis
 vcvtph2w, 0x667d, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vcvtph2uw, 0x7d, AVX512_FP16, Modrm|Masking|EVexMap5|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf|StaticRounding|SAE, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vcvtsd2sh, 0xf25a, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|VexVVVV|VexW1|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtss2sh, 0x1d, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|VexVVVV|VexW0|Disp8MemShift=2|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsd2sh, 0xf25a, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|VexVVVVSrc|VexW1|Disp8MemShift=3|NoSuf|StaticRounding|SAE, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtss2sh, 0x1d, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|VexVVVVSrc|VexW0|Disp8MemShift=2|NoSuf|StaticRounding|SAE, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsi2sh, 0xf32a, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVV|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVV|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVVSrc|Disp8ShiftVL|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|ATTSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtusi2sh, 0xf37b, AVX512_FP16, Modrm|EVexLIG|EVexMap5|VexVVVVSrc|Disp8ShiftVL|No_bSuf|No_wSuf|No_sSuf|StaticRounding|SAE|IntelSyntax, { Reg32|Reg64|Unspecified|BaseIndex, RegXMM, RegXMM }
 
-vcvtsh2sd, 0xf35a, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|VexVVVV|VexW0|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
-vcvtsh2ss, 0x13, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsh2sd, 0xf35a, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap5|VexVVVVSrc|VexW0|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vcvtsh2ss, 0x13, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVVSrc|VexW0|Disp8MemShift=1|NoSuf|SAE, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vcvtsh2si, 0xf32d, AVX512_FP16, Modrm|EVexLIG|EVexMap5|Disp8MemShift=1|NoSuf|StaticRounding|SAE, { RegXMM|Word|Unspecified|BaseIndex, Reg32|Reg64 }
 
@@ -3344,11 +3433,11 @@  vmovw, 0x667e, AVX512_FP16, D|RegMem|EVex128|VexWIG|EVexMap5|NoSuf, { RegXMM, Re
 
 vrcpph, 0x664c, AVX512_FP16, Modrm|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vrcpsh, 0x664d, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrcpsh, 0x664d, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVVSrc|VexW0|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 vrsqrtph, 0x664e, AVX512_FP16, Modrm|Masking|EVexMap6|VexW0|Broadcast|Disp8ShiftVL|CheckOperandSize|NoSuf, { RegXMM|RegYMM|RegZMM|Word|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 
-vrsqrtsh, 0x664f, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVV|VexW0|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
+vrsqrtsh, 0x664f, AVX512_FP16, Modrm|EVexLIG|Masking|EVexMap6|VexVVVVSrc|VexW0|Disp8MemShift=1|NoSuf, { RegXMM|Word|Unspecified|BaseIndex, RegXMM, RegXMM }
 
 // FP16 (HFNI) instructions end.
 
@@ -3361,8 +3450,8 @@  prefetchit1, 0xf18/6, PREFETCHI|x64, Modrm|Anysize|IgnoreSize|NoSuf, { BaseIndex
 
 // CMPCCXADD instructions.
 
-cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD|x64, Modrm|Vex|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
-cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD|x64|APX_F, Modrm|EVex128|Space0F38|VexVVVV|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD|x64, Modrm|Vex|Space0F38|VexVVVVSrc|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
+cmp<cc>xadd, 0x66e<cc:opc>, CMPCCXADD|x64|APX_F, Modrm|EVex128|Space0F38|VexVVVVSrc|SwapSources|CheckOperandSize|NoSuf, { Reg32|Reg64, Reg32|Reg64, Dword|Qword|Unspecified|BaseIndex }
 
 // CMPCCXADD instructions end.