[4/4] x86: provide a 128-bit VBROADCASTSD pseudo

Message ID 08bf9dc9-5616-7dce-a094-d2ea799c92bf@suse.com
State Unresolved
Headers
Series x86: some more optimization plus a new pseudo insn form |

Checks

Context Check Description
snail/binutils-gdb-check warning Git am fail log

Commit Message

Jan Beulich June 16, 2023, 7:32 a.m. UTC
  VBROADCASTSD not supporting 128-bit destinations in any of their AVX,
AVX2, or AVX512F incarnations is presumably because of VMOVDDUP
precisely supporting this very operation. (It is therefore different
from e.g. VPBROADCASTQ, which has no exact equivalent.) Still its
absence has led to people using VPBROADCASTQ as substitution; this could
have been avoided if such a pseudo had been supported from the very
beginning.

Note that the pseudos try to match what the real instructions would have
used as closely as possible, i.e. VexW0 instead of VexWIG for the AVX
and AVX2 forms as well as AVX2 in the first place for the register
source form.
---
For being the first example of us supplying such, this is partly RFC. On
top of that a question is also whether to indeed have split AVX/AVX2
templates, when in principle one (allowing for both memory and register
source) could do.
  

Comments

H.J. Lu June 16, 2023, 4:59 p.m. UTC | #1
On Fri, Jun 16, 2023 at 12:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> VBROADCASTSD not supporting 128-bit destinations in any of their AVX,
> AVX2, or AVX512F incarnations is presumably because of VMOVDDUP
> precisely supporting this very operation. (It is therefore different
> from e.g. VPBROADCASTQ, which has no exact equivalent.) Still its
> absence has led to people using VPBROADCASTQ as substitution; this could
> have been avoided if such a pseudo had been supported from the very
> beginning.
>
> Note that the pseudos try to match what the real instructions would have
> used as closely as possible, i.e. VexW0 instead of VexWIG for the AVX
> and AVX2 forms as well as AVX2 in the first place for the register
> source form.
> ---
> For being the first example of us supplying such, this is partly RFC. On
> top of that a question is also whether to indeed have split AVX/AVX2
> templates, when in principle one (allowing for both memory and register
> source) could do.
>

I don't think assembler should invent such instructions.
  
Jan Beulich June 19, 2023, 7:20 a.m. UTC | #2
On 16.06.2023 18:59, H.J. Lu wrote:
> On Fri, Jun 16, 2023 at 12:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>> VBROADCASTSD not supporting 128-bit destinations in any of their AVX,
>> AVX2, or AVX512F incarnations is presumably because of VMOVDDUP
>> precisely supporting this very operation. (It is therefore different
>> from e.g. VPBROADCASTQ, which has no exact equivalent.) Still its
>> absence has led to people using VPBROADCASTQ as substitution; this could
>> have been avoided if such a pseudo had been supported from the very
>> beginning.
>>
>> Note that the pseudos try to match what the real instructions would have
>> used as closely as possible, i.e. VexW0 instead of VexWIG for the AVX
>> and AVX2 forms as well as AVX2 in the first place for the register
>> source form.
>> ---
>> For being the first example of us supplying such, this is partly RFC. On
>> top of that a question is also whether to indeed have split AVX/AVX2
>> templates, when in principle one (allowing for both memory and register
>> source) could do.
>>
> 
> I don't think assembler should invent such instructions.

May I ask about the "why" behind this? If such a pseudo had been there
from the beginning, an admittedly minor mistake like that corrected by
gcc commit a4df0ce78d6f likely wouldn't have been made, because no
special casing of V2DFmode would have been necessary in the first place.

Jan
  
H.J. Lu June 20, 2023, 4:07 p.m. UTC | #3
On Mon, Jun 19, 2023 at 12:20 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 16.06.2023 18:59, H.J. Lu wrote:
> > On Fri, Jun 16, 2023 at 12:32 AM Jan Beulich <jbeulich@suse.com> wrote:
> >> VBROADCASTSD not supporting 128-bit destinations in any of their AVX,
> >> AVX2, or AVX512F incarnations is presumably because of VMOVDDUP
> >> precisely supporting this very operation. (It is therefore different
> >> from e.g. VPBROADCASTQ, which has no exact equivalent.) Still its
> >> absence has led to people using VPBROADCASTQ as substitution; this could
> >> have been avoided if such a pseudo had been supported from the very
> >> beginning.
> >>
> >> Note that the pseudos try to match what the real instructions would have
> >> used as closely as possible, i.e. VexW0 instead of VexWIG for the AVX
> >> and AVX2 forms as well as AVX2 in the first place for the register
> >> source form.
> >> ---
> >> For being the first example of us supplying such, this is partly RFC. On
> >> top of that a question is also whether to indeed have split AVX/AVX2
> >> templates, when in principle one (allowing for both memory and register
> >> source) could do.
> >>
> >
> > I don't think assembler should invent such instructions.
>
> May I ask about the "why" behind this? If such a pseudo had been there
> from the beginning, an admittedly minor mistake like that corrected by
> gcc commit a4df0ce78d6f likely wouldn't have been made, because no
> special casing of V2DFmode would have been necessary in the first place.
>
> Jan

All x86 instructions should come from the x86 SDM.
  
Jan Beulich June 21, 2023, 9:01 a.m. UTC | #4
On 20.06.2023 18:07, H.J. Lu wrote:
> On Mon, Jun 19, 2023 at 12:20 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 16.06.2023 18:59, H.J. Lu wrote:
>>> On Fri, Jun 16, 2023 at 12:32 AM Jan Beulich <jbeulich@suse.com> wrote:
>>>> VBROADCASTSD not supporting 128-bit destinations in any of their AVX,
>>>> AVX2, or AVX512F incarnations is presumably because of VMOVDDUP
>>>> precisely supporting this very operation. (It is therefore different
>>>> from e.g. VPBROADCASTQ, which has no exact equivalent.) Still its
>>>> absence has led to people using VPBROADCASTQ as substitution; this could
>>>> have been avoided if such a pseudo had been supported from the very
>>>> beginning.
>>>>
>>>> Note that the pseudos try to match what the real instructions would have
>>>> used as closely as possible, i.e. VexW0 instead of VexWIG for the AVX
>>>> and AVX2 forms as well as AVX2 in the first place for the register
>>>> source form.
>>>> ---
>>>> For being the first example of us supplying such, this is partly RFC. On
>>>> top of that a question is also whether to indeed have split AVX/AVX2
>>>> templates, when in principle one (allowing for both memory and register
>>>> source) could do.
>>>>
>>>
>>> I don't think assembler should invent such instructions.
>>
>> May I ask about the "why" behind this? If such a pseudo had been there
>> from the beginning, an admittedly minor mistake like that corrected by
>> gcc commit a4df0ce78d6f likely wouldn't have been made, because no
>> special casing of V2DFmode would have been necessary in the first place.
> 
> All x86 instructions should come from the x86 SDM.

Ehem. See "clr" for an example where syntax doesn't matter (IOW I wasn't
really right in saying this is the first example). There are also various
AT&T-invented mnemonics we support (and - wrongly - even in Intel syntax).
There are further insn forms (number and/or kind of operands) which aren't
backed by the SDM.

I'm afraid I can't take this single sentence as an answer to my question
of "Why?" Even less so with not addressing at all the reason I gave why I
think we should have had such a pseudo from the beginning.

Jan
  

Patch

--- a/gas/testsuite/gas/i386/avx.d
+++ b/gas/testsuite/gas/i386/avx.d
@@ -927,6 +927,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 35 21       	vpmovzxdq \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd \(%ecx\),%xmm4
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd %xmm4,\(%ecx\)
 [ 	]*[a-f0-9]+:	c5 f8 13 21          	vmovlps %xmm4,\(%ecx\)
@@ -2768,6 +2769,8 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd \(%ecx\),%xmm4
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup \(%ecx\),%xmm4
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd %xmm4,\(%ecx\)
--- a/gas/testsuite/gas/i386/avx.s
+++ b/gas/testsuite/gas/i386/avx.s
@@ -982,6 +982,7 @@  _start:
 	vucomisd (%ecx),%xmm4
 
 # Tests for op mem64, xmm
+	vbroadcastsd (%ecx),%xmm4
 	vmovsd (%ecx),%xmm4
 
 # Tests for op xmm, mem64
@@ -2953,6 +2954,8 @@  _start:
 	vucomisd xmm4,[ecx]
 
 # Tests for op mem64, xmm
+	vbroadcastsd xmm4,QWORD PTR [ecx]
+	vbroadcastsd xmm4,[ecx]
 	vmovsd xmm4,QWORD PTR [ecx]
 	vmovsd xmm4,[ecx]
 
--- a/gas/testsuite/gas/i386/avx-16bit.d
+++ b/gas/testsuite/gas/i386/avx-16bit.d
@@ -928,6 +928,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	67 c4 e2 79 35 21    	vpmovzxdq \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	67 c5 f9 2e 21       	vucomisd \(%ecx\),%xmm4
+[ 	]*[a-f0-9]+:	67 c5 fb 12 21       	vmovddup \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	67 c5 fb 10 21       	vmovsd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	67 c5 f9 13 21       	vmovlpd %xmm4,\(%ecx\)
 [ 	]*[a-f0-9]+:	67 c5 f8 13 21       	vmovlps %xmm4,\(%ecx\)
@@ -2769,6 +2770,8 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	67 c5 f9 2e 21       	vucomisd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	67 c5 f9 2e 21       	vucomisd \(%ecx\),%xmm4
+[ 	]*[a-f0-9]+:	67 c5 fb 12 21       	vmovddup \(%ecx\),%xmm4
+[ 	]*[a-f0-9]+:	67 c5 fb 12 21       	vmovddup \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	67 c5 fb 10 21       	vmovsd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	67 c5 fb 10 21       	vmovsd \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	67 c5 f9 13 21       	vmovlpd %xmm4,\(%ecx\)
--- a/gas/testsuite/gas/i386/avx-intel.d
+++ b/gas/testsuite/gas/i386/avx-intel.d
@@ -928,6 +928,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 35 21       	vpmovzxdq xmm4,QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd xmm6,xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd xmm4,QWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup xmm4,QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd xmm4,QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd QWORD PTR \[ecx\],xmm4
 [ 	]*[a-f0-9]+:	c5 f8 13 21          	vmovlps QWORD PTR \[ecx\],xmm4
@@ -2769,6 +2770,8 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd xmm6,xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd xmm4,QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd xmm4,QWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup xmm4,QWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup xmm4,QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd xmm4,QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd xmm4,QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd QWORD PTR \[ecx\],xmm4
--- a/gas/testsuite/gas/i386/avx2.d
+++ b/gas/testsuite/gas/i386/avx2.d
@@ -73,6 +73,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 78 21       	vpbroadcastb \(%ecx\),%xmm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb %xmm4,%ymm6
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb \(%ecx\),%ymm4
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c4 e2 5d 8c 31       	vpmaskmovd \(%ecx\),%ymm4,%ymm6
 [ 	]*[a-f0-9]+:	c4 e2 4d 8e 21       	vpmaskmovd %ymm4,%ymm6,\(%ecx\)
@@ -177,5 +178,6 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb %xmm4,%ymm6
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb \(%ecx\),%ymm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb \(%ecx\),%ymm4
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss %xmm4,%xmm6
 #pass
--- a/gas/testsuite/gas/i386/avx2.s
+++ b/gas/testsuite/gas/i386/avx2.s
@@ -114,6 +114,7 @@  _start:
 	vpbroadcastb (%ecx),%ymm4
 
 # Tests for op xmm, xmm
+	vbroadcastsd %xmm4,%xmm6
 	vbroadcastss %xmm4,%xmm6
 
 	.intel_syntax noprefix
@@ -265,4 +266,5 @@  _start:
 	vpbroadcastb ymm4,[ecx]
 
 # Tests for op xmm, xmm
+	vbroadcastsd xmm6,xmm4
 	vbroadcastss xmm6,xmm4
--- a/gas/testsuite/gas/i386/avx2-intel.d
+++ b/gas/testsuite/gas/i386/avx2-intel.d
@@ -74,6 +74,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 78 21       	vpbroadcastb xmm4,BYTE PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb ymm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb ymm4,BYTE PTR \[ecx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup xmm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss xmm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 5d 8c 31       	vpmaskmovd ymm6,ymm4,YMMWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c4 e2 4d 8e 21       	vpmaskmovd YMMWORD PTR \[ecx\],ymm6,ymm4
@@ -178,5 +179,6 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb ymm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb ymm4,BYTE PTR \[ecx\]
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb ymm4,BYTE PTR \[ecx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup xmm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss xmm6,xmm4
 #pass
--- a/gas/testsuite/gas/i386/avx512f_vl.d
+++ b/gas/testsuite/gas/i386/avx512f_vl.d
@@ -155,6 +155,15 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 0x800\(%edx\),%ymm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a 72 80[ 	]*vbroadcasti32x4 -0x800\(%edx\),%ymm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 -0x810\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 31[ 	]*vmovddup \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 31[ 	]*vmovddup \(%ecx\),%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ 	]*vmovddup -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 7f[ 	]*vmovddup 0x3f8\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 00 04 00 00[ 	]*vmovddup 0x400\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 80[ 	]*vmovddup -0x400\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 f8 fb ff ff[ 	]*vmovddup -0x408\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 f5[ 	]*vmovddup %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 f5[ 	]*vmovddup %xmm5,%xmm6\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 31[ 	]*vbroadcastsd \(%ecx\),%ymm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 19 31[ 	]*vbroadcastsd \(%ecx\),%ymm6\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ 	]*vbroadcastsd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
@@ -5850,6 +5859,15 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 0x800\(%edx\),%ymm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a 72 80[ 	]*vbroadcasti32x4 -0x800\(%edx\),%ymm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 -0x810\(%edx\),%ymm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 31[ 	]*vmovddup \(%ecx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 31[ 	]*vmovddup \(%ecx\),%xmm6\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ 	]*vmovddup -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 7f[ 	]*vmovddup 0x3f8\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 00 04 00 00[ 	]*vmovddup 0x400\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 80[ 	]*vmovddup -0x400\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 f8 fb ff ff[ 	]*vmovddup -0x408\(%edx\),%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 f5[ 	]*vmovddup %xmm5,%xmm6\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 f5[ 	]*vmovddup %xmm5,%xmm6\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 31[ 	]*vbroadcastsd \(%ecx\),%ymm6\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 19 31[ 	]*vbroadcastsd \(%ecx\),%ymm6\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ 	]*vbroadcastsd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\}
--- a/gas/testsuite/gas/i386/avx512f_vl.s
+++ b/gas/testsuite/gas/i386/avx512f_vl.s
@@ -149,6 +149,15 @@  _start:
 	vbroadcasti32x4	2048(%edx), %ymm6{%k7}	 # AVX512{F,VL}
 	vbroadcasti32x4	-2048(%edx), %ymm6{%k7}	 # AVX512{F,VL} Disp8
 	vbroadcasti32x4	-2064(%edx), %ymm6{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	(%ecx), %xmm6{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	(%ecx), %xmm6{%k7}{z}	 # AVX512{F,VL}
+	vbroadcastsd	-123456(%esp,%esi,8), %xmm6{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	1016(%edx), %xmm6{%k7}	 # AVX512{F,VL} Disp8
+	vbroadcastsd	1024(%edx), %xmm6{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	-1024(%edx), %xmm6{%k7}	 # AVX512{F,VL} Disp8
+	vbroadcastsd	-1032(%edx), %xmm6{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	%xmm5, %xmm6{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	%xmm5, %xmm6{%k7}{z}	 # AVX512{F,VL}
 	vbroadcastsd	(%ecx), %ymm6{%k7}	 # AVX512{F,VL}
 	vbroadcastsd	(%ecx), %ymm6{%k7}{z}	 # AVX512{F,VL}
 	vbroadcastsd	-123456(%esp,%esi,8), %ymm6{%k7}	 # AVX512{F,VL}
@@ -5846,6 +5855,15 @@  _start:
 	vbroadcasti32x4	ymm6{k7}, XMMWORD PTR [edx+2048]	 # AVX512{F,VL}
 	vbroadcasti32x4	ymm6{k7}, XMMWORD PTR [edx-2048]	 # AVX512{F,VL} Disp8
 	vbroadcasti32x4	ymm6{k7}, XMMWORD PTR [edx-2064]	 # AVX512{F,VL}
+	vbroadcastsd	xmm6{k7}, QWORD PTR [ecx]	 # AVX512{F,VL}
+	vbroadcastsd	xmm6{k7}{z}, QWORD PTR [ecx]	 # AVX512{F,VL}
+	vbroadcastsd	xmm6{k7}, QWORD PTR [esp+esi*8-123456]	 # AVX512{F,VL}
+	vbroadcastsd	xmm6{k7}, QWORD PTR [edx+1016]	 # AVX512{F,VL} Disp8
+	vbroadcastsd	xmm6{k7}, QWORD PTR [edx+1024]	 # AVX512{F,VL}
+	vbroadcastsd	xmm6{k7}, QWORD PTR [edx-1024]	 # AVX512{F,VL} Disp8
+	vbroadcastsd	xmm6{k7}, QWORD PTR [edx-1032]	 # AVX512{F,VL}
+	vbroadcastsd	xmm6{k7}, xmm5	 # AVX512{F,VL}
+	vbroadcastsd	xmm6{k7}{z}, xmm5	 # AVX512{F,VL}
 	vbroadcastsd	ymm6{k7}, QWORD PTR [ecx]	 # AVX512{F,VL}
 	vbroadcastsd	ymm6{k7}{z}, QWORD PTR [ecx]	 # AVX512{F,VL}
 	vbroadcastsd	ymm6{k7}, QWORD PTR [esp+esi*8-123456]	 # AVX512{F,VL}
--- a/gas/testsuite/gas/i386/avx512f_vl-intel.d
+++ b/gas/testsuite/gas/i386/avx512f_vl-intel.d
@@ -155,6 +155,15 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx\+0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a 72 80[ 	]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x810\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 31[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 31[ 	]*vmovddup xmm6\{k7\}\{z\},QWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 7f[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x3f8\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 00 04 00 00[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 80[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 f8 fb ff ff[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x408\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 f5[ 	]*vmovddup xmm6\{k7\},xmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 f5[ 	]*vmovddup xmm6\{k7\}\{z\},xmm5
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 31[ 	]*vbroadcastsd ymm6\{k7\},QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 19 31[ 	]*vbroadcastsd ymm6\{k7\}\{z\},QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ 	]*vbroadcastsd ymm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\]
@@ -5850,6 +5859,15 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx\+0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a 72 80[ 	]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x810\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 31[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 31[ 	]*vmovddup xmm6\{k7\}\{z\},QWORD PTR \[ecx\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 7f[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x3f8\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 00 04 00 00[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 72 80[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 b2 f8 fb ff ff[ 	]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x408\]
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 0f 12 f5[ 	]*vmovddup xmm6\{k7\},xmm5
+[ 	]*[a-f0-9]+:[ 	]*62 f1 ff 8f 12 f5[ 	]*vmovddup xmm6\{k7\}\{z\},xmm5
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 31[ 	]*vbroadcastsd ymm6\{k7\},QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd af 19 31[ 	]*vbroadcastsd ymm6\{k7\}\{z\},QWORD PTR \[ecx\]
 [ 	]*[a-f0-9]+:[ 	]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ 	]*vbroadcastsd ymm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\]
--- a/gas/testsuite/gas/i386/x86-64-avx.d
+++ b/gas/testsuite/gas/i386/x86-64-avx.d
@@ -875,6 +875,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 35 21       	vpmovzxdq \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd \(%rcx\),%xmm4
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd %xmm4,\(%rcx\)
 [ 	]*[a-f0-9]+:	c5 f8 13 21          	vmovlps %xmm4,\(%rcx\)
@@ -2818,6 +2819,8 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd \(%rcx\),%xmm4
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup \(%rcx\),%xmm4
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd %xmm4,\(%rcx\)
--- a/gas/testsuite/gas/i386/x86-64-avx.s
+++ b/gas/testsuite/gas/i386/x86-64-avx.s
@@ -930,6 +930,7 @@  _start:
 	vucomisd (%rcx),%xmm4
 
 # Tests for op mem64, xmm
+	vbroadcastsd (%rcx),%xmm4
 	vmovsd (%rcx),%xmm4
 
 # Tests for op xmm, mem64
@@ -3024,6 +3025,8 @@  _start:
 	vucomisd xmm4,[rcx]
 
 # Tests for op mem64, xmm
+	vbroadcastsd xmm4,QWORD PTR [rcx]
+	vbroadcastsd xmm4,[rcx]
 	vmovsd xmm4,QWORD PTR [rcx]
 	vmovsd xmm4,[rcx]
 
--- a/gas/testsuite/gas/i386/x86-64-avx-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-avx-intel.d
@@ -876,6 +876,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 35 21       	vpmovzxdq xmm4,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd xmm6,xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd xmm4,QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup xmm4,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd xmm4,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd QWORD PTR \[rcx\],xmm4
 [ 	]*[a-f0-9]+:	c5 f8 13 21          	vmovlps QWORD PTR \[rcx\],xmm4
@@ -2819,6 +2820,8 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c5 f9 2e f4          	vucomisd xmm6,xmm4
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd xmm4,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c5 f9 2e 21          	vucomisd xmm4,QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup xmm4,QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 21          	vmovddup xmm4,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd xmm4,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c5 fb 10 21          	vmovsd xmm4,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c5 f9 13 21          	vmovlpd QWORD PTR \[rcx\],xmm4
--- a/gas/testsuite/gas/i386/x86-64-avx2.d
+++ b/gas/testsuite/gas/i386/x86-64-avx2.d
@@ -73,6 +73,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 78 21       	vpbroadcastb \(%rcx\),%xmm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb %xmm4,%ymm6
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb \(%rcx\),%ymm4
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c4 e2 5d 8c 31       	vpmaskmovd \(%rcx\),%ymm4,%ymm6
 [ 	]*[a-f0-9]+:	c4 e2 4d 8e 21       	vpmaskmovd %ymm4,%ymm6,\(%rcx\)
@@ -177,5 +178,6 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb %xmm4,%ymm6
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb \(%rcx\),%ymm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb \(%rcx\),%ymm4
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup %xmm4,%xmm6
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss %xmm4,%xmm6
 #pass
--- a/gas/testsuite/gas/i386/x86-64-avx2.s
+++ b/gas/testsuite/gas/i386/x86-64-avx2.s
@@ -114,6 +114,7 @@  _start:
 	vpbroadcastb (%rcx),%ymm4
 
 # Tests for op xmm, xmm
+	vbroadcastsd %xmm4,%xmm6
 	vbroadcastss %xmm4,%xmm6
 
 	.intel_syntax noprefix
@@ -265,4 +266,5 @@  _start:
 	vpbroadcastb ymm4,[rcx]
 
 # Tests for op xmm, xmm
+	vbroadcastsd xmm6,xmm4
 	vbroadcastss xmm6,xmm4
--- a/gas/testsuite/gas/i386/x86-64-avx2-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-avx2-intel.d
@@ -74,6 +74,7 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 79 78 21       	vpbroadcastb xmm4,BYTE PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb ymm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb ymm4,BYTE PTR \[rcx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup xmm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss xmm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 5d 8c 31       	vpmaskmovd ymm6,ymm4,YMMWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c4 e2 4d 8e 21       	vpmaskmovd YMMWORD PTR \[rcx\],ymm6,ymm4
@@ -178,5 +179,6 @@  Disassembly of section .text:
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 f4       	vpbroadcastb ymm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb ymm4,BYTE PTR \[rcx\]
 [ 	]*[a-f0-9]+:	c4 e2 7d 78 21       	vpbroadcastb ymm4,BYTE PTR \[rcx\]
+[ 	]*[a-f0-9]+:	c5 fb 12 f4          	vmovddup xmm6,xmm4
 [ 	]*[a-f0-9]+:	c4 e2 79 18 f4       	vbroadcastss xmm6,xmm4
 #pass
--- a/gas/testsuite/gas/i386/x86-64-avx512f_vl.d
+++ b/gas/testsuite/gas/i386/x86-64-avx512f_vl.d
@@ -167,6 +167,17 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 0x800\(%rdx\),%ymm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a 72 80[ 	]*vbroadcasti32x4 -0x800\(%rdx\),%ymm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 -0x810\(%rdx\),%ymm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 31[ 	]*vmovddup \(%rcx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 0f 12 31[ 	]*vmovddup \(%rcx\),%xmm30\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 8f 12 31[ 	]*vmovddup \(%rcx\),%xmm30\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 21 ff 08 12 b4 f0 23 01 00 00[ 	]*vmovddup 0x123\(%rax,%r14,8\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 7f[ 	]*vmovddup 0x3f8\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 00 04 00 00[ 	]*vmovddup 0x400\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 80[ 	]*vmovddup -0x400\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 f8 fb ff ff[ 	]*vmovddup -0x408\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 08 12 f5[ 	]*vmovddup %xmm29,%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 0f 12 f5[ 	]*vmovddup %xmm29,%xmm30\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 8f 12 f5[ 	]*vmovddup %xmm29,%xmm30\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 28 19 31[ 	]*vbroadcastsd \(%rcx\),%ymm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 2f 19 31[ 	]*vbroadcastsd \(%rcx\),%ymm30\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd af 19 31[ 	]*vbroadcastsd \(%rcx\),%ymm30\{%k7\}\{z\}
@@ -6474,6 +6485,17 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 0x800\(%rdx\),%ymm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a 72 80[ 	]*vbroadcasti32x4 -0x800\(%rdx\),%ymm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 -0x810\(%rdx\),%ymm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 31[ 	]*vmovddup \(%rcx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 0f 12 31[ 	]*vmovddup \(%rcx\),%xmm30\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 8f 12 31[ 	]*vmovddup \(%rcx\),%xmm30\{%k7\}\{z\}
+[ 	]*[a-f0-9]+:[ 	]*62 21 ff 08 12 b4 f0 34 12 00 00[ 	]*vmovddup 0x1234\(%rax,%r14,8\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 7f[ 	]*vmovddup 0x3f8\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 00 04 00 00[ 	]*vmovddup 0x400\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 80[ 	]*vmovddup -0x400\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 f8 fb ff ff[ 	]*vmovddup -0x408\(%rdx\),%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 08 12 f5[ 	]*vmovddup %xmm29,%xmm30
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 0f 12 f5[ 	]*vmovddup %xmm29,%xmm30\{%k7\}
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 8f 12 f5[ 	]*vmovddup %xmm29,%xmm30\{%k7\}\{z\}
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 28 19 31[ 	]*vbroadcastsd \(%rcx\),%ymm30
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 2f 19 31[ 	]*vbroadcastsd \(%rcx\),%ymm30\{%k7\}
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd af 19 31[ 	]*vbroadcastsd \(%rcx\),%ymm30\{%k7\}\{z\}
--- a/gas/testsuite/gas/i386/x86-64-avx512f_vl.s
+++ b/gas/testsuite/gas/i386/x86-64-avx512f_vl.s
@@ -161,6 +161,17 @@  _start:
 	vbroadcasti32x4	2048(%rdx), %ymm30	 # AVX512{F,VL}
 	vbroadcasti32x4	-2048(%rdx), %ymm30	 # AVX512{F,VL} Disp8
 	vbroadcasti32x4	-2064(%rdx), %ymm30	 # AVX512{F,VL}
+	vbroadcastsd	(%rcx), %xmm30	 # AVX512{F,VL}
+	vbroadcastsd	(%rcx), %xmm30{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	(%rcx), %xmm30{%k7}{z}	 # AVX512{F,VL}
+	vbroadcastsd	0x123(%rax,%r14,8), %xmm30	 # AVX512{F,VL}
+	vbroadcastsd	1016(%rdx), %xmm30	 # AVX512{F,VL} Disp8
+	vbroadcastsd	1024(%rdx), %xmm30	 # AVX512{F,VL}
+	vbroadcastsd	-1024(%rdx), %xmm30	 # AVX512{F,VL} Disp8
+	vbroadcastsd	-1032(%rdx), %xmm30	 # AVX512{F,VL}
+	vbroadcastsd	%xmm29, %xmm30	 # AVX512{F,VL}
+	vbroadcastsd	%xmm29, %xmm30{%k7}	 # AVX512{F,VL}
+	vbroadcastsd	%xmm29, %xmm30{%k7}{z}	 # AVX512{F,VL}
 	vbroadcastsd	(%rcx), %ymm30	 # AVX512{F,VL}
 	vbroadcastsd	(%rcx), %ymm30{%k7}	 # AVX512{F,VL}
 	vbroadcastsd	(%rcx), %ymm30{%k7}{z}	 # AVX512{F,VL}
@@ -6470,6 +6481,17 @@  _start:
 	vbroadcasti32x4	ymm30, XMMWORD PTR [rdx+2048]	 # AVX512{F,VL}
 	vbroadcasti32x4	ymm30, XMMWORD PTR [rdx-2048]	 # AVX512{F,VL} Disp8
 	vbroadcasti32x4	ymm30, XMMWORD PTR [rdx-2064]	 # AVX512{F,VL}
+	vbroadcastsd	xmm30, QWORD PTR [rcx]	 # AVX512{F,VL}
+	vbroadcastsd	xmm30{k7}, QWORD PTR [rcx]	 # AVX512{F,VL}
+	vbroadcastsd	xmm30{k7}{z}, QWORD PTR [rcx]	 # AVX512{F,VL}
+	vbroadcastsd	xmm30, QWORD PTR [rax+r14*8+0x1234]	 # AVX512{F,VL}
+	vbroadcastsd	xmm30, QWORD PTR [rdx+1016]	 # AVX512{F,VL} Disp8
+	vbroadcastsd	xmm30, QWORD PTR [rdx+1024]	 # AVX512{F,VL}
+	vbroadcastsd	xmm30, QWORD PTR [rdx-1024]	 # AVX512{F,VL} Disp8
+	vbroadcastsd	xmm30, QWORD PTR [rdx-1032]	 # AVX512{F,VL}
+	vbroadcastsd	xmm30, xmm29	 # AVX512{F,VL}
+	vbroadcastsd	xmm30{k7}, xmm29	 # AVX512{F,VL}
+	vbroadcastsd	xmm30{k7}{z}, xmm29	 # AVX512{F,VL}
 	vbroadcastsd	ymm30, QWORD PTR [rcx]	 # AVX512{F,VL}
 	vbroadcastsd	ymm30{k7}, QWORD PTR [rcx]	 # AVX512{F,VL}
 	vbroadcastsd	ymm30{k7}{z}, QWORD PTR [rcx]	 # AVX512{F,VL}
--- a/gas/testsuite/gas/i386/x86-64-avx512f_vl-intel.d
+++ b/gas/testsuite/gas/i386/x86-64-avx512f_vl-intel.d
@@ -167,6 +167,17 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx\+0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a 72 80[ 	]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x810\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 31[ 	]*vmovddup xmm30,QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 0f 12 31[ 	]*vmovddup xmm30\{k7\},QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 8f 12 31[ 	]*vmovddup xmm30\{k7\}\{z\},QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 21 ff 08 12 b4 f0 23 01 00 00[ 	]*vmovddup xmm30,QWORD PTR \[rax\+r14\*8\+0x123\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 7f[ 	]*vmovddup xmm30,QWORD PTR \[rdx\+0x3f8\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 00 04 00 00[ 	]*vmovddup xmm30,QWORD PTR \[rdx\+0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 80[ 	]*vmovddup xmm30,QWORD PTR \[rdx-0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 f8 fb ff ff[ 	]*vmovddup xmm30,QWORD PTR \[rdx-0x408\]
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 08 12 f5[ 	]*vmovddup xmm30,xmm29
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 0f 12 f5[ 	]*vmovddup xmm30\{k7\},xmm29
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 8f 12 f5[ 	]*vmovddup xmm30\{k7\}\{z\},xmm29
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 28 19 31[ 	]*vbroadcastsd ymm30,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 2f 19 31[ 	]*vbroadcastsd ymm30\{k7\},QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd af 19 31[ 	]*vbroadcastsd ymm30\{k7\}\{z\},QWORD PTR \[rcx\]
@@ -6474,6 +6485,17 @@  Disassembly of section \.text:
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 00 08 00 00[ 	]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx\+0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a 72 80[ 	]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x800\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 7d 28 5a b2 f0 f7 ff ff[ 	]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x810\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 31[ 	]*vmovddup xmm30,QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 0f 12 31[ 	]*vmovddup xmm30\{k7\},QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 8f 12 31[ 	]*vmovddup xmm30\{k7\}\{z\},QWORD PTR \[rcx\]
+[ 	]*[a-f0-9]+:[ 	]*62 21 ff 08 12 b4 f0 34 12 00 00[ 	]*vmovddup xmm30,QWORD PTR \[rax\+r14\*8\+0x1234\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 7f[ 	]*vmovddup xmm30,QWORD PTR \[rdx\+0x3f8\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 00 04 00 00[ 	]*vmovddup xmm30,QWORD PTR \[rdx\+0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 72 80[ 	]*vmovddup xmm30,QWORD PTR \[rdx-0x400\]
+[ 	]*[a-f0-9]+:[ 	]*62 61 ff 08 12 b2 f8 fb ff ff[ 	]*vmovddup xmm30,QWORD PTR \[rdx-0x408\]
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 08 12 f5[ 	]*vmovddup xmm30,xmm29
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 0f 12 f5[ 	]*vmovddup xmm30\{k7\},xmm29
+[ 	]*[a-f0-9]+:[ 	]*62 01 ff 8f 12 f5[ 	]*vmovddup xmm30\{k7\}\{z\},xmm29
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 28 19 31[ 	]*vbroadcastsd ymm30,QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd 2f 19 31[ 	]*vbroadcastsd ymm30\{k7\},QWORD PTR \[rcx\]
 [ 	]*[a-f0-9]+:[ 	]*62 62 fd af 19 31[ 	]*vbroadcastsd ymm30\{k7\}\{z\},QWORD PTR \[rcx\]
--- a/opcodes/i386-opc.tbl
+++ b/opcodes/i386-opc.tbl
@@ -1495,6 +1495,8 @@  vblendp<sd>, 0x660c | <sd:opc>, AVX, Mod
 vblendvp<sd>, 0x664a | <sd:opc>, AVX, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vbroadcastf128, 0x661a, AVX, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
 vbroadcastsd, 0x6619, AVX, Modrm|Vex256|Space0F38|VexW0|NoSuf, { Qword|Unspecified|BaseIndex, RegYMM }
+// As an extension, provide a 128-bit form as well, utilizing vmovddup.
+vbroadcastsd, 0xf212, AVX, Modrm|Vex128|Space0F|VexW0|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM }
 vbroadcastss, 0x6618, AVX, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Dword|Unspecified|BaseIndex, RegXMM|RegYMM }
 vcmp<frel>p<sd>, 0x<sd:ppfx>c2/0x<frel:imm>, AVX, Modrm|<frel:comm>|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM }
 vcmp<frel>s<sd>, 0x<sd:spfx>c2/0x<frel:imm>, AVX, Modrm|<frel:comm>|VexLIG|Space0F|VexVVVV|VexWIG|NoSuf|ImmExt, { RegXMM|<sd:elem>|Unspecified|BaseIndex, RegXMM, RegXMM }
@@ -1731,6 +1733,8 @@  vpmovzxwq, 0x6634, AVX2, Modrm|Vex=2|Spa
 
 vbroadcasti128, 0x665A, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM }
 vbroadcastsd, 0x6619, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { RegXMM, RegYMM }
+// As an extension, provide a 128-bit form as well, utilizing vmovddup.
+vbroadcastsd, 0xf212, AVX2, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegXMM, RegXMM }
 vbroadcastss, 0x6618, AVX2, Modrm|Vex|Space0F38|VexW=1|NoSuf, { RegXMM, RegXMM|RegYMM }
 vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM }
 vpbroadcast<bw>, 0x6678 | <bw:opc>, AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf, { <bw:elem>|Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM }
@@ -2128,6 +2132,8 @@  vbroadcasti64x4, 0x665B, AVX512F, Modrm|
 
 vbroadcastss, 0x6618, AVX512F, Modrm|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vbroadcastsd, 0x6619, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM }
+// As an extension, provide a 128-bit form as well, utilizing vmovddup.
+vbroadcastsd, 0xf212, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM }
 
 vpbroadcast<dq>, 0x6658 | <dq:opc>, AVX512F, Modrm|Masking|Space0F38|<dq:vexw>|Disp8MemShift|NoSuf, { RegXMM|<dq:elem>|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM }
 vpbroadcast<dq>, 0x667c, AVX512F, Modrm|Masking|Space0F38|<dq:vexw64>|NoSuf, { <dq:gpr>, RegXMM|RegYMM|RegZMM }