From patchwork Fri Oct 14 09:12:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 2608 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp80546wrs; Fri, 14 Oct 2022 02:19:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7VQh6Evo2Mi9rThxuQRQqy23U5SxIzyIFsJQo67fEaicGgkXlswnb4cPwSNoD5qS/JzCem X-Received: by 2002:a17:906:d550:b0:78d:a6d4:c18f with SMTP id cr16-20020a170906d55000b0078da6d4c18fmr2882208ejc.113.1665739168849; Fri, 14 Oct 2022 02:19:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665739168; cv=none; d=google.com; s=arc-20160816; b=Bfg01KVWAPm79gKW/Gh9Dl4NpMb9IvCwx3sXsLHByXZ8OtVyvefebOfsZGSYErloUx 3QJO84SgUGTn3jLKbmRsFkjSNtTgeliBAhC2u3p22tH9kWsoR9yVTIfAmv1yiPcTpV+q ds2UMIF3toT2P97DHAcy4OH8TzbLoJAvkztvSf9oMaeVWvewKyIShOrlF3zG3cDPlVrr ewUjEZxAwmOCJ0eAtFkasXTF4WCYAVMYRJBqV7U3CJJJzAkb80BYwkSYQgon+8h2bq46 +Q9EoZetUmdJQ5zWM/siNw/Z3lUVTTiNi+B2DaLfQpN+qrF5RP4XOoF67z4hsRQlGAzL XAcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=oJH4BR/pnZGKIHOBmQWYsSi7s6E5G0yms6a6z5tFfxg=; b=ZjOL1bzeKb3t7AWkxmMevW1lt0WTLiLBLwuQuZ3PxSq4exS0RK9p6msxJGcnACHixe 5tZtCptMLeXMTS4xXT59Ec8/MZCC2SEFq8paEYXxi7VUGqZ8Mylcf28BntSmZNn8IXDx BdflI6azYrfrUiT6jLp3fOMqx7GbMW80ONP4AZIzitZ5viw5wD7E6OxxVznK4dCs82IE pMAK6oaNo24gLLztS+K6hhZpkpdGMfu7mL07NmUpI5BOTlewCwvztm9phefsmH5HNKtt jsEwd3wnRu5+7yv+dz7So+sVuaRGJ4QYGX6AaFX7+skqNIE9XDL4vNBeViVK6KoFJvJ7 7S/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=d5tBhDyG; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id o16-20020aa7c510000000b0045c2a0de3ddsi1675289edq.150.2022.10.14.02.19.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Oct 2022 02:19:28 -0700 (PDT) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=d5tBhDyG; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 94B62385087A for ; Fri, 14 Oct 2022 09:17:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 94B62385087A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1665739038; bh=oJH4BR/pnZGKIHOBmQWYsSi7s6E5G0yms6a6z5tFfxg=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=d5tBhDyGZul6h9DCPZGmgVhr/03GrW6hiU5tDlX4v2RkjNKanGfOzS3ixrf+lwyj1 b0OVzYDdTtYyJ8EdJU3QHIZp81vVz7skkVTBxHnsGzm0JL2CZna4ZoTyXPuz2y8J55 myxJqp1E8NzGVsVGswvu2WoQBwD2OkC8rnF+OBpI= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by sourceware.org (Postfix) with ESMTPS id 06816385AC1F for ; Fri, 14 Oct 2022 09:15:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 06816385AC1F X-IronPort-AV: E=McAfee;i="6500,9779,10499"; a="292682633" X-IronPort-AV: E=Sophos;i="5.95,182,1661842800"; d="scan'208";a="292682633" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2022 02:15:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10499"; a="629873112" X-IronPort-AV: E=Sophos;i="5.95,182,1661842800"; d="scan'208";a="629873112" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga007.fm.intel.com with ESMTP; 14 Oct 2022 02:14:50 -0700 Received: from shliclel314.sh.intel.com (shliclel314.sh.intel.com [10.239.240.214]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 43C4F1009C95; Fri, 14 Oct 2022 17:14:50 +0800 (CST) To: binutils@sourceware.org Subject: [PATCH 02/10] Support Intel AVX-VNNI-INT8 Date: Fri, 14 Oct 2022 17:12:40 +0800 Message-Id: <20221014091248.4920-3-haochen.jiang@intel.com> X-Mailer: git-send-email 2.18.2 In-Reply-To: <20221014091248.4920-1-haochen.jiang@intel.com> References: <20221014091248.4920-1-haochen.jiang@intel.com> X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Binutils From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746654114903541772?= X-GMAIL-MSGID: =?utf-8?q?1746654114903541772?= From: "Cui,Lili" gas/ * NEWS: Support Intel AVX-VNNI-INT8. * config/tc-i386.c: Add avx_vnni_int8. * doc/c-i386.texi: Document avx_vnni_int8 and noavx_vnni_int8. * testsuite/gas/i386/avx-vnni-int8-intel.d: New file. * testsuite/gas/i386/avx-vnni-int8.d: Likewise. * testsuite/gas/i386/avx-vnni-int8.s: Likewise. * testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d: Likewise. * testsuite/gas/i386/x86-64-avx-vnni-int8.d: Likewise. * testsuite/gas/i386/x86-64-avx-vnni-int8.s: Likewise. * testsuite/gas/i386/i386.exp: Run AVX VNNI INT8 tests. opcodes/ * i386-dis.c: (PREFIX_VEX_0F3850) New. (PREFIX_VEX_0F3851): Likewise. (VEX_W_0F3850_P_0): Likewise. (VEX_W_0F3850_P_1): Likewise. (VEX_W_0F3850_P_2): Likewise. (VEX_W_0F3850_P_3): Likewise. (VEX_W_0F3851_P_0): Likewise. (VEX_W_0F3851_P_1): Likewise. (VEX_W_0F3851_P_2): Likewise. (VEX_W_0F3851_P_3): Likewise. (VEX_W_0F3850): Delete. (VEX_W_0F3851): Likewise. (prefix_table): Add PREFIX_VEX_0F3850 and PREFIX_VEX_0F3851. (vex_table): Add PREFIX_VEX_0F3850 and PREFIX_VEX_0F3851, delete VEX_W_0F3850 and VEX_W_0F3851. (vex_w_table): Add VEX_W_0F3850_P_0, VEX_W_0F3850_P_1, VEX_W_0F3850_P_2 VEX_W_0F3850_P_3, VEX_W_0F3851_P_0, VEX_W_0F3851_P_1, VEX_W_0F3851_P_2 and VEX_W_0F3851_P_3, delete VEX_W_0F3850 and VEX_W_0F3851. * i386-gen.c: (cpu_flag_init): Add CPU_AVX_VNNI_INT8_FLAGS and CPU_ANY_AVX_VNNI_INT8_FLAGS. (cpu_flags): Add CpuAVX_VNNI_INT8. * i386-opc.h (CpuAVX_VNNI_INT8): New. * i386-opc.tbl: Add Intel AVX_VNNI_INT8 instructions. * i386-init.h: Regenerated. * i386-tbl.h: Likewise. --- gas/NEWS | 2 + gas/config/tc-i386.c | 1 + gas/doc/c-i386.texi | 4 +- gas/testsuite/gas/i386/avx-vnni-int8-intel.d | 71 + gas/testsuite/gas/i386/avx-vnni-int8.d | 71 + gas/testsuite/gas/i386/avx-vnni-int8.s | 127 + gas/testsuite/gas/i386/i386.exp | 4 + .../gas/i386/x86-64-avx-vnni-int8-intel.d | 71 + gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d | 71 + gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s | 127 + opcodes/i386-dis.c | 23 +- opcodes/i386-gen.c | 5 + opcodes/i386-init.h | 516 +- opcodes/i386-opc.h | 3 + opcodes/i386-opc.tbl | 11 + opcodes/i386-tbl.h | 7876 +++++++++-------- 16 files changed, 4843 insertions(+), 4140 deletions(-) create mode 100644 gas/testsuite/gas/i386/avx-vnni-int8-intel.d create mode 100644 gas/testsuite/gas/i386/avx-vnni-int8.d create mode 100644 gas/testsuite/gas/i386/avx-vnni-int8.s create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d create mode 100644 gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s diff --git a/gas/NEWS b/gas/NEWS index 7cf65728ba..86996be2f5 100644 --- a/gas/NEWS +++ b/gas/NEWS @@ -1,5 +1,7 @@ -*- text -*- +* Add support for Intel AVX-VNNI-INT8 instructions. + * Add support for Intel AVX-IFMA instructions. * gas now supports --compress-debug-sections=zstd to compress diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c index 2fe7674884..8329529d38 100644 --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -1095,6 +1095,7 @@ static const arch_entry cpu_arch[] = SUBARCH (hreset, HRESET, ANY_HRESET, false), SUBARCH (avx512_fp16, AVX512_FP16, ANY_AVX512_FP16, false), SUBARCH (avx_ifma, AVX_IFMA, ANY_AVX_IFMA, false), + SUBARCH (avx_vnni_int8, AVX_VNNI_INT8, ANY_AVX_VNNI_INT8, false), }; #undef SUBARCH diff --git a/gas/doc/c-i386.texi b/gas/doc/c-i386.texi index 9e629605f8..18fa30072b 100644 --- a/gas/doc/c-i386.texi +++ b/gas/doc/c-i386.texi @@ -214,6 +214,7 @@ accept various extension mnemonics. For example, @code{avx_vnni}, @code{avx512_fp16}, @code{avx_ifma}, +@code{avx_vnni_int8}, @code{noavx512f}, @code{noavx512cd}, @code{noavx512er}, @@ -235,6 +236,7 @@ accept various extension mnemonics. For example, @code{noavx_vnni}, @code{noavx512_fp16}, @code{noavx_ifma}, +@code{noavx_vnni_int8}, @code{noenqcmd}, @code{noserialize}, @code{notsxldtrk}, @@ -1535,7 +1537,7 @@ supported on the CPU specified. The choices for @var{cpu_type} are: @item @samp{.avx512_bitalg} @tab @samp{.avx512_bf16} @tab @samp{.avx512_vp2intersect} @item @samp{.tdx} @tab @samp{.avx_vnni} @tab @samp{.avx512_fp16} @item @samp{.clwb} @tab @samp{.rdpid} @tab @samp{.ptwrite} @tab @samp{.ibt} -@item @samp{.avx_ifma} +@item @samp{.avx_ifma} @tab @samp{.avx_vnni_int8} @item @samp{.wbnoinvd} @tab @samp{.pconfig} @tab @samp{.waitpkg} @tab @samp{.cldemote} @item @samp{.shstk} @tab @samp{.gfni} @tab @samp{.vaes} @tab @samp{.vpclmulqdq} @item @samp{.movdiri} @tab @samp{.movdir64b} @tab @samp{.enqcmd} @tab @samp{.tsxldtrk} diff --git a/gas/testsuite/gas/i386/avx-vnni-int8-intel.d b/gas/testsuite/gas/i386/avx-vnni-int8-intel.d new file mode 100644 index 0000000000..1d7d162f20 --- /dev/null +++ b/gas/testsuite/gas/i386/avx-vnni-int8-intel.d @@ -0,0 +1,71 @@ +#as: +#objdump: -dw -Mintel +#name: i386 AVX-VNNI-INT8 insns (Intel disassembly) +#source: avx-vnni-int8.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 57 50 f4\s+vpdpbssd ymm6,ymm5,ymm4 +\s*[a-f0-9]+:\s*c4 e2 53 50 f4\s+vpdpbssd xmm6,xmm5,xmm4 +\s*[a-f0-9]+:\s*c4 e2 57 50 b4 f4 00 00 00 10\s+vpdpbssd ymm6,ymm5,YMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 57 50 31\s+vpdpbssd ymm6,ymm5,YMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 57 50 b1 e0 0f 00 00\s+vpdpbssd ymm6,ymm5,YMMWORD PTR \[ecx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 e2 57 50 b2 00 f0 ff ff\s+vpdpbssd ymm6,ymm5,YMMWORD PTR \[edx-0x1000\] +\s*[a-f0-9]+:\s*c4 e2 53 50 b4 f4 00 00 00 10\s+vpdpbssd xmm6,xmm5,XMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 53 50 31\s+vpdpbssd xmm6,xmm5,XMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 53 50 b1 f0 07 00 00\s+vpdpbssd xmm6,xmm5,XMMWORD PTR \[ecx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 e2 53 50 b2 00 f8 ff ff\s+vpdpbssd xmm6,xmm5,XMMWORD PTR \[edx-0x800\] +\s*[a-f0-9]+:\s*c4 e2 57 51 f4\s+vpdpbssds ymm6,ymm5,ymm4 +\s*[a-f0-9]+:\s*c4 e2 53 51 f4\s+vpdpbssds xmm6,xmm5,xmm4 +\s*[a-f0-9]+:\s*c4 e2 57 51 b4 f4 00 00 00 10\s+vpdpbssds ymm6,ymm5,YMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 57 51 31\s+vpdpbssds ymm6,ymm5,YMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 57 51 b1 e0 0f 00 00\s+vpdpbssds ymm6,ymm5,YMMWORD PTR \[ecx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 e2 57 51 b2 00 f0 ff ff\s+vpdpbssds ymm6,ymm5,YMMWORD PTR \[edx-0x1000\] +\s*[a-f0-9]+:\s*c4 e2 53 51 b4 f4 00 00 00 10\s+vpdpbssds xmm6,xmm5,XMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 53 51 31\s+vpdpbssds xmm6,xmm5,XMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 53 51 b1 f0 07 00 00\s+vpdpbssds xmm6,xmm5,XMMWORD PTR \[ecx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 e2 53 51 b2 00 f8 ff ff\s+vpdpbssds xmm6,xmm5,XMMWORD PTR \[edx-0x800\] +\s*[a-f0-9]+:\s*c4 e2 56 50 f4\s+vpdpbsud ymm6,ymm5,ymm4 +\s*[a-f0-9]+:\s*c4 e2 52 50 f4\s+vpdpbsud xmm6,xmm5,xmm4 +\s*[a-f0-9]+:\s*c4 e2 56 50 b4 f4 00 00 00 10\s+vpdpbsud ymm6,ymm5,YMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 56 50 31\s+vpdpbsud ymm6,ymm5,YMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 56 50 b1 e0 0f 00 00\s+vpdpbsud ymm6,ymm5,YMMWORD PTR \[ecx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 e2 56 50 b2 00 f0 ff ff\s+vpdpbsud ymm6,ymm5,YMMWORD PTR \[edx-0x1000\] +\s*[a-f0-9]+:\s*c4 e2 52 50 b4 f4 00 00 00 10\s+vpdpbsud xmm6,xmm5,XMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 52 50 31\s+vpdpbsud xmm6,xmm5,XMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 52 50 b1 f0 07 00 00\s+vpdpbsud xmm6,xmm5,XMMWORD PTR \[ecx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 e2 52 50 b2 00 f8 ff ff\s+vpdpbsud xmm6,xmm5,XMMWORD PTR \[edx-0x800\] +\s*[a-f0-9]+:\s*c4 e2 56 51 f4\s+vpdpbsuds ymm6,ymm5,ymm4 +\s*[a-f0-9]+:\s*c4 e2 52 51 f4\s+vpdpbsuds xmm6,xmm5,xmm4 +\s*[a-f0-9]+:\s*c4 e2 56 51 b4 f4 00 00 00 10\s+vpdpbsuds ymm6,ymm5,YMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 56 51 31\s+vpdpbsuds ymm6,ymm5,YMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 56 51 b1 e0 0f 00 00\s+vpdpbsuds ymm6,ymm5,YMMWORD PTR \[ecx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 e2 56 51 b2 00 f0 ff ff\s+vpdpbsuds ymm6,ymm5,YMMWORD PTR \[edx-0x1000\] +\s*[a-f0-9]+:\s*c4 e2 52 51 b4 f4 00 00 00 10\s+vpdpbsuds xmm6,xmm5,XMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 52 51 31\s+vpdpbsuds xmm6,xmm5,XMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 52 51 b1 f0 07 00 00\s+vpdpbsuds xmm6,xmm5,XMMWORD PTR \[ecx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 e2 52 51 b2 00 f8 ff ff\s+vpdpbsuds xmm6,xmm5,XMMWORD PTR \[edx-0x800\] +\s*[a-f0-9]+:\s*c4 e2 54 50 f4\s+vpdpbuud ymm6,ymm5,ymm4 +\s*[a-f0-9]+:\s*c4 e2 50 50 f4\s+vpdpbuud xmm6,xmm5,xmm4 +\s*[a-f0-9]+:\s*c4 e2 54 50 b4 f4 00 00 00 10\s+vpdpbuud ymm6,ymm5,YMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 54 50 31\s+vpdpbuud ymm6,ymm5,YMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 54 50 b1 e0 0f 00 00\s+vpdpbuud ymm6,ymm5,YMMWORD PTR \[ecx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 e2 54 50 b2 00 f0 ff ff\s+vpdpbuud ymm6,ymm5,YMMWORD PTR \[edx-0x1000\] +\s*[a-f0-9]+:\s*c4 e2 50 50 b4 f4 00 00 00 10\s+vpdpbuud xmm6,xmm5,XMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 50 50 31\s+vpdpbuud xmm6,xmm5,XMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 50 50 b1 f0 07 00 00\s+vpdpbuud xmm6,xmm5,XMMWORD PTR \[ecx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 e2 50 50 b2 00 f8 ff ff\s+vpdpbuud xmm6,xmm5,XMMWORD PTR \[edx-0x800\] +\s*[a-f0-9]+:\s*c4 e2 54 51 f4\s+vpdpbuuds ymm6,ymm5,ymm4 +\s*[a-f0-9]+:\s*c4 e2 50 51 f4\s+vpdpbuuds xmm6,xmm5,xmm4 +\s*[a-f0-9]+:\s*c4 e2 54 51 b4 f4 00 00 00 10\s+vpdpbuuds ymm6,ymm5,YMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 54 51 31\s+vpdpbuuds ymm6,ymm5,YMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 54 51 b1 e0 0f 00 00\s+vpdpbuuds ymm6,ymm5,YMMWORD PTR \[ecx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 e2 54 51 b2 00 f0 ff ff\s+vpdpbuuds ymm6,ymm5,YMMWORD PTR \[edx-0x1000\] +\s*[a-f0-9]+:\s*c4 e2 50 51 b4 f4 00 00 00 10\s+vpdpbuuds xmm6,xmm5,XMMWORD PTR \[esp\+esi\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 e2 50 51 31\s+vpdpbuuds xmm6,xmm5,XMMWORD PTR \[ecx\] +\s*[a-f0-9]+:\s*c4 e2 50 51 b1 f0 07 00 00\s+vpdpbuuds xmm6,xmm5,XMMWORD PTR \[ecx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 e2 50 51 b2 00 f8 ff ff\s+vpdpbuuds xmm6,xmm5,XMMWORD PTR \[edx-0x800\] +#pass diff --git a/gas/testsuite/gas/i386/avx-vnni-int8.d b/gas/testsuite/gas/i386/avx-vnni-int8.d new file mode 100644 index 0000000000..cd4499e59f --- /dev/null +++ b/gas/testsuite/gas/i386/avx-vnni-int8.d @@ -0,0 +1,71 @@ +#as: +#objdump: -dw +#name: i386 AVX-VNNI-INT8 insns +#source: avx-vnni-int8.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 e2 57 50 f4\s+vpdpbssd %ymm4,%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 53 50 f4\s+vpdpbssd %xmm4,%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 57 50 b4 f4 00 00 00 10\s+vpdpbssd 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 57 50 31\s+vpdpbssd \(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 57 50 b1 e0 0f 00 00\s+vpdpbssd 0xfe0\(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 57 50 b2 00 f0 ff ff\s+vpdpbssd -0x1000\(%edx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 53 50 b4 f4 00 00 00 10\s+vpdpbssd 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 53 50 31\s+vpdpbssd \(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 53 50 b1 f0 07 00 00\s+vpdpbssd 0x7f0\(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 53 50 b2 00 f8 ff ff\s+vpdpbssd -0x800\(%edx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 57 51 f4\s+vpdpbssds %ymm4,%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 53 51 f4\s+vpdpbssds %xmm4,%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 57 51 b4 f4 00 00 00 10\s+vpdpbssds 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 57 51 31\s+vpdpbssds \(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 57 51 b1 e0 0f 00 00\s+vpdpbssds 0xfe0\(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 57 51 b2 00 f0 ff ff\s+vpdpbssds -0x1000\(%edx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 53 51 b4 f4 00 00 00 10\s+vpdpbssds 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 53 51 31\s+vpdpbssds \(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 53 51 b1 f0 07 00 00\s+vpdpbssds 0x7f0\(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 53 51 b2 00 f8 ff ff\s+vpdpbssds -0x800\(%edx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 56 50 f4\s+vpdpbsud %ymm4,%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 52 50 f4\s+vpdpbsud %xmm4,%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 56 50 b4 f4 00 00 00 10\s+vpdpbsud 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 56 50 31\s+vpdpbsud \(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 56 50 b1 e0 0f 00 00\s+vpdpbsud 0xfe0\(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 56 50 b2 00 f0 ff ff\s+vpdpbsud -0x1000\(%edx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 52 50 b4 f4 00 00 00 10\s+vpdpbsud 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 52 50 31\s+vpdpbsud \(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 52 50 b1 f0 07 00 00\s+vpdpbsud 0x7f0\(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 52 50 b2 00 f8 ff ff\s+vpdpbsud -0x800\(%edx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 56 51 f4\s+vpdpbsuds %ymm4,%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 52 51 f4\s+vpdpbsuds %xmm4,%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 56 51 b4 f4 00 00 00 10\s+vpdpbsuds 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 56 51 31\s+vpdpbsuds \(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 56 51 b1 e0 0f 00 00\s+vpdpbsuds 0xfe0\(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 56 51 b2 00 f0 ff ff\s+vpdpbsuds -0x1000\(%edx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 52 51 b4 f4 00 00 00 10\s+vpdpbsuds 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 52 51 31\s+vpdpbsuds \(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 52 51 b1 f0 07 00 00\s+vpdpbsuds 0x7f0\(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 52 51 b2 00 f8 ff ff\s+vpdpbsuds -0x800\(%edx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 54 50 f4\s+vpdpbuud %ymm4,%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 50 50 f4\s+vpdpbuud %xmm4,%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 54 50 b4 f4 00 00 00 10\s+vpdpbuud 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 54 50 31\s+vpdpbuud \(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 54 50 b1 e0 0f 00 00\s+vpdpbuud 0xfe0\(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 54 50 b2 00 f0 ff ff\s+vpdpbuud -0x1000\(%edx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 50 50 b4 f4 00 00 00 10\s+vpdpbuud 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 50 50 31\s+vpdpbuud \(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 50 50 b1 f0 07 00 00\s+vpdpbuud 0x7f0\(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 50 50 b2 00 f8 ff ff\s+vpdpbuud -0x800\(%edx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 54 51 f4\s+vpdpbuuds %ymm4,%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 50 51 f4\s+vpdpbuuds %xmm4,%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 54 51 b4 f4 00 00 00 10\s+vpdpbuuds 0x10000000\(%esp,%esi,8\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 54 51 31\s+vpdpbuuds \(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 54 51 b1 e0 0f 00 00\s+vpdpbuuds 0xfe0\(%ecx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 54 51 b2 00 f0 ff ff\s+vpdpbuuds -0x1000\(%edx\),%ymm5,%ymm6 +\s*[a-f0-9]+:\s*c4 e2 50 51 b4 f4 00 00 00 10\s+vpdpbuuds 0x10000000\(%esp,%esi,8\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 50 51 31\s+vpdpbuuds \(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 50 51 b1 f0 07 00 00\s+vpdpbuuds 0x7f0\(%ecx\),%xmm5,%xmm6 +\s*[a-f0-9]+:\s*c4 e2 50 51 b2 00 f8 ff ff\s+vpdpbuuds -0x800\(%edx\),%xmm5,%xmm6 +#pass diff --git a/gas/testsuite/gas/i386/avx-vnni-int8.s b/gas/testsuite/gas/i386/avx-vnni-int8.s new file mode 100644 index 0000000000..e3cfeb6680 --- /dev/null +++ b/gas/testsuite/gas/i386/avx-vnni-int8.s @@ -0,0 +1,127 @@ +# Check 32bit AVX-VNNI-INT8 instructions + + .allow_index_reg + .text +_start: + vpdpbssd %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbssd %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbssd 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbssd (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbssd 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssd -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssd 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbssd (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbssd 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssd -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbssds %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbssds %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbssds 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbssds (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbssds 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssds -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssds 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbssds (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbssds 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssds -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsud %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbsud %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbsud 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbsud (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbsud 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsud -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsud 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbsud (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbsud 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsud -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsuds %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbsuds %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbsuds 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbsuds (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbsuds 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsuds -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsuds 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbsuds (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbsuds 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsuds -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuud %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbuud %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbuud 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbuud (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbuud 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuud -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuud 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbuud (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbuud 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuud -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuuds %ymm4, %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbuuds %xmm4, %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbuuds 0x10000000(%esp, %esi, 8), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbuuds (%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 + vpdpbuuds 4064(%ecx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuuds -4096(%edx), %ymm5, %ymm6 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuuds 0x10000000(%esp, %esi, 8), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbuuds (%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 + vpdpbuuds 2032(%ecx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuuds -2048(%edx), %xmm5, %xmm6 #AVX-VNNI-INT8 Disp32(00f8ffff) + +.intel_syntax noprefix + vpdpbssd ymm6, ymm5, ymm4 #AVX-VNNI-INT8 + vpdpbssd xmm6, xmm5, xmm4 #AVX-VNNI-INT8 + vpdpbssd ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssd ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbssd ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssd ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssd xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssd xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbssd xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssd xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbssds ymm6, ymm5, ymm4 #AVX-VNNI-INT8 + vpdpbssds xmm6, xmm5, xmm4 #AVX-VNNI-INT8 + vpdpbssds ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssds ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbssds ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssds ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssds xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssds xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbssds xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssds xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsud ymm6, ymm5, ymm4 #AVX-VNNI-INT8 + vpdpbsud xmm6, xmm5, xmm4 #AVX-VNNI-INT8 + vpdpbsud ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsud ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbsud ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsud ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsud xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsud xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbsud xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsud xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsuds ymm6, ymm5, ymm4 #AVX-VNNI-INT8 + vpdpbsuds xmm6, xmm5, xmm4 #AVX-VNNI-INT8 + vpdpbsuds ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsuds ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbsuds ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsuds ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsuds xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsuds xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbsuds xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsuds xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuud ymm6, ymm5, ymm4 #AVX-VNNI-INT8 + vpdpbuud xmm6, xmm5, xmm4 #AVX-VNNI-INT8 + vpdpbuud ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuud ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbuud ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuud ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuud xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuud xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbuud xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuud xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuuds ymm6, ymm5, ymm4 #AVX-VNNI-INT8 + vpdpbuuds xmm6, xmm5, xmm4 #AVX-VNNI-INT8 + vpdpbuuds ymm6, ymm5, YMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuuds ymm6, ymm5, YMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbuuds ymm6, ymm5, YMMWORD PTR [ecx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuuds ymm6, ymm5, YMMWORD PTR [edx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuuds xmm6, xmm5, XMMWORD PTR [esp+esi*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuuds xmm6, xmm5, XMMWORD PTR [ecx] #AVX-VNNI-INT8 + vpdpbuuds xmm6, xmm5, XMMWORD PTR [ecx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuuds xmm6, xmm5, XMMWORD PTR [edx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) diff --git a/gas/testsuite/gas/i386/i386.exp b/gas/testsuite/gas/i386/i386.exp index 3a46807e4f..08774c38d9 100644 --- a/gas/testsuite/gas/i386/i386.exp +++ b/gas/testsuite/gas/i386/i386.exp @@ -481,6 +481,8 @@ if [gas_32_check] then { run_dump_test "avx-ifma" run_dump_test "avx-ifma-intel" run_list_test "avx-ifma-inval" + run_dump_test "avx-vnni-int8" + run_dump_test "avx-vnni-int8-intel" run_list_test "sg" run_dump_test "clzero" run_dump_test "invlpgb" @@ -1151,6 +1153,8 @@ if [gas_64_check] then { run_dump_test "x86-64-avx-ifma" run_dump_test "x86-64-avx-ifma-intel" run_list_test "x86-64-avx-ifma-inval" + run_dump_test "x86-64-avx-vnni-int8" + run_dump_test "x86-64-avx-vnni-int8-intel" run_dump_test "x86-64-clzero" run_dump_test "x86-64-mwaitx-bdver4" run_list_test "x86-64-mwaitx-reg" diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d new file mode 100644 index 0000000000..61c01124ef --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8-intel.d @@ -0,0 +1,71 @@ +#as: +#objdump: -dw -Mintel +#name: x86_64 AVX-VNNI-INT8 insns (Intel disassembly) +#source: x86-64-avx-vnni-int8.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 42 37 50 d0\s+vpdpbssd ymm10,ymm9,ymm8 +\s*[a-f0-9]+:\s*c4 42 33 50 d0\s+vpdpbssd xmm10,xmm9,xmm8 +\s*[a-f0-9]+:\s*c4 22 37 50 94 f5 00 00 00 10\s+vpdpbssd ymm10,ymm9,YMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 37 50 11\s+vpdpbssd ymm10,ymm9,YMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 37 50 91 e0 0f 00 00\s+vpdpbssd ymm10,ymm9,YMMWORD PTR \[rcx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 62 37 50 92 00 f0 ff ff\s+vpdpbssd ymm10,ymm9,YMMWORD PTR \[rdx-0x1000\] +\s*[a-f0-9]+:\s*c4 22 33 50 94 f5 00 00 00 10\s+vpdpbssd xmm10,xmm9,XMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 33 50 11\s+vpdpbssd xmm10,xmm9,XMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 33 50 91 f0 07 00 00\s+vpdpbssd xmm10,xmm9,XMMWORD PTR \[rcx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 62 33 50 92 00 f8 ff ff\s+vpdpbssd xmm10,xmm9,XMMWORD PTR \[rdx-0x800\] +\s*[a-f0-9]+:\s*c4 42 37 51 d0\s+vpdpbssds ymm10,ymm9,ymm8 +\s*[a-f0-9]+:\s*c4 42 33 51 d0\s+vpdpbssds xmm10,xmm9,xmm8 +\s*[a-f0-9]+:\s*c4 22 37 51 94 f5 00 00 00 10\s+vpdpbssds ymm10,ymm9,YMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 37 51 11\s+vpdpbssds ymm10,ymm9,YMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 37 51 91 e0 0f 00 00\s+vpdpbssds ymm10,ymm9,YMMWORD PTR \[rcx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 62 37 51 92 00 f0 ff ff\s+vpdpbssds ymm10,ymm9,YMMWORD PTR \[rdx-0x1000\] +\s*[a-f0-9]+:\s*c4 22 33 51 94 f5 00 00 00 10\s+vpdpbssds xmm10,xmm9,XMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 33 51 11\s+vpdpbssds xmm10,xmm9,XMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 33 51 91 f0 07 00 00\s+vpdpbssds xmm10,xmm9,XMMWORD PTR \[rcx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 62 33 51 92 00 f8 ff ff\s+vpdpbssds xmm10,xmm9,XMMWORD PTR \[rdx-0x800\] +\s*[a-f0-9]+:\s*c4 42 36 50 d0\s+vpdpbsud ymm10,ymm9,ymm8 +\s*[a-f0-9]+:\s*c4 42 32 50 d0\s+vpdpbsud xmm10,xmm9,xmm8 +\s*[a-f0-9]+:\s*c4 22 36 50 94 f5 00 00 00 10\s+vpdpbsud ymm10,ymm9,YMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 36 50 11\s+vpdpbsud ymm10,ymm9,YMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 36 50 91 e0 0f 00 00\s+vpdpbsud ymm10,ymm9,YMMWORD PTR \[rcx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 62 36 50 92 00 f0 ff ff\s+vpdpbsud ymm10,ymm9,YMMWORD PTR \[rdx-0x1000\] +\s*[a-f0-9]+:\s*c4 22 32 50 94 f5 00 00 00 10\s+vpdpbsud xmm10,xmm9,XMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 32 50 11\s+vpdpbsud xmm10,xmm9,XMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 32 50 91 f0 07 00 00\s+vpdpbsud xmm10,xmm9,XMMWORD PTR \[rcx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 62 32 50 92 00 f8 ff ff\s+vpdpbsud xmm10,xmm9,XMMWORD PTR \[rdx-0x800\] +\s*[a-f0-9]+:\s*c4 42 36 51 d0\s+vpdpbsuds ymm10,ymm9,ymm8 +\s*[a-f0-9]+:\s*c4 42 32 51 d0\s+vpdpbsuds xmm10,xmm9,xmm8 +\s*[a-f0-9]+:\s*c4 22 36 51 94 f5 00 00 00 10\s+vpdpbsuds ymm10,ymm9,YMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 36 51 11\s+vpdpbsuds ymm10,ymm9,YMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 36 51 91 e0 0f 00 00\s+vpdpbsuds ymm10,ymm9,YMMWORD PTR \[rcx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 62 36 51 92 00 f0 ff ff\s+vpdpbsuds ymm10,ymm9,YMMWORD PTR \[rdx-0x1000\] +\s*[a-f0-9]+:\s*c4 22 32 51 94 f5 00 00 00 10\s+vpdpbsuds xmm10,xmm9,XMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 32 51 11\s+vpdpbsuds xmm10,xmm9,XMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 32 51 91 f0 07 00 00\s+vpdpbsuds xmm10,xmm9,XMMWORD PTR \[rcx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 62 32 51 92 00 f8 ff ff\s+vpdpbsuds xmm10,xmm9,XMMWORD PTR \[rdx-0x800\] +\s*[a-f0-9]+:\s*c4 42 34 50 d0\s+vpdpbuud ymm10,ymm9,ymm8 +\s*[a-f0-9]+:\s*c4 42 30 50 d0\s+vpdpbuud xmm10,xmm9,xmm8 +\s*[a-f0-9]+:\s*c4 22 34 50 94 f5 00 00 00 10\s+vpdpbuud ymm10,ymm9,YMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 34 50 11\s+vpdpbuud ymm10,ymm9,YMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 34 50 91 e0 0f 00 00\s+vpdpbuud ymm10,ymm9,YMMWORD PTR \[rcx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 62 34 50 92 00 f0 ff ff\s+vpdpbuud ymm10,ymm9,YMMWORD PTR \[rdx-0x1000\] +\s*[a-f0-9]+:\s*c4 22 30 50 94 f5 00 00 00 10\s+vpdpbuud xmm10,xmm9,XMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 30 50 11\s+vpdpbuud xmm10,xmm9,XMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 30 50 91 f0 07 00 00\s+vpdpbuud xmm10,xmm9,XMMWORD PTR \[rcx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 62 30 50 92 00 f8 ff ff\s+vpdpbuud xmm10,xmm9,XMMWORD PTR \[rdx-0x800\] +\s*[a-f0-9]+:\s*c4 42 34 51 d0\s+vpdpbuuds ymm10,ymm9,ymm8 +\s*[a-f0-9]+:\s*c4 42 30 51 d0\s+vpdpbuuds xmm10,xmm9,xmm8 +\s*[a-f0-9]+:\s*c4 22 34 51 94 f5 00 00 00 10\s+vpdpbuuds ymm10,ymm9,YMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 34 51 11\s+vpdpbuuds ymm10,ymm9,YMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 34 51 91 e0 0f 00 00\s+vpdpbuuds ymm10,ymm9,YMMWORD PTR \[rcx\+0xfe0\] +\s*[a-f0-9]+:\s*c4 62 34 51 92 00 f0 ff ff\s+vpdpbuuds ymm10,ymm9,YMMWORD PTR \[rdx-0x1000\] +\s*[a-f0-9]+:\s*c4 22 30 51 94 f5 00 00 00 10\s+vpdpbuuds xmm10,xmm9,XMMWORD PTR \[rbp\+r14\*8\+0x10000000\] +\s*[a-f0-9]+:\s*c4 42 30 51 11\s+vpdpbuuds xmm10,xmm9,XMMWORD PTR \[r9\] +\s*[a-f0-9]+:\s*c4 62 30 51 91 f0 07 00 00\s+vpdpbuuds xmm10,xmm9,XMMWORD PTR \[rcx\+0x7f0\] +\s*[a-f0-9]+:\s*c4 62 30 51 92 00 f8 ff ff\s+vpdpbuuds xmm10,xmm9,XMMWORD PTR \[rdx-0x800\] +#pass diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d new file mode 100644 index 0000000000..90faed581b --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.d @@ -0,0 +1,71 @@ +#as: +#objdump: -dw +#name: x86_64 AVX-VNNI-INT8 insns +#source: x86-64-avx-vnni-int8.s + +.*: +file format .* + +Disassembly of section \.text: + +0+ <_start>: +\s*[a-f0-9]+:\s*c4 42 37 50 d0\s+vpdpbssd %ymm8,%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 33 50 d0\s+vpdpbssd %xmm8,%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 22 37 50 94 f5 00 00 00 10\s+vpdpbssd 0x10000000\(%rbp,%r14,8\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 37 50 11\s+vpdpbssd \(%r9\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 37 50 91 e0 0f 00 00\s+vpdpbssd 0xfe0\(%rcx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 37 50 92 00 f0 ff ff\s+vpdpbssd -0x1000\(%rdx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 22 33 50 94 f5 00 00 00 10\s+vpdpbssd 0x10000000\(%rbp,%r14,8\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 33 50 11\s+vpdpbssd \(%r9\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 33 50 91 f0 07 00 00\s+vpdpbssd 0x7f0\(%rcx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 33 50 92 00 f8 ff ff\s+vpdpbssd -0x800\(%rdx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 37 51 d0\s+vpdpbssds %ymm8,%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 33 51 d0\s+vpdpbssds %xmm8,%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 22 37 51 94 f5 00 00 00 10\s+vpdpbssds 0x10000000\(%rbp,%r14,8\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 37 51 11\s+vpdpbssds \(%r9\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 37 51 91 e0 0f 00 00\s+vpdpbssds 0xfe0\(%rcx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 37 51 92 00 f0 ff ff\s+vpdpbssds -0x1000\(%rdx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 22 33 51 94 f5 00 00 00 10\s+vpdpbssds 0x10000000\(%rbp,%r14,8\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 33 51 11\s+vpdpbssds \(%r9\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 33 51 91 f0 07 00 00\s+vpdpbssds 0x7f0\(%rcx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 33 51 92 00 f8 ff ff\s+vpdpbssds -0x800\(%rdx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 36 50 d0\s+vpdpbsud %ymm8,%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 32 50 d0\s+vpdpbsud %xmm8,%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 22 36 50 94 f5 00 00 00 10\s+vpdpbsud 0x10000000\(%rbp,%r14,8\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 36 50 11\s+vpdpbsud \(%r9\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 36 50 91 e0 0f 00 00\s+vpdpbsud 0xfe0\(%rcx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 36 50 92 00 f0 ff ff\s+vpdpbsud -0x1000\(%rdx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 22 32 50 94 f5 00 00 00 10\s+vpdpbsud 0x10000000\(%rbp,%r14,8\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 32 50 11\s+vpdpbsud \(%r9\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 32 50 91 f0 07 00 00\s+vpdpbsud 0x7f0\(%rcx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 32 50 92 00 f8 ff ff\s+vpdpbsud -0x800\(%rdx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 36 51 d0\s+vpdpbsuds %ymm8,%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 32 51 d0\s+vpdpbsuds %xmm8,%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 22 36 51 94 f5 00 00 00 10\s+vpdpbsuds 0x10000000\(%rbp,%r14,8\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 36 51 11\s+vpdpbsuds \(%r9\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 36 51 91 e0 0f 00 00\s+vpdpbsuds 0xfe0\(%rcx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 36 51 92 00 f0 ff ff\s+vpdpbsuds -0x1000\(%rdx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 22 32 51 94 f5 00 00 00 10\s+vpdpbsuds 0x10000000\(%rbp,%r14,8\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 32 51 11\s+vpdpbsuds \(%r9\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 32 51 91 f0 07 00 00\s+vpdpbsuds 0x7f0\(%rcx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 32 51 92 00 f8 ff ff\s+vpdpbsuds -0x800\(%rdx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 34 50 d0\s+vpdpbuud %ymm8,%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 30 50 d0\s+vpdpbuud %xmm8,%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 22 34 50 94 f5 00 00 00 10\s+vpdpbuud 0x10000000\(%rbp,%r14,8\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 34 50 11\s+vpdpbuud \(%r9\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 34 50 91 e0 0f 00 00\s+vpdpbuud 0xfe0\(%rcx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 34 50 92 00 f0 ff ff\s+vpdpbuud -0x1000\(%rdx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 22 30 50 94 f5 00 00 00 10\s+vpdpbuud 0x10000000\(%rbp,%r14,8\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 30 50 11\s+vpdpbuud \(%r9\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 30 50 91 f0 07 00 00\s+vpdpbuud 0x7f0\(%rcx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 30 50 92 00 f8 ff ff\s+vpdpbuud -0x800\(%rdx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 34 51 d0\s+vpdpbuuds %ymm8,%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 30 51 d0\s+vpdpbuuds %xmm8,%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 22 34 51 94 f5 00 00 00 10\s+vpdpbuuds 0x10000000\(%rbp,%r14,8\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 42 34 51 11\s+vpdpbuuds \(%r9\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 34 51 91 e0 0f 00 00\s+vpdpbuuds 0xfe0\(%rcx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 62 34 51 92 00 f0 ff ff\s+vpdpbuuds -0x1000\(%rdx\),%ymm9,%ymm10 +\s*[a-f0-9]+:\s*c4 22 30 51 94 f5 00 00 00 10\s+vpdpbuuds 0x10000000\(%rbp,%r14,8\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 42 30 51 11\s+vpdpbuuds \(%r9\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 30 51 91 f0 07 00 00\s+vpdpbuuds 0x7f0\(%rcx\),%xmm9,%xmm10 +\s*[a-f0-9]+:\s*c4 62 30 51 92 00 f8 ff ff\s+vpdpbuuds -0x800\(%rdx\),%xmm9,%xmm10 +#pass diff --git a/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s new file mode 100644 index 0000000000..bc9145b26f --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-avx-vnni-int8.s @@ -0,0 +1,127 @@ +# Check 64bit AVX-VNNI-INT8 instructions + + .allow_index_reg + .text +_start: + vpdpbssd %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbssd %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbssd 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbssd (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbssd 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssd -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssd 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbssd (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbssd 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssd -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbssds %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbssds %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbssds 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbssds (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbssds 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssds -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssds 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbssds (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbssds 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssds -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsud %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbsud %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbsud 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbsud (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbsud 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsud -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsud 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbsud (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbsud 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsud -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsuds %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbsuds %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbsuds 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbsuds (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbsuds 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsuds -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsuds 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbsuds (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbsuds 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsuds -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuud %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbuud %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbuud 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbuud (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbuud 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuud -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuud 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbuud (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbuud 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuud -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuuds %ymm8, %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbuuds %xmm8, %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbuuds 0x10000000(%rbp, %r14, 8), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbuuds (%r9), %ymm9, %ymm10 #AVX-VNNI-INT8 + vpdpbuuds 4064(%rcx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuuds -4096(%rdx), %ymm9, %ymm10 #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuuds 0x10000000(%rbp, %r14, 8), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbuuds (%r9), %xmm9, %xmm10 #AVX-VNNI-INT8 + vpdpbuuds 2032(%rcx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuuds -2048(%rdx), %xmm9, %xmm10 #AVX-VNNI-INT8 Disp32(00f8ffff) + +.intel_syntax noprefix + vpdpbssd ymm10, ymm9, ymm8 #AVX-VNNI-INT8 + vpdpbssd xmm10, xmm9, xmm8 #AVX-VNNI-INT8 + vpdpbssd ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssd ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbssd ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssd ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssd xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssd xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbssd xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssd xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbssds ymm10, ymm9, ymm8 #AVX-VNNI-INT8 + vpdpbssds xmm10, xmm9, xmm8 #AVX-VNNI-INT8 + vpdpbssds ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssds ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbssds ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbssds ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbssds xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbssds xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbssds xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbssds xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsud ymm10, ymm9, ymm8 #AVX-VNNI-INT8 + vpdpbsud xmm10, xmm9, xmm8 #AVX-VNNI-INT8 + vpdpbsud ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsud ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbsud ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsud ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsud xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsud xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbsud xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsud xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbsuds ymm10, ymm9, ymm8 #AVX-VNNI-INT8 + vpdpbsuds xmm10, xmm9, xmm8 #AVX-VNNI-INT8 + vpdpbsuds ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsuds ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbsuds ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbsuds ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbsuds xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbsuds xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbsuds xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbsuds xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuud ymm10, ymm9, ymm8 #AVX-VNNI-INT8 + vpdpbuud xmm10, xmm9, xmm8 #AVX-VNNI-INT8 + vpdpbuud ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuud ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbuud ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuud ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuud xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuud xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbuud xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuud xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) + vpdpbuuds ymm10, ymm9, ymm8 #AVX-VNNI-INT8 + vpdpbuuds xmm10, xmm9, xmm8 #AVX-VNNI-INT8 + vpdpbuuds ymm10, ymm9, YMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuuds ymm10, ymm9, YMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbuuds ymm10, ymm9, YMMWORD PTR [rcx+4064] #AVX-VNNI-INT8 Disp32(e00f0000) + vpdpbuuds ymm10, ymm9, YMMWORD PTR [rdx-4096] #AVX-VNNI-INT8 Disp32(00f0ffff) + vpdpbuuds xmm10, xmm9, XMMWORD PTR [rbp+r14*8+0x10000000] #AVX-VNNI-INT8 + vpdpbuuds xmm10, xmm9, XMMWORD PTR [r9] #AVX-VNNI-INT8 + vpdpbuuds xmm10, xmm9, XMMWORD PTR [rcx+2032] #AVX-VNNI-INT8 Disp32(f0070000) + vpdpbuuds xmm10, xmm9, XMMWORD PTR [rdx-2048] #AVX-VNNI-INT8 Disp32(00f8ffff) diff --git a/opcodes/i386-dis.c b/opcodes/i386-dis.c index 74f71467c5..d04cc3779b 100644 --- a/opcodes/i386-dis.c +++ b/opcodes/i386-dis.c @@ -1128,6 +1128,8 @@ enum PREFIX_VEX_0FF0, PREFIX_VEX_0F3849_X86_64, PREFIX_VEX_0F384B_X86_64, + PREFIX_VEX_0F3850_W_0, + PREFIX_VEX_0F3851_W_0, PREFIX_VEX_0F385C_X86_64, PREFIX_VEX_0F385E_X86_64, PREFIX_VEX_0F38F5_L_0, @@ -4004,6 +4006,21 @@ static const struct dis386 prefix_table[][4] = { { VEX_W_TABLE (VEX_W_0F384B_X86_64_P_3) }, }, + /* PREFIX_VEX_0F3850_W_0 */ + { + { "vpdpbuud", { XM, Vex, EXx }, 0 }, + { "vpdpbsud", { XM, Vex, EXx }, 0 }, + { "%XV vpdpbusd", { XM, Vex, EXx }, 0 }, + { "vpdpbssd", { XM, Vex, EXx }, 0 }, + }, + + /* PREFIX_VEX_0F3851_W_0 */ + { + { "vpdpbuuds", { XM, Vex, EXx }, 0 }, + { "vpdpbsuds", { XM, Vex, EXx }, 0 }, + { "%XV vpdpbusds", { XM, Vex, EXx }, 0 }, + { "vpdpbssds", { XM, Vex, EXx }, 0 }, + }, /* PREFIX_VEX_0F385C_X86_64 */ { { Bad_Opcode }, @@ -7547,11 +7564,11 @@ static const struct dis386 vex_w_table[][2] = { }, { /* VEX_W_0F3850 */ - { "%XV vpdpbusd", { XM, Vex, EXx }, 0 }, + { PREFIX_TABLE (PREFIX_VEX_0F3850_W_0) }, }, { - /* VEX_W_0F3851 */ - { "%XV vpdpbusds", { XM, Vex, EXx }, 0 }, + /* VEX_W_0F3851_P_0 */ + { PREFIX_TABLE (PREFIX_VEX_0F3851_W_0) }, }, { /* VEX_W_0F3852 */ diff --git a/opcodes/i386-gen.c b/opcodes/i386-gen.c index cf7b5ece2a..8a80c4c64b 100644 --- a/opcodes/i386-gen.c +++ b/opcodes/i386-gen.c @@ -247,6 +247,8 @@ static initializer cpu_flag_init[] = "CPU_AVX512BW_FLAGS|CpuAVX512_FP16" }, { "CPU_AVX_IFMA_FLAGS", "CPU_AVX2_FLAGS|CpuAVX_IFMA" }, + { "CPU_AVX_VNNI_INT8_FLAGS", + "CPU_AVX2_FLAGS|CpuAVX_VNNI_INT8" }, { "CPU_IAMCU_FLAGS", "Cpu186|Cpu286|Cpu386|Cpu486|Cpu586|CpuIAMCU" }, { "CPU_ADX_FLAGS", @@ -443,6 +445,8 @@ static initializer cpu_flag_init[] = "CpuAVX512_FP16" }, { "CPU_ANY_AVX_IFMA_FLAGS", "CpuAVX_IFMA" }, + { "CPU_ANY_AVX_VNNI_INT8_FLAGS", + "CpuAVX_VNNI_INT8" }, }; static initializer operand_type_init[] = @@ -645,6 +649,7 @@ static bitfield cpu_flags[] = BITFIELD (CpuAVX_VNNI), BITFIELD (CpuAVX512_FP16), BITFIELD (CpuAVX_IFMA), + BITFIELD (CpuAVX_VNNI_INT8), BITFIELD (CpuMWAITX), BITFIELD (CpuCLZERO), BITFIELD (CpuOSPKE), diff --git a/opcodes/i386-opc.h b/opcodes/i386-opc.h index 8ff66d42cc..e9c6785898 100644 --- a/opcodes/i386-opc.h +++ b/opcodes/i386-opc.h @@ -211,6 +211,8 @@ enum CpuAVX512_FP16, /* Intel AVX IFMA Instructions support required. */ CpuAVX_IFMA, + /* Intel AVX VNNI-INT8 Instructions support required. */ + CpuAVX_VNNI_INT8, /* mwaitx instruction required */ CpuMWAITX, /* Clzero instruction required */ @@ -391,6 +393,7 @@ typedef union i386_cpu_flags unsigned int cpuavx_vnni:1; unsigned int cpuavx512_fp16:1; unsigned int cpuavx_ifma:1; + unsigned int cpuavx_vnni_int8:1; unsigned int cpumwaitx:1; unsigned int cpuclzero:1; unsigned int cpuospke:1; diff --git a/opcodes/i386-opc.tbl b/opcodes/i386-opc.tbl index d39c676abf..458a9897da 100644 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -3270,3 +3270,14 @@ vpmadd52huq, 0x66B5, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexV vpmadd52luq, 0x66B4, None, CpuAVX_IFMA, Modrm|Vex|PseudoVexPrefix|Space0F38|VexVVVV=1|VexW1|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } // AVX_IFMA instructions end. + +// AVX_VNNI_INT8 instructions. + +vpdpbuud, 0x50, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } +vpdpbuuds, 0x51, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } +vpdpbssd, 0xf250, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } +vpdpbssds, 0xf251, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } +vpdpbsud, 0xf350, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } +vpdpbsuds, 0xf351, None, CpuAVX_VNNI_INT8, Modrm|Vex|Space0F38|VexVVVV|VexW0|CheckRegSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } + +// AVX_VNNI_INT8 instructions end.