From patchwork Fri Jun 16 07:30:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 108895 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1148547vqr; Fri, 16 Jun 2023 00:30:58 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4oZylT78q5nP4JEkbUvOW012POz992vpmtUqQB1BdrJIK0W2MWGUNfkDKPFsZZwBUoWU8e X-Received: by 2002:a17:906:6a18:b0:974:419d:7847 with SMTP id qw24-20020a1709066a1800b00974419d7847mr1201560ejc.71.1686900658757; Fri, 16 Jun 2023 00:30:58 -0700 (PDT) Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id a8-20020a17090680c800b00977bbe82762si10714287ejx.31.2023.06.16.00.30.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Jun 2023 00:30:58 -0700 (PDT) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=y2NHsrWi; arc=fail (signature failed); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C0F8E3853D05 for ; Fri, 16 Jun 2023 07:30:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C0F8E3853D05 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1686900653; bh=fQKOcZIj6sjo1GgWVyKPw5zZpngrh//knCtoV57CLEg=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=y2NHsrWiCcGsYwZ+k6l4Qj8E8XbvhQvtsm6FU30Kiv1qqBVuNLmJc8oTYp6t/feR9 5gVmtIeFg+w8hy1oXyXhKAms3tVlflmetfPk4BVYIfOyysR8SIfHLhk7m3s+wp0e0V h+cAq77+g2TNxgLksNScoMu9A0gynM1VfXKPN6bQ= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2054.outbound.protection.outlook.com [40.107.20.54]) by sourceware.org (Postfix) with ESMTPS id BF5CA3854E73 for ; Fri, 16 Jun 2023 07:30:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BF5CA3854E73 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AedNLFaxQPc1MpCaBX0JK2HzWhEVu0TaVLvvwv2XT90ZXGZK9jDOuW8mwVXuFmzZhWfYQFRyNEDjBYTkHULX9nOV+XTZKA6AUENWQBsluccVJ6FRo9BqnEodHA1hbvf8JjqRlqUoERd4X9h5WL96PowB0FZvCUQFfT+NKyYPXwIyG4ltts3eAyUn0aCUjwOmxmQckhuv7Zi29dsSspGCw+jNWE2J6uOqgXx050Ywis6TUY+EHifwcAoTMOI3EWcNixXfxnREv6pzU26fsTiMuEb12SE2I5WwavdFL5EWky+EEaBHjVXsbgvd+Yxdocr5m/N2wLPmiu6ZJDsKyHi3bg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fQKOcZIj6sjo1GgWVyKPw5zZpngrh//knCtoV57CLEg=; b=kJKSKn0iql96jCdGqfQukaQPaE0piG8fZzJXon6IjEnTSVAhhUZgb13keYt65RLHt6UVEdH+oENjgt9oejbyVjeiO5fHVLafulH+iR1uj3JInuH08oBpgBuyWuLU4RxHJLMjJ5E923BBchAeGHzlBm7jAm6B2T5A7ANM54hKFPM7cnh7Ho2mxjCzYrAdqQ6JT3RiTJDZmXSmczhMHOLDl38aE7hDw4bTDDRI1nOrMr6E9dfXm+sigAkPoVlFq8ZyQDzGk94QorlPFInmynPue1zmgj6qPAkEi4sEweqHEVq394YB8FosUfEwHMyCL2QEXKMml8N95aR+lM02XQCSkQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM0PR04MB6913.eurprd04.prod.outlook.com (2603:10a6:208:184::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.29; Fri, 16 Jun 2023 07:30:43 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Fri, 16 Jun 2023 07:30:43 +0000 Message-ID: <503caac8-8824-823a-81c2-762cba207cb6@suse.com> Date: Fri, 16 Jun 2023 09:30:41 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 1/4] x86: optimize pre-AVX512 {,V}PCMPEQQ with identical sources Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: In-Reply-To: X-ClientProxiedBy: FR2P281CA0126.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9d::20) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|AM0PR04MB6913:EE_ X-MS-Office365-Filtering-Correlation-Id: 4063fc81-96c5-40ce-78f2-08db6e3b979b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ja/uA5iCf8a4C4VTNrJuhfy3+DTMzMjBXOhZ6ofAVS41pkc6va/L4HmWbB7JgOUzqpMgLHriOOh+9iiNuCCoD8LuAWoZigi8AORXd2EURdfo9wuf6sJ9tYUtSjR5/r6/KSc/Jl0g+5AMDAoqBpq12voxXqzzif966zM0fKTTWkte9+Z0puyYe69TtCWDOj4Os1rfuRAcwzir4WonIov6FTAPD/hLmjlezj05p9qL+M9wXvCLUxA0DWeNjDh0m5GVPfbiMIiqHuFcASn+YyYs52yTgsMopvzco6nQn3Nk9/uMi5w0cWnE69pJd0InLVBg/NwvCttv3/ZJFFFoZICexxO51gnl89H1EZGO0DjQlWz30/J9UdqrFB4bq0diStv2ZSHsxO1NcXRuYh6IN8jCOqgHva7JfcHFfIaYPQjG7XzBYkugEQoGUPUmLJuZCZZLqHrMlwdsTGtQGwwq50lGBY8I3BVL0Arc/kvOZxLRmrCyl8e1EyjPqM1yGXEhGU/gHeFTLbGBSeR3qntDR/f0p5qt6mCLvMrwlWRw7raIiak749yRYdfl6blYVVpTGce+/+b/KUSWVav4twLVR/119NU8l3tW2OelQiPKRRLjHBBrgecQFUbFpWQk2z+c+fEv1wFRt4V4mFEawgHeAu/onw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(376002)(396003)(366004)(136003)(346002)(39860400002)(451199021)(478600001)(6486002)(5660300002)(41300700001)(8936002)(8676002)(2906002)(36756003)(86362001)(31696002)(38100700002)(66476007)(66556008)(66946007)(316002)(6916009)(4326008)(26005)(6512007)(6506007)(186003)(31686004)(2616005)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?9jShhh4jaPSg3D+KDsK3JPAW0JSS?= =?utf-8?q?BRWtpz4VgS8IUav+teVPZiCtz97JjGv3PddQUBPxJGth4Z/bGQYyAdCd2jJW4aMkL?= =?utf-8?q?02SOyoDvoUi/NE7GzF7ZDUAP0JyRgpmEFXrHSTcKY8oc6X7YgnDjH9R2zogIqSBn7?= =?utf-8?q?kQnvIuCKlGWbS88Gf8QvfHnqa72nSl1baA1lCCpbwGran30mxN79QBmKosTTiqCcI?= =?utf-8?q?jAfFnV4TXtyQ0Ui9w0vY+WqM6QK7P6/e1ScHgDMOQExqiNKfI/xArj4AEHiQCEP+8?= =?utf-8?q?I/bypO/JUaY/IlSFawysYwaen4RNnPzPolqoUpxWSBQKFC2v/qyyp4SzmqXULAgkp?= =?utf-8?q?XAtWbFLl+/NhZ20MYq5rGdXrkmS1pISLUUVwTrlSj+IPNX9tTO/bVk5/yeC/S9b7P?= =?utf-8?q?s7QmYFgEvdfx+ryXyXAsT/wsTi+wh8JI2fG5otVGg4uEKrECuV5YbrINqsG88wO/H?= =?utf-8?q?0/bRaahjjvsqOMZswGyUENcxUki3yvnT0yThAQWyKdB3+Tit7B4LFS0IH8tLp3Lse?= =?utf-8?q?aYmF7OvdB2x/QSlY/ZNgE8Du7vvF4zam09u4xdKHa7vO8h4MUfXMcoh8KakaVGe64?= =?utf-8?q?jYvmTLwtx2kBbqLSeYf3MwtXTB2TDJfaaTg3nXQ54RWbd5gVt6xFnbiodqgzQYfv9?= =?utf-8?q?8XLkqVf1EPOeN0ArrMWYHs4vWdrQtFxBFLaqATbWzaSJewuhKJJH9d5jAvmkDo8EM?= =?utf-8?q?MF2uZUPvLEwypa0XHJcHSrdnbO7d1budIGstjxGqYbHA2Se1JDLIh9BtVdYB43nYR?= =?utf-8?q?DWYx8JU/Waihxa3Dp6KAE2UE2+227P/bwQZ7H+8bawxgCK/yCZEvEd+wDRlpDFITp?= =?utf-8?q?2IzZQX+Q4481kRqq0uWTkeIVBLUJfV+X5m7W1eeS8ZcjhnebqunMUDNAWu4y4IdvX?= =?utf-8?q?G4vQ1daZvuBmQBtJInr21xuY8t0Dk+EsW8dJSfImmgWTyOwT0FFhrARTFTvbFvOc9?= =?utf-8?q?RS4tRP9ObDwvBtllh7zsiSF7ZeFI+2dMcwVDLhCFs/VeCoMIo/U1X7wnq/ZoJQtb6?= =?utf-8?q?GdgzG7NPnbM/kwdtHZEdXKGkaO143+w2ZYTwhmMT5NLXCgrj9i1MvK6rpDH5lhpqh?= =?utf-8?q?gORHTdTZbqG6D+Ly9JNx2GJ+dRf4VfQ9it1h0ypYuBFgklWbX5gD4kG2VBNByqYDs?= =?utf-8?q?OgTCjU/HveyvjjU3+rEzV9RigLDZSWRslBh1vj2bMsxXA34Er59oH+s4etfQjSCxY?= =?utf-8?q?8d2kQl7e9/058w6zEZnm95WcoRDR5CI3BZJbEydFhQn7MCtgqMxGicWTa5wVNXijH?= =?utf-8?q?KQwVifb3mrNJKw4TbMRTYL3FyYHbNohEV5Y9OpmnGUk45WscR3r8QfQMgLR40caGT?= =?utf-8?q?jeK6hlEljf+DegWtefreRMjU0nxZe6cRibtBY+pF5w9KtdpMntdYEATGb0CmeBoai?= =?utf-8?q?6EbxjNd21Mr7MNYvRnLrBHkA7mML3fRcqil+Sih2m9jICGIClob6jjtcrlYYc2IXA?= =?utf-8?q?wn5b+A0j9ZY9N9lzauIM2UHQM21mchaj9t4An65GQxMIgz0jPGXRerkoJbfGNyrY+?= =?utf-8?q?nQg9cpapr/ZB?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4063fc81-96c5-40ce-78f2-08db6e3b979b X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 07:30:43.0256 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /Po/5sr9Snf8++QZs5Z4FeUwgQC8gxLobKAxBse5IgPVN5nv6Y144rlLt6tFDu4iGRZqrdKPaQK6BOHOOqjeqg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR04MB6913 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768843545106292241?= X-GMAIL-MSGID: =?utf-8?q?1768843545106292241?= The {,V}PCMPEQD alternative is 1 byte shorter in many cases. --- It's not really clear whether the same would be worthwhile for AVX512 forms: Some could be expressed via KXNOR* (when no masking is in effect) or KOR* (when masking is in effect), but others cannot. And while in pre-AVX512 code these patterns are likely to be used to produce all-ones idioms, this looks less likely in AVX512. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -4563,6 +4563,23 @@ optimize_encoding (void) i.types[j].bitfield.disp8 = fits_in_disp8 (i.op[j].disps->X_add_number); } + else if (optimize_for_space + && i.tm.base_opcode == 0x29 + && i.tm.opcode_space == SPACE_0F38 + && i.operands == i.reg_operands + && i.op[0].regs == i.op[1].regs + && (!i.tm.opcode_modifier.vex + || !(i.op[0].regs->reg_flags & RegRex)) + && !is_evex_encoding (&i.tm)) + { + /* Optimize: -Os: + pcmpeqq %xmmN, %xmmN -> pcmpeqd %xmmN, %xmmN + vpcmpeqq %xmmN, %xmmN, %xmmM -> vpcmpeqd %xmmN, %xmmN, %xmmM (N < 8) + vpcmpeqq %ymmN, %ymmN, %ymmM -> vpcmpeqd %ymmN, %ymmN, %ymmM (N < 8) + */ + i.tm.opcode_space = SPACE_0F; + i.tm.base_opcode = 0x76; + } } /* Return non-zero for load instruction. */ --- a/gas/testsuite/gas/i386/optimize-2.d +++ b/gas/testsuite/gas/i386/optimize-2.d @@ -161,4 +161,7 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq \(%eax\)\{1to2\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxord \(%eax\)\{1to4\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxorq \(%eax\)\{1to4\},%ymm2,%ymm3 + +[a-f0-9]+: 66 .* pcmpeqd %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpcmpeqd %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpcmpeqd %ymm2,%ymm2,%ymm0 #pass --- a/gas/testsuite/gas/i386/optimize-2.s +++ b/gas/testsuite/gas/i386/optimize-2.s @@ -180,3 +180,7 @@ _start: vporq (%eax){1to2}, %xmm2, %xmm3 vpxord (%eax){1to4}, %xmm2, %xmm3 vpxorq (%eax){1to4}, %ymm2, %ymm3 + + pcmpeqq %xmm2, %xmm2 + vpcmpeqq %xmm2, %xmm2, %xmm0 + vpcmpeqq %ymm2, %ymm2, %ymm0 --- a/gas/testsuite/gas/i386/optimize-2b.d +++ b/gas/testsuite/gas/i386/optimize-2b.d @@ -162,4 +162,7 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq \(%eax\)\{1to2\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxord \(%eax\)\{1to4\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxorq \(%eax\)\{1to4\},%ymm2,%ymm3 + +[a-f0-9]+: 66 .* pcmpeqq %xmm2,%xmm2 + +[a-f0-9]+: c4 .* vpcmpeqq %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c4 .* vpcmpeqq %ymm2,%ymm2,%ymm0 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-3.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-3.d @@ -199,4 +199,10 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq \(%rax\)\{1to2\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxord \(%rax\)\{1to4\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxorq \(%rax\)\{1to4\},%ymm2,%ymm3 + +[a-f0-9]+: 66 .* pcmpeqd %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpcmpeqd %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpcmpeqd %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pcmpeqd %xmm12,%xmm12 + +[a-f0-9]+: c4 .* vpcmpeqq %xmm12,%xmm12,%xmm0 + +[a-f0-9]+: c4 .* vpcmpeqq %ymm12,%ymm12,%ymm0 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-3.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-3.s @@ -221,3 +221,11 @@ _start: vporq (%rax){1to2}, %xmm2, %xmm3 vpxord (%rax){1to4}, %xmm2, %xmm3 vpxorq (%rax){1to4}, %ymm2, %ymm3 + + pcmpeqq %xmm2, %xmm2 + vpcmpeqq %xmm2, %xmm2, %xmm0 + vpcmpeqq %ymm2, %ymm2, %ymm0 + + pcmpeqq %xmm12, %xmm12 + vpcmpeqq %xmm12, %xmm12, %xmm0 + vpcmpeqq %ymm12, %ymm12, %ymm0 --- a/gas/testsuite/gas/i386/x86-64-optimize-3b.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-3b.d @@ -200,4 +200,10 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq \(%rax\)\{1to2\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxord \(%rax\)\{1to4\},%xmm2,%xmm3 +[a-f0-9]+: 62 .* vpxorq \(%rax\)\{1to4\},%ymm2,%ymm3 + +[a-f0-9]+: 66 .* pcmpeqq %xmm2,%xmm2 + +[a-f0-9]+: c4 .* vpcmpeqq %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c4 .* vpcmpeqq %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pcmpeqq %xmm12,%xmm12 + +[a-f0-9]+: c4 .* vpcmpeqq %xmm12,%xmm12,%xmm0 + +[a-f0-9]+: c4 .* vpcmpeqq %ymm12,%ymm12,%ymm0 #pass --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1363,7 +1363,7 @@ pblendvb, 0x664c, AVX, Modrm|Vex128|Spac pblendvb, 0x660f3810, SSE4_1, Modrm|NoSuf, { Acc|Xmmword, RegXMM|Unspecified|BaseIndex, RegXMM } pblendvb, 0x660f3810, SSE4_1, Modrm|NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM } pblendw, 0x660f3a0e, , Modrm|||NoSuf, { Imm8|Imm8S, RegXMM|Unspecified|BaseIndex, RegXMM } -pcmpeqq, 0x660f3829, , Modrm|||NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM } +pcmpeqq, 0x660f3829, , Modrm|||NoSuf|Optimize, { RegXMM|Unspecified|BaseIndex, RegXMM } pextr, 0x660f3a14 | , , RegMem||NoSuf|IgnoreSize|NoRex64, { Imm8, RegXMM, Reg32|Reg64 } pextr, 0x660f3a14 | , , Modrm||NoSuf, { Imm8, RegXMM, |Unspecified|BaseIndex } pextrd, 0x660f3a16, , Modrm||NoSuf|IgnoreSize, { Imm8, RegXMM, Reg32|Unspecified|BaseIndex } @@ -1592,7 +1592,7 @@ vpblendvb, 0x664c, AVX|AVX2, Modrm|Vex|S vpblendw, 0x660e, AVX|AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpcmpeq, 0x6674 | , AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpcmpeqd, 0x6676, AVX|AVX2, Modrm|C|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpcmpeqq, 0x6629, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpcmpeqq, 0x6629, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpcmpestri, 0x6661, AVX|No64, Modrm|Vex|Space0F3A|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpestri, 0x6661, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpestrm, 0x6660, AVX|No64, Modrm|Vex|Space0F3A|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } From patchwork Fri Jun 16 07:31:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 108896 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1148807vqr; Fri, 16 Jun 2023 00:31:31 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6rhssswpvy3cPGZDRMhTOddD/byLAdMiXOJzJPaQpjlSHTNozkA463Yzw9N04F9FIlYc0x X-Received: by 2002:a17:906:ef0a:b0:978:337e:c41a with SMTP id f10-20020a170906ef0a00b00978337ec41amr1042777ejs.14.1686900690973; Fri, 16 Jun 2023 00:31:30 -0700 (PDT) Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id o27-20020a1709062e9b00b00985b6153e29si580962eji.830.2023.06.16.00.31.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Jun 2023 00:31:30 -0700 (PDT) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=iJgfyZDG; arc=fail (signature failed); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BFEAE3854E7E for ; Fri, 16 Jun 2023 07:31:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BFEAE3854E7E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1686900688; bh=udPL3gPvAckEiUa15KP6Cd4610Dh1Kdt8N/uJU7bqQc=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=iJgfyZDGldt0FaCoebXBKFu81Quur8jVsMdwzomJFWOlBmcBBsUmZeuPGB0OlKqgS cwLMSn287Lp8JX2FN8WZVOx8nwo88oFpstKkcNuDzwSf6ZBZPr6BZsQqEjcu60H2N1 m8QW6QPP4ARSK88hWfK7oW/8ydGlVtJbf3nz0wgA= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2052.outbound.protection.outlook.com [40.107.22.52]) by sourceware.org (Postfix) with ESMTPS id E8D603854E76 for ; Fri, 16 Jun 2023 07:31:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E8D603854E76 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bqTPgNkLJYjomoWSxFCtlYbjOEWKbJBrqARweuTn2VMUBzs1RSBOdo5CZ2uLTFOrL7tOxW6jml8iGS2EvBtx3mDUVrQJ5e0mmGESIwyXZ+Kqu+gTr+EcMd5XvpuMfbQNa9Qxv8ytOxbwoO8tMHPerbAC0FmDkwLqCUFtZEIiWGIAvt4+AgXQkZxBNWFO1rzKAZgjShg3pjqKC94DsLDEj2bscNyDtYTVB1WhQM5tgPdOEvk/ybT4YBemlkPeK1Uq/WCXvCNAPWbmZpt+ijEqwQyOnBg2RmF2SfbDjq3DD2t8Ind+h/8+VJvXWHVvGoVYmDTKRbUQsmAb2noESV/R4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=udPL3gPvAckEiUa15KP6Cd4610Dh1Kdt8N/uJU7bqQc=; b=CulWgTumTyIppNcZarDrJeRhYFQbG7iVPmCA1v8wZ1ynn998Dc2AvwP9eZnqBMwUAgLWCEUXNJ1jdxmcTUkN1zNjv81ru2jfwYGzdBpFMmgq0XmLyfWTOIuQ+Vd0uTiyHeqgygf2E/jsaRXlfYGaqPCcJAWaf6kYnPO1/dnmp9jgDq4OkezjqKUkolI01Uct1f8ab2JYrBJFuPybEcfHNv2mlTtM7yrtHPbPm01kR8R6GZbTHaa5M+tnTf4uqdBV60Idn1bu9d+bNV8/DqLxh2xBcyLi3djCptcUXHiujsWGgnY+VPKC+AMAOe9lacpzHrHNA4Xp4XIX10MsRq/Mvg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM0PR04MB6913.eurprd04.prod.outlook.com (2603:10a6:208:184::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.29; Fri, 16 Jun 2023 07:31:14 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Fri, 16 Jun 2023 07:31:14 +0000 Message-ID: <7141b586-9711-aef0-7f28-5d4489478f1b@suse.com> Date: Fri, 16 Jun 2023 09:31:12 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 2/4] x86: optimize pre-AVX512 {,V}PCMPGT* with identical sources Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: In-Reply-To: X-ClientProxiedBy: FR0P281CA0009.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:15::14) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|AM0PR04MB6913:EE_ X-MS-Office365-Filtering-Correlation-Id: 3917e5bd-ad33-4a77-3ee0-08db6e3baa0e X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: NDVRT665kGx4EGoDf/YepKmEZWozqlBekCqdZCzroFE4ALeuD5+SZRb/Q0ycjlFSb2IYiyhMHCHcTxTKVDStcvM4bMQT+YyJyfuiSuxXMoaIrqcvB6ACIDk9+sKzuqNuqLaKYsnyyN21PKxFqdtDPeUFfaopkFB1TbiPONknOVxcE0w5Gr/+dpBXq4eO4rcXK4xYiHJlzhOZeYcX/FgbAr9UnvCuL1Cf17M29Q7hnKu8nBbvSNRB3OK9WkdqzYTTMY/XTJ1oWNPPZrv1E2sUKdwWP5jYopJp7y0Jypq8Wffyl5MxLdIgfXm1gGxNgCiPPB+48ibCMItBo/E8HfIsowizxijJbyivF8DD/QXoCEjnl0Mxk/msnmf5M2WlSDDeA3i3NCT1FEOtNU6QtYnD+/awZidVPIBjKm51lgjpLsqrZX604eOKPazX/sLIOtUlWWsC1FJxUSqQLmp0gtYJzBavh/hmu5y1FyvbL/WnwKRfPXjbVxMmxRbSEgkc8oXV9KnJjE7mzkG1R2RB5yxpjLr3v6OnphqIG3yqNNInbO0UH8dFiSr0LmcqH/nSF9w2WsSUFe/Vb9x4mCSSiZZ8f7FlAWqAFN1rBoY3aBGRuszPl7fvYhgyyyVPIJ/0/p4Hkn+NkWQlQgtHQM0ar/Y3Sw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(376002)(396003)(366004)(136003)(346002)(39860400002)(451199021)(478600001)(6486002)(5660300002)(41300700001)(8936002)(8676002)(2906002)(30864003)(36756003)(86362001)(31696002)(38100700002)(66476007)(66556008)(66946007)(316002)(6916009)(4326008)(26005)(6512007)(6506007)(186003)(31686004)(83380400001)(66574015)(2616005)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?mAxowNrErtSiApRFCO6YS/6C/OQl?= =?utf-8?q?M6t5Feg+zskSHuLJkk2tVIViv+r7jex5bl2/BeoCQP73zB9oUy4ZhsIcVBHKGIVUG?= =?utf-8?q?q9xWgpgo78mnFfzrLIglKj7nsSt6KqLLk56x69oFmKyYGp0b0lDu/X0++vW/xrgKK?= =?utf-8?q?dGqwx2UEl+6gTTlSptW+LSQo+xyDoX5DpCv6e/4zWa5YKyTm0e2p3z8j/718HK60m?= =?utf-8?q?bcz5jQ1NKZXQOqs803mnirk/S3vAEUVEHMq5PwAQptDjJ9jnngFVeLUwI6AL2errJ?= =?utf-8?q?xxXYWKKVSpAfdiKYB9wDo1e/TACmOohIgyQm146kRu+d8mSKp33xGhMKBebrppsG6?= =?utf-8?q?QXQFT/AD2N1x8QPpSStoX/yn/guGso5hDOX+BkgjZtG6UWeKneeyf16pWVqa7WBlt?= =?utf-8?q?YPiAuk+DydktEaz8xByJfNyfZKHmSRfmZ/0l5rxtsxGpQEXnBmHeekDbEknqmxWKJ?= =?utf-8?q?vYWzDyOZX+W0sY8/k7XvuG3Yt7w6wH6d24hL4jOhBp4Xg/LGl+Cb9c94HbI4bCRTh?= =?utf-8?q?NbUHwCIa7TYfcHUVJ7mVzwy6eHzq7IyjMtwkZKNRPuyNwp+0vAYa7/ymCodegXrIK?= =?utf-8?q?khbXQwS/wj/FCsxaDt1Z/eUy2qWeELI7i4dr6uNE3zp1o99bwZHYdRwpxh/mPXxTn?= =?utf-8?q?l5MVRshEdRW1PNzgY0pz1g4+/A2vKt7CfNRTLTdKSa9KsUUlDiisVG7487Uoww/EN?= =?utf-8?q?WtYOBSjfqKSnC1/GK9J2qki/QloUf1L13knCapYr+KOFi8HRXHbX/8J4cTMUqe7Y+?= =?utf-8?q?nJkEzXHFUW56g5mTZa3lpSDih6FoZUc+G0rFGvkIiDhWQzjCIa1T2idgNGbNPGoEk?= =?utf-8?q?lmTCUEW8j7GQ6nLvNJyFzyIwr/oLMIg2tTDhLmFPPfL9TT5wSf8wLkTX0XlhA3zi8?= =?utf-8?q?m16lO3mlj/VogQFQaz7KW1yXophxs/ShEYabzETCH1ghabwwseL1+Z8XD4KJiLEKZ?= =?utf-8?q?9zDsDcyXKl/XDx0eDsHMEO7zrr+Ua+0NipiD4YbQ/JsmaPkvkRY7w5WIUbyrjUR61?= =?utf-8?q?X4edpbkdsJiHqkyFOFmvFrPi+gNvLC0Ppe5QLssSUfDfGY+V+WrRXVpTgstxMzb1V?= =?utf-8?q?iEdHpGEOQR1vE59Nqsca9RSzmknUJJvmMKYLa7r5s95oTi0EFICZaDjQBdxB5gqMB?= =?utf-8?q?AO6jHmD4O5g1tlBehpWanJDZBp9vQ9zmQoFqMnLVY9XshSeUmwp7peb/xBvqsjoeq?= =?utf-8?q?lHcN2t/CgSLi2ul17YL1172Om/kSfzFmcU6GL+eATn1yMlqiV1DGqrI9/jXt2lMPx?= =?utf-8?q?m5/X/vLs+S5JAhnp/0wGExK2yjN0SqaLFmP7bMHi7rv1jrDGY1hKg+3XiRGaskAeo?= =?utf-8?q?k6wY111lzlLlZA4PrnrdvodnAPpvbYbVa6nFffCGKHXDCxks3UCogVNC5E6aCr2hE?= =?utf-8?q?aTtflg7i4wuGP+5OsPmd/saPflbiqWEYXNnp0n8yzm9sNkIyCaYNXS6J4O4rs5quv?= =?utf-8?q?wXReLrfr4yMhS5H2ka9pM264cchJFXiIDnnW8+FT70xHTYYAf5vu19V8HJzueVZd9?= =?utf-8?q?ejTlB4DEzmet?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3917e5bd-ad33-4a77-3ee0-08db6e3baa0e X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 07:31:13.9500 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: NWHK30DnDJA8HyHMJ0NUD12xGfoj0boUfuhIyxPYFqPGFwHADZXNf/7oER5tnC4Iw5qAvAFN68PlnsZJHS40JQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR04MB6913 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768843579166157710?= X-GMAIL-MSGID: =?utf-8?q?1768843579166157710?= These are better expressed by the zeroing idiom {,V}PXOR. In some cases this also results in a shorter encoding. --- Thoughts towards doing the same for {,V}PSUB{,U}S{B,W,D,Q}, anyone? --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -4580,6 +4580,46 @@ optimize_encoding (void) i.tm.opcode_space = SPACE_0F; i.tm.base_opcode = 0x76; } + else if (((i.tm.base_opcode >= 0x64 + && i.tm.base_opcode <= 0x66 + && i.tm.opcode_space == SPACE_0F) + || (i.tm.base_opcode == 0x37 + && i.tm.opcode_space == SPACE_0F38)) + && i.operands == i.reg_operands + && i.op[0].regs == i.op[1].regs + && !is_evex_encoding (&i.tm)) + { + /* Optimize: -O: + pcmpgt[bwd] %mmN, %mmN -> pxor %mmN, %mmN + pcmpgt[bwdq] %xmmN, %xmmN -> pxor %xmmN, %xmmN + vpcmpgt[bwdq] %xmmN, %xmmN, %xmmM -> vpxor %xmmN, %xmmN, %xmmM (N < 8) + vpcmpgt[bwdq] %xmmN, %xmmN, %xmmM -> vpxor %xmm0, %xmm0, %xmmM (N > 7) + vpcmpgt[bwdq] %ymmN, %ymmN, %ymmM -> vpxor %ymmN, %ymmN, %ymmM (N < 8) + vpcmpgt[bwdq] %ymmN, %ymmN, %ymmM -> vpxor %ymm0, %ymm0, %ymmM (N > 7) + */ + i.tm.opcode_space = SPACE_0F; + i.tm.base_opcode = 0xef; + if (i.tm.opcode_modifier.vex && (i.op[0].regs->reg_flags & RegRex)) + { + if (i.operands == 2) + { + gas_assert (i.tm.opcode_modifier.sse2avx); + + i.operands = 3; + i.reg_operands = 3; + i.tm.operands = 3; + + i.op[2].regs = i.op[0].regs; + i.types[2] = i.types[0]; + i.flags[2] = i.flags[0]; + i.tm.operand_types[2] = i.tm.operand_types[0]; + + i.tm.opcode_modifier.sse2avx = 0; + } + i.op[0].regs -= i.op[0].regs->reg_num + 8; + i.op[1].regs = i.op[0].regs; + } + } } /* Return non-zero for load instruction. */ --- a/gas/testsuite/gas/i386/optimize-1.d +++ b/gas/testsuite/gas/i386/optimize-1.d @@ -147,6 +147,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/optimize-1.s +++ b/gas/testsuite/gas/i386/optimize-1.s @@ -171,6 +171,25 @@ _start: vpxord 128(%eax), %ymm2, %ymm3 vpxorq 128(%eax), %ymm2, %ymm3 + pcmpgtb %mm2, %mm2 + pcmpgtb %xmm2, %xmm2 + vpcmpgtb %xmm2, %xmm2, %xmm0 + vpcmpgtb %ymm2, %ymm2, %ymm0 + + pcmpgtw %mm2, %mm2 + pcmpgtw %xmm2, %xmm2 + vpcmpgtw %xmm2, %xmm2, %xmm0 + vpcmpgtw %ymm2, %ymm2, %ymm0 + + pcmpgtd %mm2, %mm2 + pcmpgtd %xmm2, %xmm2 + vpcmpgtd %xmm2, %xmm2, %xmm0 + vpcmpgtd %ymm2, %ymm2, %ymm0 + + pcmpgtq %xmm2, %xmm2 + vpcmpgtq %xmm2, %xmm2, %xmm0 + vpcmpgtq %ymm2, %ymm2, %ymm0 + bt $15, %ax bt $16, %ax btc $15, %ax --- a/gas/testsuite/gas/i386/optimize-1a.d +++ b/gas/testsuite/gas/i386/optimize-1a.d @@ -148,6 +148,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/optimize-4.d +++ b/gas/testsuite/gas/i386/optimize-4.d @@ -147,6 +147,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/optimize-5.d +++ b/gas/testsuite/gas/i386/optimize-5.d @@ -147,6 +147,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/x86-64.exp +++ b/gas/testsuite/gas/i386/x86-64.exp @@ -520,6 +520,7 @@ run_dump_test "x86-64-optimize-1" run_dump_test "x86-64-optimize-2" run_dump_test "x86-64-optimize-2a" run_dump_test "x86-64-optimize-2b" +run_dump_test "x86-64-optimize-2c" run_dump_test "x86-64-optimize-3" run_dump_test "x86-64-optimize-3b" run_dump_test "x86-64-optimize-4" --- a/gas/testsuite/gas/i386/x86-64-optimize-2.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2.d @@ -203,4 +203,23 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-2.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-2.s @@ -226,3 +226,26 @@ _start: vporq 128(%rax), %ymm2, %ymm3 vpxord 128(%rax), %ymm2, %ymm3 vpxorq 128(%rax), %ymm2, %ymm3 + + pcmpgtb %mm2, %mm2 + pcmpgtb %xmm2, %xmm2 + pcmpgtb %xmm12, %xmm12 + vpcmpgtb %xmm2, %xmm2, %xmm8 + vpcmpgtb %ymm12, %ymm12, %ymm1 + + pcmpgtw %mm2, %mm2 + pcmpgtw %xmm2, %xmm2 + pcmpgtw %xmm12, %xmm12 + vpcmpgtw %xmm2, %xmm2, %xmm8 + vpcmpgtw %ymm12, %ymm12, %ymm1 + + pcmpgtd %mm2, %mm2 + pcmpgtd %xmm2, %xmm2 + pcmpgtd %xmm12, %xmm12 + vpcmpgtd %xmm2, %xmm2, %xmm8 + vpcmpgtd %ymm12, %ymm12, %ymm1 + + pcmpgtq %xmm2, %xmm2 + pcmpgtq %xmm12, %xmm12 + vpcmpgtq %xmm2, %xmm2, %xmm8 + vpcmpgtq %ymm12, %ymm12, %ymm1 --- a/gas/testsuite/gas/i386/x86-64-optimize-2a.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2a.d @@ -204,4 +204,23 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-2b.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2b.d @@ -203,4 +203,23 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 #pass --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-optimize-2c.d @@ -0,0 +1,226 @@ +#source: x86-64-optimize-2.s +#as: -O -msse2avx +#objdump: -drw +#name: x86-64 optimized encoding 2c with -O and SSE2AVX + +.*: +file format .* + + +Disassembly of section .text: + +0+ <_start>: + +[a-f0-9]+: 62 71 f5 4f 55 f9 vandnpd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 55 c1 vandnpd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 55 c1 vandnpd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 55 c9 vandnpd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 55 c9 vandnpd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 74 4f 55 f9 vandnps %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 74 48 55 c1 vandnps %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 74 28 55 c1 vandnps %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 74 40 55 c9 vandnps %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 74 20 55 c9 vandnps %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 71 75 4f df f9 vpandnd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 df c1 vpandnd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 df c1 vpandnd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 df c9 vpandnd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 df c9 vpandnd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f df f9 vpandnq %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 df c1 vpandnq %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 df c1 vpandnq %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 df c9 vpandnq %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 df c9 vpandnq %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f 57 f9 vxorpd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 57 c1 vxorpd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 57 c1 vxorpd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 57 c9 vxorpd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 57 c9 vxorpd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 74 4f 57 f9 vxorps %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 74 48 57 c1 vxorps %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 74 28 57 c1 vxorps %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 74 40 57 c9 vxorps %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 74 20 57 c9 vxorps %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 71 75 4f ef f9 vpxord %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 ef c1 vpxord %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 ef c1 vpxord %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 ef c9 vpxord %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 ef c9 vpxord %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f ef f9 vpxorq %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 ef c1 vpxorq %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 ef c1 vpxorq %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 ef c9 vpxorq %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 ef c9 vpxorq %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 75 4f f8 f9 vpsubb %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 f8 f9 vpsubb %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f8 f9 vpsubb %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f8 f9 vpsubb %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 f8 c1 vpsubb %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 f8 c1 vpsubb %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 f8 c9 vpsubb %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 f8 c9 vpsubb %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 75 4f f9 f9 vpsubw %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 f9 f9 vpsubw %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f9 f9 vpsubw %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f9 f9 vpsubw %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 f9 c1 vpsubw %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 f9 c1 vpsubw %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 f9 c9 vpsubw %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 f9 c9 vpsubw %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 75 4f fa f9 vpsubd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 fa f9 vpsubd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fa f9 vpsubd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fa f9 vpsubd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 fa c1 vpsubd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 fa c1 vpsubd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 fa c9 vpsubd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 fa c9 vpsubd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f fb f9 vpsubq %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 fb f9 vpsubq %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fb f9 vpsubq %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fb f9 vpsubq %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 fb c1 vpsubq %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 fb c1 vpsubq %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 fb c9 vpsubq %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 fb c9 vpsubq %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: 62 f1 7d 08 7f 48 08 vmovdqa32 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fd 08 7f 48 08 vmovdqa64 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7f 08 7f 48 08 vmovdqu8 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 ff 08 7f 48 08 vmovdqu16 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7e 08 7f 48 08 vmovdqu32 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fe 08 7f 48 08 vmovdqu64 %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: 62 f1 7d 28 7f 48 04 vmovdqa32 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fd 28 7f 48 04 vmovdqa64 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7f 28 7f 48 04 vmovdqu8 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 ff 28 7f 48 04 vmovdqu16 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7e 28 7f 48 04 vmovdqu32 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fe 28 7f 48 04 vmovdqu64 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7d 48 6f 10 vmovdqa32 \(%rax\),%zmm2 + +[a-f0-9]+: c5 .* vpand %xmm2,%xmm3,%xmm4 + +[a-f0-9]+: c4 .* vpand %xmm12,%xmm3,%xmm4 + +[a-f0-9]+: c5 .* vpandn %xmm2,%xmm13,%xmm4 + +[a-f0-9]+: c5 .* vpandn %xmm2,%xmm3,%xmm14 + +[a-f0-9]+: c5 .* vpor %xmm2,%xmm3,%xmm4 + +[a-f0-9]+: c4 .* vpor %xmm12,%xmm3,%xmm4 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm13,%xmm4 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm3,%xmm14 + +[a-f0-9]+: c5 .* vpand %ymm2,%ymm3,%ymm4 + +[a-f0-9]+: c4 .* vpand %ymm12,%ymm3,%ymm4 + +[a-f0-9]+: c5 .* vpandn %ymm2,%ymm13,%ymm4 + +[a-f0-9]+: c5 .* vpandn %ymm2,%ymm3,%ymm14 + +[a-f0-9]+: c5 .* vpor %ymm2,%ymm3,%ymm4 + +[a-f0-9]+: c4 .* vpor %ymm12,%ymm3,%ymm4 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm13,%ymm4 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm3,%ymm14 + +[a-f0-9]+: c5 .* vpand 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpand 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpandn 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpandn 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpxor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpxor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandd 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandnd 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandnq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpord 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpand 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpand 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpandn 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpandn 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpxor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpxor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandd 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandnd 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandnq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpord 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 +#pass --- a/gas/testsuite/gas/i386/x86-64-optimize-5.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-5.d @@ -203,6 +203,25 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 --- a/gas/testsuite/gas/i386/x86-64-optimize-6.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-6.d @@ -203,6 +203,25 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1025,8 +1025,8 @@ pand, 0x0fdb, , M pandn, 0x0fdf, , Modrm||NoSuf, { ||Unspecified|BaseIndex, } pcmpeq, 0x0f74 | , , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } pcmpeqd, 0x0f76, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } -pcmpgt, 0x0f64 | , , Modrm||NoSuf, { ||Unspecified|BaseIndex, } -pcmpgtd, 0x0f66, , Modrm||NoSuf, { ||Unspecified|BaseIndex, } +pcmpgt, 0x0f64 | , , Modrm||NoSuf|Optimize, { ||Unspecified|BaseIndex, } +pcmpgtd, 0x0f66, , Modrm||NoSuf|Optimize, { ||Unspecified|BaseIndex, } pmaddwd, 0x0ff5, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } pmulhw, 0x0fe5, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } pmullw, 0x0fd5, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } @@ -1405,7 +1405,7 @@ rounds, 0x660f3a0a | -pcmpgtq, 0x660f3837, , Modrm|||NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM } +pcmpgtq, 0x660f3837, , Modrm|||NoSuf|Optimize, { RegXMM|Unspecified|BaseIndex, RegXMM } pcmpestri, 0x660f3a61, |No64, Modrm||NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM } pcmpestri, 0x6661, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|SSE2AVX, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } pcmpestri, 0x660f3a61, SSE4_2|x64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } @@ -1597,9 +1597,9 @@ vpcmpestri, 0x6661, AVX|No64, Modrm|Vex| vpcmpestri, 0x6661, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpestrm, 0x6660, AVX|No64, Modrm|Vex|Space0F3A|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpestrm, 0x6660, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpcmpgt, 0x6664 | , AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpcmpgtd, 0x6666, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpcmpgt, 0x6664 | , AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpcmpgtd, 0x6666, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpcmpistri, 0x6663, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpistrm, 0x6662, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vperm2f128, 0x6606, AVX, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } From patchwork Fri Jun 16 07:31:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 108899 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1149084vqr; Fri, 16 Jun 2023 00:32:09 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4u0GvMjHcY133zsmnnD7JiqoPHySWnJM2dSjLA+WkRcN9cSI5ysqE34m3ifVsAT91DRUUt X-Received: by 2002:a17:907:3e92:b0:982:3d6a:89d with SMTP id hs18-20020a1709073e9200b009823d6a089dmr1210521ejc.75.1686900729761; Fri, 16 Jun 2023 00:32:09 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id z6-20020a1709060f0600b00977cfa6ff4asi7007342eji.843.2023.06.16.00.32.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Jun 2023 00:32:09 -0700 (PDT) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b="bLJ0Z4/0"; arc=fail (signature failed); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0ED7138515F4 for ; Fri, 16 Jun 2023 07:31:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0ED7138515F4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1686900714; bh=DkTkSyIIzB6ScMDaPm0Xgx/rNllNHq4w3xxF1Zn4jdI=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bLJ0Z4/0saCyPWKbOuhWiEPn/DUuiHBgi8OQ1eoZrWCZthl6rqJnYFIkCT2kHguwb R1jarabZGFoRwqMU7OWO9r7UD0fwYffCjx83zcy2ZnijjPnZQlo48qtK6TVShpLay2 r7nKFMRHRJoD7AJWP7XMBnWPlj7aeNNXjODExB1A= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2049.outbound.protection.outlook.com [40.107.22.49]) by sourceware.org (Postfix) with ESMTPS id 6F78E385354A for ; Fri, 16 Jun 2023 07:31:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6F78E385354A ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F4TmysPEySudc2/YOcfrdmlzpexkkZL881k75BxGtUOUJbxAApczQDCQrTD3v+eO0x+uyFsxOBSis13otXBNJiX8+lz4hq8lkyWG8oBHE6pxorCSAtg0V136LGTZegOV7MEM6M4l1y1NHNL0NhldVKVlXaDdpfsybyaSx+UEJ6ofUrlMmshPJ5eXaCCW/KOrb3qSskAeqguNujyu2S7x+UYmYHg/wxrA8A/T3WuNqlWmNCT7xNaQFymifULZWpBMDLyinpo2wDBYqx7r9n4PZ542xrnH4EAbwo3j8VYwYB6r3rAa3EpXk4GhGlEMjFPMyvmEHaxdJxLnzzhhg5QrEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=DkTkSyIIzB6ScMDaPm0Xgx/rNllNHq4w3xxF1Zn4jdI=; b=WvQaui0VSfKBBJ+3HW6CZckhM+bJAgMGAZAC/P/ppBveWN9LU5vSAWK/mYOyg1TOi5pJQ8JQwcqcoGgDKl7WY+Us6xEXRcG3gzKOXrfStNeBhT2D4naLeX3iYdPmfOd7ell8myzesNsQtQKLIhk9Y8vpBr10GoFMzpIb+m4WFRWUWHfBrWyCjnmjRpfgmPj4H6JJzdE6on1pum7qqvwgYvmhCEtdrmucJNyYJzQ4tGToDd/cQ2bVURf1xyZsuK26uj3d12ow773ZzNhwFCjTe4KBkmYNdLRgiRekwX1gJjZjrtT7B1G4SZ76wu6BFt7jpkd5PuqG9XlfRTYg9QHfaw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM0PR04MB6913.eurprd04.prod.outlook.com (2603:10a6:208:184::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.29; Fri, 16 Jun 2023 07:31:43 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Fri, 16 Jun 2023 07:31:43 +0000 Message-ID: <3e1b884e-7312-8546-ebdc-ac513a199858@suse.com> Date: Fri, 16 Jun 2023 09:31:41 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 3/4] x86: optimize 128-bit VPBROADCASTQ to VPUNPCKLQDQ Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: In-Reply-To: X-ClientProxiedBy: FR3P281CA0179.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:a0::16) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|AM0PR04MB6913:EE_ X-MS-Office365-Filtering-Correlation-Id: 35c1d530-d619-4bdc-1453-08db6e3bbb89 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ymmWgkDvn0GedQTYMF2kH1DFXEbESX+Xt8+chsfY/6hEaZv1+8vdSqZ+GDZUYrXGluqyom1y3H7qLGR804rI6mdLmfpD1ZvaBDi9kg9BxCUjx+2WhpUU1xbj/yl0VRTHXcudqIJRYmUz3Ul+k5g8vNCjpEB852rno8yI2O61TiQW9hu+RjE3C4c0HdS7dgU6eRRXi/cvRN6HFVfxhpXysCRh+eviTCuDHSZPu9GRo5me22emEdqzOR8McVGkMNc9WJFtWvVXinny1lbLO3I668HNcjxX7RQlnV0B8zONlYOkkt+x29FAPQaYJ9e2Wri318J/Z7Vx8/Q+Ll+0LVF6cCurW4JvBsyKFlndYetLi679DMlXTC9v6go0pVb9TnpNLY3U9D7d38q7JUMHkvZhaNyI450NpEsTjPg0j/KaVvTRx8Qd/LesYrKeji+Go3XywNEIcz3svCVrjKAAQ5TmpebloRhjTyBIqsKQsALucLdMRiVwl/1t0OsnF9zP047uWy7JKugXl43cDgSpO48EFHlkmVkP+yAdwVd9cjmcRmdNxdiklViQxQEZmrdkf0A58BuQIPNz3knoNKngLXz5b45QGz1rv8UaPPf41mDF/NXcQj3llvFqNR3lGiZy9PuiIGlq/e0EgN5bO6iToBwh5cpMCza5T+HtENrzdjzuDlIHJ/nrD/UnFFyQ9GCUCidR X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(376002)(396003)(366004)(136003)(346002)(39860400002)(451199021)(478600001)(6486002)(5660300002)(41300700001)(8936002)(8676002)(2906002)(36756003)(86362001)(31696002)(38100700002)(66476007)(66556008)(66946007)(316002)(6916009)(4326008)(26005)(6512007)(6506007)(186003)(31686004)(2616005)(156123004)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?7kkVrC7e/8Y/kxxWGypLWQJ3RTlT?= =?utf-8?q?+h5C9mg1L7stOoDzvR3xv8jIlS4xkCNW8e/uuu87WsyqSwsJ2yNxI5LrOFZ3VK/27?= =?utf-8?q?pV+wcoAxJHqCDrQKIogTzMHDtTxX6SEInJdFEO5NdjQRzH5wqAk4i7gmQVSXPUlfv?= =?utf-8?q?dk6ZC7qooicroeeAlzMab7Ei7lxzuJpnm2GtqIdU70QJxUDlCMyUUQAHjrNFcsjmn?= =?utf-8?q?Od35/4XfEDv5RMRNq72nRiXTv+gIFinBnTtIIupww+PaYDqqVs/WJhGj3KYDOudSQ?= =?utf-8?q?BrBifpykO7GSc7Fmgn8PqJzdINsMn7DC+C1sYJV96rJ+oku3AMas10WtS3Z52r3ZS?= =?utf-8?q?am0ukM9qb0J4IarDiE86agP7t3FcB/CfbiIFOus8QXkTHSZPQlRt5f35b+Lzsgank?= =?utf-8?q?uzST9+oICuQvF8ukOwZvSsZ7qca9raNKLHKpetJoRgnhKBPoz2rH1lBBLhcGIjHd2?= =?utf-8?q?UNvn9msldAsHl4X79qb21vPe1R4OE9Xz72lywvckuq/7qHU0pxbjwaZBbrm9j+ugY?= =?utf-8?q?FAaoTF6OU1yqc78LTZ9b+QlfzV0aoNjruIqaxWyWh8H4nmbGnIwIiBUcRYqrmpIZ+?= =?utf-8?q?nyMmRqgSau7Qfmx9y7nqa8tFZU5pFKPLb92W8LQr4uCYjIOuFCPqWaQn4LKjLoMQu?= =?utf-8?q?VoDZbdr/jJYXlxnVubHObmCz9vNw9RuvTfQlUlCJrga5N3f8UosfIXz/A3EbR+XSR?= =?utf-8?q?RLp07Sp3V0LIb7zEY/XKQmdh7Ni+gT5+7ENGIQQ+mWZTt9Z02GYgJnVO7C2PobxkW?= =?utf-8?q?aw+Fv/2PtVGtjr7anorg/ZnDwEBfEY0XMCbOQ3hgyCSWvUhh/+asp+jdfq9hlLkeI?= =?utf-8?q?evpfGY2Gnfg8hzAxc0S/TmC1KWr++KDWVqVaZeP0pGURKgqyARYMHT8og+1SEo7q1?= =?utf-8?q?4SqJw1st2iGuL7lYV/+flIuVCL2Tz1MpFBgWMJlDiiuA4VDQWpiB4CSTkaee0a1i1?= =?utf-8?q?eeey+pInDhe2FNjlfzzRO/U1YFlMBcOh8c9HTy+b+g6tpS3IQ74CGl0fEtKO7rYZt?= =?utf-8?q?AddGPim6h6q5j+2Vcxf6CkwHtgG7q9/r63DRpDzlxXceQsBXqz2ZEBdzpFPzzox1U?= =?utf-8?q?WI4pMvdJbaOuF9DVEdrV/EgaYRbBcftRt0YHPV5jO+sBWy3EJC0w41KlEGSuhqEQg?= =?utf-8?q?f/EcPs73HP1evFSnm5shERxZ77s5mX0355syPWnd0AMyIpYiV3kpX9n7LVE7Rxkd+?= =?utf-8?q?eRW2GkvYsp7N2/Nrnb1giLd0DiYwS7ys1tMqzC4O1+Qrn0EPDbLdCzWNVXEuslkiz?= =?utf-8?q?BxgsPm+zHyjl2jkN2yGaZBvgHQmHUDaboYu8aS9gjxIFdS1+GAQlJAIS428l23EtE?= =?utf-8?q?ji4k2xmrPmrZ8mNuWfoWmkysxylAVqeeJu6IyfjY0VZJAM29R50g46UHpG1d/AH3I?= =?utf-8?q?p8YvIgLCVWLe63g467+wWEd1UwKaozbl6wTu9Vk4u8gZ7cv4Vy98HCr9W9AI4Nphp?= =?utf-8?q?363emMGEqZe2iVBJz3MiKgdAQvf93wB0ybURPH9wUuNKAEGJ2iZZNf2fNuMz2GlfM?= =?utf-8?q?/abY9nGXC0M2?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 35c1d530-d619-4bdc-1453-08db6e3bbb89 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 07:31:43.2733 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0d5J2P4L3hnOkD20+LXY61toK3ylESFyULaNnSPk7ihaUiVNlixoHCTAcr95bW5G0JJ9o57fJUZojt3UO08Lbw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR04MB6913 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768843619459017898?= X-GMAIL-MSGID: =?utf-8?q?1768843619459017898?= The alternative is 1 byte shorter when the source is %xmm0-7, as a 2-byte VEX prefix can then be used. --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -4620,6 +4620,33 @@ optimize_encoding (void) i.op[1].regs = i.op[0].regs; } } + else if (optimize_for_space + && i.tm.base_opcode == 0x59 + && i.tm.opcode_space == SPACE_0F38 + && i.operands == i.reg_operands + && i.tm.opcode_modifier.vex + && !(i.op[0].regs->reg_flags & RegRex) + && i.op[0].regs->reg_type.bitfield.xmmword + && i.vec_encoding != vex_encoding_vex3) + { + /* Optimize: -Os: + vpbroadcastq %xmmN, %xmmM -> vpunpcklqdq %xmmN, %xmmN, %xmmM (N < 8) + */ + i.tm.opcode_space = SPACE_0F; + i.tm.base_opcode = 0x6c; + i.tm.opcode_modifier.vexvvvv = 1; + + ++i.operands; + ++i.reg_operands; + ++i.tm.operands; + + i.op[2].regs = i.op[0].regs; + i.types[2] = i.types[0]; + i.flags[2] = i.flags[0]; + i.tm.operand_types[2] = i.tm.operand_types[0]; + + swap_2_operands (1, 2); + } } /* Return non-zero for load instruction. */ --- a/gas/testsuite/gas/i386/optimize-2.d +++ b/gas/testsuite/gas/i386/optimize-2.d @@ -164,4 +164,5 @@ Disassembly of section .text: +[a-f0-9]+: 66 .* pcmpeqd %xmm2,%xmm2 +[a-f0-9]+: c5 .* vpcmpeqd %xmm2,%xmm2,%xmm0 +[a-f0-9]+: c5 .* vpcmpeqd %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: c5 .* vpunpcklqdq %xmm2,%xmm2,%xmm0 #pass --- a/gas/testsuite/gas/i386/optimize-2.s +++ b/gas/testsuite/gas/i386/optimize-2.s @@ -184,3 +184,5 @@ _start: pcmpeqq %xmm2, %xmm2 vpcmpeqq %xmm2, %xmm2, %xmm0 vpcmpeqq %ymm2, %ymm2, %ymm0 + + vpbroadcastq %xmm2, %xmm0 --- a/gas/testsuite/gas/i386/optimize-2b.d +++ b/gas/testsuite/gas/i386/optimize-2b.d @@ -165,4 +165,5 @@ Disassembly of section .text: +[a-f0-9]+: 66 .* pcmpeqq %xmm2,%xmm2 +[a-f0-9]+: c4 .* vpcmpeqq %xmm2,%xmm2,%xmm0 +[a-f0-9]+: c4 .* vpcmpeqq %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: c4 .* vpbroadcastq %xmm2,%xmm0 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-3.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-3.d @@ -205,4 +205,6 @@ Disassembly of section .text: +[a-f0-9]+: 66 .* pcmpeqd %xmm12,%xmm12 +[a-f0-9]+: c4 .* vpcmpeqq %xmm12,%xmm12,%xmm0 +[a-f0-9]+: c4 .* vpcmpeqq %ymm12,%ymm12,%ymm0 + +[a-f0-9]+: c5 .* vpunpcklqdq %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c4 .* vpbroadcastq %xmm12,%xmm0 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-3.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-3.s @@ -229,3 +229,6 @@ _start: pcmpeqq %xmm12, %xmm12 vpcmpeqq %xmm12, %xmm12, %xmm0 vpcmpeqq %ymm12, %ymm12, %ymm0 + + vpbroadcastq %xmm2, %xmm0 + vpbroadcastq %xmm12, %xmm0 --- a/gas/testsuite/gas/i386/x86-64-optimize-3b.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-3b.d @@ -206,4 +206,6 @@ Disassembly of section .text: +[a-f0-9]+: 66 .* pcmpeqq %xmm12,%xmm12 +[a-f0-9]+: c4 .* vpcmpeqq %xmm12,%xmm12,%xmm0 +[a-f0-9]+: c4 .* vpcmpeqq %ymm12,%ymm12,%ymm0 + +[a-f0-9]+: c4 .* vpbroadcastq %xmm2,%xmm0 + +[a-f0-9]+: c4 .* vpbroadcastq %xmm12,%xmm0 #pass --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1734,7 +1734,7 @@ vbroadcastsd, 0x6619, AVX2, Modrm|Vex=2| vbroadcastss, 0x6618, AVX2, Modrm|Vex|Space0F38|VexW=1|NoSuf, { RegXMM, RegXMM|RegYMM } vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpbroadcast, 0x6678 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } -vpbroadcast, 0x6658 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } +vpbroadcast, 0x6658 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf|Optimize, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } vperm2i128, 0x6646, AVX2, Modrm|Vex=2|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } vpermd, 0x6636, AVX2, Modrm|Vex256|Space0F38|VexVVVV|VexW0|NoSuf, { Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM } vpermpd, 0x6601, AVX2, Modrm|Vex=2|Space0F3A|VexW1|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM } From patchwork Fri Jun 16 07:32:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 108900 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1149765vqr; Fri, 16 Jun 2023 00:33:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ48g0uf8sQgWTmWx5jXbc5OA+KFlRSnm3GC9gnIYBg5XgBP7++efdyeAKYhvqQ0uXUkl8zd X-Received: by 2002:a17:907:2da4:b0:974:62d7:1467 with SMTP id gt36-20020a1709072da400b0097462d71467mr1102329ejc.5.1686900830176; Fri, 16 Jun 2023 00:33:50 -0700 (PDT) Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id gg23-20020a170906e29700b009821ad6ad90si6045004ejb.555.2023.06.16.00.33.49 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Jun 2023 00:33:50 -0700 (PDT) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=oOQfeATM; arc=fail (signature failed); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AC37E3853D1F for ; Fri, 16 Jun 2023 07:32:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AC37E3853D1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1686900777; bh=9+Xx/2l1HF538RHagGizVVDbokFaQnqDRHkJAfCC33Y=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=oOQfeATMyhYaklg7AfEg7x5pFBmJ63C/KKkQuKzGpUOJJVZ3X2FaegEeKQldE3fXT UUYAQUL5WknM1z4km7R2KLVYTSvf2MHxykR1LmggfkK2Sh7MJwpce8aw5ggJ3VRcH0 1CeEKZRq2TJ1Uymbsf/ClXXUFyupHXZYOtwwHowA= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2058.outbound.protection.outlook.com [40.107.104.58]) by sourceware.org (Postfix) with ESMTPS id 65AD638313A6 for ; Fri, 16 Jun 2023 07:32:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 65AD638313A6 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=K7LH0zO57G+6YjbVrqufKNSoCx1ETR+BtGhfdqoiSZERlRk/wJ7aQk+25yAa4sVVTses8LHr3WioqJ4wcxxh9csMbjy2KNZkKBNhHg+YJw/MugnwX4U7Pj2EkNIMs1EG5xttoUUsOsEyCL7hy2RmvjR7VrKMnG3yNTnORiOXPrEitFGqEnbnDNMObukVm7rQqtyV68U+CIrzdKXWYuN1n2gBaw1DC60HJO3h2kJZB8Nsl7BGj6mdp3DN+WbjznBVwiH6mTgUbV8DmtPaWleKUdEINt+3/XplFt+ayth51xdq7OhXbReKu8Ts65vfWjrG0vFui6f/LAi0nCyOXWM0rQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9+Xx/2l1HF538RHagGizVVDbokFaQnqDRHkJAfCC33Y=; b=Zolb8RNCeayPRlX67+Q8fxKO+YwBD89YzkvE/v5sZU82Q+cKJHO4LW1+mCxKCsGfOuHsXXntCrkTAC3SLl1MUqexxm2uKVo5gmlXJGIOuEUYIhqD1g3vfigK9uCyZ3942MmVjRNTRoFgSiDmqrhlvmBiiUZCJavjOUugUUqyWg1Ikqxjb+CpaDUhFh/YZoBXPnzzrQhQnNOcrm+vC31qlnXDsK925ed03467PV1LcHrV29zVA/5mW9XwfsrTPWH3ttzTEmpxgg+wusyFg0W+/xqClG3olvpG+8X2li0IgqXeCnpLLw8WA7gWGy3c1voC+PcfcWBRRaHePvznE/auQA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM0PR04MB6913.eurprd04.prod.outlook.com (2603:10a6:208:184::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.29; Fri, 16 Jun 2023 07:32:42 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Fri, 16 Jun 2023 07:32:42 +0000 Message-ID: <08bf9dc9-5616-7dce-a094-d2ea799c92bf@suse.com> Date: Fri, 16 Jun 2023 09:32:40 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 4/4] x86: provide a 128-bit VBROADCASTSD pseudo Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: In-Reply-To: X-ClientProxiedBy: FR0P281CA0127.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:97::8) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|AM0PR04MB6913:EE_ X-MS-Office365-Filtering-Correlation-Id: ca9e4487-803f-4cc1-e7f2-08db6e3bde66 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: qpDxDtzMBZDKMm5h5dR9nzUBzW+UiqtODKX9vCUZ5B1WvSLLnTGe6IPl8Orq7Cs0N5W2DzhnVk06SRWCtI5PScGlyAOSf5izFgTSVRc5kQObU7YNlmxdTsmoAfwvQs1PeJrA03WlYvzXLdVaQ7e5UtIAm1LlONp5aIquzk73D21seTtzxmxXTTr31yrzUNvZJf58Trc1YgJVGHi5HU8zfcz2TS/4AlcZKNjF62vY51vIqvptngFlB8cK+r4YLfQqfWyRNasvNy/41+kZgU/VzRVFglDC13x3qDazmB/dNM3Bcgqlu6s+cXMcaHteRTk2L2MKhOeyfL6McDCmBiD3igRMQCMGIzFMoSAELVLnbaNvGKoPakm+e4yL/TW+GtdWj1Move2CpQAY4Lj2+hXm3ztmbm2X/CSYqQA8Afyme7zR7fYWEha8H3hLBzea2U36yyaQfc3AcstvkS4dKTd+jwtcnC5IrMd0lBCO6orJVjO9ffKgHr6mbhdvSvbcTUsWCYzX0WPRYZr57PGKofmwKhyYjpc/5+VDIbB/8VTV8ph7HGSxyTzMb5mR02MhS/RCC6iT/SJONgGA/HzsMP23ftIgGVZEOkF+dOvABz/4ibuFf5Ky5z5LgQhM41O8ZlvNn4mBlvBShvNkzBkjOoCy0g== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(376002)(396003)(366004)(136003)(346002)(39860400002)(451199021)(478600001)(6486002)(5660300002)(41300700001)(8936002)(8676002)(2906002)(30864003)(36756003)(86362001)(31696002)(38100700002)(66476007)(66556008)(66946007)(316002)(6916009)(4326008)(26005)(6512007)(6506007)(186003)(31686004)(83380400001)(2616005)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?gIpz98dlYnM85923ZdEPJsan5OKb?= =?utf-8?q?AQC2FLYtlqAFglZqmUg0ThwNeG4AJMFQLE0h0CQIjekkYEQu7wq4riq+v8Kt7ZuVT?= =?utf-8?q?Cutt7AMWNTiYJUjkOoPid5eSPccpDpFFtIIAx6DtCzOUV2dAYnPuqSCIFq8DjAbhG?= =?utf-8?q?d3I5kJAA+K6mRIyA/SE+mby7oYrMNEQ4/nSlPYgyHn3JSlvKqs+3iL05L4tZqq2qq?= =?utf-8?q?1CPOydlHkHngAvL5+7qZ61EO13rXziql0te95PbN5AIUrF68e98Z8AvMEsumjiR3h?= =?utf-8?q?eexa0qkqqnEbf2YTht8uZR0LVbCwtpVqssZzYqTv34B52ufBKTY7obJL4ks5o35XV?= =?utf-8?q?HEdv5Bq+85nXEr5FqXG/zMXURCcJXaCGOlXtEkZGAF+E2N1/ezSBkOIrKwDSgUIut?= =?utf-8?q?3tUdJB0vcJfz7w5ZZZJp5Aug/C19YjIrAXWit70GI9qEdiyyVj42jSbTTv0f6PGxc?= =?utf-8?q?p6sX9NGM0I1NXImV8a1xTBY7gCI5awE5ZuS4RZKnJ6pHCZGHZbVUH0cx47FUu7WBf?= =?utf-8?q?ydOoIQTu7JJuXi0fB7Mjvp5cZz/8om9D1qPLG2TdZ/qPXvCyX3JKW19QWx/8bv//9?= =?utf-8?q?Y8w81q3ZnN+YAM/HKZkrqZkOrbkuz0bm/8grnkHWSpTYON95kT67Ai49fP4vZIdNw?= =?utf-8?q?GK6oE1L3Lrx/nYhq5jjmj92KR2TFSONbUpmumHGcjd+aIm+ouSrXnLS/MPZkaZeEF?= =?utf-8?q?FSXyQV4OMCu0Ia0HGsZ6jch//WTBB1gViff/LvwogMANssaEkIydrsisYobWVJI4e?= =?utf-8?q?AxXnNLOKusl5Pw29ImWBnzULOnYZlmDjELTmyR6IEbgQOSz8iZSmBXQ242ISoWCz2?= =?utf-8?q?g5MBr4HF3sVGDn6D0zJzRj8XZsxtPOeQPXsNYKn5QU2BAReMlauzrhpTLw5Dfq1LH?= =?utf-8?q?8TG09EzHadAzlYZwEopdMp3eYJujdfgTRgOTXf+FuiGishwFM9yz/5L9dyBCDPTSp?= =?utf-8?q?eWFRU1HU8Ru3WLihPvC8qTeJZuKoxKCcA5PEKCmdcXAUYJ5AwdIwqwbtKvTHXTgwl?= =?utf-8?q?ecSH9VpUlAZHKr6/g53JF5dYTgxRgR6J/Nl/J6iXlfbER+gTKdc3nG3nHz06CnrsC?= =?utf-8?q?fgeq0nbINJC1z5PcWLWiLHV425cJvNzD5s2+pXS2pTq0OLGiuSStKo+fcWvMJEDXk?= =?utf-8?q?Y1bz0YQxO2thN+sqnC96nkHZc9thp80FcJAyWnV31YrPv2+VKbXdW6DqwQjy0dOp2?= =?utf-8?q?HtBS038fJVMDfxHXAz1mL2p005zJvZEvOzgyino+cgRn2U7hAeVwWZEfzvn2IYcLF?= =?utf-8?q?wW+Iyct8V72Znpr1fUmbDX/upSSsT+hZvE5gd3+h+pNu9bqvtElm4cgIedchouaMP?= =?utf-8?q?XCtrNXtsB/wYbIbG/CiowqNYNJ2Vxm27bXT+GtmbFwzJRt7rUilNdpiC3j+qliv/e?= =?utf-8?q?JX0PID9TJgB+Rfmz3jdgNP4b2Q6kqjQlvCa3XTIKLOLzbPU9g+f9Q+SDcEJD5HEFb?= =?utf-8?q?jFX1tfBmCOLT6cjq74t3kM+/7FioDdTsMjFlarbST2sskxRVBcFQI7YVmuTX0VSwo?= =?utf-8?q?I59OKDy0EYKH?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: ca9e4487-803f-4cc1-e7f2-08db6e3bde66 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 07:32:42.3573 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: tKZsDJrzjDz9lGu0SZWF3V6pM4ZpuSjstxItVLYWjtF9P8JlpVEalMGQ5dOpgfEhfJcd6tY4Es3e8ALRd46b/w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR04MB6913 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768843724728924657?= X-GMAIL-MSGID: =?utf-8?q?1768843724728924657?= VBROADCASTSD not supporting 128-bit destinations in any of their AVX, AVX2, or AVX512F incarnations is presumably because of VMOVDDUP precisely supporting this very operation. (It is therefore different from e.g. VPBROADCASTQ, which has no exact equivalent.) Still its absence has led to people using VPBROADCASTQ as substitution; this could have been avoided if such a pseudo had been supported from the very beginning. Note that the pseudos try to match what the real instructions would have used as closely as possible, i.e. VexW0 instead of VexWIG for the AVX and AVX2 forms as well as AVX2 in the first place for the register source form. --- For being the first example of us supplying such, this is partly RFC. On top of that a question is also whether to indeed have split AVX/AVX2 templates, when in principle one (allowing for both memory and register source) could do. --- a/gas/testsuite/gas/i386/avx.d +++ b/gas/testsuite/gas/i386/avx.d @@ -927,6 +927,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 35 21 vpmovzxdq \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd %xmm4,%xmm6 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd \(%ecx\),%xmm4 +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd %xmm4,\(%ecx\) [ ]*[a-f0-9]+: c5 f8 13 21 vmovlps %xmm4,\(%ecx\) @@ -2768,6 +2769,8 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd %xmm4,%xmm6 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd \(%ecx\),%xmm4 +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup \(%ecx\),%xmm4 +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd %xmm4,\(%ecx\) --- a/gas/testsuite/gas/i386/avx.s +++ b/gas/testsuite/gas/i386/avx.s @@ -982,6 +982,7 @@ _start: vucomisd (%ecx),%xmm4 # Tests for op mem64, xmm + vbroadcastsd (%ecx),%xmm4 vmovsd (%ecx),%xmm4 # Tests for op xmm, mem64 @@ -2953,6 +2954,8 @@ _start: vucomisd xmm4,[ecx] # Tests for op mem64, xmm + vbroadcastsd xmm4,QWORD PTR [ecx] + vbroadcastsd xmm4,[ecx] vmovsd xmm4,QWORD PTR [ecx] vmovsd xmm4,[ecx] --- a/gas/testsuite/gas/i386/avx-16bit.d +++ b/gas/testsuite/gas/i386/avx-16bit.d @@ -928,6 +928,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: 67 c4 e2 79 35 21 vpmovzxdq \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd %xmm4,%xmm6 [ ]*[a-f0-9]+: 67 c5 f9 2e 21 vucomisd \(%ecx\),%xmm4 +[ ]*[a-f0-9]+: 67 c5 fb 12 21 vmovddup \(%ecx\),%xmm4 [ ]*[a-f0-9]+: 67 c5 fb 10 21 vmovsd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: 67 c5 f9 13 21 vmovlpd %xmm4,\(%ecx\) [ ]*[a-f0-9]+: 67 c5 f8 13 21 vmovlps %xmm4,\(%ecx\) @@ -2769,6 +2770,8 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd %xmm4,%xmm6 [ ]*[a-f0-9]+: 67 c5 f9 2e 21 vucomisd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: 67 c5 f9 2e 21 vucomisd \(%ecx\),%xmm4 +[ ]*[a-f0-9]+: 67 c5 fb 12 21 vmovddup \(%ecx\),%xmm4 +[ ]*[a-f0-9]+: 67 c5 fb 12 21 vmovddup \(%ecx\),%xmm4 [ ]*[a-f0-9]+: 67 c5 fb 10 21 vmovsd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: 67 c5 fb 10 21 vmovsd \(%ecx\),%xmm4 [ ]*[a-f0-9]+: 67 c5 f9 13 21 vmovlpd %xmm4,\(%ecx\) --- a/gas/testsuite/gas/i386/avx-intel.d +++ b/gas/testsuite/gas/i386/avx-intel.d @@ -928,6 +928,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 35 21 vpmovzxdq xmm4,QWORD PTR \[ecx\] [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd xmm6,xmm4 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd xmm4,QWORD PTR \[ecx\] +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup xmm4,QWORD PTR \[ecx\] [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd xmm4,QWORD PTR \[ecx\] [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd QWORD PTR \[ecx\],xmm4 [ ]*[a-f0-9]+: c5 f8 13 21 vmovlps QWORD PTR \[ecx\],xmm4 @@ -2769,6 +2770,8 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd xmm6,xmm4 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd xmm4,QWORD PTR \[ecx\] [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd xmm4,QWORD PTR \[ecx\] +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup xmm4,QWORD PTR \[ecx\] +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup xmm4,QWORD PTR \[ecx\] [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd xmm4,QWORD PTR \[ecx\] [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd xmm4,QWORD PTR \[ecx\] [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd QWORD PTR \[ecx\],xmm4 --- a/gas/testsuite/gas/i386/avx2.d +++ b/gas/testsuite/gas/i386/avx2.d @@ -73,6 +73,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 78 21 vpbroadcastb \(%ecx\),%xmm4 [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb %xmm4,%ymm6 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb \(%ecx\),%ymm4 +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup %xmm4,%xmm6 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss %xmm4,%xmm6 [ ]*[a-f0-9]+: c4 e2 5d 8c 31 vpmaskmovd \(%ecx\),%ymm4,%ymm6 [ ]*[a-f0-9]+: c4 e2 4d 8e 21 vpmaskmovd %ymm4,%ymm6,\(%ecx\) @@ -177,5 +178,6 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb %xmm4,%ymm6 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb \(%ecx\),%ymm4 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb \(%ecx\),%ymm4 +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup %xmm4,%xmm6 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss %xmm4,%xmm6 #pass --- a/gas/testsuite/gas/i386/avx2.s +++ b/gas/testsuite/gas/i386/avx2.s @@ -114,6 +114,7 @@ _start: vpbroadcastb (%ecx),%ymm4 # Tests for op xmm, xmm + vbroadcastsd %xmm4,%xmm6 vbroadcastss %xmm4,%xmm6 .intel_syntax noprefix @@ -265,4 +266,5 @@ _start: vpbroadcastb ymm4,[ecx] # Tests for op xmm, xmm + vbroadcastsd xmm6,xmm4 vbroadcastss xmm6,xmm4 --- a/gas/testsuite/gas/i386/avx2-intel.d +++ b/gas/testsuite/gas/i386/avx2-intel.d @@ -74,6 +74,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 78 21 vpbroadcastb xmm4,BYTE PTR \[ecx\] [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb ymm6,xmm4 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb ymm4,BYTE PTR \[ecx\] +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup xmm6,xmm4 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss xmm6,xmm4 [ ]*[a-f0-9]+: c4 e2 5d 8c 31 vpmaskmovd ymm6,ymm4,YMMWORD PTR \[ecx\] [ ]*[a-f0-9]+: c4 e2 4d 8e 21 vpmaskmovd YMMWORD PTR \[ecx\],ymm6,ymm4 @@ -178,5 +179,6 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb ymm6,xmm4 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb ymm4,BYTE PTR \[ecx\] [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb ymm4,BYTE PTR \[ecx\] +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup xmm6,xmm4 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss xmm6,xmm4 #pass --- a/gas/testsuite/gas/i386/avx512f_vl.d +++ b/gas/testsuite/gas/i386/avx512f_vl.d @@ -155,6 +155,15 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 00 08 00 00[ ]*vbroadcasti32x4 0x800\(%edx\),%ymm6\{%k7\} [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a 72 80[ ]*vbroadcasti32x4 -0x800\(%edx\),%ymm6\{%k7\} [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 -0x810\(%edx\),%ymm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 31[ ]*vmovddup \(%ecx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 31[ ]*vmovddup \(%ecx\),%xmm6\{%k7\}\{z\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ ]*vmovddup -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 7f[ ]*vmovddup 0x3f8\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 00 04 00 00[ ]*vmovddup 0x400\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 80[ ]*vmovddup -0x400\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 f8 fb ff ff[ ]*vmovddup -0x408\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 f5[ ]*vmovddup %xmm5,%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 f5[ ]*vmovddup %xmm5,%xmm6\{%k7\}\{z\} [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 31[ ]*vbroadcastsd \(%ecx\),%ymm6\{%k7\} [ ]*[a-f0-9]+:[ ]*62 f2 fd af 19 31[ ]*vbroadcastsd \(%ecx\),%ymm6\{%k7\}\{z\} [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ ]*vbroadcastsd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\} @@ -5850,6 +5859,15 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 00 08 00 00[ ]*vbroadcasti32x4 0x800\(%edx\),%ymm6\{%k7\} [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a 72 80[ ]*vbroadcasti32x4 -0x800\(%edx\),%ymm6\{%k7\} [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 -0x810\(%edx\),%ymm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 31[ ]*vmovddup \(%ecx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 31[ ]*vmovddup \(%ecx\),%xmm6\{%k7\}\{z\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ ]*vmovddup -0x1e240\(%esp,%esi,8\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 7f[ ]*vmovddup 0x3f8\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 00 04 00 00[ ]*vmovddup 0x400\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 80[ ]*vmovddup -0x400\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 f8 fb ff ff[ ]*vmovddup -0x408\(%edx\),%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 f5[ ]*vmovddup %xmm5,%xmm6\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 f5[ ]*vmovddup %xmm5,%xmm6\{%k7\}\{z\} [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 31[ ]*vbroadcastsd \(%ecx\),%ymm6\{%k7\} [ ]*[a-f0-9]+:[ ]*62 f2 fd af 19 31[ ]*vbroadcastsd \(%ecx\),%ymm6\{%k7\}\{z\} [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ ]*vbroadcastsd -0x1e240\(%esp,%esi,8\),%ymm6\{%k7\} --- a/gas/testsuite/gas/i386/avx512f_vl.s +++ b/gas/testsuite/gas/i386/avx512f_vl.s @@ -149,6 +149,15 @@ _start: vbroadcasti32x4 2048(%edx), %ymm6{%k7} # AVX512{F,VL} vbroadcasti32x4 -2048(%edx), %ymm6{%k7} # AVX512{F,VL} Disp8 vbroadcasti32x4 -2064(%edx), %ymm6{%k7} # AVX512{F,VL} + vbroadcastsd (%ecx), %xmm6{%k7} # AVX512{F,VL} + vbroadcastsd (%ecx), %xmm6{%k7}{z} # AVX512{F,VL} + vbroadcastsd -123456(%esp,%esi,8), %xmm6{%k7} # AVX512{F,VL} + vbroadcastsd 1016(%edx), %xmm6{%k7} # AVX512{F,VL} Disp8 + vbroadcastsd 1024(%edx), %xmm6{%k7} # AVX512{F,VL} + vbroadcastsd -1024(%edx), %xmm6{%k7} # AVX512{F,VL} Disp8 + vbroadcastsd -1032(%edx), %xmm6{%k7} # AVX512{F,VL} + vbroadcastsd %xmm5, %xmm6{%k7} # AVX512{F,VL} + vbroadcastsd %xmm5, %xmm6{%k7}{z} # AVX512{F,VL} vbroadcastsd (%ecx), %ymm6{%k7} # AVX512{F,VL} vbroadcastsd (%ecx), %ymm6{%k7}{z} # AVX512{F,VL} vbroadcastsd -123456(%esp,%esi,8), %ymm6{%k7} # AVX512{F,VL} @@ -5846,6 +5855,15 @@ _start: vbroadcasti32x4 ymm6{k7}, XMMWORD PTR [edx+2048] # AVX512{F,VL} vbroadcasti32x4 ymm6{k7}, XMMWORD PTR [edx-2048] # AVX512{F,VL} Disp8 vbroadcasti32x4 ymm6{k7}, XMMWORD PTR [edx-2064] # AVX512{F,VL} + vbroadcastsd xmm6{k7}, QWORD PTR [ecx] # AVX512{F,VL} + vbroadcastsd xmm6{k7}{z}, QWORD PTR [ecx] # AVX512{F,VL} + vbroadcastsd xmm6{k7}, QWORD PTR [esp+esi*8-123456] # AVX512{F,VL} + vbroadcastsd xmm6{k7}, QWORD PTR [edx+1016] # AVX512{F,VL} Disp8 + vbroadcastsd xmm6{k7}, QWORD PTR [edx+1024] # AVX512{F,VL} + vbroadcastsd xmm6{k7}, QWORD PTR [edx-1024] # AVX512{F,VL} Disp8 + vbroadcastsd xmm6{k7}, QWORD PTR [edx-1032] # AVX512{F,VL} + vbroadcastsd xmm6{k7}, xmm5 # AVX512{F,VL} + vbroadcastsd xmm6{k7}{z}, xmm5 # AVX512{F,VL} vbroadcastsd ymm6{k7}, QWORD PTR [ecx] # AVX512{F,VL} vbroadcastsd ymm6{k7}{z}, QWORD PTR [ecx] # AVX512{F,VL} vbroadcastsd ymm6{k7}, QWORD PTR [esp+esi*8-123456] # AVX512{F,VL} --- a/gas/testsuite/gas/i386/avx512f_vl-intel.d +++ b/gas/testsuite/gas/i386/avx512f_vl-intel.d @@ -155,6 +155,15 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 00 08 00 00[ ]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx\+0x800\] [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a 72 80[ ]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x800\] [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x810\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 31[ ]*vmovddup xmm6\{k7\},QWORD PTR \[ecx\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 31[ ]*vmovddup xmm6\{k7\}\{z\},QWORD PTR \[ecx\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ ]*vmovddup xmm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 7f[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x3f8\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 00 04 00 00[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x400\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 80[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x400\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 f8 fb ff ff[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x408\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 f5[ ]*vmovddup xmm6\{k7\},xmm5 +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 f5[ ]*vmovddup xmm6\{k7\}\{z\},xmm5 [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 31[ ]*vbroadcastsd ymm6\{k7\},QWORD PTR \[ecx\] [ ]*[a-f0-9]+:[ ]*62 f2 fd af 19 31[ ]*vbroadcastsd ymm6\{k7\}\{z\},QWORD PTR \[ecx\] [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ ]*vbroadcastsd ymm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\] @@ -5850,6 +5859,15 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 00 08 00 00[ ]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx\+0x800\] [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a 72 80[ ]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x800\] [ ]*[a-f0-9]+:[ ]*62 f2 7d 2f 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 ymm6\{k7\},XMMWORD PTR \[edx-0x810\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 31[ ]*vmovddup xmm6\{k7\},QWORD PTR \[ecx\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 31[ ]*vmovddup xmm6\{k7\}\{z\},QWORD PTR \[ecx\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b4 f4 c0 1d fe ff[ ]*vmovddup xmm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 7f[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x3f8\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 00 04 00 00[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx\+0x400\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 72 80[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x400\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 b2 f8 fb ff ff[ ]*vmovddup xmm6\{k7\},QWORD PTR \[edx-0x408\] +[ ]*[a-f0-9]+:[ ]*62 f1 ff 0f 12 f5[ ]*vmovddup xmm6\{k7\},xmm5 +[ ]*[a-f0-9]+:[ ]*62 f1 ff 8f 12 f5[ ]*vmovddup xmm6\{k7\}\{z\},xmm5 [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 31[ ]*vbroadcastsd ymm6\{k7\},QWORD PTR \[ecx\] [ ]*[a-f0-9]+:[ ]*62 f2 fd af 19 31[ ]*vbroadcastsd ymm6\{k7\}\{z\},QWORD PTR \[ecx\] [ ]*[a-f0-9]+:[ ]*62 f2 fd 2f 19 b4 f4 c0 1d fe ff[ ]*vbroadcastsd ymm6\{k7\},QWORD PTR \[esp\+esi\*8-0x1e240\] --- a/gas/testsuite/gas/i386/x86-64-avx.d +++ b/gas/testsuite/gas/i386/x86-64-avx.d @@ -875,6 +875,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 35 21 vpmovzxdq \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd %xmm4,%xmm6 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd \(%rcx\),%xmm4 +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd %xmm4,\(%rcx\) [ ]*[a-f0-9]+: c5 f8 13 21 vmovlps %xmm4,\(%rcx\) @@ -2818,6 +2819,8 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd %xmm4,%xmm6 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd \(%rcx\),%xmm4 +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup \(%rcx\),%xmm4 +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd %xmm4,\(%rcx\) --- a/gas/testsuite/gas/i386/x86-64-avx.s +++ b/gas/testsuite/gas/i386/x86-64-avx.s @@ -930,6 +930,7 @@ _start: vucomisd (%rcx),%xmm4 # Tests for op mem64, xmm + vbroadcastsd (%rcx),%xmm4 vmovsd (%rcx),%xmm4 # Tests for op xmm, mem64 @@ -3024,6 +3025,8 @@ _start: vucomisd xmm4,[rcx] # Tests for op mem64, xmm + vbroadcastsd xmm4,QWORD PTR [rcx] + vbroadcastsd xmm4,[rcx] vmovsd xmm4,QWORD PTR [rcx] vmovsd xmm4,[rcx] --- a/gas/testsuite/gas/i386/x86-64-avx-intel.d +++ b/gas/testsuite/gas/i386/x86-64-avx-intel.d @@ -876,6 +876,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 35 21 vpmovzxdq xmm4,QWORD PTR \[rcx\] [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd xmm6,xmm4 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd xmm4,QWORD PTR \[rcx\] +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup xmm4,QWORD PTR \[rcx\] [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd xmm4,QWORD PTR \[rcx\] [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd QWORD PTR \[rcx\],xmm4 [ ]*[a-f0-9]+: c5 f8 13 21 vmovlps QWORD PTR \[rcx\],xmm4 @@ -2819,6 +2820,8 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c5 f9 2e f4 vucomisd xmm6,xmm4 [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd xmm4,QWORD PTR \[rcx\] [ ]*[a-f0-9]+: c5 f9 2e 21 vucomisd xmm4,QWORD PTR \[rcx\] +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup xmm4,QWORD PTR \[rcx\] +[ ]*[a-f0-9]+: c5 fb 12 21 vmovddup xmm4,QWORD PTR \[rcx\] [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd xmm4,QWORD PTR \[rcx\] [ ]*[a-f0-9]+: c5 fb 10 21 vmovsd xmm4,QWORD PTR \[rcx\] [ ]*[a-f0-9]+: c5 f9 13 21 vmovlpd QWORD PTR \[rcx\],xmm4 --- a/gas/testsuite/gas/i386/x86-64-avx2.d +++ b/gas/testsuite/gas/i386/x86-64-avx2.d @@ -73,6 +73,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 78 21 vpbroadcastb \(%rcx\),%xmm4 [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb %xmm4,%ymm6 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb \(%rcx\),%ymm4 +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup %xmm4,%xmm6 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss %xmm4,%xmm6 [ ]*[a-f0-9]+: c4 e2 5d 8c 31 vpmaskmovd \(%rcx\),%ymm4,%ymm6 [ ]*[a-f0-9]+: c4 e2 4d 8e 21 vpmaskmovd %ymm4,%ymm6,\(%rcx\) @@ -177,5 +178,6 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb %xmm4,%ymm6 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb \(%rcx\),%ymm4 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb \(%rcx\),%ymm4 +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup %xmm4,%xmm6 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss %xmm4,%xmm6 #pass --- a/gas/testsuite/gas/i386/x86-64-avx2.s +++ b/gas/testsuite/gas/i386/x86-64-avx2.s @@ -114,6 +114,7 @@ _start: vpbroadcastb (%rcx),%ymm4 # Tests for op xmm, xmm + vbroadcastsd %xmm4,%xmm6 vbroadcastss %xmm4,%xmm6 .intel_syntax noprefix @@ -265,4 +266,5 @@ _start: vpbroadcastb ymm4,[rcx] # Tests for op xmm, xmm + vbroadcastsd xmm6,xmm4 vbroadcastss xmm6,xmm4 --- a/gas/testsuite/gas/i386/x86-64-avx2-intel.d +++ b/gas/testsuite/gas/i386/x86-64-avx2-intel.d @@ -74,6 +74,7 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 79 78 21 vpbroadcastb xmm4,BYTE PTR \[rcx\] [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb ymm6,xmm4 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb ymm4,BYTE PTR \[rcx\] +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup xmm6,xmm4 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss xmm6,xmm4 [ ]*[a-f0-9]+: c4 e2 5d 8c 31 vpmaskmovd ymm6,ymm4,YMMWORD PTR \[rcx\] [ ]*[a-f0-9]+: c4 e2 4d 8e 21 vpmaskmovd YMMWORD PTR \[rcx\],ymm6,ymm4 @@ -178,5 +179,6 @@ Disassembly of section .text: [ ]*[a-f0-9]+: c4 e2 7d 78 f4 vpbroadcastb ymm6,xmm4 [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb ymm4,BYTE PTR \[rcx\] [ ]*[a-f0-9]+: c4 e2 7d 78 21 vpbroadcastb ymm4,BYTE PTR \[rcx\] +[ ]*[a-f0-9]+: c5 fb 12 f4 vmovddup xmm6,xmm4 [ ]*[a-f0-9]+: c4 e2 79 18 f4 vbroadcastss xmm6,xmm4 #pass --- a/gas/testsuite/gas/i386/x86-64-avx512f_vl.d +++ b/gas/testsuite/gas/i386/x86-64-avx512f_vl.d @@ -167,6 +167,17 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 00 08 00 00[ ]*vbroadcasti32x4 0x800\(%rdx\),%ymm30 [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a 72 80[ ]*vbroadcasti32x4 -0x800\(%rdx\),%ymm30 [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 -0x810\(%rdx\),%ymm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 31[ ]*vmovddup \(%rcx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 0f 12 31[ ]*vmovddup \(%rcx\),%xmm30\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 61 ff 8f 12 31[ ]*vmovddup \(%rcx\),%xmm30\{%k7\}\{z\} +[ ]*[a-f0-9]+:[ ]*62 21 ff 08 12 b4 f0 23 01 00 00[ ]*vmovddup 0x123\(%rax,%r14,8\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 7f[ ]*vmovddup 0x3f8\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 00 04 00 00[ ]*vmovddup 0x400\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 80[ ]*vmovddup -0x400\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 f8 fb ff ff[ ]*vmovddup -0x408\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 01 ff 08 12 f5[ ]*vmovddup %xmm29,%xmm30 +[ ]*[a-f0-9]+:[ ]*62 01 ff 0f 12 f5[ ]*vmovddup %xmm29,%xmm30\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 01 ff 8f 12 f5[ ]*vmovddup %xmm29,%xmm30\{%k7\}\{z\} [ ]*[a-f0-9]+:[ ]*62 62 fd 28 19 31[ ]*vbroadcastsd \(%rcx\),%ymm30 [ ]*[a-f0-9]+:[ ]*62 62 fd 2f 19 31[ ]*vbroadcastsd \(%rcx\),%ymm30\{%k7\} [ ]*[a-f0-9]+:[ ]*62 62 fd af 19 31[ ]*vbroadcastsd \(%rcx\),%ymm30\{%k7\}\{z\} @@ -6474,6 +6485,17 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 00 08 00 00[ ]*vbroadcasti32x4 0x800\(%rdx\),%ymm30 [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a 72 80[ ]*vbroadcasti32x4 -0x800\(%rdx\),%ymm30 [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 -0x810\(%rdx\),%ymm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 31[ ]*vmovddup \(%rcx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 0f 12 31[ ]*vmovddup \(%rcx\),%xmm30\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 61 ff 8f 12 31[ ]*vmovddup \(%rcx\),%xmm30\{%k7\}\{z\} +[ ]*[a-f0-9]+:[ ]*62 21 ff 08 12 b4 f0 34 12 00 00[ ]*vmovddup 0x1234\(%rax,%r14,8\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 7f[ ]*vmovddup 0x3f8\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 00 04 00 00[ ]*vmovddup 0x400\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 80[ ]*vmovddup -0x400\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 f8 fb ff ff[ ]*vmovddup -0x408\(%rdx\),%xmm30 +[ ]*[a-f0-9]+:[ ]*62 01 ff 08 12 f5[ ]*vmovddup %xmm29,%xmm30 +[ ]*[a-f0-9]+:[ ]*62 01 ff 0f 12 f5[ ]*vmovddup %xmm29,%xmm30\{%k7\} +[ ]*[a-f0-9]+:[ ]*62 01 ff 8f 12 f5[ ]*vmovddup %xmm29,%xmm30\{%k7\}\{z\} [ ]*[a-f0-9]+:[ ]*62 62 fd 28 19 31[ ]*vbroadcastsd \(%rcx\),%ymm30 [ ]*[a-f0-9]+:[ ]*62 62 fd 2f 19 31[ ]*vbroadcastsd \(%rcx\),%ymm30\{%k7\} [ ]*[a-f0-9]+:[ ]*62 62 fd af 19 31[ ]*vbroadcastsd \(%rcx\),%ymm30\{%k7\}\{z\} --- a/gas/testsuite/gas/i386/x86-64-avx512f_vl.s +++ b/gas/testsuite/gas/i386/x86-64-avx512f_vl.s @@ -161,6 +161,17 @@ _start: vbroadcasti32x4 2048(%rdx), %ymm30 # AVX512{F,VL} vbroadcasti32x4 -2048(%rdx), %ymm30 # AVX512{F,VL} Disp8 vbroadcasti32x4 -2064(%rdx), %ymm30 # AVX512{F,VL} + vbroadcastsd (%rcx), %xmm30 # AVX512{F,VL} + vbroadcastsd (%rcx), %xmm30{%k7} # AVX512{F,VL} + vbroadcastsd (%rcx), %xmm30{%k7}{z} # AVX512{F,VL} + vbroadcastsd 0x123(%rax,%r14,8), %xmm30 # AVX512{F,VL} + vbroadcastsd 1016(%rdx), %xmm30 # AVX512{F,VL} Disp8 + vbroadcastsd 1024(%rdx), %xmm30 # AVX512{F,VL} + vbroadcastsd -1024(%rdx), %xmm30 # AVX512{F,VL} Disp8 + vbroadcastsd -1032(%rdx), %xmm30 # AVX512{F,VL} + vbroadcastsd %xmm29, %xmm30 # AVX512{F,VL} + vbroadcastsd %xmm29, %xmm30{%k7} # AVX512{F,VL} + vbroadcastsd %xmm29, %xmm30{%k7}{z} # AVX512{F,VL} vbroadcastsd (%rcx), %ymm30 # AVX512{F,VL} vbroadcastsd (%rcx), %ymm30{%k7} # AVX512{F,VL} vbroadcastsd (%rcx), %ymm30{%k7}{z} # AVX512{F,VL} @@ -6470,6 +6481,17 @@ _start: vbroadcasti32x4 ymm30, XMMWORD PTR [rdx+2048] # AVX512{F,VL} vbroadcasti32x4 ymm30, XMMWORD PTR [rdx-2048] # AVX512{F,VL} Disp8 vbroadcasti32x4 ymm30, XMMWORD PTR [rdx-2064] # AVX512{F,VL} + vbroadcastsd xmm30, QWORD PTR [rcx] # AVX512{F,VL} + vbroadcastsd xmm30{k7}, QWORD PTR [rcx] # AVX512{F,VL} + vbroadcastsd xmm30{k7}{z}, QWORD PTR [rcx] # AVX512{F,VL} + vbroadcastsd xmm30, QWORD PTR [rax+r14*8+0x1234] # AVX512{F,VL} + vbroadcastsd xmm30, QWORD PTR [rdx+1016] # AVX512{F,VL} Disp8 + vbroadcastsd xmm30, QWORD PTR [rdx+1024] # AVX512{F,VL} + vbroadcastsd xmm30, QWORD PTR [rdx-1024] # AVX512{F,VL} Disp8 + vbroadcastsd xmm30, QWORD PTR [rdx-1032] # AVX512{F,VL} + vbroadcastsd xmm30, xmm29 # AVX512{F,VL} + vbroadcastsd xmm30{k7}, xmm29 # AVX512{F,VL} + vbroadcastsd xmm30{k7}{z}, xmm29 # AVX512{F,VL} vbroadcastsd ymm30, QWORD PTR [rcx] # AVX512{F,VL} vbroadcastsd ymm30{k7}, QWORD PTR [rcx] # AVX512{F,VL} vbroadcastsd ymm30{k7}{z}, QWORD PTR [rcx] # AVX512{F,VL} --- a/gas/testsuite/gas/i386/x86-64-avx512f_vl-intel.d +++ b/gas/testsuite/gas/i386/x86-64-avx512f_vl-intel.d @@ -167,6 +167,17 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 00 08 00 00[ ]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx\+0x800\] [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a 72 80[ ]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x800\] [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x810\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 31[ ]*vmovddup xmm30,QWORD PTR \[rcx\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 0f 12 31[ ]*vmovddup xmm30\{k7\},QWORD PTR \[rcx\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 8f 12 31[ ]*vmovddup xmm30\{k7\}\{z\},QWORD PTR \[rcx\] +[ ]*[a-f0-9]+:[ ]*62 21 ff 08 12 b4 f0 23 01 00 00[ ]*vmovddup xmm30,QWORD PTR \[rax\+r14\*8\+0x123\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 7f[ ]*vmovddup xmm30,QWORD PTR \[rdx\+0x3f8\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 00 04 00 00[ ]*vmovddup xmm30,QWORD PTR \[rdx\+0x400\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 80[ ]*vmovddup xmm30,QWORD PTR \[rdx-0x400\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 f8 fb ff ff[ ]*vmovddup xmm30,QWORD PTR \[rdx-0x408\] +[ ]*[a-f0-9]+:[ ]*62 01 ff 08 12 f5[ ]*vmovddup xmm30,xmm29 +[ ]*[a-f0-9]+:[ ]*62 01 ff 0f 12 f5[ ]*vmovddup xmm30\{k7\},xmm29 +[ ]*[a-f0-9]+:[ ]*62 01 ff 8f 12 f5[ ]*vmovddup xmm30\{k7\}\{z\},xmm29 [ ]*[a-f0-9]+:[ ]*62 62 fd 28 19 31[ ]*vbroadcastsd ymm30,QWORD PTR \[rcx\] [ ]*[a-f0-9]+:[ ]*62 62 fd 2f 19 31[ ]*vbroadcastsd ymm30\{k7\},QWORD PTR \[rcx\] [ ]*[a-f0-9]+:[ ]*62 62 fd af 19 31[ ]*vbroadcastsd ymm30\{k7\}\{z\},QWORD PTR \[rcx\] @@ -6474,6 +6485,17 @@ Disassembly of section \.text: [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 00 08 00 00[ ]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx\+0x800\] [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a 72 80[ ]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x800\] [ ]*[a-f0-9]+:[ ]*62 62 7d 28 5a b2 f0 f7 ff ff[ ]*vbroadcasti32x4 ymm30,XMMWORD PTR \[rdx-0x810\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 31[ ]*vmovddup xmm30,QWORD PTR \[rcx\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 0f 12 31[ ]*vmovddup xmm30\{k7\},QWORD PTR \[rcx\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 8f 12 31[ ]*vmovddup xmm30\{k7\}\{z\},QWORD PTR \[rcx\] +[ ]*[a-f0-9]+:[ ]*62 21 ff 08 12 b4 f0 34 12 00 00[ ]*vmovddup xmm30,QWORD PTR \[rax\+r14\*8\+0x1234\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 7f[ ]*vmovddup xmm30,QWORD PTR \[rdx\+0x3f8\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 00 04 00 00[ ]*vmovddup xmm30,QWORD PTR \[rdx\+0x400\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 72 80[ ]*vmovddup xmm30,QWORD PTR \[rdx-0x400\] +[ ]*[a-f0-9]+:[ ]*62 61 ff 08 12 b2 f8 fb ff ff[ ]*vmovddup xmm30,QWORD PTR \[rdx-0x408\] +[ ]*[a-f0-9]+:[ ]*62 01 ff 08 12 f5[ ]*vmovddup xmm30,xmm29 +[ ]*[a-f0-9]+:[ ]*62 01 ff 0f 12 f5[ ]*vmovddup xmm30\{k7\},xmm29 +[ ]*[a-f0-9]+:[ ]*62 01 ff 8f 12 f5[ ]*vmovddup xmm30\{k7\}\{z\},xmm29 [ ]*[a-f0-9]+:[ ]*62 62 fd 28 19 31[ ]*vbroadcastsd ymm30,QWORD PTR \[rcx\] [ ]*[a-f0-9]+:[ ]*62 62 fd 2f 19 31[ ]*vbroadcastsd ymm30\{k7\},QWORD PTR \[rcx\] [ ]*[a-f0-9]+:[ ]*62 62 fd af 19 31[ ]*vbroadcastsd ymm30\{k7\}\{z\},QWORD PTR \[rcx\] --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1495,6 +1495,8 @@ vblendp, 0x660c | , AVX, Mod vblendvp, 0x664a | , AVX, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { RegXMM|RegYMM, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vbroadcastf128, 0x661a, AVX, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } vbroadcastsd, 0x6619, AVX, Modrm|Vex256|Space0F38|VexW0|NoSuf, { Qword|Unspecified|BaseIndex, RegYMM } +// As an extension, provide a 128-bit form as well, utilizing vmovddup. +vbroadcastsd, 0xf212, AVX, Modrm|Vex128|Space0F|VexW0|NoSuf, { Qword|Unspecified|BaseIndex, RegXMM } vbroadcastss, 0x6618, AVX, Modrm|Vex128|Space0F38|VexW0|NoSuf, { Dword|Unspecified|BaseIndex, RegXMM|RegYMM } vcmpp, 0xc2/0x, AVX, Modrm||Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|ImmExt, { RegXMM|RegYMM|Unspecified|BaseIndex, RegXMM|RegYMM, RegXMM|RegYMM } vcmps, 0xc2/0x, AVX, Modrm||VexLIG|Space0F|VexVVVV|VexWIG|NoSuf|ImmExt, { RegXMM||Unspecified|BaseIndex, RegXMM, RegXMM } @@ -1731,6 +1733,8 @@ vpmovzxwq, 0x6634, AVX2, Modrm|Vex=2|Spa vbroadcasti128, 0x665A, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { Xmmword|Unspecified|BaseIndex, RegYMM } vbroadcastsd, 0x6619, AVX2, Modrm|Vex=2|Space0F38|VexW=1|NoSuf, { RegXMM, RegYMM } +// As an extension, provide a 128-bit form as well, utilizing vmovddup. +vbroadcastsd, 0xf212, AVX2, Modrm|Vex128|Space0F|VexW0|NoSuf, { RegXMM, RegXMM } vbroadcastss, 0x6618, AVX2, Modrm|Vex|Space0F38|VexW=1|NoSuf, { RegXMM, RegXMM|RegYMM } vpblendd, 0x6602, AVX2, Modrm|Vex|Space0F3A|VexVVVV|VexW0|CheckOperandSize|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpbroadcast, 0x6678 | , AVX2, Modrm|Vex|Space0F38|VexW0|NoSuf, { |Unspecified|BaseIndex|RegXMM, RegXMM|RegYMM } @@ -2128,6 +2132,8 @@ vbroadcasti64x4, 0x665B, AVX512F, Modrm| vbroadcastss, 0x6618, AVX512F, Modrm|Masking|Space0F38|VexW0|Disp8MemShift=2|NoSuf, { RegXMM|Dword|Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vbroadcastsd, 0x6619, AVX512F, Modrm|Masking|Space0F38|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegYMM|RegZMM } +// As an extension, provide a 128-bit form as well, utilizing vmovddup. +vbroadcastsd, 0xf212, AVX512F|AVX512VL, Modrm|EVex128|Masking|Space0F|VexW1|Disp8MemShift=3|NoSuf, { RegXMM|Qword|Unspecified|BaseIndex, RegXMM } vpbroadcast, 0x6658 | , AVX512F, Modrm|Masking|Space0F38||Disp8MemShift|NoSuf, { RegXMM||Unspecified|BaseIndex, RegXMM|RegYMM|RegZMM } vpbroadcast, 0x667c, AVX512F, Modrm|Masking|Space0F38||NoSuf, { , RegXMM|RegYMM|RegZMM }