From patchwork Fri Jun 16 07:31:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 108896 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1148807vqr; Fri, 16 Jun 2023 00:31:31 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6rhssswpvy3cPGZDRMhTOddD/byLAdMiXOJzJPaQpjlSHTNozkA463Yzw9N04F9FIlYc0x X-Received: by 2002:a17:906:ef0a:b0:978:337e:c41a with SMTP id f10-20020a170906ef0a00b00978337ec41amr1042777ejs.14.1686900690973; Fri, 16 Jun 2023 00:31:30 -0700 (PDT) Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id o27-20020a1709062e9b00b00985b6153e29si580962eji.830.2023.06.16.00.31.30 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Jun 2023 00:31:30 -0700 (PDT) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=iJgfyZDG; arc=fail (signature failed); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BFEAE3854E7E for ; Fri, 16 Jun 2023 07:31:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BFEAE3854E7E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1686900688; bh=udPL3gPvAckEiUa15KP6Cd4610Dh1Kdt8N/uJU7bqQc=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=iJgfyZDGldt0FaCoebXBKFu81Quur8jVsMdwzomJFWOlBmcBBsUmZeuPGB0OlKqgS cwLMSn287Lp8JX2FN8WZVOx8nwo88oFpstKkcNuDzwSf6ZBZPr6BZsQqEjcu60H2N1 m8QW6QPP4ARSK88hWfK7oW/8ydGlVtJbf3nz0wgA= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2052.outbound.protection.outlook.com [40.107.22.52]) by sourceware.org (Postfix) with ESMTPS id E8D603854E76 for ; Fri, 16 Jun 2023 07:31:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E8D603854E76 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bqTPgNkLJYjomoWSxFCtlYbjOEWKbJBrqARweuTn2VMUBzs1RSBOdo5CZ2uLTFOrL7tOxW6jml8iGS2EvBtx3mDUVrQJ5e0mmGESIwyXZ+Kqu+gTr+EcMd5XvpuMfbQNa9Qxv8ytOxbwoO8tMHPerbAC0FmDkwLqCUFtZEIiWGIAvt4+AgXQkZxBNWFO1rzKAZgjShg3pjqKC94DsLDEj2bscNyDtYTVB1WhQM5tgPdOEvk/ybT4YBemlkPeK1Uq/WCXvCNAPWbmZpt+ijEqwQyOnBg2RmF2SfbDjq3DD2t8Ind+h/8+VJvXWHVvGoVYmDTKRbUQsmAb2noESV/R4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=udPL3gPvAckEiUa15KP6Cd4610Dh1Kdt8N/uJU7bqQc=; b=CulWgTumTyIppNcZarDrJeRhYFQbG7iVPmCA1v8wZ1ynn998Dc2AvwP9eZnqBMwUAgLWCEUXNJ1jdxmcTUkN1zNjv81ru2jfwYGzdBpFMmgq0XmLyfWTOIuQ+Vd0uTiyHeqgygf2E/jsaRXlfYGaqPCcJAWaf6kYnPO1/dnmp9jgDq4OkezjqKUkolI01Uct1f8ab2JYrBJFuPybEcfHNv2mlTtM7yrtHPbPm01kR8R6GZbTHaa5M+tnTf4uqdBV60Idn1bu9d+bNV8/DqLxh2xBcyLi3djCptcUXHiujsWGgnY+VPKC+AMAOe9lacpzHrHNA4Xp4XIX10MsRq/Mvg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by AM0PR04MB6913.eurprd04.prod.outlook.com (2603:10a6:208:184::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.29; Fri, 16 Jun 2023 07:31:14 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Fri, 16 Jun 2023 07:31:14 +0000 Message-ID: <7141b586-9711-aef0-7f28-5d4489478f1b@suse.com> Date: Fri, 16 Jun 2023 09:31:12 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 2/4] x86: optimize pre-AVX512 {,V}PCMPGT* with identical sources Content-Language: en-US To: Binutils Cc: "H.J. Lu" References: In-Reply-To: X-ClientProxiedBy: FR0P281CA0009.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:15::14) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|AM0PR04MB6913:EE_ X-MS-Office365-Filtering-Correlation-Id: 3917e5bd-ad33-4a77-3ee0-08db6e3baa0e X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: NDVRT665kGx4EGoDf/YepKmEZWozqlBekCqdZCzroFE4ALeuD5+SZRb/Q0ycjlFSb2IYiyhMHCHcTxTKVDStcvM4bMQT+YyJyfuiSuxXMoaIrqcvB6ACIDk9+sKzuqNuqLaKYsnyyN21PKxFqdtDPeUFfaopkFB1TbiPONknOVxcE0w5Gr/+dpBXq4eO4rcXK4xYiHJlzhOZeYcX/FgbAr9UnvCuL1Cf17M29Q7hnKu8nBbvSNRB3OK9WkdqzYTTMY/XTJ1oWNPPZrv1E2sUKdwWP5jYopJp7y0Jypq8Wffyl5MxLdIgfXm1gGxNgCiPPB+48ibCMItBo/E8HfIsowizxijJbyivF8DD/QXoCEjnl0Mxk/msnmf5M2WlSDDeA3i3NCT1FEOtNU6QtYnD+/awZidVPIBjKm51lgjpLsqrZX604eOKPazX/sLIOtUlWWsC1FJxUSqQLmp0gtYJzBavh/hmu5y1FyvbL/WnwKRfPXjbVxMmxRbSEgkc8oXV9KnJjE7mzkG1R2RB5yxpjLr3v6OnphqIG3yqNNInbO0UH8dFiSr0LmcqH/nSF9w2WsSUFe/Vb9x4mCSSiZZ8f7FlAWqAFN1rBoY3aBGRuszPl7fvYhgyyyVPIJ/0/p4Hkn+NkWQlQgtHQM0ar/Y3Sw== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(376002)(396003)(366004)(136003)(346002)(39860400002)(451199021)(478600001)(6486002)(5660300002)(41300700001)(8936002)(8676002)(2906002)(30864003)(36756003)(86362001)(31696002)(38100700002)(66476007)(66556008)(66946007)(316002)(6916009)(4326008)(26005)(6512007)(6506007)(186003)(31686004)(83380400001)(66574015)(2616005)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?mAxowNrErtSiApRFCO6YS/6C/OQl?= =?utf-8?q?M6t5Feg+zskSHuLJkk2tVIViv+r7jex5bl2/BeoCQP73zB9oUy4ZhsIcVBHKGIVUG?= =?utf-8?q?q9xWgpgo78mnFfzrLIglKj7nsSt6KqLLk56x69oFmKyYGp0b0lDu/X0++vW/xrgKK?= =?utf-8?q?dGqwx2UEl+6gTTlSptW+LSQo+xyDoX5DpCv6e/4zWa5YKyTm0e2p3z8j/718HK60m?= =?utf-8?q?bcz5jQ1NKZXQOqs803mnirk/S3vAEUVEHMq5PwAQptDjJ9jnngFVeLUwI6AL2errJ?= =?utf-8?q?xxXYWKKVSpAfdiKYB9wDo1e/TACmOohIgyQm146kRu+d8mSKp33xGhMKBebrppsG6?= =?utf-8?q?QXQFT/AD2N1x8QPpSStoX/yn/guGso5hDOX+BkgjZtG6UWeKneeyf16pWVqa7WBlt?= =?utf-8?q?YPiAuk+DydktEaz8xByJfNyfZKHmSRfmZ/0l5rxtsxGpQEXnBmHeekDbEknqmxWKJ?= =?utf-8?q?vYWzDyOZX+W0sY8/k7XvuG3Yt7w6wH6d24hL4jOhBp4Xg/LGl+Cb9c94HbI4bCRTh?= =?utf-8?q?NbUHwCIa7TYfcHUVJ7mVzwy6eHzq7IyjMtwkZKNRPuyNwp+0vAYa7/ymCodegXrIK?= =?utf-8?q?khbXQwS/wj/FCsxaDt1Z/eUy2qWeELI7i4dr6uNE3zp1o99bwZHYdRwpxh/mPXxTn?= =?utf-8?q?l5MVRshEdRW1PNzgY0pz1g4+/A2vKt7CfNRTLTdKSa9KsUUlDiisVG7487Uoww/EN?= =?utf-8?q?WtYOBSjfqKSnC1/GK9J2qki/QloUf1L13knCapYr+KOFi8HRXHbX/8J4cTMUqe7Y+?= =?utf-8?q?nJkEzXHFUW56g5mTZa3lpSDih6FoZUc+G0rFGvkIiDhWQzjCIa1T2idgNGbNPGoEk?= =?utf-8?q?lmTCUEW8j7GQ6nLvNJyFzyIwr/oLMIg2tTDhLmFPPfL9TT5wSf8wLkTX0XlhA3zi8?= =?utf-8?q?m16lO3mlj/VogQFQaz7KW1yXophxs/ShEYabzETCH1ghabwwseL1+Z8XD4KJiLEKZ?= =?utf-8?q?9zDsDcyXKl/XDx0eDsHMEO7zrr+Ua+0NipiD4YbQ/JsmaPkvkRY7w5WIUbyrjUR61?= =?utf-8?q?X4edpbkdsJiHqkyFOFmvFrPi+gNvLC0Ppe5QLssSUfDfGY+V+WrRXVpTgstxMzb1V?= =?utf-8?q?iEdHpGEOQR1vE59Nqsca9RSzmknUJJvmMKYLa7r5s95oTi0EFICZaDjQBdxB5gqMB?= =?utf-8?q?AO6jHmD4O5g1tlBehpWanJDZBp9vQ9zmQoFqMnLVY9XshSeUmwp7peb/xBvqsjoeq?= =?utf-8?q?lHcN2t/CgSLi2ul17YL1172Om/kSfzFmcU6GL+eATn1yMlqiV1DGqrI9/jXt2lMPx?= =?utf-8?q?m5/X/vLs+S5JAhnp/0wGExK2yjN0SqaLFmP7bMHi7rv1jrDGY1hKg+3XiRGaskAeo?= =?utf-8?q?k6wY111lzlLlZA4PrnrdvodnAPpvbYbVa6nFffCGKHXDCxks3UCogVNC5E6aCr2hE?= =?utf-8?q?aTtflg7i4wuGP+5OsPmd/saPflbiqWEYXNnp0n8yzm9sNkIyCaYNXS6J4O4rs5quv?= =?utf-8?q?wXReLrfr4yMhS5H2ka9pM264cchJFXiIDnnW8+FT70xHTYYAf5vu19V8HJzueVZd9?= =?utf-8?q?ejTlB4DEzmet?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3917e5bd-ad33-4a77-3ee0-08db6e3baa0e X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Jun 2023 07:31:13.9500 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: NWHK30DnDJA8HyHMJ0NUD12xGfoj0boUfuhIyxPYFqPGFwHADZXNf/7oER5tnC4Iw5qAvAFN68PlnsZJHS40JQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR04MB6913 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Binutils From: Jan Beulich Reply-To: Jan Beulich Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768843579166157710?= X-GMAIL-MSGID: =?utf-8?q?1768843579166157710?= These are better expressed by the zeroing idiom {,V}PXOR. In some cases this also results in a shorter encoding. --- Thoughts towards doing the same for {,V}PSUB{,U}S{B,W,D,Q}, anyone? --- a/gas/config/tc-i386.c +++ b/gas/config/tc-i386.c @@ -4580,6 +4580,46 @@ optimize_encoding (void) i.tm.opcode_space = SPACE_0F; i.tm.base_opcode = 0x76; } + else if (((i.tm.base_opcode >= 0x64 + && i.tm.base_opcode <= 0x66 + && i.tm.opcode_space == SPACE_0F) + || (i.tm.base_opcode == 0x37 + && i.tm.opcode_space == SPACE_0F38)) + && i.operands == i.reg_operands + && i.op[0].regs == i.op[1].regs + && !is_evex_encoding (&i.tm)) + { + /* Optimize: -O: + pcmpgt[bwd] %mmN, %mmN -> pxor %mmN, %mmN + pcmpgt[bwdq] %xmmN, %xmmN -> pxor %xmmN, %xmmN + vpcmpgt[bwdq] %xmmN, %xmmN, %xmmM -> vpxor %xmmN, %xmmN, %xmmM (N < 8) + vpcmpgt[bwdq] %xmmN, %xmmN, %xmmM -> vpxor %xmm0, %xmm0, %xmmM (N > 7) + vpcmpgt[bwdq] %ymmN, %ymmN, %ymmM -> vpxor %ymmN, %ymmN, %ymmM (N < 8) + vpcmpgt[bwdq] %ymmN, %ymmN, %ymmM -> vpxor %ymm0, %ymm0, %ymmM (N > 7) + */ + i.tm.opcode_space = SPACE_0F; + i.tm.base_opcode = 0xef; + if (i.tm.opcode_modifier.vex && (i.op[0].regs->reg_flags & RegRex)) + { + if (i.operands == 2) + { + gas_assert (i.tm.opcode_modifier.sse2avx); + + i.operands = 3; + i.reg_operands = 3; + i.tm.operands = 3; + + i.op[2].regs = i.op[0].regs; + i.types[2] = i.types[0]; + i.flags[2] = i.flags[0]; + i.tm.operand_types[2] = i.tm.operand_types[0]; + + i.tm.opcode_modifier.sse2avx = 0; + } + i.op[0].regs -= i.op[0].regs->reg_num + 8; + i.op[1].regs = i.op[0].regs; + } + } } /* Return non-zero for load instruction. */ --- a/gas/testsuite/gas/i386/optimize-1.d +++ b/gas/testsuite/gas/i386/optimize-1.d @@ -147,6 +147,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/optimize-1.s +++ b/gas/testsuite/gas/i386/optimize-1.s @@ -171,6 +171,25 @@ _start: vpxord 128(%eax), %ymm2, %ymm3 vpxorq 128(%eax), %ymm2, %ymm3 + pcmpgtb %mm2, %mm2 + pcmpgtb %xmm2, %xmm2 + vpcmpgtb %xmm2, %xmm2, %xmm0 + vpcmpgtb %ymm2, %ymm2, %ymm0 + + pcmpgtw %mm2, %mm2 + pcmpgtw %xmm2, %xmm2 + vpcmpgtw %xmm2, %xmm2, %xmm0 + vpcmpgtw %ymm2, %ymm2, %ymm0 + + pcmpgtd %mm2, %mm2 + pcmpgtd %xmm2, %xmm2 + vpcmpgtd %xmm2, %xmm2, %xmm0 + vpcmpgtd %ymm2, %ymm2, %ymm0 + + pcmpgtq %xmm2, %xmm2 + vpcmpgtq %xmm2, %xmm2, %xmm0 + vpcmpgtq %ymm2, %ymm2, %ymm0 + bt $15, %ax bt $16, %ax btc $15, %ax --- a/gas/testsuite/gas/i386/optimize-1a.d +++ b/gas/testsuite/gas/i386/optimize-1a.d @@ -148,6 +148,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/optimize-4.d +++ b/gas/testsuite/gas/i386/optimize-4.d @@ -147,6 +147,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/optimize-5.d +++ b/gas/testsuite/gas/i386/optimize-5.d @@ -147,6 +147,21 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%eax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%eax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm0 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm2,%ymm0 +[a-f0-9]+: 0f ba e0 0f bt \$0xf,%eax +[a-f0-9]+: 66 0f ba e0 10 bt \$0x10,%ax +[a-f0-9]+: 0f ba f8 0f btc \$0xf,%eax --- a/gas/testsuite/gas/i386/x86-64.exp +++ b/gas/testsuite/gas/i386/x86-64.exp @@ -520,6 +520,7 @@ run_dump_test "x86-64-optimize-1" run_dump_test "x86-64-optimize-2" run_dump_test "x86-64-optimize-2a" run_dump_test "x86-64-optimize-2b" +run_dump_test "x86-64-optimize-2c" run_dump_test "x86-64-optimize-3" run_dump_test "x86-64-optimize-3b" run_dump_test "x86-64-optimize-4" --- a/gas/testsuite/gas/i386/x86-64-optimize-2.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2.d @@ -203,4 +203,23 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-2.s +++ b/gas/testsuite/gas/i386/x86-64-optimize-2.s @@ -226,3 +226,26 @@ _start: vporq 128(%rax), %ymm2, %ymm3 vpxord 128(%rax), %ymm2, %ymm3 vpxorq 128(%rax), %ymm2, %ymm3 + + pcmpgtb %mm2, %mm2 + pcmpgtb %xmm2, %xmm2 + pcmpgtb %xmm12, %xmm12 + vpcmpgtb %xmm2, %xmm2, %xmm8 + vpcmpgtb %ymm12, %ymm12, %ymm1 + + pcmpgtw %mm2, %mm2 + pcmpgtw %xmm2, %xmm2 + pcmpgtw %xmm12, %xmm12 + vpcmpgtw %xmm2, %xmm2, %xmm8 + vpcmpgtw %ymm12, %ymm12, %ymm1 + + pcmpgtd %mm2, %mm2 + pcmpgtd %xmm2, %xmm2 + pcmpgtd %xmm12, %xmm12 + vpcmpgtd %xmm2, %xmm2, %xmm8 + vpcmpgtd %ymm12, %ymm12, %ymm1 + + pcmpgtq %xmm2, %xmm2 + pcmpgtq %xmm12, %xmm12 + vpcmpgtq %xmm2, %xmm2, %xmm8 + vpcmpgtq %ymm12, %ymm12, %ymm1 --- a/gas/testsuite/gas/i386/x86-64-optimize-2a.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2a.d @@ -204,4 +204,23 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 #pass --- a/gas/testsuite/gas/i386/x86-64-optimize-2b.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-2b.d @@ -203,4 +203,23 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 #pass --- /dev/null +++ b/gas/testsuite/gas/i386/x86-64-optimize-2c.d @@ -0,0 +1,226 @@ +#source: x86-64-optimize-2.s +#as: -O -msse2avx +#objdump: -drw +#name: x86-64 optimized encoding 2c with -O and SSE2AVX + +.*: +file format .* + + +Disassembly of section .text: + +0+ <_start>: + +[a-f0-9]+: 62 71 f5 4f 55 f9 vandnpd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 55 f9 vandnpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 55 c1 vandnpd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 55 c1 vandnpd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 55 c9 vandnpd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 55 c9 vandnpd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 74 4f 55 f9 vandnps %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 55 f9 vandnps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 74 48 55 c1 vandnps %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 74 28 55 c1 vandnps %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 74 40 55 c9 vandnps %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 74 20 55 c9 vandnps %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 71 75 4f df f9 vpandnd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 df c1 vpandnd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 df c1 vpandnd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 df c9 vpandnd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 df c9 vpandnd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f df f9 vpandnq %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 df f9 vpandn %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 df c1 vpandnq %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 df c1 vpandnq %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 df c9 vpandnq %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 df c9 vpandnq %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f 57 f9 vxorpd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 57 f9 vxorpd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 57 c1 vxorpd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 57 c1 vxorpd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 57 c9 vxorpd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 57 c9 vxorpd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 74 4f 57 f9 vxorps %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 70 57 f9 vxorps %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 74 48 57 c1 vxorps %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 74 28 57 c1 vxorps %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 74 40 57 c9 vxorps %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 74 20 57 c9 vxorps %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 71 75 4f ef f9 vpxord %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 ef c1 vpxord %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 ef c1 vpxord %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 ef c9 vpxord %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 ef c9 vpxord %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f ef f9 vpxorq %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 ef f9 vpxor %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 ef c1 vpxorq %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 ef c1 vpxorq %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 ef c9 vpxorq %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 ef c9 vpxorq %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 75 4f f8 f9 vpsubb %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 f8 f9 vpsubb %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f8 f9 vpsubb %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f8 f9 vpsubb %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 f8 c1 vpsubb %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 f8 c1 vpsubb %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 f8 c9 vpsubb %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 f8 c9 vpsubb %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 75 4f f9 f9 vpsubw %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 f9 f9 vpsubw %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f9 f9 vpsubw %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 f9 f9 vpsubw %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 f9 c1 vpsubw %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 f9 c1 vpsubw %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 f9 c9 vpsubw %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 f9 c9 vpsubw %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 75 4f fa f9 vpsubd %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 fa f9 vpsubd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fa f9 vpsubd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fa f9 vpsubd %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 75 48 fa c1 vpsubd %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 75 28 fa c1 vpsubd %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 75 40 fa c9 vpsubd %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 75 20 fa c9 vpsubd %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: 62 71 f5 4f fb f9 vpsubq %zmm1,%zmm1,%zmm15\{%k7\} + +[a-f0-9]+: c5 71 fb f9 vpsubq %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fb f9 vpsubq %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: c5 71 fb f9 vpsubq %xmm1,%xmm1,%xmm15 + +[a-f0-9]+: 62 e1 f5 48 fb c1 vpsubq %zmm1,%zmm1,%zmm16 + +[a-f0-9]+: 62 e1 f5 28 fb c1 vpsubq %ymm1,%ymm1,%ymm16 + +[a-f0-9]+: 62 b1 f5 40 fb c9 vpsubq %zmm17,%zmm17,%zmm1 + +[a-f0-9]+: 62 b1 f5 20 fb c9 vpsubq %ymm17,%ymm17,%ymm1 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 f9 6f d1 vmovdqa %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c5 fa 6f d1 vmovdqu %xmm1,%xmm2 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 79 6f e3 vmovdqa %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c4 41 7a 6f e3 vmovdqu %xmm11,%xmm12 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 f9 6f 50 7f vmovdqa 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: c5 fa 6f 50 7f vmovdqu 0x7f\(%rax\),%xmm2 + +[a-f0-9]+: 62 f1 7d 08 7f 48 08 vmovdqa32 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fd 08 7f 48 08 vmovdqa64 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7f 08 7f 48 08 vmovdqu8 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 ff 08 7f 48 08 vmovdqu16 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7e 08 7f 48 08 vmovdqu32 %xmm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fe 08 7f 48 08 vmovdqu64 %xmm1,0x80\(%rax\) + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fd 6f d1 vmovdqa %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c5 fe 6f d1 vmovdqu %ymm1,%ymm2 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7d 6f e3 vmovdqa %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c4 41 7e 6f e3 vmovdqu %ymm11,%ymm12 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fd 6f 50 7f vmovdqa 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: c5 fe 6f 50 7f vmovdqu 0x7f\(%rax\),%ymm2 + +[a-f0-9]+: 62 f1 7d 28 7f 48 04 vmovdqa32 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fd 28 7f 48 04 vmovdqa64 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7f 28 7f 48 04 vmovdqu8 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 ff 28 7f 48 04 vmovdqu16 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7e 28 7f 48 04 vmovdqu32 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 fe 28 7f 48 04 vmovdqu64 %ymm1,0x80\(%rax\) + +[a-f0-9]+: 62 f1 7d 48 6f 10 vmovdqa32 \(%rax\),%zmm2 + +[a-f0-9]+: c5 .* vpand %xmm2,%xmm3,%xmm4 + +[a-f0-9]+: c4 .* vpand %xmm12,%xmm3,%xmm4 + +[a-f0-9]+: c5 .* vpandn %xmm2,%xmm13,%xmm4 + +[a-f0-9]+: c5 .* vpandn %xmm2,%xmm3,%xmm14 + +[a-f0-9]+: c5 .* vpor %xmm2,%xmm3,%xmm4 + +[a-f0-9]+: c4 .* vpor %xmm12,%xmm3,%xmm4 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm13,%xmm4 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm3,%xmm14 + +[a-f0-9]+: c5 .* vpand %ymm2,%ymm3,%ymm4 + +[a-f0-9]+: c4 .* vpand %ymm12,%ymm3,%ymm4 + +[a-f0-9]+: c5 .* vpandn %ymm2,%ymm13,%ymm4 + +[a-f0-9]+: c5 .* vpandn %ymm2,%ymm3,%ymm14 + +[a-f0-9]+: c5 .* vpor %ymm2,%ymm3,%ymm4 + +[a-f0-9]+: c4 .* vpor %ymm12,%ymm3,%ymm4 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm13,%ymm4 + +[a-f0-9]+: c5 .* vpxor %ymm2,%ymm3,%ymm14 + +[a-f0-9]+: c5 .* vpand 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpand 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpandn 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpandn 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpxor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpxor 0x70\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandd 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandnd 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpandnq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpord 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%xmm2,%xmm3 + +[a-f0-9]+: c5 .* vpand 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpand 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpandn 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpandn 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpxor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: c5 .* vpxor 0x60\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandd 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandnd 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpandnq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpord 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm2 + +[a-f0-9]+: c5 .* vpxor %xmm0,%xmm0,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 +#pass --- a/gas/testsuite/gas/i386/x86-64-optimize-5.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-5.d @@ -203,6 +203,25 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 --- a/gas/testsuite/gas/i386/x86-64-optimize-6.d +++ b/gas/testsuite/gas/i386/x86-64-optimize-6.d @@ -203,6 +203,25 @@ Disassembly of section .text: +[a-f0-9]+: 62 .* vporq 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxord 0x80\(%rax\),%ymm2,%ymm3 +[a-f0-9]+: 62 .* vpxorq 0x80\(%rax\),%ymm2,%ymm3 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 0f .* pxor %mm2,%mm2 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 + +[a-f0-9]+: 66 .* pxor %xmm2,%xmm2 + +[a-f0-9]+: 66 .* pxor %xmm12,%xmm12 + +[a-f0-9]+: c5 .* vpxor %xmm2,%xmm2,%xmm8 + +[a-f0-9]+: c5 .* vpxor %ymm0,%ymm0,%ymm1 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 f5 08 55 e9 \{evex\} vandnpd %xmm1,%xmm1,%xmm5 +[a-f0-9]+: 62 f1 7d 28 6f d1 vmovdqa32 %ymm1,%ymm2 --- a/opcodes/i386-opc.tbl +++ b/opcodes/i386-opc.tbl @@ -1025,8 +1025,8 @@ pand, 0x0fdb, , M pandn, 0x0fdf, , Modrm||NoSuf, { ||Unspecified|BaseIndex, } pcmpeq, 0x0f74 | , , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } pcmpeqd, 0x0f76, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } -pcmpgt, 0x0f64 | , , Modrm||NoSuf, { ||Unspecified|BaseIndex, } -pcmpgtd, 0x0f66, , Modrm||NoSuf, { ||Unspecified|BaseIndex, } +pcmpgt, 0x0f64 | , , Modrm||NoSuf|Optimize, { ||Unspecified|BaseIndex, } +pcmpgtd, 0x0f66, , Modrm||NoSuf|Optimize, { ||Unspecified|BaseIndex, } pmaddwd, 0x0ff5, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } pmulhw, 0x0fe5, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } pmullw, 0x0fd5, , Modrm||C|NoSuf, { ||Unspecified|BaseIndex, } @@ -1405,7 +1405,7 @@ rounds, 0x660f3a0a | -pcmpgtq, 0x660f3837, , Modrm|||NoSuf, { RegXMM|Unspecified|BaseIndex, RegXMM } +pcmpgtq, 0x660f3837, , Modrm|||NoSuf|Optimize, { RegXMM|Unspecified|BaseIndex, RegXMM } pcmpestri, 0x660f3a61, |No64, Modrm||NoSuf, { Imm8, RegXMM|Unspecified|BaseIndex, RegXMM } pcmpestri, 0x6661, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf|SSE2AVX, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } pcmpestri, 0x660f3a61, SSE4_2|x64, Modrm|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } @@ -1597,9 +1597,9 @@ vpcmpestri, 0x6661, AVX|No64, Modrm|Vex| vpcmpestri, 0x6661, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpestrm, 0x6660, AVX|No64, Modrm|Vex|Space0F3A|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpestrm, 0x6660, AVX|x64, Modrm|Vex|Space0F3A|IgnoreSize|No_bSuf|No_wSuf|No_sSuf, { Imm8, Xmmword|Unspecified|BaseIndex|RegXMM, RegXMM } -vpcmpgt, 0x6664 | , AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpcmpgtd, 0x6666, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } -vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpcmpgt, 0x6664 | , AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpcmpgtd, 0x6666, AVX|AVX2, Modrm|Vex|Space0F|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } +vpcmpgtq, 0x6637, AVX|AVX2, Modrm|Vex|Space0F38|VexVVVV|VexWIG|CheckOperandSize|NoSuf|Optimize, { Unspecified|BaseIndex|RegXMM|RegYMM, RegXMM|RegYMM, RegXMM|RegYMM } vpcmpistri, 0x6663, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vpcmpistrm, 0x6662, AVX, Modrm|Vex|Space0F3A|VexWIG|NoSuf, { Imm8, Unspecified|BaseIndex|RegXMM, RegXMM } vperm2f128, 0x6606, AVX, Modrm|Vex256|Space0F3A|VexVVVV|VexW0|NoSuf, { Imm8|Imm8S, Unspecified|BaseIndex|RegYMM, RegYMM, RegYMM }