Message ID | aa84243c-860b-ddf8-bfde-7e080a197cd1@suse.com |
---|---|
State | Accepted |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp420065vqr; Wed, 14 Jun 2023 23:04:02 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ57w8NBatxvZAf9tr9pXJ+spfvcnopcY3HISHtCwT0WGzp3GdlfF2ARLWf7wboEpNvQHBnl X-Received: by 2002:a2e:b618:0:b0:2a8:eee0:59f3 with SMTP id r24-20020a2eb618000000b002a8eee059f3mr8478367ljn.41.1686809042802; Wed, 14 Jun 2023 23:04:02 -0700 (PDT) Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e23-20020a50ec97000000b00514a4bf2b95si9412035edr.177.2023.06.14.23.04.02 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Jun 2023 23:04:02 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=QAlnVyCW; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 702393858022 for <ouuuleilei@gmail.com>; Thu, 15 Jun 2023 06:04:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 702393858022 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686809041; bh=zSWgvmKK5vu2NAyJI56VfFa4/Xu4UhLHCe0VQL2DaxE=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=QAlnVyCWtlSSvR2REhWU0c6kO9cWNviF7HDshqJtJz/M8Fd7d5Ef84jqGk6mYLGtw Atb5Ut/YrrcpZ4pd7doQ3cSVkK6axPYjQMM4lK5FesOh/NDm92rtfYfz+BVJMOEPeT BgzOarlMtvSAKwvoKYUN9WCHQE9xXRIvmlXe+AQg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2044.outbound.protection.outlook.com [40.107.7.44]) by sourceware.org (Postfix) with ESMTPS id 5C7033858C1F for <gcc-patches@gcc.gnu.org>; Thu, 15 Jun 2023 06:03:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5C7033858C1F ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FaE3IqIqTVDfjVvofLn2wQ4BMyJQKvgLEe0GiKrrvUb/akEgrNIKR1YF4TJWZtAUHkPuUL8iYNcScFJn2CE46NLjUl1RwnkUjakrDMj1wA1Rf/u62V+ID9EKxmikxsgOsVsMLPplRVETMTWpdWJdO/ZDlIhXZUAYMTIGDgdx8PjGUX1sgSiR25CY2UYW2RfpYkQWoZLARM1mx4E5CMFY7a5mx1DD+IZ1cjWW6xwpQJ3RXQJIjiLMODdC018c413QR6FognSZGoHZ+u0ooIssX2A+SkouVlcWEYbKzhWiRKAC+XrrM/sxODviog/ZoH9VuOa9DKX0rz2B43499YKrdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zSWgvmKK5vu2NAyJI56VfFa4/Xu4UhLHCe0VQL2DaxE=; b=M8munLF+ygqX/iMOvgZA5Vgq9TjTALDbYtaofCaA6X51VmZgD/gkCLmO0u/2R/jCAcEWLtjy9LqmHpfZFkDf6tVs6pruttX97/ti9Bis2WJQf9wKGADfILTM7BOjqPVOCHfbfsZe1ElAEA92NJlX415yoU/OWxuUqLdsN3J7ejPbsNADNMXxsI9JWkgCIu8Fig5gmE+kuVO5r3oM1LwdBV2rZdgEstdsUfwZvm3H9DwShFoRN3uvbg7uAMMF7utJ2q+G//5cI6mCUzP6+W2BnQyGFOT61XCcLMVt3aB7M1e1JvODAB09z6/Zdqq6fWGFU6CWS7jHKPI4UTB/Ht3ioA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DU0PR04MB9493.eurprd04.prod.outlook.com (2603:10a6:10:350::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.37; Thu, 15 Jun 2023 06:03:13 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6455.039; Thu, 15 Jun 2023 06:03:13 +0000 Message-ID: <aa84243c-860b-ddf8-bfde-7e080a197cd1@suse.com> Date: Thu, 15 Jun 2023 08:03:11 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Content-Language: en-US To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org> Cc: Hongtao Liu <hongtao.liu@intel.com>, Kirill Yukhin <kirill.yukhin@gmail.com> Subject: [PATCH] x86: correct and improve "*vec_dupv2di" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-ClientProxiedBy: FR2P281CA0125.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:9d::19) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|DU0PR04MB9493:EE_ X-MS-Office365-Filtering-Correlation-Id: 4777edd4-acce-4459-c1c1-08db6d663419 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: R+2O532NJQMA+zHcja4YOD7NgRIySInd3ISdmMXBpZ032DHGGK64oT6aVEgO7L0Siu0cuTK4ga6t+vU6y/1MqKEWRS6S0Dx2iJ/dzRm9gxFVuMcbDWP51YhWBmwCaOXd8SHvstxgEMAFRbDWNT/LmF33nGdC1uSEfHyKdurt8cSLoQLgYeUa6xqActTrfsgCMLe5ADS9tG+kw49+rtsuiTCF9OVf8TLY2mUI4JTjMZsHQ/Q1rlr836PxqpEh8JU2x6UE/ouXTCxfl98yO0pvSHY8sST1sLIsJtHdcxdaaD9cScqxDHVHxlAnrTmxi+gIaEj0Hlg/IMHpnk4rLqTs5a1yMt67s2MTbF2+cmuhjoLsq7st8kz9U4FiXbXNWl9Av+4a8whHzL5hVNDNxW69Upy8hvDH87hWwR/o2fe9U15I9WHfPI3lWv44N0gi2jdCUpRDaGdgtA47KSsCdOyDAVxWCkWY5hxPHEnwjAnA9y9GC5lMEnGITe9CLNYv6Otp6s2iKxlh1An76if65Lc4zoY+KMobR6VRQXY12OxzSeoqP2tczxEBDgY4obJ1L5PHNSx3oplgm3xni4H+J9z2VDBAWoE2fK1H8+a5PH3xhkmZ/o/YdevlTV+G5oRSnXt2QvFJzlCWVg5ym5qeHnUUQg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(346002)(39860400002)(396003)(136003)(376002)(366004)(451199021)(2906002)(2616005)(36756003)(86362001)(31696002)(38100700002)(8936002)(8676002)(6486002)(316002)(41300700001)(5660300002)(478600001)(31686004)(54906003)(66946007)(4326008)(66476007)(6916009)(6512007)(26005)(6506007)(66556008)(186003)(43740500002)(45980500001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?gNEP/wixoN5gMF5UfdADvMdvWWTH?= =?utf-8?q?nESsZCEWaTCFF5+O5zwkWOQW/FE0JxaT5fUTlieeTkWJXe4HHfT0eLf42AYofw0Q/?= =?utf-8?q?qhyqgVbR8akIQ2+LgjNAQGhcmOzn41fnIujr7I0uL0oLS185MhnnWpden3TJGgS56?= =?utf-8?q?DBawSqDgmp2Xm9usYpTriKoFnYH6jX8vZayn6nMFWXSkgz5KMHPVN7fmqNOF6CiAc?= =?utf-8?q?PoywaTkUdHAFG55S+EO9/t6RM9FWZ75yjBKuKHK8nRCM4pYXACGef64UiiD597cQ9?= =?utf-8?q?2Q980rOQ2JXNPxmICTW5h01M2Hc6ikXLYxX7IgAzX3OVGE0mSei2erGfsXIU9EPHe?= =?utf-8?q?hrT2hF8niWyA1lYvWAbKMqtz4FtRz9wVGVcuudz+JViqmZO6/a9sbWrh48Hw9hZOr?= =?utf-8?q?UfpWWhz8CJn/bUzEodgpgDRowHh6X0KKcBVfUqepNRceytX1baggM68l/88V4281R?= =?utf-8?q?d5deQYCADF0Sy/ec/afHBhqyO78CsieL4ndO0FN2S+ADeig9LveI+1hPPhZIkZj3f?= =?utf-8?q?xxa4rOmJg3qDzRMXSOUIPlhD9RPkrSddQEE+4p7xLOjM2bqBKLKqZpp47k8EXVdre?= =?utf-8?q?zWL07w3RrxYtbEvhSC01LB1hXwuVE9/tmU3Ten2Xt08Wt9TUUJhEJ4pVe8IKzgYmN?= =?utf-8?q?ARrZQxhEu5Sb+PPAt14PSzMYj816ZA1xbkd1A3Tt/Ld0K+pjZePvYyr4wmN+/1DwE?= =?utf-8?q?sz7L1uYnPlEgBd2G1U4E7EtY+6nzakzUoNtlYXTXCiovCcMZlh5OwKe/EUrfFUqde?= =?utf-8?q?RWPmwFO+cVxGg2FIMIKBZl+7yOhT5chayupopiZYdyJc3WR/2ZnMp/sRiRJU93uaS?= =?utf-8?q?aBMZqvezluXJMsPVp/M5KOVOHDQvHjfedPNT4evptz1Z5JK58wPDl+pA1VBskkpaq?= =?utf-8?q?3/tmLU65fa0iyYUOf+F0c0Ldi1TGhMrqIfTeTj06t3H4Nu3e6doEULlZ+Kgj9ghg8?= =?utf-8?q?VNFjrhdvmzAPRRK4h9fQ05ucSgJJ1KFNfCbvJP+wwxbl70oO9KCHUEPphVVaK5SZ2?= =?utf-8?q?4bvwnkcFuQj7qbXoX1p8HG+CciULjipJ88cGDw+w1SEtlf6eDMLPwohbSdE+eAdwJ?= =?utf-8?q?+jHlhaP4qbq8wjbTwWJWCDm55sLkR3m6OM81gpDJ8DIVli/YwqvX0z2V26T03sQuF?= =?utf-8?q?6D/q6lGq1NmBTnQ0jhccedJ69OpETKoMpTMUzqsqdVY7SYTDYbrPbwRil/m+woX3y?= =?utf-8?q?pZjHTIyAbbI8VgR2oN32ACLF9unEUD841FOaB9B/IC3duyErmzyv22nYEVzaRXeLM?= =?utf-8?q?MlDWFw+BmCwAZTGGc27MZDG/iL66BzT8Iipc0fOseQKBg5kcfcJ2NhXhulkVh2lGF?= =?utf-8?q?i3Ggx7FTPTYQTuXSfa+zdbBsuD68Xs0gWMqfEe/yYkH4fX2+GjTrHC0NFv4kXzWL9?= =?utf-8?q?3b0FIRPAtIXSNcQcdqxZTpxYLj64+c29sbrKgbx3xJ6wohDRjb/8QFAHboBzL/hQy?= =?utf-8?q?aiUj+eEev+lXZMdTPxvA54nLAXdb8H8HvkNpgiNEnyczOBV5yt/QkGx0WS8Qb6991?= =?utf-8?q?B1FKJSJTXutZ?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4777edd4-acce-4459-c1c1-08db6d663419 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Jun 2023 06:03:13.2679 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 72bnYYfjjHhWsby9+Pxtnkk2uXNHJjZoy/z0IoQiMctA0KiIolKZf4LupMDwKKdlMYJUCC3WXM8AxWFetrYw3A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR04MB9493 X-Spam-Status: No, score=-3027.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Jan Beulich via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Jan Beulich <jbeulich@suse.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768747478630548083?= X-GMAIL-MSGID: =?utf-8?q?1768747478630548083?= |
Series |
x86: correct and improve "*vec_dupv2di"
|
|
Checks
Context | Check | Description |
---|---|---|
snail/gcc-patch-check | success | Github commit url |
Commit Message
Jan Beulich
June 15, 2023, 6:03 a.m. UTC
The input constraint for the %vmovddup alternative was wrong, as the upper 16 XMM registers require AVX512VL to be used with this insn. To compensate, introduce a new alternative permitting all 32 registers, by broadcasting to the full 512 bits in that case if AVX512VL is not available. gcc/ * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input constraint. Add new AVX512F alternative. --- Strictly speaking the new alternative could be enabled from AVX2 onwards, but vmovddup can frequently be a shorter encoding (VEX2 vs VEX3).
Comments
On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > The input constraint for the %vmovddup alternative was wrong, as the > upper 16 XMM registers require AVX512VL to be used with this insn. To > compensate, introduce a new alternative permitting all 32 registers, by > broadcasting to the full 512 bits in that case if AVX512VL is not > available. > > gcc/ > > * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input > constraint. Add new AVX512F alternative. > --- > Strictly speaking the new alternative could be enabled from AVX2 > onwards, but vmovddup can frequently be a shorter encoding (VEX2 > vs VEX3). > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -25851,19 +25851,39 @@ > (symbol_ref "true")))]) > > (define_insn "*vec_dupv2di" > - [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") > + [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,v,x") > (vec_duplicate:V2DI > - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] > + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0")))] > "TARGET_SSE" > - "@ > - punpcklqdq\t%0, %0 > - vpunpcklqdq\t{%d1, %0|%0, %d1} > - %vmovddup\t{%1, %0|%0, %1} > - movlhps\t%0, %0" > - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") > - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") > - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") > - (set_attr "mode" "TI,TI,DF,V4SF")]) > +{ > + switch (which_alternative) > + { > + case 0: > + return "punpcklqdq\t%0, %0"; > + case 1: > + return "vpunpcklqdq\t{%d1, %0|%0, %d1}"; > + case 2: > + if (TARGET_AVX512VL) > + return "vpbroadcastq\t{%1, %0|%0, %1}"; > + return "vpbroadcastq\t{%1, %g0|%g0, %1}"; You can use * return TARGET_AVX512VL ? \"vpbroadcastq\t{%1, %0|%0, %1}\" : \"vpbroadcastq\t{%1, %g0|%g0, %1}\"; directly in a multi-output insn template to avoid the above C code. See e.g. sse2_cvtpd2pi for an example. Uros. > + case 3: > + return "%vmovddup\t{%1, %0|%0, %1}"; > + case 4: > + return "movlhps\t%0, %0"; > + default: > + gcc_unreachable (); > + } > +} > + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") > + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") > + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") > + (set_attr "mode" "TI,TI,TI,DF,V4SF") > + (set (attr "enabled") > + (if_then_else > + (eq_attr "alternative" "2") > + (symbol_ref "TARGET_AVX512VL > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") > + (const_string "*")))]) > > (define_insn "avx2_vbroadcasti128_<mode>" > [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")
On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > The input constraint for the %vmovddup alternative was wrong, as the > > upper 16 XMM registers require AVX512VL to be used with this insn. To > > compensate, introduce a new alternative permitting all 32 registers, by > > broadcasting to the full 512 bits in that case if AVX512VL is not > > available. > > > > gcc/ > > > > * config/i386/sse.md (vec_dupv2di): Correct %vmovddup input > > constraint. Add new AVX512F alternative. > > --- > > Strictly speaking the new alternative could be enabled from AVX2 > > onwards, but vmovddup can frequently be a shorter encoding (VEX2 > > vs VEX3). > > > > --- a/gcc/config/i386/sse.md > > +++ b/gcc/config/i386/sse.md > > @@ -25851,19 +25851,39 @@ > > (symbol_ref "true")))]) > > > > (define_insn "*vec_dupv2di" > > - [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") > > + [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,v,x") > > (vec_duplicate:V2DI > > - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] > > + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0")))] > > "TARGET_SSE" > > - "@ > > - punpcklqdq\t%0, %0 > > - vpunpcklqdq\t{%d1, %0|%0, %d1} > > - %vmovddup\t{%1, %0|%0, %1} > > - movlhps\t%0, %0" > > - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") > > - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") > > - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") > > - (set_attr "mode" "TI,TI,DF,V4SF")]) > > +{ > > + switch (which_alternative) > > + { > > + case 0: > > + return "punpcklqdq\t%0, %0"; > > + case 1: > > + return "vpunpcklqdq\t{%d1, %0|%0, %d1}"; > > + case 2: > > + if (TARGET_AVX512VL) > > + return "vpbroadcastq\t{%1, %0|%0, %1}"; > > + return "vpbroadcastq\t{%1, %g0|%g0, %1}"; > > You can use > > * return TARGET_AVX512VL ? \"vpbroadcastq\t{%1, %0|%0, %1}\" : > \"vpbroadcastq\t{%1, %g0|%g0, %1}\"; > > directly in a multi-output insn template to avoid the above C code. > See e.g. sse2_cvtpd2pi for an example. > > Uros. > > > + case 3: > > + return "%vmovddup\t{%1, %0|%0, %1}"; > > + case 4: > > + return "movlhps\t%0, %0"; > > + default: > > + gcc_unreachable (); > > + } > > +} > > + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") > > + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") > > + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") > > + (set_attr "mode" "TI,TI,TI,DF,V4SF") alternative 2 should be XImode when !TARGET_AVX512VL. > > + (set (attr "enabled") > > + (if_then_else > > + (eq_attr "alternative" "2") > > + (symbol_ref "TARGET_AVX512VL > > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") > > + (const_string "*")))]) > > > > (define_insn "avx2_vbroadcasti128_<mode>" > > [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")
On 15.06.2023 09:45, Hongtao Liu wrote: > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches >> <gcc-patches@gcc.gnu.org> wrote: >>> + case 3: >>> + return "%vmovddup\t{%1, %0|%0, %1}"; >>> + case 4: >>> + return "movlhps\t%0, %0"; >>> + default: >>> + gcc_unreachable (); >>> + } >>> +} >>> + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") >>> + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") >>> + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") >>> + (set_attr "mode" "TI,TI,TI,DF,V4SF") > alternative 2 should be XImode when !TARGET_AVX512VL. This gives me a chance to actually raise a related question I stumbled across several times: Which operand does the mode attribute actually describe? I've seen places where it's the source, but I've also seen places where it's the destination. Because of this mix I wasn't really sure that getting this attribute entirely correct is actually necessary, and hence I hoped it would be okay to not further complicate the attribute here. Jan
On Thu, Jun 15, 2023 at 10:15 AM Jan Beulich <jbeulich@suse.com> wrote: > > On 15.06.2023 09:45, Hongtao Liu wrote: > > On Thu, Jun 15, 2023 at 3:07 PM Uros Bizjak via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> On Thu, Jun 15, 2023 at 8:03 AM Jan Beulich via Gcc-patches > >> <gcc-patches@gcc.gnu.org> wrote: > >>> + case 3: > >>> + return "%vmovddup\t{%1, %0|%0, %1}"; > >>> + case 4: > >>> + return "movlhps\t%0, %0"; > >>> + default: > >>> + gcc_unreachable (); > >>> + } > >>> +} > >>> + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") > >>> + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") > >>> + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") > >>> + (set_attr "mode" "TI,TI,TI,DF,V4SF") > > alternative 2 should be XImode when !TARGET_AVX512VL. > > This gives me a chance to actually raise a related question I stumbled > across several times: Which operand does the mode attribute actually > describe? I've seen places where it's the source, but I've also seen > places where it's the destination. Because of this mix I wasn't really > sure that getting this attribute entirely correct is actually > necessary, and hence I hoped it would be okay to not further complicate > the attribute here. It should be the mode the insn is operating in. So, zero-extended SImode add is still operating in SImode, even if its output is DImode, and TARGET_MMX_WITH_SSE are V4SFmode, even if their operands are all V2SFmode. Uros.
--- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -25851,19 +25851,39 @@ (symbol_ref "true")))]) (define_insn "*vec_dupv2di" - [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,x") + [(set (match_operand:V2DI 0 "register_operand" "=x,v,v,v,x") (vec_duplicate:V2DI - (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,0")))] + (match_operand:DI 1 "nonimmediate_operand" " 0,Yv,vm,Yvm,0")))] "TARGET_SSE" - "@ - punpcklqdq\t%0, %0 - vpunpcklqdq\t{%d1, %0|%0, %d1} - %vmovddup\t{%1, %0|%0, %1} - movlhps\t%0, %0" - [(set_attr "isa" "sse2_noavx,avx,sse3,noavx") - (set_attr "type" "sselog1,sselog1,sselog1,ssemov") - (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig") - (set_attr "mode" "TI,TI,DF,V4SF")]) +{ + switch (which_alternative) + { + case 0: + return "punpcklqdq\t%0, %0"; + case 1: + return "vpunpcklqdq\t{%d1, %0|%0, %d1}"; + case 2: + if (TARGET_AVX512VL) + return "vpbroadcastq\t{%1, %0|%0, %1}"; + return "vpbroadcastq\t{%1, %g0|%g0, %1}"; + case 3: + return "%vmovddup\t{%1, %0|%0, %1}"; + case 4: + return "movlhps\t%0, %0"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "sse2_noavx,avx,avx512f,sse3,noavx") + (set_attr "type" "sselog1,sselog1,ssemov,sselog1,ssemov") + (set_attr "prefix" "orig,maybe_evex,evex,maybe_vex,orig") + (set_attr "mode" "TI,TI,TI,DF,V4SF") + (set (attr "enabled") + (if_then_else + (eq_attr "alternative" "2") + (symbol_ref "TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)") + (const_string "*")))]) (define_insn "avx2_vbroadcasti128_<mode>" [(set (match_operand:VI_256 0 "register_operand" "=x,v,v")