From patchwork Wed Jun 21 06:25:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 110791 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp4156625vqr; Tue, 20 Jun 2023 23:26:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5cuRSWgSw6e/3y4sDCOgJEOYGDPWGF82bxbP8T8erSoY/buR9WxdhlLiJdFzNjoD0qUhb6 X-Received: by 2002:a17:907:802:b0:974:fb94:8067 with SMTP id wv2-20020a170907080200b00974fb948067mr19971344ejb.23.1687328804011; Tue, 20 Jun 2023 23:26:44 -0700 (PDT) Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bf20-20020a0564021a5400b0051a327d1940si1910306edb.89.2023.06.20.23.26.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jun 2023 23:26:43 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=kGRBiYLg; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6B833858D28 for ; Wed, 21 Jun 2023 06:26:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C6B833858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687328802; bh=tBTQyBHDfdKurtv2xLjhbYT6yt6F4IC9iBA2co03Dec=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=kGRBiYLgX/Vxz4gFOM2+47LFDYs1ewWPhE0PsXAMfpaqz4JrnOSS3wPt1pkGhyPEV JnDHkS8m4oUYnNgzYdbvAB9wD3bm5NC+udQ4FjSztGPnnusNLXbw7xz4Y1G6Ze3bkV rQqos5oGZrodbCQqbBllD7FsamhFicgTwxJKsrkU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2087.outbound.protection.outlook.com [40.107.105.87]) by sourceware.org (Postfix) with ESMTPS id 7E5673858D1E for ; Wed, 21 Jun 2023 06:25:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7E5673858D1E ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=apDheHSJm1tgsmyZ+7qxz0QiXHLAkhwUvlYW3V2mYIaDAE7GqO6K1aBvXoMlKcoJHFdD3OMpS0z/7FneIT5P9dHUGKf3VwR7XDjrMnKp6Xu9iT9wFixUOnI+puskmL2hXfhViJgiaVHLPsSKHRffwBZNqtlY4jj5aZVG4DAvfIyDRREWObk9H1QZJNRnKMscbwJwvaEPQdT8H7YOMPg+KUOqt6w/EwsRGxLKLtEEXsva3kOmf/TgOV7DD89b7Y8PBQNRiuDK9jgFQjIqXX/w1tB0e16mIXLTKS7S8UwG6HkYumlV3eK2tQBinh0eRgt5feKa1NAgUGMqOSuc8J4aDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tBTQyBHDfdKurtv2xLjhbYT6yt6F4IC9iBA2co03Dec=; b=h1kta/2zehVmV4/uU0Zd6LZt+XRBRtz3tpW/81DGZEHNF+LAK6VrRMkgfCx8f1OJdJPwrUl7+sApjxuIgugk7aD0twIR2kOHu/1asuB0VM22PdkuJnskRVs7EndWDzluAGPT+0iRHRZPi/ApAS6cdb4OV+lJTFTS38Jyt5QuTQAZ+nMYF4fLKor+FUABhztMHNN/PSb86WSKS5Rc++UJ/vqzUw4jCD46U1XhumTCM4sV6dDX/5/ktjmE0ab6sb0p/xrTJIW07zLoHpFInzDHkp8XVDCYtAD9jPYkLqOE1kPIYEXBb6dSQ3YZVrxWQFJIBD9oOakIVjgUkZBuRsKdxQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none Received: from VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) by DUZPR04MB9982.eurprd04.prod.outlook.com (2603:10a6:10:4db::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.23; Wed, 21 Jun 2023 06:25:55 +0000 Received: from VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c]) by VE1PR04MB6560.eurprd04.prod.outlook.com ([fe80::e442:306f:7711:e24c%5]) with mapi id 15.20.6500.036; Wed, 21 Jun 2023 06:25:54 +0000 Message-ID: <457ffad0-9ecd-3e19-f5ab-6153ce4b8bad@suse.com> Date: Wed, 21 Jun 2023 08:25:52 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0 Subject: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Cc: Hongtao Liu , Kirill Yukhin References: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> In-Reply-To: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> X-ClientProxiedBy: FR3P281CA0058.DEUP281.PROD.OUTLOOK.COM (2603:10a6:d10:4b::19) To VE1PR04MB6560.eurprd04.prod.outlook.com (2603:10a6:803:122::25) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: VE1PR04MB6560:EE_|DUZPR04MB9982:EE_ X-MS-Office365-Filtering-Correlation-Id: 097b78ef-a884-4336-5d27-08db72205e28 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ez6m3mzG+ruWG9rhSKA4E2K8wvAR67hxGF2vGaQE/ywVeOz40NhDtZpnuXTFyyG9rEIjfhZVZkrNcqm0QrFgjZmbhiZ0uk5DHXCXeMq3RP9XoWcT6TvAFDzp4t0IwfeYnvv44rgM/7hDoIHgJjse8kLaZG3uB8YaoiXYOiD3yP46obnvLuGWxnZRRtZmtILjxexnZqxJoalTPSkHJa9nDcmMNjXaCmUGwuI67q3BoPZzDIlwNfFCVKVgP4C5lSjRSV/OsnTLBL3ANvRfAjRuWY54Co9FmxmsFma14yL6nXMk8ekh9cwYhSDSwYfa8LNtNhCOYXK9T6djCy9Nmi+urvznEOblOro39E84K9e3/u780NcI/4vW3czr6BmHq/MglY8UKjWYaQeTjqood8eYrl2OlUbRYkSNkcQToY3EbxVBWF050KeQNkpAy63r8rh1f4pmMdYWXZ6u2zITpgWAOwTYiDzJBGZzRD5Pl8Zez6LU5U7YzLdBVDFRuX8OEQ3o44guyxVb5fCFLKbxrjJBN4mdRLOvmDN5an1E4+Raf25Ycyh0nlVV/c3XinU85+0C9t55xuSV06AvnmZzgWc7E/bfiFfYGneDw4Texk4aeHxIVyyp1hgklknjV/04pKzNgyKtS6c5yQ8GMNvKL8FfKg== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR04MB6560.eurprd04.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(366004)(39860400002)(376002)(136003)(346002)(396003)(451199021)(2616005)(84970400001)(6506007)(26005)(186003)(6512007)(38100700002)(6486002)(478600001)(54906003)(31686004)(36756003)(31696002)(86362001)(66556008)(66946007)(66476007)(8936002)(8676002)(4326008)(316002)(6916009)(2906002)(5660300002)(41300700001)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?M1XbY6BsExRtWwE2rZWxVN2c/Jhm?= =?utf-8?q?/MT0ECFL1pRrZwz0xc7Ef90qK39gXrBWZ3dZlzxzZSap4NsCRPsqfzbnSqqs1bt2k?= =?utf-8?q?Wvl2hGUTGCkfQuXml4p0/sAD/qA4SGwAC0iTmIwEpwGAOWfltoqfRxJ1OtZPZJisv?= =?utf-8?q?hh9DAZfBjdkBwATRqhZtkQu2vCdAv6hO3ABoK5A7TtfKv3ahXcRwd4sejLtGvGCzx?= =?utf-8?q?JuVqW7JV9Hxtlzx7npvKajyP2/ZDzNXDmWZ3DJt58af+jUqjKM6hDaQEGVPtf0PJU?= =?utf-8?q?hpkKMh3olMR2tMNKve56yHDeh/PtPqJNXp1hRkhW3Oa7natd6xRYDaZaak3nVQzfY?= =?utf-8?q?eCUDfMqCkbQoUiro3LBx/h/rhOcwDkfEYDZnqgwKXhz/XQI+thZf2jyXTEz4t6niT?= =?utf-8?q?n5VOZzp6xYOOVJKyNTx09RJ/E6GII5/g6XnOiglXINkGuT88SfpIK/6BFcBoQAZkL?= =?utf-8?q?8jPHUki+mItu1KoCOZ+iV9YiQkW8XUzhlDfXD7A0Zj24QYE+rMwG/qEjnmq2xofLh?= =?utf-8?q?DgzB61C8BtMAF5m66WhlN1S3ZbAMSuZ/LhrOywwis+RU+OOXEmkmgySqFwclfyEX5?= =?utf-8?q?UY4M/5C1r+5Ttv2N/ewIUa2n6DgBuueOO5v86misSbVFsjsFU0LXcPZgy4HhLciZ6?= =?utf-8?q?4n9bVLUBCvCpShkYQk0iWY6bKS/W3e5NpAMmfW5GZgNVcPrtT/q9nHJFUsJ9mxHPs?= =?utf-8?q?MGVGkU4b182vHFtuy21JH1D+aW2hJGgtYVaqUEDy2xh4P75FOOpw1Hi+3xrqnuhMg?= =?utf-8?q?5Om9hFGOwksYGHiIobnKOhwBzVrCZ3sCMYKdJUqrXyN5wXeh6/YCjz9B4YCRdFHJk?= =?utf-8?q?4uhhHcM3Rss/4AZUy4Gq9VwftTV74Gp02vp3wZnU68ZlzPibiPQZNFSDNVmBR6JOR?= =?utf-8?q?RlbhF7gXNDe1wcCqbznFnbQP4wx7nGvr3HTpKE9NiKOsGUuOaIsjNX37umuz9BKdP?= =?utf-8?q?fJtVUKQz3UVWk0nDdfXMd692+gQk0DXDhe3png/fXDi4GRGY9+HQXixHapVhFvSBg?= =?utf-8?q?+5pzvW/OwGe9xk7kobeL4Iiw3BOEO4/6VPxAHZ5EiVlbbPopRZ+LR/Dv6YaAFkJcy?= =?utf-8?q?mypt0/I3EuJkFr+Yc1BbXl1oL3rXBGtGXcUECJuagfuqqWpAbeFF1N/gupj/DqvWd?= =?utf-8?q?/dV+n5CEwoRfpt+ODC9atwx8RpStdQdCoYRbXCwFYssbKAt3XbtGy/fUd8EqM57nh?= =?utf-8?q?cXufIOpSjCBxvvOcTkk+cok6clHaf+nBEYQD5mJqHdwtXeXDy/ohgYN/bkcPhDkNH?= =?utf-8?q?bAsV9wv1DWM38mdoVxIEuAPvw/6XJQAsjh6/efXJ4pKlXLk3EMz97PPO833CuLHzP?= =?utf-8?q?uA8M+P79lrrcaEXZPeWRxhcOkGiWsnkkEU8UoLurQ74OGdFXXu56w+VMtI5JZt7MJ?= =?utf-8?q?k3Z8ZbAmuAUW1ClnSxU+E0F/MpzcsMsJ+5087ebBQi6TUlJFicWI4iVth1cuUL8Io?= =?utf-8?q?Ob6PPWjJGM5WgFOUzpprtRXiBtZFrKiThqMuqR2+YKEdHJ2ThK603MBeNZR15Kyl8?= =?utf-8?q?U5uxtJP8n9ju?= X-OriginatorOrg: suse.com X-MS-Exchange-CrossTenant-Network-Message-Id: 097b78ef-a884-4336-5d27-08db72205e28 X-MS-Exchange-CrossTenant-AuthSource: VE1PR04MB6560.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Jun 2023 06:25:54.9101 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: f7a17af6-1c5c-4a36-aa8b-f5be247aa4ba X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: eFOeJO+/bHthDTTDPhIefHWTB0Ul65eI8mVBa9nmHK7nwRMEm5olu8mFqXiNRRAo9iUlnhgMp8ga21ZGXsdi4w== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DUZPR04MB9982 X-Spam-Status: No, score=-3027.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Beulich via Gcc-patches From: Jan Beulich Reply-To: Jan Beulich Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769292488263427890?= X-GMAIL-MSGID: =?utf-8?q?1769292488263427890?= All combinations of and, ior, xor, and not involving two operands can be expressed that way in a single insn. gcc/ PR target/93768 * config/i386/i386.cc (ix86_rtx_costs): Further special-case bitwise vector operations. * config/i386/sse.md (*iornot3): New insn. (*xnor3): Likewise. (*3): Likewise. (andor): New code iterator. (nlogic): New code attribute. (ternlog_nlogic): Likewise. gcc/testsuite/ PR target/93768 gcc.target/i386/avx512-binop-not-1.h: New. gcc.target/i386/avx512-binop-not-2.h: New. gcc.target/i386/avx512f-orn-si-zmm-1.c: New test. gcc.target/i386/avx512f-orn-si-zmm-2.c: New test. --- The use of VI matches that in e.g. one_cmpl2 / one_cmpl2 and *andnot3, despite (here and there) - V64QI and V32HI being needlessly excluded when AVX512BW isn't enabled, - VTI not being covered, - vector modes more narrow than 16 bytes not being covered. --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -21178,6 +21178,32 @@ ix86_rtx_costs (rtx x, machine_mode mode return false; case IOR: + if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) + { + /* (ior (not ...) ...) can be a single insn in AVX512. */ + if (GET_CODE (XEXP (x, 0)) == NOT && TARGET_AVX512F + && (GET_MODE_SIZE (mode) == 64 + || (TARGET_AVX512VL + && (GET_MODE_SIZE (mode) == 32 + || GET_MODE_SIZE (mode) == 16)))) + { + rtx right = GET_CODE (XEXP (x, 1)) != NOT + ? XEXP (x, 1) : XEXP (XEXP (x, 1), 0); + + *total = ix86_vec_cost (mode, cost->sse_op) + + rtx_cost (XEXP (XEXP (x, 0), 0), mode, + outer_code, opno, speed) + + rtx_cost (right, mode, outer_code, opno, speed); + return true; + } + *total = ix86_vec_cost (mode, cost->sse_op); + } + else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) + *total = cost->add * 2; + else + *total = cost->add; + return false; + case XOR: if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) *total = ix86_vec_cost (mode, cost->sse_op); @@ -21198,11 +21224,20 @@ ix86_rtx_costs (rtx x, machine_mode mode /* pandn is a single instruction. */ if (GET_CODE (XEXP (x, 0)) == NOT) { + rtx right = XEXP (x, 1); + + /* (and (not ...) (not ...)) can be a single insn in AVX512. */ + if (GET_CODE (right) == NOT && TARGET_AVX512F + && (GET_MODE_SIZE (mode) == 64 + || (TARGET_AVX512VL + && (GET_MODE_SIZE (mode) == 32 + || GET_MODE_SIZE (mode) == 16)))) + right = XEXP (right, 0); + *total = ix86_vec_cost (mode, cost->sse_op) + rtx_cost (XEXP (XEXP (x, 0), 0), mode, outer_code, opno, speed) - + rtx_cost (XEXP (x, 1), mode, - outer_code, opno, speed); + + rtx_cost (right, mode, outer_code, opno, speed); return true; } else if (GET_CODE (XEXP (x, 1)) == NOT) @@ -21260,8 +21295,25 @@ ix86_rtx_costs (rtx x, machine_mode mode case NOT: if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) - // vnot is pxor -1. - *total = ix86_vec_cost (mode, cost->sse_op) + 1; + { + /* (not (xor ...)) can be a single insn in AVX512. */ + if (GET_CODE (XEXP (x, 0)) == XOR && TARGET_AVX512F + && (GET_MODE_SIZE (mode) == 64 + || (TARGET_AVX512VL + && (GET_MODE_SIZE (mode) == 32 + || GET_MODE_SIZE (mode) == 16)))) + { + *total = ix86_vec_cost (mode, cost->sse_op) + + rtx_cost (XEXP (XEXP (x, 0), 0), mode, + outer_code, opno, speed) + + rtx_cost (XEXP (XEXP (x, 0), 1), mode, + outer_code, opno, speed); + return true; + } + + // vnot is pxor -1. + *total = ix86_vec_cost (mode, cost->sse_op) + 1; + } else if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) *total = cost->add * 2; else --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17616,6 +17616,98 @@ operands[2] = force_reg (V1TImode, CONSTM1_RTX (V1TImode)); }) +(define_insn "*iornot3" + [(set (match_operand:VI 0 "register_operand" "=v,v,v,v") + (ior:VI + (not:VI + (match_operand:VI 1 "bcst_vector_operand" "v,Br,v,m")) + (match_operand:VI 2 "bcst_vector_operand" "vBr,v,m,v")))] + "( == 64 || TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)) + && (register_operand (operands[1], mode) + || register_operand (operands[2], mode))" +{ + if (!register_operand (operands[1], mode)) + { + if (TARGET_AVX512VL) + return "vpternlog\t{$0xdd, %1, %2, %0|%0, %2, %1, 0xdd}"; + return "vpternlog\t{$0xdd, %g1, %g2, %g0|%g0, %g2, %g1, 0xdd}"; + } + if (TARGET_AVX512VL) + return "vpternlog\t{$0xbb, %2, %1, %0|%0, %1, %2, 0xbb}"; + return "vpternlog\t{$0xbb, %g2, %g1, %g0|%g0, %g1, %g2, 0xbb}"; +} + [(set_attr "type" "sselog") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set (attr "mode") + (if_then_else (match_test "TARGET_AVX512VL") + (const_string "") + (const_string "XI"))) + (set (attr "enabled") + (if_then_else (eq_attr "alternative" "2,3") + (symbol_ref " == 64 || TARGET_AVX512VL") + (const_string "*")))]) + +(define_insn "*xnor3" + [(set (match_operand:VI 0 "register_operand" "=v,v") + (not:VI + (xor:VI + (match_operand:VI 1 "bcst_vector_operand" "%v,v") + (match_operand:VI 2 "bcst_vector_operand" "vBr,m"))))] + "( == 64 || TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)) + && (register_operand (operands[1], mode) + || register_operand (operands[2], mode))" +{ + if (TARGET_AVX512VL) + return "vpternlog\t{$0x99, %2, %1, %0|%0, %1, %2, 0x99}"; + else + return "vpternlog\t{$0x99, %g2, %g1, %g0|%g0, %g1, %g2, 0x99}"; +} + [(set_attr "type" "sselog") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set (attr "mode") + (if_then_else (match_test "TARGET_AVX512VL") + (const_string "") + (const_string "XI"))) + (set (attr "enabled") + (if_then_else (eq_attr "alternative" "1") + (symbol_ref " == 64 || TARGET_AVX512VL") + (const_string "*")))]) + +(define_code_iterator andor [and ior]) +(define_code_attr nlogic [(and "nor") (ior "nand")]) +(define_code_attr ternlog_nlogic [(and "0x11") (ior "0x77")]) + +(define_insn "*3" + [(set (match_operand:VI 0 "register_operand" "=v,v") + (andor:VI + (not:VI (match_operand:VI 1 "bcst_vector_operand" "%v,v")) + (not:VI (match_operand:VI 2 "bcst_vector_operand" "vBr,m"))))] + "( == 64 || TARGET_AVX512VL + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)) + && (register_operand (operands[1], mode) + || register_operand (operands[2], mode))" +{ + if (TARGET_AVX512VL) + return "vpternlog\t{$, %2, %1, %0|%0, %1, %2, }"; + else + return "vpternlog\t{$, %g2, %g1, %g0|%g0, %g1, %g2, }"; +} + [(set_attr "type" "sselog") + (set_attr "length_immediate" "1") + (set_attr "prefix" "evex") + (set (attr "mode") + (if_then_else (match_test "TARGET_AVX512VL") + (const_string "") + (const_string "XI"))) + (set (attr "enabled") + (if_then_else (eq_attr "alternative" "1") + (symbol_ref " == 64 || TARGET_AVX512VL") + (const_string "*")))]) + (define_mode_iterator AVX512ZEXTMASK [(DI "TARGET_AVX512BW") (SI "TARGET_AVX512BW") HI]) --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512-binop-not-1.h @@ -0,0 +1,13 @@ +#include + +#define PASTER2(x,y) x##y +#define PASTER3(x,y,z) _mm##x##_##y##_##z +#define OP(vec, op, suffix) PASTER3 (vec, op, suffix) +#define DUP(vec, suffix, val) PASTER3 (vec, set1, suffix) (val) + +type +foo (type x, SCALAR *f) +{ + return OP (vec, op, suffix) (x, OP (vec, xor, suffix) (DUP (vec, suffix, *f), + DUP (vec, suffix, ~0))); +} --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512-binop-not-2.h @@ -0,0 +1,13 @@ +#include + +#define PASTER2(x,y) x##y +#define PASTER3(x,y,z) _mm##x##_##y##_##z +#define OP(vec, op, suffix) PASTER3 (vec, op, suffix) +#define DUP(vec, suffix, val) PASTER3 (vec, set1, suffix) (val) + +type +foo (type x, SCALAR *f) +{ + return OP (vec, op, suffix) (OP (vec, xor, suffix) (x, DUP (vec, suffix, ~0)), + DUP (vec, suffix, *f)); +} --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */ +/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0xdd, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ + +#define type __m512i +#define vec 512 +#define op or +#define suffix epi32 +#define SCALAR int + +#include "avx512-binop-not-1.h" --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512f-orn-si-zmm-2.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */ +/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0xbb, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ + +#define type __m512i +#define vec 512 +#define op or +#define suffix epi32 +#define SCALAR int + +#include "avx512-binop-not-2.h"