From patchwork Sun Jul 9 20:35:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 117530 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp4617562vqx; Sun, 9 Jul 2023 13:36:28 -0700 (PDT) X-Google-Smtp-Source: APBJJlHv6FAasTRhOQiMdFFdnZdP6WAdpulioAeVlulDZlYh0vap2N+3XGsFGzJ9DiMHvHjhuPcY X-Received: by 2002:a2e:b781:0:b0:2b7:7b9:4767 with SMTP id n1-20020a2eb781000000b002b707b94767mr7438421ljo.41.1688934987550; Sun, 09 Jul 2023 13:36:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688934987; cv=none; d=google.com; s=arc-20160816; b=pEIULIPeSqvXBNGk2Om4ttraQ8y2mAkFVFfbXYABROnMlQzoh6qs00zLKPQpIEt5hU Nmgh6WrOY4Jalf7BJoDQoX+1WA7+p1zIIRpHvDdfaqfrycyRRAYCrPVKT2BJouGSx4yv vWW9yyGRxTUQHA1MZWcjMv/RiRcJ6r1QLW02iwMuTuRTJ48/mVdA4i244WlPoPARgR3h AePTV8HvrA9N9+54B5V+keCxQ1zurOqPjEu7IIy/oOWcFjz8tEy5qpvjw8sQYSfnZw8S QjWwvOtRgpsyBbGTyYlK9UgoGtJoHUZoTgd3SEI47Cq/DbkT23gfqT6dWpHwICDZqMYm CpYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:thread-index:content-language :mime-version:message-id:date:subject:cc:to:from:dkim-signature :dmarc-filter:delivered-to; bh=WdEsrLNskwk7nyHQqjQhfXUu6ms92a6NgCSSIGJXEJE=; fh=UAbSimdGpojL/e9wL4Po0hjwO3sT24dauJYSxjnq65Y=; b=sXX3bv6D7HFduMYPjnUEv8mqR+S8qbePPlIBioC1G+MhWUUf9lhW/sZX7LDhSG7nO7 tGlQ0F6R50hFJtoiWDTC+piqnfohhF/k15nEL8cGoHm1/byr6b+llH9X4m7dpJ/9r1Fz NykATJuVq9HGW17DEHy3YsIN9k9kIuK+t1pqhzcCaGqDMeNhImVmTeh9F20u6BrjqrU1 UCyEqxnecjnbwNIJCHQgKX0Izid6hxAANjhio8Fr+fuKdC1MA93tg7IgW3R9cKVTAsa0 jODUOiH9ACT0pJcow+zdCBXEaWgmtkRrxCwcKBXQX9vivWSF7Ydvfa35DfVPTYl+AWoh ArUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=nzjtmozC; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id br20-20020a170906d15400b0098831afb89bsi7133091ejb.84.2023.07.09.13.36.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 09 Jul 2023 13:36:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=nzjtmozC; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 42BA93857013 for ; Sun, 9 Jul 2023 20:36:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 639843858C2C for ; Sun, 9 Jul 2023 20:35:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 639843858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=WdEsrLNskwk7nyHQqjQhfXUu6ms92a6NgCSSIGJXEJE=; b=nzjtmozCQD7KvAuA3bNTOtQXu3 2ZzdB3ZBzxV/L+POx6ukvd+BO/6lY2Kcf1gt/KCpDnocckpX99RlrsEuQ5a8s1tzhmABivj3LquUN Pe4Uc6q9OCeoM0FZYDlMBeauBv042fzLTbSjsX6QHJwgtRu0vYi3XszVmtEB9KLsw5ve5OBFqClAy 4vUwT/7R0tT9Tfp5HWbaqBSOsXLn5z2OY8Put25/0M3FS82IiLZThZfNCCuS77eDlWaVkwuxOBBAo QntCpQ6FJYtBd0J3KSuOkehb9fBfMUNv+xsMSTuxIgxEC1MEQuaRdg9lBnb8t5u+nPv0stQepzrhp ieL0vwhg==; Received: from host86-161-68-50.range86-161.btcentralplus.com ([86.161.68.50]:52393 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qIb8N-0001B1-1d; Sun, 09 Jul 2023 16:35:55 -0400 From: "Roger Sayle" To: Cc: "'Uros Bizjak'" Subject: [x86 PATCH] Add AVX512 support for STV of SI/DImode rotation by constant. Date: Sun, 9 Jul 2023 21:35:53 +0100 Message-ID: <035a01d9b2a4$f6078950$e2169bf0$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Content-Language: en-gb Thread-Index: Admyo6VrHd7kgPeISMKGusLm8YSK8A== X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1770976693412146376 X-GMAIL-MSGID: 1770976693412146376 Following Uros' suggestion, this patch adds support for AVX512VL's vpro[lr][dq] instructions to the recently added scalar-to-vector (STV) enhancements to handle DImode and SImode rotations by a constant. For the test cases: unsigned long long rot1(unsigned long long x) { return (x>>1) | (x<<63); } void mem1(unsigned long long *p) { *p = rot1(*p); } with -m32 -O2 -mavx512vl, we currently generate: rot1: movl 4(%esp), %eax movl 8(%esp), %edx movl %eax, %ecx shrdl $1, %edx, %eax shrdl $1, %ecx, %edx ret mem1: movl 4(%esp), %eax vmovq (%eax), %xmm0 vpshufd $20, %xmm0, %xmm0 vpsrlq $1, %xmm0, %xmm0 vpshufd $136, %xmm0, %xmm0 vmovq %xmm0, (%eax) ret with this patch, we now generate: rot1: vmovq 4(%esp), %xmm0 vprorq $1, %xmm0, %xmm0 vmovd %xmm0, %eax vpextrd $1, %xmm0, %edx ret mem1: movl 4(%esp), %eax vmovq (%eax), %xmm0 vprorq $1, %xmm0, %xmm0 vmovq %xmm0, (%eax) ret This patch has been tested on x86_64-pc-linux-gnu (cascadelake which has avx512) with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-07-09 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (compute_convert_gain): Tweak gains/costs for ROTATE/ROTATERT by integer constant on AVX512VL. (general_scalar_chain::convert_rotate): On TARGET_AVX512F generate avx512vl_rolv2di or avx412vl_rolv4si when appropriate. gcc/testsuite/ChangeLog * gcc.target/i386/avx512vl-stv-rotatedi-1.c: New test case. Cheers, Roger diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index 2e751d1..4d69251 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -585,7 +585,9 @@ general_scalar_chain::compute_convert_gain () case ROTATE: case ROTATERT: igain += m * ix86_cost->shift_const; - if (smode == DImode) + if (TARGET_AVX512F) + igain -= ix86_cost->sse_op; + else if (smode == DImode) { int bits = INTVAL (XEXP (src, 1)); if ((bits & 0x0f) == 0) @@ -1225,6 +1227,8 @@ general_scalar_chain::convert_rotate (enum rtx_code code, rtx op0, rtx op1, emit_insn_before (pat, insn); result = gen_lowpart (V2DImode, tmp1); } + else if (TARGET_AVX512F) + result = simplify_gen_binary (code, V2DImode, op0, op1); else if (bits == 16 || bits == 48) { rtx tmp1 = gen_reg_rtx (V8HImode); @@ -1269,6 +1273,8 @@ general_scalar_chain::convert_rotate (enum rtx_code code, rtx op0, rtx op1, emit_insn_before (pat, insn); result = gen_lowpart (V4SImode, tmp1); } + else if (TARGET_AVX512F) + result = simplify_gen_binary (code, V4SImode, op0, op1); else { if (code == ROTATE) diff --git a/gcc/testsuite/gcc.target/i386/avx512vl-stv-rotatedi-1.c b/gcc/testsuite/gcc.target/i386/avx512vl-stv-rotatedi-1.c new file mode 100644 index 0000000..2f0ead8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512vl-stv-rotatedi-1.c @@ -0,0 +1,35 @@ +/* { dg-do compile { target ia32 } } */ +/* { dg-options "-O2 -mavx512vl" } */ + +unsigned long long rot1(unsigned long long x) { return (x>>1) | (x<<63); } +unsigned long long rot2(unsigned long long x) { return (x>>2) | (x<<62); } +unsigned long long rot3(unsigned long long x) { return (x>>3) | (x<<61); } +unsigned long long rot4(unsigned long long x) { return (x>>4) | (x<<60); } +unsigned long long rot5(unsigned long long x) { return (x>>5) | (x<<59); } +unsigned long long rot6(unsigned long long x) { return (x>>6) | (x<<58); } +unsigned long long rot7(unsigned long long x) { return (x>>7) | (x<<57); } +unsigned long long rot8(unsigned long long x) { return (x>>8) | (x<<56); } +unsigned long long rot9(unsigned long long x) { return (x>>9) | (x<<55); } +unsigned long long rot10(unsigned long long x) { return (x>>10) | (x<<54); } +unsigned long long rot15(unsigned long long x) { return (x>>15) | (x<<49); } +unsigned long long rot16(unsigned long long x) { return (x>>16) | (x<<48); } +unsigned long long rot17(unsigned long long x) { return (x>>17) | (x<<47); } +unsigned long long rot20(unsigned long long x) { return (x>>20) | (x<<44); } +unsigned long long rot24(unsigned long long x) { return (x>>24) | (x<<40); } +unsigned long long rot30(unsigned long long x) { return (x>>30) | (x<<34); } +unsigned long long rot31(unsigned long long x) { return (x>>31) | (x<<33); } +unsigned long long rot32(unsigned long long x) { return (x>>32) | (x<<32); } +unsigned long long rot33(unsigned long long x) { return (x>>33) | (x<<31); } +unsigned long long rot34(unsigned long long x) { return (x>>34) | (x<<30); } +unsigned long long rot40(unsigned long long x) { return (x>>40) | (x<<24); } +unsigned long long rot42(unsigned long long x) { return (x>>42) | (x<<22); } +unsigned long long rot48(unsigned long long x) { return (x>>48) | (x<<16); } +unsigned long long rot50(unsigned long long x) { return (x>>50) | (x<<14); } +unsigned long long rot56(unsigned long long x) { return (x>>56) | (x<<8); } +unsigned long long rot58(unsigned long long x) { return (x>>58) | (x<<6); } +unsigned long long rot60(unsigned long long x) { return (x>>60) | (x<<4); } +unsigned long long rot61(unsigned long long x) { return (x>>61) | (x<<3); } +unsigned long long rot62(unsigned long long x) { return (x>>62) | (x<<2); } +unsigned long long rot63(unsigned long long x) { return (x>>63) | (x<<1); } + +/* { dg-final { scan-assembler-times "vpro\[lr\]q" 29 } } */