From patchwork Fri Dec 23 16:46:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 36310 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp413646wrn; Fri, 23 Dec 2022 08:46:40 -0800 (PST) X-Google-Smtp-Source: AMrXdXsq5sRYNwxZwn3ieO8T/JZYmXYWzWmN+pMO+HhZYvwbPU2o/pAVlDE97h4sBVgqZAcAq3o8 X-Received: by 2002:a17:906:7747:b0:840:604:1da1 with SMTP id o7-20020a170906774700b0084006041da1mr7704163ejn.61.1671814000192; Fri, 23 Dec 2022 08:46:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671814000; cv=none; d=google.com; s=arc-20160816; b=1DQJFAzAQXcgca7+MZwLFGSXp1VNxkXb7k1Ldwfrh+OkoyWhADy3llzX9X7TYGGXwo WR7PVKkxOiXshaXuWFAsn5khiHPEM/U7omrPKgxAHrDC6keLZCeWdZqNhmvGwGPipm0Y saOxRnr64pg8wMmS9teKT3q4AggMgaeALUE1Z4d0jmFsiibwyYlFmnLOKeeQ5gd5Chw3 57JWqvjsGJGkrfiLQTXcJ64KYcAHCiI8E5KuXG0MCCMecuDgssXAT4HDPtajIayjgQnn h+tDhnyP/pX4IsABEMYqewwb26dhJ+L8p8WPd0ZGwQwxpcfxefkBxgVAPTDigpKoYHd+ olFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-language:thread-index :mime-version:message-id:date:subject:cc:to:from:dkim-signature :dmarc-filter:delivered-to; bh=eHZ+lUno2BzCufBSC2hazK5ExHp+BFmJbelNUfHyhY0=; b=uPiuFIJOSnOe3Fr8eOG7NJOYLaaLYgaToJKVMEY/79mBYFM2DgMf8vzrI88NPQX4ps PxRGv3IWfhG2Pi1qIXiUayvIArLVoQwRjv4RRjCSi8nvjn6xs9fuZ5fti96HW5nollPu x85tPJmPAgxrEpYBbNTnrkj1Yqhj6aUbngSv8HZWVarhlOQoMSzQPfCnODICcYOanoNj RtAocnp/Pl36bJSUU6k+byZxpFx1yWoK7dl+qS95LJ7R0hTSfOfbWiDf+7Nw5uWFA2uQ KkkeHoAKpUaoNMjFYdEMpfV+zRp6lIjQvaqLFulhRjPzuabsAaUS3lK6rkl5bBjM58g3 Zryw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b="Iz0uK1/g"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id cf20-20020a170906b2d400b007c12c63d1f4si2454069ejb.813.2022.12.23.08.46.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Dec 2022 08:46:40 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b="Iz0uK1/g"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B98543850F0E for ; Fri, 23 Dec 2022 16:46:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 582123858D1E for ; Fri, 23 Dec 2022 16:46:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 582123858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=eHZ+lUno2BzCufBSC2hazK5ExHp+BFmJbelNUfHyhY0=; b=Iz0uK1/gc0oQqqFpfkS6aybxTH tXAhsm8h6KwDsFNWefTXIeHHsGVamAUHxKruM522bs1cQ4GSJXQ1kb5NFlEfYFKZCGdG9mqRY9P3A Q2FcSXjCkDPSQ9EZyub0jNwWuNhueehoaps3nQ5U2CuGdRrGbmf2lHDRfFWLntg9SCPLmEgqb3eGv JVzPdi9s5/ICQziZo1OzAKWB79/1peHtQmL+dEcETIXm3SNAg1hOdqAbZchT3iw+7ltw8VbAIg8uh NW6RaTzfMc8Wh+bLun7+5/YKkT1PfsUvWAup5wd69v5ynM7ZyiiCo+e9xV2vrwZ2UlOySHiTgEg5j uIPcf78Q==; Received: from [185.62.158.67] (port=59779 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1p8lBR-0007N4-HA; Fri, 23 Dec 2022 11:46:09 -0500 From: "Roger Sayle" To: "'GCC Patches'" Cc: "'Uros Bizjak'" Subject: [x86 PATCH] Use movss/movsd to implement V4SI/V2DI VEC_PERM. Date: Fri, 23 Dec 2022 16:46:06 -0000 Message-ID: <00e501d916ee$0ef7d210$2ce77630$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdkW7TdADctL0G19QMGsYzpBjhoS6g== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753024037042958733?= X-GMAIL-MSGID: =?utf-8?q?1753024037042958733?= This patch tweaks the x86 backend to use the movss and movsd instructions to perform some vector permutations on integer vectors (V4SI and V2DI) in the same way they are used for floating point vectors (V4SF and V2DF). As a motivating example, consider: typedef unsigned int v4si __attribute__((vector_size(16))); typedef float v4sf __attribute__((vector_size(16))); v4si foo(v4si x,v4si y) { return (v4si){y[0],x[1],x[2],x[3]}; } v4sf bar(v4sf x,v4sf y) { return (v4sf){y[0],x[1],x[2],x[3]}; } which is currently compiled with -O2 to: foo: movdqa %xmm0, %xmm2 shufps $80, %xmm0, %xmm1 movdqa %xmm1, %xmm0 shufps $232, %xmm2, %xmm0 ret bar: movss %xmm1, %xmm0 ret with this patch both functions compile to the same form. Likewise for the V2DI case: typedef unsigned long v2di __attribute__((vector_size(16))); typedef double v2df __attribute__((vector_size(16))); v2di foo(v2di x,v2di y) { return (v2di){y[0],x[1]}; } v2df bar(v2df x,v2df y) { return (v2df){y[0],x[1]}; } which is currently generates: foo: shufpd $2, %xmm0, %xmm1 movdqa %xmm1, %xmm0 ret bar: movsd %xmm1, %xmm0 ret There are two possible approaches to adding integer vector forms of the sse_movss and sse2_movsd instructions. One is to use a mode iterator (VI4F_128 or VI8F_128) on the existing define_insn patterns, but this requires renaming the patterns to sse_movss_ which then requires changes to i386-builtins.def and through-out the backend to reflect the new naming of gen_sse_movss_v4sf. The alternate approach (taken here) is to simply clone and specialize the existing patterns. Uros, if you'd prefer the first approach, I'm happy to make/test/commit those changes. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-12-23 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (expand_vec_perm_movs): Also allow V4SImode with TARGET_SSE and V2DImode with TARGET_SSE2. * config/i386/sse.md (sse_movss_v4si): New define_insn, a V4SI specialization of sse_movss. (sse2_movsd_v2di): Likewise, a V2DI specialization of sse2_movsd. gcc/testsuite/ChangeLog * gcc.target/i386/sse-movss-4.c: New test case. * gcc.target/i386/sse2-movsd-3.c: New test case. Thanks in advance, Roger diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index a45640f..ad7745a 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -18903,8 +18903,10 @@ expand_vec_perm_movs (struct expand_vec_perm_d *d) return false; if (!(TARGET_SSE && vmode == V4SFmode) + && !(TARGET_SSE && vmode == V4SImode) && !(TARGET_MMX_WITH_SSE && vmode == V2SFmode) - && !(TARGET_SSE2 && vmode == V2DFmode)) + && !(TARGET_SSE2 && vmode == V2DFmode) + && !(TARGET_SSE2 && vmode == V2DImode)) return false; /* Only the first element is changed. */ diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index de632b2..f5860f2c 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -10513,6 +10513,21 @@ (set_attr "prefix" "orig,maybe_evex") (set_attr "mode" "SF")]) +(define_insn "sse_movss_v4si" + [(set (match_operand:V4SI 0 "register_operand" "=x,v") + (vec_merge:V4SI + (match_operand:V4SI 2 "register_operand" " x,v") + (match_operand:V4SI 1 "register_operand" " 0,v") + (const_int 1)))] + "TARGET_SSE" + "@ + movss\t{%2, %0|%0, %2} + vmovss\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "orig,maybe_evex") + (set_attr "mode" "SF")]) + (define_insn "avx2_vec_dup" [(set (match_operand:VF1_128_256 0 "register_operand" "=v") (vec_duplicate:VF1_128_256 @@ -13523,6 +13538,21 @@ (const_string "orig"))) (set_attr "mode" "DF,DF,V1DF,V1DF,V1DF,V2DF,V1DF,V1DF,V1DF")]) +(define_insn "sse2_movsd_v2di" + [(set (match_operand:V2DI 0 "register_operand" "=x,v") + (vec_merge:V2DI + (match_operand:V2DI 2 "register_operand" " x,v") + (match_operand:V2DI 1 "register_operand" " 0,v") + (const_int 1)))] + "TARGET_SSE2" + "@ + movsd\t{%2, %0|%0, %2} + vmovsd\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "orig,maybe_evex") + (set_attr "mode" "DF")]) + (define_insn "vec_dupv2df" [(set (match_operand:V2DF 0 "register_operand" "=x,x,v") (vec_duplicate:V2DF diff --git a/gcc/testsuite/gcc.target/i386/sse-movss-4.c b/gcc/testsuite/gcc.target/i386/sse-movss-4.c new file mode 100644 index 0000000..ec3019c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse-movss-4.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse" } */ + +typedef unsigned int v4si __attribute__((vector_size(16))); +typedef float v4sf __attribute__((vector_size(16))); + +v4si foo(v4si x,v4si y) { return (v4si){y[0],x[1],x[2],x[3]}; } +v4sf bar(v4sf x,v4sf y) { return (v4sf){y[0],x[1],x[2],x[3]}; } + +/* { dg-final { scan-assembler-times "\tv?movss\t" 2 } } */ +/* { dg-final { scan-assembler-not "movaps" } } */ +/* { dg-final { scan-assembler-not "shufps" } } */ +/* { dg-final { scan-assembler-not "vpblendw" } } */ diff --git a/gcc/testsuite/gcc.target/i386/sse2-movsd-3.c b/gcc/testsuite/gcc.target/i386/sse2-movsd-3.c new file mode 100644 index 0000000..db120b4 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse2-movsd-3.c @@ -0,0 +1,15 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -msse2" } */ + +typedef unsigned long v2di __attribute__((vector_size(16))); +typedef double v2df __attribute__((vector_size(16))); + +v2di foo(v2di x,v2di y) { return (v2di){y[0],x[1]}; } +v2df bar(v2df x,v2df y) { return (v2df){y[0],x[1]}; } + +/* { dg-final { scan-assembler-times "\tv?movsd\t" 2 } } */ +/* { dg-final { scan-assembler-not "v?shufpd" } } */ +/* { dg-final { scan-assembler-not "movdqa" } } */ +/* { dg-final { scan-assembler-not "pshufd" } } */ +/* { dg-final { scan-assembler-not "v?punpckldq" } } */ +/* { dg-final { scan-assembler-not "v?movq" } } */