From patchwork Thu Aug 3 07:10:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 130341 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp962067vqx; Thu, 3 Aug 2023 00:10:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlE7Kj99xRkiGjJnD7p6ejFPTScChyzGy5j0CDtlDfybCQWbpZHA+mMGCIkZa+6w1XoHtzyA X-Received: by 2002:a17:907:a0c6:b0:99b:ddac:d9d9 with SMTP id hw6-20020a170907a0c600b0099bddacd9d9mr5951689ejc.53.1691046651882; Thu, 03 Aug 2023 00:10:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691046651; cv=none; d=google.com; s=arc-20160816; b=nMe+5smuE7rqRRuVgRNfaoxp2EPk11RhVeOFBgI/X8JMlpa+CGfxqfZUunhcR1BpYs j7Ly7n+fp2PZVidvGoFyhEU0+vWbfDG15LJSC/TyKQ9JVoEuRhrErwVRH/EDmB1wjjW9 X45wQuLOSlY/bRi7LwG7pyyHsoIRi6b4/0S1dKsW6kJHigsnaLOi++FNBeOj9QhF/TLf /qRkBSX0XL3Fz9eg2vH/XlPJjh+Awo8tesOxtCF7YFef6XBAVtERwxpzI/MMPNj9RxoC IMK7VT0Cb+J8972hbzO7B/4qKa0sbB93JeWr1REmNQDpTj+1RCgE8Uf1JlAY0wDQNpDc c+ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-language:thread-index :mime-version:message-id:date:subject:cc:to:from:dkim-signature :dmarc-filter:delivered-to; bh=e8aVgoPxqYdPCE6WYk9WNQ00cJsCq+z03yuGDn0BJc0=; fh=UAbSimdGpojL/e9wL4Po0hjwO3sT24dauJYSxjnq65Y=; b=Y1Z+Lr53bSOpa0ytbDUHU07y0AiLJDz5YjiLlJzb4fk1l2+GI84ZLoORDR1Au0hLMh +OBt9tnO9frv6cztzGyWfJXPB1dY+dFw08K+IwBs9cVp+UopEr0x1dwaQColOHJflgSr Na6X+BNvqOYiKx/fW+BrDFp1rifmCaIfCwm/QjhYLcSDEzf1j0GV2FlII4sHIdpzsz9c sWZeBMNaXNpGnIk3ACXIrtN60qE7il0uTIfW8sXeXy7ah1FymQYZyeIeuFdtwq4UcQcR FOfZ7R8bVDlmiXOsn1n26bMX1V+b9hhDJsijbvK+8HM8Xn/ca7F/1ZbnR9AerWJkPTGH jpqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=jX5XYvld; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id uz16-20020a170907119000b00992de9a0240si8302707ejb.220.2023.08.03.00.10.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 00:10:51 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=jX5XYvld; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6D10385841B for ; Thu, 3 Aug 2023 07:10:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id BE11C3858D1E for ; Thu, 3 Aug 2023 07:10:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BE11C3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=e8aVgoPxqYdPCE6WYk9WNQ00cJsCq+z03yuGDn0BJc0=; b=jX5XYvldvQUfHTY30PVbH8WHC6 puvp2lMhPQm+df1eWLksty60xXv2I1RwqkL651cJUsEict+dXq3xk5Nt+2fXGDMWTMvFBMiFEjhDv E8XRuzP9RKJOyQXyEWhAs7VkyxMdEABRzPIGOZ4qjFHjWSKU+WsIoB8t4I2IWmmOWBZ/iDbuEVIsf G5x6btAwJlUHEjMM3SgfC4qskVMjC6OghRnOqp4zZFqvvQh/HyUqIJvtMNfnUelwbp5nlnvD6qnN2 3QSjK/ogAEVRPSVOZ9b3BJdlPiyLU4Uxtf1Jnf08dhXzUqxyyQ2L/3NUhvKsa2M8TM876f3W9FIoL xhw9jbQA==; Received: from host86-161-68-50.range86-161.btcentralplus.com ([86.161.68.50]:50688 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qRSTT-0000u3-0H; Thu, 03 Aug 2023 03:10:19 -0400 From: "Roger Sayle" To: Cc: "'Uros Bizjak'" Subject: [x86 PATCH] Split SUBREGs of SSE vector registers into vec_select insns. Date: Thu, 3 Aug 2023 08:10:17 +0100 Message-ID: <00c601d9c5d9$8f5ad4d0$ae107e70$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdnF2Lq+Jtu24KqyRGaz5oun1UunMA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773190933779240060 X-GMAIL-MSGID: 1773190933779240060 This patch is the final piece in the series to improve the ABI issues affecting PR 88873. The previous patches tackled inserting DFmode values into V2DFmode registers, by introducing insvti_{low,high}part patterns. This patch improves the extraction of DFmode values from v2DFmode registers via TImode intermediates. I'd initially thought this would require new extvti_{low,high}part patterns to be defined, but all that's required is to recognize that the SUBREG idioms produced by combine are equivalent to (forms of) vec_select patterns. The target-independent middle-end can't be sure that the appropriate vec_select instruction exists on the target, hence doesn't canonicalize a SUBREG of a vector mode as a vec_select, but the backend can provide a define_split stating where and when this is useful, for example, considering whether the operand is in memory, or whether !TARGET_SSE_MATH and the destination is i387. For pr88873.c, gcc -O2 -march=cascadelake currently generates: foo: vpunpcklqdq %xmm3, %xmm2, %xmm7 vpunpcklqdq %xmm1, %xmm0, %xmm6 vpunpcklqdq %xmm5, %xmm4, %xmm2 vmovdqa %xmm7, -24(%rsp) vmovdqa %xmm6, %xmm1 movq -16(%rsp), %rax vpinsrq $1, %rax, %xmm7, %xmm4 vmovapd %xmm4, %xmm6 vfmadd132pd %xmm1, %xmm2, %xmm6 vmovapd %xmm6, -24(%rsp) vmovsd -16(%rsp), %xmm1 vmovsd -24(%rsp), %xmm0 ret with this patch, we now generate: foo: vpunpcklqdq %xmm1, %xmm0, %xmm6 vpunpcklqdq %xmm3, %xmm2, %xmm7 vpunpcklqdq %xmm5, %xmm4, %xmm2 vmovdqa %xmm6, %xmm1 vfmadd132pd %xmm7, %xmm2, %xmm1 vmovsd %xmm1, %xmm1, %xmm0 vunpckhpd %xmm1, %xmm1, %xmm1 ret The improvement is even more dramatic when compared to the original 29 instructions shown in comment #8. GCC 13, for example, required 12 transfers to/from memory. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-08-03 Roger Sayle gcc/ChangeLog * config/i386/sse.md (define_split): Convert highpart:DF extract from V2DFmode register into a sse2_storehpd instruction. (define_split): Likewise, convert lowpart:DF extract from V2DF register into a sse2_storelpd instruction. gcc/testsuite/ChangeLog * gcc.target/i386/pr88873.c: Tweak to check for improved code. Thanks in advance, Roger diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 35fd66e..bc419ff 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -13554,6 +13554,14 @@ [(set_attr "type" "ssemov") (set_attr "mode" "V2SF,V4SF,V2SF")]) +;; Convert highpart SUBREG in sse2_storehpd or *vec_extractv2df_1_sse. +(define_split + [(set (match_operand:DF 0 "register_operand") + (subreg:DF (match_operand:V2DF 1 "register_operand") 8))] + "TARGET_SSE" + [(set (match_dup 0) + (vec_select:DF (match_dup 1) (parallel [(const_int 1)])))]) + ;; Avoid combining registers from different units in a single alternative, ;; see comment above inline_secondary_memory_needed function in i386.cc (define_insn "sse2_storelpd" @@ -13599,6 +13607,14 @@ [(set_attr "type" "ssemov") (set_attr "mode" "V2SF,V4SF,V2SF")]) +;; Convert lowpart SUBREG into sse2_storelpd or *vec_extractv2df_0_sse. +(define_split + [(set (match_operand:DF 0 "register_operand") + (subreg:DF (match_operand:V2DF 1 "register_operand") 0))] + "TARGET_SSE" + [(set (match_dup 0) + (vec_select:DF (match_dup 1) (parallel [(const_int 0)])))]) + (define_expand "sse2_loadhpd_exp" [(set (match_operand:V2DF 0 "nonimmediate_operand") (vec_concat:V2DF diff --git a/gcc/testsuite/gcc.target/i386/pr88873.c b/gcc/testsuite/gcc.target/i386/pr88873.c index d893aac..a3a7ef2 100644 --- a/gcc/testsuite/gcc.target/i386/pr88873.c +++ b/gcc/testsuite/gcc.target/i386/pr88873.c @@ -9,3 +9,5 @@ s_t foo (s_t a, s_t b, s_t c) } /* { dg-final { scan-assembler-times "vpunpcklqdq" 3 } } */ +/* { dg-final { scan-assembler "vunpckhpd" } } */ +/* { dg-final { scan-assembler-not "rsp" } } */