From patchwork Sat Jun 24 17:13:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 112476 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp6513089vqr; Sat, 24 Jun 2023 10:14:30 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7tB4JjrXMr8huvUnIUh7l/Ta15DocKeCUb3Tl6lYqEZ/bas5OoSBdkbfwBpr7/7BYTIf0/ X-Received: by 2002:a17:906:9751:b0:988:b204:66b0 with SMTP id o17-20020a170906975100b00988b20466b0mr16481502ejy.33.1687626870134; Sat, 24 Jun 2023 10:14:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687626870; cv=none; d=google.com; s=arc-20160816; b=RZt/nSW+lJjJOMfAZLt/zP7vGd07ToewU29rU5ilzC6VYW3eTXs3IGaHvyRIRtA0nt u45RHUDaI+P/EveH6suv4COSJauySY5rPnmpNOYLZx3eQRbzo4a6L0AviEBa72vrS8BE kDTJBh5w9nePEVotXsMEmcjautpiyfjLqwJ5pR6XdIHBzwVFGQJLx4Gm0z8WsZefWpN5 1tx0lvngYj5oFMu88r9OlIf0CGE3MX/wWF/35BPZr7N1Pj5pRZ7+e3s4eUv4ZiVOIEnW vlJugx5zua4180ouZlVx48WTmnCYHoODfhnCsAsm1rRB8stDqLDBQB2qBfKzB+8b+HTW xnVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-language:thread-index :mime-version:message-id:date:subject:cc:to:from:dkim-signature :dmarc-filter:delivered-to; bh=KH5qscyjI9KV8Z5BVDvsIknwsg2RXgkGgsimmpCnraY=; fh=ez+UBk19YaOo+lQEyE9porlijlGbJDzUOtzUi3k96eQ=; b=beP2PoCc1PdehcxlQ+QQjVznCPrTBeS5yJi7hlScMMisSZEtGZaAt1pF+6iet/bGPE +TyugOHZPLXkIUNgrKjPutv7RppMnBiO1xH29gM2R9nrhLvxQCzkXy7YXwUJc5sBV20m ALF9e9QMm4/4pq7odlwD2ddQm5ng2zt8LQrVHJOsshR5mglhyFnW4XkoM+MY2B5swRt+ 537iT9Sc9DpyXPFYCgnGE9Pk0RS12nB/75hLnKAaD8kqGpH03DeEe5la29fDOeOSshyW cmmVdEIZndr9Qv3UH9l4Z2+MgOuzlH1XG6Iq0wmhvTbXIJS/AFl5sbUtwsRVJFbOnnMv HjYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=WQJIAYj5; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id y9-20020a170906914900b009893fe84e6asi959684ejw.677.2023.06.24.10.14.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 24 Jun 2023 10:14:30 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=WQJIAYj5; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D47153858296 for ; Sat, 24 Jun 2023 17:14:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id B70623858D35 for ; Sat, 24 Jun 2023 17:13:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B70623858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=KH5qscyjI9KV8Z5BVDvsIknwsg2RXgkGgsimmpCnraY=; b=WQJIAYj5uADBvetV2BmYrdIDVs JK/eknUFiSDpex6iql35hQJAOrFl5La3lZN/S6n3dyDH6lvLEporuxd/vVz6hwlWMR8zZ0STRXHAK +536FgiyTh/bmZUsCSZSyqL8D6j4BJl6iqBUu8HdZg73K8QOSx0MqGF6vHRc0XXUSlVKH2uRKB4En 257ZoFIRUSLhYYJRgeL21j5gTbAE3+xuM/Njk3OgrmWoW7QNaTjXkJSxzietFJgKszINnyVEo4yLZ kopxrVSFHTmbEUQHOCD5T4TiL5sn+QU5LBvQAj9v2NlOZIo9uP17r+bD5R0kIhEdCAKBM7IJVJCqC L9iyCPsA==; Received: from [185.62.158.67] (port=58024 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qD6pg-0006cS-36; Sat, 24 Jun 2023 13:13:57 -0400 From: "Roger Sayle" To: Cc: "'Uros Bizjak'" Subject: [x86_64 PATCH] Handle SUBREG conversions in TImode STV (for ptest). Date: Sat, 24 Jun 2023 18:13:55 +0100 Message-ID: <00d401d9a6bf$42b20e70$c8162b50$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: Admmvd2ZCDA3L2HsSe+cNYhpiSpGFw== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769605032932603157?= X-GMAIL-MSGID: =?utf-8?q?1769605032932603157?= This patch teaches i386's STV pass how to handle SUBREG conversions, i.e. that a TImode SUBREG can be transformed into a V1TImode SUBREG, without worrying about other DEFs and USEs. A motivating example where this is useful is typedef long long __m128i __attribute__ ((__vector_size__ (16))); int foo (__m128i x, __m128i y) { return (__int128)x == (__int128)y; } where with -O2 -msse4 we can now scalar-to-vector transform: (insn 7 4 8 2 (set (reg:CCZ 17 flags) (compare:CCZ (subreg:TI (reg/v:V2DI 86 [ x ]) 0) (subreg:TI (reg/v:V2DI 87 [ y ]) 0))) {*cmpti_doubleword} into (insn 17 4 7 2 (set (reg:V1TI 91) (xor:V1TI (subreg:V1TI (reg/v:V2DI 86 [ x ]) 0) (subreg:V1TI (reg/v:V2DI 87 [ y ]) 0))) (nil)) (insn 7 17 8 2 (set (reg:CCZ 17 flags) (unspec:CCZ [ (reg:V1TI 91) repeated x2 ] UNSPEC_PTEST)) {*sse4_1_ptestv1ti} (expr_list:REG_DEAD (reg/v:V2DI 87 [ y ]) (expr_list:REG_DEAD (reg/v:V2DI 86 [ x ]) (nil)))) with the dramatic effect that the assembly output before: foo: movaps %xmm0, -40(%rsp) movq -32(%rsp), %rdx movq %xmm0, %rax movq %xmm1, %rsi movaps %xmm1, -24(%rsp) movq -16(%rsp), %rcx xorq %rsi, %rax xorq %rcx, %rdx orq %rdx, %rax sete %al movzbl %al, %eax ret now becomes foo: pxor %xmm1, %xmm0 xorl %eax, %eax ptest %xmm0, %xmm0 sete %al ret i.e. a 128-bit vector doesn't need to be transferred to the scalar unit to be tested for equality. The new test case includes additional related examples that show similar improvements. Previously we explicitly checked *cmpti_doubleword operands to be either immediate constants, or a TImode REG or a TImode MEM. By enhancing this to allow a TImode SUBREG, we now handle everything that would match the general_operand predicate, making this part of STV more like other RTL passes (lra/reload). The big change is that unlike a regular DF USE, a SUBREG USE doesn't require us to analyze and convert the rest of the DEF-USE chain. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-06-24 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (scalar_chain:add_insn): Don't call analyze_register_chain if the USE is a SUBREG. (timode_scalar_chain::convert_op): Call gen_lowpart to convert TImode SUBREGs to V1TImode SUBREGs. (convertible_comparison_p): We can now handle all general_operands of *cmp_doubleword. (timode_remove_non_convertible_regs): We only need to check TImode uses that aren't TImode SUBREGs of registers in other modes. gcc/testsuite/ChangeLog * gcc.target/i386/sse4_1-ptest-7.c: New test case. Thanks in advance, Roger diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index 4a3b07a..6e9ba54 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -449,7 +449,8 @@ scalar_chain::add_insn (bitmap candidates, unsigned int insn_uid, return true; for (ref = DF_INSN_UID_USES (insn_uid); ref; ref = DF_REF_NEXT_LOC (ref)) - if (!DF_REF_REG_MEM_P (ref)) + if (DF_REF_TYPE (ref) == DF_REF_REG_USE + && !SUBREG_P (DF_REF_REG (ref))) if (!analyze_register_chain (candidates, ref, disallowed)) return false; @@ -1621,7 +1622,8 @@ timode_scalar_chain::convert_op (rtx *op, rtx_insn *insn) else { gcc_assert (SUBREG_P (*op)); - gcc_assert (GET_MODE (*op) == vmode); + if (GET_MODE (*op) != V1TImode) + *op = gen_lowpart (V1TImode, *op); } } @@ -1912,12 +1914,8 @@ convertible_comparison_p (rtx_insn *insn, enum machine_mode mode) rtx op2 = XEXP (src, 1); /* *cmp_doubleword. */ - if ((CONST_SCALAR_INT_P (op1) - || ((REG_P (op1) || MEM_P (op1)) - && GET_MODE (op1) == mode)) - && (CONST_SCALAR_INT_P (op2) - || ((REG_P (op2) || MEM_P (op2)) - && GET_MODE (op2) == mode))) + if (general_operand (op1, mode) + && general_operand (op2, mode)) return true; /* *testti_doubleword. */ @@ -2244,8 +2242,9 @@ timode_remove_non_convertible_regs (bitmap candidates) DF_REF_REGNO (ref)); FOR_EACH_INSN_USE (ref, insn) - if (!DF_REF_REG_MEM_P (ref) - && GET_MODE (DF_REF_REG (ref)) == TImode) + if (DF_REF_TYPE (ref) == DF_REF_REG_USE + && GET_MODE (DF_REF_REG (ref)) == TImode + && !SUBREG_P (DF_REF_REG (ref))) timode_check_non_convertible_regs (candidates, regs, DF_REF_REGNO (ref)); } diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-ptest-7.c b/gcc/testsuite/gcc.target/i386/sse4_1-ptest-7.c new file mode 100644 index 0000000..bb52d3b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse4_1-ptest-7.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O2 -msse4.1" } */ + +typedef long long __m128i __attribute__ ((__vector_size__ (16))); + +int foo (__m128i x, __m128i y) +{ + return (__int128)x == (__int128)y; +} + +int bar (__m128i x, __m128i y) +{ + return (__int128)(x^y) == 0; +} + +int baz (__m128i x, __m128i y) +{ + return (__int128)(x==y) == ~0; +} + +/* { dg-final { scan-assembler-times "ptest\[ \\t\]+%" 3 } } */ +/* { dg-final { scan-assembler-not "%\[er\]sp" } } */