From patchwork Thu Jul 6 12:04:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 116647 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2507752vqx; Thu, 6 Jul 2023 05:05:12 -0700 (PDT) X-Google-Smtp-Source: APBJJlHjGygXV33noeb8bzI+MWMifLE74wH5qtqVyQ7C/vwS7/oedH1UXbB1c8oGKUdw4mpvPRtE X-Received: by 2002:aa7:cb0c:0:b0:51d:a724:48d1 with SMTP id s12-20020aa7cb0c000000b0051da72448d1mr547541edt.23.1688645111915; Thu, 06 Jul 2023 05:05:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688645111; cv=none; d=google.com; s=arc-20160816; b=PqdGF89A9xALON1i8c12atYlV2rYnSDm4kRGT1x/6YkQHLgvkGGYpcV7ACxx4oJ7bd 411xsyVhzdU+Di0wTAAfxcGOUyIk6uxll/KcKbklrIh7WWFSDOe5rcd5MEp5Th6Ve9Fs bg0xALbT+RrIlQuk6pMv4q6rx4hNajbIV5D6h/nXpPYPfKxIVVlORW1G1aKOd9oI0jy3 Ow5gkqwU229Ld9ApO3OBzaSp1g4EBlQE6JcXRflJ73rgrQp4g81aehHGzrcBkDyl2fc+ hFrJW6vOQdc/CML3q5ISFG8m63ck3QHXYjFEmxzMRbwbVj6KKcAI7pFbGKle1uASMX54 byOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-language:thread-index :mime-version:message-id:date:subject:cc:to:from:dkim-signature :dmarc-filter:delivered-to; bh=5er8mtM+Iwo3uDkexf6RdDj78VNJXTH4OdWwZyFGUqg=; fh=UAbSimdGpojL/e9wL4Po0hjwO3sT24dauJYSxjnq65Y=; b=QlKhAeBML3Oh7i53xnv0ODPx7RnICEIsFQ6hgTHP6fG3OTpa1F8+gmsYjIvpe8d9r3 AwiIjKzL+emjxce6BxA5gu1gc2vgA4VjDhjCoKq3tp5StBw+TFs7Psp6AV5qgJKwCahl edFAkVT20Vt2XQKP4m9+6XYOLjxknxNgLNQMaZQLOS8w6qRB9I2mdmrzzrN1mpMXbE4z uNz3AAaRixtIkO7h0gnLx4JV7+5GZ/lvgpGNtQwSm4H4zWeGN35P6z3VRB83WJuV/kHh wkVLjP01p4jaMhKT3yXsCf7AkRdRDeUXW3hgae8zcmdbrK2Agq1SZHBErKmwxDsurjdZ FuQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=Q0mgNKrM; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id j9-20020aa7ca49000000b0051debcd957esi801663edt.351.2023.07.06.05.05.11 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jul 2023 05:05:11 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=Q0mgNKrM; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 838CC385559E for ; Thu, 6 Jul 2023 12:05:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 76ACF3858D37 for ; Thu, 6 Jul 2023 12:04:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 76ACF3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=5er8mtM+Iwo3uDkexf6RdDj78VNJXTH4OdWwZyFGUqg=; b=Q0mgNKrM5OR6SRaakkiJgwWIN0 jja9Eal+EclEfYedTIbfQZ+lBixZlw2evL+2mcIbsItFR3yLYBqjSsoKVCYarIuLOkF9tLlpKQayy CbA8kwBiNsKDZflnNR4Giy0QsD0h8Zo6+AyFgciPfMFGBOuTLI443NSl3U6jxEDFwOk9RUgeR1G6n qFIcwo/7RvPe4RoogK2z64WsXL73Huo7AS2mJW6g3sIcfDFOCMWbeyAuBix0IKTq8fKwPxj1D1vTT G7Ueqg9NIOKr+yOeIGj+1aZJ0DJoVrJuNr3N7g+d+cKhJgaxE1gKv3vEzDAIabhT3971fdrVt5QrU Uw0sHU0g==; Received: from [185.62.158.67] (port=52069 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1qHNiv-0002bs-28; Thu, 06 Jul 2023 08:04:37 -0400 From: "Roger Sayle" To: Cc: "'Uros Bizjak'" Subject: [x86_64 PATCH] Improve __int128 argument passing (in ix86_expand_move). Date: Thu, 6 Jul 2023 13:04:35 +0100 Message-ID: <014901d9b002$094f5ec0$1bee1c40$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdmwAPS6Ac3+nONvQwu2DqfOuLG6bg== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770672736765009253?= X-GMAIL-MSGID: =?utf-8?q?1770672736765009253?= Passing 128-bit integer (TImode) parameters on x86_64 can sometimes result in surprising code. Consider the example below (from PR 43644): __uint128 foo(__uint128 x, unsigned long long y) { return x+y; } which currently results in 6 consecutive movq instructions: foo: movq %rsi, %rax movq %rdi, %rsi movq %rdx, %rcx movq %rax, %rdi movq %rsi, %rax movq %rdi, %rdx addq %rcx, %rax adcq $0, %rdx ret The underlying issue is that during RTL expansion, we generate the following initial RTL for the x argument: (insn 4 3 5 2 (set (reg:TI 85) (subreg:TI (reg:DI 86) 0)) "pr43644-2.c":5:1 -1 (nil)) (insn 5 4 6 2 (set (subreg:DI (reg:TI 85) 8) (reg:DI 87)) "pr43644-2.c":5:1 -1 (nil)) (insn 6 5 7 2 (set (reg/v:TI 84 [ x ]) (reg:TI 85)) "pr43644-2.c":5:1 -1 (nil)) which by combine/reload becomes (insn 25 3 22 2 (set (reg/v:TI 84 [ x ]) (const_int 0 [0])) "pr43644-2.c":5:1 -1 (nil)) (insn 22 25 23 2 (set (subreg:DI (reg/v:TI 84 [ x ]) 0) (reg:DI 93)) "pr43644-2.c":5:1 90 {*movdi_internal} (expr_list:REG_DEAD (reg:DI 93) (nil))) (insn 23 22 28 2 (set (subreg:DI (reg/v:TI 84 [ x ]) 8) (reg:DI 94)) "pr43644-2.c":5:1 90 {*movdi_internal} (expr_list:REG_DEAD (reg:DI 94) (nil))) where the heavy use of SUBREG SET_DESTs creates challenges for both combine and register allocation. The improvement proposed here is to avoid these problematic SUBREGs by adding (two) special cases to ix86_expand_move. For insn 4, which sets a TImode destination from a paradoxical SUBREG, to assign the lowpart, we can use an explicit zero extension (zero_extendditi2 was added in July 2022), and for insn 5, which sets the highpart of a TImode register we can use the *insvti_highpart_1 instruction (that was added in May 2023, after being approved for stage1 in January). This allows combine to work its magic, merging these insns into a *concatditi3 and from there into other optimized forms. So for the test case above, we now generate only a single movq: foo: movq %rdx, %rax xorl %edx, %edx addq %rdi, %rax adcq %rsi, %rdx ret But there is a little bad news. This patch causes two (minor) missed optimization regressions on x86_64; gcc.target/i386/pr82580.c and gcc.target/i386/pr91681-1.c. As shown in the test case above, we're no longer generating adcq $0, but instead using xorl. For the other FAIL, register allocation now has more freedom and is (arbitrarily) choosing a register assignment that doesn't match what the test is expecting. These issues are easier to explain and fix once this patch is in the tree. The good news is that this approach fixes a number of long standing issues, that need to checked in bugzilla, including PR target/110533 which was just opened/reported earlier this week. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with only the two new FAILs described above. Ok for mainline? 2023-07-06 Roger Sayle gcc/ChangeLog PR target/43644 PR target/110533 * config/i386/i386-expand.cc (ix86_expand_move): Convert SETs of TImode destinations from paradoxical SUBREGs (setting the lowpart) into explicit zero extensions. Use *insvti_highpart_1 instruction to set the highpart of a TImode destination. gcc/testsuite/ChangeLog PR target/43644 PR target/110533 * gcc.target/i386/pr110533.c: New test case. * gcc.target/i386/pr43644-2.c: Likewise. Thanks in advance, Roger diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 567248d..92ffa4b 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -429,6 +429,16 @@ ix86_expand_move (machine_mode mode, rtx operands[]) default: break; + + case SUBREG: + /* Transform TImode paradoxical SUBREG into zero_extendditi2. */ + if (TARGET_64BIT + && mode == TImode + && SUBREG_P (op1) + && GET_MODE (SUBREG_REG (op1)) == DImode + && SUBREG_BYTE (op1) == 0) + op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1)); + break; } if ((flag_pic || MACHOPIC_INDIRECT) @@ -532,6 +542,24 @@ ix86_expand_move (machine_mode mode, rtx operands[]) } } + /* Use *insvti_highpart_1 to set highpart of TImode register. */ + if (TARGET_64BIT + && mode == DImode + && SUBREG_P (op0) + && SUBREG_BYTE (op0) == 8 + && GET_MODE (SUBREG_REG (op0)) == TImode + && REG_P (SUBREG_REG (op0)) + && REG_P (op1)) + { + wide_int mask = wi::mask (64, false, 128); + rtx tmp = immed_wide_int_const (mask, TImode); + op0 = SUBREG_REG (op0); + tmp = gen_rtx_AND (TImode, copy_rtx (op0), tmp); + op1 = gen_rtx_ZERO_EXTEND (TImode, op1); + op1 = gen_rtx_ASHIFT (TImode, op1, GEN_INT (64)); + op1 = gen_rtx_IOR (TImode, tmp, op1); + } + emit_insn (gen_rtx_SET (op0, op1)); } diff --git a/gcc/testsuite/gcc.target/i386/pr110533.c b/gcc/testsuite/gcc.target/i386/pr110533.c new file mode 100644 index 0000000..513bcd4 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110533.c @@ -0,0 +1,9 @@ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O0" } */ + +__attribute__((naked)) +void fn(__int128 a) { + asm("ret"); +} + +/* { dg-final { scan-assembler-not "mov" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr43644-2.c b/gcc/testsuite/gcc.target/i386/pr43644-2.c new file mode 100644 index 0000000..d470b0a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr43644-2.c @@ -0,0 +1,9 @@ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O2" } */ + +unsigned __int128 foo(unsigned __int128 x, unsigned long long y) +{ + return x+y; +} + +/* { dg-final { scan-assembler-times "movq" 1 } } */