From patchwork Mon Mar 6 12:47:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 64630 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1818274wrd; Mon, 6 Mar 2023 04:47:54 -0800 (PST) X-Google-Smtp-Source: AK7set8NRkAxt32L8a8eJc5irt9I4CTykx6mvKVMoAhyjBpxtFUIW3xRRjE8y6RwK4PE522qKKeU X-Received: by 2002:a17:906:784:b0:8ae:ea17:bae4 with SMTP id l4-20020a170906078400b008aeea17bae4mr13799635ejc.30.1678106874467; Mon, 06 Mar 2023 04:47:54 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1678106874; cv=none; d=google.com; s=arc-20160816; b=truXkPcVZIVurOkVPR1Y7T5WNjWeU8jqqPiC8RuAsTcJJ7fKv1uu95wUNa55rxITgD 4+zYfzzDfQQHA22fU6rXzBhfNhBUQoi3v65WD78Q+bh150mcBgzUQ245qym9xMUuestG f0Yw2pcp0Wbgz1hix00IFFQOEx3rEiQ329nVniFO1Nr/GpO/+k1PdgkO/mXZK6Sto4Gt P6AAGLvyYPHNFlFwMKrPNafYDfkBAq5Bip8CmphHZHkA1sfJIEut6O9U9wT/KOK29Gv2 hbHIBQ4kBvDB9zDXUPCfaeVHi/AFL42xGM6KDSCvemmBuW7OKwpEszffgNH6ACHNzctZ Tmcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:message-id:in-reply-to:date:references:subject:cc :mail-followup-to:to:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=xqo+cCHY5nBk0pJK2WZn++/BRwPtKioXvV/80JJrRgU=; b=w4hi9kdvcd1MLxtHzvyDLXTCj3AuuFwfo4NWNyPSIHxa6o3Ywx84v+nQngL2q1EgZC 2SgXRNd3CBoI7tOB6rWjAekzXHq82b44eE4AgCIktSSabrcByMGK86Q3fGW7w/Sxona3 KIjaXo5cWo8Jh5YEEntoO5W1fcfzCP2oJQ+69ugrQTCy/yV3I2XapiPeFr9DQuuIcr5h 7XIVNZKwZn7H2kOA+P2QxBRBP/DnviT1OAwOm6hAgPSkGMUCSkmjZ3I8J/cShNeKCWxh c1HVKS+AU6YB90QqKqc2g+gHrfoV2RN56426tPR+MXjm/rXXmv8sHQ0nmj4CuZrPfNMq y7Zg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=aAm2zl1C; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id dn7-20020a17090794c700b008eb56f9a75asi7894420ejc.58.2023.03.06.04.47.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Mar 2023 04:47:54 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=aAm2zl1C; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 46D843857400 for ; Mon, 6 Mar 2023 12:47:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 46D843857400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678106873; bh=xqo+cCHY5nBk0pJK2WZn++/BRwPtKioXvV/80JJrRgU=; h=To:Cc:Subject:References:Date:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=aAm2zl1CmmyQzrmF6F2/qfD6lzZb0IHumKLKjc9evmuvE747w83v4U9jESrzRWAYA MaWvg5yVfe/BSNR/wa4K4MtOZXEjIa6wcN2OBFUvYInlg0blUuCagPRYP1LIDhcBkQ /DbVM7iu5dRdVi7bEdSDmcbjfeGb+2M+KoDQdiQg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5EECC3858D39 for ; Mon, 6 Mar 2023 12:47:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5EECC3858D39 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6122F13D5; Mon, 6 Mar 2023 04:47:52 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 337493F885; Mon, 6 Mar 2023 04:47:08 -0800 (PST) To: Jeff Law via Gcc-patches Mail-Followup-To: Jeff Law via Gcc-patches , Tamar Christina , Segher Boessenkool , Roger Sayle , Jeff Law , richard.sandiford@arm.com Cc: Tamar Christina , Segher Boessenkool , Roger Sayle , Jeff Law Subject: [PATCH] combine: Try harder to form zero_extends [PR106594] References: <000c01d94ec7$a6921430$f3b63c90$@nextmovesoftware.com> <20230304221749.GK25951@gate.crashing.org> <3b1ed616-5d90-7a66-63b5-bdb5e320eebf@gmail.com> Date: Mon, 06 Mar 2023 12:47:06 +0000 In-Reply-To: <3b1ed616-5d90-7a66-63b5-bdb5e320eebf@gmail.com> (Jeff Law via Gcc-patches's message of "Sun, 5 Mar 2023 12:56:02 -0700") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-34.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759622594223386134?= X-GMAIL-MSGID: =?utf-8?q?1759622594223386134?= Jeff Law via Gcc-patches writes: > On 3/5/23 12:28, Tamar Christina via Gcc-patches wrote: >> >> The regression was reported during stage-1. A patch was provided during stage 1 and the discussions around combine stalled. >> >> The regression for AArch64 needs to be fixed in GCC 13. The hit is too big just to "take". >> >> So we need a way forward, even if it's stage-4. > Then it needs to be in a way that works within the design constraints of > combine. > > As Segher has indicated, using a magic constant to say "this is always > cheap enough" isn't acceptable. Furthermore, what this patch changes is > combine's internal canonicalization of extensions into shift pairs. > > So I think another path forward needs to be found. I don't see hacking > up expand_compound_operation is viable. > > Jeff How about the patch below? Tested on aarch64-linux-gnu, but I'll test on x86_64-linux-gnu and powerpc64le-linux-gnu before committing. ----------------------------- The PR contains a case where we want GCC to combine a sign_extend into an address (which is something that GCC 12 could do). After substitution, the sign_extend goes through the usual expand_compound_operation wrangler and, after taking nonzero_bits into account, make_compound_operation is presented with: X1: (and (mult (subreg x) (const_int N2)) (const_int N1)) where: (a) the subreg is paradoxical (b) N2 is a power of 2 (c) all bits outside N1/N2 are known to be zero in x This is equivalent to: X2: (mult (and (subreg x) (const_int N1/N2)) (const_int N2)) Given in this form, existing code would already use (c) to convert the inner "and" to a zero_extend: (mult (zero_extend x) (const_int N2)) This patch makes the code handle X1 as well as X2. Logically, it would make sense to do the same for ASHIFT, which would be the canonical form outside memory addresses. However, it seemed better to do the minimum possible, given the late stage in the release cycle. gcc/ PR rtl-optimization/106594 * combine.cc (make_compound_operation_int): Extend the AND to ZERO_EXTEND transformation so that it can handle an intervening multiplication by a power of two. gcc/testsuite/ * gcc.target/aarch64/pr106594.c: New test. --- gcc/combine.cc | 60 ++++++++++++++------- gcc/testsuite/gcc.target/aarch64/pr106594.c | 21 ++++++++ 2 files changed, 63 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr106594.c diff --git a/gcc/combine.cc b/gcc/combine.cc index 053879500b7..b45042bbafd 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -8188,28 +8188,52 @@ make_compound_operation_int (scalar_int_mode mode, rtx *x_ptr, /* If the one operand is a paradoxical subreg of a register or memory and the constant (limited to the smaller mode) has only zero bits where the sub expression has known zero bits, this can be expressed as - a zero_extend. */ - else if (GET_CODE (XEXP (x, 0)) == SUBREG) - { - rtx sub; + a zero_extend. + + Look also for the case where the operand is such a subreg that + is multiplied by 2**N: - sub = XEXP (XEXP (x, 0), 0); - machine_mode sub_mode = GET_MODE (sub); - int sub_width; - if ((REG_P (sub) || MEM_P (sub)) - && GET_MODE_PRECISION (sub_mode).is_constant (&sub_width) - && sub_width < mode_width) + (and (mult ... 2**N) M) --> (mult (and ... M>>N) 2**N) -> ... */ + else + { + rtx y = XEXP (x, 0); + rtx top = y; + int shift = 0; + if (GET_CODE (y) == MULT + && CONST_INT_P (XEXP (y, 1)) + && pow2p_hwi (INTVAL (XEXP (y, 1)))) { - unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode); - unsigned HOST_WIDE_INT mask; + shift = exact_log2 (INTVAL (XEXP (y, 1))); + y = XEXP (y, 0); + } + if (GET_CODE (y) == SUBREG) + { + rtx sub; - /* original AND constant with all the known zero bits set */ - mask = UINTVAL (XEXP (x, 1)) | (~nonzero_bits (sub, sub_mode)); - if ((mask & mode_mask) == mode_mask) + sub = XEXP (y, 0); + machine_mode sub_mode = GET_MODE (sub); + int sub_width; + if ((REG_P (sub) || MEM_P (sub)) + && GET_MODE_PRECISION (sub_mode).is_constant (&sub_width) + && sub_width < mode_width) { - new_rtx = make_compound_operation (sub, next_code); - new_rtx = make_extraction (mode, new_rtx, 0, 0, sub_width, - 1, 0, in_code == COMPARE); + unsigned HOST_WIDE_INT mode_mask = GET_MODE_MASK (sub_mode); + unsigned HOST_WIDE_INT mask; + + /* The shifted AND constant with all the known zero + bits set. */ + mask = ((UINTVAL (XEXP (x, 1)) >> shift) + | ~nonzero_bits (sub, sub_mode)); + if ((mask & mode_mask) == mode_mask) + { + new_rtx = make_compound_operation (sub, next_code); + new_rtx = make_extraction (mode, new_rtx, + 0, 0, sub_width, + 1, 0, in_code == COMPARE); + if (shift) + new_rtx = gen_rtx_fmt_ee (GET_CODE (top), mode, + new_rtx, XEXP (top, 1)); + } } } } diff --git a/gcc/testsuite/gcc.target/aarch64/pr106594.c b/gcc/testsuite/gcc.target/aarch64/pr106594.c new file mode 100644 index 00000000000..f10d72665e0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr106594.c @@ -0,0 +1,21 @@ +/* { dg-options "-O2" } */ + +extern const int constellation_64qam[64]; + +void foo(int nbits, + const char *p_src, + int *p_dst) { + + while (nbits > 0U) { + char first = *p_src++; + + char index1 = ((first & 0x3) << 4) | (first >> 4); + + *p_dst++ = constellation_64qam[index1]; + + nbits--; + } +} + +/* { dg-final { scan-assembler {ldr\tw[0-9]+, \[x[0-9]+, w[0-9]+, [su]xtw #?2\]} } } */ +