From patchwork Thu Dec 28 14:59:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 183754 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp2043469dyb; Thu, 28 Dec 2023 07:02:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IHdpqVKxMSaHeJ+CAV55rM+4pz65GSE9Tsmn2J4OGbMbbRpb7rKIsv5fQZSizM/URg6Xfgx X-Received: by 2002:ac5:c396:0:b0:4b6:bee9:3cbd with SMTP id s22-20020ac5c396000000b004b6bee93cbdmr3173963vkk.5.1703775729965; Thu, 28 Dec 2023 07:02:09 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1703775729; cv=pass; d=google.com; s=arc-20160816; b=eT3CnXGy9qzgG9+57A4/8wiE0q6nRRe5wEHugXqzdRsKi01tBBOY/blMkSIAgN73Gg sHD5NBIxoqwu5k7i4/YuHB5qaeei9Ea/FlJe5Yx520p60HNhn6k7IerQrdwiTvtZcIea SLzc4ea51WTMZZzLkWvfVfe5WdJwv5rSlX0CnfxI5kebdG+JNsuz4eK2Ef15sXy/PuYR 7XfX82g0SobgXM29MAjnC57Fmi/meUx4oYA4r2YHvm8jZKqsldLmNf1XSifHUpzYhxfM ERLsen0e8exZ2rnu4NKfrj5yVIDlstquw0RnAVykezUXicVt6tsyM3Q2OZuKg0jUHCjX T2PA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-language:thread-index :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=M2CAtKhNXY9MIAoO5+rjKr7V90AVG2zi/jrvOQ5j/1s=; fh=Hv1bhVmKpFi5fVnRwEcAWHyPZRg0ovNClcqM0T49FQk=; b=wtvxt3KFaOzALacq/WCsFjJLrcKF7QToMOfC5VcL2JPEebbMJIh8szPQJfXQPH19R3 vgLGtryuHd+DC3GBr/HLflUPD0w8SI6ndKN2eYfTKI8N3Q+x/GH+b/89+XK31Xr5/y3J Tkv+zJv+5IjJZjI6EMyQuqaICrYvP9qdHnl33lStx2G6IkpuZ6Dpr8aCtqMfB1OWDslD MoE6zTEbCWK03+Du4ICssKR8ozfMIr1Rhvwt6zt3eEtXB7oeZfTKDZGPqcJ8doBYRXZW oVIHw/+5gQQ+YOciND+nsvACQD/mx0hprXXySbZZFiZxXvsiH9mqOIFJpPlLB9hkTg24 6AGg== ARC-Authentication-Results: i=2; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=pAbw4jJm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id y3-20020a1fe103000000b0048fcea1515fsi2343730vkg.149.2023.12.28.07.02.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Dec 2023 07:02:09 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=pAbw4jJm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7E2AC3861822 for ; Thu, 28 Dec 2023 15:01:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 982AD385840C; Thu, 28 Dec 2023 14:59:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 982AD385840C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 982AD385840C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=162.254.253.69 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703775590; cv=none; b=BA61WTnE5+nH6MH3ON7Ak7xeyKo+ZvVKI3qCmf/1LDYQlSaC0SlSJB3b/jyl0Xf6JzVBklJ2k8d/kn9+4a2Z1oRQF4dDrdg6Sh7+5Vf3VQPA2Im4XF/+9Y8TCGGurMPdoTfqfTlXQi4qUPxOSN88mKuiPfs2eIBnInY34Dliv/Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703775590; c=relaxed/simple; bh=KXshTg13fackIYok7fPwEGsoNR0KpgOn4rkoemhOs9Q=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=mLpfoW6QGXBqWQmKLuObkMPBSb9FHjJM/FQFcqRG8t1jC2B5lciZPQCn1CPCI/tXdW/n2UMXihvfqHBXzMzzg2MB6y2Zf2Tjo/h2wzhk4Rh+ao6XvHPaAjnKgERAJXw1lHsbAdI2gUFI9bpSIFDtc/EXPvy5O47k/QizbJ7U/5s= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=M2CAtKhNXY9MIAoO5+rjKr7V90AVG2zi/jrvOQ5j/1s=; b=pAbw4jJmM+KmOlHG4s+mKlFnTZ 2UQdIwnohEIRO0q4xf/O5VBxEy8vgvPViU7AiDFnWhsScj6VEg3Y5wG6cl2oz9j2kmiHApzcoFEVP Sp3lJeKInvwynPqdLb9E9ZhWpPYWUpA6Fj5FoXtXzjSWL7C+5OUlreFGe9tNSlGUgN03irlJ4DI+/ e4XFXBr1sET2FNSUHultzAzbdCGRp5EH1N7OKEwUHOpG0NmQmyFPGbAGkaolBPfPUKFRTZ2kz/fgP GvABYsWgmLNXT0fkh3PrvwmSETZ1s37DSG0fx0JUOrXNSmr4vZcZ/L/umY99aPmA3ANVpG0Vp8dYS SljQxIeQ==; Received: from [185.62.158.67] (port=59280 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96.2) (envelope-from ) id 1rIrrK-0006mQ-2n; Thu, 28 Dec 2023 09:59:43 -0500 From: "Roger Sayle" To: Cc: "'Jeff Law'" , "'YunQiang Su'" Subject: [PATCH] Improved RTL expansion of field assignments into promoted registers. Date: Thu, 28 Dec 2023 14:59:40 -0000 Message-ID: <005901da399e$7d13b330$773b1990$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: Ado5neRAOg29Cn2zQj+ni3mFebRk0g== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786538339597289221 X-GMAIL-MSGID: 1786538339597289221 This patch fixes PR rtl-optmization/104914 by tweaking/improving the way that fields are written into a pseudo register that needs to be kept sign extended. The motivating example from the bugzilla PR is: extern void ext(int); void foo(const unsigned char *buf) { int val; ((unsigned char*)&val)[0] = *buf++; ((unsigned char*)&val)[1] = *buf++; ((unsigned char*)&val)[2] = *buf++; ((unsigned char*)&val)[3] = *buf++; if(val > 0) ext(1); else ext(0); } which at the end of the tree optimization passes looks like: void foo (const unsigned char * buf) { int val; unsigned char _1; unsigned char _2; unsigned char _3; unsigned char _4; int val.5_5; [local count: 1073741824]: _1 = *buf_7(D); MEM[(unsigned char *)&val] = _1; _2 = MEM[(const unsigned char *)buf_7(D) + 1B]; MEM[(unsigned char *)&val + 1B] = _2; _3 = MEM[(const unsigned char *)buf_7(D) + 2B]; MEM[(unsigned char *)&val + 2B] = _3; _4 = MEM[(const unsigned char *)buf_7(D) + 3B]; MEM[(unsigned char *)&val + 3B] = _4; val.5_5 = val; if (val.5_5 > 0) goto ; [59.00%] else goto ; [41.00%] [local count: 633507681]: ext (1); goto ; [100.00%] [local count: 440234144]: ext (0); [local count: 1073741824]: val ={v} {CLOBBER(eol)}; return; } Here four bytes are being sequentially written into the SImode value val. On some platforms, such as MIPS64, this SImode value is kept in a 64-bit register, suitably sign-extended. The function expand_assignment contains logic to handle this via SUBREG_PROMOTED_VAR_P (around line 6264 in expr.cc) which outputs an explicit extension operation after each store_field (typically insv) to such promoted/extended pseudos. The first observation is that there's no need to perform sign extension after each byte in the example above; the extension is only required after changes to the most significant byte (i.e. to a field that overlaps the most significant bit). The bug fix is actually a bit more subtle, but at this point during code expansion it's not safe to use a SUBREG when sign-extending this field. Currently, GCC generates (sign_extend:DI (subreg:SI (reg:DI) 0)) but combine (and other RTL optimizers) later realize that because SImode values are always sign-extended in their 64-bit hard registers that this is a no-op and eliminates it. The trouble is that it's unsafe to refer to the SImode lowpart of a 64-bit register using SUBREG at those critical points when temporarily the value isn't correctly sign-extended, and the usual backend invariants don't hold. At these critical points, the middle-end needs to use an explicit TRUNCATE rtx (as this isn't a TRULY_NOOP_TRUNCATION), so that the explicit sign-extension looks like (sign_extend:DI (truncate:SI (reg:DI)), which avoids the problem. Note that MODE_REP_EXTENDED (NARROW, WIDE) != UNKOWN implies (or should imply) !TRULY_NOOP_TRUNCATION (NARROW, WIDE). I've another (independent) patch that I'll post in a few minutes. This middle-end patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. The cc1 from a cross-compiler to mips64 appears to generate much better code for the above test case. Ok for mainline? 2023-12-28 Roger Sayle gcc/ChangeLog PR rtl-optimization/104914 * expr.cc (expand_assignment): When target is SUBREG_PROMOTED_VAR_P a sign or zero extension is only required if the modified field overlaps the SUBREG's most significant bit. On MODE_REP_EXTENDED targets, don't refer to the temporarily incorrectly extended value using a SUBREG, but instead generate an explicit TRUNCATE rtx. Thanks in advance, Roger diff --git a/gcc/expr.cc b/gcc/expr.cc index 9fef2bf6585..1a34b48e38f 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -6272,19 +6272,32 @@ expand_assignment (tree to, tree from, bool nontemporal) && known_eq (bitpos, 0) && known_eq (bitsize, GET_MODE_BITSIZE (GET_MODE (to_rtx)))) result = store_expr (from, to_rtx, 0, nontemporal, false); - else + /* Check if the field overlaps the MSB, requiring extension. */ + else if (known_eq (bitpos + bitsize, + GET_MODE_BITSIZE (GET_MODE (to_rtx)))) { - rtx to_rtx1 - = lowpart_subreg (subreg_unpromoted_mode (to_rtx), - SUBREG_REG (to_rtx), - subreg_promoted_mode (to_rtx)); + scalar_int_mode imode = subreg_unpromoted_mode (to_rtx); + scalar_int_mode omode = subreg_promoted_mode (to_rtx); + rtx to_rtx1 = lowpart_subreg (imode, SUBREG_REG (to_rtx), + omode); result = store_field (to_rtx1, bitsize, bitpos, bitregion_start, bitregion_end, mode1, from, get_alias_set (to), nontemporal, reversep); + /* If the target usually keeps IMODE appropriately + extended in OMODE it's unsafe to refer to it using + a SUBREG whilst this invariant doesn't hold. */ + if (targetm.mode_rep_extended (imode, omode) != UNKNOWN) + to_rtx1 = simplify_gen_unary (TRUNCATE, imode, + SUBREG_REG (to_rtx), omode); convert_move (SUBREG_REG (to_rtx), to_rtx1, SUBREG_PROMOTED_SIGN (to_rtx)); } + else + result = store_field (to_rtx, bitsize, bitpos, + bitregion_start, bitregion_end, + mode1, from, get_alias_set (to), + nontemporal, reversep); } else result = store_field (to_rtx, bitsize, bitpos,