From patchwork Mon Jun 12 18:55:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 106764 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp69986vqr; Mon, 12 Jun 2023 11:56:53 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7F4vzX6BAFMtXBbV+6B85hCktCaXo1o1983QhJ3qw9pL/Zvnyo9vWB5xNKOJOX24GQU4N9 X-Received: by 2002:a17:907:788:b0:974:5487:6037 with SMTP id xd8-20020a170907078800b0097454876037mr12039582ejb.36.1686596213168; Mon, 12 Jun 2023 11:56:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686596213; cv=none; d=google.com; s=arc-20160816; b=P267Lph34DsfJmo0l8D2JMMR3D3yQ55gw6PO7iHtxfBUQEMLWtF5mhNAzxp17suGqB LMbXqEHxLcxidwCWrlWosWDL4BZi+CEETl8gCLrUogyUjLnvfqFo33E7L1Mh3Ay3WZCy 5b6c6uuUS9k+bUlm+IYS2fyIz3h6VPuNRtwoXzY0KD6EqQEvftfnlRe8wqb+2dfsDNSd G6MWwHMlhnvEdrunpQIjtZeD1oxExx4PPDLjbWygB5nHVsDOvFv0OAGF4tbCy3yEfOYf u0Iy5YzuHZcLKbnO1K1TItuMY3Yfk/k9o2aGENrxy1eRJ5EG4cKwcnl+A4g5TfxTq6vh WyKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:to:subject :content-language:user-agent:mime-version:date:message-id :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=lX67k5vUjZAH4McdRdqwrmEIkWc2Gh9qu9nddWfg+0s=; b=CV78gTGK9Clh0Tl2fumjbaLS2y4l9ztvxQeEoVgQRmbpyzaaah8kxzP5oB+05H9BZ9 AS6Y3H8SNtILBfBXGupPLYZc4N0qif5u5aMcEohlDOxGwrmNeEr8ywyRphqDoxa5/bgu 65e4TTyJt1M6MmrvJWpdx2JQPXDyrHuUMLMBXlwnx2RFPU7/5GrDTMidf/D7Bcx9oHUP vlXpylGTorKKjwHbw/jcpXwScQHCUyr/xDzjJUfkm1BBMRg92mrb6mI4Wms3hlthC+u3 ggISYADFEzZZHo7yC9PQvm/1EVUCVcdBkO7/Bpig7EJ5Gd0bybIPaF/IGaXzufwY1B+T adMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PurZdoDe; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id gw8-20020a170906f14800b00977c7fa7ecdsi5891150ejb.372.2023.06.12.11.56.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 11:56:53 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=PurZdoDe; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CBF463857700 for ; Mon, 12 Jun 2023 18:56:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CBF463857700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686596202; bh=lX67k5vUjZAH4McdRdqwrmEIkWc2Gh9qu9nddWfg+0s=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=PurZdoDepzOff6h7RL6st8iCP5RgRpwm1LR4ynK5CHdoCbQLgv5/cP9byJYTnFIW4 G/EYvRL2XmU8qaa2WUwM6z1wEy1nYxhYMv/Cs7su2KP9xWnHRoCBIgDrVYigjMn2fK ZlRz+RiXqUQQMesL87nvRjdiPDQjB30anUrULRXg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by sourceware.org (Postfix) with ESMTPS id 3C8863858D20 for ; Mon, 12 Jun 2023 18:55:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3C8863858D20 Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1b3db8f3d07so5843085ad.2 for ; Mon, 12 Jun 2023 11:55:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686596152; x=1689188152; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lX67k5vUjZAH4McdRdqwrmEIkWc2Gh9qu9nddWfg+0s=; b=AmwGDOY3M635XxuOnRWYxGqYtIYOAPIi5bTo6dQmH2mlc6s0ldTY7MkGjnl2oLFJWb BrHM5A1HBlS65r+OU+FBTL0zGNKo/5vzqNj2E2JXOfMQSiLk3F62thcSFiRSN0dosEHc MRbrxYaDSQ4s7nyBt1mfhN2uM1A/dZTvwLq4IOD+jrK2xaCgWTkMKxXD4o31cF0CQQUW 45C0v6Nb3YNJczLmyWeoFYh3GHcBEzQZKpOyv48P6rOpYjUc/jfNu7u8xSUXOWrQiESl Ab2T2Qvr8vUcDwkyOI1aabswSySszKzhZd8Y8spVz/fZKLhXL7yvgOkpcqFqwOpszdtj obVQ== X-Gm-Message-State: AC+VfDyZJH5qbK1tpZ7M6SaDCWdDxVFBeXcD2EQMr/Yyp/z7p7UUtunw Mhld79AXJwIbLXJ2h1Dj+avP7n16rlU= X-Received: by 2002:a17:90a:195a:b0:25b:be49:29d1 with SMTP id 26-20020a17090a195a00b0025bbe4929d1mr5266107pjh.27.1686596151390; Mon, 12 Jun 2023 11:55:51 -0700 (PDT) Received: from [172.31.0.109] ([136.36.130.248]) by smtp.gmail.com with ESMTPSA id 23-20020a17090a005700b002568073d6fbsm8431271pjb.13.2023.06.12.11.55.49 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 12 Jun 2023 11:55:50 -0700 (PDT) Message-ID: <52aefdbf-130d-dcb9-63f1-1b451153ccba@gmail.com> Date: Mon, 12 Jun 2023 12:55:49 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Content-Language: en-US Subject: [committed] [PR rtl-optimization/101188] Fix reload_cse_move2add ignoring clobbers To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jeff Law via Gcc-patches From: Jeff Law Reply-To: Jeff Law Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768524310482246175?= X-GMAIL-MSGID: =?utf-8?q?1768524310482246175?= So as Georg-Johann discusses in the BZ, reload_cse_move2add can generate incorrect code when optimizing code with clobbers. Specifically in the case where we try to optimize a sequence of 4 operations down to 3 operations we can reset INSN to the next instruction and continue the loop. That skips the code to invalidate objects based on things like REG_INC nodes, stack pushes and most importantly clobbers attached to the current insn. This patch factors all of the invalidation code used by reload_cse_move2add into a new function and calls it at the appropriate time. Georg-Johann has confirmed this patch fixes his avr bug and I've had it in my tester over the weekend. It's bootstrapped and regression tested on aarch64, m68k, sh4, alpha and hppa. It's also regression tested successfully on a wide variety of other targets. Pushing to the trunk momentarily. Jeff commit ae193f9008e02683e27f3c87f3b06f38e103b1d0 Author: Jeff Law Date: Mon Jun 12 12:52:10 2023 -0600 [committed] [PR rtl-optimization/101188] Fix reload_cse_move2add ignoring clobbers So as Georg-Johann discusses in the BZ, reload_cse_move2add can generate incorrect code when optimizing code with clobbers. Specifically in the case where we try to optimize a sequence of 4 operations down to 3 operations we can reset INSN to the next instruction and continue the loop. That skips the code to invalidate objects based on things like REG_INC nodes, stack pushes and most importantly clobbers attached to the current insn. This patch factors all of the invalidation code used by reload_cse_move2add into a new function and calls it at the appropriate time. Georg-Johann has confirmed this patch fixes his avr bug and I've had it in my tester over the weekend. It's bootstrapped and regression tested on aarch64, m68k, sh4, alpha and hppa. It's also regression tested successfully on a wide variety of other targets. gcc/ PR rtl-optimization/101188 * postreload.cc (reload_cse_move2add_invalidate): New function, extracted from... (reload_cse_move2add): Call reload_cse_move2add_invalidate. gcc/testsuite PR rtl-optimization/101188 * gcc.c-torture/execute/pr101188.c: New test diff --git a/gcc/postreload.cc b/gcc/postreload.cc index 20e138b4fa8..d5f367071bb 100644 --- a/gcc/postreload.cc +++ b/gcc/postreload.cc @@ -1899,6 +1899,79 @@ move2add_use_add3_insn (scalar_int_mode mode, rtx reg, rtx sym, rtx off, return changed; } +/* Perform any invalidations necessary for INSN. */ + +static void +reload_cse_move2add_invalidate (rtx_insn *insn) +{ + for (rtx note = REG_NOTES (insn); note; note = XEXP (note, 1)) + { + if (REG_NOTE_KIND (note) == REG_INC + && REG_P (XEXP (note, 0))) + { + /* Reset the information about this register. */ + int regno = REGNO (XEXP (note, 0)); + if (regno < FIRST_PSEUDO_REGISTER) + { + move2add_record_mode (XEXP (note, 0)); + reg_mode[regno] = VOIDmode; + } + } + } + + /* There are no REG_INC notes for SP autoinc. */ + subrtx_var_iterator::array_type array; + FOR_EACH_SUBRTX_VAR (iter, array, PATTERN (insn), NONCONST) + { + rtx mem = *iter; + if (mem + && MEM_P (mem) + && GET_RTX_CLASS (GET_CODE (XEXP (mem, 0))) == RTX_AUTOINC) + { + if (XEXP (XEXP (mem, 0), 0) == stack_pointer_rtx) + reg_mode[STACK_POINTER_REGNUM] = VOIDmode; + } + } + + note_stores (insn, move2add_note_store, insn); + + /* If INSN is a conditional branch, we try to extract an + implicit set out of it. */ + if (any_condjump_p (insn)) + { + rtx cnd = fis_get_condition (insn); + + if (cnd != NULL_RTX + && GET_CODE (cnd) == NE + && REG_P (XEXP (cnd, 0)) + && !reg_set_p (XEXP (cnd, 0), insn) + /* The following two checks, which are also in + move2add_note_store, are intended to reduce the + number of calls to gen_rtx_SET to avoid memory + allocation if possible. */ + && SCALAR_INT_MODE_P (GET_MODE (XEXP (cnd, 0))) + && REG_NREGS (XEXP (cnd, 0)) == 1 + && CONST_INT_P (XEXP (cnd, 1))) + { + rtx implicit_set = gen_rtx_SET (XEXP (cnd, 0), XEXP (cnd, 1)); + move2add_note_store (SET_DEST (implicit_set), implicit_set, insn); + } + } + + /* If this is a CALL_INSN, all call used registers are stored with + unknown values. */ + if (CALL_P (insn)) + { + function_abi callee_abi = insn_callee_abi (insn); + for (int i = FIRST_PSEUDO_REGISTER - 1; i >= 0; i--) + if (reg_mode[i] != VOIDmode + && reg_mode[i] != BLKmode + && callee_abi.clobbers_reg_p (reg_mode[i], i)) + /* Reset the information about this register. */ + reg_mode[i] = VOIDmode; + } +} + /* Convert move insns with constant inputs to additions if they are cheaper. Return true if any changes were made. */ static bool @@ -1921,7 +1994,7 @@ reload_cse_move2add (rtx_insn *first) move2add_luid = 2; for (insn = first; insn; insn = NEXT_INSN (insn), move2add_luid++) { - rtx set, note; + rtx set; if (LABEL_P (insn)) { @@ -2041,6 +2114,12 @@ reload_cse_move2add (rtx_insn *first) delete_insn (insn); changed |= success; insn = next; + /* Make sure to perform any invalidations related to + NEXT/INSN since we're going to bypass the normal + flow with the continue below. + + Do this before recording the new mode/offset. */ + reload_cse_move2add_invalidate (insn); move2add_record_mode (reg); reg_offset[regno] = trunc_int_for_mode (added_offset + base_offset, @@ -2094,74 +2173,7 @@ reload_cse_move2add (rtx_insn *first) continue; } } - - for (note = REG_NOTES (insn); note; note = XEXP (note, 1)) - { - if (REG_NOTE_KIND (note) == REG_INC - && REG_P (XEXP (note, 0))) - { - /* Reset the information about this register. */ - int regno = REGNO (XEXP (note, 0)); - if (regno < FIRST_PSEUDO_REGISTER) - { - move2add_record_mode (XEXP (note, 0)); - reg_mode[regno] = VOIDmode; - } - } - } - - /* There are no REG_INC notes for SP autoinc. */ - subrtx_var_iterator::array_type array; - FOR_EACH_SUBRTX_VAR (iter, array, PATTERN (insn), NONCONST) - { - rtx mem = *iter; - if (mem - && MEM_P (mem) - && GET_RTX_CLASS (GET_CODE (XEXP (mem, 0))) == RTX_AUTOINC) - { - if (XEXP (XEXP (mem, 0), 0) == stack_pointer_rtx) - reg_mode[STACK_POINTER_REGNUM] = VOIDmode; - } - } - - note_stores (insn, move2add_note_store, insn); - - /* If INSN is a conditional branch, we try to extract an - implicit set out of it. */ - if (any_condjump_p (insn)) - { - rtx cnd = fis_get_condition (insn); - - if (cnd != NULL_RTX - && GET_CODE (cnd) == NE - && REG_P (XEXP (cnd, 0)) - && !reg_set_p (XEXP (cnd, 0), insn) - /* The following two checks, which are also in - move2add_note_store, are intended to reduce the - number of calls to gen_rtx_SET to avoid memory - allocation if possible. */ - && SCALAR_INT_MODE_P (GET_MODE (XEXP (cnd, 0))) - && REG_NREGS (XEXP (cnd, 0)) == 1 - && CONST_INT_P (XEXP (cnd, 1))) - { - rtx implicit_set = - gen_rtx_SET (XEXP (cnd, 0), XEXP (cnd, 1)); - move2add_note_store (SET_DEST (implicit_set), implicit_set, insn); - } - } - - /* If this is a CALL_INSN, all call used registers are stored with - unknown values. */ - if (CALL_P (insn)) - { - function_abi callee_abi = insn_callee_abi (insn); - for (i = FIRST_PSEUDO_REGISTER - 1; i >= 0; i--) - if (reg_mode[i] != VOIDmode - && reg_mode[i] != BLKmode - && callee_abi.clobbers_reg_p (reg_mode[i], i)) - /* Reset the information about this register. */ - reg_mode[i] = VOIDmode; - } + reload_cse_move2add_invalidate (insn); } return changed; } diff --git a/gcc/testsuite/gcc.c-torture/execute/pr101188.c b/gcc/testsuite/gcc.c-torture/execute/pr101188.c new file mode 100644 index 00000000000..1bdb50b3de5 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/pr101188.c @@ -0,0 +1,61 @@ +/* { dg-require-effective-target indirect_calls } */ +typedef __UINT8_TYPE__ uint8_t; +typedef __UINT16_TYPE__ uint16_t; + +typedef uint8_t (*fn1)(void *a); +typedef void (*fn2)(void *a, int *arg); + +struct S +{ + uint8_t buffer[64]; + uint16_t n; + fn2 f2; + void *a; + fn1 f1; +}; + +volatile uint16_t x; + +void __attribute__((__noinline__,__noclone__)) +foo (uint16_t n) +{ + x = n; +} + +void __attribute__((__noinline__,__noclone__)) +testfn (struct S *self) +{ + int arg; + + foo (self->n); + self->n++; + self->f2 (self->a, &arg); + self->buffer[0] = self->f1 (self->a); +} + +static unsigned char myfn2_called = 0; + +static void +myfn2 (void *a, int *arg) +{ + myfn2_called = 1; +} + +static uint8_t +myfn1 (void *a) +{ + return 0; +} + +int main (void) +{ + struct S s; + s.n = 0; + s.f2 = myfn2; + s.f1 = myfn1; + testfn (&s); + if (myfn2_called != 1) + __builtin_abort(); + return 0; +} +