From patchwork Thu Sep 14 13:23:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 139582 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp345772vqi; Thu, 14 Sep 2023 06:24:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFP3TnWxlqfqQGs+4w3XvoQ55214lTUP1DWUk2L1XV+Ehl3adpFXXiJJpWhnlf9D9C1Y+fo X-Received: by 2002:a17:907:7811:b0:9a9:e4f8:3501 with SMTP id la17-20020a170907781100b009a9e4f83501mr4402561ejc.43.1694697843391; Thu, 14 Sep 2023 06:24:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694697843; cv=none; d=google.com; s=arc-20160816; b=g4UMr8xy7+bA8r+SyoUnqu0EkD3rznP9FfveTQkfNoN4ylTJjcbssh/tCF87p+zeGk aeapC0Eol5qkEsItfmiOzphWp5ouNBsEElDDB2MQDxRnhEFM72oQ9CH3m4CJ/6q4vpJp A0m36bfrcYcxWjXxrCQflq4OPmL+LVgSeuQRMredYEfcDDgAS5Dufn+AP6o59CZMgOKV PexchY6FHZStz+q/txOKfxarojpl0vwe6evPdg8kuTHtRW2J2Wf7cdX+3ZL4LZIzG7JG i0b9kCHsR1bxxBjUoZ6jAhURvuRMNXYC3/l1FBcnZ66nYAowjsxmZLEqK9+dQey77Tcm JUGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:sender:errors-to:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mime-version:user-agent:subject:cc:to:date:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=dz0+rmSRSig8V09zp83Ho3IDx0+3hlE3wKQRMvH/wi4=; fh=Hz/QWAL2vMAbrm3W16QrnQrLktFWGNewssxaKtdN1w4=; b=ax0RKS17IM7xmDV1KSdgEytcGLOZeCRopSnSpr8zmW0K+s3sYN+lnPQ466XwzSwk2O B03FFJ+zl6H1pr3bKnlzUdx+T1jeYTeQHgrnwhEc4/dw3fFHHopg033VdxdAtWnHuc5a c+njwV/UYv5GP9rzeDXm97t0kNhyj8VQIAIn1KiZvYRl3cawqkgbzceV+Cvp26YiKvvG LAtGGzCUyQ3qiQZIiD5rWIdFmLE/YXSFdTlMNyOE0ltzAt8Z5iJk2jnjwI9gXoNpRGC8 chzIQw2bRQHngOXvdBOd2KUpBZTuifTa+rDmn1Fzrjwn6DdQiHNKXIxczlQMcG+eGqqU E/Wg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=em8MgFEw; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y16-20020a170906559000b009a63cbcf7c5si1349783ejp.933.2023.09.14.06.24.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 06:24:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=em8MgFEw; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E8C483858C3A for ; Thu, 14 Sep 2023 13:24:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E8C483858C3A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694697841; bh=dz0+rmSRSig8V09zp83Ho3IDx0+3hlE3wKQRMvH/wi4=; h=Date:To:cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=em8MgFEwv/WwuRE+1HTCr1w9YJdF3lYreWpU8Tf5FVYknq+W88jVXCR9x3VOg23ix WMFtHLfM+0vPvnuDtFj2VthvUMpKN0ELiSBGNECUI0vzOYRFppIjr1YU445BL9hJAI FaoR6UjFyXxKrg5SLAu7/qT/JvwR3t7pdeLaMUKo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 0920B3858D20 for ; Thu, 14 Sep 2023 13:23:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0920B3858D20 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id F2D9D2184E; Thu, 14 Sep 2023 13:23:13 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id DF8012C142; Thu, 14 Sep 2023 13:23:13 +0000 (UTC) Date: Thu, 14 Sep 2023 13:23:13 +0000 (UTC) To: gcc-patches@gcc.gnu.org cc: Jakub Jelinek Subject: [PATCH] tree-optimization/111294 - backwards threader PHI costing User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" Message-Id: <20230914132401.E8C483858C3A@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777019485826027014 X-GMAIL-MSGID: 1777019485826027014 This revives an earlier patch since the problematic code applying extra costs to PHIs in copied blocks we couldn't make any sense of prevents a required threading in this case. Instead of coming up with an artificial other costing the following simply removes the bits. As with all threading changes this requires a plethora of testsuite adjustments, but only the last three are unfortunate as is the libgomp team.c adjustment which is required to avoid a bogus -Werror diagnostic during bootstrap. Bootstrapped and tested on x86_64-unknown-linux-gnu. Any objections? Thanks, Richard. PR tree-optimization/111294 gcc/ * tree-ssa-threadbackward.cc (back_threader_profitability::m_name): Remove (back_threader::find_paths_to_names): Adjust. (back_threader::maybe_thread_block): Likewise. (back_threader_profitability::possibly_profitable_path_p): Remove code applying extra costs to copies PHIs. libgomp/ * team.c (gomp_team_start): Guard gomp_alloca to avoid false positive alloc-size diagnostic. gcc/testsuite/ * gcc.dg/tree-ssa/pr111294.c: New test. * gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust. * gcc.dg/tree-ssa/pr59597.c: Likewise. * gcc.dg/tree-ssa/pr61839_2.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-18.c: Likewise. * g++.dg/warn/Wstringop-overflow-4.C: XFAIL subtest on ilp32. * gcc.dg/uninit-pred-9_b.c: XFAIL subtest everywhere. * gcc.dg/vect/vect-117.c: Make scan for not Invalid sum conditional on lp64. --- .../g++.dg/warn/Wstringop-overflow-4.C | 4 +- .../gcc.dg/tree-ssa/phi_on_compare-4.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/pr111294.c | 32 ++++++++++ gcc/testsuite/gcc.dg/tree-ssa/pr59597.c | 8 +-- gcc/testsuite/gcc.dg/tree-ssa/pr61839_2.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-18.c | 6 +- gcc/testsuite/gcc.dg/uninit-pred-9_b.c | 2 +- gcc/testsuite/gcc.dg/vect/vect-117.c | 2 +- gcc/tree-ssa-threadbackward.cc | 60 ++----------------- libgomp/team.c | 5 +- 10 files changed, 57 insertions(+), 70 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr111294.c diff --git a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C index faad5bed074..275ecac01b5 100644 --- a/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C +++ b/gcc/testsuite/g++.dg/warn/Wstringop-overflow-4.C @@ -151,7 +151,9 @@ void test_strcpy_new_int16_t (size_t n, const size_t vals[]) as size_t as a result of threading. See PR 101688 comment #2. */ T (S (1), new int16_t[r_0_imax]); - T (S (2), new int16_t[r_0_imax + 1]); + /* Similar to PR 101688 the following can result in a bougs warning because + of threading. */ + T (S (2), new int16_t[r_0_imax + 1]); // { dg-bogus "into a region of size" "" { xfail { ilp32 } } } T (S (9), new int16_t[r_0_imax * 2 + 1]); int r_1_imax = SR (1, INT_MAX); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c index 1e09f89af9f..6240d1cdd6d 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-Ofast -fdump-tree-dom2" } */ +/* { dg-options "-Ofast -fdump-tree-threadfull1-stats" } */ void g (int); void g1 (int); @@ -37,4 +37,4 @@ f (long a, long b, long c, long d, int x) g (c + d); } -/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "dom2" } } */ +/* { dg-final { scan-tree-dump "Jumps threaded: 2" "threadfull1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr111294.c b/gcc/testsuite/gcc.dg/tree-ssa/pr111294.c new file mode 100644 index 00000000000..9ad912bad0b --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr111294.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-Os -fdump-tree-optimized" } */ + +void foo(void); +static short a; +static int b, c, d; +static int *e, *f = &d; +static int **g = &e; +static unsigned char h; +static short(i)(short j, int k) { return j > k ?: j; } +static char l() { + if (a) return b; + return c; +} +int main() { + b = 0; + for (; b < 5; ++b) + ; + h = l(); + if (a ^ 3 >= i(h, 11)) + a = 0; + else { + *g = f; + if (e == &d & b) { + __builtin_unreachable(); + } else + foo(); + ; + } +} + +/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c b/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c index 0f66aae87bb..26c81d9dbb7 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr59597.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-Ofast -fdisable-tree-cunrolli -fdump-tree-threadfull1-details" } */ +/* { dg-options "-Ofast -fdump-tree-ethread-details" } */ typedef unsigned short u16; typedef unsigned char u8; @@ -56,8 +56,4 @@ main (int argc, char argv[]) return crc; } -/* We used to have no threads in vrp-thread1 because all the attempted - ones would cross loops. Now we get 30+ threads before VRP because - of loop unrolling. A better option is to disable unrolling and - test for the original 4 threads that this test was testing. */ -/* { dg-final { scan-tree-dump-times "Registering jump thread" 4 "threadfull1" } } */ +/* { dg-final { scan-tree-dump-times "Registering jump thread" 2 "ethread" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_2.c index 0e0f4c02113..a78e444038a 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr61839_2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr61839_2.c @@ -1,6 +1,8 @@ /* PR tree-optimization/61839. */ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-evrp" } */ +/* Disable jump threading, we want to avoid separating the division/modulo + by zero paths - we'd isolate those only later. */ +/* { dg-options "-O2 -fno-thread-jumps -fdump-tree-evrp" } */ /* { dg-require-effective-target int32plus } */ __attribute__ ((noinline)) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-18.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-18.c index fd6c8677212..13b9ba4f70f 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-18.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-18.c @@ -198,7 +198,9 @@ compute_on_bytes (uint8_t *in_data, int in_len, uint8_t *out_data, int out_len) exits after gimple loop optimizations, which generates instructions executed each iteration in loop, but the results are used outside of loop: With -m64, - "Sinking _367 = (uint8_t *) _320; + "Sinking op_230 = op_244 + 2; + from bb 63 to bb 94 + Sinking _367 = (uint8_t *) _320; from bb 31 to bb 90 Sinking _320 = _321 + ivtmp.25_326; from bb 31 to bb 90 @@ -213,4 +215,4 @@ compute_on_bytes (uint8_t *in_data, int in_len, uint8_t *out_data, int out_len) base+index addressing modes, so the ip[len] address computation can't be made from the IV computation above. */ - /* { dg-final { scan-tree-dump-times "Sunk statements: 4" 1 "sink2" { target lp64 xfail { riscv64-*-* } } } } */ + /* { dg-final { scan-tree-dump-times "Sunk statements: 5" 1 "sink2" { target lp64 xfail { riscv64-*-* } } } } */ diff --git a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c index 0f508fa56e1..3c83d505ec0 100644 --- a/gcc/testsuite/gcc.dg/uninit-pred-9_b.c +++ b/gcc/testsuite/gcc.dg/uninit-pred-9_b.c @@ -17,7 +17,7 @@ int foo (int n, int l, int m, int r) if (l > 100) if ( (n <= 9) && (m < 100) && (r < 19) ) - blah(v); /* { dg-bogus "uninitialized" "bogus warning" { xfail powerpc*-*-* cris-*-* riscv*-*-* } } */ + blah(v); /* { dg-bogus "uninitialized" "bogus warning" { xfail *-*-* } } */ if ( (n <= 8) && (m < 99) && (r < 19) ) blah(v); /* { dg-bogus "uninitialized" "pr101674" { xfail mmix-*-* } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-117.c b/gcc/testsuite/gcc.dg/vect/vect-117.c index 010d63c9ad8..4755e39f951 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-117.c +++ b/gcc/testsuite/gcc.dg/vect/vect-117.c @@ -61,4 +61,4 @@ int main (void) /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ /* { dg-final { scan-tree-dump-times "possible dependence between data-refs" 0 "vect" } } */ -/* { dg-final { scan-tree-dump-not "Invalid sum" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "Invalid sum" "optimized" { target { lp64 } } } } */ diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc index d5da4b0c1b1..c45f4b261ad 100644 --- a/gcc/tree-ssa-threadbackward.cc +++ b/gcc/tree-ssa-threadbackward.cc @@ -62,7 +62,7 @@ class back_threader_profitability { public: back_threader_profitability (bool speed_p, gimple *stmt); - bool possibly_profitable_path_p (const vec &, tree, bool *); + bool possibly_profitable_path_p (const vec &, bool *); bool profitable_path_p (const vec &, edge taken, bool *irreducible_loop); private: @@ -126,9 +126,6 @@ private: auto_bitmap m_imports; // The last statement in the path. gimple *m_last_stmt; - // This is a bit of a wart. It's used to pass the LHS SSA name to - // the profitability engine. - tree m_name; // Marker to differentiate unreachable edges. static const edge UNREACHABLE_EDGE; // Set to TRUE if unknown SSA names along a path should be resolved @@ -366,7 +363,7 @@ back_threader::find_paths_to_names (basic_block bb, bitmap interesting, // on the way to the backedge could be worthwhile. bool large_non_fsm; if (m_path.length () > 1 - && (!profit.possibly_profitable_path_p (m_path, m_name, &large_non_fsm) + && (!profit.possibly_profitable_path_p (m_path, &large_non_fsm) || (!large_non_fsm && maybe_register_path (profit)))) ; @@ -517,23 +514,19 @@ back_threader::maybe_thread_block (basic_block bb) return; enum gimple_code code = gimple_code (stmt); - tree name; - if (code == GIMPLE_SWITCH) - name = gimple_switch_index (as_a (stmt)); - else if (code == GIMPLE_COND) - name = gimple_cond_lhs (stmt); - else + if (code != GIMPLE_SWITCH + && code != GIMPLE_COND) return; m_last_stmt = stmt; m_visited_bbs.empty (); m_path.truncate (0); - m_name = name; // We compute imports of the path during discovery starting // just with names used in the conditional. bitmap_clear (m_imports); ssa_op_iter iter; + tree name; FOR_EACH_SSA_TREE_OPERAND (name, stmt, iter, SSA_OP_USE) { if (!gimple_range_ssa_p (name)) @@ -588,15 +581,12 @@ back_threader::debug () *LARGE_NON_FSM whether the thread is too large for a non-FSM thread but would be OK if we extend the path to cover the loop backedge. - NAME is the SSA_NAME of the variable we found to have a constant - value on PATH. If unknown, SSA_NAME is NULL. - ?? It seems we should be able to loosen some of the restrictions in this function after loop optimizations have run. */ bool back_threader_profitability::possibly_profitable_path_p - (const vec &m_path, tree name, + (const vec &m_path, bool *large_non_fsm) { gcc_checking_assert (!m_path.is_empty ()); @@ -645,44 +635,6 @@ back_threader_profitability::possibly_profitable_path_p if (j < m_path.length () - 1) { int orig_n_insns = m_n_insns; - /* PHIs in the path will create degenerate PHIS in the - copied path which will then get propagated away, so - looking at just the duplicate path the PHIs would - seem unimportant. - - But those PHIs, because they're assignments to objects - typically with lives that exist outside the thread path, - will tend to generate PHIs (or at least new PHI arguments) - at points where we leave the thread path and rejoin - the original blocks. So we do want to account for them. - - We ignore virtual PHIs. We also ignore cases where BB - has a single incoming edge. That's the most common - degenerate PHI we'll see here. Finally we ignore PHIs - that are associated with the value we're tracking as - that object likely dies. */ - if (EDGE_COUNT (bb->succs) > 1 && EDGE_COUNT (bb->preds) > 1) - { - for (gphi_iterator gsip = gsi_start_phis (bb); - !gsi_end_p (gsip); - gsi_next (&gsip)) - { - gphi *phi = gsip.phi (); - tree dst = gimple_phi_result (phi); - - /* Note that if both NAME and DST are anonymous - SSA_NAMEs, then we do not have enough information - to consider them associated. */ - if (dst != name - && name - && TREE_CODE (name) == SSA_NAME - && (SSA_NAME_VAR (dst) != SSA_NAME_VAR (name) - || !SSA_NAME_VAR (dst)) - && !virtual_operand_p (dst)) - ++m_n_insns; - } - } - if (!m_contains_hot_bb && m_speed_p) m_contains_hot_bb |= optimize_bb_for_speed_p (bb); for (gsi = gsi_after_labels (bb); diff --git a/libgomp/team.c b/libgomp/team.c index 54dfca8080a..e5a86de1dd0 100644 --- a/libgomp/team.c +++ b/libgomp/team.c @@ -756,8 +756,9 @@ gomp_team_start (void (*fn) (void *), void *data, unsigned nthreads, attr = &thread_attr; } - start_data = gomp_alloca (sizeof (struct gomp_thread_start_data) - * (nthreads - i)); + if (i < nthreads) + start_data = gomp_alloca (sizeof (struct gomp_thread_start_data) + * (nthreads - i)); /* Launch new threads. */ for (; i < nthreads; ++i)