From patchwork Wed Jan 11 11:58:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 41948 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp3276144wrt; Wed, 11 Jan 2023 03:59:40 -0800 (PST) X-Google-Smtp-Source: AMrXdXsT1T2SI5nzrpaiBd4tJpluWwqg2oPtzNMEyOiotxuCZ8iW2J1Lavikz+fzo2JN4FR1g4Ti X-Received: by 2002:a17:906:9d8e:b0:84d:3921:11e2 with SMTP id fq14-20020a1709069d8e00b0084d392111e2mr10939402ejc.58.1673438380033; Wed, 11 Jan 2023 03:59:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673438380; cv=none; d=google.com; s=arc-20160816; b=cLELZur588KfZ/5qY+CCGyS/IFp+5i49K/q/VjvNkkHIjViK1J8L+lVP34u4mTJesl MlWhfTqB3pM+q+Lsl152Fm5H511y2YJENgdbwoYGYFC2IwDK9c3kIwl5jYU3mtHCh8E7 lF8ZQleEN6MgHgW3YbjK9Y78hLKHVm6RK2rvPP1H8G6Va9RnjrbDITd46R3/dS+KJk/B mW+PalFoubdMX7nDI5PDuH/emI/tH1zlfp7/1gs++cMvgd2IVfN3QZWMmcpNoM+IArpZ eWQu9VMOS+eXY9yfgMt+8WMu9rG/vHZCt2Il1pBbBtp/9l1oUo+vlJHz3oD7flqYRlaD rUxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:message-id :mime-version:subject:to:date:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=RSd5kdpf5ciot79HD2QakinTGD0AnHpuWQUv83kNIvc=; b=w2iJyihzn1nyf1Luk/xxur/IMzdFH9I225egyzd2hGVd28+NgUn1YpLpdKGiFM9rVp YZFLb7mb9hGbyU6/AotUcVm4F6Hr23Rx+AiAI++JatxYd48qhxY+tpEsJZZpAxiOehZQ 31xtpweAI9ZookMwQ59A+qefFkw61JmB5+yLYK+se0Vx/tSHPix0mlj90cFJamp/U5Eo jsDtFhX41APybx/qL+ITCRTu1M7SXz5mEXl23OzLEyaH0fTeCvEA47qfvu117tq1H1Q5 fnn4bWmgd+Qb4xFlyVfmMv1Xt9jWRpni5Udmi+VMXfLkzCEtKdADe4rntR2FjhZvO63h oiBA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=WdbIXUy9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id xe9-20020a170907318900b007c101c9237asi3012596ejb.668.2023.01.11.03.59.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Jan 2023 03:59:40 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=WdbIXUy9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DAF3238582A4 for ; Wed, 11 Jan 2023 11:59:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DAF3238582A4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1673438378; bh=RSd5kdpf5ciot79HD2QakinTGD0AnHpuWQUv83kNIvc=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=WdbIXUy94gjTGmVXWfA2V1X/Fs7GHv1224+3LQC9vxNrHbfMTbU1HuIevJ1ZXQIG9 chLwXSUl8F3ucodJ50tRmzLWVBHCNvf9OcYx99njGhyywb2bDRIg0wa7BfXt0uqqjc RpjQE6Vr9gSIBeaki7gXRJQSUfq7YzrzpE0s8rak= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 855C83858D38 for ; Wed, 11 Jan 2023 11:58:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 855C83858D38 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 8C9EE176F5 for ; Wed, 11 Jan 2023 11:58:54 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 781A01358A for ; Wed, 11 Jan 2023 11:58:54 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 0tUkHH6kvmO7OAAAMHmgww (envelope-from ) for ; Wed, 11 Jan 2023 11:58:54 +0000 Date: Wed, 11 Jan 2023 12:58:54 +0100 (CET) To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/108352 - FSM threads creating irreducible loops MIME-Version: 1.0 Message-Id: <20230111115854.781A01358A@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754727322530546943?= X-GMAIL-MSGID: =?utf-8?q?1754727322530546943?= The following relaxes a heuristic that prevents creating irreducible loops from FSM threads not covering multi-way branches. Instead of allowing threads that adhere to && (n_insns * (unsigned) param_fsm_scale_path_stmts > (m_path.length () * (unsigned) param_fsm_scale_path_blocks)) with reasoning "We also consider it worth creating an irreducible inner loop if the number of copied statement is low relative to the length of the path -- in that case there's little the traditional loop optimizer would have done anyway, so an irreducible loop is not so bad." that I cannot make much sense of the following patch changes that to only allow those after loop optimization and when they are (scaled) short: && (!(cfun->curr_properties & PROP_loop_opts_done) || (m_n_insns * param_fsm_scale_path_stmts >= param_max_jump_thread_duplication_stmts))) This allows us to get rid of --param fsm-scale-path-blocks which previous to the bisected revision allowed an enlarged path covering the original allowance (but we do not consider that enlarged path now because enlarging it doesn't add any information). Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/108352 * tree-ssa-threadbackward.cc (back_threader_profitability::profitable_path_p): Adjust heuristic that allows non-multi-way branch threads creating irreducible loops. * doc/invoke.texi (--param fsm-scale-path-blocks): Remove. (--param fsm-scale-path-stmts): Adjust. * params.opt (--param=fsm-scale-path-blocks=): Remove. (-param=fsm-scale-path-stmts=): Adjust description. * gcc.dg/tree-ssa/ssa-thread-21.c: New testcase. * gcc.dg/tree-ssa/vrp46.c: Remove --param fsm-scale-path-blocks=1. --- gcc/doc/invoke.texi | 7 ++--- gcc/params.opt | 6 +---- gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c | 26 +++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/vrp46.c | 2 +- gcc/tree-ssa-threadbackward.cc | 18 +++++-------- 5 files changed, 37 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 80d942917bd..701c228bd0a 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -15981,16 +15981,13 @@ Max. size of loc list for which reverse ops should be added. @item fsm-scale-path-stmts Scale factor to apply to the number of statements in a threading path -when comparing to the number of (scaled) blocks. +crossing a loop backedge when comparing to +@option{--param=max-jump-thread-duplication-stmts}. @item uninit-control-dep-attempts Maximum number of nested calls to search for control dependencies during uninitialized variable analysis. -@item fsm-scale-path-blocks -Scale factor to apply to the number of blocks in a threading path -when comparing to the number of (scaled) statements. - @item sched-autopref-queue-depth Hardware autoprefetcher scheduler model control flag. Number of lookahead cycles the model looks into; at ' diff --git a/gcc/params.opt b/gcc/params.opt index e178dec1600..929131254d2 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -134,13 +134,9 @@ Maximum number of basic blocks before EVRP uses a sparse cache. Common Joined UInteger Var(param_evrp_switch_limit) Init(50) Optimization Param Maximum number of outgoing edges in a switch before EVRP will not process it. --param=fsm-scale-path-blocks= -Common Joined UInteger Var(param_fsm_scale_path_blocks) Init(3) IntegerRange(1, 10) Param Optimization -Scale factor to apply to the number of blocks in a threading path when comparing to the number of (scaled) statements. - -param=fsm-scale-path-stmts= Common Joined UInteger Var(param_fsm_scale_path_stmts) Init(2) IntegerRange(1, 10) Param Optimization -Scale factor to apply to the number of statements in a threading path when comparing to the number of (scaled) blocks. +Scale factor to apply to the number of statements in a threading path crossing a loop backedge when comparing to max-jump-thread-duplication-stmts. -param=gcse-after-reload-critical-fraction= Common Joined UInteger Var(param_gcse_after_reload_critical_fraction) Init(10) Param Optimization diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c new file mode 100644 index 00000000000..16537ccfb61 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-21.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-thread2-stats -fdump-tree-optimized" } */ + +long a; +int b; +void bar64_(void); +void foo(); +int main() { + char c = 0; + unsigned d = 10; + int e = 2; + for (; d; d--) { + bar64_(); + b = d; + e && (c = (e = 0) != 4) > 1; + } + if (c < 1) + foo(); + a = b; +} + +/* We need to perform a non-multi-way branch FSM thread creating an + irreducible loop in thread2 to allow followup threading to + remove the call to foo (). */ +/* { dg-final { scan-tree-dump "Jumps threaded: 1" "thread2" } } */ +/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c index ebdc2e3eedb..6d3f75209d1 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp46.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-vrp1 --param fsm-scale-path-blocks=1" } */ +/* { dg-options "-O2 -fdump-tree-vrp1" } */ int func_81 (int); int func_98 (int); diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc index 8a64535d195..fcbb95b08be 100644 --- a/gcc/tree-ssa-threadbackward.cc +++ b/gcc/tree-ssa-threadbackward.cc @@ -868,22 +868,18 @@ back_threader_profitability::profitable_path_p (const vec &m_path, a multiway branch, in which case we have deemed it worth losing other loop optimizations later. - We also consider it worth creating an irreducible inner loop if - the number of copied statement is low relative to the length of - the path -- in that case there's little the traditional loop - optimizer would have done anyway, so an irreducible loop is not - so bad. */ + We also consider it worth creating an irreducible inner loop after + loop optimizations if the number of copied statement is low. */ if (!m_threaded_multiway_branch && *creates_irreducible_loop - && (m_n_insns * (unsigned) param_fsm_scale_path_stmts - > (m_path.length () * - (unsigned) param_fsm_scale_path_blocks))) - + && (!(cfun->curr_properties & PROP_loop_opts_done) + || (m_n_insns * param_fsm_scale_path_stmts + >= param_max_jump_thread_duplication_stmts))) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, - " FAIL: Would create irreducible loop without threading " - "multiway branch.\n"); + " FAIL: Would create irreducible loop early without " + "threading multiway branch.\n"); /* We compute creates_irreducible_loop only late. */ return false; }