From patchwork Tue Aug 9 13:01:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 449 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:20da:b0:2d3:3019:e567 with SMTP id n26csp2500315pxc; Tue, 9 Aug 2022 06:02:07 -0700 (PDT) X-Google-Smtp-Source: AA6agR7y46SQdfgvCNyVETYV8s1MQ+D1QG6SZxw9fh71eQO1rChyWdEpy8ZT9+YxbEq9lqskeDwp X-Received: by 2002:a17:907:7b8e:b0:731:4e73:89f2 with SMTP id ne14-20020a1709077b8e00b007314e7389f2mr7868373ejc.562.1660050127206; Tue, 09 Aug 2022 06:02:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660050127; cv=none; d=google.com; s=arc-20160816; b=D3XO2bn342KcjpdTdbGLA8U6hahqt3bi6ZLhS/hHJg1qH9GYP3jvzTak7ys+C8IVHe PH8ae3fjyhaeCuw+fpTI5e+UQ0P4kdAI3mdTInPnX+4X94Yyx8+WzafTDqwPMhJels5t fuCx1a8Q7pUkHb7Yr9BI7Q98IhiIBB+tBZyBgW8nJnT7NA2q0A5mgraSrxHajuAZxjIs EF3IyTfzD8bM1L9AXsHzC7mnMMb/fQGNC3dVFIuCq5/UaEDTcWD6JWHGQqR2DyEg6rZt wZtiKJagVmUjSSfEN3pytN9Myx0CK3KAUnITmnTSZoTIVnMvtOOrmeA57MJFRDP7/47V /nIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:sender:errors-to:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mime-version:user-agent:subject:to:date:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=lfKbG9YK53NExilgsd7TlZeD/fQU8iICSUSeOLECBck=; b=UhiblDp2YI8VDPe6SfB2uQhMxEU4liMPxqCmTXxdCQ5wLgmPHUjSCxSP/gbDvx3uMc QXiBVuoaGFEf6qy5feOHCXCy4SMOiAA6JyWcericEWO4s4y4x2ECOHDnrGuC9putxDdt STN5FgrnbzF/50wf42Gk6FEn6VvP6OobMsccOxib0GjT74vkiYJ6dH9dg6E4Jh2S+715 uUeOswccsTpGlE6690VzYurhGgoLUlrzd2hNEMCJfeGybDRA40cre419WsswuFC+che6 x3c/W3NWwhil7zhujK5Dd4Fn+wNCYLeLlWve0u5vEvpDwiLY5Jhf89T4aqsgFcSeVjGN MXXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tGgcvBi8; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id bg7-20020a170906a04700b007314bba0ecdsi1782301ejb.377.2022.08.09.06.02.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Aug 2022 06:02:07 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tGgcvBi8; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AC043385AC23 for ; Tue, 9 Aug 2022 13:02:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AC043385AC23 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1660050125; bh=lfKbG9YK53NExilgsd7TlZeD/fQU8iICSUSeOLECBck=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=tGgcvBi8Rd8b6vRqQdlgbMzszmSg922slutCwti5rFsJS9LZx9nZAdpRi2K/CwOil wl17RhxtJVwEzuAykUNWnbUZc+GQksH6y72eUiUw5auT/dOd/zWSeHSf/oT0PkMU9r aF5WU1iqZXaMwygj6Apxb9Uk16Vgah5Qf+Vhj7L0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 4A8EB385735A for ; Tue, 9 Aug 2022 13:01:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4A8EB385735A Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 41C8634820; Tue, 9 Aug 2022 13:01:21 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 380752C1AF; Tue, 9 Aug 2022 13:01:21 +0000 (UTC) Date: Tue, 9 Aug 2022 13:01:21 +0000 (UTC) To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/106514 - revisit m_import compute in backward threading User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" Message-Id: <20220809130205.AC043385AC23@sourceware.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740688722252998132?= X-GMAIL-MSGID: =?utf-8?q?1740688722252998132?= This revisits how we compute imports later used for the ranger path query during backwards threading. The compute_imports function of the path solver ends up pulling the SSA def chain of regular stmts without limit and since it starts with just the gori imports of the path exit it misses some interesting names to translate during path discovery. In fact with a still empty path this compute_imports function looks like not the correct tool. The following instead implements what it does during the path discovery and since we add the exit block we seed the initial imports and interesting names from just the exit conditional. When we then process interesting names (aka imports we did not yet see the definition of) we prune local defs but add their uses in a similar way as compute_imports would have done. The patch also properly unwinds m_imports during the path discovery backtracking and from a debugging session I have verified the two sets evolve as expected now while previously behaving slightly erratic. Fortunately the m_imports set now also is shrunken significantly for the PR69592 testcase (aka PR106514) so that there's overall speedup when increasing --param max-jump-thread-duplication-stmts as 15 -> 30 -> 60 -> 120 from 1s -> 2s -> 13s -> 27s to with the patch 1s -> 2s -> 4s -> 8s. This runs into a latent issue in X which doesn't seem to expect any PHI nodes with a constant argument on an edge inside the path. But we now have those as interesting, for example for the ICEing g++.dg/torture/pr100925.C which just has sth like if (i) x = 1; else x = 5; if (x == 1) ... where we now have the path from if (i) to if (x) and the PHI for x in the set of imports to consider for resolving x == 1 which IMHO looks exactly like what we want. The path_range_query::ssa_range_in_phi papers over the issue and drops the range to varying instead of crashing. I didn't want to mess with this any further in this patch (but I couldn't resist replacing the loop over PHI args with PHI_ARG_DEF_FROM_EDGE, so mind the re-indenting). Bootstrapped and tested on x86_64-unknown-linux-gnu w/o the path_range_query::ssa_range_in_phi fix, now re-running with. OK? Thanks, Richard. PR tree-optimization/106514 * tree-ssa-threadbackward.cc (back_threader::find_paths_to_names): Compute and unwind both m_imports and interesting on the fly during path discovery. (back_threader::find_paths): Compute the original m_imports from just the SSA uses of the exit conditional. Drop handling single_succ_to_potentially_threadable_block. * gimple-range-path.cc (path_range_query::ssa_range_in_phi): Handle constant PHI arguments without crashing. Use PHI_ARG_DEF_FROM_EDGE. --- gcc/gimple-range-path.cc | 52 ++++++++--------- gcc/tree-ssa-threadbackward.cc | 104 ++++++++++++++++++++++++++------- 2 files changed, 106 insertions(+), 50 deletions(-) diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc index 43e7526b6fc..b4376011ea8 100644 --- a/gcc/gimple-range-path.cc +++ b/gcc/gimple-range-path.cc @@ -276,8 +276,6 @@ void path_range_query::ssa_range_in_phi (vrange &r, gphi *phi) { tree name = gimple_phi_result (phi); - basic_block bb = gimple_bb (phi); - unsigned nargs = gimple_phi_num_args (phi); if (at_entry ()) { @@ -287,6 +285,7 @@ path_range_query::ssa_range_in_phi (vrange &r, gphi *phi) // Try to fold the phi exclusively with global or cached values. // This will get things like PHI <5(99), 6(88)>. We do this by // calling range_of_expr with no context. + unsigned nargs = gimple_phi_num_args (phi); Value_Range arg_range (TREE_TYPE (name)); r.set_undefined (); for (size_t i = 0; i < nargs; ++i) @@ -303,36 +302,31 @@ path_range_query::ssa_range_in_phi (vrange &r, gphi *phi) return; } + basic_block bb = gimple_bb (phi); basic_block prev = prev_bb (); edge e_in = find_edge (prev, bb); - - for (size_t i = 0; i < nargs; ++i) - if (e_in == gimple_phi_arg_edge (phi, i)) - { - tree arg = gimple_phi_arg_def (phi, i); - // Avoid using the cache for ARGs defined in this block, as - // that could create an ordering problem. - if (ssa_defined_in_bb (arg, bb) || !get_cache (r, arg)) - { - if (m_resolve) - { - Value_Range tmp (TREE_TYPE (name)); - // Using both the range on entry to the path, and the - // range on this edge yields significantly better - // results. - if (defined_outside_path (arg)) - range_on_path_entry (r, arg); - else - r.set_varying (TREE_TYPE (name)); - m_ranger->range_on_edge (tmp, e_in, arg); - r.intersect (tmp); - return; - } + tree arg = PHI_ARG_DEF_FROM_EDGE (phi, e_in); + // Avoid using the cache for ARGs defined in this block, as + // that could create an ordering problem. + if (ssa_defined_in_bb (arg, bb) || !get_cache (r, arg)) + { + if (m_resolve) + { + Value_Range tmp (TREE_TYPE (name)); + // Using both the range on entry to the path, and the + // range on this edge yields significantly better + // results. + if (TREE_CODE (arg) == SSA_NAME + && defined_outside_path (arg)) + range_on_path_entry (r, arg); + else r.set_varying (TREE_TYPE (name)); - } - return; - } - gcc_unreachable (); + m_ranger->range_on_edge (tmp, e_in, arg); + r.intersect (tmp); + return; + } + r.set_varying (TREE_TYPE (name)); + } } // If NAME is defined in BB, set R to the range of NAME, and return diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc index 4cd4d21c712..a544e97b2af 100644 --- a/gcc/tree-ssa-threadbackward.cc +++ b/gcc/tree-ssa-threadbackward.cc @@ -482,32 +482,76 @@ back_threader::find_paths_to_names (basic_block bb, bitmap interesting, { // For further greedy searching we want to remove interesting // names defined in BB but add ones on the PHI edges for the - // respective edges. We do this by starting with all names + // respective edges and adding imports from those stmts. + // We do this by starting with all names // not defined in BB as interesting, collecting a list of // interesting PHIs in BB on the fly. Then we iterate over // predecessor edges, adding interesting PHI edge defs to // the set of interesting names to consider when processing it. auto_bitmap new_interesting; + auto_vec new_imports; auto_vec interesting_phis; bitmap_iterator bi; unsigned i; + auto_vec worklist; EXECUTE_IF_SET_IN_BITMAP (interesting, 0, i, bi) { tree name = ssa_name (i); gimple *def_stmt = SSA_NAME_DEF_STMT (name); + /* Imports remain interesting. */ if (gimple_bb (def_stmt) != bb) - bitmap_set_bit (new_interesting, i); - else if (gphi *phi = dyn_cast (def_stmt)) { - tree res = gimple_phi_result (phi); - if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (res)) - interesting_phis.safe_push (phi); + bitmap_set_bit (new_interesting, i); + continue; + } + worklist.quick_push (name); + while (!worklist.is_empty ()) + { + tree name = worklist.pop (); + gimple *def_stmt = SSA_NAME_DEF_STMT (name); + /* Newly discovered imports are interesting. */ + if (gimple_bb (def_stmt) != bb) + { + bitmap_set_bit (new_interesting, SSA_NAME_VERSION (name)); + continue; + } + /* Local PHIs participate in renaming below. */ + if (gphi *phi = dyn_cast (def_stmt)) + { + tree res = gimple_phi_result (phi); + if (!SSA_NAME_OCCURS_IN_ABNORMAL_PHI (res)) + interesting_phis.safe_push (phi); + } + /* For other local defs process their uses, amending + imports on the way. */ + else if (is_gimple_assign (def_stmt)) + { + for (unsigned j = 1; j < gimple_num_ops (def_stmt); ++j) + { + tree rhs = gimple_op (def_stmt, j); + if (TREE_CODE (rhs) == SSA_NAME + && bitmap_set_bit (m_imports, + SSA_NAME_VERSION (rhs))) + { + /* ??? We probably want to track a 'visited' + flag separately and add to imports only + when the def is handled by ranger. The + current way adds us defs that are neither + PHIs nor "interesting" assigns, like for + example loads. */ + new_imports.safe_push (SSA_NAME_VERSION (rhs)); + worklist.safe_push (rhs); + } + } + } } } + worklist.release (); if (!bitmap_empty_p (new_interesting) || !interesting_phis.is_empty ()) { - auto_vec unwind (interesting_phis.length ()); + auto_vec unwind (interesting_phis.length ()); + auto_vec imports_unwind (interesting_phis.length ()); edge_iterator iter; edge e; FOR_EACH_EDGE (e, iter, bb->preds) @@ -525,22 +569,31 @@ back_threader::find_paths_to_names (basic_block bb, bitmap interesting, { tree def = PHI_ARG_DEF_FROM_EDGE (phi, e); if (TREE_CODE (def) == SSA_NAME) - if (bitmap_set_bit (new_interesting, - SSA_NAME_VERSION (def))) - { - bitmap_set_bit (m_imports, SSA_NAME_VERSION (def)); - unwind.quick_push (def); - } + { + int ver = SSA_NAME_VERSION (def); + if (bitmap_set_bit (new_interesting, ver)) + { + if (bitmap_set_bit (m_imports, ver)) + imports_unwind.quick_push (ver); + unwind.quick_push (ver); + } + } } find_paths_to_names (e->src, new_interesting, overall_paths); - // Restore new_interesting. We leave m_imports alone since - // we do not prune defs in BB from it and separately keeping - // track of which bits to unwind isn't worth the trouble. - for (tree def : unwind) - bitmap_clear_bit (new_interesting, SSA_NAME_VERSION (def)); + // Restore new_interesting. + for (int def : unwind) + bitmap_clear_bit (new_interesting, def); unwind.truncate (0); + // Restore and m_imports. + for (int def : imports_unwind) + bitmap_clear_bit (m_imports, def); + imports_unwind.truncate (0); } } + /* m_imports tracks all interesting names on the path, so when + backtracking we have to restore it. */ + for (int j : new_imports) + bitmap_clear_bit (m_imports, j); } else if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " FAIL: Search space limit %d reached.\n", @@ -566,15 +619,24 @@ back_threader::find_paths (basic_block bb, tree name) && gimple_code (stmt) != GIMPLE_SWITCH)) return; - if (EDGE_COUNT (bb->succs) > 1 - || single_succ_to_potentially_threadable_block (bb)) + if (EDGE_COUNT (bb->succs) > 1) { m_last_stmt = stmt; m_visited_bbs.empty (); m_path.truncate (0); m_name = name; - m_solver->compute_imports (m_imports, bb); + // We compute imports of the path during discovery starting + // just with names used in the conditional. + bitmap_clear (m_imports); + ssa_op_iter iter; + FOR_EACH_SSA_TREE_OPERAND (name, stmt, iter, SSA_OP_USE) + bitmap_set_bit (m_imports, SSA_NAME_VERSION (name)); + + // Interesting is the set of imports we still not have see + // the definition of. So while imports only grow, the + // set of interesting defs dwindles and once empty we can + // stop searching. auto_bitmap interesting; bitmap_copy (interesting, m_imports); find_paths_to_names (bb, interesting, 1);