From patchwork Tue Aug 2 08:41:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 344 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:b5d6:b0:2b9:3548:2db5 with SMTP id v22csp2839004pxt; Tue, 2 Aug 2022 01:41:56 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tRASM9bGsKR6fb+oOnK11KmDjqWCKrC6z0mZ18wGwvBpn/kCKtznzh7pc9znepQlHw6hLk X-Received: by 2002:aa7:c1d0:0:b0:43c:f89f:a134 with SMTP id d16-20020aa7c1d0000000b0043cf89fa134mr19010179edp.4.1659429716004; Tue, 02 Aug 2022 01:41:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659429716; cv=none; d=google.com; s=arc-20160816; b=1K6zvlHMUFzqvDzCGiUTeHu0/PODcD8yf8zn29tP2pJ9QIjh+5ajaWCDl06UEvo816 NWngNdxXd/qk2+24z818TMjDdWEIupcI2JvYoB74O967dnUZ2Dz3yhPSMiJbWbQ14XEt xwY6MTEI80s1EILt2YqKEi+vJ5ptGutkU6g8te/hT1ROput6KbjVsVMYeQIR3UZcUuei xR0S8CtG+CmE2A9Ky9GkeAQyb6qvEvf7KbY98zF0V1m2cRMOiPdnw8g9TAMwqJXVFvUM v1IDmgOMg4eSJN6Gkt9Nq4v3ITZXdxr2JGXNYgxbDulUmKpYAoHynPL47UQrWObqfVTI /rzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:sender:errors-to:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mime-version:user-agent:subject:to:date:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=R4KVX6b/lGPDM/FQbcJbqbSpc2OcS9yz5u/KEQqVVIo=; b=B9rI4Ew22EgjBW6HUJqk+xBZh9LD5f4fxRkpkmP6IDnmkYfmXY+TN/aoB8dl6fNYDP 9ZfqGA1gQvXmWV4tzxBEiLOnjoQIMVHSOgSMjWkCyV+ULgEom0luGFsRniY1TRD6bNBm pJjrslypQyzITcbxQpAzVNSGZpKzYQu2tXCrG/W0e7Gl8Mpnj1V2gjzd9sJjRAVdn2pV 34fD+VErC77BC//uUb4nIDrCpoRq6wNISCr/an26WfcUYDonPyxdmCEGlqiV4OkY7hCU fpBxW6+ZX1aaINDxp1XTAsZ/O8M80l9R+E04qGbvUD2YuGDzlpML05LMvQETlokkzHRa Ju6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tv2LzPgd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b2-20020aa7c902000000b0043bb6a3ab4asi11753928edt.397.2022.08.02.01.41.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Aug 2022 01:41:55 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=tv2LzPgd; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D8002385800D for ; Tue, 2 Aug 2022 08:41:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D8002385800D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1659429714; bh=R4KVX6b/lGPDM/FQbcJbqbSpc2OcS9yz5u/KEQqVVIo=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=tv2LzPgd9RO4Gd6CsrKyaOfAOsQGZgrkTnHQTQrnzKRM0oNxpnnl1ZCrNPKfoHlX0 G1WljWRbWyAAQA86bStbk3CjOY7OAuhv/CCV6fPaVpAyvVWp+dg7GJfx6qdatc6c9E S6fyCBvh4sp0MSJEDlatb6r5NrOqXHgcpu4mbziU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id E19F93857C49 for ; Tue, 2 Aug 2022 08:41:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E19F93857C49 Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 702082056D; Tue, 2 Aug 2022 08:41:06 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 5C2112C141; Tue, 2 Aug 2022 08:41:06 +0000 (UTC) Date: Tue, 2 Aug 2022 08:41:06 +0000 (UTC) To: gcc-patches@gcc.gnu.org Subject: [PATCH] Properly honor param_max_fsm_thread_path_insns in backwards threader User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" Message-Id: <20220802084154.D8002385800D@sourceware.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740038173999556394?= X-GMAIL-MSGID: =?utf-8?q?1740038173999556394?= I am trying to make sense of back_threader_profitability::profitable_path_p and the first thing I notice is that we do /* Threading is profitable if the path duplicated is hot but also in a case we separate cold path from hot path and permit optimization of the hot path later. Be on the agressive side here. In some testcases, as in PR 78407 this leads to noticeable improvements. */ if (m_speed_p && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) || contains_hot_bb)) { if (n_insns >= param_max_fsm_thread_path_insns) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " FAIL: Jump-thread path not considered: " "the number of instructions on the path " "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); return false; } ... } else if (!m_speed_p && n_insns > 1) { if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " FAIL: Jump-thread path not considered: " "duplication of %i insns is needed and optimizing for size.\n", n_insns); return false; } ... return true; thus we apply the n_insns >= param_max_fsm_thread_path_insns only to "hot paths". The comment above this isn't entirely clear whether this is by design ("Be on the aggressive side here ...") but I think this is a mistake. In fact the "hot path" check seems entirely useless since if the path is not hot we simply continue threading it. I have my reservations about how we compute hot (contains_hot_bb in particular), but the following first refactors the above to apply the size constraints always and then _not_ threading if the path is not considered hot (but allow threading if n_insns <= 1 as with the !m_speed_p case). As for contains_hot_bb - it might be that this consciously captures the case where we separate a cold from a hot path even though the threaded path itself is cold. Consider A / \ (unlikely) B C \ / D / \ .. abort() when we want to thread A -> B -> D -> abort () and A (or D) has a hot BB count then we have contains_hot_bb even though the counts on the path itself are small. In fact when we thread the only relevant count for the resulting threaded path is the count of A with the A->C probability applied (that should also the count to subtract from the blocks we copied - sth missing for the backwards threader as well). So I'm wondering how the logic computing contains_hot_bb relates to the above comment before the costing block. Anyone remembers? Bootstrap & regtest running on x86_64-unknown-linux-gnu. * tree-ssa-threadbackwards.cc (back_threader_profitability::profitable_path_p): Apply size constraints to all paths. Do not thread cold paths. --- gcc/tree-ssa-threadbackward.cc | 53 +++++++++++++++++++++------------- 1 file changed, 33 insertions(+), 20 deletions(-) diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc index 0519f2a8c4b..a887568032b 100644 --- a/gcc/tree-ssa-threadbackward.cc +++ b/gcc/tree-ssa-threadbackward.cc @@ -761,22 +761,43 @@ back_threader_profitability::profitable_path_p (const vec &m_path, *creates_irreducible_loop = true; } - /* Threading is profitable if the path duplicated is hot but also + const int max_cold_insns = 1; + if (!m_speed_p && n_insns > max_cold_insns) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " FAIL: Jump-thread path not considered: " + "duplication of %i insns is needed and optimizing for size.\n", + n_insns); + return false; + } + else if (n_insns >= param_max_fsm_thread_path_insns) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " FAIL: Jump-thread path not considered: " + "the number of instructions on the path " + "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); + return false; + } + + /* Threading is profitable if the path duplicated is small or hot but also in a case we separate cold path from hot path and permit optimization of the hot path later. Be on the agressive side here. In some testcases, as in PR 78407 this leads to noticeable improvements. */ - if (m_speed_p - && ((taken_edge && optimize_edge_for_speed_p (taken_edge)) - || contains_hot_bb)) + if (!(n_insns <= max_cold_insns + || contains_hot_bb + || (taken_edge && optimize_edge_for_speed_p (taken_edge)))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " FAIL: Jump-thread path not considered: " + "path is not profitable to thread.\n"); + return false; + } + + /* If the path is not small to duplicate and either the entry or + the final destination is probably never executed avoid separating + the cold path since that can lead to spurious diagnostics. */ + if (n_insns > max_cold_insns) { - if (n_insns >= param_max_fsm_thread_path_insns) - { - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " FAIL: Jump-thread path not considered: " - "the number of instructions on the path " - "exceeds PARAM_MAX_FSM_THREAD_PATH_INSNS.\n"); - return false; - } if (taken_edge && probably_never_executed_edge_p (cfun, taken_edge)) { if (dump_file && (dump_flags & TDF_DETAILS)) @@ -794,14 +815,6 @@ back_threader_profitability::profitable_path_p (const vec &m_path, return false; } } - else if (!m_speed_p && n_insns > 1) - { - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " FAIL: Jump-thread path not considered: " - "duplication of %i insns is needed and optimizing for size.\n", - n_insns); - return false; - } /* We avoid creating irreducible inner loops unless we thread through a multiway branch, in which case we have deemed it worth losing