From patchwork Wed Jul 13 12:58:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 231 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a98:d5ce:0:b0:178:cc93:bf7d with SMTP id g14csp240468eik; Wed, 13 Jul 2022 05:59:45 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uV5lYr9VuNwhrrEh5n64kUeC4N7H8/tpIn6ysaZ3gu+BXmWEX10gxJ/vRE+SGkL0wBSFAy X-Received: by 2002:a05:6402:f10:b0:43a:b4cb:4d04 with SMTP id i16-20020a0564020f1000b0043ab4cb4d04mr4893639eda.305.1657717185490; Wed, 13 Jul 2022 05:59:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657717185; cv=none; d=google.com; s=arc-20160816; b=wgG4CGzvHN2PGTKUBLsBCnkPqJX06ekCDmTGTmqJKqy1KXpJg7yrONpK3yaaLGq+ps K3yL10wS2mLeS2avZkq/8HaALkSpeVbaL++z1vOLpwEw4tFH2Jd7V1n8N2PG9I56i4pz TfAhKH0GNMqwxGWbzQyOUgpOZFfuTqbXG5UlTFDwbTj7D2e2rNpX9RixoPl/Mvko4Sfo EaRD0NPZ8FNnbQW6s3Fbr46z3B3+r19L+f883UbZ5rxrOH1Q0mcH14WvREzsQfyGrxG/ BZvoLq48LG1OMxemoV+Yirkq5WfFEzy2OHeKVoCtUN//CAFwpk/Q5JMdU7vP8NjI9cmQ VLFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:message-id :mime-version:subject:to:date:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=rYyFzzFX77aDYrb1bFHBXcqjGjbNg7s2oAwZQg3QuUw=; b=n9+entkXFSQosLHF5JV2lYzwZ4Ud5n5MqOeflFrLwb4Op0WpVN+SCAXi2nqP6YkGUl vynDMmVl0zTqa2L0yDi8PQy83aBf33B945mHX9CwxiCKLfLZl+r8LH0hmmfRQJtlsGh7 5LxHNfXk3rwhSQAi4Mv7wec3QZh0k9L1vdfIkqn8aQoZ+ofsiNNolbpP+7HeyCVw7U6a 1/O52V0cX2ejf9fZkfz2Xj9LMoxpNHNgJpE4oOoT8D1XHPcpcwt8gx3GpqWtqvJ95orT KIPSjkbOJFWsmFJHNTbsxgh84X/cmXmMczMwYODB6yw8iq8DsBio3NOzKc/YVrBgzHaA 7glQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=T4IoRAYj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id oz7-20020a1709077d8700b007269ef1872esi3665964ejc.897.2022.07.13.05.59.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jul 2022 05:59:45 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=T4IoRAYj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0DA433856DCD for ; Wed, 13 Jul 2022 12:59:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0DA433856DCD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1657717150; bh=rYyFzzFX77aDYrb1bFHBXcqjGjbNg7s2oAwZQg3QuUw=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=T4IoRAYjkHYqdui6psyfjeMWiRl0T2srDBNP7XNxcmmevF420r1pvT+85c9FJ5WC0 wmpLeniQHrDfc1RsXV5Wofvrc0BcDkCjEm+vh0jHkoN4VS350oz+NiOD79XRzTNsCm 7vohIUeqvCPXEhLAvIPkTU/JWroyciHBZNBKD6Ns= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by sourceware.org (Postfix) with ESMTPS id 9F9C23838F0A for ; Wed, 13 Jul 2022 12:58:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9F9C23838F0A Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 972AB33D02 for ; Wed, 13 Jul 2022 12:58:24 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 8315113754 for ; Wed, 13 Jul 2022 12:58:24 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id SbCYHnDBzmIpNwAAMHmgww (envelope-from ) for ; Wed, 13 Jul 2022 12:58:24 +0000 Date: Wed, 13 Jul 2022 14:58:24 +0200 (CEST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] Speed up DOM record_temporary_equivalences MIME-Version: 1.0 Message-Id: <20220713125824.8315113754@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1738242455100242070?= X-GMAIL-MSGID: =?utf-8?q?1738242455100242070?= The following gets away computing a dominance bitmap when fast queries are not available and we are doing back_propagate_equivalences. The comuted bitmap can be cheaply kept up-to-date during the domwalk since it is simply the set of blocks on the domwalk stack. Abstraction of the threading makes this somewhat awkward but it also fulfills the fixme comment in only considering equivalences in already (domwalk) visited blocks, even when querying from the outgoing block of a forward thread. Maybe that's not what is intended but at least we have no testsuite coverage of such missed equivalences. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. * tree-ssa-dom.h (record_temporary_equivalences): Remove. * tree-ssa-dom.cc (dom_jt_state::m_blocks_on_stack): New. (dom_jt_state::get_blocks_on_stack): Likewise. (dom_opt_dom_walker::dom_opt_dom_walker): Take dom_jt_state. (back_propagate_equivalences): Remove dominator bitmap compute and instead use passed in m_blocks_on_stack. (record_temporary_equivalences): Likewise. (record_equivalences_from_incoming_edge): Likewise. (dom_opt_dom_walker::before_dom_children): Maintain and pass down blocks on stack. (dom_opt_dom_walker::after_dom_children): Likewise. --- gcc/tree-ssa-dom.cc | 67 +++++++++++++++++++++------------------------ gcc/tree-ssa-dom.h | 3 -- 2 files changed, 31 insertions(+), 39 deletions(-) diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc index 43acc756c96..f5e8f574997 100644 --- a/gcc/tree-ssa-dom.cc +++ b/gcc/tree-ssa-dom.cc @@ -112,7 +112,8 @@ static void record_equality (tree, tree, class const_and_copies *); static void record_equivalences_from_phis (basic_block); static void record_equivalences_from_incoming_edge (basic_block, class const_and_copies *, - class avail_exprs_stack *); + class avail_exprs_stack *, + bitmap blocks_on_stack); static void eliminate_redundant_computations (gimple_stmt_iterator *, class const_and_copies *, class avail_exprs_stack *); @@ -120,6 +121,8 @@ static void record_equivalences_from_stmt (gimple *, int, class avail_exprs_stack *); static void dump_dominator_optimization_stats (FILE *file, hash_table *); +static void record_temporary_equivalences (edge, class const_and_copies *, + class avail_exprs_stack *, bitmap); /* Constructor for EDGE_INFO. An EDGE_INFO instance is always associated with an edge E. */ @@ -591,6 +594,7 @@ public: dom_jt_state (const_and_copies *copies, avail_exprs_stack *avails) : m_copies (copies), m_avails (avails) { + bitmap_tree_view (m_blocks_on_stack); } void push (edge e) override { @@ -606,12 +610,16 @@ public: } void register_equivs_edge (edge e) override { - record_temporary_equivalences (e, m_copies, m_avails); + record_temporary_equivalences (e, m_copies, m_avails, m_blocks_on_stack); } void register_equiv (tree dest, tree src, bool update) override; + bitmap get_blocks_on_stack () { return m_blocks_on_stack; } private: const_and_copies *m_copies; avail_exprs_stack *m_avails; + /* Set of blocks on the stack, to be used for medium-fast + dominance queries in back_propagate_equivalences. */ + auto_bitmap m_blocks_on_stack; }; void @@ -653,7 +661,7 @@ class dom_opt_dom_walker : public dom_walker public: dom_opt_dom_walker (cdi_direction direction, jump_threader *threader, - jt_state *state, + dom_jt_state *state, gimple_ranger *ranger, const_and_copies *const_and_copies, avail_exprs_stack *avail_exprs_stack) @@ -693,7 +701,7 @@ private: jump_threader *m_threader; gimple_ranger *m_ranger; - jt_state *m_state; + dom_jt_state *m_state; }; /* Jump threading, redundancy elimination and const/copy propagation. @@ -962,7 +970,7 @@ dom_valueize (tree t) static void back_propagate_equivalences (tree lhs, edge e, class const_and_copies *const_and_copies, - bitmap *domby) + bitmap domby) { use_operand_p use_p; imm_use_iterator iter; @@ -997,29 +1005,12 @@ back_propagate_equivalences (tree lhs, edge e, } else { - /* Profiling has shown the domination tests here can be fairly - expensive when the fast indexes are not computed. - We get significant improvements by building the - set of blocks that dominate BB. We can then just test - for set membership below. - - We also initialize the set lazily since often the only uses - are going to be in the same block as DEST. */ - - if (!*domby) - { - *domby = BITMAP_ALLOC (NULL); - bitmap_tree_view (*domby); - basic_block bb = get_immediate_dominator (CDI_DOMINATORS, dest); - while (bb) - { - bitmap_set_bit (*domby, bb->index); - bb = get_immediate_dominator (CDI_DOMINATORS, bb); - } - } - + /* We can use the set of BBs on the stack from a domwalk + for a medium fast way to query dominance. Profiling + has shown non-fast query dominance tests here can be fairly + expensive. */ /* This tests if USE_STMT does not dominate DEST. */ - if (!bitmap_bit_p (*domby, gimple_bb (use_stmt)->index)) + if (!bitmap_bit_p (domby, gimple_bb (use_stmt)->index)) continue; } @@ -1037,10 +1028,11 @@ back_propagate_equivalences (tree lhs, edge e, by traversing edge E (which are cached in E->aux). Callers are responsible for managing the unwinding markers. */ -void +static void record_temporary_equivalences (edge e, class const_and_copies *const_and_copies, - class avail_exprs_stack *avail_exprs_stack) + class avail_exprs_stack *avail_exprs_stack, + bitmap blocks_on_stack) { int i; class edge_info *edge_info = (class edge_info *) e->aux; @@ -1055,7 +1047,6 @@ record_temporary_equivalences (edge e, for (i = 0; edge_info->cond_equivalences.iterate (i, &eq); ++i) avail_exprs_stack->record_cond (eq); - bitmap domby = NULL; edge_info::equiv_pair *seq; for (i = 0; edge_info->simple_equivalences.iterate (i, &seq); ++i) { @@ -1092,10 +1083,9 @@ record_temporary_equivalences (edge e, /* Any equivalence found for LHS may result in additional equivalences for other uses of LHS that we have already processed. */ - back_propagate_equivalences (lhs, e, const_and_copies, &domby); + back_propagate_equivalences (lhs, e, const_and_copies, + blocks_on_stack); } - if (domby) - BITMAP_FREE (domby); } } @@ -1267,7 +1257,8 @@ dom_opt_dom_walker::set_global_ranges_from_unreachable_edges (basic_block bb) static void record_equivalences_from_incoming_edge (basic_block bb, class const_and_copies *const_and_copies, - class avail_exprs_stack *avail_exprs_stack) + class avail_exprs_stack *avail_exprs_stack, + bitmap blocks_on_stack) { edge e; basic_block parent; @@ -1282,7 +1273,8 @@ record_equivalences_from_incoming_edge (basic_block bb, /* If we had a single incoming edge from our parent block, then enter any data associated with the edge into our tables. */ if (e && e->src == parent) - record_temporary_equivalences (e, const_and_copies, avail_exprs_stack); + record_temporary_equivalences (e, const_and_copies, avail_exprs_stack, + blocks_on_stack); } /* Dump statistics for the hash table HTAB. */ @@ -1517,9 +1509,11 @@ dom_opt_dom_walker::before_dom_children (basic_block bb) far to unwind when we finalize this block. */ m_avail_exprs_stack->push_marker (); m_const_and_copies->push_marker (); + bitmap_set_bit (m_state->get_blocks_on_stack (), bb->index); record_equivalences_from_incoming_edge (bb, m_const_and_copies, - m_avail_exprs_stack); + m_avail_exprs_stack, + m_state->get_blocks_on_stack ()); set_global_ranges_from_unreachable_edges (bb); /* PHI nodes can create equivalences too. */ @@ -1594,6 +1588,7 @@ void dom_opt_dom_walker::after_dom_children (basic_block bb) { m_threader->thread_outgoing_edges (bb); + bitmap_clear_bit (m_state->get_blocks_on_stack (), bb->index); m_avail_exprs_stack->pop_to_marker (); m_const_and_copies->pop_to_marker (); } diff --git a/gcc/tree-ssa-dom.h b/gcc/tree-ssa-dom.h index 9df830798bb..98154c5313f 100644 --- a/gcc/tree-ssa-dom.h +++ b/gcc/tree-ssa-dom.h @@ -21,8 +21,5 @@ along with GCC; see the file COPYING3. If not see #define GCC_TREE_SSA_DOM_H extern bool simple_iv_increment_p (gimple *); -extern void record_temporary_equivalences (edge, - class const_and_copies *, - class avail_exprs_stack *); #endif /* GCC_TREE_SSA_DOM_H */