From patchwork Tue Oct 17 11:34:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 154103 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4071091vqb; Tue, 17 Oct 2023 04:42:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IExsR8QFNFM0d3TrUcCXAdwpygb3u726cX5f8XFGAqc2S11NvLmP6iiigcTZLyklR5aq/cm X-Received: by 2002:a05:622a:4cb:b0:419:5b6c:be62 with SMTP id q11-20020a05622a04cb00b004195b6cbe62mr2308509qtx.4.1697542934820; Tue, 17 Oct 2023 04:42:14 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697542934; cv=pass; d=google.com; s=arc-20160816; b=FnpuwankwBFxfceegmodDh6bDuLGdYA+I0WgL4SwJlVQbXI35XHhpV3YJtGg4yUmll qb9hBs7xp3LK4EkA5KvrgxQM3+Lxy54MgmKN4yOn33J2w6cx3Al93yYoN2KxxPRMELT5 6oGZtkwuKrHbXRTZA5esCR8O4kdjTv51MaEvdOXSYzzea274cuVeWYLIEfy9MLIxcc67 b9BawJaxJNiL4VtOl/3B7LSAJzb9LcDTFdwtPM1u34jBU0PrwlBoKS6uc98K6jfxJilN T6XExffotn0OwGZ1T8A2rLrbXKm6wAURwooZRdfoVSXe5qjVCQVO253JGXwOV9kKQgHK ZX7Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=bIqTHnioLO35QnnvLtTXU6hEhqNuZa04kjPMGOGq5Oo=; fh=x/Q0OlwHuvCZ3FpkiZPiUSvevOYVxUAi4aNnf76mUPQ=; b=LtZ0QQTtkIC6hBSs+HWvFAmThpwloayHbRcfydrCnldsD+iYGCIHYOApQ4KOws3GS6 m7NPhBxFVGtffnBAX7sPnlpMg/73u0WbMcEdhlxbZZfPhrQVs/OovUZutqoJjBAWwHfc vgCcqC19yGG6OGj/8rDjf7tukxhjZZLuKSlqH4yPbr1+yrEmHI/6vwjJxrbPjtjxV7MO 3lHHDLczCIkmh4TDZF4HRFpWPqX9lz4xsirHFace4sR4xkne9hgWfLUxjpCWsJvn6v63 KPe/8mBFvY/TB1bWIyUYLpD7l2YxzKkDke/K8nMOp81vlR/scVfog/zGA1mEQTXziMPi qFHQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id c6-20020a05622a024600b00419630a935esi937727qtx.237.2023.10.17.04.42.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Oct 2023 04:42:14 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AA00B387544C for ; Tue, 17 Oct 2023 11:41:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id 8E4593858C01 for ; Tue, 17 Oct 2023 11:35:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8E4593858C01 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8E4593858C01 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697542549; cv=none; b=KAOMCvimW/lZAjL28O14+kMubalBu2f465KI2tqC2MdVfJhaJTjeEq8fLZXgSXxhkLbduY61+LOIJadlT7kEgYigf7HGykFDwQ/edGYkKNMghA7K5wiS8spqZUU6XgkB592+4sfKbxc8AAJ4zJ5m/pVrankiB7NEJqq1E+JYoFE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697542549; c=relaxed/simple; bh=OWFaipbhguFcAz1IJFOBc1DK8+NZqrX2gkV5036Rszw=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=TC1ASGW8H9atKxHjVx9ySHKLeqFidxAXIPf20k0MJwA7ybu7EWR4gU7esDV7JbyABGHuMY8JOFS2jmxgak4J5dYXE4GGgj0AUb+jSGDQmuAGcm7y0jLO+MQadIy+xoWaNlQBXuF405lLBK93l6oOXPx2ZVCNs5+m8wexjyUqRQY= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp62t1697542539tfp69gln Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 17 Oct 2023 19:35:37 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: Fc2LLDWeHZ/6ngvQVJkaVrn/PM2ORRL0qZsLKXiqYpf8YOG3jvlMyw950n8ar gY1kvp21Z+1RRMZFlXQmTmIoKTJ1BqWI/W5Rd/sFSxMs6qc68dU91J+hPzP99oN3T5w2M5W YgeS2MLb/DANSkCd3tPKqZq/8ghm/Raz/tJu/SI2/Lmo9suyYhGyQ/coh9IUA3BYjTdMXhV 3cX5VS5WJdgjOrL4A0FIICOFlDki43n6htiNzG3+rsEMRIG7wJhT3R9nFG+rjFbfsxF2yuR hRj5Ofyt29aCLPjB/q3gVg0uDY+G360pUMBddnpKNlad5aBy4cVi75CByFZQhCyUhv76Oxu 3Wr9QF+jg0rnk+WApHS3PcgcNBxKU0AeoOJc8QrYobTgZq1WvwPeNaGRWfNsaHvPYhqdH/j hHHCIApqDcw= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 3601565993683237264 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, rdapp.gcc@gmail.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, lehua.ding@rivai.ai Subject: [PATCH V2 09/14] RISC-V: P9: Cleanup post optimize phase Date: Tue, 17 Oct 2023 19:34:55 +0800 Message-Id: <20231017113500.1160997-10-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231017113500.1160997-1-lehua.ding@rivai.ai> References: <20231017113500.1160997-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780002780491153158 X-GMAIL-MSGID: 1780002780491153158 This sub-patch deletes partial post optimize code(which implement in the main phase) and move the remain cleanup code to pre_vsetvl class. gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::cleaup): New. (pre_vsetvl::remove_avl_operand): New. (pre_vsetvl::remove_unused_dest_operand): New. (pass_vsetvl::get_vsetvl_at_end): Removed. (local_avl_compatible_p): Removed. (pass_vsetvl::local_eliminate_vsetvl_insn): Removed. (get_first_vsetvl_before_rvv_insns): Removed. (pass_vsetvl::global_eliminate_vsetvl_insn): Removed. (pass_vsetvl::ssa_post_optimization): Removed. (has_no_uses): Removed. (pass_vsetvl::df_post_optimization): Removed. (pass_vsetvl::init): Removed. (pass_vsetvl::done): Removed. (pass_vsetvl::lazy_vsetvl): Removed. --- gcc/config/riscv/riscv-vsetvl.cc | 675 ++++--------------------------- 1 file changed, 76 insertions(+), 599 deletions(-) -- 2.36.3 diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 5d84d290e9e..ac636623b3f 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -3791,6 +3791,82 @@ pre_vsetvl::emit_vsetvl () commit_edge_insertions (); } +void +pre_vsetvl::cleaup () +{ + remove_avl_operand (); + remove_unused_dest_operand (); +} + +void +pre_vsetvl::remove_avl_operand () +{ + for (const bb_info *bb : crtl->ssa->bbs ()) + for (insn_info *insn : bb->real_nondebug_insns ()) + { + rtx_insn *rinsn = insn->rtl (); + /* Erase the AVL operand from the instruction. */ + if (!has_vl_op (rinsn) || !REG_P (get_vl (rinsn))) + continue; + rtx avl = get_vl (rinsn); + if (count_regno_occurrences (rinsn, REGNO (avl)) == 1) + { + /* Get the list of uses for the new instruction. */ + auto attempt = crtl->ssa->new_change_attempt (); + insn_change change (insn); + /* Remove the use of the substituted value. */ + access_array_builder uses_builder (attempt); + uses_builder.reserve (insn->num_uses () - 1); + for (use_info *use : insn->uses ()) + if (use != find_access (insn->uses (), REGNO (avl))) + uses_builder.quick_push (use); + use_array new_uses = use_array (uses_builder.finish ()); + change.new_uses = new_uses; + change.move_range = insn->ebb ()->insn_range (); + rtx pat; + if (fault_first_load_p (rinsn)) + pat = simplify_replace_rtx (PATTERN (rinsn), avl, const0_rtx); + else + { + rtx set = single_set (rinsn); + rtx src = simplify_replace_rtx (SET_SRC (set), avl, const0_rtx); + pat = gen_rtx_SET (SET_DEST (set), src); + } + bool ok = change_insn (crtl->ssa, change, insn, pat); + gcc_assert (ok); + } + } +} + +void +pre_vsetvl::remove_unused_dest_operand () +{ + df_analyze (); + hash_set to_delete; + basic_block cfg_bb; + rtx_insn *rinsn; + FOR_ALL_BB_FN (cfg_bb, cfun) + { + FOR_BB_INSNS (cfg_bb, rinsn) + { + if (NONDEBUG_INSN_P (rinsn) && vsetvl_insn_p (rinsn)) + { + rtx vl = get_vl (rinsn); + vsetvl_info info = vsetvl_info (rinsn); + if (has_no_uses (cfg_bb, rinsn, REGNO (vl))) + { + if (!info.has_vlmax_avl ()) + { + rtx new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, info, + NULL_RTX); + validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, + false); + } + } + } + } + } +} const pass_data pass_data_vsetvl = { RTL_PASS, /* type */ @@ -3923,602 +3999,3 @@ make_pass_vsetvl (gcc::context *ctxt) { return new pass_vsetvl (ctxt); } - -/* Some instruction can not be accessed in RTL_SSA when we don't re-init - the new RTL_SSA framework but it is definetely at the END of the block. - - Here we optimize the VSETVL is hoisted by LCM: - - Before LCM: - bb 1: - vsetvli a5,a2,e32,m1,ta,mu - bb 2: - vsetvli zero,a5,e32,m1,ta,mu - ... - - After LCM: - bb 1: - vsetvli a5,a2,e32,m1,ta,mu - LCM INSERTED: vsetvli zero,a5,e32,m1,ta,mu --> eliminate - bb 2: - ... - */ -rtx_insn * -pass_vsetvl::get_vsetvl_at_end (const bb_info *bb, vector_insn_info *dem) const -{ - rtx_insn *end_vsetvl = BB_END (bb->cfg_bb ()); - if (end_vsetvl && NONDEBUG_INSN_P (end_vsetvl)) - { - if (JUMP_P (end_vsetvl)) - end_vsetvl = PREV_INSN (end_vsetvl); - - if (NONDEBUG_INSN_P (end_vsetvl) - && vsetvl_discard_result_insn_p (end_vsetvl)) - { - /* Only handle single succ. here, multiple succ. is much - more complicated. */ - if (single_succ_p (bb->cfg_bb ())) - { - edge e = single_succ_edge (bb->cfg_bb ()); - *dem = get_block_info (e->dest).local_dem; - return end_vsetvl; - } - } - } - return nullptr; -} - -/* This predicator should only used within same basic block. */ -static bool -local_avl_compatible_p (rtx avl1, rtx avl2) -{ - if (!REG_P (avl1) || !REG_P (avl2)) - return false; - - return REGNO (avl1) == REGNO (avl2); -} - -/* Local user vsetvl optimizaiton: - - Case 1: - vsetvl a5,a4,e8,mf8 - ... - vsetvl zero,a5,e8,mf8 --> Eliminate directly. - - Case 2: - vsetvl a5,a4,e8,mf8 --> vsetvl a5,a4,e32,mf2 - ... - vsetvl zero,a5,e32,mf2 --> Eliminate directly. */ -void -pass_vsetvl::local_eliminate_vsetvl_insn (const bb_info *bb) const -{ - rtx_insn *prev_vsetvl = nullptr; - rtx_insn *curr_vsetvl = nullptr; - rtx vl_placeholder = RVV_VLMAX; - rtx prev_avl = vl_placeholder; - rtx curr_avl = vl_placeholder; - vector_insn_info prev_dem; - - /* Instruction inserted by LCM is not appeared in RTL-SSA yet, try to - found those instruciton. */ - if (rtx_insn *end_vsetvl = get_vsetvl_at_end (bb, &prev_dem)) - { - prev_avl = get_avl (end_vsetvl); - prev_vsetvl = end_vsetvl; - } - - bool skip_one = false; - /* Backward propgate vsetvl info, drop the later one (prev_vsetvl) if it's - compatible with current vsetvl (curr_avl), and merge the vtype and avl - info. into current vsetvl. */ - for (insn_info *insn : bb->reverse_real_nondebug_insns ()) - { - rtx_insn *rinsn = insn->rtl (); - const auto &curr_dem = get_vector_info (insn); - bool need_invalidate = false; - - /* Skip if this insn already handled in last iteration. */ - if (skip_one) - { - skip_one = false; - continue; - } - - if (vsetvl_insn_p (rinsn)) - { - curr_vsetvl = rinsn; - /* vsetvl are using vl rather than avl since it will try to merge - with other vsetvl_discard_result. - - v--- avl - vsetvl a5,a4,e8,mf8 # vsetvl - ... ^--- vl - vsetvl zero,a5,e8,mf8 # vsetvl_discard_result - ^--- avl - */ - curr_avl = get_vl (rinsn); - /* vsetvl is a cut point of local backward vsetvl elimination. */ - need_invalidate = true; - } - else if (has_vtype_op (rinsn) && NONDEBUG_INSN_P (PREV_INSN (rinsn)) - && (vsetvl_discard_result_insn_p (PREV_INSN (rinsn)) - || vsetvl_insn_p (PREV_INSN (rinsn)))) - { - curr_vsetvl = PREV_INSN (rinsn); - - if (vsetvl_insn_p (PREV_INSN (rinsn))) - { - /* Need invalidate and skip if it's vsetvl. */ - need_invalidate = true; - /* vsetvl_discard_result_insn_p won't appeared in RTL-SSA, - * so only need to skip for vsetvl. */ - skip_one = true; - } - - curr_avl = curr_dem.get_avl (); - - /* Some instrucion like pred_extract_first don't reqruie avl, so - the avl is null, use vl_placeholder for unify the handling - logic. */ - if (!curr_avl) - curr_avl = vl_placeholder; - } - else if (insn->is_call () || insn->is_asm () - || find_access (insn->defs (), VL_REGNUM) - || find_access (insn->defs (), VTYPE_REGNUM) - || (REG_P (prev_avl) - && find_access (insn->defs (), REGNO (prev_avl)))) - { - /* Invalidate if this insn can't propagate vl, vtype or avl. */ - need_invalidate = true; - prev_dem = vector_insn_info (); - } - else - /* Not interested instruction. */ - continue; - - /* Local AVL compatibility checking is simpler than global, we only - need to check the REGNO is same. */ - if (prev_dem.valid_or_dirty_p () - && prev_dem.skip_avl_compatible_p (curr_dem) - && local_avl_compatible_p (prev_avl, curr_avl)) - { - /* curr_dem and prev_dem is compatible! */ - /* Update avl info since we need to make sure they are fully - compatible before merge. */ - prev_dem.set_avl_info (curr_dem.get_avl_info ()); - /* Merge both and update into curr_vsetvl. */ - prev_dem = curr_dem.local_merge (prev_dem); - change_vsetvl_insn (curr_dem.get_insn (), prev_dem); - /* Then we can drop prev_vsetvl. */ - eliminate_insn (prev_vsetvl); - } - - if (need_invalidate) - { - prev_vsetvl = nullptr; - curr_vsetvl = nullptr; - prev_avl = vl_placeholder; - curr_avl = vl_placeholder; - prev_dem = vector_insn_info (); - } - else - { - prev_vsetvl = curr_vsetvl; - prev_avl = curr_avl; - prev_dem = curr_dem; - } - } -} - -/* Return the first vsetvl instruction in CFG_BB or NULL if - none exists or if a user RVV instruction is enountered - prior to any vsetvl. */ -static rtx_insn * -get_first_vsetvl_before_rvv_insns (basic_block cfg_bb, - enum vsetvl_type insn_type) -{ - gcc_assert (insn_type == VSETVL_DISCARD_RESULT - || insn_type == VSETVL_VTYPE_CHANGE_ONLY); - rtx_insn *rinsn; - FOR_BB_INSNS (cfg_bb, rinsn) - { - if (!NONDEBUG_INSN_P (rinsn)) - continue; - /* If we don't find any inserted vsetvli before user RVV instructions, - we don't need to optimize the vsetvls in this block. */ - if (has_vtype_op (rinsn) || vsetvl_insn_p (rinsn)) - return nullptr; - - if (insn_type == VSETVL_DISCARD_RESULT - && vsetvl_discard_result_insn_p (rinsn)) - return rinsn; - if (insn_type == VSETVL_VTYPE_CHANGE_ONLY - && vsetvl_vtype_change_only_p (rinsn)) - return rinsn; - } - return nullptr; -} - -/* Global user vsetvl optimizaiton: - - Case 1: - bb 1: - vsetvl a5,a4,e8,mf8 - ... - bb 2: - ... - vsetvl zero,a5,e8,mf8 --> Eliminate directly. - - Case 2: - bb 1: - vsetvl a5,a4,e8,mf8 --> vsetvl a5,a4,e32,mf2 - ... - bb 2: - ... - vsetvl zero,a5,e32,mf2 --> Eliminate directly. - - Case 3: - bb 1: - vsetvl a5,a4,e8,mf8 --> vsetvl a5,a4,e32,mf2 - ... - bb 2: - ... - vsetvl a5,a4,e8,mf8 --> vsetvl a5,a4,e32,mf2 - goto bb 3 - bb 3: - ... - vsetvl zero,a5,e32,mf2 --> Eliminate directly. -*/ -bool -pass_vsetvl::global_eliminate_vsetvl_insn (const bb_info *bb) const -{ - rtx_insn *vsetvl_rinsn = NULL; - vector_insn_info dem = vector_insn_info (); - const auto &block_info = get_block_info (bb); - basic_block cfg_bb = bb->cfg_bb (); - - if (block_info.local_dem.valid_or_dirty_p ()) - { - /* Optimize the local vsetvl. */ - dem = block_info.local_dem; - vsetvl_rinsn - = get_first_vsetvl_before_rvv_insns (cfg_bb, VSETVL_DISCARD_RESULT); - } - if (!vsetvl_rinsn) - /* Optimize the global vsetvl inserted by LCM. */ - vsetvl_rinsn = get_vsetvl_at_end (bb, &dem); - - /* No need to optimize if block doesn't have vsetvl instructions. */ - if (!dem.valid_or_dirty_p () || !vsetvl_rinsn || !dem.get_avl_source () - || !dem.has_avl_reg ()) - return false; - - /* Condition 1: Check it has preds. */ - if (EDGE_COUNT (cfg_bb->preds) == 0) - return false; - - /* If all preds has VL/VTYPE status setted by user vsetvls, and these - user vsetvls are all skip_avl_compatible_p with the vsetvl in this - block, we can eliminate this vsetvl instruction. */ - sbitmap avin = m_vector_manager->vector_avin[cfg_bb->index]; - - unsigned int bb_index; - sbitmap_iterator sbi; - rtx avl = dem.get_avl (); - hash_set sets - = get_all_sets (dem.get_avl_source (), true, false, false); - /* Condition 2: All VL/VTYPE available in are all compatible. */ - EXECUTE_IF_SET_IN_BITMAP (avin, 0, bb_index, sbi) - { - const auto &expr = m_vector_manager->vector_exprs[bb_index]; - const auto &insn = expr->get_insn (); - def_info *def = find_access (insn->defs (), REGNO (avl)); - set_info *set = safe_dyn_cast (def); - if (!vsetvl_insn_p (insn->rtl ()) || insn->bb () == bb - || !sets.contains (set)) - return false; - } - - /* Condition 3: We don't do the global optimization for the block - has a pred is entry block or exit block. */ - /* Condition 4: All preds have available VL/VTYPE out. */ - edge e; - edge_iterator ei; - FOR_EACH_EDGE (e, ei, cfg_bb->preds) - { - sbitmap avout = m_vector_manager->vector_avout[e->src->index]; - if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun) - || e->src == EXIT_BLOCK_PTR_FOR_FN (cfun) - || (unsigned int) e->src->index - >= m_vector_manager->vector_block_infos.length () - || bitmap_empty_p (avout)) - return false; - - EXECUTE_IF_SET_IN_BITMAP (avout, 0, bb_index, sbi) - { - const auto &expr = m_vector_manager->vector_exprs[bb_index]; - const auto &insn = expr->get_insn (); - def_info *def = find_access (insn->defs (), REGNO (avl)); - set_info *set = safe_dyn_cast (def); - if (!vsetvl_insn_p (insn->rtl ()) || insn->bb () == bb - || !sets.contains (set) || !expr->skip_avl_compatible_p (dem)) - return false; - } - } - - /* Step1: Reshape the VL/VTYPE status to make sure everything compatible. */ - auto_vec pred_cfg_bbs - = get_dominated_by (CDI_POST_DOMINATORS, cfg_bb); - FOR_EACH_EDGE (e, ei, cfg_bb->preds) - { - sbitmap avout = m_vector_manager->vector_avout[e->src->index]; - EXECUTE_IF_SET_IN_BITMAP (avout, 0, bb_index, sbi) - { - vector_insn_info prev_dem = *m_vector_manager->vector_exprs[bb_index]; - vector_insn_info curr_dem = dem; - insn_info *insn = prev_dem.get_insn (); - if (!pred_cfg_bbs.contains (insn->bb ()->cfg_bb ())) - continue; - /* Update avl info since we need to make sure they are fully - compatible before merge. */ - curr_dem.set_avl_info (prev_dem.get_avl_info ()); - /* Merge both and update into curr_vsetvl. */ - prev_dem = curr_dem.local_merge (prev_dem); - change_vsetvl_insn (insn, prev_dem); - } - } - - /* Step2: eliminate the vsetvl instruction. */ - eliminate_insn (vsetvl_rinsn); - return true; -} - -/* This function does the following post optimization base on RTL_SSA: - - 1. Local user vsetvl optimizations. - 2. Global user vsetvl optimizations. - 3. AVL dependencies removal: - Before VSETVL PASS, RVV instructions pattern is depending on AVL operand - implicitly. Since we will emit VSETVL instruction and make RVV - instructions depending on VL/VTYPE global status registers, we remove the - such AVL operand in the RVV instructions pattern here in order to remove - AVL dependencies when AVL operand is a register operand. - - Before the VSETVL PASS: - li a5,32 - ... - vadd.vv (..., a5) - After the VSETVL PASS: - li a5,32 - vsetvli zero, a5, ... - ... - vadd.vv (..., const_int 0). */ -void -pass_vsetvl::ssa_post_optimization (void) const -{ - for (const bb_info *bb : crtl->ssa->bbs ()) - { - local_eliminate_vsetvl_insn (bb); - bool changed_p = true; - while (changed_p) - { - changed_p = false; - changed_p |= global_eliminate_vsetvl_insn (bb); - } - for (insn_info *insn : bb->real_nondebug_insns ()) - { - rtx_insn *rinsn = insn->rtl (); - if (vlmax_avl_insn_p (rinsn)) - { - eliminate_insn (rinsn); - continue; - } - - /* Erase the AVL operand from the instruction. */ - if (!has_vl_op (rinsn) || !REG_P (get_vl (rinsn))) - continue; - rtx avl = get_vl (rinsn); - if (count_regno_occurrences (rinsn, REGNO (avl)) == 1) - { - /* Get the list of uses for the new instruction. */ - auto attempt = crtl->ssa->new_change_attempt (); - insn_change change (insn); - /* Remove the use of the substituted value. */ - access_array_builder uses_builder (attempt); - uses_builder.reserve (insn->num_uses () - 1); - for (use_info *use : insn->uses ()) - if (use != find_access (insn->uses (), REGNO (avl))) - uses_builder.quick_push (use); - use_array new_uses = use_array (uses_builder.finish ()); - change.new_uses = new_uses; - change.move_range = insn->ebb ()->insn_range (); - rtx pat; - if (fault_first_load_p (rinsn)) - pat = simplify_replace_rtx (PATTERN (rinsn), avl, const0_rtx); - else - { - rtx set = single_set (rinsn); - rtx src - = simplify_replace_rtx (SET_SRC (set), avl, const0_rtx); - pat = gen_rtx_SET (SET_DEST (set), src); - } - bool ok = change_insn (crtl->ssa, change, insn, pat); - gcc_assert (ok); - } - } - } -} - -/* Return true if the SET result is not used by any instructions. */ -static bool -has_no_uses (basic_block cfg_bb, rtx_insn *rinsn, int regno) -{ - /* Handle the following case that can not be detected in RTL_SSA. */ - /* E.g. - li a5, 100 - vsetvli a6, a5... - ... - vadd (use a6) - - The use of "a6" is removed from "vadd" but the information is - not updated in RTL_SSA framework. We don't want to re-new - a new RTL_SSA which is expensive, instead, we use data-flow - analysis to check whether "a6" has no uses. */ - if (bitmap_bit_p (df_get_live_out (cfg_bb), regno)) - return false; - - rtx_insn *iter; - for (iter = NEXT_INSN (rinsn); iter && iter != NEXT_INSN (BB_END (cfg_bb)); - iter = NEXT_INSN (iter)) - if (df_find_use (iter, regno_reg_rtx[regno])) - return false; - - return true; -} - -/* This function does the following post optimization base on dataflow - analysis: - - 1. Change vsetvl rd, rs1 --> vsevl zero, rs1, if rd is not used by any - nondebug instructions. Even though this PASS runs after RA and it doesn't - help for reduce register pressure, it can help instructions scheduling since - we remove the dependencies. - - 2. Remove redundant user vsetvls base on outcome of Phase 4 (LCM) && Phase 5 - (AVL dependencies removal). */ -void -pass_vsetvl::df_post_optimization (void) const -{ - df_analyze (); - hash_set to_delete; - basic_block cfg_bb; - rtx_insn *rinsn; - FOR_ALL_BB_FN (cfg_bb, cfun) - { - FOR_BB_INSNS (cfg_bb, rinsn) - { - if (NONDEBUG_INSN_P (rinsn) && vsetvl_insn_p (rinsn)) - { - rtx vl = get_vl (rinsn); - vector_insn_info info; - info.parse_insn (rinsn); - bool to_delete_p = m_vector_manager->to_delete_p (rinsn); - bool to_refine_p = m_vector_manager->to_refine_p (rinsn); - if (has_no_uses (cfg_bb, rinsn, REGNO (vl))) - { - if (to_delete_p) - to_delete.add (rinsn); - else if (to_refine_p) - { - rtx new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, - info, NULL_RTX); - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, - false); - } - else if (!vlmax_avl_p (info.get_avl ())) - { - rtx new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, info, - NULL_RTX); - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, - false); - } - } - } - } - } - for (rtx_insn *rinsn : to_delete) - eliminate_insn (rinsn); -} - -void -pass_vsetvl::init (void) -{ - if (optimize > 0) - { - /* Initialization of RTL_SSA. */ - calculate_dominance_info (CDI_DOMINATORS); - calculate_dominance_info (CDI_POST_DOMINATORS); - df_analyze (); - crtl->ssa = new function_info (cfun); - } - - m_vector_manager = new vector_infos_manager (); - compute_probabilities (); - - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "\nPrologue: Initialize vector infos\n"); - m_vector_manager->dump (dump_file); - } -} - -void -pass_vsetvl::done (void) -{ - if (optimize > 0) - { - /* Finalization of RTL_SSA. */ - free_dominance_info (CDI_DOMINATORS); - free_dominance_info (CDI_POST_DOMINATORS); - if (crtl->ssa->perform_pending_updates ()) - cleanup_cfg (0); - delete crtl->ssa; - crtl->ssa = nullptr; - } - m_vector_manager->release (); - delete m_vector_manager; - m_vector_manager = nullptr; -} - -/* Lazy vsetvl insertion for optimize > 0. */ -void -pass_vsetvl::lazy_vsetvl (void) -{ - if (dump_file) - fprintf (dump_file, - "\nEntering Lazy VSETVL PASS and Handling %d basic blocks for " - "function:%s\n", - n_basic_blocks_for_fn (cfun), function_name (cfun)); - - /* Phase 1 - Compute the local dems within each block. - The data-flow analysis within each block is backward analysis. */ - if (dump_file) - fprintf (dump_file, "\nPhase 1: Compute local backward vector infos\n"); - for (const bb_info *bb : crtl->ssa->bbs ()) - compute_local_backward_infos (bb); - if (dump_file && (dump_flags & TDF_DETAILS)) - m_vector_manager->dump (dump_file); - - /* Phase 2 - Emit vsetvl instructions within each basic block according to - demand, compute and save ANTLOC && AVLOC of each block. */ - if (dump_file) - fprintf (dump_file, - "\nPhase 2: Emit vsetvl instruction within each block\n"); - for (const bb_info *bb : crtl->ssa->bbs ()) - emit_local_forward_vsetvls (bb); - if (dump_file && (dump_flags & TDF_DETAILS)) - m_vector_manager->dump (dump_file); - - /* Phase 3 - Propagate demanded info across blocks. */ - if (dump_file) - fprintf (dump_file, "\nPhase 3: Demands propagation across blocks\n"); - vsetvl_fusion (); - - /* Phase 4 - Lazy code motion. */ - if (dump_file) - fprintf (dump_file, "\nPhase 4: PRE vsetvl by Lazy code motion (LCM)\n"); - pre_vsetvl (); - - /* Phase 5 - Post optimization base on RTL_SSA. */ - if (dump_file) - fprintf (dump_file, "\nPhase 5: Post optimization base on RTL_SSA\n"); - ssa_post_optimization (); - - /* Phase 6 - Post optimization base on data-flow analysis. */ - if (dump_file) - fprintf (dump_file, - "\nPhase 6: Post optimization base on data-flow analysis\n"); - df_post_optimization (); -} -