From patchwork Thu Oct 19 08:33:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 155378 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp239883vqb; Thu, 19 Oct 2023 01:38:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFtzUZLPuHjNj3M7JDpk08IIhTfOFBi8FMxKAX3rFp4pPk266m9p36m/h9B4Blkcnd2nFwC X-Received: by 2002:a05:622a:1702:b0:41c:ad7f:5720 with SMTP id h2-20020a05622a170200b0041cad7f5720mr1833833qtk.61.1697704686122; Thu, 19 Oct 2023 01:38:06 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697704686; cv=pass; d=google.com; s=arc-20160816; b=OxudTttYLVtup5GsrdUhFoNGlmF7NTIYV2zysysf7MZW9GqZVcfLWecII58i2p1dy8 KGnhbo60bnoK2Og2b9KJ4ZReOwfAz/8f0gNknYsHWZo24hi3QycEdjGxtfi44maQtV0U AX8vJ8qsIUFvMuJbwgn7jJBZLOt3IFy6yfUfjqFK2bxbrFYzIzrbKrb5VhOjmdfWybG8 EHxjn00oY4IfTqynfNqoaJj8xzth45F3Exd5vvn/pyKCcyXygAcDYoTXiv1KvIhuG2fc 8F1fHRzrk3ROpGJXP1qNozlw1vhnX7EFxqdw0KVrTqFX2c78RghLSstOrOtKo7qfvya4 0fKQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:arc-filter:dmarc-filter :delivered-to; bh=e7lSjgVqcjbbn07ooZ6bdz+GZcTG3t9YvoNTO41TnwE=; fh=x/Q0OlwHuvCZ3FpkiZPiUSvevOYVxUAi4aNnf76mUPQ=; b=hCsGeG6HYsQPtALAUe4XklWI+R3bo+E86Gnyw/1tQCU1uEEYfk+wbf24TFKQNisGej zdt30+5HqhkMjoYfUZL4i604Z99BN8YVHaMaYU8oR0hxo75su3oSoMg7hYQA5OAGaYTB TShgW8rn1G++qavA8FTaZUTSbAVFaPXHL34I7PV8dO9cf1WdvVK6brIGJVReAXxlyWU9 SbL2arvLFLqAW0DSPcgI+m3oZaNkFp/2RUwWwXX4fc4ukLsY3uFKS2MVtVIdsYW7suY1 lEvdrsw4pI+hZZm+n+1q9z41JIQRGOcHNaRZtzEI3ixOZv1o8tPKYgHNbzO6cmIZwnwt trYQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id k15-20020ac85fcf000000b004196dab68a4si1175235qta.656.2023.10.19.01.38.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 01:38:06 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4C18A385353A for ; Thu, 19 Oct 2023 08:38:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgeu1.qq.com (smtpbgeu1.qq.com [52.59.177.22]) by sourceware.org (Postfix) with ESMTPS id F1612385C6FD for ; Thu, 19 Oct 2023 08:34:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F1612385C6FD Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org F1612385C6FD Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=52.59.177.22 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697704464; cv=none; b=oPipQiQ4V9bvWVn4H1r5LLhhaY3SvgCPz1pSl2QQoFFR6cFZ7rnkNpR0KoCtjIsqHTsUsAJOCWNxxeb0qSvOvKdC7TnaIqH3EHwe3Oe44clWuUCybFFg6YL0YPMlJsM7fU2NZ84YE8sRCW8az8eR+DRZcLR2siuZOWbYge778/o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697704464; c=relaxed/simple; bh=P5TYypLZGy7OyLtZD7tTfIYebwcdG3KgjIoEDRiWyKQ=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=pma4Z5z/uacduUedUPA2Gt2Id6ZHXutLdAVnsHCrJXgO7FbyZuEFk658dt3EQ97YDGA+gE0GDAB//DVHTeoMzJguOCq7yxL4N0SI9GKxqAjz0uLFEaY0xkcooV3Ru0SHTjVx5L78brEUw0iG/wiGeJhuIV0XS26xFXqEJLP+UPA= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp65t1697704445t0aasgcl Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 19 Oct 2023 16:34:04 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: ILHsT53NKPjcy0vmrfNhaupSZxdjGlMgE3+hfADithOX6m8tlljVMd4a3zBnv DBUDVTWsUbkE7Fj6oXFYOSYyoCxRBjMrwdO4iZJzXSiENZ0NnmetYuNnWJ+ei8s/Ex4VtXf 2vVTsmI4wUUer7iEmL0/MOEnYO8Ugc+pwY3nxNpDslhSei6jna1YOnMT9PqMVoaMrs1lsBR 2hSU9qBVieQjNxHm0dBLKa2MoOoM+a9Enbxd4VBREe8RpbSaJ6BqZ5Kvqq8lWCCIL1pGz8B CfXQpXhlRonxggsRRpqJa54X3bvpE2BCgUj33MKn7yu8ibAM5R74hzmW+VlF1Zg/cEYIvih r7Wxqf8e5kTzaLrwIbdNm98oOyuyjNUEt1luezHxB7+UUIFQx0vMUIPTndLqJ2OVIwQI+9P 43FinvCxpbY= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 14339753270090616642 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, rdapp.gcc@gmail.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, lehua.ding@rivai.ai Subject: [PATCH V3 09/11] RISC-V: P9: Cleanup and reorganize helper functions Date: Thu, 19 Oct 2023 16:33:31 +0800 Message-Id: <20231019083333.2052340-10-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 In-Reply-To: <20231019083333.2052340-1-lehua.ding@rivai.ai> References: <20231019083333.2052340-1-lehua.ding@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_PASS, TXREP, T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780172388919904632 X-GMAIL-MSGID: 1780172388919904632 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (debug): Removed. (bitmap_union_of_preds_with_entry): New. (compute_reaching_defintion): New. (vlmax_avl_p): New. (enum vsetvl_type): Moved. (enum emit_type): Moved. (vlmul_to_str): Moved. (vlmax_avl_insn_p): Moved. (policy_to_str): Moved. (loop_basic_block_p): Removed. (valid_sew_p): Removed. (vsetvl_insn_p): Moved. (vsetvl_vtype_change_only_p): Removed. (after_or_same_p): Removed. (before_p): Removed. (anticipatable_occurrence_p): Removed. (available_occurrence_p): Removed. (insn_should_be_added_p): Moved. (get_all_sets): Moved. (get_same_bb_set): Moved. (gen_vsetvl_pat): Removed. (emit_vsetvl_insn): Removed. (eliminate_insn): Removed. (calculate_vlmul): Moved. (insert_vsetvl): Removed. (get_max_int_sew): New. (get_vl_vtype_info): Removed. (get_max_float_sew): New. (count_regno_occurrences): Moved. (enum def_type): Moved. (validate_change_or_fail): Moved. (change_insn): Removed. (get_all_real_uses): New. (get_forward_read_vl_insn): Removed. (get_backward_fault_first_load_insn): Removed. (change_vsetvl_insn): Removed. (avl_source_has_vsetvl_p): Moved. (source_equal_p): Moved. (same_equiv_note_p): Moved. (calculate_sew): Moved. (get_expr_id): New. (get_regno): New. (get_bb_index): New. (has_no_uses): Moved. --- gcc/config/riscv/riscv-vsetvl.cc | 1153 ++++++++++-------------------- 1 file changed, 383 insertions(+), 770 deletions(-) diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 06d02d25cb3..e136351aee5 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -18,60 +18,47 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see . */ -/* This pass is to Set VL/VTYPE global status for RVV instructions - that depend on VL and VTYPE registers by Lazy code motion (LCM). - - Strategy: - - - Backward demanded info fusion within block. - - - Lazy code motion (LCM) based demanded info backward propagation. - - - RTL_SSA framework for def-use, PHI analysis. - - - Lazy code motion (LCM) for global VL/VTYPE optimization. - - Assumption: - - - Each avl operand is either an immediate (must be in range 0 ~ 31) or reg. - - This pass consists of 5 phases: - - - Phase 1 - compute VL/VTYPE demanded information within each block - by backward data-flow analysis. - - - Phase 2 - Emit vsetvl instructions within each basic block according to - demand, compute and save ANTLOC && AVLOC of each block. - - - Phase 3 - LCM Earliest-edge baseed VSETVL demand fusion. - - - Phase 4 - Lazy code motion including: compute local properties, - pre_edge_lcm and vsetvl insertion && delete edges for LCM results. - - - Phase 5 - Cleanup AVL operand of RVV instruction since it will not be - used any more and VL operand of VSETVL instruction if it is not used by - any non-debug instructions. - - - Phase 6 - DF based post VSETVL optimizations. - - Implementation: - - - The subroutine of optimize == 0 is simple_vsetvl. - This function simplily vsetvl insertion for each RVV - instruction. No optimization. - - - The subroutine of optimize > 0 is lazy_vsetvl. - This function optimize vsetvl insertion process by - lazy code motion (LCM) layering on RTL_SSA. - - - get_avl (), get_insn (), get_avl_source (): - - 1. get_insn () is the current instruction, find_access (get_insn - ())->def is the same as get_avl_source () if get_insn () demand VL. - 2. If get_avl () is non-VLMAX REG, get_avl () == get_avl_source - ()->regno (). - 3. get_avl_source ()->regno () is the REGNO that we backward propagate. - */ +/* The values of the vl and vtype registers will affect the behavior of RVV + insns. That is, when we need to execute an RVV instruction, we need to set + the correct vl and vtype values by executing the vsetvl instruction before. + Executing the fewest number of vsetvl instructions while keeping the behavior + the same is the problem this pass is trying to solve. This vsetvl pass is + divided into 5 phases: + + - Phase 1 (fuse local vsetvl infos): traverses each Basic Block, parses + each instruction in it that affects vl and vtype state and generates an + array of vsetvl_info objects. Then traverse the vsetvl_info array from + front to back and perform fusion according to the fusion rules. The fused + vsetvl infos are stored in the vsetvl_block_info object's `infos` field. + + - Phase 2 (earliest fuse global vsetvl infos): The header_info and + footer_info of vsetvl_block_info are used as expressions, and the + earliest of each expression is computed. Based on the earliest + information, try to lift up the corresponding vsetvl info to the src + basic block of the edge (mainly to reduce the total number of vsetvl + instructions, this uplift will cause some execution paths to execute + vsetvl instructions that shouldn't be there). + + - Phase 3 (pre global vsetvl info): The header_info and footer_info of + vsetvl_block_info are used as expressions, and the LCM algorithm is used + to compute the header_info that needs to be deleted and the one that + needs to be inserted in some edges. + + - Phase 4 (emit vsetvl insns) : Based on the fusion result of Phase 1 and + the deletion and insertion information of Phase 3, the mandatory vsetvl + instruction insertion, modification and deletion are performed. + + - Phase 5 (cleanup): Clean up the avl operand in the RVV operator + instruction and cleanup the unused dest operand of the vsetvl insn. + + After the Phase 1 a virtual CFG of vsetvl_info is generated. The virtual + basic block is represented by vsetvl_block_info, and the virtual vsetvl + statements inside are represented by vsetvl_info. The later phases 2 and 3 + are constantly modifying and adjusting this virtual CFG. Phase 4 performs + insertion, modification and deletion of vsetvl instructions based on the + optimized virtual CFG. The Phase 1, 2 and 3 do not involve modifications to + the RTL. +*/ #define IN_TARGET_CODE 1 #define INCLUDE_ALGORITHM @@ -98,61 +85,180 @@ along with GCC; see the file COPYING3. If not see #include "predict.h" #include "profile-count.h" #include "gcse.h" -#include "riscv-vsetvl.h" using namespace rtl_ssa; using namespace riscv_vector; -static CONSTEXPR const unsigned ALL_SEW[] = {8, 16, 32, 64}; -static CONSTEXPR const vlmul_type ALL_LMUL[] - = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2}; - -DEBUG_FUNCTION void -debug (const vector_insn_info *info) +/* Set the bitmap DST to the union of SRC of predecessors of + basic block B. + It's a bit different from bitmap_union_of_preds in cfganal.cc. This function + takes into account the case where pred is ENTRY basic block. The main reason + for this difference is to make it easier to insert some special value into + the ENTRY base block. For example, vsetvl_info with a status of UNKNOW. */ +static void +bitmap_union_of_preds_with_entry (sbitmap dst, sbitmap *src, basic_block b) { - info->dump (stderr); + unsigned int set_size = dst->size; + edge e; + unsigned ix; + + for (ix = 0; ix < EDGE_COUNT (b->preds); ix++) + { + e = EDGE_PRED (b, ix); + bitmap_copy (dst, src[e->src->index]); + break; + } + + if (ix == EDGE_COUNT (b->preds)) + bitmap_clear (dst); + else + for (ix++; ix < EDGE_COUNT (b->preds); ix++) + { + unsigned int i; + SBITMAP_ELT_TYPE *p, *r; + + e = EDGE_PRED (b, ix); + p = src[e->src->index]->elms; + r = dst->elms; + for (i = 0; i < set_size; i++) + *r++ |= *p++; + } } -DEBUG_FUNCTION void -debug (const vector_infos_manager *info) -{ - info->dump (stderr); +/* Compute the reaching defintion in and out based on the gen and KILL + informations in each Base Blocks. + This function references the compute_avaiable implementation in lcm.cc */ +static void +compute_reaching_defintion (sbitmap *gen, sbitmap *kill, sbitmap *in, + sbitmap *out) +{ + edge e; + basic_block *worklist, *qin, *qout, *qend, bb; + unsigned int qlen; + edge_iterator ei; + + /* Allocate a worklist array/queue. Entries are only added to the + list if they were not already on the list. So the size is + bounded by the number of basic blocks. */ + qin = qout = worklist + = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS); + + /* Put every block on the worklist; this is necessary because of the + optimistic initialization of AVOUT above. Use reverse postorder + to make the forward dataflow problem require less iterations. */ + int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS); + int n = pre_and_rev_post_order_compute_fn (cfun, NULL, rpo, false); + for (int i = 0; i < n; ++i) + { + bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]); + *qin++ = bb; + bb->aux = bb; + } + free (rpo); + + qin = worklist; + qend = &worklist[n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS]; + qlen = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS; + + /* Mark blocks which are successors of the entry block so that we + can easily identify them below. */ + FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs) + e->dest->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun); + + /* Iterate until the worklist is empty. */ + while (qlen) + { + /* Take the first entry off the worklist. */ + bb = *qout++; + qlen--; + + if (qout >= qend) + qout = worklist; + + /* Do not clear the aux field for blocks which are successors of the + ENTRY block. That way we never add then to the worklist again. */ + if (bb->aux != ENTRY_BLOCK_PTR_FOR_FN (cfun)) + bb->aux = NULL; + + bitmap_union_of_preds_with_entry (in[bb->index], out, bb); + + if (bitmap_ior_and_compl (out[bb->index], gen[bb->index], in[bb->index], + kill[bb->index])) + /* If the out state of this block changed, then we need + to add the successors of this block to the worklist + if they are not already on the worklist. */ + FOR_EACH_EDGE (e, ei, bb->succs) + if (!e->dest->aux && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)) + { + *qin++ = e->dest; + e->dest->aux = e; + qlen++; + + if (qin >= qend) + qin = worklist; + } + } + + clear_aux_for_edges (); + clear_aux_for_blocks (); + free (worklist); } -static bool -vlmax_avl_p (rtx x) +/* Classification of vsetvl instruction. */ +enum vsetvl_type { - return x && rtx_equal_p (x, RVV_VLMAX); + VSETVL_NORMAL, + VSETVL_VTYPE_CHANGE_ONLY, + VSETVL_DISCARD_RESULT, + NUM_VSETVL_TYPE +}; + +enum emit_type +{ + /* emit_insn directly. */ + EMIT_DIRECT, + EMIT_BEFORE, + EMIT_AFTER, +}; + +/* dump helper functions */ +static const char * +vlmul_to_str (vlmul_type vlmul) +{ + switch (vlmul) + { + case LMUL_1: + return "m1"; + case LMUL_2: + return "m2"; + case LMUL_4: + return "m4"; + case LMUL_8: + return "m8"; + case LMUL_RESERVED: + return "INVALID LMUL"; + case LMUL_F8: + return "mf8"; + case LMUL_F4: + return "mf4"; + case LMUL_F2: + return "mf2"; + + default: + gcc_unreachable (); + } } -static bool -vlmax_avl_insn_p (rtx_insn *rinsn) +static const char * +policy_to_str (bool agnostic_p) { - return (INSN_CODE (rinsn) == CODE_FOR_vlmax_avlsi - || INSN_CODE (rinsn) == CODE_FOR_vlmax_avldi); + return agnostic_p ? "agnostic" : "undisturbed"; } -/* Return true if the block is a loop itself: - local_dem - __________ - ____|____ | - | | | - |________| | - |_________| - reaching_out -*/ static bool -loop_basic_block_p (const basic_block cfg_bb) +vlmax_avl_p (rtx x) { - if (JUMP_P (BB_END (cfg_bb)) && any_condjump_p (BB_END (cfg_bb))) - { - edge e; - edge_iterator ei; - FOR_EACH_EDGE (e, ei, cfg_bb->succs) - if (e->dest->index == cfg_bb->index) - return true; - } - return false; + return x && rtx_equal_p (x, RVV_VLMAX); } /* Return true if it is an RVV instruction depends on VTYPE global @@ -171,13 +277,6 @@ has_vl_op (rtx_insn *rinsn) return recog_memoized (rinsn) >= 0 && get_attr_has_vl_op (rinsn); } -/* Is this a SEW value that can be encoded into the VTYPE format. */ -static bool -valid_sew_p (size_t sew) -{ - return exact_log2 (sew) && sew >= 8 && sew <= 64; -} - /* Return true if the instruction ignores VLMUL field of VTYPE. */ static bool ignore_vlmul_insn_p (rtx_insn *rinsn) @@ -223,7 +322,7 @@ vector_config_insn_p (rtx_insn *rinsn) static bool vsetvl_insn_p (rtx_insn *rinsn) { - if (!vector_config_insn_p (rinsn)) + if (!rinsn || !vector_config_insn_p (rinsn)) return false; return (INSN_CODE (rinsn) == CODE_FOR_vsetvldi || INSN_CODE (rinsn) == CODE_FOR_vsetvlsi); @@ -239,34 +338,13 @@ vsetvl_discard_result_insn_p (rtx_insn *rinsn) || INSN_CODE (rinsn) == CODE_FOR_vsetvl_discard_resultsi); } -/* Return true if it is vsetvl zero, zero. */ -static bool -vsetvl_vtype_change_only_p (rtx_insn *rinsn) -{ - if (!vector_config_insn_p (rinsn)) - return false; - return (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only); -} - -static bool -after_or_same_p (const insn_info *insn1, const insn_info *insn2) -{ - return insn1->compare_with (insn2) >= 0; -} - static bool real_insn_and_same_bb_p (const insn_info *insn, const bb_info *bb) { return insn != nullptr && insn->is_real () && insn->bb () == bb; } -static bool -before_p (const insn_info *insn1, const insn_info *insn2) -{ - return insn1->compare_with (insn2) < 0; -} - -/* Helper function to get VL operand. */ +/* Helper function to get VL operand for VLMAX insn. */ static rtx get_vl (rtx_insn *rinsn) { @@ -278,224 +356,6 @@ get_vl (rtx_insn *rinsn) return SET_DEST (XVECEXP (PATTERN (rinsn), 0, 0)); } -/* An "anticipatable occurrence" is one that is the first occurrence in the - basic block, the operands are not modified in the basic block prior - to the occurrence and the output is not used between the start of - the block and the occurrence. - - For VSETVL instruction, we have these following formats: - 1. vsetvl zero, rs1. - 2. vsetvl zero, imm. - 3. vsetvl rd, rs1. - - So base on these circumstances, a DEM is considered as a local anticipatable - occurrence should satisfy these following conditions: - - 1). rs1 (avl) are not modified in the basic block prior to the VSETVL. - 2). rd (vl) are not modified in the basic block prior to the VSETVL. - 3). rd (vl) is not used between the start of the block and the occurrence. - - Note: We don't need to check VL/VTYPE here since DEM is UNKNOWN if VL/VTYPE - is modified prior to the occurrence. This case is already considered as - a non-local anticipatable occurrence. -*/ -static bool -anticipatable_occurrence_p (const bb_info *bb, const vector_insn_info dem) -{ - insn_info *insn = dem.get_insn (); - /* The only possible operand we care of VSETVL is AVL. */ - if (dem.has_avl_reg ()) - { - /* rs1 (avl) are not modified in the basic block prior to the VSETVL. */ - rtx avl = dem.get_avl_or_vl_reg (); - if (dem.dirty_p ()) - { - gcc_assert (!vsetvl_insn_p (insn->rtl ())); - - /* Earliest VSETVL will be inserted at the end of the block. */ - for (const insn_info *i : bb->real_nondebug_insns ()) - { - /* rs1 (avl) are not modified in the basic block prior to the - VSETVL. */ - if (find_access (i->defs (), REGNO (avl))) - return false; - if (vlmax_avl_p (dem.get_avl ())) - { - /* rd (avl) is not used between the start of the block and - the occurrence. Note: Only for Dirty and VLMAX-avl. */ - if (find_access (i->uses (), REGNO (avl))) - return false; - } - } - - return true; - } - else if (!vlmax_avl_p (avl)) - { - set_info *set = dem.get_avl_source (); - /* If it's undefined, it's not anticipatable conservatively. */ - if (!set) - return false; - if (real_insn_and_same_bb_p (set->insn (), bb) - && before_p (set->insn (), insn)) - return false; - for (insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - /* rs1 (avl) are not modified in the basic block prior to the - VSETVL. */ - if (find_access (i->defs (), REGNO (avl))) - return false; - } - } - } - - /* rd (vl) is not used between the start of the block and the occurrence. */ - if (vsetvl_insn_p (insn->rtl ())) - { - rtx dest = get_vl (insn->rtl ()); - for (insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - /* rd (vl) is not used between the start of the block and the - * occurrence. */ - if (find_access (i->uses (), REGNO (dest))) - return false; - /* rd (vl) are not modified in the basic block prior to the VSETVL. */ - if (find_access (i->defs (), REGNO (dest))) - return false; - } - } - - return true; -} - -/* An "available occurrence" is one that is the last occurrence in the - basic block and the operands are not modified by following statements in - the basic block [including this insn]. - - For VSETVL instruction, we have these following formats: - 1. vsetvl zero, rs1. - 2. vsetvl zero, imm. - 3. vsetvl rd, rs1. - - So base on these circumstances, a DEM is considered as a local available - occurrence should satisfy these following conditions: - - 1). rs1 (avl) are not modified by following statements in - the basic block. - 2). rd (vl) are not modified by following statements in - the basic block. - - Note: We don't need to check VL/VTYPE here since DEM is UNKNOWN if VL/VTYPE - is modified prior to the occurrence. This case is already considered as - a non-local available occurrence. -*/ -static bool -available_occurrence_p (const bb_info *bb, const vector_insn_info dem) -{ - insn_info *insn = dem.get_insn (); - /* The only possible operand we care of VSETVL is AVL. */ - if (dem.has_avl_reg ()) - { - if (!vlmax_avl_p (dem.get_avl ())) - { - rtx dest = NULL_RTX; - insn_info *i = insn; - if (vsetvl_insn_p (insn->rtl ())) - { - dest = get_vl (insn->rtl ()); - /* For user vsetvl a2, a2 instruction, we consider it as - available even though it modifies "a2". */ - i = i->next_nondebug_insn (); - } - for (; real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) - { - if (read_vl_insn_p (i->rtl ())) - continue; - /* rs1 (avl) are not modified by following statements in - the basic block. */ - if (find_access (i->defs (), REGNO (dem.get_avl ()))) - return false; - /* rd (vl) are not modified by following statements in - the basic block. */ - if (dest && find_access (i->defs (), REGNO (dest))) - return false; - } - } - } - return true; -} - -static bool -insn_should_be_added_p (const insn_info *insn, unsigned int types) -{ - if (insn->is_real () && (types & REAL_SET)) - return true; - if (insn->is_phi () && (types & PHI_SET)) - return true; - if (insn->is_bb_head () && (types & BB_HEAD_SET)) - return true; - if (insn->is_bb_end () && (types & BB_END_SET)) - return true; - return false; -} - -/* Recursively find all define instructions. The kind of instruction is - specified by the DEF_TYPE. */ -static hash_set -get_all_sets (phi_info *phi, unsigned int types) -{ - hash_set insns; - auto_vec work_list; - hash_set visited_list; - if (!phi) - return hash_set (); - work_list.safe_push (phi); - - while (!work_list.is_empty ()) - { - phi_info *phi = work_list.pop (); - visited_list.add (phi); - for (use_info *use : phi->inputs ()) - { - def_info *def = use->def (); - set_info *set = safe_dyn_cast (def); - if (!set) - return hash_set (); - - gcc_assert (!set->insn ()->is_debug_insn ()); - - if (insn_should_be_added_p (set->insn (), types)) - insns.add (set); - if (set->insn ()->is_phi ()) - { - phi_info *new_phi = as_a (set); - if (!visited_list.contains (new_phi)) - work_list.safe_push (new_phi); - } - } - } - return insns; -} - -static hash_set -get_all_sets (set_info *set, bool /* get_real_inst */ real_p, - bool /*get_phi*/ phi_p, bool /* get_function_parameter*/ param_p) -{ - if (real_p && phi_p && param_p) - return get_all_sets (safe_dyn_cast (set), - REAL_SET | PHI_SET | BB_HEAD_SET | BB_END_SET); - - else if (real_p && param_p) - return get_all_sets (safe_dyn_cast (set), - REAL_SET | BB_HEAD_SET | BB_END_SET); - - else if (real_p) - return get_all_sets (safe_dyn_cast (set), REAL_SET); - return hash_set (); -} - /* Helper function to get AVL operand. */ static rtx get_avl (rtx_insn *rinsn) @@ -511,15 +371,6 @@ get_avl (rtx_insn *rinsn) return recog_data.operand[get_attr_vl_op_idx (rinsn)]; } -static set_info * -get_same_bb_set (hash_set &sets, const basic_block cfg_bb) -{ - for (set_info *set : sets) - if (set->bb ()->cfg_bb () == cfg_bb) - return set; - return nullptr; -} - /* Helper function to get SEW operand. We always have SEW value for all RVV instructions that have VTYPE OP. */ static uint8_t @@ -589,365 +440,174 @@ has_vector_insn (function *fn) return false; } -/* Emit vsetvl instruction. */ -static rtx -gen_vsetvl_pat (enum vsetvl_type insn_type, const vl_vtype_info &info, rtx vl) -{ - rtx avl = info.get_avl (); - /* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s, - set the value of avl to (const_int 0) so that VSETVL PASS will - insert vsetvl correctly.*/ - if (info.has_avl_no_reg ()) - avl = GEN_INT (0); - rtx sew = gen_int_mode (info.get_sew (), Pmode); - rtx vlmul = gen_int_mode (info.get_vlmul (), Pmode); - rtx ta = gen_int_mode (info.get_ta (), Pmode); - rtx ma = gen_int_mode (info.get_ma (), Pmode); - - if (insn_type == VSETVL_NORMAL) - { - gcc_assert (vl != NULL_RTX); - return gen_vsetvl (Pmode, vl, avl, sew, vlmul, ta, ma); - } - else if (insn_type == VSETVL_VTYPE_CHANGE_ONLY) - return gen_vsetvl_vtype_change_only (sew, vlmul, ta, ma); - else - return gen_vsetvl_discard_result (Pmode, avl, sew, vlmul, ta, ma); -} - -static rtx -gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info &info, - rtx vl = NULL_RTX) -{ - rtx new_pat; - vl_vtype_info new_info = info; - if (info.get_insn () && info.get_insn ()->rtl () - && fault_first_load_p (info.get_insn ()->rtl ())) - new_info.set_avl_info ( - avl_info (get_avl (info.get_insn ()->rtl ()), nullptr)); - if (vl) - new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, vl); - else - { - if (vsetvl_insn_p (rinsn)) - new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, get_vl (rinsn)); - else if (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only) - new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, NULL_RTX); - else - new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, new_info, NULL_RTX); - } - return new_pat; -} - -static void -emit_vsetvl_insn (enum vsetvl_type insn_type, enum emit_type emit_type, - const vl_vtype_info &info, rtx vl, rtx_insn *rinsn) -{ - rtx pat = gen_vsetvl_pat (insn_type, info, vl); - if (dump_file) - { - fprintf (dump_file, "\nInsert vsetvl insn PATTERN:\n"); - print_rtl_single (dump_file, pat); - fprintf (dump_file, "\nfor insn:\n"); - print_rtl_single (dump_file, rinsn); - } - - if (emit_type == EMIT_DIRECT) - emit_insn (pat); - else if (emit_type == EMIT_BEFORE) - emit_insn_before (pat, rinsn); - else - emit_insn_after (pat, rinsn); -} - -static void -eliminate_insn (rtx_insn *rinsn) +static vlmul_type +calculate_vlmul (unsigned int sew, unsigned int ratio) { - if (dump_file) - { - fprintf (dump_file, "\nEliminate insn %d:\n", INSN_UID (rinsn)); - print_rtl_single (dump_file, rinsn); - } - if (in_sequence_p ()) - remove_insn (rinsn); - else - delete_insn (rinsn); + const vlmul_type ALL_LMUL[] + = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2}; + for (const vlmul_type vlmul : ALL_LMUL) + if (calculate_ratio (sew, vlmul) == ratio) + return vlmul; + return LMUL_RESERVED; } -static vsetvl_type -insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, - const vector_insn_info &info, const vector_insn_info &prev_info) +/* Get the currently supported maximum sew used in the int rvv instructions. */ +static uint8_t +get_max_int_sew () { - /* Use X0, X0 form if the AVL is the same and the SEW+LMUL gives the same - VLMAX. */ - if (prev_info.valid_or_dirty_p () && !prev_info.unknown_p () - && info.compatible_avl_p (prev_info) && info.same_vlmax_p (prev_info)) - { - emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_VTYPE_CHANGE_ONLY; - } - - if (info.has_avl_imm ()) - { - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_DISCARD_RESULT; - } - - if (info.has_avl_no_reg ()) - { - /* We can only use x0, x0 if there's no chance of the vtype change causing - the previous vl to become invalid. */ - if (prev_info.valid_or_dirty_p () && !prev_info.unknown_p () - && info.same_vlmax_p (prev_info)) - { - emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_VTYPE_CHANGE_ONLY; - } - /* Otherwise use an AVL of 0 to avoid depending on previous vl. */ - vl_vtype_info new_info = info; - new_info.set_avl_info (avl_info (const0_rtx, nullptr)); - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, new_info, NULL_RTX, - rinsn); - return VSETVL_DISCARD_RESULT; - } - - /* Use X0 as the DestReg unless AVLReg is X0. We also need to change the - opcode if the AVLReg is X0 as they have different register classes for - the AVL operand. */ - if (vlmax_avl_p (info.get_avl ())) - { - gcc_assert (has_vtype_op (rinsn) || vsetvl_insn_p (rinsn)); - /* For user vsetvli a5, zero, we should use get_vl to get the VL - operand "a5". */ - rtx vl_op = info.get_avl_or_vl_reg (); - gcc_assert (!vlmax_avl_p (vl_op)); - emit_vsetvl_insn (VSETVL_NORMAL, emit_type, info, vl_op, rinsn); - return VSETVL_NORMAL; - } - - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, rinsn); - - if (dump_file) - { - fprintf (dump_file, "Update VL/VTYPE info, previous info="); - prev_info.dump (dump_file); - } - return VSETVL_DISCARD_RESULT; + if (TARGET_VECTOR_ELEN_64) + return 64; + else if (TARGET_VECTOR_ELEN_32) + return 32; + gcc_unreachable (); } -/* Get VL/VTYPE information for INSN. */ -static vl_vtype_info -get_vl_vtype_info (const insn_info *insn) -{ - set_info *set = nullptr; - rtx avl = ::get_avl (insn->rtl ()); - if (avl && REG_P (avl)) - { - if (vlmax_avl_p (avl) && has_vl_op (insn->rtl ())) - set - = find_access (insn->uses (), REGNO (get_vl (insn->rtl ())))->def (); - else if (!vlmax_avl_p (avl)) - set = find_access (insn->uses (), REGNO (avl))->def (); - else - set = nullptr; - } - - uint8_t sew = get_sew (insn->rtl ()); - enum vlmul_type vlmul = get_vlmul (insn->rtl ()); - uint8_t ratio = get_attr_ratio (insn->rtl ()); - /* when get_attr_ratio is invalid, this kind of instructions - doesn't care about ratio. However, we still need this value - in demand info backward analysis. */ - if (ratio == INVALID_ATTRIBUTE) - ratio = calculate_ratio (sew, vlmul); - bool ta = tail_agnostic_p (insn->rtl ()); - bool ma = mask_agnostic_p (insn->rtl ()); - - /* If merge operand is undef value, we prefer agnostic. */ - int merge_op_idx = get_attr_merge_op_idx (insn->rtl ()); - if (merge_op_idx != INVALID_ATTRIBUTE - && satisfies_constraint_vu (recog_data.operand[merge_op_idx])) - { - ta = true; - ma = true; - } - - vl_vtype_info info (avl_info (avl, set), sew, vlmul, ratio, ta, ma); - return info; -} +/* Get the currently supported maximum sew used in the float rvv instructions. + */ +static uint8_t +get_max_float_sew () +{ + if (TARGET_VECTOR_ELEN_FP_64) + return 64; + else if (TARGET_VECTOR_ELEN_FP_32) + return 32; + else if (TARGET_VECTOR_ELEN_FP_16) + return 16; + gcc_unreachable (); +} + +/* Count the number of REGNO in RINSN. */ +static int +count_regno_occurrences (rtx_insn *rinsn, unsigned int regno) +{ + int count = 0; + extract_insn (rinsn); + for (int i = 0; i < recog_data.n_operands; i++) + if (refers_to_regno_p (regno, recog_data.operand[i])) + count++; + return count; +} + +enum def_type +{ + REAL_SET = 1 << 0, + PHI_SET = 1 << 1, + BB_HEAD_SET = 1 << 2, + BB_END_SET = 1 << 3, + /* ??? TODO: In RTL_SSA framework, we have REAL_SET, + PHI_SET, BB_HEAD_SET, BB_END_SET and + CLOBBER_DEF def_info types. Currently, + we conservatively do not optimize clobber + def since we don't see the case that we + need to optimize it. */ + CLOBBER_DEF = 1 << 4 +}; -/* Change insn and Assert the change always happens. */ -static void -validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_group) +static bool +insn_should_be_added_p (const insn_info *insn, unsigned int types) { - bool change_p = validate_change (object, loc, new_rtx, in_group); - gcc_assert (change_p); + if (insn->is_real () && (types & REAL_SET)) + return true; + if (insn->is_phi () && (types & PHI_SET)) + return true; + if (insn->is_bb_head () && (types & BB_HEAD_SET)) + return true; + if (insn->is_bb_end () && (types & BB_END_SET)) + return true; + return false; } -static void -change_insn (rtx_insn *rinsn, rtx new_pat) +static const hash_set +get_all_real_uses (insn_info *insn, unsigned regno) { - /* We don't apply change on RTL_SSA here since it's possible a - new INSN we add in the PASS before which doesn't have RTL_SSA - info yet.*/ - if (dump_file) - { - fprintf (dump_file, "\nChange PATTERN of insn %d from:\n", - INSN_UID (rinsn)); - print_rtl_single (dump_file, PATTERN (rinsn)); - } + gcc_assert (insn->is_real ()); - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); + hash_set uses; + auto_vec work_list; + hash_set visited_list; - if (dump_file) + for (def_info *def : insn->defs ()) { - fprintf (dump_file, "\nto:\n"); - print_rtl_single (dump_file, PATTERN (rinsn)); + if (!def->is_reg () || def->regno () != regno) + continue; + set_info *set = safe_dyn_cast (def); + if (!set) + continue; + for (use_info *use : set->nondebug_insn_uses ()) + if (use->insn ()->is_real ()) + uses.add (use); + for (use_info *use : set->phi_uses ()) + work_list.safe_push (use->phi ()); } -} -static const insn_info * -get_forward_read_vl_insn (const insn_info *insn) -{ - const bb_info *bb = insn->bb (); - for (const insn_info *i = insn->next_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) + while (!work_list.is_empty ()) { - if (find_access (i->defs (), VL_REGNUM)) - return nullptr; - if (read_vl_insn_p (i->rtl ())) - return i; - } - return nullptr; -} + phi_info *phi = work_list.pop (); + visited_list.add (phi); -static const insn_info * -get_backward_fault_first_load_insn (const insn_info *insn) -{ - const bb_info *bb = insn->bb (); - for (const insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - if (fault_first_load_p (i->rtl ())) - return i; - if (find_access (i->defs (), VL_REGNUM)) - return nullptr; + for (use_info *use : phi->nondebug_insn_uses ()) + if (use->insn ()->is_real ()) + uses.add (use); + for (use_info *use : phi->phi_uses ()) + if (!visited_list.contains (use->phi ())) + work_list.safe_push (use->phi ()); } - return nullptr; + return uses; } -static bool -change_insn (function_info *ssa, insn_change change, insn_info *insn, - rtx new_pat) +/* Recursively find all define instructions. The kind of instruction is + specified by the DEF_TYPE. */ +static hash_set +get_all_sets (phi_info *phi, unsigned int types) { - rtx_insn *rinsn = insn->rtl (); - auto attempt = ssa->new_change_attempt (); - if (!restrict_movement (change)) - return false; + hash_set insns; + auto_vec work_list; + hash_set visited_list; + if (!phi) + return hash_set (); + work_list.safe_push (phi); - if (dump_file) + while (!work_list.is_empty ()) { - fprintf (dump_file, "\nChange PATTERN of insn %d from:\n", - INSN_UID (rinsn)); - print_rtl_single (dump_file, PATTERN (rinsn)); - } - - insn_change_watermark watermark; - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, true); - - /* These routines report failures themselves. */ - if (!recog (attempt, change) || !change_is_worthwhile (change, false)) - return false; + phi_info *phi = work_list.pop (); + visited_list.add (phi); + for (use_info *use : phi->inputs ()) + { + def_info *def = use->def (); + set_info *set = safe_dyn_cast (def); + if (!set) + return hash_set (); - /* Fix bug: - (insn 12 34 13 2 (set (reg:RVVM4DI 120 v24 [orig:134 _1 ] [134]) - (if_then_else:RVVM4DI (unspec:RVVMF8BI [ - (const_vector:RVVMF8BI repeat [ - (const_int 1 [0x1]) - ]) - (const_int 0 [0]) - (const_int 2 [0x2]) repeated x2 - (const_int 0 [0]) - (reg:SI 66 vl) - (reg:SI 67 vtype) - ] UNSPEC_VPREDICATE) - (plus:RVVM4DI (reg/v:RVVM4DI 104 v8 [orig:137 op1 ] [137]) - (sign_extend:RVVM4DI (vec_duplicate:RVVM4SI (reg:SI 15 a5 - [140])))) (unspec:RVVM4DI [ (const_int 0 [0]) ] UNSPEC_VUNDEF))) - "rvv.c":8:12 2784 {pred_single_widen_addsvnx8di_scalar} (expr_list:REG_EQUIV - (mem/c:RVVM4DI (reg:DI 10 a0 [142]) [1 +0 S[64, 64] A128]) - (expr_list:REG_EQUAL (if_then_else:RVVM4DI (unspec:RVVMF8BI [ - (const_vector:RVVMF8BI repeat [ - (const_int 1 [0x1]) - ]) - (reg/v:DI 13 a3 [orig:139 vl ] [139]) - (const_int 2 [0x2]) repeated x2 - (const_int 0 [0]) - (reg:SI 66 vl) - (reg:SI 67 vtype) - ] UNSPEC_VPREDICATE) - (plus:RVVM4DI (reg/v:RVVM4DI 104 v8 [orig:137 op1 ] [137]) - (const_vector:RVVM4DI repeat [ - (const_int 2730 [0xaaa]) - ])) - (unspec:RVVM4DI [ - (const_int 0 [0]) - ] UNSPEC_VUNDEF)) - (nil)))) - Here we want to remove use "a3". However, the REG_EQUAL/REG_EQUIV note use - "a3" which made us fail in change_insn. We reference to the - 'aarch64-cc-fusion.cc' and add this method. */ - remove_reg_equal_equiv_notes (rinsn); - confirm_change_group (); - ssa->change_insn (change); + gcc_assert (!set->insn ()->is_debug_insn ()); - if (dump_file) - { - fprintf (dump_file, "\nto:\n"); - print_rtl_single (dump_file, PATTERN (rinsn)); + if (insn_should_be_added_p (set->insn (), types)) + insns.add (set); + if (set->insn ()->is_phi ()) + { + phi_info *new_phi = as_a (set); + if (!visited_list.contains (new_phi)) + work_list.safe_push (new_phi); + } + } } - return true; + return insns; } -static void -change_vsetvl_insn (const insn_info *insn, const vector_insn_info &info, - rtx vl = NULL_RTX) +static hash_set +get_all_sets (set_info *set, bool /* get_real_inst */ real_p, + bool /*get_phi*/ phi_p, bool /* get_function_parameter*/ param_p) { - rtx_insn *rinsn; - if (vector_config_insn_p (insn->rtl ())) - { - rinsn = insn->rtl (); - gcc_assert (vsetvl_insn_p (rinsn) && "Can't handle X0, rs1 vsetvli yet"); - } - else - { - gcc_assert (has_vtype_op (insn->rtl ())); - rinsn = PREV_INSN (insn->rtl ()); - gcc_assert (vector_config_insn_p (rinsn)); - } - rtx new_pat = gen_vsetvl_pat (rinsn, info, vl); - change_insn (rinsn, new_pat); -} + if (real_p && phi_p && param_p) + return get_all_sets (safe_dyn_cast (set), + REAL_SET | PHI_SET | BB_HEAD_SET | BB_END_SET); -static bool -avl_source_has_vsetvl_p (set_info *avl_source) -{ - if (!avl_source) - return false; - if (!avl_source->insn ()) - return false; - if (avl_source->insn ()->is_real ()) - return vsetvl_insn_p (avl_source->insn ()->rtl ()); - hash_set sets = get_all_sets (avl_source, true, false, true); - for (const auto set : sets) - { - if (set->insn ()->is_real () && vsetvl_insn_p (set->insn ()->rtl ())) - return true; - } - return false; + else if (real_p && param_p) + return get_all_sets (safe_dyn_cast (set), + REAL_SET | BB_HEAD_SET | BB_END_SET); + + else if (real_p) + return get_all_sets (safe_dyn_cast (set), REAL_SET); + return hash_set (); } static bool @@ -959,93 +619,14 @@ source_equal_p (insn_info *insn1, insn_info *insn2) rtx_insn *rinsn2 = insn2->rtl (); if (!rinsn1 || !rinsn2) return false; + rtx note1 = find_reg_equal_equiv_note (rinsn1); rtx note2 = find_reg_equal_equiv_note (rinsn2); - rtx single_set1 = single_set (rinsn1); - rtx single_set2 = single_set (rinsn2); - if (read_vl_insn_p (rinsn1) && read_vl_insn_p (rinsn2)) - { - const insn_info *load1 = get_backward_fault_first_load_insn (insn1); - const insn_info *load2 = get_backward_fault_first_load_insn (insn2); - return load1 && load2 && load1 == load2; - } - if (note1 && note2 && rtx_equal_p (note1, note2)) return true; - - /* Since vsetvl instruction is not single SET. - We handle this case specially here. */ - if (vsetvl_insn_p (insn1->rtl ()) && vsetvl_insn_p (insn2->rtl ())) - { - /* For example: - vsetvl1 a6,a5,e32m1 - RVV 1 (use a6 as AVL) - vsetvl2 a5,a5,e8mf4 - RVV 2 (use a5 as AVL) - We consider AVL of RVV 1 and RVV 2 are same so that we can - gain more optimization opportunities. - - Note: insn1_info.compatible_avl_p (insn2_info) - will make sure there is no instruction between vsetvl1 and vsetvl2 - modify a5 since their def will be different if there is instruction - modify a5 and compatible_avl_p will return false. */ - vector_insn_info insn1_info, insn2_info; - insn1_info.parse_insn (insn1); - insn2_info.parse_insn (insn2); - - /* To avoid dead loop, we don't optimize a vsetvli def has vsetvli - instructions which will complicate the situation. */ - if (avl_source_has_vsetvl_p (insn1_info.get_avl_source ()) - || avl_source_has_vsetvl_p (insn2_info.get_avl_source ())) - return false; - - if (insn1_info.same_vlmax_p (insn2_info) - && insn1_info.compatible_avl_p (insn2_info)) - return true; - } - - /* We only handle AVL is set by instructions with no side effects. */ - if (!single_set1 || !single_set2) - return false; - if (!rtx_equal_p (SET_SRC (single_set1), SET_SRC (single_set2))) - return false; - /* RTL_SSA uses include REG_NOTE. Consider this following case: - - insn1 RTL: - (insn 41 39 42 4 (set (reg:DI 26 s10 [orig:159 loop_len_46 ] [159]) - (umin:DI (reg:DI 15 a5 [orig:201 _149 ] [201]) - (reg:DI 14 a4 [276]))) 408 {*umindi3} - (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:201 _149 ] [201]) - (const_int 2 [0x2])) - (nil))) - The RTL_SSA uses of this instruction has 2 uses: - 1. (reg:DI 15 a5 [orig:201 _149 ] [201]) - twice. - 2. (reg:DI 14 a4 [276]) - once. - - insn2 RTL: - (insn 38 353 351 4 (set (reg:DI 27 s11 [orig:160 loop_len_47 ] [160]) - (umin:DI (reg:DI 15 a5 [orig:199 _146 ] [199]) - (reg:DI 14 a4 [276]))) 408 {*umindi3} - (expr_list:REG_EQUAL (umin:DI (reg:DI 28 t3 [orig:200 ivtmp_147 ] [200]) - (const_int 2 [0x2])) - (nil))) - The RTL_SSA uses of this instruction has 3 uses: - 1. (reg:DI 15 a5 [orig:199 _146 ] [199]) - once - 2. (reg:DI 14 a4 [276]) - once - 3. (reg:DI 28 t3 [orig:200 ivtmp_147 ] [200]) - once - - Return false when insn1->uses ().size () != insn2->uses ().size () - */ - if (insn1->uses ().size () != insn2->uses ().size ()) - return false; - for (size_t i = 0; i < insn1->uses ().size (); i++) - if (insn1->uses ()[i] != insn2->uses ()[i]) - return false; - return true; + return false; } -/* Helper function to get single same real RTL source. - return NULL if it is not a single real RTL source. */ static insn_info * extract_single_source (set_info *set) { @@ -1066,29 +647,61 @@ extract_single_source (set_info *set) NULL so that VSETVL PASS will insert vsetvl directly. */ if (set->insn ()->is_artificial ()) return nullptr; - if (!source_equal_p (set->insn (), first_insn)) + if (set != *sets.begin () && !source_equal_p (set->insn (), first_insn)) return nullptr; } return first_insn; } +static bool +same_equiv_note_p (set_info *set1, set_info *set2) +{ + insn_info *insn1 = extract_single_source (set1); + insn_info *insn2 = extract_single_source (set2); + if (!insn1 || !insn2) + return false; + return source_equal_p (insn1, insn2); +} + static unsigned -calculate_sew (vlmul_type vlmul, unsigned int ratio) +get_expr_id (unsigned bb_index, unsigned regno, unsigned num_bbs) { - for (const unsigned sew : ALL_SEW) - if (calculate_ratio (sew, vlmul) == ratio) - return sew; - return 0; + return regno * num_bbs + bb_index; +} +static unsigned +get_regno (unsigned expr_id, unsigned num_bb) +{ + return expr_id / num_bb; +} +static unsigned +get_bb_index (unsigned expr_id, unsigned num_bb) +{ + return expr_id % num_bb; } -static vlmul_type -calculate_vlmul (unsigned int sew, unsigned int ratio) +/* Return true if the SET result is not used by any instructions. */ +static bool +has_no_uses (basic_block cfg_bb, rtx_insn *rinsn, int regno) { - for (const vlmul_type vlmul : ALL_LMUL) - if (calculate_ratio (sew, vlmul) == ratio) - return vlmul; - return LMUL_RESERVED; + if (bitmap_bit_p (df_get_live_out (cfg_bb), regno)) + return false; + + rtx_insn *iter; + for (iter = NEXT_INSN (rinsn); iter && iter != NEXT_INSN (BB_END (cfg_bb)); + iter = NEXT_INSN (iter)) + if (df_find_use (iter, regno_reg_rtx[regno])) + return false; + + return true; +} + +/* Change insn and Assert the change always happens. */ +static void +validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_group) +{ + bool change_p = validate_change (object, loc, new_rtx, in_group); + gcc_assert (change_p); } /* This flags indicates the minimum demand of the vl and vtype values by the