From patchwork Mon Oct 16 14:52:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 153493 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3516592vqb; Mon, 16 Oct 2023 07:54:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEYUVsYVSlr4ehFHNSkQ/fqdqmyUTmBD5yU65/bC+qdSTmIXjZ9n0YqTc1b2lJ7GGcIHaNr X-Received: by 2002:a05:6214:5ece:b0:65c:fec4:30a1 with SMTP id mn14-20020a0562145ece00b0065cfec430a1mr36703376qvb.55.1697468086403; Mon, 16 Oct 2023 07:54:46 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697468086; cv=pass; d=google.com; s=arc-20160816; b=jaKj3TDkX9rf4djAQVToDtsyEpw40njKj+Sr3ebNOzzv2yT/FxyOpRSis4ppXuSxSO a4O9vPMUbt5wRVTOp6bHljVPxvzNmAswu6KIZfDNaNJhRO3UYy6pmtV1MxTovDsG9dcJ dmjNJAMzErJN6/3zq3wAKPE9TGLbnx4YEDzV4d7QIs3dzDHVZNSQM5nDmNqQfJYg+D88 TbWJ06lZe50vJuImrga7l9xkW4IYhf6ntc4IS1Odwb7idoeX2iFDIb9VINcFzNTfVR96 OTJgwnnjHU/5qN1GdNvX/FkzAISREPj/qUoyBeFejL2ELT59f6a9q2fNGLU/a2buyMc0 rnAQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=YbCRc1a5AM2gxWX65/3tbITUqcNOiuF8YjOXxMYJPx4=; fh=x/Q0OlwHuvCZ3FpkiZPiUSvevOYVxUAi4aNnf76mUPQ=; b=xnEh5eKjpyJMVwCkM5c+2FJOjAOzDqgwXApZ8KNYtTc7uavyw0iAaIo6hH6zAQzSC2 9mDJfPWl5ldrwj0vcoU4n9+MQL+W+tCykWj2wcVYIGaeXjqN6nfvfKgxNj+3K0w0ESx/ SMtHhLrFnc16G1sX4XPLfxatyka2yYFUAyziHCBcHklwaonkVvwHK/bPKrWtscXkkQeS WcIFkuwZ8In7Zp59S6TIwZBf7nLuhpjgiYjB0Gnvo1F1x6kInF0/3yrkEAwFWcm4f8n5 BobIxxF1bmhauea7ZWXXSxQOC+Z8LpPM1ZH8/bzIvXKWVjY6YjMuWdp4HZhkd6kz/yYv EOaA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c7-20020ad44307000000b0065b15f43702si5284092qvs.473.2023.10.16.07.54.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 07:54:46 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C045E3857727 for ; Mon, 16 Oct 2023 14:54:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id 27CF5385841A for ; Mon, 16 Oct 2023 14:52:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 27CF5385841A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 27CF5385841A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697467988; cv=none; b=XnRZ//tFhy9+pHa2jPR4sRKQSqlxJcixH7yL3Y95wCbfV5/yP9KL2gWR9tdhCL9lQfEtCI6thghh83bfSoad4MMTM07FBHG3wYz+FIuGbCaxQUbVbwFh0gaxVN+5E5In1ufC55hnqwnLT1RbonBmEjW+4H0cy/s6T0dDWLq8/4c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697467988; c=relaxed/simple; bh=kWAJAorubWKJJUEGss+Ma4ixp45CgcOTTkw93e99SEU=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Zd+IEDOKKvHGPY2Eow6URTxTEIZB90Hlk/A1RmARVy7WH07b/1IG6LKumJXctwNntB+CYe/dhlv9r95010pRErFjp26QONzedKykuTh/lUKy/XEFMPcecOJBBnnMPfWaRnZ9Rnra65gvEF+qUzUvNi3QWKCZV2RlA3Yuy6eJ8Ts= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp89t1697467971t4ggkv53 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Mon, 16 Oct 2023 22:52:50 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: 7QbCsSX/jDYas/MhIyD3yAHWZzdWEQuQOZNWpz3YDa9mFUc3AK7KOZur2eynF 7YsHRg77TA3MW7sN9Ms5AjoeymZKhtUPctgJIf/W6BC8GnVzxfrqzt/o4NFgu0H7oYe5Mxc 0duQpodG8oZfv9jCywsY68KrLuF+sJIun6GhXxeeTbBJc2N9mEiHxCew/kqA0dvYX5nApT7 OWSrMElncuKM8fQTVuGXGX2v0sSoOMLNUsCx0wbIuIw3Mlgo6aGVyfdExrXJnA2rGzemeDy JkcCIAwdWknO9MdJWrtYbcG1NiOM5QdjqG774q5nqb4yOEopVKWvJmxKTqkxSMuwpCCrlK7 D/qW3HP1fFFbfhg1EbWtqQdPlxmacXlev+B7zylGoVFUOHb95H9KioWHVdU2m06UxgvoqCm X-QQ-GoodBg: 2 X-BIZMAIL-ID: 5907524090003757679 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, rdapp.gcc@gmail.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, lehua.ding@rivai.ai Subject: [PATCH] RISC-V: Refactor and cleanup vsetvl pass Date: Mon, 16 Oct 2023 22:52:50 +0800 Message-Id: <20231016145250.139806-1-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779924296228205881 X-GMAIL-MSGID: 1779924296228205881 This patch refactors and cleanups the vsetvl pass in order to make the code easier to modify and understand. This patch does several things: 1. Introducing a virtual CFG for vsetvl infos and Phase 1, 2 and 3 only maintain and modify this virtual CFG. Phase 4 performs insertion, modification and deletion of vsetvl insns based on the virtual CFG. The Basic block in the virtual CFG is called vsetvl_block_info and the vsetvl information inside is called vsetvl_info. 2. Combine Phase 1 and 2 into a single Phase 1 and unified the demand system, this Phase only fuse local vsetvl info in forward direction. 3. Refactor Phase 3, change the logic for determining whether to uplift vsetvl info to a pred basic block to a more unified method that there is a vsetvl info in the vsetvl defintion reaching in compatible with it. 4. Place all modification operations to the RTL in Phase 4 and Phase 5. Phase 4 is responsible for inserting, modifying and deleting vsetvl instructions based on fully optimized vsetvl infos. Phase 5 removes the avl operand from the RVV instruction and removes the unused dest operand register from the vsetvl insns. These modifications resulted in some testcases needing to be updated. The reasons for updating are summarized below: 1. more optimized vlmax_back_prop-25.c/vlmax_back_prop-26.c/vlmax_conflict-3.c/vsetvl-13.c vsetvl-23.c/ avl_single-21.c/avl_single-23.c/avl_single-67.c/avl_single-68.c/ avl_single-71.c/avl_single-89.c/avl_single-93.c/avl_single-95.c/ avl_single-96.c 2. less unnecessary fusion avl_single-46.c/imm_bb_prop-1.c/pr109743-2.c/pr109773-1.c 3. local fuse direction (backward -> forward) scalar_move-1.c/ 4. add some bugfix testcases. pr111037-3.c/pr111037-4.c avl_single-89.c PR target/111037 PR target/111234 PR target/111725 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (bitmap_union_of_preds_with_entry): New helper function. (debug): Removed. (compute_reaching_defintion): Compute reaching defintion data. (enum vsetvl_type): Unchange. (vlmax_avl_p): Unchange. (enum emit_type): Unchange. (vlmul_to_str): Unchange. (vlmax_avl_insn_p): Removed. (policy_to_str): Unchange. (loop_basic_block_p): Removed. (valid_sew_p): Removed. (vsetvl_insn_p): Unchange. (vsetvl_vtype_change_only_p): Removed. (after_or_same_p): Removed. (before_p): Removed. (anticipatable_occurrence_p): Removed. (available_occurrence_p): Removed. (insn_should_be_added_p): Unchange. (get_all_sets): Unchange. (get_same_bb_set): Unchange. (gen_vsetvl_pat): Unchange. (calculate_vlmul): Unchange. (get_max_int_sew): New. (emit_vsetvl_insn): Unchange. (get_max_float_sew): New. (eliminate_insn): Unchange. (get_vl2): New. (insert_vsetvl): Unchange. (count_regno_occurrences): Unchange. (get_vl_vtype_info): Removed. (enum def_type): Unchagne. (validate_change_or_fail): Unchagne. (change_insn): Unchagne. (get_all_real_uses): New. (get_forward_read_vl_insn): Removed. (get_backward_fault_first_load_insn): Removed. (change_vsetvl_insn): Adjust. (avl_source_has_vsetvl_p): Removed. (source_equal_p): Adjust. (extract_single_source): Adjust. (calculate_sew): Unchange. (get_expr_id): New. (get_regno): New. (get_bb_index): New. (enum demand_flags): New enum. (incompatible_avl_p): Removed. (different_sew_p): Removed. (enum class): (different_lmul_p): Removed. (different_ratio_p): Removed. (different_tail_policy_p): Removed. (different_mask_policy_p): Removed. (class vsetvl_info): New class. (possible_zero_avl_p): Removed. (second_ratio_invalid_for_first_sew_p): Removed. (second_ratio_invalid_for_first_lmul_p): Removed. (float_insn_valid_sew_p): Removed. (second_sew_less_than_first_sew_p): Removed. (first_sew_less_than_second_sew_p): Removed. (compare_lmul): Removed. (second_lmul_less_than_first_lmul_p): Removed. (second_ratio_less_than_first_ratio_p): Removed. (DEF_INCOMPATIBLE_COND): Removed. (greatest_sew): Removed. (first_sew): Removed. (second_sew): Removed. (first_vlmul): Removed. (second_vlmul): Removed. (first_ratio): Removed. (second_ratio): Removed. (vlmul_for_first_sew_second_ratio): Removed. (vlmul_for_greatest_sew_second_ratio): Removed. (ratio_for_second_sew_first_vlmul): Removed. (DEF_SEW_LMUL_FUSE_RULE): Removed. (always_unavailable): Removed. (avl_unavailable_p): Removed. (sew_unavailable_p): Removed. (lmul_unavailable_p): Removed. (ge_sew_unavailable_p): Removed. (ge_sew_lmul_unavailable_p): Removed. (ge_sew_ratio_unavailable_p): Removed. (DEF_UNAVAILABLE_COND): Removed. (same_sew_lmul_demand_p): Removed. (propagate_avl_across_demands_p): Removed. (reg_available_p): Removed. (support_relaxed_compatible_p): Removed. (demands_can_be_fused_p): Removed. (earliest_pred_can_be_fused_p): Removed. (vsetvl_dominated_by_p): Removed. (avl_info::avl_info): Removed. (avl_info::single_source_equal_p): Removed. (avl_info::multiple_source_equal_p): Removed. (avl_info::operator=): Removed. (avl_info::operator==): Removed. (avl_info::operator!=): Removed. (avl_info::has_non_zero_avl): Removed. (vl_vtype_info::vl_vtype_info): Removed. (vl_vtype_info::operator==): Removed. (vl_vtype_info::operator!=): Removed. (vl_vtype_info::same_avl_p): Removed. (vl_vtype_info::same_vtype_p): Removed. (vl_vtype_info::same_vlmax_p): Removed. (vector_insn_info::operator>=): Removed. (vector_insn_info::operator==): Removed. (vector_insn_info::parse_insn): Removed. (vector_insn_info::compatible_p): Removed. (same_equiv_note_p): Unchange. (vector_insn_info::skip_avl_compatible_p): Removed. (class demand_system): New class. (vector_insn_info::compatible_avl_p): Removed. (vector_insn_info::compatible_vtype_p): Removed. (vector_insn_info::available_p): Removed. (vector_insn_info::fuse_avl): Removed. (vector_insn_info::fuse_sew_lmul): Removed. (vector_insn_info::fuse_tail_policy): Removed. (vector_insn_info::fuse_mask_policy): Removed. (vector_insn_info::local_merge): Removed. (vector_insn_info::global_merge): Removed. (vector_insn_info::get_avl_or_vl_reg): Removed. (vector_insn_info::update_fault_first_load_avl): Removed. (vector_insn_info::dump): Removed. (vector_infos_manager::vector_infos_manager): Removed. (vector_infos_manager::create_expr): Removed. (vector_infos_manager::get_expr_id): Removed. (vector_infos_manager::all_same_ratio_p): Removed. (vector_infos_manager::all_avail_in_compatible_p): Removed. (vector_infos_manager::all_same_avl_p): Removed. (DEF_SEW_LMUL_RULE): Removed. (vector_infos_manager::expr_set_num): Removed. (vector_infos_manager::release): Removed. (vector_infos_manager::create_bitmap_vectors): Removed. (DEF_POLICY_RULE): Removed. (vector_infos_manager::free_bitmap_vectors): Removed. (vector_infos_manager::dump): Removed. (DEF_AVL_RULE): Removed. (class pass_vsetvl): Adjust. (class vsetvl_block_info): New class. (pass_vsetvl::get_vector_info): Removed. (pass_vsetvl::get_block_info): Removed. (pass_vsetvl::update_vector_info): Removed. (pass_vsetvl::update_block_info): Removed. (pass_vsetvl::simple_vsetvl): Adjust. (pass_vsetvl::compute_local_backward_infos): Removed. (pass_vsetvl::need_vsetvl): Removed. (pass_vsetvl::transfer_before): Removed. (pass_vsetvl::transfer_after): Removed. (class pre_vsetvl): New class. (pass_vsetvl::emit_local_forward_vsetvls): Removed. (pass_vsetvl::prune_expressions): Removed. (pass_vsetvl::compute_local_properties): Removed. (pre_vsetvl::fuse_local_vsetvl_info): New. (pass_vsetvl::earliest_fusion): Removed. (pass_vsetvl::vsetvl_fusion): Removed. (pass_vsetvl::can_refine_vsetvl_p): Removed. (pre_vsetvl::earliest_fuse_vsetvl_info): New. (pass_vsetvl::refine_vsetvls): Removed. (pass_vsetvl::cleanup_vsetvls): Removed. (pass_vsetvl::commit_vsetvls): Removed. (pre_vsetvl::compute_vsetvl_def_data): New. (pass_vsetvl::pre_vsetvl): New. (pass_vsetvl::get_vsetvl_at_end): Removed. (local_avl_compatible_p): Removed. (pre_vsetvl::preds_has_same_avl_p): Removed. (pass_vsetvl::local_eliminate_vsetvl_insn): Removed. (pre_vsetvl::pre_global_vsetvl_info): New. (get_first_vsetvl_before_rvv_insns): Removed. (pass_vsetvl::global_eliminate_vsetvl_insn): Removed. (pre_vsetvl::emit_vsetvl): Removed. (pass_vsetvl::ssa_post_optimization): Removed. (pre_vsetvl::cleaup): New. (pre_vsetvl::remove_avl_operand): Unchange. (has_no_uses): Unchange. (pass_vsetvl::df_post_optimization): Removed. (pre_vsetvl::remove_unused_dest_operand): Unchange. (pass_vsetvl::init): Removed. (pass_vsetvl::done): Removed. (pass_vsetvl::compute_probabilities): Removed. (pass_vsetvl::lazy_vsetvl): New. (pass_vsetvl::execute): Adjust. * config/riscv/riscv-vsetvl.def (DEF_INCOMPATIBLE_COND): Removed. (DEF_SEW_LMUL_RULE): New. (DEF_SEW_LMUL_FUSE_RULE): Removed. (DEF_POLICY_RULE): New. (DEF_UNAVAILABLE_COND): Removed. (DEF_AVL_RULE): New. (sew_lmul): New demand type. (ratio_only): New demand type. (sew_only): New demand type. (ge_sew): New demand type. (ratio_and_ge_sew): New demand type. (tail_mask_policy): New demand type. (tail_policy_only): New demand type. (mask_policy_only): New demand type. (ignore_policy): New demand type. (avl): New demand type. (non_zero_avl): New demand type. (ignore_avl): New demand type. * config/riscv/riscv-vsetvl.h (enum vsetvl_type): Removed. (enum emit_type): Removed. (enum demand_type): Removed. (enum demand_status): Removed. (enum fusion_type): Removed. (enum def_type): Removed. (class avl_info): Removed. (struct vl_vtype_info): Removed. (class vector_insn_info): Removed. (struct vector_block_info): Removed. (class vector_infos_manager): Removed. (struct demands_pair): Removed. (struct demands_cond): Removed. (struct demands_fuse_rule): Removed. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/scalar_move-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-21.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-46.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-67.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-68.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-71.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-89.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-93.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-95.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-96.c: Adjust. * gcc.target/riscv/rvv/vsetvl/imm_bb_prop-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109743-2.c: Adjust. * gcc.target/riscv/rvv/vsetvl/pr109773-1.c: Adjust. * gcc.target/riscv/rvv/base/pr111037-1.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-1.c: ...here. * gcc.target/riscv/rvv/base/pr111037-2.c: Moved to... * gcc.target/riscv/rvv/vsetvl/pr111037-2.c: ...here. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-1.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-13.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-18.c: Adjust. * gcc.target/riscv/rvv/vsetvl/vsetvl-23.c: Adjust. * gcc.target/riscv/rvv/vsetvl/avl_single-104.c: New test. * gcc.target/riscv/rvv/vsetvl/avl_single-105.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: New test. * gcc.target/riscv/rvv/vsetvl/pr111037-4.c: New test. * gcc.target/riscv/rvv/vsetvl/vsetvl-1-1.c: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 6621 ++++++++--------- gcc/config/riscv/riscv-vsetvl.def | 634 +- gcc/config/riscv/riscv-vsetvl.h | 459 -- .../gcc.target/riscv/rvv/base/scalar_move-1.c | 2 +- .../riscv/rvv/vsetvl/avl_single-104.c | 35 + .../riscv/rvv/vsetvl/avl_single-105.c | 23 + .../riscv/rvv/vsetvl/avl_single-21.c | 5 +- .../riscv/rvv/vsetvl/avl_single-23.c | 6 +- .../riscv/rvv/vsetvl/avl_single-46.c | 3 +- .../riscv/rvv/vsetvl/avl_single-67.c | 8 +- .../riscv/rvv/vsetvl/avl_single-68.c | 9 +- .../riscv/rvv/vsetvl/avl_single-71.c | 11 +- .../riscv/rvv/vsetvl/avl_single-89.c | 8 +- .../riscv/rvv/vsetvl/avl_single-93.c | 2 +- .../riscv/rvv/vsetvl/avl_single-95.c | 2 +- .../riscv/rvv/vsetvl/avl_single-96.c | 2 +- .../riscv/rvv/vsetvl/imm_bb_prop-1.c | 7 +- .../gcc.target/riscv/rvv/vsetvl/pr109743-2.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/pr109773-1.c | 2 +- .../riscv/rvv/{base => vsetvl}/pr111037-1.c | 0 .../riscv/rvv/{base => vsetvl}/pr111037-2.c | 0 .../gcc.target/riscv/rvv/vsetvl/pr111037-3.c | 16 + .../gcc.target/riscv/rvv/vsetvl/pr111037-4.c | 16 + .../riscv/rvv/vsetvl/vlmax_back_prop-25.c | 10 +- .../riscv/rvv/vsetvl/vlmax_back_prop-26.c | 10 +- .../riscv/rvv/vsetvl/vlmax_conflict-12.c | 1 - .../riscv/rvv/vsetvl/vlmax_conflict-3.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/vsetvl-1-1.c | 18 + .../gcc.target/riscv/rvv/vsetvl/vsetvl-1.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/vsetvl-13.c | 4 +- .../gcc.target/riscv/rvv/vsetvl/vsetvl-18.c | 4 +- .../gcc.target/riscv/rvv/vsetvl/vsetvl-23.c | 2 +- 32 files changed, 3247 insertions(+), 4679 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-104.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-105.c rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-1.c (100%) rename gcc/testsuite/gcc.target/riscv/rvv/{base => vsetvl}/pr111037-2.c (100%) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-1-1.c -- 2.36.3 diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 4b06d93e7f9..f636b8e68e3 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -18,60 +18,47 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see . */ -/* This pass is to Set VL/VTYPE global status for RVV instructions - that depend on VL and VTYPE registers by Lazy code motion (LCM). - - Strategy: - - - Backward demanded info fusion within block. - - - Lazy code motion (LCM) based demanded info backward propagation. - - - RTL_SSA framework for def-use, PHI analysis. - - - Lazy code motion (LCM) for global VL/VTYPE optimization. - - Assumption: - - - Each avl operand is either an immediate (must be in range 0 ~ 31) or reg. - - This pass consists of 5 phases: - - - Phase 1 - compute VL/VTYPE demanded information within each block - by backward data-flow analysis. - - - Phase 2 - Emit vsetvl instructions within each basic block according to - demand, compute and save ANTLOC && AVLOC of each block. - - - Phase 3 - LCM Earliest-edge baseed VSETVL demand fusion. - - - Phase 4 - Lazy code motion including: compute local properties, - pre_edge_lcm and vsetvl insertion && delete edges for LCM results. - - - Phase 5 - Cleanup AVL operand of RVV instruction since it will not be - used any more and VL operand of VSETVL instruction if it is not used by - any non-debug instructions. - - - Phase 6 - DF based post VSETVL optimizations. - - Implementation: - - - The subroutine of optimize == 0 is simple_vsetvl. - This function simplily vsetvl insertion for each RVV - instruction. No optimization. - - - The subroutine of optimize > 0 is lazy_vsetvl. - This function optimize vsetvl insertion process by - lazy code motion (LCM) layering on RTL_SSA. - - - get_avl (), get_insn (), get_avl_source (): - - 1. get_insn () is the current instruction, find_access (get_insn - ())->def is the same as get_avl_source () if get_insn () demand VL. - 2. If get_avl () is non-VLMAX REG, get_avl () == get_avl_source - ()->regno (). - 3. get_avl_source ()->regno () is the REGNO that we backward propagate. - */ +/* The values of the vl and vtype registers will affect the behavior of RVV + insns. That is, when we need to execute an RVV instruction, we need to set + the correct vl and vtype values by executing the vsetvl instruction before. + Executing the fewest number of vsetvl instructions while keeping the behavior + the same is the problem this pass is trying to solve. This vsetvl pass is + divided into 5 phases: + + - Phase 1 (fuse local vsetvl infos): traverses each Basic Block, parses + each instruction in it that affects vl and vtype state and generates an + array of vsetvl_info objects. Then traverse the vsetvl_info array from + front to back and perform fusion according to the fusion rules. The fused + vsetvl infos are stored in the vsetvl_block_info object's `infos` field. + + - Phase 2 (earliest fuse global vsetvl infos): The header_info and + footer_info of vsetvl_block_info are used as expressions, and the + earliest of each expression is computed. Based on the earliest + information, try to lift up the corresponding vsetvl info to the src + basic block of the edge (mainly to reduce the total number of vsetvl + instructions, this uplift will cause some execution paths to execute + vsetvl instructions that shouldn't be there). + + - Phase 3 (pre global vsetvl info): The header_info and footer_info of + vsetvl_block_info are used as expressions, and the LCM algorithm is used + to compute the header_info that needs to be deleted and the one that + needs to be inserted in some edges. + + - Phase 4 (emit vsetvl insns) : Based on the fusion result of Phase 1 and + the deletion and insertion information of Phase 3, the mandatory vsetvl + instruction insertion, modification and deletion are performed. + + - Phase 5 (cleanup): Clean up the avl operand in the RVV operator + instruction and cleanup the unused dest operand of the vsetvl insn. + + After the Phase 1 a virtual CFG of vsetvl_info is generated. The virtual + basic block is represented by vsetvl_block_info, and the virtual vsetvl + statements inside are represented by vsetvl_info. The later phases 2 and 3 + are constantly modifying and adjusting this virtual CFG. Phase 4 performs + insertion, modification and deletion of vsetvl instructions based on the + optimized virtual CFG. The Phase 1, 2 and 3 do not involve modifications to + the RTL. +*/ #define IN_TARGET_CODE 1 #define INCLUDE_ALGORITHM @@ -98,61 +85,180 @@ along with GCC; see the file COPYING3. If not see #include "predict.h" #include "profile-count.h" #include "gcse.h" -#include "riscv-vsetvl.h" using namespace rtl_ssa; using namespace riscv_vector; -static CONSTEXPR const unsigned ALL_SEW[] = {8, 16, 32, 64}; -static CONSTEXPR const vlmul_type ALL_LMUL[] - = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2}; +/* Set the bitmap DST to the union of SRC of predecessors of + basic block B. + It's a bit different from bitmap_union_of_preds in cfganal.cc. This function + takes into account the case where pred is ENTRY basic block. The main reason + for this difference is to make it easier to insert some special value into + the ENTRY base block. For example, vsetvl_info with a status of UNKNOW. */ +static void +bitmap_union_of_preds_with_entry (sbitmap dst, sbitmap *src, basic_block b) +{ + unsigned int set_size = dst->size; + edge e; + unsigned ix; + + for (ix = 0; ix < EDGE_COUNT (b->preds); ix++) + { + e = EDGE_PRED (b, ix); + bitmap_copy (dst, src[e->src->index]); + break; + } -DEBUG_FUNCTION void -debug (const vector_insn_info *info) + if (ix == EDGE_COUNT (b->preds)) + bitmap_clear (dst); + else + for (ix++; ix < EDGE_COUNT (b->preds); ix++) + { + unsigned int i; + SBITMAP_ELT_TYPE *p, *r; + + e = EDGE_PRED (b, ix); + p = src[e->src->index]->elms; + r = dst->elms; + for (i = 0; i < set_size; i++) + *r++ |= *p++; + } +} + +/* Compute the reaching defintion in and out based on the gen and KILL + informations in each Base Blocks. + This function references the compute_avaiable implementation in lcm.cc */ +static void +compute_reaching_defintion (sbitmap *gen, sbitmap *kill, sbitmap *in, + sbitmap *out) { - info->dump (stderr); + edge e; + basic_block *worklist, *qin, *qout, *qend, bb; + unsigned int qlen; + edge_iterator ei; + + /* Allocate a worklist array/queue. Entries are only added to the + list if they were not already on the list. So the size is + bounded by the number of basic blocks. */ + qin = qout = worklist + = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS); + + /* Put every block on the worklist; this is necessary because of the + optimistic initialization of AVOUT above. Use reverse postorder + to make the forward dataflow problem require less iterations. */ + int *rpo = XNEWVEC (int, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS); + int n = pre_and_rev_post_order_compute_fn (cfun, NULL, rpo, false); + for (int i = 0; i < n; ++i) + { + bb = BASIC_BLOCK_FOR_FN (cfun, rpo[i]); + *qin++ = bb; + bb->aux = bb; + } + free (rpo); + + qin = worklist; + qend = &worklist[n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS]; + qlen = n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS; + + /* Mark blocks which are successors of the entry block so that we + can easily identify them below. */ + FOR_EACH_EDGE (e, ei, ENTRY_BLOCK_PTR_FOR_FN (cfun)->succs) + e->dest->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun); + + /* Iterate until the worklist is empty. */ + while (qlen) + { + /* Take the first entry off the worklist. */ + bb = *qout++; + qlen--; + + if (qout >= qend) + qout = worklist; + + /* Do not clear the aux field for blocks which are successors of the + ENTRY block. That way we never add then to the worklist again. */ + if (bb->aux != ENTRY_BLOCK_PTR_FOR_FN (cfun)) + bb->aux = NULL; + + bitmap_union_of_preds_with_entry (in[bb->index], out, bb); + + if (bitmap_ior_and_compl (out[bb->index], gen[bb->index], in[bb->index], + kill[bb->index])) + /* If the out state of this block changed, then we need + to add the successors of this block to the worklist + if they are not already on the worklist. */ + FOR_EACH_EDGE (e, ei, bb->succs) + if (!e->dest->aux && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)) + { + *qin++ = e->dest; + e->dest->aux = e; + qlen++; + + if (qin >= qend) + qin = worklist; + } + } + + clear_aux_for_edges (); + clear_aux_for_blocks (); + free (worklist); } -DEBUG_FUNCTION void -debug (const vector_infos_manager *info) +/* Classification of vsetvl instruction. */ +enum vsetvl_type { - info->dump (stderr); -} + VSETVL_NORMAL, + VSETVL_VTYPE_CHANGE_ONLY, + VSETVL_DISCARD_RESULT, + NUM_VSETVL_TYPE +}; -static bool -vlmax_avl_p (rtx x) +enum emit_type { - return x && rtx_equal_p (x, RVV_VLMAX); + /* emit_insn directly. */ + EMIT_DIRECT, + EMIT_BEFORE, + EMIT_AFTER, +}; + +/* dump helper functions */ +static const char * +vlmul_to_str (vlmul_type vlmul) +{ + switch (vlmul) + { + case LMUL_1: + return "m1"; + case LMUL_2: + return "m2"; + case LMUL_4: + return "m4"; + case LMUL_8: + return "m8"; + case LMUL_RESERVED: + return "INVALID LMUL"; + case LMUL_F8: + return "mf8"; + case LMUL_F4: + return "mf4"; + case LMUL_F2: + return "mf2"; + + default: + gcc_unreachable (); + } } -static bool -vlmax_avl_insn_p (rtx_insn *rinsn) +static const char * +policy_to_str (bool agnostic_p) { - return (INSN_CODE (rinsn) == CODE_FOR_vlmax_avlsi - || INSN_CODE (rinsn) == CODE_FOR_vlmax_avldi); + return agnostic_p ? "agnostic" : "undisturbed"; } -/* Return true if the block is a loop itself: - local_dem - __________ - ____|____ | - | | | - |________| | - |_________| - reaching_out -*/ static bool -loop_basic_block_p (const basic_block cfg_bb) +vlmax_avl_p (rtx x) { - if (JUMP_P (BB_END (cfg_bb)) && any_condjump_p (BB_END (cfg_bb))) - { - edge e; - edge_iterator ei; - FOR_EACH_EDGE (e, ei, cfg_bb->succs) - if (e->dest->index == cfg_bb->index) - return true; - } - return false; + return x && rtx_equal_p (x, RVV_VLMAX); } /* Return true if it is an RVV instruction depends on VTYPE global @@ -171,13 +277,6 @@ has_vl_op (rtx_insn *rinsn) return recog_memoized (rinsn) >= 0 && get_attr_has_vl_op (rinsn); } -/* Is this a SEW value that can be encoded into the VTYPE format. */ -static bool -valid_sew_p (size_t sew) -{ - return exact_log2 (sew) && sew >= 8 && sew <= 64; -} - /* Return true if the instruction ignores VLMUL field of VTYPE. */ static bool ignore_vlmul_insn_p (rtx_insn *rinsn) @@ -223,7 +322,7 @@ vector_config_insn_p (rtx_insn *rinsn) static bool vsetvl_insn_p (rtx_insn *rinsn) { - if (!vector_config_insn_p (rinsn)) + if (!rinsn || !vector_config_insn_p (rinsn)) return false; return (INSN_CODE (rinsn) == CODE_FOR_vsetvldi || INSN_CODE (rinsn) == CODE_FOR_vsetvlsi); @@ -239,34 +338,13 @@ vsetvl_discard_result_insn_p (rtx_insn *rinsn) || INSN_CODE (rinsn) == CODE_FOR_vsetvl_discard_resultsi); } -/* Return true if it is vsetvl zero, zero. */ -static bool -vsetvl_vtype_change_only_p (rtx_insn *rinsn) -{ - if (!vector_config_insn_p (rinsn)) - return false; - return (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only); -} - -static bool -after_or_same_p (const insn_info *insn1, const insn_info *insn2) -{ - return insn1->compare_with (insn2) >= 0; -} - static bool real_insn_and_same_bb_p (const insn_info *insn, const bb_info *bb) { return insn != nullptr && insn->is_real () && insn->bb () == bb; } -static bool -before_p (const insn_info *insn1, const insn_info *insn2) -{ - return insn1->compare_with (insn2) < 0; -} - -/* Helper function to get VL operand. */ +/* Helper function to get VL operand for VLMAX insn. */ static rtx get_vl (rtx_insn *rinsn) { @@ -278,224 +356,6 @@ get_vl (rtx_insn *rinsn) return SET_DEST (XVECEXP (PATTERN (rinsn), 0, 0)); } -/* An "anticipatable occurrence" is one that is the first occurrence in the - basic block, the operands are not modified in the basic block prior - to the occurrence and the output is not used between the start of - the block and the occurrence. - - For VSETVL instruction, we have these following formats: - 1. vsetvl zero, rs1. - 2. vsetvl zero, imm. - 3. vsetvl rd, rs1. - - So base on these circumstances, a DEM is considered as a local anticipatable - occurrence should satisfy these following conditions: - - 1). rs1 (avl) are not modified in the basic block prior to the VSETVL. - 2). rd (vl) are not modified in the basic block prior to the VSETVL. - 3). rd (vl) is not used between the start of the block and the occurrence. - - Note: We don't need to check VL/VTYPE here since DEM is UNKNOWN if VL/VTYPE - is modified prior to the occurrence. This case is already considered as - a non-local anticipatable occurrence. -*/ -static bool -anticipatable_occurrence_p (const bb_info *bb, const vector_insn_info dem) -{ - insn_info *insn = dem.get_insn (); - /* The only possible operand we care of VSETVL is AVL. */ - if (dem.has_avl_reg ()) - { - /* rs1 (avl) are not modified in the basic block prior to the VSETVL. */ - rtx avl = dem.get_avl_or_vl_reg (); - if (dem.dirty_p ()) - { - gcc_assert (!vsetvl_insn_p (insn->rtl ())); - - /* Earliest VSETVL will be inserted at the end of the block. */ - for (const insn_info *i : bb->real_nondebug_insns ()) - { - /* rs1 (avl) are not modified in the basic block prior to the - VSETVL. */ - if (find_access (i->defs (), REGNO (avl))) - return false; - if (vlmax_avl_p (dem.get_avl ())) - { - /* rd (avl) is not used between the start of the block and - the occurrence. Note: Only for Dirty and VLMAX-avl. */ - if (find_access (i->uses (), REGNO (avl))) - return false; - } - } - - return true; - } - else if (!vlmax_avl_p (avl)) - { - set_info *set = dem.get_avl_source (); - /* If it's undefined, it's not anticipatable conservatively. */ - if (!set) - return false; - if (real_insn_and_same_bb_p (set->insn (), bb) - && before_p (set->insn (), insn)) - return false; - for (insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - /* rs1 (avl) are not modified in the basic block prior to the - VSETVL. */ - if (find_access (i->defs (), REGNO (avl))) - return false; - } - } - } - - /* rd (vl) is not used between the start of the block and the occurrence. */ - if (vsetvl_insn_p (insn->rtl ())) - { - rtx dest = get_vl (insn->rtl ()); - for (insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - /* rd (vl) is not used between the start of the block and the - * occurrence. */ - if (find_access (i->uses (), REGNO (dest))) - return false; - /* rd (vl) are not modified in the basic block prior to the VSETVL. */ - if (find_access (i->defs (), REGNO (dest))) - return false; - } - } - - return true; -} - -/* An "available occurrence" is one that is the last occurrence in the - basic block and the operands are not modified by following statements in - the basic block [including this insn]. - - For VSETVL instruction, we have these following formats: - 1. vsetvl zero, rs1. - 2. vsetvl zero, imm. - 3. vsetvl rd, rs1. - - So base on these circumstances, a DEM is considered as a local available - occurrence should satisfy these following conditions: - - 1). rs1 (avl) are not modified by following statements in - the basic block. - 2). rd (vl) are not modified by following statements in - the basic block. - - Note: We don't need to check VL/VTYPE here since DEM is UNKNOWN if VL/VTYPE - is modified prior to the occurrence. This case is already considered as - a non-local available occurrence. -*/ -static bool -available_occurrence_p (const bb_info *bb, const vector_insn_info dem) -{ - insn_info *insn = dem.get_insn (); - /* The only possible operand we care of VSETVL is AVL. */ - if (dem.has_avl_reg ()) - { - if (!vlmax_avl_p (dem.get_avl ())) - { - rtx dest = NULL_RTX; - insn_info *i = insn; - if (vsetvl_insn_p (insn->rtl ())) - { - dest = get_vl (insn->rtl ()); - /* For user vsetvl a2, a2 instruction, we consider it as - available even though it modifies "a2". */ - i = i->next_nondebug_insn (); - } - for (; real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) - { - if (read_vl_insn_p (i->rtl ())) - continue; - /* rs1 (avl) are not modified by following statements in - the basic block. */ - if (find_access (i->defs (), REGNO (dem.get_avl ()))) - return false; - /* rd (vl) are not modified by following statements in - the basic block. */ - if (dest && find_access (i->defs (), REGNO (dest))) - return false; - } - } - } - return true; -} - -static bool -insn_should_be_added_p (const insn_info *insn, unsigned int types) -{ - if (insn->is_real () && (types & REAL_SET)) - return true; - if (insn->is_phi () && (types & PHI_SET)) - return true; - if (insn->is_bb_head () && (types & BB_HEAD_SET)) - return true; - if (insn->is_bb_end () && (types & BB_END_SET)) - return true; - return false; -} - -/* Recursively find all define instructions. The kind of instruction is - specified by the DEF_TYPE. */ -static hash_set -get_all_sets (phi_info *phi, unsigned int types) -{ - hash_set insns; - auto_vec work_list; - hash_set visited_list; - if (!phi) - return hash_set (); - work_list.safe_push (phi); - - while (!work_list.is_empty ()) - { - phi_info *phi = work_list.pop (); - visited_list.add (phi); - for (use_info *use : phi->inputs ()) - { - def_info *def = use->def (); - set_info *set = safe_dyn_cast (def); - if (!set) - return hash_set (); - - gcc_assert (!set->insn ()->is_debug_insn ()); - - if (insn_should_be_added_p (set->insn (), types)) - insns.add (set); - if (set->insn ()->is_phi ()) - { - phi_info *new_phi = as_a (set); - if (!visited_list.contains (new_phi)) - work_list.safe_push (new_phi); - } - } - } - return insns; -} - -static hash_set -get_all_sets (set_info *set, bool /* get_real_inst */ real_p, - bool /*get_phi*/ phi_p, bool /* get_function_parameter*/ param_p) -{ - if (real_p && phi_p && param_p) - return get_all_sets (safe_dyn_cast (set), - REAL_SET | PHI_SET | BB_HEAD_SET | BB_END_SET); - - else if (real_p && param_p) - return get_all_sets (safe_dyn_cast (set), - REAL_SET | BB_HEAD_SET | BB_END_SET); - - else if (real_p) - return get_all_sets (safe_dyn_cast (set), REAL_SET); - return hash_set (); -} - /* Helper function to get AVL operand. */ static rtx get_avl (rtx_insn *rinsn) @@ -511,15 +371,6 @@ get_avl (rtx_insn *rinsn) return recog_data.operand[get_attr_vl_op_idx (rinsn)]; } -static set_info * -get_same_bb_set (hash_set &sets, const basic_block cfg_bb) -{ - for (set_info *set : sets) - if (set->bb ()->cfg_bb () == cfg_bb) - return set; - return nullptr; -} - /* Helper function to get SEW operand. We always have SEW value for all RVV instructions that have VTYPE OP. */ static uint8_t @@ -589,365 +440,186 @@ has_vector_insn (function *fn) return false; } -/* Emit vsetvl instruction. */ -static rtx -gen_vsetvl_pat (enum vsetvl_type insn_type, const vl_vtype_info &info, rtx vl) +static vlmul_type +calculate_vlmul (unsigned int sew, unsigned int ratio) { - rtx avl = info.get_avl (); - /* if optimization == 0 and the instruction is vmv.x.s/vfmv.f.s, - set the value of avl to (const_int 0) so that VSETVL PASS will - insert vsetvl correctly.*/ - if (info.has_avl_no_reg ()) - avl = GEN_INT (0); - rtx sew = gen_int_mode (info.get_sew (), Pmode); - rtx vlmul = gen_int_mode (info.get_vlmul (), Pmode); - rtx ta = gen_int_mode (info.get_ta (), Pmode); - rtx ma = gen_int_mode (info.get_ma (), Pmode); - - if (insn_type == VSETVL_NORMAL) - { - gcc_assert (vl != NULL_RTX); - return gen_vsetvl (Pmode, vl, avl, sew, vlmul, ta, ma); - } - else if (insn_type == VSETVL_VTYPE_CHANGE_ONLY) - return gen_vsetvl_vtype_change_only (sew, vlmul, ta, ma); - else - return gen_vsetvl_discard_result (Pmode, avl, sew, vlmul, ta, ma); + const vlmul_type ALL_LMUL[] + = {LMUL_1, LMUL_2, LMUL_4, LMUL_8, LMUL_F8, LMUL_F4, LMUL_F2}; + for (const vlmul_type vlmul : ALL_LMUL) + if (calculate_ratio (sew, vlmul) == ratio) + return vlmul; + return LMUL_RESERVED; } -static rtx -gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info &info, - rtx vl = NULL_RTX) +/* Get the currently supported maximum sew used in the int rvv instructions. */ +static uint8_t +get_max_int_sew () { - rtx new_pat; - vl_vtype_info new_info = info; - if (info.get_insn () && info.get_insn ()->rtl () - && fault_first_load_p (info.get_insn ()->rtl ())) - new_info.set_avl_info ( - avl_info (get_avl (info.get_insn ()->rtl ()), nullptr)); - if (vl) - new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, vl); - else - { - if (vsetvl_insn_p (rinsn)) - new_pat = gen_vsetvl_pat (VSETVL_NORMAL, new_info, get_vl (rinsn)); - else if (INSN_CODE (rinsn) == CODE_FOR_vsetvl_vtype_change_only) - new_pat = gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, NULL_RTX); - else - new_pat = gen_vsetvl_pat (VSETVL_DISCARD_RESULT, new_info, NULL_RTX); - } - return new_pat; + if (TARGET_VECTOR_ELEN_64) + return 64; + else if (TARGET_VECTOR_ELEN_32) + return 32; + gcc_unreachable (); } -static void -emit_vsetvl_insn (enum vsetvl_type insn_type, enum emit_type emit_type, - const vl_vtype_info &info, rtx vl, rtx_insn *rinsn) -{ - rtx pat = gen_vsetvl_pat (insn_type, info, vl); - if (dump_file) - { - fprintf (dump_file, "\nInsert vsetvl insn PATTERN:\n"); - print_rtl_single (dump_file, pat); - fprintf (dump_file, "\nfor insn:\n"); - print_rtl_single (dump_file, rinsn); - } - - if (emit_type == EMIT_DIRECT) - emit_insn (pat); - else if (emit_type == EMIT_BEFORE) - emit_insn_before (pat, rinsn); - else - emit_insn_after (pat, rinsn); +/* Get the currently supported maximum sew used in the float rvv instructions. + */ +static uint8_t +get_max_float_sew () +{ + if (TARGET_VECTOR_ELEN_FP_64) + return 64; + else if (TARGET_VECTOR_ELEN_FP_32) + return 32; + else if (TARGET_VECTOR_ELEN_FP_16) + return 16; + gcc_unreachable (); } -static void -eliminate_insn (rtx_insn *rinsn) +/* Helper function to get VL operand. */ +static rtx +get_vl2 (rtx_insn *rinsn) { - if (dump_file) + if (has_vl_op (rinsn)) { - fprintf (dump_file, "\nEliminate insn %d:\n", INSN_UID (rinsn)); - print_rtl_single (dump_file, rinsn); + extract_insn_cached (rinsn); + return recog_data.operand[get_attr_vl_op_idx (rinsn)]; } - if (in_sequence_p ()) - remove_insn (rinsn); - else - delete_insn (rinsn); + return SET_DEST (XVECEXP (PATTERN (rinsn), 0, 0)); } -static vsetvl_type -insert_vsetvl (enum emit_type emit_type, rtx_insn *rinsn, - const vector_insn_info &info, const vector_insn_info &prev_info) +/* Count the number of REGNO in RINSN. */ +static int +count_regno_occurrences (rtx_insn *rinsn, unsigned int regno) { - /* Use X0, X0 form if the AVL is the same and the SEW+LMUL gives the same - VLMAX. */ - if (prev_info.valid_or_dirty_p () && !prev_info.unknown_p () - && info.compatible_avl_p (prev_info) && info.same_vlmax_p (prev_info)) - { - emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_VTYPE_CHANGE_ONLY; - } - - if (info.has_avl_imm ()) - { - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_DISCARD_RESULT; - } - - if (info.has_avl_no_reg ()) - { - /* We can only use x0, x0 if there's no chance of the vtype change causing - the previous vl to become invalid. */ - if (prev_info.valid_or_dirty_p () && !prev_info.unknown_p () - && info.same_vlmax_p (prev_info)) - { - emit_vsetvl_insn (VSETVL_VTYPE_CHANGE_ONLY, emit_type, info, NULL_RTX, - rinsn); - return VSETVL_VTYPE_CHANGE_ONLY; - } - /* Otherwise use an AVL of 0 to avoid depending on previous vl. */ - vl_vtype_info new_info = info; - new_info.set_avl_info (avl_info (const0_rtx, nullptr)); - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, new_info, NULL_RTX, - rinsn); - return VSETVL_DISCARD_RESULT; - } - - /* Use X0 as the DestReg unless AVLReg is X0. We also need to change the - opcode if the AVLReg is X0 as they have different register classes for - the AVL operand. */ - if (vlmax_avl_p (info.get_avl ())) - { - gcc_assert (has_vtype_op (rinsn) || vsetvl_insn_p (rinsn)); - /* For user vsetvli a5, zero, we should use get_vl to get the VL - operand "a5". */ - rtx vl_op = info.get_avl_or_vl_reg (); - gcc_assert (!vlmax_avl_p (vl_op)); - emit_vsetvl_insn (VSETVL_NORMAL, emit_type, info, vl_op, rinsn); - return VSETVL_NORMAL; - } - - emit_vsetvl_insn (VSETVL_DISCARD_RESULT, emit_type, info, NULL_RTX, rinsn); - - if (dump_file) - { - fprintf (dump_file, "Update VL/VTYPE info, previous info="); - prev_info.dump (dump_file); - } - return VSETVL_DISCARD_RESULT; + int count = 0; + extract_insn (rinsn); + for (int i = 0; i < recog_data.n_operands; i++) + if (refers_to_regno_p (regno, recog_data.operand[i])) + count++; + return count; } -/* Get VL/VTYPE information for INSN. */ -static vl_vtype_info -get_vl_vtype_info (const insn_info *insn) +enum def_type { - set_info *set = nullptr; - rtx avl = ::get_avl (insn->rtl ()); - if (avl && REG_P (avl)) - { - if (vlmax_avl_p (avl) && has_vl_op (insn->rtl ())) - set - = find_access (insn->uses (), REGNO (get_vl (insn->rtl ())))->def (); - else if (!vlmax_avl_p (avl)) - set = find_access (insn->uses (), REGNO (avl))->def (); - else - set = nullptr; - } - - uint8_t sew = get_sew (insn->rtl ()); - enum vlmul_type vlmul = get_vlmul (insn->rtl ()); - uint8_t ratio = get_attr_ratio (insn->rtl ()); - /* when get_attr_ratio is invalid, this kind of instructions - doesn't care about ratio. However, we still need this value - in demand info backward analysis. */ - if (ratio == INVALID_ATTRIBUTE) - ratio = calculate_ratio (sew, vlmul); - bool ta = tail_agnostic_p (insn->rtl ()); - bool ma = mask_agnostic_p (insn->rtl ()); - - /* If merge operand is undef value, we prefer agnostic. */ - int merge_op_idx = get_attr_merge_op_idx (insn->rtl ()); - if (merge_op_idx != INVALID_ATTRIBUTE - && satisfies_constraint_vu (recog_data.operand[merge_op_idx])) - { - ta = true; - ma = true; - } - - vl_vtype_info info (avl_info (avl, set), sew, vlmul, ratio, ta, ma); - return info; -} + REAL_SET = 1 << 0, + PHI_SET = 1 << 1, + BB_HEAD_SET = 1 << 2, + BB_END_SET = 1 << 3, + /* ??? TODO: In RTL_SSA framework, we have REAL_SET, + PHI_SET, BB_HEAD_SET, BB_END_SET and + CLOBBER_DEF def_info types. Currently, + we conservatively do not optimize clobber + def since we don't see the case that we + need to optimize it. */ + CLOBBER_DEF = 1 << 4 +}; -/* Change insn and Assert the change always happens. */ -static void -validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_group) +static bool +insn_should_be_added_p (const insn_info *insn, unsigned int types) { - bool change_p = validate_change (object, loc, new_rtx, in_group); - gcc_assert (change_p); + if (insn->is_real () && (types & REAL_SET)) + return true; + if (insn->is_phi () && (types & PHI_SET)) + return true; + if (insn->is_bb_head () && (types & BB_HEAD_SET)) + return true; + if (insn->is_bb_end () && (types & BB_END_SET)) + return true; + return false; } -static void -change_insn (rtx_insn *rinsn, rtx new_pat) +static const hash_set +get_all_real_uses (insn_info *insn, unsigned regno) { - /* We don't apply change on RTL_SSA here since it's possible a - new INSN we add in the PASS before which doesn't have RTL_SSA - info yet.*/ - if (dump_file) - { - fprintf (dump_file, "\nChange PATTERN of insn %d from:\n", - INSN_UID (rinsn)); - print_rtl_single (dump_file, PATTERN (rinsn)); - } + gcc_assert (insn->is_real ()); - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); + hash_set uses; + auto_vec work_list; + hash_set visited_list; - if (dump_file) + for (def_info *def : insn->defs ()) { - fprintf (dump_file, "\nto:\n"); - print_rtl_single (dump_file, PATTERN (rinsn)); + if (!def->is_reg () || def->regno () != regno) + continue; + set_info *set = safe_dyn_cast (def); + if (!set) + continue; + for (use_info *use : set->nondebug_insn_uses ()) + if (use->insn ()->is_real ()) + uses.add (use); + for (use_info *use : set->phi_uses ()) + work_list.safe_push (use->phi ()); } -} -static const insn_info * -get_forward_read_vl_insn (const insn_info *insn) -{ - const bb_info *bb = insn->bb (); - for (const insn_info *i = insn->next_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) + while (!work_list.is_empty ()) { - if (find_access (i->defs (), VL_REGNUM)) - return nullptr; - if (read_vl_insn_p (i->rtl ())) - return i; - } - return nullptr; -} + phi_info *phi = work_list.pop (); + visited_list.add (phi); -static const insn_info * -get_backward_fault_first_load_insn (const insn_info *insn) -{ - const bb_info *bb = insn->bb (); - for (const insn_info *i = insn->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->prev_nondebug_insn ()) - { - if (fault_first_load_p (i->rtl ())) - return i; - if (find_access (i->defs (), VL_REGNUM)) - return nullptr; + for (use_info *use : phi->nondebug_insn_uses ()) + if (use->insn ()->is_real ()) + uses.add (use); + for (use_info *use : phi->phi_uses ()) + if (!visited_list.contains (use->phi ())) + work_list.safe_push (use->phi ()); } - return nullptr; + return uses; } -static bool -change_insn (function_info *ssa, insn_change change, insn_info *insn, - rtx new_pat) +/* Recursively find all define instructions. The kind of instruction is + specified by the DEF_TYPE. */ +static hash_set +get_all_sets (phi_info *phi, unsigned int types) { - rtx_insn *rinsn = insn->rtl (); - auto attempt = ssa->new_change_attempt (); - if (!restrict_movement (change)) - return false; + hash_set insns; + auto_vec work_list; + hash_set visited_list; + if (!phi) + return hash_set (); + work_list.safe_push (phi); - if (dump_file) + while (!work_list.is_empty ()) { - fprintf (dump_file, "\nChange PATTERN of insn %d from:\n", - INSN_UID (rinsn)); - print_rtl_single (dump_file, PATTERN (rinsn)); - } - - insn_change_watermark watermark; - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, true); - - /* These routines report failures themselves. */ - if (!recog (attempt, change) || !change_is_worthwhile (change, false)) - return false; + phi_info *phi = work_list.pop (); + visited_list.add (phi); + for (use_info *use : phi->inputs ()) + { + def_info *def = use->def (); + set_info *set = safe_dyn_cast (def); + if (!set) + return hash_set (); - /* Fix bug: - (insn 12 34 13 2 (set (reg:RVVM4DI 120 v24 [orig:134 _1 ] [134]) - (if_then_else:RVVM4DI (unspec:RVVMF8BI [ - (const_vector:RVVMF8BI repeat [ - (const_int 1 [0x1]) - ]) - (const_int 0 [0]) - (const_int 2 [0x2]) repeated x2 - (const_int 0 [0]) - (reg:SI 66 vl) - (reg:SI 67 vtype) - ] UNSPEC_VPREDICATE) - (plus:RVVM4DI (reg/v:RVVM4DI 104 v8 [orig:137 op1 ] [137]) - (sign_extend:RVVM4DI (vec_duplicate:RVVM4SI (reg:SI 15 a5 - [140])))) (unspec:RVVM4DI [ (const_int 0 [0]) ] UNSPEC_VUNDEF))) - "rvv.c":8:12 2784 {pred_single_widen_addsvnx8di_scalar} (expr_list:REG_EQUIV - (mem/c:RVVM4DI (reg:DI 10 a0 [142]) [1 +0 S[64, 64] A128]) - (expr_list:REG_EQUAL (if_then_else:RVVM4DI (unspec:RVVMF8BI [ - (const_vector:RVVMF8BI repeat [ - (const_int 1 [0x1]) - ]) - (reg/v:DI 13 a3 [orig:139 vl ] [139]) - (const_int 2 [0x2]) repeated x2 - (const_int 0 [0]) - (reg:SI 66 vl) - (reg:SI 67 vtype) - ] UNSPEC_VPREDICATE) - (plus:RVVM4DI (reg/v:RVVM4DI 104 v8 [orig:137 op1 ] [137]) - (const_vector:RVVM4DI repeat [ - (const_int 2730 [0xaaa]) - ])) - (unspec:RVVM4DI [ - (const_int 0 [0]) - ] UNSPEC_VUNDEF)) - (nil)))) - Here we want to remove use "a3". However, the REG_EQUAL/REG_EQUIV note use - "a3" which made us fail in change_insn. We reference to the - 'aarch64-cc-fusion.cc' and add this method. */ - remove_reg_equal_equiv_notes (rinsn); - confirm_change_group (); - ssa->change_insn (change); + gcc_assert (!set->insn ()->is_debug_insn ()); - if (dump_file) - { - fprintf (dump_file, "\nto:\n"); - print_rtl_single (dump_file, PATTERN (rinsn)); + if (insn_should_be_added_p (set->insn (), types)) + insns.add (set); + if (set->insn ()->is_phi ()) + { + phi_info *new_phi = as_a (set); + if (!visited_list.contains (new_phi)) + work_list.safe_push (new_phi); + } + } } - return true; + return insns; } -static void -change_vsetvl_insn (const insn_info *insn, const vector_insn_info &info, - rtx vl = NULL_RTX) +static hash_set +get_all_sets (set_info *set, bool /* get_real_inst */ real_p, + bool /*get_phi*/ phi_p, bool /* get_function_parameter*/ param_p) { - rtx_insn *rinsn; - if (vector_config_insn_p (insn->rtl ())) - { - rinsn = insn->rtl (); - gcc_assert (vsetvl_insn_p (rinsn) && "Can't handle X0, rs1 vsetvli yet"); - } - else - { - gcc_assert (has_vtype_op (insn->rtl ())); - rinsn = PREV_INSN (insn->rtl ()); - gcc_assert (vector_config_insn_p (rinsn)); - } - rtx new_pat = gen_vsetvl_pat (rinsn, info, vl); - change_insn (rinsn, new_pat); -} + if (real_p && phi_p && param_p) + return get_all_sets (safe_dyn_cast (set), + REAL_SET | PHI_SET | BB_HEAD_SET | BB_END_SET); -static bool -avl_source_has_vsetvl_p (set_info *avl_source) -{ - if (!avl_source) - return false; - if (!avl_source->insn ()) - return false; - if (avl_source->insn ()->is_real ()) - return vsetvl_insn_p (avl_source->insn ()->rtl ()); - hash_set sets = get_all_sets (avl_source, true, false, true); - for (const auto set : sets) - { - if (set->insn ()->is_real () && vsetvl_insn_p (set->insn ()->rtl ())) - return true; - } - return false; + else if (real_p && param_p) + return get_all_sets (safe_dyn_cast (set), + REAL_SET | BB_HEAD_SET | BB_END_SET); + + else if (real_p) + return get_all_sets (safe_dyn_cast (set), REAL_SET); + return hash_set (); } static bool @@ -959,93 +631,14 @@ source_equal_p (insn_info *insn1, insn_info *insn2) rtx_insn *rinsn2 = insn2->rtl (); if (!rinsn1 || !rinsn2) return false; + rtx note1 = find_reg_equal_equiv_note (rinsn1); rtx note2 = find_reg_equal_equiv_note (rinsn2); - rtx single_set1 = single_set (rinsn1); - rtx single_set2 = single_set (rinsn2); - if (read_vl_insn_p (rinsn1) && read_vl_insn_p (rinsn2)) - { - const insn_info *load1 = get_backward_fault_first_load_insn (insn1); - const insn_info *load2 = get_backward_fault_first_load_insn (insn2); - return load1 && load2 && load1 == load2; - } - if (note1 && note2 && rtx_equal_p (note1, note2)) return true; - - /* Since vsetvl instruction is not single SET. - We handle this case specially here. */ - if (vsetvl_insn_p (insn1->rtl ()) && vsetvl_insn_p (insn2->rtl ())) - { - /* For example: - vsetvl1 a6,a5,e32m1 - RVV 1 (use a6 as AVL) - vsetvl2 a5,a5,e8mf4 - RVV 2 (use a5 as AVL) - We consider AVL of RVV 1 and RVV 2 are same so that we can - gain more optimization opportunities. - - Note: insn1_info.compatible_avl_p (insn2_info) - will make sure there is no instruction between vsetvl1 and vsetvl2 - modify a5 since their def will be different if there is instruction - modify a5 and compatible_avl_p will return false. */ - vector_insn_info insn1_info, insn2_info; - insn1_info.parse_insn (insn1); - insn2_info.parse_insn (insn2); - - /* To avoid dead loop, we don't optimize a vsetvli def has vsetvli - instructions which will complicate the situation. */ - if (avl_source_has_vsetvl_p (insn1_info.get_avl_source ()) - || avl_source_has_vsetvl_p (insn2_info.get_avl_source ())) - return false; - - if (insn1_info.same_vlmax_p (insn2_info) - && insn1_info.compatible_avl_p (insn2_info)) - return true; - } - - /* We only handle AVL is set by instructions with no side effects. */ - if (!single_set1 || !single_set2) - return false; - if (!rtx_equal_p (SET_SRC (single_set1), SET_SRC (single_set2))) - return false; - /* RTL_SSA uses include REG_NOTE. Consider this following case: - - insn1 RTL: - (insn 41 39 42 4 (set (reg:DI 26 s10 [orig:159 loop_len_46 ] [159]) - (umin:DI (reg:DI 15 a5 [orig:201 _149 ] [201]) - (reg:DI 14 a4 [276]))) 408 {*umindi3} - (expr_list:REG_EQUAL (umin:DI (reg:DI 15 a5 [orig:201 _149 ] [201]) - (const_int 2 [0x2])) - (nil))) - The RTL_SSA uses of this instruction has 2 uses: - 1. (reg:DI 15 a5 [orig:201 _149 ] [201]) - twice. - 2. (reg:DI 14 a4 [276]) - once. - - insn2 RTL: - (insn 38 353 351 4 (set (reg:DI 27 s11 [orig:160 loop_len_47 ] [160]) - (umin:DI (reg:DI 15 a5 [orig:199 _146 ] [199]) - (reg:DI 14 a4 [276]))) 408 {*umindi3} - (expr_list:REG_EQUAL (umin:DI (reg:DI 28 t3 [orig:200 ivtmp_147 ] [200]) - (const_int 2 [0x2])) - (nil))) - The RTL_SSA uses of this instruction has 3 uses: - 1. (reg:DI 15 a5 [orig:199 _146 ] [199]) - once - 2. (reg:DI 14 a4 [276]) - once - 3. (reg:DI 28 t3 [orig:200 ivtmp_147 ] [200]) - once - - Return false when insn1->uses ().size () != insn2->uses ().size () - */ - if (insn1->uses ().size () != insn2->uses ().size ()) - return false; - for (size_t i = 0; i < insn1->uses ().size (); i++) - if (insn1->uses ()[i] != insn2->uses ()[i]) - return false; - return true; + return false; } -/* Helper function to get single same real RTL source. - return NULL if it is not a single real RTL source. */ static insn_info * extract_single_source (set_info *set) { @@ -1066,7 +659,7 @@ extract_single_source (set_info *set) NULL so that VSETVL PASS will insert vsetvl directly. */ if (set->insn ()->is_artificial ()) return nullptr; - if (!source_equal_p (set->insn (), first_insn)) + if (set != *sets.begin () && !source_equal_p (set->insn (), first_insn)) return nullptr; } @@ -1074,3115 +667,2825 @@ extract_single_source (set_info *set) } static unsigned -calculate_sew (vlmul_type vlmul, unsigned int ratio) +get_expr_id (unsigned bb_index, unsigned regno, unsigned num_bbs) { - for (const unsigned sew : ALL_SEW) - if (calculate_ratio (sew, vlmul) == ratio) - return sew; - return 0; + return regno * num_bbs + bb_index; } - -static vlmul_type -calculate_vlmul (unsigned int sew, unsigned int ratio) +static unsigned +get_regno (unsigned expr_id, unsigned num_bb) { - for (const vlmul_type vlmul : ALL_LMUL) - if (calculate_ratio (sew, vlmul) == ratio) - return vlmul; - return LMUL_RESERVED; + return expr_id / num_bb; } +static unsigned +get_bb_index (unsigned expr_id, unsigned num_bb) +{ + return expr_id % num_bb; +} + +/* This flags indicates the minimum demand of the vl and vtype values by the + RVV instruction. For example, DEMAND_RATIO_P indicates that this RVV + instruction only needs the SEW/LMUL ratio to remain the same, and does not + require SEW and LMUL to be fixed. + Therefore, if the former RVV instruction needs DEMAND_RATIO_P and the latter + instruction needs DEMAND_SEW_LMUL_P and its SEW/LMUL is the same as that of + the former instruction, then we can make the minimu demand of the former + instruction strict to DEMAND_SEW_LMUL_P, and its required SEW and LMUL are + the SEW and LMUL of the latter instruction, and the vsetvl instruction + generated according to the new demand can also be used for the latter + instruction, so there is no need to insert a separate vsetvl instruction for + the latter instruction. */ +enum demand_flags : unsigned +{ + DEMAND_EMPTY_P = 0, + DEMAND_SEW_P = 1 << 0, + DEMAND_LMUL_P = 1 << 1, + DEMAND_RATIO_P = 1 << 2, + DEMAND_GE_SEW_P = 1 << 3, + DEMAND_TAIL_POLICY_P = 1 << 4, + DEMAND_MASK_POLICY_P = 1 << 5, + DEMAND_AVL_P = 1 << 6, + DEMAND_NON_ZERO_AVL_P = 1 << 7, +}; -static bool -incompatible_avl_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return !info1.compatible_avl_p (info2) && !info2.compatible_avl_p (info1); -} +/* We split the demand information into three parts. They are sew and lmul + related (sew_lmul_demand_type), tail and mask policy related + (policy_demand_type) and avl related (avl_demand_type). Then we define three + interfaces avaiable_with, compatible_with and merge_with. avaiable_with is + used to determine whether the two vsetvl infos prev_info and next_info are + available or not. If prev_info is available for next_info, it means that the + RVV insn corresponding to next_info on the path from prev_info to next_info + can be used without inserting a separate vsetvl instruction. compatible_with + is used to determine whether prev_info is compatible with next_info, and if + so, merge_with can be used to merge the stricter demand information from + next_info into prev_info so that prev_info becomes available to next_info. + */ -static bool -different_sew_p (const vector_insn_info &info1, const vector_insn_info &info2) +enum class sew_lmul_demand_type : unsigned { - return info1.get_sew () != info2.get_sew (); -} + sew_lmul = demand_flags::DEMAND_SEW_P | demand_flags::DEMAND_LMUL_P, + ratio_only = demand_flags::DEMAND_RATIO_P, + sew_only = demand_flags::DEMAND_SEW_P, + ge_sew = demand_flags::DEMAND_GE_SEW_P, + ratio_and_ge_sew + = demand_flags::DEMAND_RATIO_P | demand_flags::DEMAND_GE_SEW_P, +}; -static bool -different_lmul_p (const vector_insn_info &info1, const vector_insn_info &info2) +enum class policy_demand_type : unsigned { - return info1.get_vlmul () != info2.get_vlmul (); -} + tail_mask_policy + = demand_flags::DEMAND_TAIL_POLICY_P | demand_flags::DEMAND_MASK_POLICY_P, + tail_policy_only = demand_flags::DEMAND_TAIL_POLICY_P, + mask_policy_only = demand_flags::DEMAND_MASK_POLICY_P, + ignore_policy = demand_flags::DEMAND_EMPTY_P, +}; -static bool -different_ratio_p (const vector_insn_info &info1, const vector_insn_info &info2) +enum class avl_demand_type : unsigned { - return info1.get_ratio () != info2.get_ratio (); -} + avl = demand_flags::DEMAND_AVL_P, + non_zero_avl = demand_flags::DEMAND_NON_ZERO_AVL_P, + ignore_avl = demand_flags::DEMAND_EMPTY_P, +}; -static bool -different_tail_policy_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return info1.get_ta () != info2.get_ta (); -} -static bool -different_mask_policy_p (const vector_insn_info &info1, - const vector_insn_info &info2) +class vsetvl_info { - return info1.get_ma () != info2.get_ma (); -} +private: + insn_info *m_insn; + bb_info *m_bb; + rtx m_avl; + rtx m_vl; + set_info *m_avl_def; + uint8_t m_sew; + uint8_t m_max_sew; + vlmul_type m_vlmul; + uint8_t m_ratio; + bool m_ta; + bool m_ma; + + sew_lmul_demand_type m_sew_lmul_demand; + policy_demand_type m_policy_demand; + avl_demand_type m_avl_demand; + + enum class state_type + { + UNINITIALIZED, + VALID, + UNKNOWN, + EMPTY, + }; + state_type m_state; + + bool m_ignore; + bool change_vtype_only; + insn_info *m_read_vl_insn; + bool use_by_non_rvv_insn; -static bool -possible_zero_avl_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return !info1.has_non_zero_avl () || !info2.has_non_zero_avl (); -} - -static bool -second_ratio_invalid_for_first_sew_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_vlmul (info1.get_sew (), info2.get_ratio ()) - == LMUL_RESERVED; -} - -static bool -second_ratio_invalid_for_first_lmul_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_sew (info1.get_vlmul (), info2.get_ratio ()) == 0; -} - -static bool -float_insn_valid_sew_p (const vector_insn_info &info, unsigned int sew) -{ - if (info.get_insn () && info.get_insn ()->is_real () - && get_attr_type (info.get_insn ()->rtl ()) == TYPE_VFMOVFV) - { - if (sew == 16) - return TARGET_VECTOR_ELEN_FP_16; - else if (sew == 32) - return TARGET_VECTOR_ELEN_FP_32; - else if (sew == 64) - return TARGET_VECTOR_ELEN_FP_64; - } - return true; -} - -static bool -second_sew_less_than_first_sew_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return info2.get_sew () < info1.get_sew () - || !float_insn_valid_sew_p (info1, info2.get_sew ()); -} - -static bool -first_sew_less_than_second_sew_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return info1.get_sew () < info2.get_sew () - || !float_insn_valid_sew_p (info2, info1.get_sew ()); -} - -/* return 0 if LMUL1 == LMUL2. - return -1 if LMUL1 < LMUL2. - return 1 if LMUL1 > LMUL2. */ -static int -compare_lmul (vlmul_type vlmul1, vlmul_type vlmul2) -{ - if (vlmul1 == vlmul2) - return 0; - - switch (vlmul1) - { - case LMUL_1: - if (vlmul2 == LMUL_2 || vlmul2 == LMUL_4 || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_2: - if (vlmul2 == LMUL_4 || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_4: - if (vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_8: - return -1; - case LMUL_F2: - if (vlmul2 == LMUL_1 || vlmul2 == LMUL_2 || vlmul2 == LMUL_4 - || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_F4: - if (vlmul2 == LMUL_F2 || vlmul2 == LMUL_1 || vlmul2 == LMUL_2 - || vlmul2 == LMUL_4 || vlmul2 == LMUL_8) - return 1; - else - return -1; - case LMUL_F8: - return 0; - default: - gcc_unreachable (); - } -} - -static bool -second_lmul_less_than_first_lmul_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return compare_lmul (info2.get_vlmul (), info1.get_vlmul ()) == -1; -} - -static bool -second_ratio_less_than_first_ratio_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return info2.get_ratio () < info1.get_ratio (); -} - -static CONSTEXPR const demands_cond incompatible_conds[] = { -#define DEF_INCOMPATIBLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, \ - GE_SEW1, TAIL_POLICTY1, MASK_POLICY1, AVL2, \ - SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, \ - TAIL_POLICTY2, MASK_POLICY2, COND) \ - {{{AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, TAIL_POLICTY1, \ - MASK_POLICY1}, \ - {AVL2, SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ - MASK_POLICY2}}, \ - COND}, -#include "riscv-vsetvl.def" -}; - -static unsigned -greatest_sew (const vector_insn_info &info1, const vector_insn_info &info2) -{ - return std::max (info1.get_sew (), info2.get_sew ()); -} - -static unsigned -first_sew (const vector_insn_info &info1, const vector_insn_info &) -{ - return info1.get_sew (); -} - -static unsigned -second_sew (const vector_insn_info &, const vector_insn_info &info2) -{ - return info2.get_sew (); -} - -static vlmul_type -first_vlmul (const vector_insn_info &info1, const vector_insn_info &) -{ - return info1.get_vlmul (); -} - -static vlmul_type -second_vlmul (const vector_insn_info &, const vector_insn_info &info2) -{ - return info2.get_vlmul (); -} - -static unsigned -first_ratio (const vector_insn_info &info1, const vector_insn_info &) -{ - return info1.get_ratio (); -} - -static unsigned -second_ratio (const vector_insn_info &, const vector_insn_info &info2) -{ - return info2.get_ratio (); -} - -static vlmul_type -vlmul_for_first_sew_second_ratio (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_vlmul (info1.get_sew (), info2.get_ratio ()); -} - -static vlmul_type -vlmul_for_greatest_sew_second_ratio (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_vlmul (MAX (info1.get_sew (), info2.get_sew ()), - info2.get_ratio ()); -} - -static unsigned -ratio_for_second_sew_first_vlmul (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - return calculate_ratio (info2.get_sew (), info1.get_vlmul ()); -} - -static CONSTEXPR const demands_fuse_rule fuse_rules[] = { -#define DEF_SEW_LMUL_FUSE_RULE(DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1, \ - DEMAND_GE_SEW1, DEMAND_SEW2, DEMAND_LMUL2, \ - DEMAND_RATIO2, DEMAND_GE_SEW2, NEW_DEMAND_SEW, \ - NEW_DEMAND_LMUL, NEW_DEMAND_RATIO, \ - NEW_DEMAND_GE_SEW, NEW_SEW, NEW_VLMUL, \ - NEW_RATIO) \ - {{{DEMAND_ANY, DEMAND_SEW1, DEMAND_LMUL1, DEMAND_RATIO1, DEMAND_ANY, \ - DEMAND_GE_SEW1, DEMAND_ANY, DEMAND_ANY}, \ - {DEMAND_ANY, DEMAND_SEW2, DEMAND_LMUL2, DEMAND_RATIO2, DEMAND_ANY, \ - DEMAND_GE_SEW2, DEMAND_ANY, DEMAND_ANY}}, \ - NEW_DEMAND_SEW, \ - NEW_DEMAND_LMUL, \ - NEW_DEMAND_RATIO, \ - NEW_DEMAND_GE_SEW, \ - NEW_SEW, \ - NEW_VLMUL, \ - NEW_RATIO}, -#include "riscv-vsetvl.def" -}; - -static bool -always_unavailable (const vector_insn_info &, const vector_insn_info &) -{ - return true; -} - -static bool -avl_unavailable_p (const vector_insn_info &info1, const vector_insn_info &info2) -{ - return !info2.compatible_avl_p (info1.get_avl_info ()); -} - -static bool -sew_unavailable_p (const vector_insn_info &info1, const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_LMUL) && !info2.demand_p (DEMAND_RATIO)) - { - if (info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - return info1.get_sew () != info2.get_sew (); - } - return true; -} - -static bool -lmul_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info1.get_vlmul () == info2.get_vlmul () && !info2.demand_p (DEMAND_SEW) - && !info2.demand_p (DEMAND_RATIO)) - return false; - return true; -} - -static bool -ge_sew_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_LMUL) && !info2.demand_p (DEMAND_RATIO) - && info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - return true; -} - -static bool -ge_sew_lmul_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_RATIO) && info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - return true; -} - -static bool -ge_sew_ratio_unavailable_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (!info2.demand_p (DEMAND_LMUL)) - { - if (info2.demand_p (DEMAND_GE_SEW)) - return info1.get_sew () < info2.get_sew (); - /* Demand GE_SEW should be available for non-demand SEW. */ - else if (!info2.demand_p (DEMAND_SEW)) - return false; - } - return true; -} - -static CONSTEXPR const demands_cond unavailable_conds[] = { -#define DEF_UNAVAILABLE_COND(AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, \ - TAIL_POLICTY1, MASK_POLICY1, AVL2, SEW2, LMUL2, \ - RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ - MASK_POLICY2, COND) \ - {{{AVL1, SEW1, LMUL1, RATIO1, NONZERO_AVL1, GE_SEW1, TAIL_POLICTY1, \ - MASK_POLICY1}, \ - {AVL2, SEW2, LMUL2, RATIO2, NONZERO_AVL2, GE_SEW2, TAIL_POLICTY2, \ - MASK_POLICY2}}, \ - COND}, -#include "riscv-vsetvl.def" -}; - -static bool -same_sew_lmul_demand_p (const bool *dems1, const bool *dems2) -{ - return dems1[DEMAND_SEW] == dems2[DEMAND_SEW] - && dems1[DEMAND_LMUL] == dems2[DEMAND_LMUL] - && dems1[DEMAND_RATIO] == dems2[DEMAND_RATIO] && !dems1[DEMAND_GE_SEW] - && !dems2[DEMAND_GE_SEW]; -} - -static bool -propagate_avl_across_demands_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info2.demand_p (DEMAND_AVL)) - { - if (info2.demand_p (DEMAND_NONZERO_AVL)) - return info1.demand_p (DEMAND_AVL) - && !info1.demand_p (DEMAND_NONZERO_AVL) && info1.has_avl_reg (); - } - else - return info1.demand_p (DEMAND_AVL) && info1.has_avl_reg (); - return false; -} - -static bool -reg_available_p (const insn_info *insn, const vector_insn_info &info) -{ - if (info.has_avl_reg () && !info.get_avl_source ()) - return false; - insn_info *def_insn = info.get_avl_source ()->insn (); - if (def_insn->bb () == insn->bb ()) - return before_p (def_insn, insn); - else - return dominated_by_p (CDI_DOMINATORS, insn->bb ()->cfg_bb (), - def_insn->bb ()->cfg_bb ()); -} - -/* Return true if the instruction support relaxed compatible check. */ -static bool -support_relaxed_compatible_p (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (fault_first_load_p (info1.get_insn ()->rtl ()) - && info2.demand_p (DEMAND_AVL) && info2.has_avl_reg () - && info2.get_avl_source () && info2.get_avl_source ()->insn ()->is_phi ()) - { - hash_set sets - = get_all_sets (info2.get_avl_source (), true, false, false); - for (set_info *set : sets) - { - if (read_vl_insn_p (set->insn ()->rtl ())) - { - const insn_info *insn - = get_backward_fault_first_load_insn (set->insn ()); - if (insn == info1.get_insn ()) - return info2.compatible_vtype_p (info1); - } - } - } - return false; -} - -/* Count the number of REGNO in RINSN. */ -static int -count_regno_occurrences (rtx_insn *rinsn, unsigned int regno) -{ - int count = 0; - extract_insn (rinsn); - for (int i = 0; i < recog_data.n_operands; i++) - if (refers_to_regno_p (regno, recog_data.operand[i])) - count++; - return count; -} - -/* Return TRUE if the demands can be fused. */ -static bool -demands_can_be_fused_p (const vector_insn_info &be_fused, - const vector_insn_info &to_fuse) -{ - return be_fused.compatible_p (to_fuse) && !be_fused.available_p (to_fuse); -} - -/* Return true if we can fuse VSETVL demand info into predecessor of earliest - * edge. */ -static bool -earliest_pred_can_be_fused_p (const bb_info *earliest_pred, - const vector_insn_info &earliest_info, - const vector_insn_info &expr, rtx *vlmax_vl) -{ - /* Backward VLMAX VL: - bb 3: - vsetivli zero, 1 ... -> vsetvli t1, zero - vmv.s.x - bb 5: - vsetvli t1, zero ... -> to be elided. - vlse16.v - - We should forward "t1". */ - if (!earliest_info.has_avl_reg () && expr.has_avl_reg ()) - { - rtx avl_or_vl_reg = expr.get_avl_or_vl_reg (); - gcc_assert (avl_or_vl_reg); - const insn_info *last_insn = earliest_info.get_insn (); - /* To fuse demand on earlest edge, we make sure AVL/VL - didn't change from the consume insn to the predecessor - of the edge. */ - for (insn_info *i = earliest_pred->end_insn ()->prev_nondebug_insn (); - real_insn_and_same_bb_p (i, earliest_pred) - && after_or_same_p (i, last_insn); - i = i->prev_nondebug_insn ()) - { - if (find_access (i->defs (), REGNO (avl_or_vl_reg))) - return false; - if (find_access (i->uses (), REGNO (avl_or_vl_reg))) - return false; - } - if (vlmax_vl && vlmax_avl_p (expr.get_avl ())) - *vlmax_vl = avl_or_vl_reg; - } - - return true; -} - -/* Return true if the current VSETVL 1 is dominated by preceding VSETVL 2. - - VSETVL 2 dominates VSETVL 1 should satisfy this following check: - - - VSETVL 2 should have the RATIO (SEW/LMUL) with VSETVL 1. - - VSETVL 2 is user vsetvl (vsetvl VL, AVL) - - VSETVL 2 "VL" result is the "AVL" of VSETL1. */ -static bool -vsetvl_dominated_by_p (const basic_block cfg_bb, - const vector_insn_info &vsetvl1, - const vector_insn_info &vsetvl2, bool fuse_p) -{ - if (!vsetvl1.valid_or_dirty_p () || !vsetvl2.valid_or_dirty_p ()) - return false; - if (!has_vl_op (vsetvl1.get_insn ()->rtl ()) - || !vsetvl_insn_p (vsetvl2.get_insn ()->rtl ())) - return false; - - hash_set sets - = get_all_sets (vsetvl1.get_avl_source (), true, false, false); - set_info *set = get_same_bb_set (sets, cfg_bb); - - if (!vsetvl1.has_avl_reg () || vlmax_avl_p (vsetvl1.get_avl ()) - || !vsetvl2.same_vlmax_p (vsetvl1) || !set - || set->insn () != vsetvl2.get_insn ()) - return false; - - if (fuse_p && vsetvl2.same_vtype_p (vsetvl1)) - return false; - else if (!fuse_p && !vsetvl2.same_vtype_p (vsetvl1)) - return false; - return true; -} - -avl_info::avl_info (const avl_info &other) -{ - m_value = other.get_value (); - m_source = other.get_source (); -} - -avl_info::avl_info (rtx value_in, set_info *source_in) - : m_value (value_in), m_source (source_in) -{} - -bool -avl_info::single_source_equal_p (const avl_info &other) const -{ - set_info *set1 = m_source; - set_info *set2 = other.get_source (); - insn_info *insn1 = extract_single_source (set1); - insn_info *insn2 = extract_single_source (set2); - if (!insn1 || !insn2) - return false; - return source_equal_p (insn1, insn2); -} - -bool -avl_info::multiple_source_equal_p (const avl_info &other) const -{ - /* When the def info is same in RTL_SSA namespace, it's safe - to consider they are avl compatible. */ - if (m_source == other.get_source ()) - return true; - - /* We only consider handle PHI node. */ - if (!m_source->insn ()->is_phi () || !other.get_source ()->insn ()->is_phi ()) - return false; - - phi_info *phi1 = as_a (m_source); - phi_info *phi2 = as_a (other.get_source ()); - - if (phi1->is_degenerate () && phi2->is_degenerate ()) - { - /* Degenerate PHI means the PHI node only have one input. */ - - /* If both PHI nodes have the same single input in use list. - We consider they are AVL compatible. */ - if (phi1->input_value (0) == phi2->input_value (0)) - return true; - } - /* TODO: We can support more optimization cases in the future. */ - return false; -} - -avl_info & -avl_info::operator= (const avl_info &other) -{ - m_value = other.get_value (); - m_source = other.get_source (); - return *this; -} - -bool -avl_info::operator== (const avl_info &other) const -{ - if (!m_value) - return !other.get_value (); - if (!other.get_value ()) - return false; - - if (GET_CODE (m_value) != GET_CODE (other.get_value ())) - return false; - - /* Handle CONST_INT AVL. */ - if (CONST_INT_P (m_value)) - return INTVAL (m_value) == INTVAL (other.get_value ()); - - /* Handle VLMAX AVL. */ - if (vlmax_avl_p (m_value)) - return vlmax_avl_p (other.get_value ()); - if (vlmax_avl_p (other.get_value ())) - return false; - - /* If any source is undef value, we think they are not equal. */ - if (!m_source || !other.get_source ()) - return false; - - /* If both sources are single source (defined by a single real RTL) - and their definitions are same. */ - if (single_source_equal_p (other)) - return true; - - return multiple_source_equal_p (other); -} - -bool -avl_info::operator!= (const avl_info &other) const -{ - return !(*this == other); -} - -bool -avl_info::has_non_zero_avl () const -{ - if (has_avl_imm ()) - return INTVAL (get_value ()) > 0; - if (has_avl_reg ()) - return vlmax_avl_p (get_value ()); - return false; -} - -/* Initialize VL/VTYPE information. */ -vl_vtype_info::vl_vtype_info (avl_info avl_in, uint8_t sew_in, - enum vlmul_type vlmul_in, uint8_t ratio_in, - bool ta_in, bool ma_in) - : m_avl (avl_in), m_sew (sew_in), m_vlmul (vlmul_in), m_ratio (ratio_in), - m_ta (ta_in), m_ma (ma_in) -{ - gcc_assert (valid_sew_p (m_sew) && "Unexpected SEW"); -} - -bool -vl_vtype_info::operator== (const vl_vtype_info &other) const -{ - return same_avl_p (other) && m_sew == other.get_sew () - && m_vlmul == other.get_vlmul () && m_ta == other.get_ta () - && m_ma == other.get_ma () && m_ratio == other.get_ratio (); -} - -bool -vl_vtype_info::operator!= (const vl_vtype_info &other) const -{ - return !(*this == other); -} - -bool -vl_vtype_info::same_avl_p (const vl_vtype_info &other) const -{ - /* We need to compare both RTL and SET. If both AVL are CONST_INT. - For example, const_int 3 and const_int 4, we need to compare - RTL. If both AVL are REG and their REGNO are same, we need to - compare SET. */ - return get_avl () == other.get_avl () - && get_avl_source () == other.get_avl_source (); -} - -bool -vl_vtype_info::same_vtype_p (const vl_vtype_info &other) const -{ - return get_sew () == other.get_sew () && get_vlmul () == other.get_vlmul () - && get_ta () == other.get_ta () && get_ma () == other.get_ma (); -} - -bool -vl_vtype_info::same_vlmax_p (const vl_vtype_info &other) const -{ - return get_ratio () == other.get_ratio (); -} - -/* Compare the compatibility between Dem1 and Dem2. - If Dem1 > Dem2, Dem1 has bigger compatibility then Dem2 - meaning Dem1 is easier be compatible with others than Dem2 - or Dem2 is stricter than Dem1. - For example, Dem1 (demand SEW + LMUL) > Dem2 (demand RATIO). */ -bool -vector_insn_info::operator>= (const vector_insn_info &other) const -{ - if (support_relaxed_compatible_p (*this, other)) - { - unsigned array_size = sizeof (unavailable_conds) / sizeof (demands_cond); - /* Bypass AVL unavailable cases. */ - for (unsigned i = 2; i < array_size; i++) - if (unavailable_conds[i].pair.match_cond_p (this->get_demands (), - other.get_demands ()) - && unavailable_conds[i].incompatible_p (*this, other)) - return false; - return true; - } - - if (!other.compatible_p (static_cast (*this))) - return false; - if (!this->compatible_p (static_cast (other))) - return true; - - if (*this == other) - return true; - - for (const auto &cond : unavailable_conds) - if (cond.pair.match_cond_p (this->get_demands (), other.get_demands ()) - && cond.incompatible_p (*this, other)) - return false; - - return true; -} - -bool -vector_insn_info::operator== (const vector_insn_info &other) const -{ - gcc_assert (!uninit_p () && !other.uninit_p () - && "Uninitialization should not happen"); - - /* Empty is only equal to another Empty. */ - if (empty_p ()) - return other.empty_p (); - if (other.empty_p ()) - return empty_p (); - - /* Unknown is only equal to another Unknown. */ - if (unknown_p ()) - return other.unknown_p (); - if (other.unknown_p ()) - return unknown_p (); - - for (size_t i = 0; i < NUM_DEMAND; i++) - if (m_demands[i] != other.demand_p ((enum demand_type) i)) - return false; - - /* We should consider different INSN demands as different - expression. Otherwise, we will be doing incorrect vsetvl - elimination. */ - if (m_insn != other.get_insn ()) - return false; - - if (!same_avl_p (other)) - return false; - - /* If the full VTYPE is valid, check that it is the same. */ - return same_vtype_p (other); -} - -void -vector_insn_info::parse_insn (rtx_insn *rinsn) -{ - *this = vector_insn_info (); - if (!NONDEBUG_INSN_P (rinsn)) - return; - if (optimize == 0 && !has_vtype_op (rinsn)) - return; - gcc_assert (!vsetvl_discard_result_insn_p (rinsn)); - m_state = VALID; - extract_insn_cached (rinsn); - rtx avl = ::get_avl (rinsn); - m_avl = avl_info (avl, nullptr); - m_sew = ::get_sew (rinsn); - m_vlmul = ::get_vlmul (rinsn); - m_ta = tail_agnostic_p (rinsn); - m_ma = mask_agnostic_p (rinsn); -} - -void -vector_insn_info::parse_insn (insn_info *insn) -{ - *this = vector_insn_info (); - - /* Return if it is debug insn for the consistency with optimize == 0. */ - if (insn->is_debug_insn ()) - return; - - /* We set it as unknown since we don't what will happen in CALL or ASM. */ - if (insn->is_call () || insn->is_asm ()) - { - set_unknown (); +public: + vsetvl_info () + : m_insn (nullptr), m_bb (nullptr), m_avl (NULL_RTX), m_vl (NULL_RTX), + m_avl_def (nullptr), m_sew (0), m_max_sew (0), m_vlmul (LMUL_RESERVED), + m_ratio (0), m_ta (false), m_ma (false), + m_sew_lmul_demand (sew_lmul_demand_type::sew_lmul), + m_policy_demand (policy_demand_type::tail_mask_policy), + m_avl_demand (avl_demand_type::avl), m_state (state_type::UNINITIALIZED), + m_ignore (false), change_vtype_only (false), m_read_vl_insn (nullptr), + use_by_non_rvv_insn (false) + {} + + vsetvl_info (insn_info *insn) : vsetvl_info () { parse_insn (insn); } + + vsetvl_info (rtx_insn *insn) : vsetvl_info () { parse_insn (insn); } + + void set_avl (rtx avl) { m_avl = avl; } + void set_vl (rtx vl) { m_vl = vl; } + void set_avl_def (set_info *avl_def) { m_avl_def = avl_def; } + void set_sew (uint8_t sew) { m_sew = sew; } + void set_vlmul (vlmul_type vlmul) { m_vlmul = vlmul; } + void set_ratio (uint8_t ratio) { m_ratio = ratio; } + void set_ta (bool ta) { m_ta = ta; } + void set_ma (bool ma) { m_ma = ma; } + void set_ignore () { m_ignore = true; } + void set_bb (bb_info *bb) { m_bb = bb; } + void set_max_sew (uint8_t max_sew) { m_max_sew = max_sew; } + void set_change_vtype_only () { change_vtype_only = true; } + void set_read_vl_insn (insn_info *insn) { m_read_vl_insn = insn; } + + rtx get_avl () const { return m_avl; } + rtx get_vl () const { return m_vl; } + set_info *get_avl_def () const { return m_avl_def; } + uint8_t get_sew () const { return m_sew; } + vlmul_type get_vlmul () const { return m_vlmul; } + uint8_t get_ratio () const { return m_ratio; } + bool get_ta () const { return m_ta; } + bool get_ma () const { return m_ma; } + insn_info *get_insn () const { return m_insn; } + bool ignore_p () const { return m_ignore; } + bb_info *get_bb () const { return m_bb; } + uint8_t get_max_sew () const { return m_max_sew; } + insn_info *get_read_vl_insn () const { return m_read_vl_insn; } + bool use_by_non_rvv_insn_p () const { return use_by_non_rvv_insn; } + + bool has_imm_avl () const { return m_avl && CONST_INT_P (m_avl); } + bool has_vlmax_avl () const { return vlmax_avl_p (m_avl); } + bool has_reg_avl () const + { + return m_avl && REG_P (m_avl) && !has_vlmax_avl (); + } + bool has_non_zero_avl () const + { + if (has_imm_avl ()) + return INTVAL (m_avl) > 0; + return has_vlmax_avl (); + } + bool has_reg_vl () const + { + gcc_assert (!m_vl || REG_P (m_vl)); + return m_vl && REG_P (m_vl); + } + bool has_same_ratio (const vsetvl_info &other) const + { + return get_ratio () == other.get_ratio (); + } + bool is_in_origin_bb () const { return get_insn ()->bb () == get_bb (); } + void update_avl (const vsetvl_info &other) + { + m_avl = other.get_avl (); + m_vl = other.get_vl (); + m_avl_def = other.get_avl_def (); + } + + bool uninit_p () const { return m_state == state_type::UNINITIALIZED; } + bool valid_p () const { return m_state == state_type::VALID; } + bool unknown_p () const { return m_state == state_type::UNKNOWN; } + bool empty_p () const { return m_state == state_type::EMPTY; } + bool change_vtype_only_p () const { return change_vtype_only; } + + void set_valid () { m_state = state_type::VALID; } + void set_unknown () { m_state = state_type::UNKNOWN; } + void set_empty () { m_state = state_type::EMPTY; } + + void set_sew_lmul_demand (sew_lmul_demand_type demand) + { + m_sew_lmul_demand = demand; + } + void set_policy_demand (policy_demand_type demand) + { + m_policy_demand = demand; + } + void set_avl_demand (avl_demand_type demand) { m_avl_demand = demand; } + + sew_lmul_demand_type get_sew_lmul_demand () const + { + return m_sew_lmul_demand; + } + policy_demand_type get_policy_demand () const { return m_policy_demand; } + avl_demand_type get_avl_demand () const { return m_avl_demand; } + + void normalize_demand (unsigned demand_flags) + { + switch (demand_flags + & (DEMAND_SEW_P | DEMAND_LMUL_P | DEMAND_RATIO_P | DEMAND_GE_SEW_P)) + { + case (unsigned) sew_lmul_demand_type::sew_lmul: + m_sew_lmul_demand = sew_lmul_demand_type::sew_lmul; + break; + case (unsigned) sew_lmul_demand_type::ratio_only: + m_sew_lmul_demand = sew_lmul_demand_type::ratio_only; + break; + case (unsigned) sew_lmul_demand_type::sew_only: + m_sew_lmul_demand = sew_lmul_demand_type::sew_only; + break; + case (unsigned) sew_lmul_demand_type::ge_sew: + m_sew_lmul_demand = sew_lmul_demand_type::ge_sew; + break; + case (unsigned) sew_lmul_demand_type::ratio_and_ge_sew: + m_sew_lmul_demand = sew_lmul_demand_type::ratio_and_ge_sew; + break; + default: + gcc_unreachable (); + } + + switch (demand_flags & (DEMAND_TAIL_POLICY_P | DEMAND_MASK_POLICY_P)) + { + case (unsigned) policy_demand_type::tail_mask_policy: + m_policy_demand = policy_demand_type::tail_mask_policy; + break; + case (unsigned) policy_demand_type::tail_policy_only: + m_policy_demand = policy_demand_type::tail_policy_only; + break; + case (unsigned) policy_demand_type::mask_policy_only: + m_policy_demand = policy_demand_type::mask_policy_only; + break; + case (unsigned) policy_demand_type::ignore_policy: + m_policy_demand = policy_demand_type::ignore_policy; + break; + default: + gcc_unreachable (); + } + + switch (demand_flags & (DEMAND_AVL_P | DEMAND_NON_ZERO_AVL_P)) + { + case (unsigned) avl_demand_type::avl: + m_avl_demand = avl_demand_type::avl; + break; + case (unsigned) avl_demand_type::non_zero_avl: + m_avl_demand = avl_demand_type::non_zero_avl; + break; + case (unsigned) avl_demand_type::ignore_avl: + m_avl_demand = avl_demand_type::ignore_avl; + break; + default: + gcc_unreachable (); + } + } + + void parse_insn (rtx_insn *rinsn) + { + if (!NONDEBUG_INSN_P (rinsn)) return; - } - - /* If this is something that updates VL/VTYPE that we don't know about, set - the state to unknown. */ - if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ()) - && (find_access (insn->defs (), VL_REGNUM) - || find_access (insn->defs (), VTYPE_REGNUM))) - { - set_unknown (); + if (optimize == 0 && !has_vtype_op (rinsn)) return; - } - - if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ())) - return; - - /* Warning: This function has to work on both the lowered (i.e. post - emit_local_forward_vsetvls) and pre-lowering forms. The main implication - of this is that it can't use the value of a SEW, VL, or Policy operand as - they might be stale after lowering. */ - vl_vtype_info::operator= (get_vl_vtype_info (insn)); - m_insn = insn; - m_state = VALID; - if (vector_config_insn_p (insn->rtl ())) - { - m_demands[DEMAND_AVL] = true; - m_demands[DEMAND_RATIO] = true; + gcc_assert (!vsetvl_discard_result_insn_p (rinsn)); + set_valid (); + extract_insn_cached (rinsn); + m_avl = ::get_avl (rinsn); + if (has_vlmax_avl () || vsetvl_insn_p (rinsn)) + m_vl = ::get_vl (rinsn); + m_sew = ::get_sew (rinsn); + m_vlmul = ::get_vlmul (rinsn); + m_ta = tail_agnostic_p (rinsn); + m_ma = mask_agnostic_p (rinsn); + } + + void parse_insn (insn_info *insn) + { + m_insn = insn; + m_bb = insn->bb (); + /* Return if it is debug insn for the consistency with optimize == 0. */ + if (insn->is_debug_insn ()) return; - } - - if (has_vl_op (insn->rtl ())) - m_demands[DEMAND_AVL] = true; - - if (get_attr_ratio (insn->rtl ()) != INVALID_ATTRIBUTE) - m_demands[DEMAND_RATIO] = true; - else - { - /* TODO: By default, if it doesn't demand RATIO, we set it - demand SEW && LMUL both. Some instructions may demand SEW - only and ignore LMUL, will fix it later. */ - m_demands[DEMAND_SEW] = true; - if (!ignore_vlmul_insn_p (insn->rtl ())) - m_demands[DEMAND_LMUL] = true; - } - - if (get_attr_ta (insn->rtl ()) != INVALID_ATTRIBUTE) - m_demands[DEMAND_TAIL_POLICY] = true; - if (get_attr_ma (insn->rtl ()) != INVALID_ATTRIBUTE) - m_demands[DEMAND_MASK_POLICY] = true; - - if (vector_config_insn_p (insn->rtl ())) - return; - - if (scalar_move_insn_p (insn->rtl ())) - { - if (m_avl.has_non_zero_avl ()) - m_demands[DEMAND_NONZERO_AVL] = true; - if (m_ta) - m_demands[DEMAND_GE_SEW] = true; - } - if (!m_avl.has_avl_reg () || vlmax_avl_p (get_avl ()) || !m_avl.get_source ()) - return; - if (!m_avl.get_source ()->insn ()->is_real () - && !m_avl.get_source ()->insn ()->is_phi ()) - return; + /* We set it as unknown since we don't what will happen in CALL or ASM. */ + if (insn->is_call () || insn->is_asm ()) + { + set_unknown (); + return; + } + + /* If this is something that updates VL/VTYPE that we don't know about, set + the state to unknown. */ + if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ()) + && (find_access (insn->defs (), VL_REGNUM) + || find_access (insn->defs (), VTYPE_REGNUM))) + { + set_unknown (); + return; + } + + if (!vector_config_insn_p (insn->rtl ()) && !has_vtype_op (insn->rtl ())) + /* uninitialized */ + return; - insn_info *def_insn = extract_single_source (m_avl.get_source ()); - if (!def_insn || !vsetvl_insn_p (def_insn->rtl ())) - return; + set_valid (); + + m_avl = ::get_avl (insn->rtl ()); + if (m_avl) + { + if (vsetvl_insn_p (insn->rtl ()) || has_vlmax_avl ()) + m_vl = ::get_vl (insn->rtl ()); + + if (has_reg_avl ()) + m_avl_def = find_access (insn->uses (), REGNO (m_avl))->def (); + } + + m_sew = ::get_sew (insn->rtl ()); + m_vlmul = ::get_vlmul (insn->rtl ()); + m_ratio = get_attr_ratio (insn->rtl ()); + /* when get_attr_ratio is invalid, this kind of instructions + doesn't care about ratio. However, we still need this value + in demand info backward analysis. */ + if (m_ratio == INVALID_ATTRIBUTE) + m_ratio = calculate_ratio (m_sew, m_vlmul); + m_ta = tail_agnostic_p (insn->rtl ()); + m_ma = mask_agnostic_p (insn->rtl ()); + + /* If merge operand is undef value, we prefer agnostic. */ + int merge_op_idx = get_attr_merge_op_idx (insn->rtl ()); + if (merge_op_idx != INVALID_ATTRIBUTE + && satisfies_constraint_vu (recog_data.operand[merge_op_idx])) + { + m_ta = true; + m_ma = true; + } + + /* Determine the demand info of the RVV insn. */ + m_max_sew = get_max_int_sew (); + unsigned demand_flags = 0; + if (vector_config_insn_p (insn->rtl ())) + { + demand_flags |= demand_flags::DEMAND_AVL_P; + demand_flags |= demand_flags::DEMAND_RATIO_P; + } + else + { + if (has_vl_op (insn->rtl ())) + { + if (scalar_move_insn_p (insn->rtl ())) + { + /* If the avl for vmv.s.x comes from the vsetvl instruction, we + don't know if the avl is non-zero, so it is set to + DEMAND_AVL_P for now. it may be corrected to + DEMAND_NON_ZERO_AVL_P later when more information is + available. + */ + if (has_non_zero_avl ()) + demand_flags |= demand_flags::DEMAND_NON_ZERO_AVL_P; + else + demand_flags |= demand_flags::DEMAND_AVL_P; + } + else + demand_flags |= demand_flags::DEMAND_AVL_P; + } - vector_insn_info new_info; - new_info.parse_insn (def_insn); - if (!same_vlmax_p (new_info) && !scalar_move_insn_p (insn->rtl ())) - return; + if (get_attr_ratio (insn->rtl ()) != INVALID_ATTRIBUTE) + demand_flags |= demand_flags::DEMAND_RATIO_P; + else + { + if (scalar_move_insn_p (insn->rtl ()) && m_ta) + { + demand_flags |= demand_flags::DEMAND_GE_SEW_P; + m_max_sew = get_attr_type (insn->rtl ()) == TYPE_VFMOVFV + ? get_max_float_sew () + : get_max_int_sew (); + } + else + demand_flags |= demand_flags::DEMAND_SEW_P; + + if (!ignore_vlmul_insn_p (insn->rtl ())) + demand_flags |= demand_flags::DEMAND_LMUL_P; + } - if (new_info.has_avl ()) - { - if (new_info.has_avl_imm ()) - set_avl_info (avl_info (new_info.get_avl (), nullptr)); - else - { - if (vlmax_avl_p (new_info.get_avl ())) - set_avl_info (avl_info (new_info.get_avl (), get_avl_source ())); - else - { - /* Conservatively propagate non-VLMAX AVL of user vsetvl: - 1. The user vsetvl should be same block with the rvv insn. - 2. The user vsetvl is the only def insn of rvv insn. - 3. The AVL is not modified between def-use chain. - 4. The VL is only used by insn within EBB. - */ - bool modified_p = false; - for (insn_info *i = def_insn->next_nondebug_insn (); - real_insn_and_same_bb_p (i, get_insn ()->bb ()); - i = i->next_nondebug_insn ()) - { - /* Consider this following sequence: - - insn 1: vsetvli a5,a3,e8,mf4,ta,mu - insn 2: vsetvli zero,a5,e32,m1,ta,ma - ... - vle32.v v1,0(a1) - vsetvli a2,zero,e32,m1,ta,ma - vadd.vv v1,v1,v1 - vsetvli zero,a5,e32,m1,ta,ma - vse32.v v1,0(a0) - ... - insn 3: sub a3,a3,a5 - ... - - We can local AVL propagate "a3" from insn 1 to insn 2 - if no insns between insn 1 and insn 2 modify "a3 even - though insn 3 modifies "a3". - Otherwise, we can't perform local AVL propagation. - - Early break if we reach the insn 2. */ - if (!before_p (i, insn)) - break; - if (find_access (i->defs (), REGNO (new_info.get_avl ()))) - { - modified_p = true; - break; - } - } + if (!m_ta) + demand_flags |= demand_flags::DEMAND_TAIL_POLICY_P; + if (!m_ma) + demand_flags |= demand_flags::DEMAND_MASK_POLICY_P; + } + + normalize_demand (demand_flags); + + /* Optimize AVL from the vsetvl instruction. */ + insn_info *def_insn = extract_single_source (get_avl_def ()); + if (def_insn && vsetvl_insn_p (def_insn->rtl ())) + { + vsetvl_info def_info = vsetvl_info (def_insn); + if ((scalar_move_insn_p (insn->rtl ()) + || def_info.get_ratio () == get_ratio ()) + && (def_info.has_vlmax_avl () || def_info.has_imm_avl ())) + { + update_avl (def_info); + if (scalar_move_insn_p (insn->rtl ()) && has_non_zero_avl ()) + m_avl_demand = avl_demand_type::non_zero_avl; + } + } + + /* Determine if dest operand(vl) has been used by non-RVV instructions. */ + if (has_reg_vl ()) + { + const hash_set vl_uses + = get_all_real_uses (get_insn (), REGNO (get_vl ())); + for (use_info *use : vl_uses) + { + gcc_assert (use->insn ()->is_real ()); + rtx_insn *rinsn = use->insn ()->rtl (); + if (!has_vl_op (rinsn) + || count_regno_occurrences (rinsn, REGNO (get_vl ())) != 1) + { + use_by_non_rvv_insn = true; + break; + } + rtx avl = ::get_avl (rinsn); + if (!avl || REGNO (get_vl ()) != REGNO (avl)) + { + use_by_non_rvv_insn = true; + break; + } + } + } - bool has_live_out_use = false; - for (use_info *use : m_avl.get_source ()->all_uses ()) - { - if (use->is_live_out_use ()) - { - has_live_out_use = true; - break; - } - } - if (!modified_p && !has_live_out_use - && def_insn == m_avl.get_source ()->insn () - && m_insn->bb () == def_insn->bb ()) - set_avl_info (new_info.get_avl_info ()); - } - } - } + /* Collect the read vl insn for the fault-only-first rvv loads. */ + if (fault_first_load_p (insn->rtl ())) + { + for (insn_info *i = insn->next_nondebug_insn (); + i->bb () == insn->bb (); i = i->next_nondebug_insn ()) + { + if (find_access (i->defs (), VL_REGNUM)) + break; + if (i->rtl () && read_vl_insn_p (i->rtl ())) + { + m_read_vl_insn = i; + break; + } + } + } + } + + bool operator== (const vsetvl_info &other) const + { + gcc_assert (!uninit_p () && !other.uninit_p () + && "Uninitialization should not happen"); + + if (empty_p ()) + return other.empty_p (); + if (unknown_p ()) + return other.unknown_p (); + + return get_insn () == other.get_insn () && get_bb () == other.get_bb () + && get_avl () == other.get_avl () && get_vl () == other.get_vl () + && get_avl_def () == other.get_avl_def () + && get_sew () == other.get_sew () + && get_vlmul () == other.get_vlmul () && get_ta () == other.get_ta () + && get_ma () == other.get_ma () + && get_avl_demand () == other.get_avl_demand () + && get_sew_lmul_demand () == other.get_sew_lmul_demand () + && get_policy_demand () == other.get_policy_demand (); + } + + void dump (FILE *file, const char *indent = "") const + { + if (uninit_p ()) + { + fprintf (file, "UNINITIALIZED.\n"); + return; + } + else if (unknown_p ()) + { + fprintf (file, "UNKNOWN.\n"); + return; + } + else if (empty_p ()) + { + fprintf (file, "EMPTY.\n"); + return; + } + else if (valid_p ()) + fprintf (file, "VALID (insn %u, bb %u)%s\n", get_insn ()->uid (), + get_bb ()->index (), ignore_p () ? " (ignore)" : ""); + else + gcc_unreachable (); - if (scalar_move_insn_p (insn->rtl ()) && m_avl.has_non_zero_avl ()) - m_demands[DEMAND_NONZERO_AVL] = true; -} + fprintf (file, "%sDemand fields:", indent); + if (m_sew_lmul_demand == sew_lmul_demand_type::sew_lmul) + fprintf (file, " demand_sew_lmul"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::ratio_only) + fprintf (file, " demand_ratio_only"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::sew_only) + fprintf (file, " demand_sew_only"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::ge_sew) + fprintf (file, " demand_ge_sew"); + else if (m_sew_lmul_demand == sew_lmul_demand_type::ratio_and_ge_sew) + fprintf (file, " demand_ratio_and_ge_sew"); + + if (m_policy_demand == policy_demand_type::tail_mask_policy) + fprintf (file, " demand_tail_mask_policy"); + else if (m_policy_demand == policy_demand_type::tail_policy_only) + fprintf (file, " demand_tail_policy_only"); + else if (m_policy_demand == policy_demand_type::mask_policy_only) + fprintf (file, " demand_mask_policy_only"); + + if (m_avl_demand == avl_demand_type::avl) + fprintf (file, " demand_avl"); + else if (m_avl_demand == avl_demand_type::non_zero_avl) + fprintf (file, " demand_non_zero_avl"); + fprintf (file, "\n"); + + fprintf (file, "%sSEW=%d, ", indent, get_sew ()); + fprintf (file, "VLMUL=%s, ", vlmul_to_str (get_vlmul ())); + fprintf (file, "RATIO=%d, ", get_ratio ()); + fprintf (file, "MAX_SEW=%d\n", get_max_sew ()); + + fprintf (file, "%sTAIL_POLICY=%s, ", indent, policy_to_str (get_ta ())); + fprintf (file, "MASK_POLICY=%s\n", policy_to_str (get_ma ())); + + fprintf (file, "%sAVL=", indent); + print_rtl_single (file, get_avl ()); + fprintf (file, "%sVL=", indent); + print_rtl_single (file, get_vl ()); + if (change_vtype_only_p ()) + fprintf (file, "%schange vtype only\n", indent); + if (get_read_vl_insn ()) + fprintf (file, "%sread_vl_insn: insn %u\n", indent, + get_read_vl_insn ()->uid ()); + if (use_by_non_rvv_insn_p ()) + fprintf (file, "%suse_by_non_rvv_insn=true\n", indent); + } +}; bool -vector_insn_info::compatible_p (const vector_insn_info &other) const +same_equiv_note_p (const vsetvl_info &prev, const vsetvl_info &next) { - gcc_assert (valid_or_dirty_p () && other.valid_or_dirty_p () - && "Can't compare invalid demanded infos"); - - for (const auto &cond : incompatible_conds) - if (cond.dual_incompatible_p (*this, other)) - return false; - return true; + set_info *set1 = prev.get_avl_def (); + set_info *set2 = next.get_avl_def (); + insn_info *insn1 = extract_single_source (set1); + insn_info *insn2 = extract_single_source (set2); + if (!insn1 || !insn2) + return false; + return source_equal_p (insn1, insn2); } -bool -vector_insn_info::skip_avl_compatible_p (const vector_insn_info &other) const +class demand_system { - gcc_assert (valid_or_dirty_p () && other.valid_or_dirty_p () - && "Can't compare invalid demanded infos"); - unsigned array_size = sizeof (incompatible_conds) / sizeof (demands_cond); - /* Bypass AVL incompatible cases. */ - for (unsigned i = 1; i < array_size; i++) - if (incompatible_conds[i].dual_incompatible_p (*this, other)) - return false; - return true; -} +private: + sbitmap *m_avl_def_in; + sbitmap *m_avl_def_out; -bool -vector_insn_info::compatible_avl_p (const vl_vtype_info &other) const -{ - gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); - gcc_assert (!unknown_p () && "Can't compare AVL in unknown state"); - if (!demand_p (DEMAND_AVL)) - return true; - if (demand_p (DEMAND_NONZERO_AVL) && other.has_non_zero_avl ()) - return true; - return get_avl_info () == other.get_avl_info (); -} + /* predictors. */ -bool -vector_insn_info::compatible_avl_p (const avl_info &other) const -{ - gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); - gcc_assert (!unknown_p () && "Can't compare AVL in unknown state"); - gcc_assert (demand_p (DEMAND_AVL) && "Can't compare AVL undemand state"); - if (!demand_p (DEMAND_AVL)) - return true; - if (demand_p (DEMAND_NONZERO_AVL) && other.has_non_zero_avl ()) + inline bool always_true (const vsetvl_info &prev ATTRIBUTE_UNUSED, + const vsetvl_info &next ATTRIBUTE_UNUSED) + { return true; - return get_avl_info () == other; -} - -bool -vector_insn_info::compatible_vtype_p (const vl_vtype_info &other) const -{ - gcc_assert (valid_or_dirty_p () && "Can't compare invalid vl_vtype_info"); - gcc_assert (!unknown_p () && "Can't compare VTYPE in unknown state"); - if (demand_p (DEMAND_SEW)) - { - if (!demand_p (DEMAND_GE_SEW) && m_sew != other.get_sew ()) - return false; - if (demand_p (DEMAND_GE_SEW) && m_sew > other.get_sew ()) - return false; - } - if (demand_p (DEMAND_LMUL) && m_vlmul != other.get_vlmul ()) - return false; - if (demand_p (DEMAND_RATIO) && m_ratio != other.get_ratio ()) - return false; - if (demand_p (DEMAND_TAIL_POLICY) && m_ta != other.get_ta ()) + } + inline bool always_false (const vsetvl_info &prev ATTRIBUTE_UNUSED, + const vsetvl_info &next ATTRIBUTE_UNUSED) + { return false; - if (demand_p (DEMAND_MASK_POLICY) && m_ma != other.get_ma ()) - return false; - return true; -} - -/* Determine whether the vector instructions requirements represented by - Require are compatible with the previous vsetvli instruction represented - by this. INSN is the instruction whose requirements we're considering. */ -bool -vector_insn_info::compatible_p (const vl_vtype_info &curr_info) const -{ - gcc_assert (!uninit_p () && "Can't handle uninitialized info"); - if (empty_p ()) + } + + /* predictors for sew and lmul */ + + inline bool eq_lmul_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_vlmul () == next.get_vlmul (); + } + inline bool eq_sew_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_sew () == next.get_sew (); + } + inline bool eq_sew_lmul_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return eq_lmul_p (prev, next) && eq_sew_p (prev, next); + } + inline bool ge_next_sew_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_sew () == next.get_sew () + || (next.get_ta () && prev.get_sew () > next.get_sew ()); + } + inline bool ge_prev_sew_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.get_sew () == next.get_sew () + || (prev.get_ta () && prev.get_sew () < next.get_sew ()); + } + inline bool le_next_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_sew () <= next.get_max_sew (); + } + inline bool le_prev_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return next.get_sew () <= prev.get_max_sew (); + } + inline bool max_sew_overlap_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return !(prev.get_sew () > next.get_max_sew () + || next.get_sew () > prev.get_max_sew ()); + } + inline bool eq_ratio_p (const vsetvl_info &prev, const vsetvl_info &next) + { + return prev.has_same_ratio (next); + } + inline bool has_prev_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ratio () >= (next.get_sew () / 8); + } + inline bool has_next_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return next.get_ratio () >= (prev.get_sew () / 8); + } + + inline bool ge_next_sew_and_eq_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return ge_next_sew_p (prev, next) && eq_ratio_p (prev, next); + } + inline bool ge_next_sew_and_le_next_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return ge_next_sew_p (prev, next) && le_next_max_sew_p (prev, next); + } + inline bool + ge_next_sew_and_le_next_max_sew_and_has_next_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return ge_next_sew_p (prev, next) && le_next_max_sew_p (prev, next) + && has_next_ratio_p (prev, next); + } + inline bool ge_prev_sew_and_le_prev_max_sew_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return ge_prev_sew_p (prev, next) && le_prev_max_sew_p (prev, next); + } + inline bool max_sew_overlap_and_has_next_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return has_next_ratio_p (prev, next) && max_sew_overlap_p (prev, next); + } + inline bool + ge_prev_sew_and_le_prev_max_sew_and_eq_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return ge_prev_sew_p (prev, next) && eq_ratio_p (prev, next) + && le_prev_max_sew_p (prev, next); + } + inline bool max_sew_overlap_and_has_prev_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return has_prev_ratio_p (prev, next) && max_sew_overlap_p (prev, next); + } + inline bool + ge_prev_sew_and_le_prev_max_sew_and_has_prev_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return ge_prev_sew_p (prev, next) && has_prev_ratio_p (prev, next) + && le_prev_max_sew_p (prev, next); + } + inline bool max_sew_overlap_and_eq_ratio_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return eq_ratio_p (prev, next) && max_sew_overlap_p (prev, next); + } + + /* predictors for tail and mask policy */ + + inline bool eq_tail_policy_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ta () == next.get_ta (); + } + inline bool eq_mask_policy_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ma () == next.get_ma (); + } + inline bool eq_tail_mask_policy_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return eq_tail_policy_p (prev, next) && eq_mask_policy_p (prev, next); + } + + inline bool comp_tail_policy_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ta () || next.get_ta () || eq_tail_policy_p (prev, next); + } + + inline bool comp_mask_policy_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return prev.get_ma () || next.get_ma () || eq_mask_policy_p (prev, next); + } + + inline bool comp_tail_mask_policy_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return comp_tail_policy_p (prev, next) && comp_mask_policy_p (prev, next); + } + + /* predictors for avl */ + + inline bool def_or_use_vl_p (insn_info *i, const vsetvl_info &info) + { + return info.has_reg_vl () + && (find_access (i->uses (), REGNO (info.get_vl ())) + || find_access (i->defs (), REGNO (info.get_vl ()))); + } + inline bool def_avl_p (insn_info *i, const vsetvl_info &info) + { + return info.has_reg_avl () + && find_access (i->defs (), REGNO (info.get_avl ())); + } + + inline bool def_reg_between (insn_info *prev_insn, insn_info *curr_insn, + unsigned regno) + { + gcc_assert (prev_insn->compare_with (curr_insn) < 0); + /* 当个BB里面从上往下,不跨边 */ + for (insn_info *i = curr_insn->prev_nondebug_insn (); i != prev_insn; + i = i->prev_nondebug_insn ()) + { + // no def of regno + if (find_access (i->defs (), regno)) + return true; + } return false; + } - /* Nothing is compatible with Unknown. */ - if (unknown_p ()) - return false; + inline bool same_reg_avl_p (const vsetvl_info &prev, const vsetvl_info &next) + { + if (!prev.has_reg_avl () || !next.has_reg_avl ()) + return false; - /* If the instruction doesn't need an AVLReg and the SEW matches, consider - it compatible. */ - if (!demand_p (DEMAND_AVL)) - if (m_sew == curr_info.get_sew ()) + if (same_equiv_note_p (prev, next)) return true; - return compatible_avl_p (curr_info) && compatible_vtype_p (curr_info); -} - -bool -vector_insn_info::available_p (const vector_insn_info &other) const -{ - return *this >= other; -} - -void -vector_insn_info::fuse_avl (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - set_insn (info1.get_insn ()); - if (info1.demand_p (DEMAND_AVL)) - { - if (info1.demand_p (DEMAND_NONZERO_AVL)) - { - if (info2.demand_p (DEMAND_AVL) - && !info2.demand_p (DEMAND_NONZERO_AVL)) - { - set_avl_info (info2.get_avl_info ()); - set_demand (DEMAND_AVL, true); - set_demand (DEMAND_NONZERO_AVL, false); - return; - } - } - set_avl_info (info1.get_avl_info ()); - set_demand (DEMAND_NONZERO_AVL, info1.demand_p (DEMAND_NONZERO_AVL)); - } - else - { - set_avl_info (info2.get_avl_info ()); - set_demand (DEMAND_NONZERO_AVL, info2.demand_p (DEMAND_NONZERO_AVL)); - } - set_demand (DEMAND_AVL, - info1.demand_p (DEMAND_AVL) || info2.demand_p (DEMAND_AVL)); -} - -void -vector_insn_info::fuse_sew_lmul (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - /* We need to fuse sew && lmul according to demand info: - - 1. GE_SEW. - 2. SEW. - 3. LMUL. - 4. RATIO. */ - if (same_sew_lmul_demand_p (info1.get_demands (), info2.get_demands ())) - { - set_demand (DEMAND_SEW, info2.demand_p (DEMAND_SEW)); - set_demand (DEMAND_LMUL, info2.demand_p (DEMAND_LMUL)); - set_demand (DEMAND_RATIO, info2.demand_p (DEMAND_RATIO)); - set_demand (DEMAND_GE_SEW, info2.demand_p (DEMAND_GE_SEW)); - set_sew (info2.get_sew ()); - set_vlmul (info2.get_vlmul ()); - set_ratio (info2.get_ratio ()); - return; - } - for (const auto &rule : fuse_rules) - { - if (rule.pair.match_cond_p (info1.get_demands (), info2.get_demands ())) - { - set_demand (DEMAND_SEW, rule.demand_sew_p); - set_demand (DEMAND_LMUL, rule.demand_lmul_p); - set_demand (DEMAND_RATIO, rule.demand_ratio_p); - set_demand (DEMAND_GE_SEW, rule.demand_ge_sew_p); - set_sew (rule.new_sew (info1, info2)); - set_vlmul (rule.new_vlmul (info1, info2)); - set_ratio (rule.new_ratio (info1, info2)); - return; - } - if (rule.pair.match_cond_p (info2.get_demands (), info1.get_demands ())) - { - set_demand (DEMAND_SEW, rule.demand_sew_p); - set_demand (DEMAND_LMUL, rule.demand_lmul_p); - set_demand (DEMAND_RATIO, rule.demand_ratio_p); - set_demand (DEMAND_GE_SEW, rule.demand_ge_sew_p); - set_sew (rule.new_sew (info2, info1)); - set_vlmul (rule.new_vlmul (info2, info1)); - set_ratio (rule.new_ratio (info2, info1)); - return; - } - } - gcc_unreachable (); -} - -void -vector_insn_info::fuse_tail_policy (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info1.demand_p (DEMAND_TAIL_POLICY)) - { - set_ta (info1.get_ta ()); - demand (DEMAND_TAIL_POLICY); - } - else if (info2.demand_p (DEMAND_TAIL_POLICY)) - { - set_ta (info2.get_ta ()); - demand (DEMAND_TAIL_POLICY); - } - else - set_ta (get_default_ta ()); -} - -void -vector_insn_info::fuse_mask_policy (const vector_insn_info &info1, - const vector_insn_info &info2) -{ - if (info1.demand_p (DEMAND_MASK_POLICY)) - { - set_ma (info1.get_ma ()); - demand (DEMAND_MASK_POLICY); - } - else if (info2.demand_p (DEMAND_MASK_POLICY)) - { - set_ma (info2.get_ma ()); - demand (DEMAND_MASK_POLICY); - } - else - set_ma (get_default_ma ()); -} - -vector_insn_info -vector_insn_info::local_merge (const vector_insn_info &merge_info) const -{ - if (!vsetvl_insn_p (get_insn ()->rtl ()) && *this != merge_info) - gcc_assert (this->compatible_p (merge_info) - && "Can't merge incompatible demanded infos"); - - vector_insn_info new_info; - new_info.set_valid (); - /* For local backward data flow, we always update INSN && AVL as the - latest INSN and AVL so that we can keep track status of each INSN. */ - new_info.fuse_avl (merge_info, *this); - new_info.fuse_sew_lmul (*this, merge_info); - new_info.fuse_tail_policy (*this, merge_info); - new_info.fuse_mask_policy (*this, merge_info); - return new_info; -} + if (REGNO (prev.get_avl ()) != REGNO (next.get_avl ())) + return false; -vector_insn_info -vector_insn_info::global_merge (const vector_insn_info &merge_info, - unsigned int bb_index) const -{ - if (!vsetvl_insn_p (get_insn ()->rtl ()) && *this != merge_info) - gcc_assert (this->compatible_p (merge_info) - && "Can't merge incompatible demanded infos"); - - vector_insn_info new_info; - new_info.set_valid (); - - /* For global data flow, we should keep original INSN and AVL if they - valid since we should keep the life information of each block. - - For example: - bb 0 -> bb 1. - We should keep INSN && AVL of bb 1 since we will eventually emit - vsetvl instruction according to INSN and AVL of bb 1. */ - new_info.fuse_avl (*this, merge_info); - /* Recompute the AVL source whose block index is equal to BB_INDEX. */ - if (new_info.get_avl_source () - && new_info.get_avl_source ()->insn ()->is_phi () - && new_info.get_avl_source ()->bb ()->index () != bb_index) - { - hash_set sets - = get_all_sets (new_info.get_avl_source (), true, true, true); - new_info.set_avl_source (nullptr); - bool can_find_set_p = false; - set_info *first_set = nullptr; - for (set_info *set : sets) - { - if (!first_set) - first_set = set; - if (set->bb ()->index () == bb_index) - { - gcc_assert (!can_find_set_p); - new_info.set_avl_source (set); - can_find_set_p = true; - } - } - if (!can_find_set_p && sets.elements () == 1 - && first_set->insn ()->is_real ()) - new_info.set_avl_source (first_set); - } + insn_info *prev_insn = prev.get_insn (); + if (prev.get_bb () != prev_insn->bb ()) + prev_insn = prev.get_bb ()->end_insn (); - /* Make sure VLMAX AVL always has a set_info the get VL. */ - if (vlmax_avl_p (new_info.get_avl ())) - { - if (this->get_avl_source ()) - new_info.set_avl_source (this->get_avl_source ()); - else - { - gcc_assert (merge_info.get_avl_source ()); - new_info.set_avl_source (merge_info.get_avl_source ()); - } - } + insn_info *next_insn = next.get_insn (); + if (next.get_bb () != next_insn->bb ()) + next_insn = next.get_bb ()->end_insn (); - new_info.fuse_sew_lmul (*this, merge_info); - new_info.fuse_tail_policy (*this, merge_info); - new_info.fuse_mask_policy (*this, merge_info); - return new_info; -} + /* 实际上next的vl可以被修改,只要vl没有被使用 */ + return safe_move_avl_vl_p (prev_insn, next_insn, next, false); + } -/* Wrapper helps to return the AVL or VL operand for the - vector_insn_info. Return AVL if the AVL is not VLMAX. - Otherwise, return the VL operand. */ -rtx -vector_insn_info::get_avl_or_vl_reg (void) const -{ - gcc_assert (has_avl_reg ()); - if (!vlmax_avl_p (get_avl ())) - return get_avl (); + inline bool equal_avl_p (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); - rtx_insn *rinsn = get_insn ()->rtl (); - if (has_vl_op (rinsn) || vsetvl_insn_p (rinsn)) - { - rtx vl = ::get_vl (rinsn); - /* For VLMAX, we should make sure we get the - REG to emit 'vsetvl VL,zero' since the 'VL' - should be the REG according to RVV ISA. */ - if (REG_P (vl)) - return vl; - } + if (prev.get_ratio () != next.get_ratio ()) + return false; - /* We always has avl_source if it is VLMAX AVL. */ - gcc_assert (get_avl_source ()); - return get_avl_reg_rtx (); -} + if (next.has_reg_vl () && next.use_by_non_rvv_insn_p ()) + return false; -bool -vector_insn_info::update_fault_first_load_avl (insn_info *insn) -{ - // Update AVL to vl-output of the fault first load. - const insn_info *read_vl = get_forward_read_vl_insn (insn); - if (read_vl) - { - rtx vl = SET_DEST (PATTERN (read_vl->rtl ())); - def_info *def = find_access (read_vl->defs (), REGNO (vl)); - set_info *set = safe_dyn_cast (def); - set_avl_info (avl_info (vl, set)); - set_insn (insn); + if (vector_config_insn_p (prev.get_insn ()->rtl ()) && next.get_avl_def () + && next.get_avl_def ()->insn () == prev.get_insn ()) return true; - } - return false; -} - -static const char * -vlmul_to_str (vlmul_type vlmul) -{ - switch (vlmul) - { - case LMUL_1: - return "m1"; - case LMUL_2: - return "m2"; - case LMUL_4: - return "m4"; - case LMUL_8: - return "m8"; - case LMUL_RESERVED: - return "INVALID LMUL"; - case LMUL_F8: - return "mf8"; - case LMUL_F4: - return "mf4"; - case LMUL_F2: - return "mf2"; - - default: - gcc_unreachable (); - } -} -static const char * -policy_to_str (bool agnostic_p) -{ - return agnostic_p ? "agnostic" : "undisturbed"; -} + if (prev.get_read_vl_insn ()) + { + if (!next.has_reg_avl () || !next.get_avl_def ()) + return false; + insn_info *avl_def_insn = extract_single_source (next.get_avl_def ()); + return avl_def_insn == prev.get_read_vl_insn (); + } + + if (prev == next && prev.has_reg_avl ()) + { + /* 单个BB作为一个Loop的情况 */ + insn_info *insn = prev.get_insn (); + bb_info *bb = insn->bb (); + for (insn_info *i = insn; real_insn_and_same_bb_p (i, bb); + i = i->next_nondebug_insn ()) + if (find_access (i->defs (), REGNO (prev.get_avl ()))) + return false; + } -void -vector_insn_info::dump (FILE *file) const -{ - fprintf (file, "["); - if (uninit_p ()) - fprintf (file, "UNINITIALIZED,"); - else if (valid_p ()) - fprintf (file, "VALID,"); - else if (unknown_p ()) - fprintf (file, "UNKNOWN,"); - else if (empty_p ()) - fprintf (file, "EMPTY,"); - else - fprintf (file, "DIRTY,"); - - fprintf (file, "Demand field={%d(VL),", demand_p (DEMAND_AVL)); - fprintf (file, "%d(DEMAND_NONZERO_AVL),", demand_p (DEMAND_NONZERO_AVL)); - fprintf (file, "%d(SEW),", demand_p (DEMAND_SEW)); - fprintf (file, "%d(DEMAND_GE_SEW),", demand_p (DEMAND_GE_SEW)); - fprintf (file, "%d(LMUL),", demand_p (DEMAND_LMUL)); - fprintf (file, "%d(RATIO),", demand_p (DEMAND_RATIO)); - fprintf (file, "%d(TAIL_POLICY),", demand_p (DEMAND_TAIL_POLICY)); - fprintf (file, "%d(MASK_POLICY)}\n", demand_p (DEMAND_MASK_POLICY)); - - fprintf (file, "AVL="); - print_rtl_single (file, get_avl ()); - fprintf (file, "SEW=%d,", get_sew ()); - fprintf (file, "VLMUL=%s,", vlmul_to_str (get_vlmul ())); - fprintf (file, "RATIO=%d,", get_ratio ()); - fprintf (file, "TAIL_POLICY=%s,", policy_to_str (get_ta ())); - fprintf (file, "MASK_POLICY=%s", policy_to_str (get_ma ())); - fprintf (file, "]\n"); - - if (valid_p ()) - { - if (get_insn ()) - { - fprintf (file, "The real INSN="); - print_rtl_single (file, get_insn ()->rtl ()); - } - } -} + if (prev.has_vlmax_avl () && next.has_vlmax_avl ()) + return true; + else if (prev.has_imm_avl () && next.has_imm_avl ()) + return INTVAL (prev.get_avl ()) == INTVAL (next.get_avl ()); + else if (prev.has_reg_vl () && next.has_reg_avl () + && REGNO (prev.get_vl ()) == REGNO (next.get_avl ())) + { + insn_info *prev_insn = prev.get_insn (); + if (prev.get_bb () != prev_insn->bb ()) + prev_insn = prev.get_bb ()->end_insn (); + + insn_info *next_insn = next.get_insn (); + if (next.get_bb () != next_insn->bb ()) + next_insn = next.get_bb ()->end_insn (); + + return safe_move_avl_vl_p (prev_insn, next_insn, next, false); + // if (prev.get_bb () != next.get_bb () || !prev.is_in_origin_bb ()) + // return false; + // insn_info *prev_insn = prev.get_insn (); + // insn_info *curr_insn = next.get_bb ()->end_insn (); + // if (def_reg_between (prev_insn, curr_insn, REGNO (next.get_avl ()))) + // return false; + // return true; + } + else if (prev.has_reg_avl () && next.has_reg_avl ()) + return same_reg_avl_p (prev, next); -vector_infos_manager::vector_infos_manager () -{ - vector_edge_list = nullptr; - vector_kill = nullptr; - vector_del = nullptr; - vector_insert = nullptr; - vector_antic = nullptr; - vector_transp = nullptr; - vector_comp = nullptr; - vector_avin = nullptr; - vector_avout = nullptr; - vector_antin = nullptr; - vector_antout = nullptr; - vector_earliest = nullptr; - vector_insn_infos.safe_grow_cleared (get_max_uid ()); - vector_block_infos.safe_grow_cleared (last_basic_block_for_fn (cfun)); - if (!optimize) - { - basic_block cfg_bb; - rtx_insn *rinsn; - FOR_ALL_BB_FN (cfg_bb, cfun) - { - vector_block_infos[cfg_bb->index].local_dem = vector_insn_info (); - vector_block_infos[cfg_bb->index].reaching_out = vector_insn_info (); - FOR_BB_INSNS (cfg_bb, rinsn) - vector_insn_infos[INSN_UID (rinsn)].parse_insn (rinsn); - } - } - else - { - for (const bb_info *bb : crtl->ssa->bbs ()) - { - vector_block_infos[bb->index ()].local_dem = vector_insn_info (); - vector_block_infos[bb->index ()].reaching_out = vector_insn_info (); - for (insn_info *insn : bb->real_insns ()) - vector_insn_infos[insn->uid ()].parse_insn (insn); - vector_block_infos[bb->index ()].probability = profile_probability (); - } - } -} + return false; + } + inline bool equal_avl_or_prev_non_zero_avl_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + return equal_avl_p (prev, next) || prev.has_non_zero_avl (); + } + + inline bool can_use_next_avl_p (const vsetvl_info &prev, + const vsetvl_info &next) + { + if (!next.has_reg_avl () && !next.has_reg_vl ()) + return true; -void -vector_infos_manager::create_expr (vector_insn_info &info) -{ - for (size_t i = 0; i < vector_exprs.length (); i++) - if (*vector_exprs[i] == info) + insn_info *prev_insn = prev.get_insn (); + if (prev.get_bb () != prev_insn->bb ()) + prev_insn = prev.get_bb ()->end_insn (); + + insn_info *next_insn = next.get_insn (); + if (next.get_bb () != next_insn->bb ()) + next_insn = next.get_bb ()->end_insn (); + + return safe_move_avl_vl_p (prev_insn, next_insn, next); + } + + inline bool equal_avl_or_next_non_zero_avl_and_can_use_next_avl_p ( + const vsetvl_info &prev, const vsetvl_info &next) + { + return equal_avl_p (prev, next) + || (next.has_non_zero_avl () && can_use_next_avl_p (prev, next)); + } + + /* modifiers */ + + inline void nop (const vsetvl_info &prev ATTRIBUTE_UNUSED, + const vsetvl_info &next ATTRIBUTE_UNUSED) + {} + + /* modifiers for sew and lmul */ + + inline void use_min_max_sew (vsetvl_info &prev, const vsetvl_info &next) + { + prev.set_max_sew (MIN (prev.get_max_sew (), next.get_max_sew ())); + } + inline void use_next_sew (vsetvl_info &prev, const vsetvl_info &next) + { + prev.set_sew (next.get_sew ()); + use_min_max_sew (prev, next); + } + inline void use_max_sew (vsetvl_info &prev, const vsetvl_info &next) + { + auto max_sew = std::max (prev.get_sew (), next.get_sew ()); + prev.set_sew (max_sew); + use_min_max_sew (prev, next); + } + inline void use_next_sew_lmul (vsetvl_info &prev, const vsetvl_info &next) + { + use_next_sew (prev, next); + prev.set_vlmul (next.get_vlmul ()); + prev.set_ratio (next.get_ratio ()); + } + inline void use_next_sew_with_prev_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + use_next_sew (prev, next); + prev.set_vlmul (calculate_vlmul (next.get_sew (), prev.get_ratio ())); + } + inline void modify_lmul_with_next_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + prev.set_vlmul (calculate_vlmul (prev.get_sew (), next.get_ratio ())); + prev.set_ratio (next.get_ratio ()); + } + + inline void use_max_sew_and_lmul_with_next_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + prev.set_vlmul (calculate_vlmul (prev.get_sew (), next.get_ratio ())); + use_max_sew (prev, next); + prev.set_ratio (next.get_ratio ()); + } + + inline void use_max_sew_and_lmul_with_prev_ratio (vsetvl_info &prev, + const vsetvl_info &next) + { + auto max_sew = std::max (prev.get_sew (), next.get_sew ()); + prev.set_vlmul (calculate_vlmul (max_sew, prev.get_ratio ())); + prev.set_sew (max_sew); + } + + /* modifiers for tail and mask policy */ + + inline void use_tail_policy (vsetvl_info &prev, const vsetvl_info &next) + { + if (!next.get_ta ()) + prev.set_ta (next.get_ta ()); + } + inline void use_mask_policy (vsetvl_info &prev, const vsetvl_info &next) + { + if (!next.get_ma ()) + prev.set_ma (next.get_ma ()); + } + inline void use_tail_mask_policy (vsetvl_info &prev, const vsetvl_info &next) + { + use_tail_policy (prev, next); + use_mask_policy (prev, next); + } + + /* modifiers for avl */ + + inline void use_next_avl (vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (can_use_next_avl_p (prev, next)); + prev.update_avl (next); + } + + inline void use_next_avl_when_not_equal (vsetvl_info &prev, + const vsetvl_info &next) + { + if (equal_avl_p (prev, next)) return; - vector_exprs.safe_push (&info); -} - -size_t -vector_infos_manager::get_expr_id (const vector_insn_info &info) const -{ - for (size_t i = 0; i < vector_exprs.length (); i++) - if (*vector_exprs[i] == info) - return i; - gcc_unreachable (); -} - -auto_vec -vector_infos_manager::get_all_available_exprs ( - const vector_insn_info &info) const -{ - auto_vec available_list; - for (size_t i = 0; i < vector_exprs.length (); i++) - if (info.available_p (*vector_exprs[i])) - available_list.safe_push (i); - return available_list; -} + gcc_assert (next.has_non_zero_avl ()); + use_next_avl (prev, next); + } -bool -vector_infos_manager::all_same_ratio_p (sbitmap bitdata) const -{ - if (bitmap_empty_p (bitdata)) - return false; +public: + demand_system () : m_avl_def_in (nullptr), m_avl_def_out (nullptr) {} + + void set_avl_in_out_data (sbitmap *avl_def_in, sbitmap *avl_def_out) + { + m_avl_def_in = avl_def_in; + m_avl_def_out = avl_def_out; + } + + /* Can we move vsetvl info between prev_insn and next_insn safe? */ + bool safe_move_avl_vl_p (insn_info *prev_insn, insn_info *next_insn, + const vsetvl_info &info, bool ignore_vl = false) + { + gcc_assert ((ignore_vl && info.has_reg_avl ()) + || (info.has_reg_avl () || info.has_reg_vl ())); + + gcc_assert (!prev_insn->is_debug_insn () && !next_insn->is_debug_insn ()); + if (prev_insn->bb () == next_insn->bb () + && prev_insn->compare_with (next_insn) < 0) + { + /* 当个BB里面从上往下,不跨边 */ + for (insn_info *i = next_insn->prev_nondebug_insn (); i != prev_insn; + i = i->prev_nondebug_insn ()) + { + // no def amd use of vl + if (!ignore_vl && def_or_use_vl_p (i, info)) + return false; - int ratio = -1; - unsigned int bb_index; - sbitmap_iterator sbi; + // no def of avl + if (def_avl_p (i, info)) + return false; + } + return true; + } + else + { + /* 跨边:1. 不同BB之间,2. 同个BB之间循环 */ + if (!ignore_vl && info.has_reg_vl ()) + { + /* 如果prev_bb的live out中包含了vl, + * 则无法安全的将info的vl在prev_insn处修改 */ + bitmap live_out = df_get_live_out (prev_insn->bb ()->cfg_bb ()); + if (bitmap_bit_p (live_out, REGNO (info.get_vl ()))) + return false; + } - EXECUTE_IF_SET_IN_BITMAP (bitdata, 0, bb_index, sbi) - { - if (ratio == -1) - ratio = vector_exprs[bb_index]->get_ratio (); - else if (vector_exprs[bb_index]->get_ratio () != ratio) - return false; - } - return true; -} + if (info.has_reg_avl () && m_avl_def_in && m_avl_def_out) + { + bool has_avl_out = false; + unsigned regno = REGNO (info.get_avl ()); + unsigned expr_id; + sbitmap_iterator sbi; + EXECUTE_IF_SET_IN_BITMAP (m_avl_def_out[prev_insn->bb ()->index ()], + 0, expr_id, sbi) + { + if (get_regno (expr_id, last_basic_block_for_fn (cfun)) + != regno) + continue; + has_avl_out = true; + if (!bitmap_bit_p (m_avl_def_in[next_insn->bb ()->index ()], + expr_id)) + return false; + } + /* 如果avl不在prev_bb的avl_out中 */ + if (!has_avl_out) + return false; + } -/* Return TRUE if the incoming vector configuration state - to CFG_BB is compatible with the vector configuration - state in CFG_BB, FALSE otherwise. */ -bool -vector_infos_manager::all_avail_in_compatible_p (const basic_block cfg_bb) const -{ - const auto &info = vector_block_infos[cfg_bb->index].local_dem; - sbitmap avin = vector_avin[cfg_bb->index]; - unsigned int bb_index; - sbitmap_iterator sbi; - EXECUTE_IF_SET_IN_BITMAP (avin, 0, bb_index, sbi) - { - const auto &avin_info - = static_cast (*vector_exprs[bb_index]); - if (!info.compatible_p (avin_info)) - return false; - } - return true; -} + /* 如果是原始info,则需要判断next_insn之前的指令有没有修改avl和使用vl */ + for (insn_info *i = next_insn; i != next_insn->bb ()->head_insn (); + i = i->prev_nondebug_insn ()) + { + // no def amd use of vl + if (!ignore_vl && def_or_use_vl_p (i, info)) + return false; -bool -vector_infos_manager::all_same_avl_p (const basic_block cfg_bb, - sbitmap bitdata) const -{ - if (bitmap_empty_p (bitdata)) - return false; + // no def of avl + if (def_avl_p (i, info)) + return false; + } + + /* 如果是原始info,则需要判断prev_insn之后的指令有没有修改avl和使用vl */ + for (insn_info *i = prev_insn->bb ()->end_insn (); i != prev_insn; + i = i->prev_nondebug_insn ()) + { + // no def amd use of vl + if (!ignore_vl && def_or_use_vl_p (i, info)) + return false; - const auto &block_info = vector_block_infos[cfg_bb->index]; - if (!block_info.local_dem.demand_p (DEMAND_AVL)) + // no def of avl + if (def_avl_p (i, info)) + return false; + } + } return true; + } + + bool compatible_sew_lmul_with (const vsetvl_info &prev, + const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + sew_lmul_demand_type prev_flags = prev.get_sew_lmul_demand (); + sew_lmul_demand_type next_flags = next.get_sew_lmul_demand (); +#define DEF_SEW_LMUL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == sew_lmul_demand_type::PREV_FLAGS \ + && next_flags == sew_lmul_demand_type::NEXT_FLAGS) \ + return COMPATIBLE_P (prev, next); - avl_info avl = block_info.local_dem.get_avl_info (); - unsigned int bb_index; - sbitmap_iterator sbi; +#include "riscv-vsetvl.def" - EXECUTE_IF_SET_IN_BITMAP (bitdata, 0, bb_index, sbi) - { - if (vector_exprs[bb_index]->get_avl_info () != avl) - return false; - } - return true; -} + gcc_unreachable (); + } + + bool available_sew_lmul_with (const vsetvl_info &prev, + const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + sew_lmul_demand_type prev_flags = prev.get_sew_lmul_demand (); + sew_lmul_demand_type next_flags = next.get_sew_lmul_demand (); +#define DEF_SEW_LMUL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == sew_lmul_demand_type::PREV_FLAGS \ + && next_flags == sew_lmul_demand_type::NEXT_FLAGS) \ + return AVAILABLE_P (prev, next); -bool -vector_infos_manager::earliest_fusion_worthwhile_p ( - const basic_block cfg_bb) const -{ - edge e; - edge_iterator ei; - profile_probability prob = profile_probability::uninitialized (); - FOR_EACH_EDGE (e, ei, cfg_bb->succs) - { - if (prob == profile_probability::uninitialized ()) - prob = vector_block_infos[e->dest->index].probability; - else if (prob == vector_block_infos[e->dest->index].probability) - continue; - else - /* We pick the highest probability among those incompatible VSETVL - infos. When all incompatible VSTEVL infos have same probability, we - don't pick any of them. */ - return true; - } - return false; -} +#include "riscv-vsetvl.def" -bool -vector_infos_manager::vsetvl_dominated_by_all_preds_p ( - const basic_block cfg_bb, const vector_insn_info &info) const -{ - edge e; - edge_iterator ei; - FOR_EACH_EDGE (e, ei, cfg_bb->preds) - { - const auto &reaching_out = vector_block_infos[e->src->index].reaching_out; - if (e->src->index == cfg_bb->index && reaching_out.compatible_p (info)) - continue; - if (!vsetvl_dominated_by_p (e->src, info, reaching_out, false)) - return false; + gcc_unreachable (); + } + + void merge_sew_lmul_with (vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + sew_lmul_demand_type prev_flags = prev.get_sew_lmul_demand (); + sew_lmul_demand_type next_flags = next.get_sew_lmul_demand (); +#define DEF_SEW_LMUL_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == sew_lmul_demand_type::PREV_FLAGS \ + && next_flags == sew_lmul_demand_type::NEXT_FLAGS) \ + { \ + gcc_assert (COMPATIBLE_P (prev, next)); \ + FUSE (prev, next); \ + prev.set_sew_lmul_demand (sew_lmul_demand_type::NEW_FLAGS); \ + return; \ } - return true; -} -size_t -vector_infos_manager::expr_set_num (sbitmap bitdata) const -{ - size_t count = 0; - for (size_t i = 0; i < vector_exprs.length (); i++) - if (bitmap_bit_p (bitdata, i)) - count++; - return count; -} +#include "riscv-vsetvl.def" -void -vector_infos_manager::release (void) -{ - if (!vector_insn_infos.is_empty ()) - vector_insn_infos.release (); - if (!vector_block_infos.is_empty ()) - vector_block_infos.release (); - if (!vector_exprs.is_empty ()) - vector_exprs.release (); - - gcc_assert (to_refine_vsetvls.is_empty ()); - gcc_assert (to_delete_vsetvls.is_empty ()); - if (optimize > 0) - free_bitmap_vectors (); -} + gcc_unreachable (); + } -void -vector_infos_manager::create_bitmap_vectors (void) -{ - /* Create the bitmap vectors. */ - vector_antic = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_transp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_comp = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_avin = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_avout = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_kill = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_antin = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - vector_antout = sbitmap_vector_alloc (last_basic_block_for_fn (cfun), - vector_exprs.length ()); - - bitmap_vector_ones (vector_transp, last_basic_block_for_fn (cfun)); - bitmap_vector_clear (vector_antic, last_basic_block_for_fn (cfun)); - bitmap_vector_clear (vector_comp, last_basic_block_for_fn (cfun)); - vector_edge_list = create_edge_list (); - vector_earliest = sbitmap_vector_alloc (NUM_EDGES (vector_edge_list), - vector_exprs.length ()); -} + bool compatible_policy_with (const vsetvl_info &prev, const vsetvl_info &next) + { + gcc_assert (prev.valid_p () && next.valid_p ()); + policy_demand_type prev_flags = prev.get_policy_demand (); + policy_demand_type next_flags = next.get_policy_demand (); +#define DEF_POLICY_RULE(PREV_FLAGS, NEXT_FLAGS, NEW_FLAGS, COMPATIBLE_P, \ + AVAILABLE_P, FUSE) \ + if (prev_flags == policy_demand_type::PREV_FLAGS \ + && next_flags == policy_demand_type::NEXT_FLAGS) \ + return COMPATIBLE_P (prev, next); -void -vector_infos_manager::free_bitmap_vectors (void) -{ - /* Finished. Free up all the things we've allocated. */ - free_edge_list (vector_edge_list); - if (vector_del) - sbitmap_vector_free (vector_del); - if (vector_insert) - sbitmap_vector_free (vector_insert); - if (vector_kill) - sbitmap_vector_free (vector_kill); - if (vector_antic) - sbitmap_vector_free (vector_antic); - if (vector_transp) - sbitmap_vector_free (vector_transp); - if (vector_comp) - sbitmap_vector_free (vector_comp); - if (vector_avin) - sbitmap_vector_free (vector_avin); - if (vector_avout) - sbitmap_vector_free (vector_avout); - if (vector_antin) - sbitmap_vector_free (vector_antin); - if (vector_antout) - sbitmap_vector_free (vector_antout); - if (vector_earliest) - sbitmap_vector_free (vector_earliest); - - vector_edge_list = nullptr; - vector_kill = nullptr; - vector_del = nullptr; - vector_insert = nullptr; - vector_antic = nullptr; - vector_transp = nullptr; - vector_comp = nullptr; - vector_avin = nullptr; - vector_avout = nullptr; - vector_antin = nullptr; - vector_antout = nullptr; - vector_earliest = nullptr; -} +#include "riscv-vsetvl.def" -void -vector_infos_manager::dump (FILE *file) const -{ - basic_block cfg_bb; - rtx_insn *rinsn; + gcc_unreachable (); + } - fprintf (file, "\n"); - FOR_ALL_BB_FN (cfg_bb, cfun) - { - fprintf (file, "Local vector info of :\n", cfg_bb->index); - fprintf (file, "
="); - vector_block_infos[cfg_bb->index].local_dem.dump (file); - FOR_BB_INSNS (cfg_bb, rinsn) - { - if (!NONDEBUG_INSN_P (rinsn) || !has_vtype_op (rinsn)) - continue; - fprintf (file, "=", INSN_UID (rinsn)); - const auto &info = vector_insn_infos[INSN_UID (rinsn)]; - info.dump (file); - } - fprintf (file, "