From patchwork Sun Jun 25 12:20:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 112571 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp6885071vqr; Sun, 25 Jun 2023 05:21:48 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5IJ0ByhXnwgrExo1l5Zd4f6KnXFfa6UNa3/FSSkRsfM9fDPwDYDCf77Bwy7QmBWhXB36bs X-Received: by 2002:a17:907:72d2:b0:98e:3935:60f4 with SMTP id du18-20020a17090772d200b0098e393560f4mr1410624ejc.1.1687695708520; Sun, 25 Jun 2023 05:21:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687695708; cv=none; d=google.com; s=arc-20160816; b=d+MvXhTiDVOJdsj2ybaM0EQ2INUoeW7UKgLsjeDNcZDvJEE56amWQnIMB5JXi36IsE m5P2q/hEcu0TXLiWvZdlST2AgIuAAx8A2MEqS47orCch5DfUrhu9OBRgJCPk1WSMPb/b lDTp7RNMwkOD4iZutoS/XTvvds96Ivv8DoPRPD8lTA+rWwUkv4T+hz0dqwqHNZjVIhl/ l+j6KZgRwKGHV/EzhO1pHTxdWnj5y3Y3j1VRMlkK5OqV1y0f6Xga9MEjGDl7qKDRaLZj xdDHzttmYNy3ON8GPuOofMafmTBAPouNLtN+6azFDOCo/BiM650cIcK0ynixzsF9TY79 D/ZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=Sd7O7NVUH07Q0RauBTXa6HaLP1w1PYcZdKA7l3g+z6Y=; fh=wJQXjlF2pzwEVBXObiUXCE1/GsELYeDpatHPYmXU/Wc=; b=cq4gw+yL7j5hVxyYvGm8PkhbjBlcMN3vvw1DbB317QK6/0smnPYUdUkco70BKrg/WC wNXke/laQWJsQAsHVkN1D0Q5YjcyWlWoYewHJB7lQpnjQk8PKsyoW1xBv6ZkBAU1qrWT g36+LHYEYX7HkjoHgCygnGBC2x/XNsdpwWufZJlR5i73f/hntUCx3jTGN0yOKqYG9PIR Xgc6jENUXe9l0pHVKaPWKaM5Y7FeS8wz8gyUCIFvvrnBAUCSFhXyn3NJgLaZaEcYkgwk l33l1JUTFdl7awd/150CdZ6jasbc0WKTCotV09xc3miCcfejZDpsvbGBuxNKKGqn2D4K 1wZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id p13-20020a1709060e8d00b00977c4ffb2d4si1743842ejf.310.2023.06.25.05.21.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 25 Jun 2023 05:21:48 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B11173857703 for ; Sun, 25 Jun 2023 12:21:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgeu2.qq.com (smtpbgeu2.qq.com [18.194.254.142]) by sourceware.org (Postfix) with ESMTPS id B13823858D35 for ; Sun, 25 Jun 2023 12:21:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B13823858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp78t1687695661ts58qmc7 Received: from server1.localdomain ( [58.60.1.22]) by bizesmtp.qq.com (ESMTP) with id ; Sun, 25 Jun 2023 20:21:00 +0800 (CST) X-QQ-SSF: 01400000000000G0S000000A0000000 X-QQ-FEAT: Rf1s/nWaRtsElAYv9AzvzUxdb0dSFfshrpIOk+N1//8hi2Qa8EiODFHUOfLhd wI5Y9jynQdEt9rxQHFC8tPRs6EEgBtMkzuL3NY8eDMh8+Pb/23odg4YLyRyiIrRKe74b34v tMa5yESikOHzXeS7dLSIb3tVjV4ux8fErt8mqxkz0a8N+Q6dB36WUJORWzQNT1FZ7nhzAnV Ffpyk71fGTSTiND1A7xxEu1MObRrKafcv56GU2UfS95glWMXTHYmPhya4mOipdIP80ZYdRu IqvaW1LKqYmPDWr/HX0Lac+p+ffXTn77yItg9D893QD4DPIxSUJCk1peSVbPDixhJQxzxf4 IAfT66kxJ/O46kEPjzr2BISB5L9e3mMxDmC7DVqX5UBKM2qXeToZSnHVwGPdVVMTAHHf65B dWL+RNs7AWsh1QSsVdy5jw== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 86015216031432984 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, palmer@dabbelt.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Optimize VSETVL codegen of SELECT_VL with LEN_MASK_{LOAD, STORE} Date: Sun, 25 Jun 2023 20:20:57 +0800 Message-Id: <20230625122057.195433-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769677215085875742?= X-GMAIL-MSGID: =?utf-8?q?1769677215085875742?= This patch is depending on LEN_MASK_{LOAD,STORE} patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-June/622742.html After enabling the LEN_MASK_{LOAD,STORE}, I notice that there is a case that VSETVL PASS need to be optimized: void f (int32_t *__restrict a, int32_t *__restrict b, int32_t *__restrict cond, int n) { for (int i = 0; i < 8; i++) if (cond[i]) a[i] = b[i]; } Before this patch: f: vsetivli a5,8,e8,mf4,tu,mu --> Propagate "8" to the following vsetvl vsetvli zero,a5,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma li a3,8 vmsne.vi v0,v0,0 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t sub a4,a3,a5 beq a3,a5,.L6 slli a5,a5,2 add a2,a2,a5 add a1,a1,a5 add a0,a0,a5 vsetvli a5,a4,e8,mf4,tu,mu --> Propagate "a4" to the following vsetvl vsetvli zero,a5,e32,m1,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma vmsne.vi v0,v0,0 vsetvli zero,a5,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t .L6: ret Current VSETLV PASS only enable AVL propagation of VLMAX AVL ("zero"). Now, we enable AVL propagation of immediate && conservative non-VLMAX. After this patch: f: vsetivli a5,8,e8,mf4,ta,ma vle32.v v0,0(a2) vsetvli a6,zero,e32,m1,ta,ma li a3,8 vmsne.vi v0,v0,0 vsetivli zero,8,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t sub a4,a3,a5 beq a3,a5,.L6 slli a5,a5,2 vsetvli a4,a4,e8,mf4,ta,ma add a2,a2,a5 vle32.v v0,0(a2) add a1,a1,a5 vsetvli a6,zero,e32,m1,ta,ma add a0,a0,a5 vmsne.vi v0,v0,0 vsetvli zero,a4,e32,m1,ta,ma vle32.v v1,0(a1),v0.t vse32.v v1,0(a0),v0.t .L6: ret gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (vector_insn_info::parse_insn): Ehance AVL propagation. * config/riscv/riscv-vsetvl.h: New function. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/select_vl-1.c: Add dump checks. * gcc.target/riscv/rvv/autovec/partial/select_vl-2.c: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 48 +++++++++++++++++-- gcc/config/riscv/riscv-vsetvl.h | 2 + .../riscv/rvv/autovec/partial/select_vl-1.c | 5 +- .../riscv/rvv/autovec/partial/select_vl-2.c | 25 ++++++++++ 4 files changed, 76 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-2.c diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 971c3f90742..2d576e8d5c1 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -2003,9 +2003,51 @@ vector_insn_info::parse_insn (insn_info *insn) new_info.parse_insn (def_insn); if (!same_vlmax_p (new_info) && !scalar_move_insn_p (insn->rtl ())) return; - /* TODO: Currently, we don't forward AVL for non-VLMAX vsetvl. */ - if (vlmax_avl_p (new_info.get_avl ())) - set_avl_info (avl_info (new_info.get_avl (), get_avl_source ())); + + if (new_info.has_avl ()) + { + if (new_info.has_avl_imm ()) + set_avl_info (avl_info (new_info.get_avl (), nullptr)); + else + { + if (vlmax_avl_p (new_info.get_avl ())) + set_avl_info (avl_info (new_info.get_avl (), get_avl_source ())); + else + { + /* Conservatively propagate non-VLMAX AVL of user vsetvl: + 1. The user vsetvl should be same block with the rvv insn. + 2. The user vsetvl is the only def insn of rvv insn. + 3. The AVL is not modified between def-use chain. + 4. The VL is only used by insn within EBB. + */ + bool modified_p = false; + for (insn_info *i = def_insn->next_nondebug_insn (); + real_insn_and_same_bb_p (i, get_insn ()->bb ()); + i = i->next_nondebug_insn ()) + { + if (find_access (i->defs (), REGNO (new_info.get_avl ()))) + { + modified_p = true; + break; + } + } + + bool has_live_out_use = false; + for (use_info *use : m_avl.get_source ()->all_uses ()) + { + if (use->is_live_out_use ()) + { + has_live_out_use = true; + break; + } + } + if (!modified_p && !has_live_out_use + && def_insn == m_avl.get_source ()->insn () + && m_insn->bb () == def_insn->bb ()) + set_avl_info (new_info.get_avl_info ()); + } + } + } if (scalar_move_insn_p (insn->rtl ()) && m_avl.has_non_zero_avl ()) m_demands[DEMAND_NONZERO_AVL] = true; diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h index 4257451bb74..87cdd2e886e 100644 --- a/gcc/config/riscv/riscv-vsetvl.h +++ b/gcc/config/riscv/riscv-vsetvl.h @@ -180,6 +180,7 @@ public: bool has_avl_reg () const { return get_value () && REG_P (get_value ()); } bool has_avl_no_reg () const { return !get_value (); } bool has_non_zero_avl () const; + bool has_avl () const { return get_value (); } }; /* Basic structure to save VL/VTYPE information. */ @@ -219,6 +220,7 @@ public: bool has_avl_reg () const { return m_avl.has_avl_reg (); } bool has_avl_no_reg () const { return m_avl.has_avl_no_reg (); } bool has_non_zero_avl () const { return m_avl.has_non_zero_avl (); }; + bool has_avl () const { return m_avl.has_avl (); } rtx get_avl () const { return m_avl.get_value (); } const avl_info &get_avl_info () const { return m_avl; } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c index 74bbf40ee9f..e27090d79cf 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=scalable -fno-vect-cost-model -fno-tree-loop-distribute-patterns -fdump-tree-optimized-details" } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param riscv-autovec-preference=scalable -fno-vect-cost-model -fno-tree-loop-distribute-patterns -fdump-tree-optimized-details" } */ #include @@ -20,7 +20,10 @@ TEST_TYPE (uint32_t) \ TEST_TYPE (int64_t) \ TEST_TYPE (uint64_t) \ + TEST_TYPE (_Float16) \ TEST_TYPE (float) \ TEST_TYPE (double) TEST_ALL () + +/* { dg-final { scan-tree-dump-times "\.SELECT_VL" 11 "optimized" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-2.c new file mode 100644 index 00000000000..eac7cbc757b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/select_vl-2.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=scalable -fno-schedule-insns --param riscv-autovec-lmul=m1 -O3 -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** foo: +** vsetivli\t[a-x0-9]+,\s*8,\s*e(8?|16?|32?|64),\s*m(1?|2?|4?|8?|f2?|f4?|f8),\s*t[au],\s*m[au] +** vle32\.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +** vsetvli\t[a-x0-9]+,\s*[a-x0-9]+,\s*e(8?|16?|32?|64),\s*m(1?|2?|4?|8?|f2?|f4?|f8),\s*t[au],\s*m[au] +** add\t[a-x0-9]+,[a-x0-9]+,[a-x0-9]+ +** vle32\.v\tv[0-9]+,0\([a-x0-9]+\) +** ... +*/ +void +foo (int32_t *__restrict a, + int32_t *__restrict b, + int32_t *__restrict cond) +{ + for (int i = 0; i < 8; i++) + if (cond[i]) + a[i] = b[i]; +}