From patchwork Mon Jan 15 11:43:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 188115 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp1647765dyc; Mon, 15 Jan 2024 03:44:58 -0800 (PST) X-Google-Smtp-Source: AGHT+IHV7YETzNU9ZnXO7zvKOP0448eY/NGyU3sRCSrDel2mY5hZU2HnfzUQI+jcQ2cJFGxThV62 X-Received: by 2002:ad4:5c48:0:b0:681:31a3:e032 with SMTP id a8-20020ad45c48000000b0068131a3e032mr7303387qva.29.1705319098295; Mon, 15 Jan 2024 03:44:58 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705319098; cv=pass; d=google.com; s=arc-20160816; b=dA6VUHDdGJ/PJH5cZRg2Rj4NDvDtg8ojSIZvATy9Q5jsNmD/LOY06q8hNrB1dtgYaa o6L/d2xUmbRT8LmQ6vB4pigOGULQM8QZI/PnpAflj5tofQrRMa+z0O7cLywd45VCtL4c jYWSHOXIbp+2M9jGWYIeC4o7d7TOg5hGhn9yb40NJmJU1N/PCV9oiXlpQYmNiBvy5NU4 4FHIsiij3Zxt6vZdon/Rd9tbyYzNGfn1wdUAqy+6xuEhT8Gk+zP4iTSMYmEnhVQiyBNP e880FzxHNl9Mr8Z6GCfVt1N9Cqx+NbJHBRV3HwERmfQ9dikGACLV4B7+Vwa3Dk8RWOSe LpbQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=O+Y5nhxuVGELQQ6sGt77uDMROYry7jf8pWXmJDTbU7E=; fh=idvV5TQ1gmHAoU8u1GUGfjilVySOK+BR5TeZLoSouN8=; b=oLBky9ljfOaa/3N698Oe93RVuTvBEOrdT552N5+ApM1NRayDK9rZ+CI6fVH6F1aLzj YnY6MBv6Yk4vqSzFgXN86qc6u8IR9dLiDkxFcCYdxgXKvqkD/A7kJyAyzYHI7V/IgWt0 etAQChrylFlnmFUZU7uvB5Eu/eNtU7nxErdrn8PibBvJJpaKP52PzLrA5xmqNV/ntW8Z j42tmgbLZ3mRkH3IZWtchil0EqOtNK9IBxvsqeOBuUIvpU2tMYJS633sa+9M7UegzyYJ luouR+iWKVMDkv/m8SrOm2J3BY5CslqZ2aAaohUujaGGzVip3N7NdG4WSOCJfzos0lex IUOA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id 2-20020ac85742000000b00429fe24b4d4si29027qtx.189.2024.01.15.03.44.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 03:44:58 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 11A043857011 for ; Mon, 15 Jan 2024 11:44:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg156.qq.com (smtpbg156.qq.com [15.184.82.18]) by sourceware.org (Postfix) with ESMTPS id 6AB043857BB2 for ; Mon, 15 Jan 2024 11:43:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6AB043857BB2 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6AB043857BB2 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=15.184.82.18 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705319041; cv=none; b=VDDmJpAM56NQRgJ3bNrZDtolB20Js6fF0McUxZPD9tjAr1/RCBgv6AxmG77cEP2fvSXT8+eatiMcaS6Bgcfwml1BODH5OVX4yW5WwMZV/UU9o4cgXAidlEfFBhQ0titrYcSVq+7xkTbg4wTfhxGXXNzalt0i+ljqZNaSpMXt8/Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705319041; c=relaxed/simple; bh=wAl4Y+4nqGLD11plHZKg1m3KQIua6/o+JVMXNHZYZnw=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=aDu69CrBuXlVYVE7FYpBQ7hbsS3BJ+aZyI62mzPTfXjjBJcRpKjD1UgGtujpLEsnOrPVWmuVOtZZJ+DWD1leKlO5MeR1iAhlgRqT/6YWGjGOAaEdixyA+8N5cWdqXmktPYojzAx2Hm5AmgR9x2l5Hqj41KUtN4YZZPQ6MmxGPro= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp90t1705319031t522vdk9 X-QQ-Originating-IP: ACVWUNbYpO39FQm58Mhpx8nG6VfLKiPSivy57vzomm0= Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Mon, 15 Jan 2024 19:43:49 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: W+onFc5Tw4NU41QqlHdkKsXSTyaaMbfmjQrxZtakgomud9YqGNNaB2kyAU/6l ++Al9ZvDKJkS9y13PY+BNrejU/COEtRAKBDPuqTF43srzE5eqwgVq0idxUMSNQKO9Xqe/kW eH9Es+KeWvf3xzxBNdkAr80YtuaEsAC+W++fA9coDgAdzNB/I6WNkYUjQcL8Wzl0RqzV+ze yXlOuFW5fyliCtY+tIh0/9DU6B6JY/URjNg3IRtFHLCRst9CE3Lj6sxTSmeg99Nu2jRsAxr vyWCHT3mS4SJz4X7+JqrDAJXmrv5OGmVaRk67Zs/ZdaO9QMoraNYMksXi07dZux5na8xNZm FCu8XQ1lLhHonikSV5ec9GiOtcG4wJfED2zZ5oGg8hKDX8ueWa+dyEZkN+VI35+EFkkeFO9 8AyQDJ8T7CeIkbXDrfxHd36IxaLHNDWD X-QQ-GoodBg: 2 X-BIZMAIL-ID: 14257249649361506786 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: Juzhe-Zhong Subject: [Committed V3] RISC-V: Adjust loop len by costing 1 when NITER < VF Date: Mon, 15 Jan 2024 19:43:48 +0800 Message-Id: <20240115114348.3149415-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788156678563251629 X-GMAIL-MSGID: 1788156678563251629 Rebase in v3: Rebase to the trunk and commit it as it's approved by Robin. Update in v2: Add dynmaic lmul test. This patch fixes the regression between GCC 13.2.0 and trunk GCC (GCC-14) GCC 13.2.0: lui a5,%hi(a) li a4,19 sb a4,%lo(a)(a5) li a0,0 ret Trunk GCC: vsetvli a5,zero,e8,mf2,ta,ma li a4,-32768 vid.v v1 vsetvli zero,zero,e16,m1,ta,ma addiw a4,a4,104 vmv.v.i v3,15 lui a1,%hi(a) li a0,19 vsetvli zero,zero,e8,mf2,ta,ma vadd.vi v1,v1,1 sb a0,%lo(a)(a1) vsetvli zero,zero,e16,m1,ta,ma vzext.vf2 v2,v1 vmv.v.x v1,a4 vminu.vv v2,v2,v3 vsrl.vv v1,v1,v2 vslidedown.vi v1,v1,17 vmv.x.s a0,v1 snez a0,a0 ret The root cause we are vectorizing the codes inefficiently since we doesn't cost len when NITERS < VF. Leverage loop control of mask targets or rs6000 fixes the regression. Tested no regression. Ok for trunk ? PR target/113281 gcc/ChangeLog: * config/riscv/riscv-vector-costs.cc (costs::adjust_vect_cost_per_loop): New function. (costs::finish_cost): Adjust cost for LOOP LEN with NITERS < VF. * config/riscv/riscv-vector-costs.h: New function. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c: New test. * gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c: New test. --- gcc/config/riscv/riscv-vector-costs.cc | 57 +++++++++++++++++++ gcc/config/riscv/riscv-vector-costs.h | 2 + .../vect/costmodel/riscv/rvv/pr113281-3.c | 18 ++++++ .../vect/costmodel/riscv/rvv/pr113281-4.c | 18 ++++++ .../vect/costmodel/riscv/rvv/pr113281-5.c | 18 ++++++ 5 files changed, 113 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 090275c7efe..90ab93b7506 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -1097,9 +1097,66 @@ costs::add_stmt_cost (int count, vect_cost_for_stmt kind, return record_stmt_cost (stmt_info, where, count * stmt_cost); } +/* For some target specific vectorization cost which can't be handled per stmt, + we check the requisite conditions and adjust the vectorization cost + accordingly if satisfied. One typical example is to model model and adjust + loop_len cost for known_lt (NITERS, VF). */ + +void +costs::adjust_vect_cost_per_loop (loop_vec_info loop_vinfo) +{ + if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo) + && !LOOP_VINFO_USING_DECREMENTING_IV_P (loop_vinfo)) + { + /* In middle-end loop vectorizer, we don't count the loop_len cost in + vect_estimate_min_profitable_iters when NITERS < VF, that is, we only + count cost of len that we need to iterate loop more than once with VF. + It's correct for most of the cases: + + E.g. VF = [4, 4] + for (int i = 0; i < 3; i ++) + a[i] += b[i]; + + We don't need to cost MIN_EXPR or SELECT_VL for the case above. + + However, for some inefficient vectorized cases, it does use MIN_EXPR + to generate len. + + E.g. VF = [256, 256] + + Loop body: + # loop_len_110 = PHI <18(2), _119(11)> + ... + _117 = MIN_EXPR ; + _118 = 18 - _117; + _119 = MIN_EXPR <_118, POLY_INT_CST [256, 256]>; + ... + + Epilogue: + ... + _112 = .VEC_EXTRACT (vect_patt_27.14_109, _111); + + We cost 1 unconditionally for this situation like other targets which + apply mask as the loop control. */ + rgroup_controls *rgc; + unsigned int num_vectors_m1; + unsigned int body_stmts = 0; + FOR_EACH_VEC_ELT (LOOP_VINFO_LENS (loop_vinfo), num_vectors_m1, rgc) + if (rgc->type) + body_stmts += num_vectors_m1 + 1; + + add_stmt_cost (body_stmts, scalar_stmt, NULL, NULL, NULL_TREE, 0, + vect_body); + } +} + void costs::finish_cost (const vector_costs *scalar_costs) { + if (loop_vec_info loop_vinfo = dyn_cast (m_vinfo)) + { + adjust_vect_cost_per_loop (loop_vinfo); + } vector_costs::finish_cost (scalar_costs); } diff --git a/gcc/config/riscv/riscv-vector-costs.h b/gcc/config/riscv/riscv-vector-costs.h index dc0d61f5d4a..4e2bbfd5ca9 100644 --- a/gcc/config/riscv/riscv-vector-costs.h +++ b/gcc/config/riscv/riscv-vector-costs.h @@ -96,6 +96,8 @@ private: V_REGS spills according to the analysis. */ bool m_has_unexpected_spills_p = false; void record_potential_unexpected_spills (loop_vec_info); + + void adjust_vect_cost_per_loop (loop_vec_info); }; } // namespace riscv_vector diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c new file mode 100644 index 00000000000..706e19116c9 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-3.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl4096b -mabi=lp64d -O3 -ftree-vectorize --param=riscv-autovec-lmul=m8" } */ + +unsigned char a; + +int main() { + short b = a = 0; + for (; a != 19; a++) + if (a) + b = 32872 >> a; + + if (b == 0) + return 0; + else + return 1; +} + +/* { dg-final { scan-assembler-not {vset} } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c new file mode 100644 index 00000000000..b0305db2d48 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-4.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl4096b -mabi=lp64d -O3 -ftree-vectorize --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax" } */ + +unsigned char a; + +int main() { + short b = a = 0; + for (; a != 19; a++) + if (a) + b = 32872 >> a; + + if (b == 0) + return 0; + else + return 1; +} + +/* { dg-final { scan-assembler-not {vset} } } */ diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c new file mode 100644 index 00000000000..d3f5717b874 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr113281-5.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl4096b -mabi=lp64d -O3 -ftree-vectorize --param=riscv-autovec-lmul=dynamic" } */ + +unsigned char a; + +int main() { + short b = a = 0; + for (; a != 19; a++) + if (a) + b = 32872 >> a; + + if (b == 0) + return 0; + else + return 1; +} + +/* { dg-final { scan-assembler-not {vset} } } */