From patchwork Fri Dec 1 00:51:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 172194 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp790730vqy; Thu, 30 Nov 2023 16:51:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IEB4NLAR3WTAflQDkRhW9J8Idmfr3yb3C4YxWL0WGZNIVrNQWcX4UH3jE4UwT50GO7dwrq+ X-Received: by 2002:ac8:4083:0:b0:423:8fdf:804 with SMTP id p3-20020ac84083000000b004238fdf0804mr35632437qtl.8.1701391911064; Thu, 30 Nov 2023 16:51:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701391911; cv=pass; d=google.com; s=arc-20160816; b=cK0BCEFh+vJww+9XsGO57Hqpel4Pk8MYN9t3VBE84pBAyBiCv8YkroVuWiV/wPnogJ p31vnlfe7LJuF/0julOeY/3xYh0YT+5MiodYA8nNT8ncJH+go4/03nPwr4elXVW8nyBs iAzHGU3HECovCl4Uk82Dnrdcy8gQ5s9DiPeYE1i4bOZaiNCtgoPGET2BUGzEa1rFPSmE CMVEH4g9dSvl3Gjzhmm/Cr1uADEWP7kzPlblR/tJXxo2a1MVsn2g/y+1ncWnSfMLQuw5 //5D2q1ACDL0ah8mMNnD2nP3y0jUhuqZ77rq9z/wJ7jFYrXbxyTmv8uh4DM7Q17ZUBEF iY1g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=m/p5Wo5K6TG9xtU1FaKvBypg301LPIrUYaBmrfez/oo=; fh=12MRPJmZ1mgDpHqWoogMKqnaGRGM2b7lcuJroqfjJiw=; b=zYyVBn/66Xjv+smqKPDtmKtHYi6jA/xzc6AcUAbsb2fpR59E4yYm3ry8fSVcPG4U97 8EP91CqajBmPJv12K8R5a6Y6suR4SonzAbXbMlIwnRlGQ+r7/4t60et5TP3rLmXUWk7v w7NoTbyPZTL6HqCjotGIqPmgWBusFj1Y/fmhAi/3MT+DFCeemfzCN5Uuy0NpDZloL0qn 6unlqIDAy1/D/d3ydiEquP3kyAEuxpdruIYwvrH5HB2vhi02RqIQXtaWknhKaRlq4nLI mgnPlVyTDpXOXT/dXlFO14JmdpqlnnDRS4mRXzfVoRNrsuICJrIB9iy7rdA5ZP7M6uf5 JrvQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id bw12-20020a05622a098c00b0042384cda308si2430163qtb.189.2023.11.30.16.51.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Nov 2023 16:51:51 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CD3EE3857B97 for ; Fri, 1 Dec 2023 00:51:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr2.qq.com (smtpbgbr2.qq.com [54.207.22.56]) by sourceware.org (Postfix) with ESMTPS id 7544B3858D38 for ; Fri, 1 Dec 2023 00:51:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7544B3858D38 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7544B3858D38 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.207.22.56 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701391885; cv=none; b=sXZs7uwoCj6FNVNVaeaHAmLe+ZkHXb4E4F+JOhFpUQfniLhcs4L8OED5okrYTZlCkZPbGEElo+eKvUicKC0ObLb62rRzeJseBP/Y4uifTyO2k98SQ1ErPPWhE7uSARsRZiUL/9RZhyll3ovKU0IxH00BhktD8dJ4D0H1zGT0gOI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701391885; c=relaxed/simple; bh=kXZB28Mxk4XzqxCBKvVdN9W2MIcd8QI8XcXg9gBwqog=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=kWFTsLpUoLnv+wbN7c4Ex/TjqcVy6DMVHTQZCqdTA2k5GmZC94RWTX/seKgtdQU/7qkuQ0k0ZoMadX0728f4+qfJl83svETH2s92kRFDr93Z9+IKS1uq85H3/tOJ1oS1h2UAXY4yY8CnaqeqxGBKrhsJZybv8+TOzpx0ZlnVX7Y= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp72t1701391872to1p6sgp Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 01 Dec 2023 08:51:11 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: sHHZn/YyBq7wGnm0wub7KdR/waR1vYKc5C4VlPEC+/rYLDQrfRfBxhmQxDhme y33RmiexEh/Tx3jQnlu0yS34q5wHtL9dtDGwgaHHGZODWC6dflUlGPnQ7oNPsm9fwKxeI4Q 1HpsRvbDDzbE4CyS0hNpwBsvOH121Sp1umVRiQV72tV1xvnLnkDmLUyYSCoRGVVYhx+7KzO 7ibhThch6Hl9f32u8OwGHQx8irJfZkl/9cWlslO774BMYSbGRWhORLS8upDQvqjS/0pZM5V gYaFrjdZzWHzjOXM20oQqWGKfGEGrEtAqpFKZ7YzYn9GzAyHU780AKoG+dujRTBX99gmyJC NTWbmnYkWI+KHV7/Hq3n0kh4CJcGfy6it7zjfmY X-QQ-GoodBg: 2 X-BIZMAIL-ID: 957460602774576867 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Fix VSETVL PASS regression Date: Fri, 1 Dec 2023 08:51:10 +0800 Message-Id: <20231201005110.2689714-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784038724563602176 X-GMAIL-MSGID: 1784038724563602176 This patch fix 2 regression (one is bug regression, the other is performance regression). Those 2 regressions are both we are comparing ratio for same AVL in wrong place. 1. BUG regression: avl_single-84.c: f0: li a5,999424 add a1,a1,a5 li a4,299008 add a5,a0,a5 addi a3,a4,992 addi a5,a5,576 addi a1,a1,576 vsetvli a4,zero,e8,m2,ta,ma add a0,a0,a3 vlm.v v1,0(a5) vsm.v v1,0(a1) vl1re64.v v1,0(a0) beq a2,zero,.L10 li a5,0 vsetvli zero,zero,e64,m1,tu,ma ---> This is totally incorrect since the ratio above is 4, wheras it is demanding ratio = 64 here. .L3: fcvt.d.lu fa5,a5 addi a5,a5,1 fadd.d fa5,fa5,fa0 vfmv.s.f v1,fa5 bne a5,a2,.L3 vfmv.f.s fa0,v1 ret .L10: vsetvli zero,zero,e64,m1,ta,ma vfmv.f.s fa0,v1 ret 2. Performance regression: before this patch: vsetvli a5,a4,e8,m1,ta,ma vsetvli zero,a5,e32,m1,tu,ma vmv.s.x v2,zero vmv.s.x v1,zero vsetvli zero,a5,e32,m4,tu,ma vle32.v v4,0(a1) vfmul.vv v4,v4,v4 vfredosum.vs v1,v4,v2 vfmv.f.s fa5,v1 fsw fa5,0(a0) sub a4,a4,a5 bne a4,zero,.L2 ret After this patch: vsetvli a5,a4,e32,m4,tu,ma vle32.v v4,0(a1) vmv.s.x v2,zero vmv.s.x v1,zero vfmul.vv v4,v4,v4 vfredosum.vs v1,v4,v2 vfmv.f.s fa5,v1 fsw fa5,0(a0) sub a4,a4,a5 bne a4,zero,.L2 ret Tested rv64gcv_zvfh_zfh passed no regression. zvl256b/zvl512b/zvl1024b/zve64d is runing. PR target/112776 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::pre_global_vsetvl_info): Fix ratio. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/avl_single-84.c: Adapt test. * gcc.target/riscv/rvv/vsetvl/pr111037-3.c: Ditto. * gcc.target/riscv/rvv/vsetvl/pr112776.c: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 13 ++++--- .../riscv/rvv/vsetvl/avl_single-84.c | 6 ++-- .../gcc.target/riscv/rvv/vsetvl/pr111037-3.c | 2 +- .../gcc.target/riscv/rvv/vsetvl/pr112776.c | 36 +++++++++++++++++++ 4 files changed, 46 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112776.c diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index b3e07d4c3aa..1da95daeeb0 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -1497,9 +1497,6 @@ private: { gcc_assert (prev.valid_p () && next.valid_p ()); - if (prev.get_ratio () != next.get_ratio ()) - return false; - if (next.has_vl () && next.vl_used_by_non_rvv_insn_p ()) return false; @@ -2188,7 +2185,7 @@ private: return true; } - bool preds_has_same_avl_p (const vsetvl_info &curr_info) + bool preds_all_same_avl_and_ratio_p (const vsetvl_info &curr_info) { gcc_assert ( !bitmap_empty_p (m_vsetvl_def_in[curr_info.get_bb ()->index ()])); @@ -2200,7 +2197,8 @@ private: { const vsetvl_info &prev_info = *m_vsetvl_def_exprs[expr_index]; if (!prev_info.valid_p () - || !m_dem.avl_available_p (prev_info, curr_info)) + || !m_dem.avl_available_p (prev_info, curr_info) + || prev_info.get_ratio () != curr_info.get_ratio ()) return false; } @@ -3171,7 +3169,7 @@ pre_vsetvl::pre_global_vsetvl_info () curr_info = block_info.local_infos[0]; } if (curr_info.valid_p () && !curr_info.vl_used_by_non_rvv_insn_p () - && preds_has_same_avl_p (curr_info)) + && preds_all_same_avl_and_ratio_p (curr_info)) curr_info.set_change_vtype_only (); vsetvl_info prev_info = vsetvl_info (); @@ -3179,7 +3177,8 @@ pre_vsetvl::pre_global_vsetvl_info () for (auto &curr_info : block_info.local_infos) { if (prev_info.valid_p () && curr_info.valid_p () - && m_dem.avl_available_p (prev_info, curr_info)) + && m_dem.avl_available_p (prev_info, curr_info) + && prev_info.get_ratio () == curr_info.get_ratio ()) curr_info.set_change_vtype_only (); prev_info = curr_info; } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-84.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-84.c index a584dd97dc0..5cd0f285029 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-84.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-84.c @@ -17,6 +17,6 @@ double f0 (int8_t * restrict in, int8_t * restrict out, int n, int m, unsigned c } /* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e8,\s*m2,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-Os" no-opts "-Oz" no-opts "-O1" no-opts "-g" no-opts "-funroll-loops" } } } } */ -/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*zero,\s*e64,\s*m1,\s*t[au],\s*m[au]} 2 { target { no-opts "-O0" no-opts "-Os" no-opts "-Oz" no-opts "-O1" no-opts "-g" no-opts "-funroll-loops" } } } } */ -/* { dg-final { scan-assembler-times {vsetvli} 3 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ - +/* { dg-final { scan-assembler-not {vsetvli\s+zero,\s*zero} { target { no-opts "-O0" no-opts "-Os" no-opts "-Oz" no-opts "-O1" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli} 1 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetivli} 2 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c index 0f40642c8b6..13344ecdd3b 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr111037-3.c @@ -13,4 +13,4 @@ void foo(_Float16 y, int16_t z, int64_t *i64p) } /* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*1,\s*e64,\s*m1,\s*t[au],\s*m[au]} 1 } } */ -/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*zero,\s*e16,\s*m1,\s*t[au],\s*m[au]} 1 } } */ +/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*1,\s*e16,\s*m1,\s*t[au],\s*m[au]} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112776.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112776.c new file mode 100644 index 00000000000..853690178ac --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr112776.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void +foo (float *r, const float *x) +{ + int i, k; + + vfloat32m4_t x_vec; + vfloat32m4_t x_forward_vec; + vfloat32m4_t temp_vec; + vfloat32m1_t dst_vec; + vfloat32m1_t src_vec; + + float result = 0.0f; + float shift_prev = 0.0f; + + size_t n = 64; + for (size_t vl; n > 0; n -= vl) + { + vl = __riscv_vsetvl_e32m4 (n); + x_vec = __riscv_vle32_v_f32m4 (&x[0], vl); + x_forward_vec = __riscv_vle32_v_f32m4 (&x[0], vl); + temp_vec = __riscv_vfmul_vv_f32m4 (x_vec, x_forward_vec, vl); + src_vec = __riscv_vfmv_s_tu (src_vec, 0.0f, vl); + dst_vec = __riscv_vfmv_s_tu (dst_vec, 0.0f, vl); + dst_vec = __riscv_vfredosum_tu (dst_vec, temp_vec, src_vec, vl); + r[0] = __riscv_vfmv_f_s_f32m1_f32 (dst_vec); + } +} + +/* { dg-final { scan-assembler-times {vsetvli} 1 } } */ +/* { dg-final { scan-assembler-not {vsetivli} } } */ +/* { dg-final { scan-assembler-times {vsetvli\t[a-x0-9]+,\s*[a-x0-9]+,\s*e32,\s*m4,\s*tu,\s*m[au]} 1 } } */