From patchwork Wed Dec 27 01:52:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 183390 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp1195318dyb; Tue, 26 Dec 2023 17:53:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IGeerYQI85HRz8hJr2JNiWdgClo+EYtJsHr7/kaB2kPyiqgoBPulQMAenU2y3lZTzfkqXwP X-Received: by 2002:a05:620a:2205:b0:77e:fba3:58eb with SMTP id m5-20020a05620a220500b0077efba358ebmr10204763qkh.124.1703642009426; Tue, 26 Dec 2023 17:53:29 -0800 (PST) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id r4-20020a05620a298400b0078130838b0bsi10094124qkp.471.2023.12.26.17.53.29 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Dec 2023 17:53:29 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 248BB3858C62 for ; Wed, 27 Dec 2023 01:53:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgjp3.qq.com (smtpbgjp3.qq.com [54.92.39.34]) by sourceware.org (Postfix) with ESMTPS id 3E4FD3858D32 for ; Wed, 27 Dec 2023 01:52:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3E4FD3858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3E4FD3858D32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.92.39.34 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703641961; cv=none; b=CeKeU5Gcxwoog9xjimIMKgfkUj2qDTl5e3VMTsibLlEqqsjDKPaGw1yHR3MTM/SLNF+d+UkX7Kv49QWNSfk5oL5IRBgf6JcRWBuT8uj5cGRwJpcFTYewO61Owe6vzSLItHDUUAii/NtdRyhinCl9SAi+rqwXRM6fQDWkAPasZAQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1703641961; c=relaxed/simple; bh=ujpSUiS1VgVYtNUvHskUdrz84VqjGXXv3UAk/NUctlg=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=paZjxjQ5/P4PoUxxKxS7CjGROow/zQNfvRh4bhOtdffsFk6JRR32f9iyht25qY7Q5jZqGaMU99/kPJ1cpXadoNtcC/x6D9/+CFzXeWKSYXdRv4yx1o4B9eTTpBkfma5tqiuMMaW5TsgEzNQwPbyU5KH39flZwR/wCMlHtaTwooE= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp83t1703641944ty9smr2j Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 27 Dec 2023 09:52:23 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: XBN7tc9DADKLhZIcXzDm9qSUW1LLPBVSYg+htRNKAIfBIjtdb0bpFf8z67vZ3 NXAZ9euXoXRflZSiMd8ssuZI07WBDK0DOFgvkmtSEkHtYH5R4ONKq5KHVeQjCJY6qSg8SEb 1ijdtz9vZ6nqrdi7BMU5UfnK9RB1p2KrnxOFWYZlPNNlq27qt7QZMU2xStgemIgJsWTBToe S0/RaP28NNpulaNeWBpeJjra1MYD1o1sWdPmDLlOjCIhs5gRFXepAMUH212PnTox20WG45q jQMVtkSZL0cHW8Bm1pm+o2y8VMEDKOuJotJeKXIiU1YQlWfy2W4kK5TbU5aYC5I6QS0NXif K1K/K6lSxC9JeTmnUodi5VirTLKrg== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 16371439053786364870 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Disallow transformation into VLMAX AVL for cond_len_xxx when length is in range [0, 31] Date: Wed, 27 Dec 2023 09:52:22 +0800 Message-Id: <20231227015222.3393770-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786398123458445671 X-GMAIL-MSGID: 1786398123458445671 Notice we have this following situation: vsetivli zero,4,e32,m1,ta,ma vlseg4e32.v v4,(a5) vlseg4e32.v v12,(a3) vsetvli a5,zero,e32,m1,tu,ma ---> This is redundant since VLMAX AVL = 4 when it is fixed-vlmax vfadd.vf v3,v13,fa0 vfadd.vf v1,v12,fa1 vfmul.vv v17,v3,v5 vfmul.vv v16,v1,v5 The rootcause is that we transform COND_LEN_xxx into VLMAX AVL when len == NUNITS blindly. However, we don't need to transform all of them since when len is range of [0,31], we don't need to consume scalar registers. After this patch: vsetivli zero,4,e32,m1,tu,ma addi a4,a5,400 vlseg4e32.v v12,(a3) vfadd.vf v3,v13,fa0 vfadd.vf v1,v12,fa1 vlseg4e32.v v4,(a4) vfadd.vf v2,v14,fa1 vfmul.vv v17,v3,v5 vfmul.vv v16,v1,v5 Tested on both RV32 and RV64 no regression. Ok for trunk ? gcc/ChangeLog: * config/riscv/riscv-v.cc (is_vlmax_len_p): New function. (expand_load_store): Disallow transformation into VLMAX when len is in range of [0,31] (expand_cond_len_op): Ditto. (expand_gather_scatter): Ditto. (expand_lanes_load_store): Ditto. (expand_fold_extract_last): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/post-ra-avl.c: Adapt test. * gcc.target/riscv/rvv/base/vf_avl-2.c: New test. --- gcc/config/riscv/riscv-v.cc | 21 +++++++++++++------ .../riscv/rvv/autovec/post-ra-avl.c | 2 +- .../gcc.target/riscv/rvv/base/vf_avl-2.c | 21 +++++++++++++++++++ 3 files changed, 37 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-2.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 038ab084a37..0cc7af58da6 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -68,6 +68,16 @@ imm_avl_p (machine_mode mode) : false; } +/* Return true if LEN is equal to NUNITS that outbounds range of [0, 31]. */ +static bool +is_vlmax_len_p (machine_mode mode, rtx len) +{ + poly_int64 value; + return poly_int_rtx_p (len, &value) + && known_eq (value, GET_MODE_NUNITS (mode)) + && !satisfies_constraint_K (len); +} + /* Helper functions for insn_flags && insn_types */ /* Return true if caller need pass mask operand for insn pattern with @@ -3776,7 +3786,7 @@ expand_load_store (rtx *ops, bool is_load) rtx len = ops[3]; machine_mode mode = GET_MODE (ops[0]); - if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode))) + if (is_vlmax_len_p (mode, len)) { /* If the length operand is equal to VF, it is VLMAX load/store. */ if (is_load) @@ -3842,8 +3852,7 @@ expand_cond_len_op (unsigned icode, insn_flags op_type, rtx *ops, rtx len) machine_mode mask_mode = GET_MODE (mask); poly_int64 value; bool is_dummy_mask = rtx_equal_p (mask, CONSTM1_RTX (mask_mode)); - bool is_vlmax_len - = poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)); + bool is_vlmax_len = is_vlmax_len_p (mode, len); unsigned insn_flags = HAS_DEST_P | HAS_MASK_P | HAS_MERGE_P | op_type; if (is_dummy_mask) @@ -4012,7 +4021,7 @@ expand_gather_scatter (rtx *ops, bool is_load) unsigned inner_offsize = GET_MODE_BITSIZE (inner_idx_mode); poly_int64 nunits = GET_MODE_NUNITS (vec_mode); poly_int64 value; - bool is_vlmax = poly_int_rtx_p (len, &value) && known_eq (value, nunits); + bool is_vlmax = is_vlmax_len_p (vec_mode, len); /* Extend the offset element to address width. */ if (inner_offsize < BITS_PER_WORD) @@ -4199,7 +4208,7 @@ expand_lanes_load_store (rtx *ops, bool is_load) rtx reg = is_load ? ops[0] : ops[1]; machine_mode mode = GET_MODE (ops[0]); - if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode))) + if (is_vlmax_len_p (mode, len)) { /* If the length operand is equal to VF, it is VLMAX load/store. */ if (is_load) @@ -4252,7 +4261,7 @@ expand_fold_extract_last (rtx *ops) rtx slide_vect = gen_reg_rtx (mode); insn_code icode; - if (poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode))) + if (is_vlmax_len_p (mode, len)) len = NULL_RTX; /* Calculate the number of 1-bit in mask. */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/post-ra-avl.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/post-ra-avl.c index f3d12bac7cd..c77b2d187fe 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/post-ra-avl.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/post-ra-avl.c @@ -13,4 +13,4 @@ int foo() { return a; } -/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero} 1 } } */ +/* { dg-final { scan-assembler-not {vsetvli} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-2.c new file mode 100644 index 00000000000..5a94a51f308 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=fixed-vlmax" } */ + +float f[12][100]; + +void bad1(float v1, float v2) +{ + for (int r = 0; r < 100; r += 4) + { + int i = r + 1; + f[0][r] = f[1][r] * (f[2][r] + v2) - f[1][i] * (f[2][i] + v1); + f[0][i] = f[1][r] * (f[2][i] + v1) + f[1][i] * (f[2][r] + v2); + f[0][r+2] = f[1][r+2] * (f[2][r+2] + v2) - f[1][i+2] * (f[2][i+2] + v1); + f[0][i+2] = f[1][r+2] * (f[2][i+2] + v1) + f[1][i+2] * (f[2][r+2] + v2); + } +} + +/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*4,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 } } */ +/* { dg-final { scan-assembler-times {vsetivli\s+zero,\s*1,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 } } */ +/* { dg-final { scan-assembler-times {vsetivli} 2 } } */ +/* { dg-final { scan-assembler-not {vsetvli} } } */