From patchwork Thu Jul 20 08:51:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 123089 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp2977452vqt; Thu, 20 Jul 2023 01:53:11 -0700 (PDT) X-Google-Smtp-Source: APBJJlEvFfOaGoHYVMOdIzNRE3d2GkKg2KZtbdB9eBPuYBBWREL6LFp5jJoKZijTXeWNN61rItIe X-Received: by 2002:a17:907:16a3:b0:995:3c9e:a629 with SMTP id hc35-20020a17090716a300b009953c9ea629mr5306487ejc.31.1689843191396; Thu, 20 Jul 2023 01:53:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689843191; cv=none; d=google.com; s=arc-20160816; b=JyKI2VMGE6DRgCMKHjhSFvNE0lQpkDEPL6Fmm+3oGkN3La4Qi8Rz8eqEmOplmINN7M zbfFYDs/NVXIr58a0K3Z1wjI7wTcWNyfw6A0BX3oKgZAz2Y9J1Bta70rEGOvrxfKTONp XjZDXcGkkqfVONMScCUQZpJG2JpzPLQvEme4Jm4v9R1Upaxcs3b4bgLufxocsMCQ8FRj LzeRCZ6i03mHyW6P1Liz2PfwIZEUtr49hJc+jbsuqrlrJwhz8pfAJjbn0hQaSkGzJ8wT JkMO/uwp/1TBzvOixPWcNcbrP8a/rM7zYwOJdFbLpiZpYSxzR8uZMYHVTLQHBKXpVLnF Hecg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=nNMJULOm/qUcB3M/nkh/zwvvlbXDmBpkxdqrbK9gPyQ=; fh=SuV1mxSfYh/fFJBV6FW8ZDQUWC7OLSIDYxyJSOKFLBQ=; b=ve5T/+RKqDo5chl7lJEIqR+QCbHzeC1aCsUyQHQTfpJTkWYrOWDABZfxu14k/jjBMw nkF4Jy7+QkNQYqPO06hIgwhtEFdxtnTaosxpuoQsHoO6BRlDtapnxxFzW/mJT4rGDvZK A8P2DMMbjlqZn2HJBiO1crVDwdjwF4BfmkuZeCeQSkfkoAPi84xj8u8Ttzs+T7scNqc4 o46tRIb0EEaojRY3eGyp9WnSGprN3HAolVSs5Nh13Vz6DFFypoX3lqPZ+HosaaM8nYKt s5OyGsOAFZL1VKok+S9bf6szM6cvalOgBXtM8JRLjO/a88KCv3rQxbtHnnngPezmAsLv kAug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id k17-20020a170906055100b00988c552332fsi391710eja.300.2023.07.20.01.53.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Jul 2023 01:53:11 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 468AB385C41F for ; Thu, 20 Jul 2023 08:52:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg153.qq.com (smtpbg153.qq.com [13.245.218.24]) by sourceware.org (Postfix) with ESMTPS id 462CD3858280 for ; Thu, 20 Jul 2023 08:51:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 462CD3858280 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp81t1689843065traaailv Received: from server1.localdomain ( [58.60.1.22]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 20 Jul 2023 16:51:04 +0800 (CST) X-QQ-SSF: 01400000000000G0U000000A0000000 X-QQ-FEAT: 3M0okmaRx3j13bav2o7CZVTMzdY3NKdX93Qft49q+COjd89LcZST0Up15W9SO JixHqgHqmu0Xiru+LdYyCXXTHtu4a6RTtXbOh/b0xtEZIgOPH/u/YRXX3aeoXCIzR7hZaT7 JAmBmxKqQO4w1WkNv4KX6xcRUYuaS9g6U+QR3WrQW6i4DYNu/bbIFjkqfFB8o/9GnLVCx97 YPBkLeVkfUS/w9KspImIa5y6W2FZu6KZeI60xpIf0KYA4rH7JHaH5/SVsHScNrzClGZBoHi TZqCRbQX/W+DU7sDSB6z42u9fYgyzrBVlXpLtZClO5Yrxr08Q16oUUTLLJhH66wHlTZebaR byCWz+uCAvncWRZNoCU8p1Bi5i7kTsaFIy1gnkYOFfbbkkzmJ4aFtuV/XzzgxQ86PrTtSDT P6voboWFjAI= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 2177787830173568740 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH V2] RISC-V: Support in-order floating-point reduction Date: Thu, 20 Jul 2023 16:51:03 +0800 Message-Id: <20230720085103.159227-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771929014437809146 X-GMAIL-MSGID: 1771929014437809146 This patch is depending on: https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624995.html Consider this following case: float foo (float *__restrict a, int n) { float result = 1.0; for (int i = 0; i < n; i++) result += a[i]; return result; } Compile with **NO** -ffast-math: Before this patch: :4:21: missed: couldn't vectorize loop :1:7: missed: not vectorized: relevant phi not supported: result_14 = PHI After this patch: foo: lui a5,%hi(.LC0) flw fa0,%lo(.LC0)(a5) ble a1,zero,.L4 .L3: vsetvli a5,a1,e32,m1,ta,ma vle32.v v1,0(a0) slli a4,a5,2 sub a1,a1,a5 vfmv.s.f v2,fa0 add a0,a0,a4 vfredosum.vs v1,v1,v2 ----------> FOLD_LEFT_PLUS vfmv.f.s fa0,v1 bne a1,zero,.L3 ret .L4: ret gcc/ChangeLog: * config/riscv/autovec.md (fold_left_plus_): New pattern. (mask_len_fold_left_plus_): Ditto. * config/riscv/riscv-protos.h (enum insn_type): New enum. (enum reduction_type): Ditto. (expand_reduction): Add in-order reduction. * config/riscv/riscv-v.cc (emit_nonvlmax_fp_reduction_insn): New function. (expand_reduction): Add in-order reduction. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c: New test. * gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c: New test. --- gcc/config/riscv/autovec.md | 39 ++++++++++++++ gcc/config/riscv/riscv-protos.h | 13 ++++- gcc/config/riscv/riscv-v.cc | 53 +++++++++++++++---- .../riscv/rvv/autovec/reduc/reduc_strict-1.c | 28 ++++++++++ .../riscv/rvv/autovec/reduc/reduc_strict-2.c | 26 +++++++++ .../riscv/rvv/autovec/reduc/reduc_strict-3.c | 18 +++++++ .../riscv/rvv/autovec/reduc/reduc_strict-4.c | 24 +++++++++ .../riscv/rvv/autovec/reduc/reduc_strict-5.c | 28 ++++++++++ .../riscv/rvv/autovec/reduc/reduc_strict-6.c | 18 +++++++ .../riscv/rvv/autovec/reduc/reduc_strict-7.c | 21 ++++++++ .../rvv/autovec/reduc/reduc_strict_run-1.c | 29 ++++++++++ .../rvv/autovec/reduc/reduc_strict_run-2.c | 31 +++++++++++ 12 files changed, 317 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 00947207f3f..667a877d009 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -1687,3 +1687,42 @@ riscv_vector::expand_reduction (SMIN, operands, f); DONE; }) + +;; ------------------------------------------------------------------------- +;; ---- [FP] Left-to-right reductions +;; ------------------------------------------------------------------------- +;; Includes: +;; - vfredosum.vs +;; ------------------------------------------------------------------------- + +;; Unpredicated in-order FP reductions. +(define_expand "fold_left_plus_" + [(match_operand: 0 "register_operand") + (match_operand: 1 "register_operand") + (match_operand:VF 2 "register_operand")] + "TARGET_VECTOR" +{ + riscv_vector::expand_reduction (PLUS, operands, + operands[1], + riscv_vector::reduction_type::FOLD_LEFT); + DONE; +}) + +;; Predicated in-order FP reductions. +(define_expand "mask_len_fold_left_plus_" + [(match_operand: 0 "register_operand") + (match_operand: 1 "register_operand") + (match_operand:VF 2 "register_operand") + (match_operand: 3 "vector_mask_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] + "TARGET_VECTOR" +{ + if (rtx_equal_p (operands[4], const0_rtx)) + emit_move_insn (operands[0], operands[1]); + else + riscv_vector::expand_reduction (PLUS, operands, + operands[1], + riscv_vector::reduction_type::MASK_LEN_FOLD_LEFT); + DONE; +}) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 16fb8dabca0..c9520f689e2 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -199,6 +199,7 @@ enum insn_type RVV_GATHER_M_OP = 5, RVV_SCATTER_M_OP = 4, RVV_REDUCTION_OP = 3, + RVV_REDUCTION_TU_OP = RVV_REDUCTION_OP + 2, }; enum vlmul_type { @@ -247,7 +248,7 @@ void emit_vlmax_merge_insn (unsigned, int, rtx *); void emit_vlmax_cmp_insn (unsigned, rtx *); void emit_vlmax_cmp_mu_insn (unsigned, rtx *); void emit_vlmax_masked_mu_insn (unsigned, int, rtx *); -void emit_scalar_move_insn (unsigned, rtx *); +void emit_scalar_move_insn (unsigned, rtx *, rtx = 0); void emit_nonvlmax_integer_move_insn (unsigned, rtx *, rtx); enum vlmul_type get_vlmul (machine_mode); unsigned int get_ratio (machine_mode); @@ -270,6 +271,13 @@ enum mask_policy MASK_AGNOSTIC = 1, MASK_ANY = 2, }; + +enum class reduction_type +{ + UNORDERED, + FOLD_LEFT, + MASK_LEN_FOLD_LEFT, +}; enum tail_policy get_prefer_tail_policy (); enum mask_policy get_prefer_mask_policy (); rtx get_avl_type_rtx (enum avl_type); @@ -282,7 +290,8 @@ bool has_vi_variant_p (rtx_code, rtx); void expand_vec_cmp (rtx, rtx_code, rtx, rtx); bool expand_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool); void expand_cond_len_binop (rtx_code, rtx *); -void expand_reduction (rtx_code, rtx *, rtx); +void expand_reduction (rtx_code, rtx *, rtx, + reduction_type = reduction_type::UNORDERED); #endif bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, bool, void (*)(rtx *, rtx)); diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 53088edf909..e338be151d3 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1023,11 +1023,11 @@ emit_nonvlmax_fp_tu_insn (unsigned icode, int op_num, rtx *ops, rtx avl) /* Emit vmv.s.x instruction. */ void -emit_scalar_move_insn (unsigned icode, rtx *ops) +emit_scalar_move_insn (unsigned icode, rtx *ops, rtx len) { machine_mode dest_mode = GET_MODE (ops[0]); machine_mode mask_mode = get_mask_mode (dest_mode).require (); - insn_expander e (riscv_vector::RVV_SCALAR_MOV_OP, + insn_expander e (RVV_SCALAR_MOV_OP, /* HAS_DEST_P */ true, /* FULLY_UNMASKED_P */ false, /* USE_REAL_MERGE_P */ true, @@ -1038,7 +1038,7 @@ emit_scalar_move_insn (unsigned icode, rtx *ops) e.set_policy (TAIL_ANY); e.set_policy (MASK_ANY); - e.set_vl (CONST1_RTX (Pmode)); + e.set_vl (len ? len : CONST1_RTX (Pmode)); e.emit_insn ((enum insn_code) icode, ops); } @@ -1196,6 +1196,26 @@ emit_vlmax_fp_reduction_insn (unsigned icode, int op_num, rtx *ops) e.emit_insn ((enum insn_code) icode, ops); } +/* Emit reduction instruction. */ +static void +emit_nonvlmax_fp_reduction_insn (unsigned icode, int op_num, rtx *ops, rtx vl) +{ + machine_mode dest_mode = GET_MODE (ops[0]); + machine_mode mask_mode = get_mask_mode (GET_MODE (ops[1])).require (); + insn_expander e (op_num, + /* HAS_DEST_P */ true, + /* FULLY_UNMASKED_P */ false, + /* USE_REAL_MERGE_P */ true, + /* HAS_AVL_P */ true, + /* VLMAX_P */ false, dest_mode, + mask_mode); + + e.set_policy (TAIL_ANY); + e.set_rounding_mode (FRM_DYN); + e.set_vl (vl); + e.emit_insn ((enum insn_code) icode, ops); +} + /* Emit merge instruction. */ static machine_mode @@ -3343,9 +3363,10 @@ expand_cond_len_ternop (unsigned icode, rtx *ops) /* Expand reduction operations. */ void -expand_reduction (rtx_code code, rtx *ops, rtx init) +expand_reduction (rtx_code code, rtx *ops, rtx init, reduction_type type) { - machine_mode vmode = GET_MODE (ops[1]); + rtx vector = type == reduction_type::UNORDERED ? ops[1] : ops[2]; + machine_mode vmode = GET_MODE (vector); machine_mode m1_mode = get_m1_mode (vmode).require (); machine_mode m1_mmode = get_mask_mode (m1_mode).require (); @@ -3353,16 +3374,30 @@ expand_reduction (rtx_code code, rtx *ops, rtx init) rtx m1_mask = gen_scalar_move_mask (m1_mmode); rtx m1_undef = RVV_VUNDEF (m1_mode); rtx scalar_move_ops[] = {m1_tmp, m1_mask, m1_undef, init}; - emit_scalar_move_insn (code_for_pred_broadcast (m1_mode), scalar_move_ops); + rtx len = type == reduction_type::MASK_LEN_FOLD_LEFT ? ops[4] : NULL_RTX; + emit_scalar_move_insn (code_for_pred_broadcast (m1_mode), scalar_move_ops, + len); rtx m1_tmp2 = gen_reg_rtx (m1_mode); - rtx reduc_ops[] = {m1_tmp2, ops[1], m1_tmp}; + rtx reduc_ops[] = {m1_tmp2, vector, m1_tmp}; if (FLOAT_MODE_P (vmode) && code == PLUS) { insn_code icode - = code_for_pred_reduc_plus (UNSPEC_UNORDERED, vmode, m1_mode); - emit_vlmax_fp_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops); + = code_for_pred_reduc_plus (type == reduction_type::UNORDERED + ? UNSPEC_UNORDERED + : UNSPEC_ORDERED, + vmode, m1_mode); + if (type == reduction_type::MASK_LEN_FOLD_LEFT) + { + rtx mask = ops[3]; + rtx mask_len_reduc_ops[] + = {m1_tmp2, mask, RVV_VUNDEF (m1_mode), vector, m1_tmp}; + emit_nonvlmax_fp_reduction_insn (icode, RVV_REDUCTION_TU_OP, + mask_len_reduc_ops, len); + } + else + emit_vlmax_fp_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops); } else { diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c new file mode 100644 index 00000000000..c293e9ae746 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +#include + +#define NUM_ELEMS(TYPE) ((int)(5 * (256 / sizeof (TYPE)) + 3)) + +#define DEF_REDUC_PLUS(TYPE) \ + TYPE __attribute__ ((noinline, noclone)) \ + reduc_plus_##TYPE (TYPE *a, TYPE *b) \ + { \ + TYPE r = 0, q = 3; \ + for (int i = 0; i < NUM_ELEMS (TYPE); i++) \ + { \ + r += a[i]; \ + q -= b[i]; \ + } \ + return r * q; \ + } + +#define TEST_ALL(T) \ + T (_Float16) \ + T (float) \ + T (double) + +TEST_ALL (DEF_REDUC_PLUS) + +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 6 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c new file mode 100644 index 00000000000..2e1e7ab674d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +#define NUM_ELEMS(TYPE) ((int) (5 * (256 / sizeof (TYPE)) + 3)) + +#define DEF_REDUC_PLUS(TYPE) \ +void __attribute__ ((noinline, noclone)) \ +reduc_plus_##TYPE (TYPE (*restrict a)[NUM_ELEMS (TYPE)], \ + TYPE *restrict r, int n) \ +{ \ + for (int i = 0; i < n; i++) \ + { \ + r[i] = 0; \ + for (int j = 0; j < NUM_ELEMS (TYPE); j++) \ + r[i] += a[i][j]; \ + } \ +} + +#define TEST_ALL(T) \ + T (_Float16) \ + T (float) \ + T (double) + +TEST_ALL (DEF_REDUC_PLUS) + +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c new file mode 100644 index 00000000000..f559d40e60f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +double mat[100][2]; + +double +slp_reduc_plus (int n) +{ + double tmp = 0.0; + for (int i = 0; i < n; i++) + { + tmp = tmp + mat[i][0]; + tmp = tmp + mat[i][1]; + } + return tmp; +} + +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c new file mode 100644 index 00000000000..428d371d9cf --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +double mat[100][8]; + +double +slp_reduc_plus (int n) +{ + double tmp = 0.0; + for (int i = 0; i < n; i++) + { + tmp = tmp + mat[i][0]; + tmp = tmp + mat[i][1]; + tmp = tmp + mat[i][2]; + tmp = tmp + mat[i][3]; + tmp = tmp + mat[i][4]; + tmp = tmp + mat[i][5]; + tmp = tmp + mat[i][6]; + tmp = tmp + mat[i][7]; + } + return tmp; +} + +/* { dg-final { scan-assembler {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c new file mode 100644 index 00000000000..24add2291f1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +double mat[100][12]; + +double +slp_reduc_plus (int n) +{ + double tmp = 0.0; + for (int i = 0; i < n; i++) + { + tmp = tmp + mat[i][0]; + tmp = tmp + mat[i][1]; + tmp = tmp + mat[i][2]; + tmp = tmp + mat[i][3]; + tmp = tmp + mat[i][4]; + tmp = tmp + mat[i][5]; + tmp = tmp + mat[i][6]; + tmp = tmp + mat[i][7]; + tmp = tmp + mat[i][8]; + tmp = tmp + mat[i][9]; + tmp = tmp + mat[i][10]; + tmp = tmp + mat[i][11]; + } + return tmp; +} + +/* { dg-final { scan-assembler {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c new file mode 100644 index 00000000000..c1567b067ba --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model -fdump-tree-vect-details" } */ + +float +double_reduc (float (*i)[16]) +{ + float l = 0; + +#pragma GCC unroll 0 + for (int a = 0; a < 8; a++) + for (int b = 0; b < 100; b++) + l += i[b][a]; + return l; +} + +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */ +/* { dg-final { scan-tree-dump "Detected double reduction" "vect" } } */ +/* { dg-final { scan-tree-dump-not "OUTER LOOP VECTORIZED" "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c new file mode 100644 index 00000000000..f742a824bb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model -fdump-tree-vect-details" } */ + +float +double_reduc (float *i, float *j) +{ + float k = 0, l = 0; + + for (int a = 0; a < 8; a++) + for (int b = 0; b < 100; b++) + { + k += i[b]; + l += j[b]; + } + return l * k; +} + +/* { dg-final { scan-assembler-times {vle32\.v} 2 } } */ +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */ +/* { dg-final { scan-tree-dump "Detected double reduction" "vect" } } */ +/* { dg-final { scan-tree-dump-not "OUTER LOOP VECTORIZED" "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c new file mode 100644 index 00000000000..516be97e9eb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c @@ -0,0 +1,29 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +#include "reduc_strict-1.c" + +#define TEST_REDUC_PLUS(TYPE) \ + { \ + TYPE a[NUM_ELEMS (TYPE)]; \ + TYPE b[NUM_ELEMS (TYPE)]; \ + TYPE r = 0, q = 3; \ + for (int i = 0; i < NUM_ELEMS (TYPE); i++) \ + { \ + a[i] = (i * 0.1) * (i & 1 ? 1 : -1); \ + b[i] = (i * 0.3) * (i & 1 ? 1 : -1); \ + r += a[i]; \ + q -= b[i]; \ + asm volatile ("" ::: "memory"); \ + } \ + TYPE res = reduc_plus_##TYPE (a, b); \ + if (res != r * q) \ + __builtin_abort (); \ + } + +int __attribute__ ((optimize (1))) +main () +{ + TEST_ALL (TEST_REDUC_PLUS); + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c new file mode 100644 index 00000000000..0a4238d96f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c @@ -0,0 +1,31 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +#include "reduc_strict-2.c" + +#define NROWS 5 + +#define TEST_REDUC_PLUS(TYPE) \ + { \ + TYPE a[NROWS][NUM_ELEMS (TYPE)]; \ + TYPE r[NROWS]; \ + TYPE expected[NROWS] = {}; \ + for (int i = 0; i < NROWS; ++i) \ + for (int j = 0; j < NUM_ELEMS (TYPE); ++j) \ + { \ + a[i][j] = (i * 0.1 + j * 0.6) * (j & 1 ? 1 : -1); \ + expected[i] += a[i][j]; \ + asm volatile ("" ::: "memory"); \ + } \ + reduc_plus_##TYPE (a, r, NROWS); \ + for (int i = 0; i < NROWS; ++i) \ + if (r[i] != expected[i]) \ + __builtin_abort (); \ + } + +int __attribute__ ((optimize (1))) +main () +{ + TEST_ALL (TEST_REDUC_PLUS); + return 0; +}