From patchwork Thu Jul 20 08:51:03 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
X-Patchwork-Id: 123089
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp2977452vqt;
        Thu, 20 Jul 2023 01:53:11 -0700 (PDT)
X-Google-Smtp-Source: 
 APBJJlEvFfOaGoHYVMOdIzNRE3d2GkKg2KZtbdB9eBPuYBBWREL6LFp5jJoKZijTXeWNN61rItIe
X-Received: by 2002:a17:907:16a3:b0:995:3c9e:a629 with SMTP id
 hc35-20020a17090716a300b009953c9ea629mr5306487ejc.31.1689843191396;
        Thu, 20 Jul 2023 01:53:11 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1689843191; cv=none;
        d=google.com; s=arc-20160816;
        b=JyKI2VMGE6DRgCMKHjhSFvNE0lQpkDEPL6Fmm+3oGkN3La4Qi8Rz8eqEmOplmINN7M
         zbfFYDs/NVXIr58a0K3Z1wjI7wTcWNyfw6A0BX3oKgZAz2Y9J1Bta70rEGOvrxfKTONp
         XjZDXcGkkqfVONMScCUQZpJG2JpzPLQvEme4Jm4v9R1Upaxcs3b4bgLufxocsMCQ8FRj
         LzeRCZ6i03mHyW6P1Liz2PfwIZEUtr49hJc+jbsuqrlrJwhz8pfAJjbn0hQaSkGzJ8wT
         JkMO/uwp/1TBzvOixPWcNcbrP8a/rM7zYwOJdFbLpiZpYSxzR8uZMYHVTLQHBKXpVLnF
         Hecg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:feedback-id
         :content-transfer-encoding:mime-version:message-id:date:subject:cc
         :to:from:dmarc-filter:delivered-to;
        bh=nNMJULOm/qUcB3M/nkh/zwvvlbXDmBpkxdqrbK9gPyQ=;
        fh=SuV1mxSfYh/fFJBV6FW8ZDQUWC7OLSIDYxyJSOKFLBQ=;
        b=ve5T/+RKqDo5chl7lJEIqR+QCbHzeC1aCsUyQHQTfpJTkWYrOWDABZfxu14k/jjBMw
         nkF4Jy7+QkNQYqPO06hIgwhtEFdxtnTaosxpuoQsHoO6BRlDtapnxxFzW/mJT4rGDvZK
         A8P2DMMbjlqZn2HJBiO1crVDwdjwF4BfmkuZeCeQSkfkoAPi84xj8u8Ttzs+T7scNqc4
         o46tRIb0EEaojRY3eGyp9WnSGprN3HAolVSs5Nh13Vz6DFFypoX3lqPZ+HosaaM8nYKt
         s5OyGsOAFZL1VKok+S9bf6szM6cvalOgBXtM8JRLjO/a88KCv3rQxbtHnnngPezmAsLv
         kAug==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c])
        by mx.google.com with ESMTPS id
 k17-20020a170906055100b00988c552332fsi391710eja.300.2023.07.20.01.53.10
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Thu, 20 Jul 2023 01:53:11 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 468AB385C41F
	for <ouuuleilei@gmail.com>; Thu, 20 Jul 2023 08:52:46 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from smtpbg153.qq.com (smtpbg153.qq.com [13.245.218.24])
 by sourceware.org (Postfix) with ESMTPS id 462CD3858280
 for <gcc-patches@gcc.gnu.org>; Thu, 20 Jul 2023 08:51:13 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 462CD3858280
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivai.ai
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai
X-QQ-mid: bizesmtp81t1689843065traaailv
Received: from server1.localdomain ( [58.60.1.22])
 by bizesmtp.qq.com (ESMTP) with
 id ; Thu, 20 Jul 2023 16:51:04 +0800 (CST)
X-QQ-SSF: 01400000000000G0U000000A0000000
X-QQ-FEAT: 3M0okmaRx3j13bav2o7CZVTMzdY3NKdX93Qft49q+COjd89LcZST0Up15W9SO
 JixHqgHqmu0Xiru+LdYyCXXTHtu4a6RTtXbOh/b0xtEZIgOPH/u/YRXX3aeoXCIzR7hZaT7
 JAmBmxKqQO4w1WkNv4KX6xcRUYuaS9g6U+QR3WrQW6i4DYNu/bbIFjkqfFB8o/9GnLVCx97
 YPBkLeVkfUS/w9KspImIa5y6W2FZu6KZeI60xpIf0KYA4rH7JHaH5/SVsHScNrzClGZBoHi
 TZqCRbQX/W+DU7sDSB6z42u9fYgyzrBVlXpLtZClO5Yrxr08Q16oUUTLLJhH66wHlTZebaR
 byCWz+uCAvncWRZNoCU8p1Bi5i7kTsaFIy1gnkYOFfbbkkzmJ4aFtuV/XzzgxQ86PrTtSDT
 P6voboWFjAI=
X-QQ-GoodBg: 2
X-BIZMAIL-ID: 2177787830173568740
From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
To: gcc-patches@gcc.gnu.org
Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com,
 rdapp.gcc@gmail.com, Juzhe-Zhong <juzhe.zhong@rivai.ai>
Subject: [PATCH V2] RISC-V: Support in-order floating-point reduction
Date: Thu, 20 Jul 2023 16:51:03 +0800
Message-Id: <20230720085103.159227-1-juzhe.zhong@rivai.ai>
X-Mailer: git-send-email 2.36.1
MIME-Version: 1.0
X-QQ-SENDSIZE: 520
Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0
X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS,
 SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1771929014437809146
X-GMAIL-MSGID: 1771929014437809146

This patch is depending on:
https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624995.html

Consider this following case:
float foo (float *__restrict a, int n)
{
  float result = 1.0;
  for (int i = 0; i < n; i++)
   result += a[i];
  return result;
}

Compile with **NO** -ffast-math:

Before this patch:
<source>:4:21: missed: couldn't vectorize loop
<source>:1:7: missed: not vectorized: relevant phi not supported: result_14 = PHI <result_11(6), 1.0e+0(5)>

After this patch:
foo:
	lui	a5,%hi(.LC0)
	flw	fa0,%lo(.LC0)(a5)
	ble	a1,zero,.L4
.L3:
	vsetvli	a5,a1,e32,m1,ta,ma
	vle32.v	v1,0(a0)
	slli	a4,a5,2
	sub	a1,a1,a5
	vfmv.s.f	v2,fa0
	add	a0,a0,a4
	vfredosum.vs	v1,v1,v2     ----------> FOLD_LEFT_PLUS
	vfmv.f.s	fa0,v1
	bne	a1,zero,.L3
	ret
.L4:
	ret

gcc/ChangeLog:

	* config/riscv/autovec.md (fold_left_plus_<mode>): New pattern.
	(mask_len_fold_left_plus_<mode>): Ditto.
	* config/riscv/riscv-protos.h (enum insn_type): New enum.
	(enum reduction_type): Ditto.
	(expand_reduction): Add in-order reduction.
	* config/riscv/riscv-v.cc (emit_nonvlmax_fp_reduction_insn): New function.
	(expand_reduction): Add in-order reduction.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c: New test.
---
 gcc/config/riscv/autovec.md                   | 39 ++++++++++++++
 gcc/config/riscv/riscv-protos.h               | 13 ++++-
 gcc/config/riscv/riscv-v.cc                   | 53 +++++++++++++++----
 .../riscv/rvv/autovec/reduc/reduc_strict-1.c  | 28 ++++++++++
 .../riscv/rvv/autovec/reduc/reduc_strict-2.c  | 26 +++++++++
 .../riscv/rvv/autovec/reduc/reduc_strict-3.c  | 18 +++++++
 .../riscv/rvv/autovec/reduc/reduc_strict-4.c  | 24 +++++++++
 .../riscv/rvv/autovec/reduc/reduc_strict-5.c  | 28 ++++++++++
 .../riscv/rvv/autovec/reduc/reduc_strict-6.c  | 18 +++++++
 .../riscv/rvv/autovec/reduc/reduc_strict-7.c  | 21 ++++++++
 .../rvv/autovec/reduc/reduc_strict_run-1.c    | 29 ++++++++++
 .../rvv/autovec/reduc/reduc_strict_run-2.c    | 31 +++++++++++
 12 files changed, 317 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 00947207f3f..667a877d009 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -1687,3 +1687,42 @@
   riscv_vector::expand_reduction (SMIN, operands, f);
   DONE;
 })
+
+;; -------------------------------------------------------------------------
+;; ---- [FP] Left-to-right reductions
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vfredosum.vs
+;; -------------------------------------------------------------------------
+
+;; Unpredicated in-order FP reductions.
+(define_expand "fold_left_plus_<mode>"
+  [(match_operand:<VEL> 0 "register_operand")
+   (match_operand:<VEL> 1 "register_operand")
+   (match_operand:VF 2 "register_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_reduction (PLUS, operands,
+				  operands[1],
+				  riscv_vector::reduction_type::FOLD_LEFT);
+  DONE;
+})
+
+;; Predicated in-order FP reductions.
+(define_expand "mask_len_fold_left_plus_<mode>"
+  [(match_operand:<VEL> 0 "register_operand")
+   (match_operand:<VEL> 1 "register_operand")
+   (match_operand:VF 2 "register_operand")
+   (match_operand:<VM> 3 "vector_mask_operand")
+   (match_operand 4 "autovec_length_operand")
+   (match_operand 5 "const_0_operand")]
+  "TARGET_VECTOR"
+{
+  if (rtx_equal_p (operands[4], const0_rtx))
+    emit_move_insn (operands[0], operands[1]);
+  else
+    riscv_vector::expand_reduction (PLUS, operands,
+				    operands[1],
+				    riscv_vector::reduction_type::MASK_LEN_FOLD_LEFT);
+  DONE;
+})
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 16fb8dabca0..c9520f689e2 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -199,6 +199,7 @@ enum insn_type
   RVV_GATHER_M_OP = 5,
   RVV_SCATTER_M_OP = 4,
   RVV_REDUCTION_OP = 3,
+  RVV_REDUCTION_TU_OP = RVV_REDUCTION_OP + 2,
 };
 enum vlmul_type
 {
@@ -247,7 +248,7 @@ void emit_vlmax_merge_insn (unsigned, int, rtx *);
 void emit_vlmax_cmp_insn (unsigned, rtx *);
 void emit_vlmax_cmp_mu_insn (unsigned, rtx *);
 void emit_vlmax_masked_mu_insn (unsigned, int, rtx *);
-void emit_scalar_move_insn (unsigned, rtx *);
+void emit_scalar_move_insn (unsigned, rtx *, rtx = 0);
 void emit_nonvlmax_integer_move_insn (unsigned, rtx *, rtx);
 enum vlmul_type get_vlmul (machine_mode);
 unsigned int get_ratio (machine_mode);
@@ -270,6 +271,13 @@ enum mask_policy
   MASK_AGNOSTIC = 1,
   MASK_ANY = 2,
 };
+
+enum class reduction_type
+{
+  UNORDERED,
+  FOLD_LEFT,
+  MASK_LEN_FOLD_LEFT,
+};
 enum tail_policy get_prefer_tail_policy ();
 enum mask_policy get_prefer_mask_policy ();
 rtx get_avl_type_rtx (enum avl_type);
@@ -282,7 +290,8 @@ bool has_vi_variant_p (rtx_code, rtx);
 void expand_vec_cmp (rtx, rtx_code, rtx, rtx);
 bool expand_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool);
 void expand_cond_len_binop (rtx_code, rtx *);
-void expand_reduction (rtx_code, rtx *, rtx);
+void expand_reduction (rtx_code, rtx *, rtx,
+		       reduction_type = reduction_type::UNORDERED);
 #endif
 bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode,
 			  bool, void (*)(rtx *, rtx));
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 53088edf909..e338be151d3 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -1023,11 +1023,11 @@ emit_nonvlmax_fp_tu_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
 /* Emit vmv.s.x instruction.  */
 
 void
-emit_scalar_move_insn (unsigned icode, rtx *ops)
+emit_scalar_move_insn (unsigned icode, rtx *ops, rtx len)
 {
   machine_mode dest_mode = GET_MODE (ops[0]);
   machine_mode mask_mode = get_mask_mode (dest_mode).require ();
-  insn_expander<RVV_INSN_OPERANDS_MAX> e (riscv_vector::RVV_SCALAR_MOV_OP,
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (RVV_SCALAR_MOV_OP,
 					  /* HAS_DEST_P */ true,
 					  /* FULLY_UNMASKED_P */ false,
 					  /* USE_REAL_MERGE_P */ true,
@@ -1038,7 +1038,7 @@ emit_scalar_move_insn (unsigned icode, rtx *ops)
 
   e.set_policy (TAIL_ANY);
   e.set_policy (MASK_ANY);
-  e.set_vl (CONST1_RTX (Pmode));
+  e.set_vl (len ? len : CONST1_RTX (Pmode));
   e.emit_insn ((enum insn_code) icode, ops);
 }
 
@@ -1196,6 +1196,26 @@ emit_vlmax_fp_reduction_insn (unsigned icode, int op_num, rtx *ops)
   e.emit_insn ((enum insn_code) icode, ops);
 }
 
+/* Emit reduction instruction.  */
+static void
+emit_nonvlmax_fp_reduction_insn (unsigned icode, int op_num, rtx *ops, rtx vl)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (GET_MODE (ops[1])).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (op_num,
+					  /* HAS_DEST_P */ true,
+					  /* FULLY_UNMASKED_P */ false,
+					  /* USE_REAL_MERGE_P */ true,
+					  /* HAS_AVL_P */ true,
+					  /* VLMAX_P */ false, dest_mode,
+					  mask_mode);
+
+  e.set_policy (TAIL_ANY);
+  e.set_rounding_mode (FRM_DYN);
+  e.set_vl (vl);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
 /* Emit merge instruction.  */
 
 static machine_mode
@@ -3343,9 +3363,10 @@ expand_cond_len_ternop (unsigned icode, rtx *ops)
 
 /* Expand reduction operations.  */
 void
-expand_reduction (rtx_code code, rtx *ops, rtx init)
+expand_reduction (rtx_code code, rtx *ops, rtx init, reduction_type type)
 {
-  machine_mode vmode = GET_MODE (ops[1]);
+  rtx vector = type == reduction_type::UNORDERED ? ops[1] : ops[2];
+  machine_mode vmode = GET_MODE (vector);
   machine_mode m1_mode = get_m1_mode (vmode).require ();
   machine_mode m1_mmode = get_mask_mode (m1_mode).require ();
 
@@ -3353,16 +3374,30 @@ expand_reduction (rtx_code code, rtx *ops, rtx init)
   rtx m1_mask = gen_scalar_move_mask (m1_mmode);
   rtx m1_undef = RVV_VUNDEF (m1_mode);
   rtx scalar_move_ops[] = {m1_tmp, m1_mask, m1_undef, init};
-  emit_scalar_move_insn (code_for_pred_broadcast (m1_mode), scalar_move_ops);
+  rtx len = type == reduction_type::MASK_LEN_FOLD_LEFT ? ops[4] : NULL_RTX;
+  emit_scalar_move_insn (code_for_pred_broadcast (m1_mode), scalar_move_ops,
+			 len);
 
   rtx m1_tmp2 = gen_reg_rtx (m1_mode);
-  rtx reduc_ops[] = {m1_tmp2, ops[1], m1_tmp};
+  rtx reduc_ops[] = {m1_tmp2, vector, m1_tmp};
 
   if (FLOAT_MODE_P (vmode) && code == PLUS)
     {
       insn_code icode
-	= code_for_pred_reduc_plus (UNSPEC_UNORDERED, vmode, m1_mode);
-      emit_vlmax_fp_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops);
+	= code_for_pred_reduc_plus (type == reduction_type::UNORDERED
+				      ? UNSPEC_UNORDERED
+				      : UNSPEC_ORDERED,
+				    vmode, m1_mode);
+      if (type == reduction_type::MASK_LEN_FOLD_LEFT)
+	{
+	  rtx mask = ops[3];
+	  rtx mask_len_reduc_ops[]
+	    = {m1_tmp2, mask, RVV_VUNDEF (m1_mode), vector, m1_tmp};
+	  emit_nonvlmax_fp_reduction_insn (icode, RVV_REDUCTION_TU_OP,
+					   mask_len_reduc_ops, len);
+	}
+      else
+	emit_vlmax_fp_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops);
     }
   else
     {
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c
new file mode 100644
index 00000000000..c293e9ae746
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include <stdint-gcc.h>
+
+#define NUM_ELEMS(TYPE) ((int)(5 * (256 / sizeof (TYPE)) + 3))
+
+#define DEF_REDUC_PLUS(TYPE)			\
+  TYPE __attribute__ ((noinline, noclone))	\
+  reduc_plus_##TYPE (TYPE *a, TYPE *b)		\
+  {						\
+    TYPE r = 0, q = 3;				\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
+      {						\
+	r += a[i];				\
+	q -= b[i];				\
+      }						\
+    return r * q;				\
+  }
+
+#define TEST_ALL(T) \
+  T (_Float16) \
+  T (float) \
+  T (double)
+
+TEST_ALL (DEF_REDUC_PLUS)
+
+/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c
new file mode 100644
index 00000000000..2e1e7ab674d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#define NUM_ELEMS(TYPE) ((int) (5 * (256 / sizeof (TYPE)) + 3))
+
+#define DEF_REDUC_PLUS(TYPE)					\
+void __attribute__ ((noinline, noclone))			\
+reduc_plus_##TYPE (TYPE (*restrict a)[NUM_ELEMS (TYPE)],	\
+		   TYPE *restrict r, int n)			\
+{								\
+  for (int i = 0; i < n; i++)					\
+    {								\
+      r[i] = 0;							\
+      for (int j = 0; j < NUM_ELEMS (TYPE); j++)		\
+        r[i] += a[i][j];					\
+    }								\
+}
+
+#define TEST_ALL(T) \
+  T (_Float16) \
+  T (float) \
+  T (double)
+
+TEST_ALL (DEF_REDUC_PLUS)
+
+/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 3 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c
new file mode 100644
index 00000000000..f559d40e60f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+double mat[100][2];
+
+double
+slp_reduc_plus (int n)
+{
+  double tmp = 0.0;
+  for (int i = 0; i < n; i++)
+    {
+      tmp = tmp + mat[i][0];
+      tmp = tmp + mat[i][1];
+    }
+  return tmp;
+}
+
+/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c
new file mode 100644
index 00000000000..428d371d9cf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+double mat[100][8];
+
+double
+slp_reduc_plus (int n)
+{
+  double tmp = 0.0;
+  for (int i = 0; i < n; i++)
+    {
+      tmp = tmp + mat[i][0];
+      tmp = tmp + mat[i][1];
+      tmp = tmp + mat[i][2];
+      tmp = tmp + mat[i][3];
+      tmp = tmp + mat[i][4];
+      tmp = tmp + mat[i][5];
+      tmp = tmp + mat[i][6];
+      tmp = tmp + mat[i][7];
+    }
+  return tmp;
+}
+
+/* { dg-final { scan-assembler {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c
new file mode 100644
index 00000000000..24add2291f1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+double mat[100][12];
+
+double
+slp_reduc_plus (int n)
+{
+  double tmp = 0.0;
+  for (int i = 0; i < n; i++)
+    {
+      tmp = tmp + mat[i][0];
+      tmp = tmp + mat[i][1];
+      tmp = tmp + mat[i][2];
+      tmp = tmp + mat[i][3];
+      tmp = tmp + mat[i][4];
+      tmp = tmp + mat[i][5];
+      tmp = tmp + mat[i][6];
+      tmp = tmp + mat[i][7];
+      tmp = tmp + mat[i][8];
+      tmp = tmp + mat[i][9];
+      tmp = tmp + mat[i][10];
+      tmp = tmp + mat[i][11];
+    }
+  return tmp;
+}
+
+/* { dg-final { scan-assembler {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c
new file mode 100644
index 00000000000..c1567b067ba
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+float
+double_reduc (float (*i)[16])
+{
+  float l = 0;
+
+#pragma GCC unroll 0
+  for (int a = 0; a < 8; a++)
+    for (int b = 0; b < 100; b++)
+      l += i[b][a];
+  return l;
+}
+
+/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */
+/* { dg-final { scan-tree-dump "Detected double reduction" "vect" } } */
+/* { dg-final { scan-tree-dump-not "OUTER LOOP VECTORIZED" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c
new file mode 100644
index 00000000000..f742a824bb2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+float
+double_reduc (float *i, float *j)
+{
+  float k = 0, l = 0;
+
+  for (int a = 0; a < 8; a++)
+    for (int b = 0; b < 100; b++)
+      {
+        k += i[b];
+        l += j[b];
+      }
+  return l * k;
+}
+
+/* { dg-final { scan-assembler-times {vle32\.v} 2 } } */
+/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 2 } } */
+/* { dg-final { scan-tree-dump "Detected double reduction" "vect" } } */
+/* { dg-final { scan-tree-dump-not "OUTER LOOP VECTORIZED" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c
new file mode 100644
index 00000000000..516be97e9eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c
@@ -0,0 +1,29 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "reduc_strict-1.c"
+
+#define TEST_REDUC_PLUS(TYPE)			\
+  {						\
+    TYPE a[NUM_ELEMS (TYPE)];			\
+    TYPE b[NUM_ELEMS (TYPE)];			\
+    TYPE r = 0, q = 3;				\
+    for (int i = 0; i < NUM_ELEMS (TYPE); i++)	\
+      {						\
+	a[i] = (i * 0.1) * (i & 1 ? 1 : -1);	\
+	b[i] = (i * 0.3) * (i & 1 ? 1 : -1);	\
+	r += a[i];				\
+	q -= b[i];				\
+	asm volatile ("" ::: "memory");		\
+      }						\
+    TYPE res = reduc_plus_##TYPE (a, b);	\
+    if (res != r * q)				\
+      __builtin_abort ();			\
+  }
+
+int __attribute__ ((optimize (1)))
+main ()
+{
+  TEST_ALL (TEST_REDUC_PLUS);
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c
new file mode 100644
index 00000000000..0a4238d96f3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c
@@ -0,0 +1,31 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */
+
+#include "reduc_strict-2.c"
+
+#define NROWS 5
+
+#define TEST_REDUC_PLUS(TYPE)					\
+  {								\
+    TYPE a[NROWS][NUM_ELEMS (TYPE)];				\
+    TYPE r[NROWS];						\
+    TYPE expected[NROWS] = {};					\
+    for (int i = 0; i < NROWS; ++i)				\
+      for (int j = 0; j < NUM_ELEMS (TYPE); ++j)		\
+	{							\
+	  a[i][j] = (i * 0.1 + j * 0.6) * (j & 1 ? 1 : -1);	\
+	  expected[i] += a[i][j];				\
+	  asm volatile ("" ::: "memory");			\
+	}							\
+    reduc_plus_##TYPE (a, r, NROWS);				\
+    for (int i = 0; i < NROWS; ++i)				\
+      if (r[i] != expected[i])					\
+	__builtin_abort ();					\
+  }
+
+int __attribute__ ((optimize (1)))
+main ()
+{
+  TEST_ALL (TEST_REDUC_PLUS);
+  return 0;
+}