From patchwork Thu Dec 14 03:23:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 178440 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:3b04:b0:fb:cd0c:d3e with SMTP id c4csp8294752dys; Wed, 13 Dec 2023 19:24:23 -0800 (PST) X-Google-Smtp-Source: AGHT+IFEhE0TBG8C5xhPVLJn3nI6NbE2bQ/OtPlMd8+MsTVBfvgJHJvdTQxtsZ9G0bKjUM5b1D7j X-Received: by 2002:ac8:5795:0:b0:423:9642:7824 with SMTP id v21-20020ac85795000000b0042396427824mr13068804qta.49.1702524263593; Wed, 13 Dec 2023 19:24:23 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702524263; cv=pass; d=google.com; s=arc-20160816; b=dh+12chiCX2QyWJ+Eis5zg6jhJv3EM36IY803HfYmRqjE3JL/tTBwX+ai4tgqNZEKG 337HLB1pwk5bDlubbZs4K//LMIPl7E1zohm4oXLqmKbXfUYFA3/16LLwbesDsyaeV7ii HCw6XrhlPXLBgZZR4cj2dabOOCTfKZWt09KLHHYos31LHxITSojeT+VjF4jGypQFiva4 91EfvsT1SM2FUSVDzPgaujpSED/8Pg6C69nWxTlTe2R/C1KnDxRPDfvyEkxaY8SMG4wX Fns/XXLI3y36w7i6m0RIQ+l2l0vgHtJMeIfcYfkk5HEY4GlKCqcHLXZgnXyCu0chTq7R NDiQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=s68eMMWdW/0DXucaZScSlBnHSy5xN1Cq+dRQfLvfTus=; fh=12MRPJmZ1mgDpHqWoogMKqnaGRGM2b7lcuJroqfjJiw=; b=QHY2MktN6tnRPBOcPgGWlFsD/gcoeT2R1TvvWKDjvdkJ/Te0/FGXd98HCrOKBZwzM/ xSeinwvq0LWcTV8SqpEsUX5BNKivxUv1vUJyZglQ5QAGio3/4HcR1C7D6fgHj5Rma5e1 RVuOhGscNpczpQhoG9tq2LzFOZpoG2CrpuejDzAzjLLBBvJuWwXJVAawEMVK9gAHH3BS XcbrXwtSvOyXkFMV5JM7yhqMPyCSUxjmIS5ZjWrbgc0MW7G5Pmb23USVhXic8PxYMkpU DkKUKctP9DwBGDG7hvSDwyzxE3JOxVrUPWBwHbgs00yuoP/cAdJCQFcENH2Tn9hGcbIc IoGg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id cj22-20020a05622a259600b00423987299fdsi14472826qtb.665.2023.12.13.19.24.23 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 19:24:23 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2C590386181C for ; Thu, 14 Dec 2023 03:24:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id EFD6E385C019 for ; Thu, 14 Dec 2023 03:23:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EFD6E385C019 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EFD6E385C019 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=18.169.211.239 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702524239; cv=none; b=C2dDyJJaeJLi2j1uVcRG6J9I6DXocZUDRxuSKVwDHt339Ap2YLFPvwjV21er7xStXE6Jj0Bj7ojzLzljy51z64vR/Uvl2ExfMKxlfsNaUzXIpVcYgpMZSrYixk6sThM0cO1/Hdp+CCW218X0NdAKOkcwk186iAnbFzR/9kfPM3s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702524239; c=relaxed/simple; bh=p7ASkjMCb7/sBbtwvfv/xhH81ca8EaZPJeFKzFqK/tI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=h1kqGLUI67SA3zYpu+LLImkPa64K8qrkKPb+6Bf6OwoVHv8q2lC+PE6L+UBAhGW4UjjVjn3jypvkcYroLois9CmlvWfcI0n0M25Fli8+4z3gVRvc7k6rtdfSfuL/jHAc9YK7pSA+u8Gv+HxbWExlNn4j43OlEAMa9U5FV+4jXro= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp83t1702524225t0rd026o Received: from rios-cad5.localdomain ( [58.60.1.25]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 14 Dec 2023 11:23:44 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: oGOjGSUjcuCEAMWOEWhshPUqOGBK2hYYl3arMuDB9QRMDnwyPZk4emjvnF0Q9 gH2Cp5eRYK0rRF7CXM8E9iyAS2nHzHhgNvqejZueB+xRojEfwdf3X0UhisN497RYhmKdP5X fCjzc/lQYz+CtvCaH+nhVEPeU6UxPm0d49Z/zkXVAJ7GllTQpVStALHg7jn4HRQvBsm+x2C FpnhENj6rrX9DviiXmAU9o5FQWlJ0WQs+cpYil6k1h/zWlGbzb/XEInDxzTREMmsPjIzZ40 U+5amEZNzrqsj//XXWShjtjYMgJvGdj8ROba4MqV3q6OepA2sEt8g2qA1Bn/ZpgVRZT3siX wk2Cp1iaxh7Kkv56GFfavv0JDhn5jpxz+ygelfKFEVT0AIh06G5liIzM9hx9LmNmXqnzHCl X-QQ-GoodBg: 2 X-BIZMAIL-ID: 945199847329588111 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Add RVV builtin vectorization cost model Date: Thu, 14 Dec 2023 11:23:43 +0800 Message-Id: <20231214032343.124505-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785226082285433237 X-GMAIL-MSGID: 1785226082285433237 This patch fixes PR11153: ble a1,zero,.L8 addiw a5,a1,-1 li a4,4 addi sp,sp,-16 mv a2,a0 sext.w a3,a1 bleu a5,a4,.L9 srliw a4,a3,2 slli a4,a4,4 mv a5,a0 add a4,a4,a0 vsetivli zero,4,e32,m1,ta,ma vmv.v.i v1,0 vse32.v v1,0(sp) .L4: vle32.v v1,0(a5) ---> This loop always processes 4 elements which is ok for VLEN = 128bits, but waste a huge amount of computation units when VLEN > 128bits vle32.v v2,0(sp) addi a5,a5,16 vadd.vv v1,v2,v1 vse32.v v1,0(sp) bne a4,a5,.L4 ld a5,0(sp) lw a4,0(sp) andi a1,a1,-4 srai a5,a5,32 addw a5,a4,a5 lw a4,8(sp) addw a5,a5,a4 ld a4,8(sp) srai a4,a4,32 addw a0,a5,a4 beq a3,a1,.L15 .L3: subw a3,a3,a1 slli a5,a1,32 slli a3,a3,32 srli a3,a3,32 srli a5,a5,30 add a2,a2,a5 vsetvli a5,a3,e8,mf4,tu,mu vsetvli a4,zero,e32,m1,ta,ma sub a1,a3,a5 vmv.v.i v1,0 vsetvli zero,a3,e32,m1,tu,ma vle32.v v2,0(a2) vmv.v.v v1,v2 bne a3,a5,.L21 .L7: vsetvli a4,zero,e32,m1,ta,ma vmv.s.x v2,zero vredsum.vs v1,v1,v2 vmv.x.s a5,v1 addw a0,a0,a5 .L15: addi sp,sp,16 jr ra .L21: slli a5,a5,2 add a2,a2,a5 vsetvli zero,a1,e32,m1,tu,ma vle32.v v2,0(a2) vadd.vv v1,v1,v2 j .L7 .L8: li a0,0 ret .L9: li a1,0 li a0,0 j .L3 The rootcause of this is we missed RVV builtin vectorization cost model. After this patch: ble a1,zero,.L4 vsetvli a5,zero,e32,m1,ta,ma vmv.v.i v1,0 .L3: vsetvli a5,a1,e32,m1,tu,ma vle32.v v2,0(a0) slli a4,a5,2 sub a1,a1,a5 add a0,a0,a4 vadd.vv v1,v2,v1 bne a1,zero,.L3 li a5,0 vsetivli zero,1,e32,m1,ta,ma vmv.s.x v2,a5 vsetvli a5,zero,e32,m1,ta,ma vredsum.vs v1,v1,v2 vmv.x.s a0,v1 ret .L4: li a0,0 ret PR target/111153 gcc/ChangeLog: * config/riscv/riscv-protos.h (struct common_vector_cost): New struct. (struct scalable_vector_cost): Ditto. (struct cpu_vector_cost): Ditto. * config/riscv/riscv-vector-costs.cc (costs::add_stmt_cost): Add RVV builtin vectorization cost * config/riscv/riscv.cc (struct riscv_tune_param): Ditto. (get_common_costs): New function. (riscv_builtin_vectorization_cost): Ditto. (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST): New targethook. gcc/testsuite/ChangeLog: * gcc.dg/vect/costmodel/riscv/rvv/pr111153.c: New test. --- gcc/config/riscv/riscv-protos.h | 76 ++++++++++ gcc/config/riscv/riscv-vector-costs.cc | 5 +- gcc/config/riscv/riscv.cc | 143 ++++++++++++++++++ .../vect/costmodel/riscv/rvv/pr111153.c | 18 +++ 4 files changed, 239 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr111153.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 85ab1db2088..7de0b031001 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -200,6 +200,82 @@ struct riscv_cpu_info { extern const riscv_cpu_info *riscv_find_cpu (const char *); +/* Common vector costs in any kind of vectorization (e.g VLA and VLS). */ +struct common_vector_cost +{ + /* Cost of any integer vector operation, excluding the ones handled + specially below. */ + const int int_stmt_cost; + + /* Cost of any fp vector operation, excluding the ones handled + specially below. */ + const int fp_stmt_cost; + + /* Gather/scatter vectorization cost. */ + const int gather_load_cost; + const int scatter_store_cost; + + /* Cost of a vector-to-scalar operation. */ + const int vec_to_scalar_cost; + + /* Cost of a scalar-to-vector operation. */ + const int scalar_to_vec_cost; + + /* Cost of a permute operation. */ + const int permute_cost; + + /* Cost of an aligned vector load. */ + const int align_load_cost; + + /* Cost of an aligned vector store. */ + const int align_store_cost; + + /* Cost of an unaligned vector load. */ + const int unalign_load_cost; + + /* Cost of an unaligned vector store. */ + const int unalign_store_cost; +}; + +/* scalable vectorization (VLA) specific cost. */ +struct scalable_vector_cost : common_vector_cost +{ + CONSTEXPR scalable_vector_cost (const common_vector_cost &base) + : common_vector_cost (base) + {} + + /* TODO: We will need more other kinds of vector cost for VLA. + E.g. fold_left reduction cost, lanes load/store cost, ..., etc. */ +}; + +/* Cost for vector insn classes. */ +struct cpu_vector_cost +{ + /* Cost of any integer scalar operation, excluding load and store. */ + const int scalar_int_stmt_cost; + + /* Cost of any fp scalar operation, excluding load and store. */ + const int scalar_fp_stmt_cost; + + /* Cost of a scalar load. */ + const int scalar_load_cost; + + /* Cost of a scalar store. */ + const int scalar_store_cost; + + /* Cost of a taken branch. */ + const int cond_taken_branch_cost; + + /* Cost of a not-taken branch. */ + const int cond_not_taken_branch_cost; + + /* Cost of an VLS modes operations. */ + const common_vector_cost *vls; + + /* Cost of an VLA modes operations. */ + const scalable_vector_cost *vla; +}; + /* Routines implemented in riscv-selftests.cc. */ #if CHECKING_P namespace selftest { diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc index 7888cef58fe..e7bc9ed5233 100644 --- a/gcc/config/riscv/riscv-vector-costs.cc +++ b/gcc/config/riscv/riscv-vector-costs.cc @@ -750,9 +750,8 @@ costs::add_stmt_cost (int count, vect_cost_for_stmt kind, stmt_vec_info stmt_info, slp_tree, tree vectype, int misalign, vect_cost_model_location where) { - /* TODO: Use default STMT cost model. - We will support more accurate STMT cost model later. */ - int stmt_cost = default_builtin_vectorization_cost (kind, vectype, misalign); + int stmt_cost + = targetm.vectorize.builtin_vectorization_cost (kind, vectype, misalign); /* Do one-time initialization based on the vinfo. */ loop_vec_info loop_vinfo = dyn_cast (m_vinfo); diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 69a8a503f30..2dc44244309 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -281,6 +281,7 @@ struct riscv_tune_param bool slow_unaligned_access; bool use_divmod_expansion; unsigned int fusible_ops; + const struct cpu_vector_cost *vec_costs; }; @@ -348,6 +349,50 @@ const enum reg_class riscv_regno_to_class[FIRST_PSEUDO_REGISTER] = { VD_REGS, VD_REGS, VD_REGS, VD_REGS, }; +/* Generic costs for VLS vector operations. */ +static const common_vector_cost generic_vls_vector_cost = { + 1, /* int_stmt_cost */ + 1, /* fp_stmt_cost */ + 1, /* gather_load_cost */ + 1, /* scatter_store_cost */ + 1, /* vec_to_scalar_cost */ + 1, /* scalar_to_vec_cost */ + 1, /* permute_cost */ + 3, /* align_load_cost */ + 3, /* align_store_cost */ + 3, /* unalign_load_cost */ + 3, /* unalign_store_cost */ +}; + +/* Generic costs for VLA vector operations. */ +static const scalable_vector_cost generic_vla_vector_cost = { + { + 1, /* int_stmt_cost */ + 1, /* fp_stmt_cost */ + 1, /* gather_load_cost */ + 1, /* scatter_store_cost */ + 1, /* vec_to_scalar_cost */ + 1, /* scalar_to_vec_cost */ + 1, /* permute_cost */ + 3, /* align_load_cost */ + 3, /* align_store_cost */ + 3, /* unalign_load_cost */ + 3, /* unalign_store_cost */ + }, +}; + +/* Generic costs for vector insn classes. */ +static const struct cpu_vector_cost generic_vector_cost = { + 1, /* scalar_int_stmt_cost */ + 1, /* scalar_fp_stmt_cost */ + 1, /* scalar_load_cost */ + 1, /* scalar_store_cost */ + 3, /* cond_taken_branch_cost */ + 1, /* cond_not_taken_branch_cost */ + &generic_vls_vector_cost, /* vls */ + &generic_vla_vector_cost, /* vla */ +}; + /* Costs to use when optimizing for rocket. */ static const struct riscv_tune_param rocket_tune_info = { {COSTS_N_INSNS (4), COSTS_N_INSNS (5)}, /* fp_add */ @@ -362,6 +407,7 @@ static const struct riscv_tune_param rocket_tune_info = { true, /* slow_unaligned_access */ false, /* use_divmod_expansion */ RISCV_FUSE_NOTHING, /* fusible_ops */ + NULL, /* vector cost */ }; /* Costs to use when optimizing for Sifive 7 Series. */ @@ -378,6 +424,7 @@ static const struct riscv_tune_param sifive_7_tune_info = { true, /* slow_unaligned_access */ false, /* use_divmod_expansion */ RISCV_FUSE_NOTHING, /* fusible_ops */ + NULL, /* vector cost */ }; /* Costs to use when optimizing for T-HEAD c906. */ @@ -394,6 +441,7 @@ static const struct riscv_tune_param thead_c906_tune_info = { false, /* slow_unaligned_access */ false, /* use_divmod_expansion */ RISCV_FUSE_NOTHING, /* fusible_ops */ + NULL, /* vector cost */ }; /* Costs to use when optimizing for a generic ooo profile. */ @@ -410,6 +458,7 @@ static const struct riscv_tune_param generic_ooo_tune_info = { false, /* slow_unaligned_access */ false, /* use_divmod_expansion */ RISCV_FUSE_NOTHING, /* fusible_ops */ + &generic_vector_cost, /* vector cost */ }; /* Costs to use when optimizing for size. */ @@ -426,6 +475,7 @@ static const struct riscv_tune_param optimize_size_tune_info = { false, /* slow_unaligned_access */ false, /* use_divmod_expansion */ RISCV_FUSE_NOTHING, /* fusible_ops */ + NULL, /* vector cost */ }; static bool riscv_avoid_shrink_wrapping_separate (); @@ -10192,6 +10242,95 @@ riscv_frame_pointer_required (void) return riscv_save_frame_pointer && !crtl->is_leaf; } +/* Return the appropriate common costs for vectors of type VECTYPE. */ +static const common_vector_cost * +get_common_costs (tree vectype) +{ + const cpu_vector_cost *costs = tune_param->vec_costs; + gcc_assert (costs); + + if (vectype && riscv_v_ext_vls_mode_p (TYPE_MODE (vectype))) + return costs->vls; + return costs->vla; +} + +/* Implement targetm.vectorize.builtin_vectorization_cost. */ + +static int +riscv_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, + tree vectype, int misalign ATTRIBUTE_UNUSED) +{ + unsigned elements; + const cpu_vector_cost *costs = tune_param->vec_costs; + bool fp = false; + + if (vectype != NULL) + fp = FLOAT_TYPE_P (vectype); + + if (costs != NULL) + { + const common_vector_cost *common_costs = get_common_costs (vectype); + gcc_assert (common_costs != NULL); + switch (type_of_cost) + { + case scalar_stmt: + return fp ? costs->scalar_fp_stmt_cost : costs->scalar_int_stmt_cost; + + case scalar_load: + return costs->scalar_load_cost; + + case scalar_store: + return costs->scalar_store_cost; + + case vector_stmt: + return fp ? common_costs->fp_stmt_cost : common_costs->int_stmt_cost; + + case vector_load: + return common_costs->align_load_cost; + + case vector_store: + return common_costs->align_store_cost; + + case vec_to_scalar: + return common_costs->vec_to_scalar_cost; + + case scalar_to_vec: + return common_costs->scalar_to_vec_cost; + + case unaligned_load: + return common_costs->unalign_load_cost; + case vector_gather_load: + return common_costs->gather_load_cost; + + case unaligned_store: + return common_costs->unalign_store_cost; + case vector_scatter_store: + return common_costs->scatter_store_cost; + + case cond_branch_taken: + return costs->cond_taken_branch_cost; + + case cond_branch_not_taken: + return costs->cond_not_taken_branch_cost; + + case vec_perm: + return common_costs->permute_cost; + + case vec_promote_demote: + return fp ? common_costs->fp_stmt_cost : common_costs->int_stmt_cost; + + case vec_construct: + elements = estimated_poly_value (TYPE_VECTOR_SUBPARTS (vectype)); + return elements / 2 + 1; + + default: + gcc_unreachable (); + } + } + + return default_builtin_vectorization_cost (type_of_cost, vectype, misalign); +} + /* Implement targetm.vectorize.create_costs. */ static vector_costs * @@ -10582,6 +10721,10 @@ extract_base_offset_in_addr (rtx mem, rtx *base, rtx *offset) #undef TARGET_FRAME_POINTER_REQUIRED #define TARGET_FRAME_POINTER_REQUIRED riscv_frame_pointer_required +#undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST +#define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \ + riscv_builtin_vectorization_cost + #undef TARGET_VECTORIZE_CREATE_COSTS #define TARGET_VECTORIZE_CREATE_COSTS riscv_vectorize_create_costs diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr111153.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr111153.c new file mode 100644 index 00000000000..06e08ec5f2e --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr111153.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -mtune=generic-ooo" } */ + +#define DEF_REDUC_PLUS(TYPE) \ + TYPE __attribute__ ((noinline, noclone)) \ + reduc_plus_##TYPE (TYPE *__restrict a, int n) \ + { \ + TYPE r = 0; \ + for (int i = 0; i < n; ++i) \ + r += a[i]; \ + return r; \ + } + +#define TEST_PLUS(T) T (int) + +TEST_PLUS (DEF_REDUC_PLUS) + +/* { dg-final { scan-assembler-not {vsetivli\s+zero,\s*4} } } */