From patchwork Thu Nov 30 06:49:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 171748 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp194764vqy; Wed, 29 Nov 2023 22:49:44 -0800 (PST) X-Google-Smtp-Source: AGHT+IF5gvo+3qi30u6G4HIuXf++8FJC/Aco5PiuA2IqACMuhWMYlLupcrWD3/qLORZVxNj3ak0V X-Received: by 2002:a05:6214:3103:b0:67a:35c1:2946 with SMTP id ks3-20020a056214310300b0067a35c12946mr17012700qvb.64.1701326984148; Wed, 29 Nov 2023 22:49:44 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701326984; cv=pass; d=google.com; s=arc-20160816; b=TtijcHNXyrz5QYWlyt00hYC4oNsizPSrEgSrbVNRNLXLYlPvYiMHIZwQJGrWZoSJZv EZsSyL2aZ8IFutA5Kl6M8eBorQBgLRwEfOLmvjl9igl1Q8ZwuB7IMHcUW1HZoI6dwfYq MpG78WqXUKwdPTG6SDzuqpHrajVlC5LarUTpjpvnUlu+POKPuyOnJZWxNF5ctvV/kgjn VZxIUgj/w/J8/nOoSw9CKyPU0KKpaR84Xf9noGZzgBqzt4L3n2ANuB04oeTHL2VtI2dh nM+47EjcSqcYyD+isV4GYmd2r1AJ3d/UOabag9KASyHT6Dy0Ru+KAPh737D6HCccGLdy bQ3A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=Sqf/GHaswPDD+eJs1t/m0NprMrRpt8gxVzv9b32ztrg=; fh=12MRPJmZ1mgDpHqWoogMKqnaGRGM2b7lcuJroqfjJiw=; b=kgaSMuXWVdB/eoXh7NHbs7JsJ1VDkCfYnCHRckBtswMWtoPAd4lry4e2usSBCq87qf CgwJTN62ljF64yTKonLIrAl9JbdE2+VIy5+4vfv9NxHgifI93Zwb7GWnvq5UsqhpNTBa F0iMP5jBdmQSsXMlNFJI3eX+vko7ELwEsprdP+hUb3aJL+xh6dmrEIvN5D41izBL5ZIB BDg0MnWuxr7eEHCj3VAGekfw9cRRf/rbCqXTI3wTyzq2+4G0Vl+MY1qcoY1DftWgt2KZ C4cZH+IE4ABgFwAyx8h91N6rGTQekFAKhU1jzUfkVyqi57IO4gclI6RL87LxeXuh/uF2 cQaw== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t9-20020a05621405c900b0067a3ac71713si419207qvz.283.2023.11.29.22.49.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Nov 2023 22:49:44 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DF51B38582A5 for ; Thu, 30 Nov 2023 06:49:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr1.qq.com (smtpbgbr1.qq.com [54.207.19.206]) by sourceware.org (Postfix) with ESMTPS id E705D3858D37 for ; Thu, 30 Nov 2023 06:49:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E705D3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E705D3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.207.19.206 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701326960; cv=none; b=Lvf/pYV3sGBlaVp3cpnftZm9cyL9oOplK1ZlvHq8vEyvbAznBxZA+kCyadFd5nBcb7R+eLhGbIhpgHjX0heGSqFt63sME7TWnEi8CuyeUeyEVAxFle6KXOrCs0RpUQ4lDrRDMWA+BBWSrS4pJZHXaK2LnRQ2tBQnq1YiqxpvsmE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701326960; c=relaxed/simple; bh=vxN2eZzU8m6GFXvyGVtmlN2+XXIWPcdM0X7ktyNaocI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=rdCLB/Es10mXY8PqP8iBK13JGHQxvSh0Rm6rYgt4ZUwhl0jn/FMHTPp4EWAvwbjWSRpwkzQwoplC2UpFhYp4VYEsT3Tw9/G4MxC2Uz8M2R8Z+pDocdOUHj6OYLxmRl/x1oh/+5VgrKfeKckIoCBMNDYOwpO/LtYzbGY2pREp90Q= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp82t1701326947tjqqib1w Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 30 Nov 2023 14:49:06 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: QityeSR92A2WZqMuvQVYATMP696UoX/UicWl9oz8lLUI3VtpZ8Sj2Rx0d0FQP sysYpD6Y3hScE20b7KUKrbcg4XpJ/veEIq6S4G4TpzEoeJNQJZTkveVLBOfJG73npIgWuLr dcb6gfo3xp0jbSXoNedr8NdmoAQNFdxr6X3qdeSuOcoEME5GsX22xT7lQ9mqiJ1eS9S8laJ CHfgc/on9/G5MrXHXQnlRjstPCS9yK5EmkiBcyxsCMWebd66xI6SINQv0dUlKnwiIjQuaQX nTZ3lEc1kSveNxRBQdHOEkdIj4nM1HXyn7IpUaQJ97yAv05/xqu+iwb/Hi4lLzIsViRgYQl 80gBiTIyEtJWesExv5Hekig9OH0YecNbaZvHOblv/1mkonuGeUGN3tI1nO+3ifqa0araYXl rVmlLBc7fc0= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 12245314244515872647 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Support widening register overlap for vf4/vf8 Date: Thu, 30 Nov 2023 14:49:05 +0800 Message-Id: <20231130064905.2716758-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-8.5 required=5.0 tests=GIT_PATCH_0, KAM_DMARC_STATUS, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783970643685639391 X-GMAIL-MSGID: 1783970643685639391 size_t foo (char const *buf, size_t len) { size_t sum = 0; size_t vl = __riscv_vsetvlmax_e8m8 (); size_t step = vl * 4; const char *it = buf, *end = buf + len; for (; it + step <= end;) { vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl); it += vl; vint8m1_t v1 = __riscv_vle8_v_i8m1 ((void *) it, vl); it += vl; vint8m1_t v2 = __riscv_vle8_v_i8m1 ((void *) it, vl); it += vl; vint8m1_t v3 = __riscv_vle8_v_i8m1 ((void *) it, vl); it += vl; asm volatile("nop" ::: "memory"); vint64m8_t vw0 = __riscv_vsext_vf8_i64m8 (v0, vl); vint64m8_t vw1 = __riscv_vsext_vf8_i64m8 (v1, vl); vint64m8_t vw2 = __riscv_vsext_vf8_i64m8 (v2, vl); vint64m8_t vw3 = __riscv_vsext_vf8_i64m8 (v3, vl); asm volatile("nop" ::: "memory"); size_t sum0 = __riscv_vmv_x_s_i64m8_i64 (vw0); size_t sum1 = __riscv_vmv_x_s_i64m8_i64 (vw1); size_t sum2 = __riscv_vmv_x_s_i64m8_i64 (vw2); size_t sum3 = __riscv_vmv_x_s_i64m8_i64 (vw3); sum += sumation (sum0, sum1, sum2, sum3); } return sum; } Before this patch: add a3,s0,s1 add a4,s6,s1 add a5,s7,s1 vsetvli zero,s0,e64,m8,ta,ma vle8.v v4,0(s1) vle8.v v3,0(a3) mv s1,s2 vle8.v v2,0(a4) vle8.v v1,0(a5) nop vsext.vf8 v8,v4 vsext.vf8 v16,v2 vs8r.v v8,0(sp) vsext.vf8 v24,v1 vsext.vf8 v8,v3 nop vmv.x.s a1,v8 vl8re64.v v8,0(sp) vmv.x.s a3,v24 vmv.x.s a2,v16 vmv.x.s a0,v8 add s2,s2,s5 call sumation add s3,s3,a0 bgeu s4,s2,.L5 After this patch: add a3,s0,s1 add a4,s6,s1 add a5,s7,s1 vsetvli zero,s0,e64,m8,ta,ma vle8.v v15,0(s1) vle8.v v23,0(a3) mv s1,s2 vle8.v v31,0(a4) vle8.v v7,0(a5) vsext.vf8 v8,v15 vsext.vf8 v16,v23 vsext.vf8 v24,v31 vsext.vf8 v0,v7 vmv.x.s a3,v0 vmv.x.s a2,v24 vmv.x.s a1,v16 vmv.x.s a0,v8 add s2,s2,s5 call sumation add s3,s3,a0 bgeu s4,s2,.L5 PR target/112431 gcc/ChangeLog: * config/riscv/vector.md: Add widening overlap of vf2/vf4. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr112431-16.c: New test. * gcc.target/riscv/rvv/base/pr112431-17.c: New test. * gcc.target/riscv/rvv/base/pr112431-18.c: New test. --- gcc/config/riscv/vector.md | 38 ++++++----- .../gcc.target/riscv/rvv/base/pr112431-16.c | 68 +++++++++++++++++++ .../gcc.target/riscv/rvv/base/pr112431-17.c | 51 ++++++++++++++ .../gcc.target/riscv/rvv/base/pr112431-18.c | 51 ++++++++++++++ 4 files changed, 190 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-17.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-18.c diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 6b891c11324..e5d62c6e58b 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -3704,43 +3704,45 @@ ;; Vector Quad-Widening Sign-extend and Zero-extend. (define_insn "@pred__vf4" - [(set (match_operand:VQEXTI 0 "register_operand" "=&vr,&vr") + [(set (match_operand:VQEXTI 0 "register_operand" "=vr, vr, vr, vr, ?&vr, ?&vr") (if_then_else:VQEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK") - (match_operand 5 "const_int_operand" " i, i") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_extend:VQEXTI - (match_operand: 3 "register_operand" " vr, vr")) - (match_operand:VQEXTI 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 3 "register_operand" " W43, W43, W86, W86, vr, vr")) + (match_operand:VQEXTI 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vext.vf4\t%0,%3%p1" [(set_attr "type" "vext") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "W43,W43,W86,W86,none,none")]) ;; Vector Oct-Widening Sign-extend and Zero-extend. (define_insn "@pred__vf8" - [(set (match_operand:VOEXTI 0 "register_operand" "=&vr,&vr") + [(set (match_operand:VOEXTI 0 "register_operand" "=vr, vr, ?&vr, ?&vr") (if_then_else:VOEXTI (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK") - (match_operand 5 "const_int_operand" " i, i") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_extend:VOEXTI - (match_operand: 3 "register_operand" " vr, vr")) - (match_operand:VOEXTI 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 3 "register_operand" " W87, W87, vr, vr")) + (match_operand:VOEXTI 2 "vector_merge_operand" " vu, 0, vu, 0")))] "TARGET_VECTOR" "vext.vf8\t%0,%3%p1" [(set_attr "type" "vext") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "W87,W87,none,none")]) ;; Vector Widening Add/Subtract/Multiply. (define_insn "@pred_dual_widen_" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-16.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-16.c new file mode 100644 index 00000000000..98f42458883 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-16.c @@ -0,0 +1,68 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +size_t __attribute__ ((noinline)) +sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3, size_t sum4, + size_t sum5, size_t sum6, size_t sum7) +{ + return sum0 + sum1 + sum2 + sum3 + sum4 + sum5 + sum6 + sum7; +} + +size_t +foo (char const *buf, size_t len) +{ + size_t sum = 0; + size_t vl = __riscv_vsetvlmax_e8m8 (); + size_t step = vl * 4; + const char *it = buf, *end = buf + len; + for (; it + step <= end;) + { + vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v1 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v2 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v3 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v4 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v5 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v6 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v7 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + + asm volatile("nop" ::: "memory"); + vint32m4_t vw0 = __riscv_vsext_vf4_i32m4 (v0, vl); + vint32m4_t vw1 = __riscv_vsext_vf4_i32m4 (v1, vl); + vint32m4_t vw2 = __riscv_vsext_vf4_i32m4 (v2, vl); + vint32m4_t vw3 = __riscv_vsext_vf4_i32m4 (v3, vl); + vint32m4_t vw4 = __riscv_vsext_vf4_i32m4 (v4, vl); + vint32m4_t vw5 = __riscv_vsext_vf4_i32m4 (v5, vl); + vint32m4_t vw6 = __riscv_vsext_vf4_i32m4 (v6, vl); + vint32m4_t vw7 = __riscv_vsext_vf4_i32m4 (v7, vl); + + asm volatile("nop" ::: "memory"); + size_t sum0 = __riscv_vmv_x_s_i32m4_i32 (vw0); + size_t sum1 = __riscv_vmv_x_s_i32m4_i32 (vw1); + size_t sum2 = __riscv_vmv_x_s_i32m4_i32 (vw2); + size_t sum3 = __riscv_vmv_x_s_i32m4_i32 (vw3); + size_t sum4 = __riscv_vmv_x_s_i32m4_i32 (vw4); + size_t sum5 = __riscv_vmv_x_s_i32m4_i32 (vw5); + size_t sum6 = __riscv_vmv_x_s_i32m4_i32 (vw6); + size_t sum7 = __riscv_vmv_x_s_i32m4_i32 (vw7); + + sum += sumation (sum0, sum1, sum2, sum3, sum4, sum5, sum6, sum7); + } + return sum; +} + +/* { dg-final { scan-assembler-not {vmv1r} } } */ +/* { dg-final { scan-assembler-not {vmv2r} } } */ +/* { dg-final { scan-assembler-not {vmv4r} } } */ +/* { dg-final { scan-assembler-not {vmv8r} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-17.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-17.c new file mode 100644 index 00000000000..9b60005344d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-17.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +size_t __attribute__ ((noinline)) +sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3) +{ + return sum0 + sum1 + sum2 + sum3; +} + +size_t +foo (char const *buf, size_t len) +{ + size_t sum = 0; + size_t vl = __riscv_vsetvlmax_e8m8 (); + size_t step = vl * 4; + const char *it = buf, *end = buf + len; + for (; it + step <= end;) + { + vint8m2_t v0 = __riscv_vle8_v_i8m2 ((void *) it, vl); + it += vl; + vint8m2_t v1 = __riscv_vle8_v_i8m2 ((void *) it, vl); + it += vl; + vint8m2_t v2 = __riscv_vle8_v_i8m2 ((void *) it, vl); + it += vl; + vint8m2_t v3 = __riscv_vle8_v_i8m2 ((void *) it, vl); + it += vl; + + asm volatile("nop" ::: "memory"); + vint32m8_t vw0 = __riscv_vsext_vf4_i32m8 (v0, vl); + vint32m8_t vw1 = __riscv_vsext_vf4_i32m8 (v1, vl); + vint32m8_t vw2 = __riscv_vsext_vf4_i32m8 (v2, vl); + vint32m8_t vw3 = __riscv_vsext_vf4_i32m8 (v3, vl); + + asm volatile("nop" ::: "memory"); + size_t sum0 = __riscv_vmv_x_s_i32m8_i32 (vw0); + size_t sum1 = __riscv_vmv_x_s_i32m8_i32 (vw1); + size_t sum2 = __riscv_vmv_x_s_i32m8_i32 (vw2); + size_t sum3 = __riscv_vmv_x_s_i32m8_i32 (vw3); + + sum += sumation (sum0, sum1, sum2, sum3); + } + return sum; +} + +/* { dg-final { scan-assembler-not {vmv1r} } } */ +/* { dg-final { scan-assembler-not {vmv2r} } } */ +/* { dg-final { scan-assembler-not {vmv4r} } } */ +/* { dg-final { scan-assembler-not {vmv8r} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-18.c b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-18.c new file mode 100644 index 00000000000..dd65b2fa098 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr112431-18.c @@ -0,0 +1,51 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +size_t __attribute__ ((noinline)) +sumation (size_t sum0, size_t sum1, size_t sum2, size_t sum3) +{ + return sum0 + sum1 + sum2 + sum3; +} + +size_t +foo (char const *buf, size_t len) +{ + size_t sum = 0; + size_t vl = __riscv_vsetvlmax_e8m8 (); + size_t step = vl * 4; + const char *it = buf, *end = buf + len; + for (; it + step <= end;) + { + vint8m1_t v0 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v1 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v2 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + vint8m1_t v3 = __riscv_vle8_v_i8m1 ((void *) it, vl); + it += vl; + + asm volatile("nop" ::: "memory"); + vint64m8_t vw0 = __riscv_vsext_vf8_i64m8 (v0, vl); + vint64m8_t vw1 = __riscv_vsext_vf8_i64m8 (v1, vl); + vint64m8_t vw2 = __riscv_vsext_vf8_i64m8 (v2, vl); + vint64m8_t vw3 = __riscv_vsext_vf8_i64m8 (v3, vl); + + asm volatile("nop" ::: "memory"); + size_t sum0 = __riscv_vmv_x_s_i64m8_i64 (vw0); + size_t sum1 = __riscv_vmv_x_s_i64m8_i64 (vw1); + size_t sum2 = __riscv_vmv_x_s_i64m8_i64 (vw2); + size_t sum3 = __riscv_vmv_x_s_i64m8_i64 (vw3); + + sum += sumation (sum0, sum1, sum2, sum3); + } + return sum; +} + +/* { dg-final { scan-assembler-not {vmv1r} } } */ +/* { dg-final { scan-assembler-not {vmv2r} } } */ +/* { dg-final { scan-assembler-not {vmv4r} } } */ +/* { dg-final { scan-assembler-not {vmv8r} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */