From patchwork Mon Jul 31 02:13:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 128316 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp1754462vqg; Sun, 30 Jul 2023 19:14:47 -0700 (PDT) X-Google-Smtp-Source: APBJJlG34K5VbQPBBE/FG+C9iVszX6e82ZHFZh+MVGhu2uDZt7PMpQ/L3LcN9jRcY8LIx3qq5IGM X-Received: by 2002:a17:906:6496:b0:99b:d4a0:1322 with SMTP id e22-20020a170906649600b0099bd4a01322mr6052008ejm.41.1690769686975; Sun, 30 Jul 2023 19:14:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690769686; cv=none; d=google.com; s=arc-20160816; b=D5npsfvKnEm2uLE/+6jKSKg659Xm3vK3T9sFTP/f9LEFxEs78PDCKqmJcmobyiSiG2 2X3IRCZqYw21yd1J1c5TVVbNKZyd4m+ZOU2bK+Wu63CYKZw9J0UKkIeEHIGqNHILWR4M uzgW5JZO5TI3LyUA+azxNrjC0jcsfH7g4/H6QZx54upw/wa/jj1H96FKGeCqo0Z77Bct bzgpYiw7u402c79uPbxWktdNQuz9GTQFQyL3zBFmvpydicwcVedU/TfKyT/MCv5Mp76+ VA1JGCVXGTfz6uaCDYFieO3Kj3hYymCDS0myLBVaQjTLC8/wQMlbi97CBXlwvDJuVv97 ECmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=pSvGeqU4zTlmZyb0GQT2/PUHVdraALG6e3BBrg5VFbY=; fh=SuV1mxSfYh/fFJBV6FW8ZDQUWC7OLSIDYxyJSOKFLBQ=; b=fNKZIGUpsbwG4UG2J4Zi9xBtzq/yBwENYBwgR1vOOP2jueXrEJupK1A8wP8F8hXZfy 5ccjrIeYq/RVxCeawc2njNR2gvWrRO0qYhJl2UlO8XN0gLytxuENDo8xXx+Cw8/jnM6+ 75LOBFDYsoMveKOD28/r+K1Ly/T5YL4ERsm/tJoMMWKRaOKRRQvSa47VVi0Hm4pZAbL6 oMCL3+PMdLfHnOVjBBausy8vQ53+6k44/qfwm9xsSfDtxI90C+9kypCpcc/jor5305GP 1cQtOwn3ZfIB0Bx2hGrDbXEMO4383gGvGPSPRk6sqmV/gd8OojihnfGbiEg1k9wJL7yI ZKyg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id hb21-20020a170906b89500b00992b6f546b1si1542088ejb.77.2023.07.30.19.14.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 30 Jul 2023 19:14:46 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A0AEC3858298 for ; Mon, 31 Jul 2023 02:14:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id 7E1013858D38 for ; Mon, 31 Jul 2023 02:14:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7E1013858D38 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp81t1690769639ts5hkber Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Mon, 31 Jul 2023 10:13:58 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: +rn5L4K3jcy8hMCSVnkQtgnBpH/ttfNB27g23hQjhaW6iwUtlsxeh3ei0OdX+ DVBjxV4FEmK5D+kPl4/g9YR5CexxW/3Ug0/iGPfD10ohSL9V62tpQBH0jkL3HQKehBpTDJN 3WEPJTdy5ZPEG3MOgHLjhVQftqxlhbW7LPrX0yGHArw0pfXCvVdioXE7z7UdIgboOXkDG5n Dvp0yvBwYglttbKPTmmyUYugwphm09tNUWCd3TnJsfLjUQY9xyN3J2ObHuQGrzHdd8qEpc7 QXdnEAJ9GK8fXh3tDFLru090Z3vtHvlMFgOsUO/3XDCXHO+CTS7odyP3XS+nviTQgEgJbKF PvpbZ0Wrvf/HswlDzbQj4YTjZdYjyX8/6igqU6v/bVK3cyHPwdNRHj5t7DsRQ== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 13693350972652433296 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH V2] RISC-V: Enable basic VLS auto-vectorization Date: Mon, 31 Jul 2023 10:13:57 +0800 Message-Id: <20230731021357.3815294-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772900514899786628 X-GMAIL-MSGID: 1772900514899786628 Consider this following case: void foo (int8_t *in, int8_t *out, int8_t x) { for (int i = 0; i < 16; i++) in[i] = x; } Compile option: --param=riscv-autovec-preference=scalable -fno-builtin Before this patch: foo: li a5,16 csrr a4,vlenb vsetvli a3,zero,e8,m1,ta,ma vmv.v.x v1,a2 bleu a5,a4,.L2 mv a5,a4 .L2: vsetvli zero,a5,e8,m1,ta,ma vse8.v v1,0(a0) ret After this patch: foo: vsetivli zero,16,e8,mf8,ta,ma vmv.v.x v1,a2 vse8.v v1,0(a0) ret gcc/ChangeLog: * config/riscv/autovec-vls.md (@vec_duplicate): New pattern. * config/riscv/riscv-v.cc (autovectorize_vector_modes): Add VLS autovec support. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/v-1.c: Adapt test. * gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c: Ditto. * gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c: Ditto. * gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c: Ditto. * gcc.target/riscv/rvv/autovec/vls/dup-1.c: New test. * gcc.target/riscv/rvv/autovec/vls/dup-2.c: New test. * gcc.target/riscv/rvv/autovec/vls/dup-3.c: New test. * gcc.target/riscv/rvv/autovec/vls/dup-4.c: New test. * gcc.target/riscv/rvv/autovec/vls/dup-5.c: New test. * gcc.target/riscv/rvv/autovec/vls/dup-6.c: New test. * gcc.target/riscv/rvv/autovec/vls/dup-7.c: New test. --- gcc/config/riscv/autovec-vls.md | 19 ++ gcc/config/riscv/riscv-v.cc | 21 ++- .../gcc.target/riscv/rvv/autovec/v-1.c | 2 +- .../gcc.target/riscv/rvv/autovec/vls/dup-1.c | 168 ++++++++++++++++++ .../gcc.target/riscv/rvv/autovec/vls/dup-2.c | 153 ++++++++++++++++ .../gcc.target/riscv/rvv/autovec/vls/dup-3.c | 153 ++++++++++++++++ .../gcc.target/riscv/rvv/autovec/vls/dup-4.c | 137 ++++++++++++++ .../gcc.target/riscv/rvv/autovec/vls/dup-5.c | 137 ++++++++++++++ .../gcc.target/riscv/rvv/autovec/vls/dup-6.c | 122 +++++++++++++ .../gcc.target/riscv/rvv/autovec/vls/dup-7.c | 122 +++++++++++++ .../riscv/rvv/autovec/zve32f_zvl128b-1.c | 2 +- .../riscv/rvv/autovec/zve64d_zvl128b-1.c | 2 +- .../riscv/rvv/autovec/zve64f_zvl128b-1.c | 2 +- 13 files changed, 1034 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-7.c diff --git a/gcc/config/riscv/autovec-vls.md b/gcc/config/riscv/autovec-vls.md index 9ece317ca4e..1a64dfdd91e 100644 --- a/gcc/config/riscv/autovec-vls.md +++ b/gcc/config/riscv/autovec-vls.md @@ -139,3 +139,22 @@ "vmv%m1r.v\t%0,%1" [(set_attr "type" "vmov") (set_attr "mode" "")]) + +;; ----------------------------------------------------------------- +;; ---- Duplicate Operations +;; ----------------------------------------------------------------- + +(define_insn_and_split "@vec_duplicate" + [(set (match_operand:VLS 0 "register_operand") + (vec_duplicate:VLS + (match_operand: 1 "reg_or_int_operand")))] + "TARGET_VECTOR && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + riscv_vector::emit_vlmax_insn (code_for_pred_broadcast (mode), + riscv_vector::RVV_UNOP, operands); + DONE; + } +) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 9e89f970a4c..c10e51b362e 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -2533,7 +2533,6 @@ autovectorize_vector_modes (vector_modes *modes, bool) { if (autovec_use_vlmax_p ()) { - /* TODO: We will support RVV VLS auto-vectorization mode in the future. */ poly_uint64 full_size = BYTES_PER_RISCV_VECTOR * ((int) riscv_autovec_lmul); @@ -2561,7 +2560,25 @@ autovectorize_vector_modes (vector_modes *modes, bool) modes->safe_push (mode); } } - return 0; + unsigned int flag = 0; + if (TARGET_VECTOR_VLS) + { + /* Enable VECT_COMPARE_COSTS between VLA modes VLS modes for scalable + auto-vectorization. */ + flag |= VECT_COMPARE_COSTS; + /* Push all VLSmodes according to TARGET_MIN_VLEN. */ + unsigned int i = 0; + unsigned int base_size = TARGET_MIN_VLEN * riscv_autovec_lmul / 8; + unsigned int size = base_size; + machine_mode mode; + while (size > 0 && get_vector_mode (QImode, size).exists (&mode)) + { + modes->safe_push (mode); + i++; + size = base_size / (1U << i); + } + } + return flag; } /* If the given VECTOR_MODE is an RVV mode, first get the largest number diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c index e68d05f5f48..ebbe5e210c5 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/v-1.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 6 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-1.c new file mode 100644 index 00000000000..1f520f2b0a7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-1.c @@ -0,0 +1,168 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-builtin -fno-schedule-insns -fno-schedule-insns2 --param riscv-autovec-lmul=m8" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "def.h" + +/* +** foo1: +** vsetivli\s+zero,\s*4,\s*e8,\s*mf8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo1 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 4; i++) + in[i] = x; +} + +/* +** foo2: +** vsetivli\s+zero,\s*8,\s*e8,\s*mf8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo2 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 8; i++) + in[i] = x; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e8,\s*mf8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo3 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 16; i++) + in[i] = x; +} + +/* +** foo4: +** li\s+[a-x0-9]+,32 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo4 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 32; i++) + in[i] = x; +} + +/* +** foo5: +** li\s+[a-x0-9]+,64 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo5 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 64; i++) + in[i] = x; +} + +/* +** foo6: +** li\s+[a-x0-9]+,128 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo6 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 128; i++) + in[i] = x; +} + +/* +** foo7: +** li\s+[a-x0-9]+,256 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo7 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 256; i++) + in[i] = x; +} + +/* +** foo8: +** li\s+[a-x0-9]+,512 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo8 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 512; i++) + in[i] = x; +} + +/* +** foo9: +** li\s+[a-x0-9]+,1024 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*m2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo9 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 1024; i++) + in[i] = x; +} + +/* +** foo10: +** li\s+[a-x0-9]+,4096 +** addi\s+[a-x0-9]+,[a-x0-9]+,-2048 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*m4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo10 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 2048; i++) + in[i] = x; +} + +/* +** foo11: +** li\s+[a-x0-9]+,4096 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*m8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse8\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo11 (int8_t *in, int8_t *out, int8_t x) +{ + for (int i = 0; i < 4096; i++) + in[i] = x; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-2.c new file mode 100644 index 00000000000..1a930d059c8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-2.c @@ -0,0 +1,153 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-builtin -fno-schedule-insns -fno-schedule-insns2 --param riscv-autovec-lmul=m8" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "def.h" + +/* +** foo1: +** vsetivli\s+zero,\s*4,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo1 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 4; i++) + in[i] = x; +} + +/* +** foo2: +** vsetivli\s+zero,\s*8,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo2 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 8; i++) + in[i] = x; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo3 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 16; i++) + in[i] = x; +} + +/* +** foo4: +** li\s+[a-x0-9]+,32 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo4 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 32; i++) + in[i] = x; +} + +/* +** foo5: +** li\s+[a-x0-9]+,64 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo5 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 64; i++) + in[i] = x; +} + +/* +** foo6: +** li\s+[a-x0-9]+,128 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo6 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 128; i++) + in[i] = x; +} + +/* +** foo7: +** li\s+[a-x0-9]+,256 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo7 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 256; i++) + in[i] = x; +} + +/* +** foo8: +** li\s+[a-x0-9]+,512 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo8 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 512; i++) + in[i] = x; +} + +/* +** foo9: +** li\s+[a-x0-9]+,1024 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo9 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 1024; i++) + in[i] = x; +} + +/* +** foo10: +** li\s+[a-x0-9]+,4096 +** addi\s+[a-x0-9]+,[a-x0-9]+,-2048 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo10 (int16_t *in, int16_t *out, int16_t x) +{ + for (int i = 0; i < 2048; i++) + in[i] = x; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-3.c new file mode 100644 index 00000000000..46fb5a525a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-3.c @@ -0,0 +1,153 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-builtin -fno-schedule-insns -fno-schedule-insns2 --param riscv-autovec-lmul=m8" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "def.h" + +/* +** foo1: +** vsetivli\s+zero,\s*4,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo1 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 4; i++) + in[i] = x; +} + +/* +** foo2: +** vsetivli\s+zero,\s*8,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo2 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 8; i++) + in[i] = x; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo3 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 16; i++) + in[i] = x; +} + +/* +** foo4: +** li\s+[a-x0-9]+,32 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo4 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 32; i++) + in[i] = x; +} + +/* +** foo5: +** li\s+[a-x0-9]+,64 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo5 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 64; i++) + in[i] = x; +} + +/* +** foo6: +** li\s+[a-x0-9]+,128 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*mf2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo6 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 128; i++) + in[i] = x; +} + +/* +** foo7: +** li\s+[a-x0-9]+,256 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m1,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo7 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 256; i++) + in[i] = x; +} + +/* +** foo8: +** li\s+[a-x0-9]+,512 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo8 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 512; i++) + in[i] = x; +} + +/* +** foo9: +** li\s+[a-x0-9]+,1024 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo9 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 1024; i++) + in[i] = x; +} + +/* +** foo10: +** li\s+[a-x0-9]+,4096 +** addi\s+[a-x0-9]+,[a-x0-9]+,-2048 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e16,\s*m8,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse16\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo10 (_Float16 *in, _Float16 *out, _Float16 x) +{ + for (int i = 0; i < 2048; i++) + in[i] = x; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-4.c new file mode 100644 index 00000000000..7e46dc42526 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-4.c @@ -0,0 +1,137 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-builtin -fno-schedule-insns -fno-schedule-insns2 --param riscv-autovec-lmul=m8" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "def.h" + +/* +** foo1: +** vsetivli\s+zero,\s*4,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo1 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 4; i++) + in[i] = x; +} + +/* +** foo2: +** vsetivli\s+zero,\s*8,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo2 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 8; i++) + in[i] = x; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo3 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 16; i++) + in[i] = x; +} + +/* +** foo4: +** li\s+[a-x0-9]+,32 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo4 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 32; i++) + in[i] = x; +} + +/* +** foo5: +** li\s+[a-x0-9]+,64 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo5 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 64; i++) + in[i] = x; +} + +/* +** foo6: +** li\s+[a-x0-9]+,128 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo6 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 128; i++) + in[i] = x; +} + +/* +** foo7: +** li\s+[a-x0-9]+,256 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo7 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 256; i++) + in[i] = x; +} + +/* +** foo8: +** li\s+[a-x0-9]+,512 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo8 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 512; i++) + in[i] = x; +} + +/* +** foo9: +** li\s+[a-x0-9]+,1024 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo9 (int32_t *in, int32_t *out, int32_t x) +{ + for (int i = 0; i < 1024; i++) + in[i] = x; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-5.c new file mode 100644 index 00000000000..9b9327bdd4d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-5.c @@ -0,0 +1,137 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-builtin -fno-schedule-insns -fno-schedule-insns2 --param riscv-autovec-lmul=m8" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "def.h" + +/* +** foo1: +** vsetivli\s+zero,\s*4,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo1 (float *in, float *out, float x) +{ + for (int i = 0; i < 4; i++) + in[i] = x; +} + +/* +** foo2: +** vsetivli\s+zero,\s*8,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo2 (float *in, float *out, float x) +{ + for (int i = 0; i < 8; i++) + in[i] = x; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo3 (float *in, float *out, float x) +{ + for (int i = 0; i < 16; i++) + in[i] = x; +} + +/* +** foo4: +** li\s+[a-x0-9]+,32 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo4 (float *in, float *out, float x) +{ + for (int i = 0; i < 32; i++) + in[i] = x; +} + +/* +** foo5: +** li\s+[a-x0-9]+,64 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*mf2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo5 (float *in, float *out, float x) +{ + for (int i = 0; i < 64; i++) + in[i] = x; +} + +/* +** foo6: +** li\s+[a-x0-9]+,128 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m1,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo6 (float *in, float *out, float x) +{ + for (int i = 0; i < 128; i++) + in[i] = x; +} + +/* +** foo7: +** li\s+[a-x0-9]+,256 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo7 (float *in, float *out, float x) +{ + for (int i = 0; i < 256; i++) + in[i] = x; +} + +/* +** foo8: +** li\s+[a-x0-9]+,512 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo8 (float *in, float *out, float x) +{ + for (int i = 0; i < 512; i++) + in[i] = x; +} + +/* +** foo9: +** li\s+[a-x0-9]+,1024 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m8,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse32\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo9 (float *in, float *out, float x) +{ + for (int i = 0; i < 1024; i++) + in[i] = x; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-6.c new file mode 100644 index 00000000000..52d5a65b44e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-6.c @@ -0,0 +1,122 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-builtin -fno-schedule-insns -fno-schedule-insns2 --param riscv-autovec-lmul=m8" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "def.h" + +/* +** foo1: +** vsetivli\s+zero,\s*4,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo1 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 4; i++) + in[i] = x; +} + +/* +** foo2: +** vsetivli\s+zero,\s*8,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo2 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 8; i++) + in[i] = x; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo3 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 16; i++) + in[i] = x; +} + +/* +** foo4: +** li\s+[a-x0-9]+,32 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo4 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 32; i++) + in[i] = x; +} + +/* +** foo5: +** li\s+[a-x0-9]+,64 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo5 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 64; i++) + in[i] = x; +} + +/* +** foo6: +** li\s+[a-x0-9]+,128 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m2,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo6 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 128; i++) + in[i] = x; +} + +/* +** foo7: +** li\s+[a-x0-9]+,256 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m4,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo7 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 256; i++) + in[i] = x; +} + +/* +** foo8: +** li\s+[a-x0-9]+,512 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m8,\s*t[au],\s*m[au] +** vmv\.v\.x\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo8 (int64_t *in, int64_t *out, int64_t x) +{ + for (int i = 0; i < 512; i++) + in[i] = x; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-7.c new file mode 100644 index 00000000000..39f27ece2e7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/dup-7.c @@ -0,0 +1,122 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvfh_zvl4096b -mabi=lp64d -O3 -fno-builtin -fno-schedule-insns -fno-schedule-insns2 --param riscv-autovec-lmul=m8" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "def.h" + +/* +** foo1: +** vsetivli\s+zero,\s*4,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo1 (double *in, double *out, double x) +{ + for (int i = 0; i < 4; i++) + in[i] = x; +} + +/* +** foo2: +** vsetivli\s+zero,\s*8,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo2 (double *in, double *out, double x) +{ + for (int i = 0; i < 8; i++) + in[i] = x; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo3 (double *in, double *out, double x) +{ + for (int i = 0; i < 16; i++) + in[i] = x; +} + +/* +** foo4: +** li\s+[a-x0-9]+,32 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo4 (double *in, double *out, double x) +{ + for (int i = 0; i < 32; i++) + in[i] = x; +} + +/* +** foo5: +** li\s+[a-x0-9]+,64 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m1,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo5 (double *in, double *out, double x) +{ + for (int i = 0; i < 64; i++) + in[i] = x; +} + +/* +** foo6: +** li\s+[a-x0-9]+,128 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m2,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo6 (double *in, double *out, double x) +{ + for (int i = 0; i < 128; i++) + in[i] = x; +} + +/* +** foo7: +** li\s+[a-x0-9]+,256 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m4,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo7 (double *in, double *out, double x) +{ + for (int i = 0; i < 256; i++) + in[i] = x; +} + +/* +** foo8: +** li\s+[a-x0-9]+,512 +** vsetvli\s+zero,\s*[a-x0-9]+,\s*e64,\s*m8,\s*t[au],\s*m[au] +** vfmv\.v\.f\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*[a-x0-9]+ +** vse64\.v\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),0\s*\([a-x0-9]+\) +** ret +*/ +void +foo8 (double *in, double *out, double x) +{ + for (int i = 0; i < 512; i++) + in[i] = x; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c index ecfda79e19a..345e2f963d5 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve32f_zvl128b-1.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 3 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 4 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c index 6b320ca6f38..e13c27dcdb0 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64d_zvl128b-1.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 6 "vect" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c index ae3f066477c..e767629ae54 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/zve64f_zvl128b-1.c @@ -3,4 +3,4 @@ #include "template-1.h" -/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 4 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 5 "vect" } } */