From patchwork Thu Mar 30 01:28:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 76864 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp801277vqo; Wed, 29 Mar 2023 18:28:51 -0700 (PDT) X-Google-Smtp-Source: AKy350bV2JqePktdsdO2h2+YM18SfjvSH82BanMDbZBM27fKa8TTmWXTicjb65ghlfE9fm1HZrOC X-Received: by 2002:a17:906:81d5:b0:93d:ae74:fa9e with SMTP id e21-20020a17090681d500b0093dae74fa9emr21114694ejx.7.1680139731517; Wed, 29 Mar 2023 18:28:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680139731; cv=none; d=google.com; s=arc-20160816; b=Ace/B7Vb3HZJf9k3+ATe/EAk1cA0DHB2ZeGU+GLRrH5AaYH/cDv5D0J46WReVW/skt qvfuNECXbYQS/sEZrTDXE+bRMA/aF17QjDXXcXZ0XbVnJEZDLo0a8sHBFNHnRJ1BFi48 OpDXQBTzagI17/rrKLqTp5NL22wtS0xtBRg6RmNYqawQdvNkuGV1GvdksrL/S3KP8dDc xkiyQeUFvQxGGjjMQOMOwTccz/r8K1PGkCbXLjRXR/51tjBsIyjefg4Q7uXw+2j288HO tSbHsX+tAQ7e8XLJzQDohQtrOtlZdU2egdpCmzIdCb8eNUkAbkNNrs9aQSwa7wO8/SJV Duyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=H5UAMxgS941Mi7/0rtMSZoJw4WWWepeIui2KxtBIues=; b=V+rNzz8uzxvUoI2w6gTZ90ork5HBg6sTtiUGbCzENxPaDna/rjamaWKE3hL8Nld+wN QFE9ucx0ZMddptKk/khHm4DP6UUtMt+f85NA/MfJNX+weUtyjZZNAiev9ycyVYH917Vr 76fSENXlqDHI863fw5S20GuVrsewmQXvCH4gUc+o1rx7V4PtwEY6rqKp3PXJUxdPu/gy 2Ovz6+B21TqAdYORakOMtw6ylpmFJn9VkhGCBWN1clgUK+wiaIr8LYL+SleZRJ1vopwv 1p9dBm3gZPvoMSOA/PB8KO9/CIRyhYS6zpZS2h9KTs8HNE6uNTiJjMx/4rIWMCLGTJ2c Zh3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id s22-20020a170906bc5600b00933b668c967si26971577ejv.477.2023.03.29.18.28.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Mar 2023 18:28:51 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 62F03385842B for ; Thu, 30 Mar 2023 01:28:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgeu1.qq.com (smtpbgeu1.qq.com [52.59.177.22]) by sourceware.org (Postfix) with ESMTPS id 6D1D83858CDA for ; Thu, 30 Mar 2023 01:28:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6D1D83858CDA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp74t1680139687tzvbylfb Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 30 Mar 2023 09:28:06 +0800 (CST) X-QQ-SSF: 01400000000000E0O000000A0000000 X-QQ-FEAT: QityeSR92A0cfJDb6KKL4LwXES98zxPy1yltt0ELkdbfsUMmawiGLg6qCRq4F 7I3LNz7gGBV8v52NKTJ0DD/LwcD0x81UqyifmvygPod0So33YYvMXM/DEpaZnVK2/3we2hr gVr+ICWDoW26fy/HWnpN7zcJIYdVHr+eK7B3ucMum7kg8kY0iXvWOwhVNvEnXuBPoGGUcD7 8JQcOIUDfWIDM/VxIMOISCIgdNwZHJ8qGk/10fvtidgDw0QaagVlRfl/9avvMI9GBZOeSvv mvpTQtGe3GqfKRXmrmajSISV6E7aWQKm0VS+Q7FM7R5f7QtxDWfw3x7mJh0PxQn+1J7zc6/ aMjNbADyg2rvhFJAlL5jJLisEXUJJrW2ObqMJTNrgrc9ernBurn0b6rKMbR/3LBgIXJwJqF X-QQ-GoodBg: 2 X-BIZMAIL-ID: 10199802429314478583 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, Juzhe-Zhong Subject: [GCC14 QUEUE PATCH] RISC-V: Optimize fault only first load Date: Thu, 30 Mar 2023 09:28:04 +0800 Message-Id: <20230330012804.110539-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvr:qybglogicsvr7 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_PASS, TXREP, T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761754199552672003?= X-GMAIL-MSGID: =?utf-8?q?1761754199552672003?= From: Juzhe-Zhong gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pass_vsetvl::cleanup_insns): Adapt PASS. * config/riscv/vector-iterators.md: New unspec. * config/riscv/vector.md: Optimize fault only first load pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/ffload-1.c: New test. * gcc.target/riscv/rvv/vsetvl/ffload-2.c: New test. * gcc.target/riscv/rvv/vsetvl/ffload-3.c: New test. * gcc.target/riscv/rvv/vsetvl/ffload-4.c: New test. * gcc.target/riscv/rvv/vsetvl/ffload-5.c: New test. * gcc.target/riscv/rvv/vsetvl/ffload-6.c: New test. * gcc.target/riscv/rvv/vsetvl/ffload-7.c: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 3 +- gcc/config/riscv/vector-iterators.md | 1 + gcc/config/riscv/vector.md | 10 ++++- .../gcc.target/riscv/rvv/vsetvl/ffload-1.c | 21 +++++++++++ .../gcc.target/riscv/rvv/vsetvl/ffload-2.c | 28 ++++++++++++++ .../gcc.target/riscv/rvv/vsetvl/ffload-3.c | 28 ++++++++++++++ .../gcc.target/riscv/rvv/vsetvl/ffload-4.c | 37 +++++++++++++++++++ .../gcc.target/riscv/rvv/vsetvl/ffload-5.c | 29 +++++++++++++++ .../gcc.target/riscv/rvv/vsetvl/ffload-6.c | 29 +++++++++++++++ .../gcc.target/riscv/rvv/vsetvl/ffload-7.c | 32 ++++++++++++++++ 10 files changed, 216 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 58568b45010..4d043c0645b 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -4003,7 +4003,8 @@ pass_vsetvl::cleanup_insns (void) const if (!has_vl_op (rinsn) || !REG_P (get_vl (rinsn))) continue; rtx avl = get_vl (rinsn); - if (count_occurrences (PATTERN (rinsn), avl, 0) == 1) + if (count_occurrences (PATTERN (rinsn), avl, 0) == 1 + || fault_first_load_p (rinsn)) { /* Get the list of uses for the new instruction. */ auto attempt = crtl->ssa->new_change_attempt (); diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 34e486e48ca..8fff61eff30 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -80,6 +80,7 @@ UNSPEC_VRGATHEREI16 UNSPEC_VCOMPRESS UNSPEC_VLEFF + UNSPEC_MODIFY_VL ]) (define_mode_iterator V [ diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index b0a4d4cea69..92adfb06122 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -7537,7 +7537,15 @@ (unspec:V [(match_operand:V 3 "memory_operand" " m, m, m, m")] UNSPEC_VLEFF) (match_operand:V 2 "vector_merge_operand" " vu, 0, vu, 0"))) - (set (reg:SI VL_REGNUM) (unspec:SI [(match_dup 0)] UNSPEC_VLEFF))] + (set (reg:SI VL_REGNUM) + (unspec:SI + [(if_then_else:V + (unspec: + [(match_dup 1) (match_dup 4) (match_dup 5) + (match_dup 6) (match_dup 7) + (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (unspec:V [(match_dup 3)] UNSPEC_VLEFF) + (match_dup 2))] UNSPEC_MODIFY_VL))] "TARGET_VECTOR" "vleff.v\t%0,%3%p1" [(set_attr "type" "vldff") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c new file mode 100644 index 00000000000..b2b7eafa945 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f (int8_t * restrict in, int8_t * restrict out, int n, int cond,size_t *new_vl,size_t *new_vl2) +{ + size_t vl = 101; + + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in, vl); + __riscv_vse8_v_i8mf8 (out, v, vl); + vbool64_t mask = __riscv_vlm_v_b64 (in + 100, vl); + vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + 100, new_vl, vl); + __riscv_vse8_v_i8mf8 (out + 100, v2, *new_vl); + v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v2, in + 200, new_vl2, vl); + __riscv_vse8_v_i8mf8 (out + 200, v2, *new_vl2); +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {csrr} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-not {vmv} { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c new file mode 100644 index 00000000000..c0e21d461e7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond) +{ + size_t vl = 101; + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl); + __riscv_vse8_v_i8mf8 (out + i, v, vl); + + vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl); + + vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl); + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); + } + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl); + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); + } +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c new file mode 100644 index 00000000000..9e90b189bd6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond) +{ + size_t vl = 101; + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl); + __riscv_vse8_v_i8mf8 (out + i, v, vl); + + vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl); + + vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl); + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); + } + + for (size_t i = 0; i < m; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl); + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); + } +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c new file mode 100644 index 00000000000..eee027e4d48 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-4.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond) +{ + size_t vl = 101; + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl); + __riscv_vse8_v_i8mf8 (out + i, v, vl); + + vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl); + + vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl); + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); + } + + for (int i = 0 ; i < n * n; i++) + out[i] = out[i] + out[i]; + + for (int i = 0 ; i < n * n * n; i++) + out[i] = out[i] * out[i]; + + for (int i = 0 ; i < n * n * n * n; i++) + out[i] = out[i] * out[i]; + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl); + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); + } +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c new file mode 100644 index 00000000000..895180cc54e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond) +{ + size_t vl = 101; + size_t new_vl; + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl); + __riscv_vse8_v_i8mf8 (out + i, v, vl); + + vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl); + + vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &new_vl, vl); + __riscv_vse8_v_i8mf8 (out + i + 100, v2, new_vl); + } + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, new_vl); + __riscv_vse8_v_i8mf8 (out + i + 300, v, new_vl); + } +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c new file mode 100644 index 00000000000..1b32f4ab24b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond) +{ + size_t vl = 101; + size_t new_vl; + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl); + __riscv_vse8_v_i8mf8 (out + i, v, vl); + + vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl); + + vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &new_vl, vl); + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); + } + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, new_vl); + __riscv_vse8_v_i8mf8 (out + i + 300, v, new_vl); + } +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*t[au],\s*m[au]} 3 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c new file mode 100644 index 00000000000..1c08b75873d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int cond) +{ + size_t vl = 101; + if (cond) + vl = m * 2; + else + vl = m * 2 * vl; + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i, vl); + __riscv_vse8_v_i8mf8 (out + i, v, vl); + + vbool64_t mask = __riscv_vlm_v_b64 (in + i + 100, vl); + + vint8mf8_t v2 = __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 100, &vl, vl); + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); + } + + for (size_t i = 0; i < n; i++) + { + vint8mf8_t v = __riscv_vle8_v_i8mf8 (in + i + 300, vl); + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); + } +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,\s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funroll-loops" } } } } */