From patchwork Fri Mar 10 03:02:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 67186 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp654764wrd; Thu, 9 Mar 2023 19:04:24 -0800 (PST) X-Google-Smtp-Source: AK7set+pb5kv1O+s9H1MZVmULDQOLTsZ7XAWwOXT2Ga3ZzJYiClBhjbnZTmt+Lub34QI64sx1+wb X-Received: by 2002:aa7:c245:0:b0:4ac:b431:4762 with SMTP id y5-20020aa7c245000000b004acb4314762mr19482245edo.23.1678417463967; Thu, 09 Mar 2023 19:04:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1678417463; cv=none; d=google.com; s=arc-20160816; b=cMB8CGBFv5sXAifiB+mJkK0rIVxIZdFvpFW2ZQ3yhWWUgICdIql+aQWhiq/S38dNyq 7CmYC/x9aUiTnpeJFJv9ejwfoyRszqcZOdk/jBDwvkvzd236La1bVDgpviQlohBfu0Z1 fA/n6O7WUu0qR2Xmad9zO5SE4+ZmYnJVMHOjpN++tWZweW2bSmb7ZZjC4xvHeeuWAOnH wap/Akx9vaVitjiHoiL/DCg9GbyQTccgFkTON1be0YLcqKPi/HhHFQ9y7jJKP4ESo3OW fna9u4G5ue1Ti3UCiBUgb4zT8z/VmeBO+Orp7Bx+6yUHCdPISrwC0m6ydPCQE4rGEI0C Ux2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=q+qAi1/k8OKnhcOKJsBRZa8BrSiiXm6fK223Do5Q5Ac=; b=gA9TpLBFFJaE1TFwcyPtOlUu8AOcvnS1AcgKnzK+7xyJVpx4gzbF90g4qb9G5xqe48 hKZqD+B5jo1vMT6IhX2LRFwonTGL+ZwIVwPF3UFrN2Z1HK97hfvmJqV2B4stTg1znwCV R+Eg2uTW59hmxJHCndNCQXYwdqqAkvvHaYbesjLhA7XU3PXp/C5Sh9JxK/iNQnDCndec m9J8JccTwLtVlB7gmVF909tzAETtKnR4IVTSqgofD4ChzDi3/isa18IYnLJdx08qu80X BV3RRsola7LkwCq2+yUtJpzNU2WzriYoYbv5/f800mS+aOQQoKNf5ti0/jDgfFQI87Yr tyRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id x6-20020aa7cd86000000b004c369c4b545si1279982edv.501.2023.03.09.19.04.23 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 09 Mar 2023 19:04:23 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8C339385483F for ; Fri, 10 Mar 2023 03:04:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg154.qq.com (smtpbg154.qq.com [15.184.224.54]) by sourceware.org (Postfix) with ESMTPS id 5CED03858281 for ; Fri, 10 Mar 2023 03:03:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5CED03858281 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp70t1678417338tbimzkni Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 10 Mar 2023 11:02:17 +0800 (CST) X-QQ-SSF: 01400000000000E0N000000A0000000 X-QQ-FEAT: q+EIYT+FhZq6cY7pf57OhAyIC6hJZ8GIFNAYwpG/i02hxhmVKSf1n0pEjdlvk n7w+2v9rAYkBLMF8CcSqm0vu0fZ1WueDD08Yf2tY+GnOr49yhw8CG/z8ppCQ91qbzL+01Fc l6T+x687jgTRCf3f+guJhahWWW5tpTxkzkNprVXISuIzMo1xVrf3+JbwDJ+nnVJ2uyaW9zq 7SaUYRw7d06GS2/MdJwG2F/daW1vTx0qTFgxxmSloBzk1GvUz0HfGY99rjeSERqmxluN08i 62BSjRxgXTXXAHNAt48fzLisTodMzYq6EBayjzXc/K5y8cTcbseU71dI3hPtmz8251GE8X7 kzgnRrXSl7gK2z+rkLoSGDxAYWW3trEmRBW1d+O4WjWRB/Uoi0dfhQtRCvKbXI0OGuDSmwc X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, Ju-Zhe Zhong Subject: [PATCH] RISC-V: Fine tune RA constraint for narrow instructions Date: Fri, 10 Mar 2023 11:02:05 +0800 Message-Id: <20230310030205.90760-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvr:qybglogicsvr7 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759948270985141708?= X-GMAIL-MSGID: =?utf-8?q?1759948270985141708?= From: Ju-Zhe Zhong According to RVV ISA, for narrow instructions: The destination EEW is smaller than the source EEW and the overlap is in the lowest-numbered part of the source register group. (e.g., when LMUL=1, vnsrl.wi v0, v0, 3 is legal, but a destination of v1 is not). We should allow narrow instructions partially overlap base on the rule of RVV ISA above so that we could exploit the useage of vector registers. Consider these cases: https://godbolt.org/z/o6sc4eqGj some cases in LLVM have redundant move instructions, some cases in LLVM have redundant register spillings. Now after this patch, GCC can have perfect RA && codegen for different pressure RA cases. gcc/ChangeLog: * config/riscv/vector.md: Fine tune RA constraints. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/narrow_constraint-1.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-10.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-11.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-2.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-3.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-4.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-5.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-6.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-7.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-8.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-9.c: New test. --- gcc/config/riscv/vector.md | 168 ++++---- .../riscv/rvv/base/narrow_constraint-1.c | 319 ++++++++++++++ .../riscv/rvv/base/narrow_constraint-10.c | 293 +++++++++++++ .../riscv/rvv/base/narrow_constraint-11.c | 293 +++++++++++++ .../riscv/rvv/base/narrow_constraint-2.c | 370 +++++++++++++++++ .../riscv/rvv/base/narrow_constraint-3.c | 392 ++++++++++++++++++ .../riscv/rvv/base/narrow_constraint-4.c | 319 ++++++++++++++ .../riscv/rvv/base/narrow_constraint-5.c | 319 ++++++++++++++ .../riscv/rvv/base/narrow_constraint-6.c | 369 +++++++++++++++++ .../riscv/rvv/base/narrow_constraint-7.c | 392 ++++++++++++++++++ .../riscv/rvv/base/narrow_constraint-8.c | 319 ++++++++++++++ .../riscv/rvv/base/narrow_constraint-9.c | 319 ++++++++++++++ 12 files changed, 3788 insertions(+), 84 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-10.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-11.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-7.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-8.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-9.c diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index d3013844e5f..977d3f2042c 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -3058,42 +3058,42 @@ ;; lowest-numbered part of the source register group ;; e.g, when LMUL = 1, vnsrl.wi v0,v0,3 is legal but a destination of v1 is not. (define_insn "@pred_narrow_" - [(set (match_operand: 0 "register_operand" "=vd, vr, &vr, vd, vr, &vr") + [(set (match_operand: 0 "register_operand" "=vd,vd, vr, vr,vd, vr, &vr, &vr, vd, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1, vm,Wc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,vm,Wc1,Wc1,vm,Wc1,vmWc1,vmWc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK,rK, rK, rK,rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (truncate: (any_shiftrt:VWEXTI - (match_operand:VWEXTI 3 "register_operand" " 0, 0, vr, 0, 0, vr") - (match_operand: 4 "vector_shift_operand" " vr, vr, vr, vk, vk, vk"))) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu,0vu,0vu, 0vu")))] + (match_operand:VWEXTI 3 "register_operand" " vr,vr, vr, vr, 0, 0, vr, vr, 0, 0, vr, vr") + (match_operand: 4 "vector_shift_operand" " 0, 0, 0, 0,vr, vr, vr, vr, vk, vk, vk, vk"))) + (match_operand: 2 "vector_merge_operand" " 0,vu, 0, vu,vu, vu, vu, 0, vu, vu, vu, 0")))] "TARGET_VECTOR" "vn.w%o4\t%0,%3,%v4%p1" [(set_attr "type" "vnshift") (set_attr "mode" "")]) (define_insn "@pred_narrow__scalar" - [(set (match_operand: 0 "register_operand" "=vd, vr, &vr, vd, vr, &vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1, vm,Wc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (truncate: (any_shiftrt:VWEXTI - (match_operand:VWEXTI 3 "register_operand" " 0, 0, vr, 0, 0, vr") - (match_operand 4 "pmode_reg_or_uimm5_operand" " r, r, r, K, K, K"))) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu,0vu,0vu, 0vu")))] + (match_operand:VWEXTI 3 "register_operand" " 0, 0, 0, 0, vr, vr") + (match_operand 4 "pmode_reg_or_uimm5_operand" " rK, rK, rK, rK, rK, rK"))) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vn.w%o4\t%0,%3,%4%p1" [(set_attr "type" "vnshift") @@ -3101,19 +3101,19 @@ ;; vncvt.x.x.w (define_insn "@pred_trunc" - [(set (match_operand: 0 "register_operand" "=vd, vr, &vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (truncate: - (match_operand:VWEXTI 3 "register_operand" " 0, 0, vr")) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu")))] + (match_operand:VWEXTI 3 "register_operand" " 0, 0, 0, 0, vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vncvt.x.x.w\t%0,%3%p1" [(set_attr "type" "vnshift") @@ -3496,40 +3496,40 @@ ;; CLIP (define_insn "@pred_narrow_clip" - [(set (match_operand: 0 "register_operand" "=vd, vr, &vr, vd, vr, &vr") + [(set (match_operand: 0 "register_operand" "=vd,vd, vr, vr,vd, vr, &vr, &vr, vd, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1, vm,Wc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm,vm,Wc1,Wc1,vm,Wc1,vmWc1,vmWc1, vm,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK,rK, rK, rK,rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec: - [(match_operand:VWEXTI 3 "register_operand" " 0, 0, vr, 0, 0, vr") - (match_operand: 4 "vector_shift_operand" " vr, vr, vr, vk, vk, vk")] VNCLIP) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu,0vu,0vu, 0vu")))] + [(match_operand:VWEXTI 3 "register_operand" " vr,vr, vr, vr, 0, 0, vr, vr, 0, 0, vr, vr") + (match_operand: 4 "vector_shift_operand" " 0, 0, 0, 0,vr, vr, vr, vr, vk, vk, vk, vk")] VNCLIP) + (match_operand: 2 "vector_merge_operand" " 0,vu, 0, vu,vu, vu, vu, 0, vu, vu, vu, 0")))] "TARGET_VECTOR" "vnclip.w%o4\t%0,%3,%v4%p1" [(set_attr "type" "vnclip") (set_attr "mode" "")]) (define_insn "@pred_narrow_clip_scalar" - [(set (match_operand: 0 "register_operand" "=vd, vr, &vr, vd, vr, &vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1, vm,Wc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec: - [(match_operand:VWEXTI 3 "register_operand" " 0, 0, vr, 0, 0, vr") - (match_operand 4 "pmode_reg_or_uimm5_operand" " r, r, r, K, K, K")] VNCLIP) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu,0vu,0vu, 0vu")))] + [(match_operand:VWEXTI 3 "register_operand" " 0, 0, 0, 0, vr, vr") + (match_operand 4 "pmode_reg_or_uimm5_operand" " rK, rK, rK, rK, rK, rK")] VNCLIP) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vnclip.w%o4\t%0,%3,%4%p1" [(set_attr "type" "vnclip") @@ -6342,96 +6342,96 @@ ;; ------------------------------------------------------------------------------- (define_insn "@pred_narrow_fcvt_x_f" - [(set (match_operand: 0 "register_operand" "=vd, vr, ?&vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec: - [(match_operand:VF 3 "register_operand" " 0, 0, vr")] VFCVTS) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu")))] + [(match_operand:VF 3 "register_operand" " 0, 0, 0, 0, vr, vr")] VFCVTS) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vfncvt.x.f.w\t%0,%3%p1" [(set_attr "type" "vfncvtftoi") (set_attr "mode" "")]) (define_insn "@pred_narrow_" - [(set (match_operand: 0 "register_operand" "=vd, vr, ?&vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_fix: - (match_operand:VF 3 "register_operand" " 0, 0, vr")) - (match_operand: 2 "vector_merge_operand" " 0vu,0vu, 0vu")))] + (match_operand:VF 3 "register_operand" " 0, 0, 0, 0, vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vfncvt.rtz.x.f.w\t%0,%3%p1" [(set_attr "type" "vfncvtftoi") (set_attr "mode" "")]) (define_insn "@pred_narrow_" - [(set (match_operand: 0 "register_operand" "=vd, vr, ?&vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_float: - (match_operand:VWCONVERTI 3 "register_operand" " 0, 0, vr")) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu")))] + (match_operand:VWCONVERTI 3 "register_operand" " 0, 0, 0, 0, vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vfncvt.f.x.w\t%0,%3%p1" [(set_attr "type" "vfncvtitof") (set_attr "mode" "")]) (define_insn "@pred_trunc" - [(set (match_operand: 0 "register_operand" "=vd, vr, ?&vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (float_truncate: - (match_operand:VWEXTF 3 "register_operand" " 0, 0, vr")) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu")))] + (match_operand:VWEXTF 3 "register_operand" " 0, 0, 0, 0, vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vfncvt.f.f.w\t%0,%3%p1" [(set_attr "type" "vfncvtftof") (set_attr "mode" "")]) (define_insn "@pred_rod_trunc" - [(set (match_operand: 0 "register_operand" "=vd, vr, ?&vr") + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" " vm,Wc1,vmWc1") - (match_operand 4 "vector_length_operand" " rK, rK, rK") - (match_operand 5 "const_int_operand" " i, i, i") - (match_operand 6 "const_int_operand" " i, i, i") - (match_operand 7 "const_int_operand" " i, i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec: [(float_truncate: - (match_operand:VWEXTF 3 "register_operand" " 0, 0, vr"))] UNSPEC_ROD) - (match_operand: 2 "vector_merge_operand" "0vu,0vu, 0vu")))] + (match_operand:VWEXTF 3 "register_operand" " 0, 0, 0, 0, vr, vr"))] UNSPEC_ROD) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vfncvt.rod.f.f.w\t%0,%3%p1" [(set_attr "type" "vfncvtftof") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-1.c new file mode 100644 index 00000000000..0cdf60cde06 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-1.c @@ -0,0 +1,319 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base, vl); + vint8mf8_t v = __riscv_vncvt_x_x_w_i8mf8(src,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base, vl); + vint8mf8_t src2 = __riscv_vle8_v_i8mf8 ((int8_t *)(base + 100), vl); + vint8mf8_t v = __riscv_vncvt_x_x_w_i8mf8_tu(src2,src,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base, vl); + vint8mf8_t v = __riscv_vncvt_x_x_w_i8mf8(src,vl); + vint16mf4_t v2 = __riscv_vadd_vv_i16mf4 (src, src,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); + __riscv_vse16_v_i16mf4 ((int16_t *)out,v2,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base + 100*i, vl); + vint8mf8_t v = __riscv_vncvt_x_x_w_i8mf8(src,vl); + vint16mf4_t v2 = __riscv_vadd_vv_i16mf4 (src, src,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_i16mf4 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base, vl); + vint8mf8_t v = __riscv_vncvt_x_x_w_i8mf8(src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + vint16mf4_t v2 = __riscv_vadd_vv_i16mf4 (src, src,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); + __riscv_vse16_v_i16mf4 ((int16_t *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n) +{ + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vint8mf8_t v = __riscv_vncvt_x_x_w_i8mf8_m(m,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vle8_v_i8mf8_tu (v, base2, vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f6 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t src = __riscv_vle16_v_i16m2 (base, vl); + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1(src,vl); + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f7 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t src = __riscv_vle16_v_i16m2 (base, vl); + vint8m1_t src2 = __riscv_vle8_v_i8m1 ((int8_t *)(base + 100), vl); + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1_tu(src2,src,vl); + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f8 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t src = __riscv_vle16_v_i16m2 (base, vl); + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1(src,vl); + vint16m2_t v2 = __riscv_vadd_vv_i16m2 (src, src,vl); + __riscv_vse8_v_i8m1 (out,v,vl); + __riscv_vse16_v_i16m2 ((int16_t *)out,v2,vl); +} + +void f9 (int16_t *base,int8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint16m2_t src = __riscv_vle16_v_i16m2 (base + 100*i, vl); + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1(src,vl); + vint16m2_t v2 = __riscv_vadd_vv_i16m2 (src, src,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + __riscv_vse16_v_i16m2 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f10 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t src = __riscv_vle16_v_i16m2 (base, vl); + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1(src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + vint16m2_t v2 = __riscv_vadd_vv_i16m2 (src, src,vl); + __riscv_vse8_v_i8m1 (out,v,vl); + __riscv_vse16_v_i16m2 ((int16_t *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n) +{ + vint16m2_t src = __riscv_vle16_v_i16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1_m(m,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vle8_v_i8m1_tu (v, base2, vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f12 (int16_t *base,int8_t *out,size_t vl, int n) +{ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base + 100*i, vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f13 (int16_t *base,int8_t *out,size_t vl, int n) +{ + vint8m1_t v = __riscv_vle8_v_i8m1 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vint16m2_t src = __riscv_vle16_v_i16m2 (base + 100*i, vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f14 (int16_t *base,int8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((int8_t *)(base + 1000 * i), vl); + vint16mf4_t src = __riscv_vle16_v_i16mf4 (base + 100*i, vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f15 (int16_t *base,int8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8m1_t v = __riscv_vle8_v_i8m1 ((int8_t *)(base + 1000 * i), vl); + vint16m2_t src = __riscv_vle16_v_i16m2 (base + 100*i, vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f16 (int16_t *base,int8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((int8_t *)(base + 1000 * i), vl); + vint16mf4_t src1 = __riscv_vle16_v_i16mf4 (base + 100*i, vl); + vint16mf4_t src2 = __riscv_vle16_v_i16mf4 (base + 200*i, vl); + vint16mf4_t src3 = __riscv_vle16_v_i16mf4 (base + 300*i, vl); + vint16mf4_t src4 = __riscv_vle16_v_i16mf4 (base + 400*i, vl); + vint16mf4_t src5 = __riscv_vle16_v_i16mf4 (base + 500*i, vl); + vint16mf4_t src6 = __riscv_vle16_v_i16mf4 (base + 600*i, vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src1,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src2,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src3,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src4,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src5,vl); + v = __riscv_vncvt_x_x_w_i8mf8_tu(v,src6,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f17 (int16_t *base,int8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8m1_t v = __riscv_vle8_v_i8m1 ((int8_t *)(base + 1000 * i), vl); + vint16m2_t src1 = __riscv_vle16_v_i16m2 (base + 100*i, vl); + vint16m2_t src2 = __riscv_vle16_v_i16m2 (base + 200*i, vl); + vint16m2_t src3 = __riscv_vle16_v_i16m2 (base + 300*i, vl); + vint16m2_t src4 = __riscv_vle16_v_i16m2 (base + 400*i, vl); + vint16m2_t src5 = __riscv_vle16_v_i16m2 (base + 500*i, vl); + vint16m2_t src6 = __riscv_vle16_v_i16m2 (base + 600*i, vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src1,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src2,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src3,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src4,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src5,vl); + v = __riscv_vncvt_x_x_w_i8m1_tu(v,src6,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *base2,void *out,size_t vl, int n) +{ + vint32mf2_t src = __riscv_vle32_v_i32mf2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vint16mf4_t v = __riscv_vncvt_x_x_w_i16mf4_m(m,src,vl); + vint16mf4_t v2 = __riscv_vle16_v_i16mf4_tu (v, base2 + i, vl); + vint8mf8_t v3 = __riscv_vncvt_x_x_w_i8mf8_m(m,v2,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v3,vl); + } +} + +void f19 (void *base,void *base2,void *out,size_t vl, int n) +{ + vint32m4_t src = __riscv_vle32_v_i32m4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vint16m2_t v = __riscv_vncvt_x_x_w_i16m2_m(m,src,vl); + vint16m2_t v2 = __riscv_vle16_v_i16m2_tu (v, base2 + i, vl); + vint8m1_t v3 = __riscv_vncvt_x_x_w_i8m1_m(m,v2,vl); + vint8m1_t v4 = __riscv_vncvt_x_x_w_i8m1_tumu(m,v3,v2,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v3,vl); + __riscv_vse8_v_i8m1 (out + 222*i,v4,vl); + } +} + +void f20 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t src = __riscv_vle16_v_i16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1(src,vl); + /* Only allow vncvt SRC == DEST v30. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f21 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m1_t src = __riscv_vle16_v_i16m1 (base, vl); + /* Only allow load v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + vint8mf2_t v = __riscv_vncvt_x_x_w_i8mf2(src,vl); + /* Only allow vncvt SRC == DEST v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vse8_v_i8mf2 (out,v,vl); +} + +void f22 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t src = __riscv_vle16_v_i16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vint8m1_t v = __riscv_vncvt_x_x_w_i8m1(src,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + v = __riscv_vadd_vv_i8m1 (v,v,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-10.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-10.c new file mode 100644 index 00000000000..5b371482d9b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-10.c @@ -0,0 +1,293 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); +} + +void f1 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t src2 = __riscv_vle32_v_f32mf2 ((void *)(base + 100), vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2_tu(src2,src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + vfloat64m1_t v2 = __riscv_vfadd_vv_f64m1 (src, src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); + __riscv_vse64_v_f64m1 ((void *)out,v2,vl); +} + +void f3 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100*i, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + vfloat64m1_t v2 = __riscv_vfadd_vv_f64m1 (src, src,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + __riscv_vse64_v_f64m1 ((void *)(out + 200*i),v2,vl); + } +} + +void f4 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + vfloat64m1_t v2 = __riscv_vfadd_vv_f64m1 (src, src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); + __riscv_vse64_v_f64m1 ((void *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2_m(m,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vle32_v_f32mf2_tu (v, base2, vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f6 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1(src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); +} + +void f7 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t src2 = __riscv_vle32_v_f32m1 ((void *)(base + 100), vl); + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1_tu(src2,src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); +} + +void f8 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1(src,vl); + vfloat64m2_t v2 = __riscv_vfadd_vv_f64m2 (src, src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); + __riscv_vse64_v_f64m2 ((void *)out,v2,vl); +} + +void f9 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100*i, vl); + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1(src,vl); + vfloat64m2_t v2 = __riscv_vfadd_vv_f64m2 (src, src,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + __riscv_vse64_v_f64m2 ((void *)(out + 200*i),v2,vl); + } +} + +void f10 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1(src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + vfloat64m2_t v2 = __riscv_vfadd_vv_f64m2 (src, src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); + __riscv_vse64_v_f64m2 ((void *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool32_t m = __riscv_vlm_v_b32 (base + i, vl); + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1_m(m,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vle32_v_f32m1_tu (v, base2, vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f12 (void *base,void *out,size_t vl, int n) +{ + vfloat32mf2_t v = __riscv_vle32_v_f32mf2 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100*i, vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f13 (void *base,void *out,size_t vl, int n) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100*i, vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32mf2_t v = __riscv_vle32_v_f32mf2 ((void *)(base + 1000 * i), vl); + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100*i, vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f15 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32m1_t v = __riscv_vle32_v_f32m1 ((void *)(base + 1000 * i), vl); + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100*i, vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f16 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32mf2_t v = __riscv_vle32_v_f32mf2 ((void *)(base + 1000 * i), vl); + vfloat64m1_t src1 = __riscv_vle64_v_f64m1 (base + 100*i, vl); + vfloat64m1_t src2 = __riscv_vle64_v_f64m1 (base + 200*i, vl); + vfloat64m1_t src3 = __riscv_vle64_v_f64m1 (base + 300*i, vl); + vfloat64m1_t src4 = __riscv_vle64_v_f64m1 (base + 400*i, vl); + vfloat64m1_t src5 = __riscv_vle64_v_f64m1 (base + 500*i, vl); + vfloat64m1_t src6 = __riscv_vle64_v_f64m1 (base + 600*i, vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src1,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src2,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src3,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src4,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src5,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src6,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32m1_t v = __riscv_vle32_v_f32m1 ((void *)(base + 1000 * i), vl); + vfloat64m2_t src1 = __riscv_vle64_v_f64m2 (base + 100*i, vl); + vfloat64m2_t src2 = __riscv_vle64_v_f64m2 (base + 200*i, vl); + vfloat64m2_t src3 = __riscv_vle64_v_f64m2 (base + 300*i, vl); + vfloat64m2_t src4 = __riscv_vle64_v_f64m2 (base + 400*i, vl); + vfloat64m2_t src5 = __riscv_vle64_v_f64m2 (base + 500*i, vl); + vfloat64m2_t src6 = __riscv_vle64_v_f64m2 (base + 600*i, vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src1,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src2,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src3,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src4,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src5,vl); + v = __riscv_vfncvt_f_f_w_f32m1_tu(v,src6,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1(src,vl); + /* Only allow vncvt SRC == DEST v30. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vse32_v_f32m1 (out,v,vl); +} + +void f19 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + /* Only allow load v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + /* Only allow vncvt SRC == DEST v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vse32_v_f32mf2 (out,v,vl); +} + +void f20 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vfloat32m1_t v = __riscv_vfncvt_f_f_w_f32m1(src,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + v = __riscv_vfadd_vv_f32m1 (v,v,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vse32_v_f32m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-11.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-11.c new file mode 100644 index 00000000000..4d4e4f30e9f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-11.c @@ -0,0 +1,293 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); +} + +void f1 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t src2 = __riscv_vle32_v_f32mf2 ((void *)(base + 100), vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2_tu(src2,src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + vfloat64m1_t v2 = __riscv_vfadd_vv_f64m1 (src, src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); + __riscv_vse64_v_f64m1 ((void *)out,v2,vl); +} + +void f3 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100*i, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + vfloat64m1_t v2 = __riscv_vfadd_vv_f64m1 (src, src,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + __riscv_vse64_v_f64m1 ((void *)(out + 200*i),v2,vl); + } +} + +void f4 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + vfloat64m1_t v2 = __riscv_vfadd_vv_f64m1 (src, src,vl); + __riscv_vse32_v_f32mf2 (out,v,vl); + __riscv_vse64_v_f64m1 ((void *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2_m(m,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vle32_v_f32mf2_tu (v, base2, vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f6 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1(src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); +} + +void f7 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t src2 = __riscv_vle32_v_f32m1 ((void *)(base + 100), vl); + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(src2,src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); +} + +void f8 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1(src,vl); + vfloat64m2_t v2 = __riscv_vfadd_vv_f64m2 (src, src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); + __riscv_vse64_v_f64m2 ((void *)out,v2,vl); +} + +void f9 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100*i, vl); + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1(src,vl); + vfloat64m2_t v2 = __riscv_vfadd_vv_f64m2 (src, src,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + __riscv_vse64_v_f64m2 ((void *)(out + 200*i),v2,vl); + } +} + +void f10 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1(src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + vfloat64m2_t v2 = __riscv_vfadd_vv_f64m2 (src, src,vl); + __riscv_vse32_v_f32m1 (out,v,vl); + __riscv_vse64_v_f64m2 ((void *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool32_t m = __riscv_vlm_v_b32 (base + i, vl); + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1_m(m,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vle32_v_f32m1_tu (v, base2, vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f12 (void *base,void *out,size_t vl, int n) +{ + vfloat32mf2_t v = __riscv_vle32_v_f32mf2 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100*i, vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f13 (void *base,void *out,size_t vl, int n) +{ + vfloat32m1_t v = __riscv_vle32_v_f32m1 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100*i, vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32mf2_t v = __riscv_vle32_v_f32mf2 ((void *)(base + 1000 * i), vl); + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base + 100*i, vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f15 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32m1_t v = __riscv_vle32_v_f32m1 ((void *)(base + 1000 * i), vl); + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base + 100*i, vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f16 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32mf2_t v = __riscv_vle32_v_f32mf2 ((void *)(base + 1000 * i), vl); + vfloat64m1_t src1 = __riscv_vle64_v_f64m1 (base + 100*i, vl); + vfloat64m1_t src2 = __riscv_vle64_v_f64m1 (base + 200*i, vl); + vfloat64m1_t src3 = __riscv_vle64_v_f64m1 (base + 300*i, vl); + vfloat64m1_t src4 = __riscv_vle64_v_f64m1 (base + 400*i, vl); + vfloat64m1_t src5 = __riscv_vle64_v_f64m1 (base + 500*i, vl); + vfloat64m1_t src6 = __riscv_vle64_v_f64m1 (base + 600*i, vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src1,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src2,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src3,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src4,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src5,vl); + v = __riscv_vfncvt_f_f_w_f32mf2_tu(v,src6,vl); + __riscv_vse32_v_f32mf2 (out + 100*i,v,vl); + } +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vfloat32m1_t v = __riscv_vle32_v_f32m1 ((void *)(base + 1000 * i), vl); + vfloat64m2_t src1 = __riscv_vle64_v_f64m2 (base + 100*i, vl); + vfloat64m2_t src2 = __riscv_vle64_v_f64m2 (base + 200*i, vl); + vfloat64m2_t src3 = __riscv_vle64_v_f64m2 (base + 300*i, vl); + vfloat64m2_t src4 = __riscv_vle64_v_f64m2 (base + 400*i, vl); + vfloat64m2_t src5 = __riscv_vle64_v_f64m2 (base + 500*i, vl); + vfloat64m2_t src6 = __riscv_vle64_v_f64m2 (base + 600*i, vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src1,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src2,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src3,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src4,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src5,vl); + v = __riscv_vfncvt_rod_f_f_w_f32m1_tu(v,src6,vl); + __riscv_vse32_v_f32m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1(src,vl); + /* Only allow vncvt SRC == DEST v30. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vse32_v_f32m1 (out,v,vl); +} + +void f19 (void *base,void *out,size_t vl) +{ + vfloat64m1_t src = __riscv_vle64_v_f64m1 (base, vl); + /* Only allow load v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + vfloat32mf2_t v = __riscv_vfncvt_f_f_w_f32mf2(src,vl); + /* Only allow vncvt SRC == DEST v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vse32_v_f32mf2 (out,v,vl); +} + +void f20 (void *base,void *out,size_t vl) +{ + vfloat64m2_t src = __riscv_vle64_v_f64m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vfloat32m1_t v = __riscv_vfncvt_rod_f_f_w_f32m1(src,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + v = __riscv_vfadd_vv_f32m1 (v,v,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vse32_v_f32m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-2.c new file mode 100644 index 00000000000..28ea6217ce7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-2.c @@ -0,0 +1,370 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f0 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f1 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f3 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f4 (void *base,void *out,size_t vl, size_t shift) +{ + vbool64_t m = __riscv_vlm_v_b64 (base + 500, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wv_u8mf8_tumu(m,v,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f5 (void *base,void *out,size_t vl, size_t shift) +{ + vbool64_t m = __riscv_vlm_v_b64 (base + 500, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wv_u8mf8_m(m,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f6 (void *base,void *out,size_t vl, size_t shift) +{ + vbool64_t m = __riscv_vlm_v_b64 (base + 500, vl); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wv_u8mf8_m(m,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); +} + +void f7 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8mf8 (src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); +} + +void f8 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8mf8 (src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); + __riscv_vse16_v_u16mf4 (out+200,src,vl); +} + +void f9 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8mf8_tu (v2,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); + __riscv_vse16_v_u16mf4 (out+200,src,vl); +} + +void f10 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f11 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8mf8_tu (v2,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); + __riscv_vse16_v_u16mf4 (out+200,src,vl); +} + +void f12 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8mf8 (src,v2,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); +} + +void f13 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8_m(m,src,vl,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v2,vl); + } +} + +void f15 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v2,vl); + } +} + +void f16 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vncvt_x_x_w_u8mf8(src,vl); + vuint8mf8_t v3 = __riscv_vnsrl_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse8_v_u8mf8 (out + 100,v3,vl); +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint8mf8_t src2 = __riscv_vle8_v_u8mf8 (base + 200*i, vl); + vuint8mf8_t v = __riscv_vnsrl_wv_u8mf8(src,src2,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + asm volatile ("":::"memory"); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 ((out + 200*i),src,vl); + __riscv_vse8_v_u8mf8 ((out + 300*i),src2,vl); + } +} + +void f18 (void *base,void *out,size_t vl, int n) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f19 (void *base,void *out,size_t vl, int n) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + vuint8mf8_t v2 = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 200*i,v2,vl); + } +} + +void f20 (void *base,void *out,size_t vl, int n) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8mf8(src,v,vl); + vuint8mf8_t v2 = __riscv_vnsrl_wv_u8mf8(src,v,vl); + v2 = __riscv_vnsrl_wv_u8mf8(src,v2,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 200*i,v2,vl); + } +} + +void f21 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000 * i), vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 (out + 200*i,src,vl); + } +} + +void f22 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((uint8_t *)(base + 1000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src1,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src2,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src3,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src4,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src5,v,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src6,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f23 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((uint8_t *)(base + 1000 * i), vl); + vuint8mf8_t v2 = __riscv_vle8_v_u8mf8 ((uint8_t *)(base + 2000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src1,v2,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src2,v2,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src3,v2,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src4,v2,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src5,v2,vl); + v = __riscv_vnsrl_wv_u8mf8_tu(v,src6,v2,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f24 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint32mf2_t src = __riscv_vle32_v_u32mf2 (base + 100, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200, vl); + vuint8mf8_t src3 = __riscv_vle8_v_u8mf8 (base + 300, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint16mf4_t v = __riscv_vnsrl_wv_u16mf4_m(m,src,src2,vl); + vuint16mf4_t v2 = __riscv_vle16_v_u16mf4_tu (v, base2 + i, vl); + vuint8mf8_t v3 = __riscv_vnsrl_wv_u8mf8_m(m,v2,src3,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v3,vl); + } +} + +void f25 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 100, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wv_u8mf8(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + __riscv_vse8_v_u8mf8 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); +} + +void f26 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 100, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnsrl_wv_u8mf8(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + __riscv_vse8_v_u8mf8 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-3.c new file mode 100644 index 00000000000..72bff02c518 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-3.c @@ -0,0 +1,392 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f0 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f1 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f3 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f4 (void *base,void *out,size_t vl, size_t shift) +{ + vbool8_t m = __riscv_vlm_v_b8 (base + 500, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wv_u8m1_tumu(m,v,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f5 (void *base,void *out,size_t vl, size_t shift) +{ + vbool8_t m = __riscv_vlm_v_b8 (base + 500, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wv_u8m1_m(m,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f6 (void *base,void *out,size_t vl, size_t shift) +{ + vbool8_t m = __riscv_vlm_v_b8 (base + 500, vl); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1_m(m,src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); +} + +void f7 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8m1 (src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); +} + +void f8 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8m1 (src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); + __riscv_vse16_v_u16m2 (out+200,src,vl); +} + +void f9 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8m1_tu (v2,src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); + __riscv_vse16_v_u16m2 (out+200,src,vl); +} + +void f10 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f11 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnsrl_wv_u8m1_tu (v2,src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); + __riscv_vse16_v_u16m2 (out+200,src,vl); +} + +void f12 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8m1 (src,v2,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); +} + +void f13 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1_m(m,src,vl,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1(src,v,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v2,vl); + } +} + +void f15 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1(src,v,vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v2,vl); + } +} + +void f16 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vncvt_x_x_w_u8m1(src,vl); + vuint8m1_t v3 = __riscv_vnsrl_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse8_v_u8m1 (out + 100,v3,vl); +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint8m1_t src2 = __riscv_vle8_v_u8m1 (base + 200*i, vl); + vuint8m1_t v = __riscv_vnsrl_wv_u8m1(src,src2,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + asm volatile ("":::"memory"); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 ((out + 200*i),src,vl); + __riscv_vse8_v_u8m1 ((out + 300*i),src2,vl); + } +} + +void f18 (void *base,void *out,size_t vl, int n) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f19 (void *base,void *out,size_t vl, int n) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1(src,v,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 200*i,v2,vl); + } +} + +void f20 (void *base,void *out,size_t vl, int n) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8m1(src,v,vl); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1(src,v,vl); + v2 = __riscv_vnsrl_wv_u8m1(src,v2,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 200*i,v2,vl); + } +} + +void f21 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000 * i), vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 (out + 200*i,src,vl); + } +} + +void f22 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((uint8_t *)(base + 1000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src1,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src2,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src3,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src4,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src5,v,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src6,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f23 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((uint8_t *)(base + 1000 * i), vl); + vuint8m1_t v2 = __riscv_vle8_v_u8m1 ((uint8_t *)(base + 2000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src1,v2,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src2,v2,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src3,v2,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src4,v2,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src5,v2,vl); + v = __riscv_vnsrl_wv_u8m1_tu(v,src6,v2,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f24 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint32m4_t src = __riscv_vle32_v_u32m4 (base + 100, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200, vl); + vuint8m1_t src3 = __riscv_vle8_v_u8m1 (base + 300, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint16m2_t v = __riscv_vnsrl_wv_u16m2_m(m,src,src2,vl); + vuint16m2_t v2 = __riscv_vle16_v_u16m2_tu (v, base2 + i, vl); + vuint8m1_t v3 = __riscv_vnsrl_wv_u8m1_m(m,v2,src3,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v3,vl); + } +} + +void f25 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 100, vl); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + __riscv_vse8_v_u8m1 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); +} + +void f26 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 100, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28"); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + __riscv_vse8_v_u8m1 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); +} + +void f27 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 100, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28"); + vuint8m1_t v2 = __riscv_vnsrl_wv_u8m1(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v29", "v28", "v30", "v31"); + __riscv_vse8_v_u8m1 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v29", "v28", "v30", "v31"); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-4.c new file mode 100644 index 00000000000..28971a0aad8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-4.c @@ -0,0 +1,319 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t src2 = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 100), vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8_tu(src2,src,shift,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f4 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8_m(m,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f6 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f7 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t src2 = __riscv_vle8_v_u8m1 ((int8_t *)(base + 100), vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1_tu(src2,src,shift,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f8 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f9 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f10 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1_m(m,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f12 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f13 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f14 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f15 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f16 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src1,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src2,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src3,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src4,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src5,shift,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src6,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f17 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src1,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src2,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src3,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src4,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src5,shift,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src6,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32mf2_t src = __riscv_vle32_v_u32mf2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint16mf4_t v = __riscv_vnsrl_wx_u16mf4_m(m,src,shift,vl); + vuint16mf4_t v2 = __riscv_vle16_v_u16mf4_tu (v, base2 + i, vl); + vuint8mf8_t v3 = __riscv_vnsrl_wx_u8mf8_m(m,v2,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v3,vl); + } +} + +void f19 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32m4_t src = __riscv_vle32_v_u32m4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint16m2_t v = __riscv_vnsrl_wx_u16m2_m(m,src,shift,vl); + vuint16m2_t v2 = __riscv_vle16_v_u16m2_tu (v, base2 + i, vl); + vuint8m1_t v3 = __riscv_vnsrl_wx_u8m1_m(m,v2,shift,vl); + vuint8m1_t v4 = __riscv_vnsrl_wx_u8m1_tumu(m,v3,v2,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v3,vl); + __riscv_vse8_v_u8m1 (out + 222*i,v4,vl); + } +} + +void f20 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + /* Only allow vncvt SRC == DEST v30. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f21 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m1_t src = __riscv_vle16_v_u16m1 (base, vl); + /* Only allow load v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + vuint8mf2_t v = __riscv_vnsrl_wx_u8mf2(src,shift,vl); + /* Only allow vncvt SRC == DEST v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vse8_v_u8mf2 (out,v,vl); +} + +void f22 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,shift,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + v = __riscv_vadd_vv_u8m1 (v,v,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-5.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-5.c new file mode 100644 index 00000000000..26675bcc87c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-5.c @@ -0,0 +1,319 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,31,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t src2 = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 100), vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8_tu(src2,src,31,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,31,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,31,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f4 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8(src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vnsrl_wx_u8mf8_m(m,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f6 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,31,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f7 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t src2 = __riscv_vle8_v_u8m1 ((int8_t *)(base + 100), vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1_tu(src2,src,31,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f8 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,31,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f9 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,31,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f10 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vnsrl_wx_u8m1_m(m,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f12 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f13 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f14 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f15 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f16 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src1,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src2,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src3,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src4,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src5,31,vl); + v = __riscv_vnsrl_wx_u8mf8_tu(v,src6,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f17 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src1,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src2,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src3,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src4,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src5,31,vl); + v = __riscv_vnsrl_wx_u8m1_tu(v,src6,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32mf2_t src = __riscv_vle32_v_u32mf2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint16mf4_t v = __riscv_vnsrl_wx_u16mf4_m(m,src,31,vl); + vuint16mf4_t v2 = __riscv_vle16_v_u16mf4_tu (v, base2 + i, vl); + vuint8mf8_t v3 = __riscv_vnsrl_wx_u8mf8_m(m,v2,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v3,vl); + } +} + +void f19 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32m4_t src = __riscv_vle32_v_u32m4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint16m2_t v = __riscv_vnsrl_wx_u16m2_m(m,src,31,vl); + vuint16m2_t v2 = __riscv_vle16_v_u16m2_tu (v, base2 + i, vl); + vuint8m1_t v3 = __riscv_vnsrl_wx_u8m1_m(m,v2,31,vl); + vuint8m1_t v4 = __riscv_vnsrl_wx_u8m1_tumu(m,v3,v2,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v3,vl); + __riscv_vse8_v_u8m1 (out + 222*i,v4,vl); + } +} + +void f20 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,31,vl); + /* Only allow vncvt SRC == DEST v30. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f21 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m1_t src = __riscv_vle16_v_u16m1 (base, vl); + /* Only allow load v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + vuint8mf2_t v = __riscv_vnsrl_wx_u8mf2(src,31,vl); + /* Only allow vncvt SRC == DEST v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vse8_v_u8mf2 (out,v,vl); +} + +void f22 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnsrl_wx_u8m1(src,31,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + v = __riscv_vadd_vv_u8m1 (v,v,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-6.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-6.c new file mode 100644 index 00000000000..fd7ffd3c97b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-6.c @@ -0,0 +1,369 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f0 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f1 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f3 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f4 (void *base,void *out,size_t vl, size_t shift) +{ + vbool64_t m = __riscv_vlm_v_b64 (base + 500, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wv_u8mf8_tumu(m,v,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f5 (void *base,void *out,size_t vl, size_t shift) +{ + vbool64_t m = __riscv_vlm_v_b64 (base + 500, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wv_u8mf8_m(m,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f6 (void *base,void *out,size_t vl, size_t shift) +{ + vbool64_t m = __riscv_vlm_v_b64 (base + 500, vl); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wv_u8mf8_m(m,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); +} + +void f7 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8mf8 (src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); +} + +void f8 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8mf8 (src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); + __riscv_vse16_v_u16mf4 (out+200,src,vl); +} + +void f9 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8mf8_tu (v2,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); + __riscv_vse16_v_u16mf4 (out+200,src,vl); +} + +void f10 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 (out+100,src,vl); +} + +void f11 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8mf8_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8mf8_tu (v2,src,v,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); + __riscv_vse16_v_u16mf4 (out+200,src,vl); +} + +void f12 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8mf8 (src,v2,vl); + __riscv_vse8_v_u8mf8 (out,v2,vl); + __riscv_vse8_v_u8mf8 (out+100,v,vl); +} + +void f13 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8_m(m,src,vl,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v2,vl); + } +} + +void f15 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 600, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v2,vl); + } +} + +void f16 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vncvt_x_x_w_u8mf8(src,vl); + vuint8mf8_t v3 = __riscv_vnclipu_wv_u8mf8(src,v,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse8_v_u8mf8 (out + 100,v3,vl); +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint8mf8_t src2 = __riscv_vle8_v_u8mf8 (base + 200*i, vl); + vuint8mf8_t v = __riscv_vnclipu_wv_u8mf8(src,src2,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + asm volatile ("":::"memory"); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 ((out + 200*i),src,vl); + __riscv_vse8_v_u8mf8 ((out + 300*i),src2,vl); + } +} + +void f18 (void *base,void *out,size_t vl, int n) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f19 (void *base,void *out,size_t vl, int n) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + vuint8mf8_t v2 = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 200*i,v2,vl); + } +} + +void f20 (void *base,void *out,size_t vl, int n) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8mf8(src,v,vl); + vuint8mf8_t v2 = __riscv_vnclipu_wv_u8mf8(src,v,vl); + v2 = __riscv_vnclipu_wv_u8mf8(src,v2,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse8_v_u8mf8 (out + 200*i,v2,vl); + } +} + +void f21 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((base + 1000 * i), vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 (out + 200*i,src,vl); + } +} + +void f22 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((uint8_t *)(base + 1000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src1,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src2,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src3,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src4,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src5,v,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src6,v,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f23 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((uint8_t *)(base + 1000 * i), vl); + vuint8mf8_t v2 = __riscv_vle8_v_u8mf8 ((uint8_t *)(base + 2000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src1,v2,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src2,v2,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src3,v2,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src4,v2,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src5,v2,vl); + v = __riscv_vnclipu_wv_u8mf8_tu(v,src6,v2,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f24 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint32mf2_t src = __riscv_vle32_v_u32mf2 (base + 100, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200, vl); + vuint8mf8_t src3 = __riscv_vle8_v_u8mf8 (base + 300, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint16mf4_t v = __riscv_vnclipu_wv_u16mf4_m(m,src,src2,vl); + vuint16mf4_t v2 = __riscv_vle16_v_u16mf4_tu (v, base2 + i, vl); + vuint8mf8_t v3 = __riscv_vnclipu_wv_u8mf8_m(m,v2,src3,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v3,vl); + } +} + +void f25 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 100, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wv_u8mf8(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + __riscv_vse8_v_u8mf8 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); +} + +void f26 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 (base + 100, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v2 = __riscv_vnclipu_wv_u8mf8(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + __riscv_vse8_v_u8mf8 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-7.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-7.c new file mode 100644 index 00000000000..70ba7d7459e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-7.c @@ -0,0 +1,392 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +void f0 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f1 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f3 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f4 (void *base,void *out,size_t vl, size_t shift) +{ + vbool8_t m = __riscv_vlm_v_b8 (base + 500, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wv_u8m1_tumu(m,v,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f5 (void *base,void *out,size_t vl, size_t shift) +{ + vbool8_t m = __riscv_vlm_v_b8 (base + 500, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wv_u8m1_m(m,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f6 (void *base,void *out,size_t vl, size_t shift) +{ + vbool8_t m = __riscv_vlm_v_b8 (base + 500, vl); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1_m(m,src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); +} + +void f7 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8m1 (src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); +} + +void f8 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8m1 (src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); + __riscv_vse16_v_u16m2 (out+200,src,vl); +} + +void f9 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8m1_tu (v2,src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); + __riscv_vse16_v_u16m2 (out+200,src,vl); +} + +void f10 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 (out+100,src,vl); +} + +void f11 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8m1_tu (v2,src,v,vl); + v2 = __riscv_vnclipu_wv_u8m1_tu (v2,src,v,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); + __riscv_vse16_v_u16m2 (out+200,src,vl); +} + +void f12 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v2 = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8m1 (src,v2,vl); + __riscv_vse8_v_u8m1 (out,v2,vl); + __riscv_vse8_v_u8m1 (out+100,v,vl); +} + +void f13 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1_m(m,src,vl,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1(src,v,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v2,vl); + } +} + +void f15 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 600, vl); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1(src,v,vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v2,vl); + } +} + +void f16 (uint16_t *base,uint8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vncvt_x_x_w_u8m1(src,vl); + vuint8m1_t v3 = __riscv_vnclipu_wv_u8m1(src,v,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse8_v_u8m1 (out + 100,v3,vl); +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint8m1_t src2 = __riscv_vle8_v_u8m1 (base + 200*i, vl); + vuint8m1_t v = __riscv_vnclipu_wv_u8m1(src,src2,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + asm volatile ("":::"memory"); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 ((out + 200*i),src,vl); + __riscv_vse8_v_u8m1 ((out + 300*i),src2,vl); + } +} + +void f18 (void *base,void *out,size_t vl, int n) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f19 (void *base,void *out,size_t vl, int n) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1(src,v,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 200*i,v2,vl); + } +} + +void f20 (void *base,void *out,size_t vl, int n) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8m1(src,v,vl); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1(src,v,vl); + v2 = __riscv_vnclipu_wv_u8m1(src,v2,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse8_v_u8m1 (out + 200*i,v2,vl); + } +} + +void f21 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((base + 1000 * i), vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 (out + 200*i,src,vl); + } +} + +void f22 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((uint8_t *)(base + 1000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src1,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src2,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src3,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src4,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src5,v,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src6,v,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f23 (uint16_t *base,uint8_t *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((uint8_t *)(base + 1000 * i), vl); + vuint8m1_t v2 = __riscv_vle8_v_u8m1 ((uint8_t *)(base + 2000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src1,v2,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src2,v2,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src3,v2,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src4,v2,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src5,v2,vl); + v = __riscv_vnclipu_wv_u8m1_tu(v,src6,v2,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f24 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint32m4_t src = __riscv_vle32_v_u32m4 (base + 100, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200, vl); + vuint8m1_t src3 = __riscv_vle8_v_u8m1 (base + 300, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint16m2_t v = __riscv_vnclipu_wv_u16m2_m(m,src,src2,vl); + vuint16m2_t v2 = __riscv_vle16_v_u16m2_tu (v, base2 + i, vl); + vuint8m1_t v3 = __riscv_vnclipu_wv_u8m1_m(m,v2,src3,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v3,vl); + } +} + +void f25 (void *base,void *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 100, vl); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + __riscv_vse8_v_u8m1 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); +} + +void f26 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 100, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28"); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + __riscv_vse8_v_u8m1 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); +} + +void f27 (void *base,void *out,size_t vl, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 (base + 100, vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28"); + vuint8m1_t v2 = __riscv_vnclipu_wv_u8m1(src,v,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v29", "v28", "v30", "v31"); + __riscv_vse8_v_u8m1 (out,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v29", "v28", "v30", "v31"); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-8.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-8.c new file mode 100644 index 00000000000..ec8a5565bd9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-8.c @@ -0,0 +1,319 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t src2 = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 100), vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8_tu(src2,src,shift,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f4 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8_m(m,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f6 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f7 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t src2 = __riscv_vle8_v_u8m1 ((int8_t *)(base + 100), vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1_tu(src2,src,shift,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f8 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f9 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f10 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1_m(m,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f12 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f13 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f14 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f15 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f16 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src1,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src2,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src3,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src4,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src5,shift,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src6,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f17 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src1,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src2,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src3,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src4,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src5,shift,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src6,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32mf2_t src = __riscv_vle32_v_u32mf2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint16mf4_t v = __riscv_vnclipu_wx_u16mf4_m(m,src,shift,vl); + vuint16mf4_t v2 = __riscv_vle16_v_u16mf4_tu (v, base2 + i, vl); + vuint8mf8_t v3 = __riscv_vnclipu_wx_u8mf8_m(m,v2,shift,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v3,vl); + } +} + +void f19 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32m4_t src = __riscv_vle32_v_u32m4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint16m2_t v = __riscv_vnclipu_wx_u16m2_m(m,src,shift,vl); + vuint16m2_t v2 = __riscv_vle16_v_u16m2_tu (v, base2 + i, vl); + vuint8m1_t v3 = __riscv_vnclipu_wx_u8m1_m(m,v2,shift,vl); + vuint8m1_t v4 = __riscv_vnclipu_wx_u8m1_tumu(m,v3,v2,shift,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v3,vl); + __riscv_vse8_v_u8m1 (out + 222*i,v4,vl); + } +} + +void f20 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + /* Only allow vncvt SRC == DEST v30. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f21 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m1_t src = __riscv_vle16_v_u16m1 (base, vl); + /* Only allow load v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + vuint8mf2_t v = __riscv_vnclipu_wx_u8mf2(src,shift,vl); + /* Only allow vncvt SRC == DEST v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vse8_v_u8mf2 (out,v,vl); +} + +void f22 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,shift,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + v = __riscv_vadd_vv_u8m1 (v,v,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-9.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-9.c new file mode 100644 index 00000000000..ff34749bdb9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-9.c @@ -0,0 +1,319 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,31,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t src2 = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 100), vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8_tu(src2,src,31,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,31,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,31,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f4 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8(src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + vuint16mf4_t v2 = __riscv_vadd_vv_u16mf4 (src, src,vl); + __riscv_vse8_v_u8mf8 (out,v,vl); + __riscv_vse16_v_u16mf4 ((int16_t *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint8mf8_t v = __riscv_vnclipu_wx_u8mf8_m(m,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vle8_v_u8mf8_tu (v, base2, vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f6 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,31,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f7 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t src2 = __riscv_vle8_v_u8m1 ((int8_t *)(base + 100), vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1_tu(src2,src,31,vl); + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f8 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,31,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f9 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,31,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)(out + 200*i),v2,vl); + } +} + +void f10 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + vuint16m2_t v2 = __riscv_vadd_vv_u16m2 (src, src,vl); + __riscv_vse8_v_u8m1 (out,v,vl); + __riscv_vse16_v_u16m2 ((int16_t *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint8m1_t v = __riscv_vnclipu_wx_u8m1_m(m,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vle8_v_u8m1_tu (v, base2, vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f12 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f13 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f14 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f15 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src = __riscv_vle16_v_u16m2 (base + 100*i, vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f16 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8mf8_t v = __riscv_vle8_v_u8mf8 ((int8_t *)(base + 1000 * i), vl); + vuint16mf4_t src1 = __riscv_vle16_v_u16mf4 (base + 100*i, vl); + vuint16mf4_t src2 = __riscv_vle16_v_u16mf4 (base + 200*i, vl); + vuint16mf4_t src3 = __riscv_vle16_v_u16mf4 (base + 300*i, vl); + vuint16mf4_t src4 = __riscv_vle16_v_u16mf4 (base + 400*i, vl); + vuint16mf4_t src5 = __riscv_vle16_v_u16mf4 (base + 500*i, vl); + vuint16mf4_t src6 = __riscv_vle16_v_u16mf4 (base + 600*i, vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src1,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src2,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src3,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src4,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src5,31,vl); + v = __riscv_vnclipu_wx_u8mf8_tu(v,src6,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v,vl); + } +} + +void f17 (int16_t *base,int8_t *out,size_t vl, int n, size_t shift) +{ + for (int i = 0; i < n; i++){ + vuint8m1_t v = __riscv_vle8_v_u8m1 ((int8_t *)(base + 1000 * i), vl); + vuint16m2_t src1 = __riscv_vle16_v_u16m2 (base + 100*i, vl); + vuint16m2_t src2 = __riscv_vle16_v_u16m2 (base + 200*i, vl); + vuint16m2_t src3 = __riscv_vle16_v_u16m2 (base + 300*i, vl); + vuint16m2_t src4 = __riscv_vle16_v_u16m2 (base + 400*i, vl); + vuint16m2_t src5 = __riscv_vle16_v_u16m2 (base + 500*i, vl); + vuint16m2_t src6 = __riscv_vle16_v_u16m2 (base + 600*i, vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src1,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src2,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src3,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src4,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src5,31,vl); + v = __riscv_vnclipu_wx_u8m1_tu(v,src6,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32mf2_t src = __riscv_vle32_v_u32mf2 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vuint16mf4_t v = __riscv_vnclipu_wx_u16mf4_m(m,src,31,vl); + vuint16mf4_t v2 = __riscv_vle16_v_u16mf4_tu (v, base2 + i, vl); + vuint8mf8_t v3 = __riscv_vnclipu_wx_u8mf8_m(m,v2,31,vl); + __riscv_vse8_v_u8mf8 (out + 100*i,v3,vl); + } +} + +void f19 (void *base,void *base2,void *out,size_t vl, int n, size_t shift) +{ + vuint32m4_t src = __riscv_vle32_v_u32m4 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint16m2_t v = __riscv_vnclipu_wx_u16m2_m(m,src,31,vl); + vuint16m2_t v2 = __riscv_vle16_v_u16m2_tu (v, base2 + i, vl); + vuint8m1_t v3 = __riscv_vnclipu_wx_u8m1_m(m,v2,31,vl); + vuint8m1_t v4 = __riscv_vnclipu_wx_u8m1_tumu(m,v3,v2,31,vl); + __riscv_vse8_v_u8m1 (out + 100*i,v3,vl); + __riscv_vse8_v_u8m1 (out + 222*i,v4,vl); + } +} + +void f20 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,31,vl); + /* Only allow vncvt SRC == DEST v30. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +void f21 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m1_t src = __riscv_vle16_v_u16m1 (base, vl); + /* Only allow load v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + vuint8mf2_t v = __riscv_vnclipu_wx_u8mf2(src,31,vl); + /* Only allow vncvt SRC == DEST v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vse8_v_u8mf2 (out,v,vl); +} + +void f22 (int16_t *base,int8_t *out,size_t vl, size_t shift) +{ + vuint16m2_t src = __riscv_vle16_v_u16m2 (base, vl); + /* Only allow load v30,v31. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vuint8m1_t v = __riscv_vnclipu_wx_u8m1(src,31,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + v = __riscv_vadd_vv_u8m1 (v,v,vl); + /* Only allow v29. */ + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vse8_v_u8m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */