From patchwork Fri Mar 24 11:29:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 74488 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp553117vqo; Fri, 24 Mar 2023 04:34:13 -0700 (PDT) X-Google-Smtp-Source: AKy350b9zAGJXlR8hLqG9x7Izycy+5+jvvkC/vif00VbTbAPfhVyADXi0qlweBGK7ZM4cyhCfr8V X-Received: by 2002:a17:906:4ed8:b0:933:4d37:82b2 with SMTP id i24-20020a1709064ed800b009334d3782b2mr2486940ejv.57.1679657653595; Fri, 24 Mar 2023 04:34:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679657653; cv=none; d=google.com; s=arc-20160816; b=HhMMdw3CF2kp0wSQTXCC0J86phujnEzMP/dPHwIsKePyAEUwq26V2atQSXTuYlxyaJ SeqdzNLJPkdipqRzjteAT4vz+e6JSUCg05qn2eG4n02ghxo+fyryC+quXf0wiS6H8cJ2 6lYfqqkxXXZ3Au0JrfM5Bbv+MwG2vSewuJ6J3Y+f5XXZq5urwcH83eVD3KfnNsDVj+Q5 utxZO5a5mvTinnLOZJI0mdGnGo/+7E7SSUtrm+mf37FdBwcUWsUwau3JpK+DQG/4/F0y tPGjBtjmYeN7UUlQvplkWNdQpsfzdawpUNKZMLbDIjErbUaej7mtCvKdBz9tFBMq5Vxt j5mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=rc6RUN5crftrn8xtQIum23Bl8q8gNxI+zGpOalGc4uc=; b=jGsykWRMdq8DzoRhwrjBsvpLVbsd4MKUbw26cerAYX/rAZzBqIAXpLdPUnOghthmF6 a/cr9KVTdrbFC9cpweCfvskcWzpYDPvpbHC4ZWpMDT6K12yZQm7+ytgRFut1y0aJOT4d a0546M5q3G6+FBWiQAdHBIERAc+ex4hf5Niyv2iKBPwy0d538IjOabb75sWQYhzBiECE oUkcPDRzdivi999H1I+vPK1X3VpXpvy6/VTwo7GLp30CdutOUHc0ULmIfWvOJmHxvAlZ IX5WT72seTu5FeSwSkP7Gy+itJ24jS+tD63pvOzsVfjq0LI/PdelDxIEgo1DWm/0fsvA iK7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id a10-20020a170906274a00b008dc8bcebff0si19561717ejd.341.2023.03.24.04.34.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Mar 2023 04:34:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B9B0C3887F6D for ; Fri, 24 Mar 2023 11:31:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr1.qq.com (smtpbgbr1.qq.com [54.207.19.206]) by sourceware.org (Postfix) with ESMTPS id 4600238708C6 for ; Fri, 24 Mar 2023 11:30:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4600238708C6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp90t1679657369tjzj5e9p Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 24 Mar 2023 19:29:28 +0800 (CST) X-QQ-SSF: 01400000000000E0N000000A0000000 X-QQ-FEAT: tgzXWVxr7yjrfWkFh5UxL2VON0XdkIkVfZHZU56pCrXR+/4pn8vxH4eY6G3jn 120lxtvG6jC+olG2T+P3ePjWuzZuQ+HrA7/yoGbGrQu1FtLv0WXGqdkxAZMRtWeIQfanqBY pY/vZ0l7JPy7JBK6hTUFnoY4GjoJdHIq9PjNdipznQmglrDOdDBZvZvcwhZGo0d3L5myVqx UySV0Y8HV553AfXIEjqz2E5CswgyYOQsni/yvfgB4E23Wo2bklZqeHkrZIya8puhxvtuaPS 8W0LU0JzSihRLFombbbTinaoVbL2qZiupY7xXFfsIZHkFpB+PxGbXaThrMn3KLoEat7v+Lk eYVDLrkamLysqZbGqdc1A1hibJkR0XSoYssFQ4fi1r1rxUebCKrGkEyzzzYLdK9lXEz1Rsh lJ+NqdGUYyX1x30dHwHuPA== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 5156781475624879075 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, Juzhe-Zhong Subject: [GCC14 QUEUE PATCH] RISC-V: Fine tune RVV narrow instruction (source EEW > dest DEST) RA constraint Date: Fri, 24 Mar 2023 19:29:27 +0800 Message-Id: <20230324112927.285817-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvr:qybglogicsvr7 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761248681490959464?= X-GMAIL-MSGID: =?utf-8?q?1761248703667560497?= From: Juzhe-Zhong gcc/ChangeLog: * config/riscv/vector.md (*pred_cmp_merge_tie_mask): New pattern. (*pred_ltge_merge_tie_mask): Ditto. (*pred_cmp_scalar_merge_tie_mask): Ditto. (*pred_eqne_scalar_merge_tie_mask): Ditto. (*pred_cmp_extended_scalar_merge_tie_mask): Ditto. (*pred_eqne_extended_scalar_merge_tie_mask): Ditto. (*pred_cmp_narrow_merge_tie_mask): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: * gcc.target/riscv/rvv/base/binop_vx_constraint-150.c: * gcc.target/riscv/rvv/base/narrow_constraint-12.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-13.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-14.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-15.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-16.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-17.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-18.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-19.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-20.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-21.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-22.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-23.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-24.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-25.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-26.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-27.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-28.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-29.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-30.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-31.c: New test. --- gcc/config/riscv/vector.md | 780 ++++++++++++------ .../riscv/rvv/base/binop_vv_constraint-4.c | 2 +- .../riscv/rvv/base/binop_vx_constraint-150.c | 2 +- .../riscv/rvv/base/narrow_constraint-12.c | 303 +++++++ .../riscv/rvv/base/narrow_constraint-13.c | 133 +++ .../riscv/rvv/base/narrow_constraint-14.c | 133 +++ .../riscv/rvv/base/narrow_constraint-15.c | 127 +++ .../riscv/rvv/base/narrow_constraint-16.c | 127 +++ .../riscv/rvv/base/narrow_constraint-17.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-18.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-19.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-20.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-21.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-22.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-23.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-24.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-25.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-26.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-27.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-28.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-29.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-30.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-31.c | 231 ++++++ 23 files changed, 4811 insertions(+), 261 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 1ddc1d3fd39..52597750f69 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -1490,63 +1490,63 @@ ;; DEST eew is smaller than SOURCE eew. (define_insn "@pred_indexed_load_x2_smaller_eew" - [(set (match_operand:VEEWTRUNC2 0 "register_operand" "=&vr, &vr") + [(set (match_operand:VEEWTRUNC2 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else:VEEWTRUNC2 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWTRUNC2 - [(match_operand 3 "pmode_register_operand" " r, r") + [(match_operand 3 "pmode_register_operand" " r, r, r, r, r, r") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " vr, vr")] ORDER) - (match_operand:VEEWTRUNC2 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 4 "register_operand" " 0, 0, 0, 0, vr, vr")] ORDER) + (match_operand:VEEWTRUNC2 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%3),%4%p1" [(set_attr "type" "vldx") (set_attr "mode" "")]) (define_insn "@pred_indexed_load_x4_smaller_eew" - [(set (match_operand:VEEWTRUNC4 0 "register_operand" "=&vr, &vr") + [(set (match_operand:VEEWTRUNC4 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else:VEEWTRUNC4 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWTRUNC4 - [(match_operand 3 "pmode_register_operand" " r, r") + [(match_operand 3 "pmode_register_operand" " r, r, r, r, r, r") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " vr, vr")] ORDER) - (match_operand:VEEWTRUNC4 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 4 "register_operand" " 0, 0, 0, 0, vr, vr")] ORDER) + (match_operand:VEEWTRUNC4 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%3),%4%p1" [(set_attr "type" "vldx") (set_attr "mode" "")]) (define_insn "@pred_indexed_load_x8_smaller_eew" - [(set (match_operand:VEEWTRUNC8 0 "register_operand" "=&vr, &vr") + [(set (match_operand:VEEWTRUNC8 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else:VEEWTRUNC8 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWTRUNC8 - [(match_operand 3 "pmode_register_operand" " r, r") + [(match_operand 3 "pmode_register_operand" " r, r, r, r, r, r") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " vr, vr")] ORDER) - (match_operand:VEEWTRUNC8 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 4 "register_operand" " 0, 0, 0, 0, vr, vr")] ORDER) + (match_operand:VEEWTRUNC8 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%3),%4%p1" [(set_attr "type" "vldx") @@ -2420,15 +2420,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) (define_insn "@pred_madc" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr, &vr") (unspec: [(plus:VI - (match_operand:VI 1 "register_operand" " vr, vr") - (match_operand:VI 2 "vector_arith_operand" " vr, vi")) - (match_operand: 3 "register_operand" " vm, vm") + (match_operand:VI 1 "register_operand" " %0, vr, vr") + (match_operand:VI 2 "vector_arith_operand" "vrvi, vr, vi")) + (match_operand: 3 "register_operand" " vm, vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK, rK") - (match_operand 5 "const_int_operand" " i, i") + [(match_operand 4 "vector_length_operand" " rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2439,15 +2439,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_msbc" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (unspec: [(minus:VI - (match_operand:VI 1 "register_operand" " vr") - (match_operand:VI 2 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand:VI 1 "register_operand" " 0, vr, vr") + (match_operand:VI 2 "register_operand" " vr, 0, vr")) + (match_operand: 3 "register_operand" " vm, vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2458,16 +2458,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_madc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "register_operand" " r")) - (match_operand:VI_QHS 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "register_operand" " r, r")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2478,16 +2478,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_msbc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_QHS 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2527,16 +2527,16 @@ }) (define_insn "*pred_madc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2547,17 +2547,17 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "*pred_madc_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2597,16 +2597,16 @@ }) (define_insn "*pred_msbc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2617,17 +2617,17 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "*pred_msbc_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2638,14 +2638,14 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_madc_overflow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr, &vr") (unspec: [(plus:VI - (match_operand:VI 1 "register_operand" " vr, vr") - (match_operand:VI 2 "vector_arith_operand" " vr, vi")) + (match_operand:VI 1 "register_operand" " %0, vr, vr") + (match_operand:VI 2 "vector_arith_operand" "vrvi, vr, vi")) (unspec: - [(match_operand 3 "vector_length_operand" " rK, rK") - (match_operand 4 "const_int_operand" " i, i") + [(match_operand 3 "vector_length_operand" " rK, rK, rK") + (match_operand 4 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2656,14 +2656,14 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "@pred_msbc_overflow" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr, &vr") (unspec: [(minus:VI - (match_operand:VI 1 "register_operand" " vr") - (match_operand:VI 2 "register_operand" " vr")) + (match_operand:VI 1 "register_operand" " 0, vr, vr, vr") + (match_operand:VI 2 "register_operand" " vr, 0, vr, vi")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 4 "const_int_operand" " i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2674,15 +2674,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "@pred_madc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_QHS 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2693,15 +2693,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "@pred_msbc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_QHS 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2740,15 +2740,15 @@ }) (define_insn "*pred_madc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2759,16 +2759,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "*pred_madc_overflow_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2807,15 +2807,15 @@ }) (define_insn "*pred_msbc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2826,16 +2826,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "*pred_msbc_overflow_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -3617,6 +3617,29 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_cmp_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_ltge_operator" + [(match_operand:VI 3 "register_operand" " vr") + (match_operand:VI 4 "vector_arith_operand" "vrvi")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.v%o4\t%0,%3,%v4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp" [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr") @@ -3639,19 +3662,19 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr, &vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 7 "const_int_operand" " i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_ltge_operator" - [(match_operand:VI 4 "register_operand" " vr, vr, vr, vr") - (match_operand:VI 5 "vector_arith_operand" " vr, vr, vi, vi")]) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")))] + [(match_operand:VI 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr") + (match_operand:VI 5 "vector_arith_operand" " vrvi, vrvi, 0, 0, vrvi, 0, 0, vrvi, vrvi")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") @@ -3674,6 +3697,29 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_ltge_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "ltge_operator" + [(match_operand:VI 3 "register_operand" " vr") + (match_operand:VI 4 "vector_neg_arith_operand" "vrvj")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.v%o4\t%0,%3,%v4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_ltge" [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr") @@ -3696,19 +3742,19 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_ltge_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr, &vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 7 "const_int_operand" " i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "ltge_operator" - [(match_operand:VI 4 "register_operand" " vr, vr, vr, vr") - (match_operand:VI 5 "vector_neg_arith_operand" " vr, vr, vj, vj")]) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")))] + [(match_operand:VI 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr") + (match_operand:VI 5 "vector_neg_arith_operand" " vrvj, vrvj, 0, 0, vrvj, 0, 0, vrvj, vrvj")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") @@ -3732,6 +3778,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_cmp_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_eqge_operator" + [(match_operand:VI_QHS 3 "register_operand" " vr") + (vec_duplicate:VI_QHS + (match_operand: 4 "register_operand" " r"))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -3755,20 +3825,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:VI_QHS 4 "register_operand" " vr, vr") + [(match_operand:VI_QHS 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VI_QHS - (match_operand: 5 "register_operand" " r, r"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r"))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -3792,6 +3862,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_eqne_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VI_QHS + (match_operand: 4 "register_operand" " r")) + (match_operand:VI_QHS 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -3815,20 +3909,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VI_QHS - (match_operand: 5 "register_operand" " r, r")) - (match_operand:VI_QHS 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r")) + (match_operand:VI_QHS 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -3909,6 +4003,54 @@ DONE; }) +(define_insn "*pred_cmp_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_eqge_operator" + [(match_operand:VI_D 3 "register_operand" " vr") + (vec_duplicate:VI_D + (match_operand: 4 "register_operand" " r"))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + +(define_insn "*pred_eqne_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VI_D + (match_operand: 4 "register_operand" " r")) + (match_operand:VI_D 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -3932,20 +4074,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:VI_D 4 "register_operand" " vr, vr") + [(match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VI_D - (match_operand: 5 "register_operand" " r, r"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r"))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -3974,25 +4116,50 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VI_D - (match_operand: 5 "register_operand" " r, r")) - (match_operand:VI_D 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r")) + (match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") (set_attr "mode" "")]) +(define_insn "*pred_cmp_extended_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_eqge_operator" + [(match_operand:VI_D 3 "register_operand" " vr") + (vec_duplicate:VI_D + (sign_extend: + (match_operand: 4 "register_operand" " r")))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -4016,26 +4183,51 @@ (set_attr "mode" "")]) (define_insn "*pred_cmp_extended_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:VI_D 4 "register_operand" " vr, vr") + [(match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VI_D (sign_extend: - (match_operand: 5 "register_operand" " r, r")))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r")))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") (set_attr "mode" "")]) +(define_insn "*pred_eqne_extended_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VI_D + (sign_extend: + (match_operand: 4 "register_operand" " r"))) + (match_operand:VI_D 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -4059,21 +4251,21 @@ (set_attr "mode" "")]) (define_insn "*pred_eqne_extended_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VI_D (sign_extend: - (match_operand: 5 "register_operand" " r, r"))) - (match_operand:VI_D 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r"))) + (match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -4111,6 +4303,7 @@ { enum rtx_code code = GET_CODE (operands[3]); rtx undef = RVV_VUNDEF (mode); + rtx tmp = gen_reg_rtx (mode); if (code == GEU && rtx_equal_p (operands[5], const0_rtx)) { /* If vmsgeu with 0 immediate, expand it to vmset. */ @@ -4157,12 +4350,11 @@ - pseudoinstruction: vmsge{u}.vx vd, va, x - expansion: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd. */ emit_insn ( - gen_pred_cmp_scalar (operands[0], operands[1], operands[2], + gen_pred_cmp_scalar (tmp, operands[1], operands[2], operands[3], operands[4], operands[5], operands[6], operands[7], operands[8])); emit_insn (gen_pred_nand (operands[0], CONSTM1_RTX (mode), - undef, operands[0], operands[0], - operands[6], operands[8])); + undef, tmp, tmp, operands[6], operands[8])); } else { @@ -4171,13 +4363,12 @@ /* masked va >= x, vd == v0 - pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt - expansion: vmslt{u}.vx vt, va, x; vmandn.mm vd, vd, vt. */ - rtx reg = gen_reg_rtx (mode); emit_insn (gen_pred_cmp_scalar ( - reg, CONSTM1_RTX (mode), undef, operands[3], operands[4], + tmp, CONSTM1_RTX (mode), undef, operands[3], operands[4], operands[5], operands[6], operands[7], operands[8])); emit_insn ( gen_pred_andnot (operands[0], CONSTM1_RTX (mode), undef, - operands[1], reg, operands[6], operands[8])); + operands[1], tmp, operands[6], operands[8])); } else { @@ -4186,10 +4377,10 @@ - expansion: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0. */ emit_insn (gen_pred_cmp_scalar ( - operands[0], operands[1], operands[2], operands[3], operands[4], + tmp, operands[1], operands[2], operands[3], operands[4], operands[5], operands[6], operands[7], operands[8])); emit_insn (gen_pred (XOR, mode, operands[0], - CONSTM1_RTX (mode), undef, operands[0], + CONSTM1_RTX (mode), undef, tmp, operands[1], operands[6], operands[8])); } } @@ -6296,21 +6487,44 @@ [(set_attr "type" "vfcmp") (set_attr "mode" "")]) +(define_insn "*pred_cmp_narrow_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "signed_order_operator" + [(match_operand:VF 3 "register_operand" " vr") + (match_operand:VF 4 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vmf%B2.vv\t%0,%3,%4,v0.t" + [(set_attr "type" "vfcmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "signed_order_operator" - [(match_operand:VF 4 "register_operand" " vr, vr") - (match_operand:VF 5 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + [(match_operand:VF 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr") + (match_operand:VF 5 "register_operand" " vr, vr, 0, 0, vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vmf%B3.vv\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") @@ -6334,6 +6548,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_cmp_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "signed_order_operator" + [(match_operand:VF 3 "register_operand" " vr") + (vec_duplicate:VF + (match_operand: 4 "register_operand" " f"))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vmf%B2.vf\t%0,%3,%4,v0.t" + [(set_attr "type" "vfcmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -6357,20 +6595,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "signed_order_operator" - [(match_operand:VF 4 "register_operand" " vr, vr") + [(match_operand:VF 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VF - (match_operand: 5 "register_operand" " f, f"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " f, f, f, f, f"))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") @@ -6394,6 +6632,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_eqne_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VF + (match_operand: 4 "register_operand" " f")) + (match_operand:VF 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vmf%B2.vf\t%0,%3,%4,v0.t" + [(set_attr "type" "vfcmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -6417,20 +6679,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VF - (match_operand: 5 "register_operand" " f, f")) - (match_operand:VF 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " f, f, f, f, f")) + (match_operand:VF 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") @@ -6730,44 +6992,44 @@ ;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode ;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64* (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_reduc:VI (vec_duplicate:VI (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VI 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VI 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN > 32" "vred.vs\t%0,%3,%4%p1" [(set_attr "type" "vired") (set_attr "mode" "")]) (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_reduc:VI_ZVE32 (vec_duplicate:VI_ZVE32 (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VI_ZVE32 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VI_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN == 32" "vred.vs\t%0,%3,%4%p1" [(set_attr "type" "vired") @@ -6810,90 +7072,90 @@ (set_attr "mode" "")]) (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_freduc:VF (vec_duplicate:VF (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VF 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN > 32" "vfred.vs\t%0,%3,%4%p1" [(set_attr "type" "vfredu") (set_attr "mode" "")]) (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_freduc:VF_ZVE32 (vec_duplicate:VF_ZVE32 (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF_ZVE32 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VF_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN == 32" "vfred.vs\t%0,%3,%4%p1" [(set_attr "type" "vfredu") (set_attr "mode" "")]) (define_insn "@pred_reduc_plus" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (plus:VF (vec_duplicate:VF (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC)] ORDER))] + (match_operand:VF 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC)] ORDER))] "TARGET_VECTOR && TARGET_MIN_VLEN > 32" "vfredsum.vs\t%0,%3,%4%p1" [(set_attr "type" "vfred") (set_attr "mode" "")]) (define_insn "@pred_reduc_plus" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (plus:VF_ZVE32 (vec_duplicate:VF_ZVE32 (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF_ZVE32 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC)] ORDER))] + (match_operand:VF_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC)] ORDER))] "TARGET_VECTOR && TARGET_MIN_VLEN == 32" "vfredsum.vs\t%0,%3,%4%p1" [(set_attr "type" "vfred") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c index 552c264d895..e16db932f15 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c @@ -24,4 +24,4 @@ void f2 (void * in, void *out, int32_t x) __riscv_vsm_v_b32 (out, m4, 4); } -/* { dg-final { scan-assembler-times {vmv} 2 } } */ +/* { dg-final { scan-assembler-not {vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c index 55a222f47ea..e92a8115f09 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c @@ -18,4 +18,4 @@ void f1 (void * in, void *out, int32_t x) /* { dg-final { scan-assembler-times {vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+,\s*v0.t} 1 } } */ /* { dg-final { scan-assembler-times {vmxor\.mm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */ /* { dg-final { scan-assembler-times {vmnot\.m\s+v[0-9]+,\s*v[0-9]+} 1 } } */ -/* { dg-final { scan-assembler-times {vmv} 1 } } */ +/* { dg-final { scan-assembler-not {vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c new file mode 100644 index 00000000000..df5b2dc5c51 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c @@ -0,0 +1,303 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); +} + +void f1 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t bindex2 = __riscv_vle8_v_i8mf8 ((void *)(base + 100), vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8_tu(bindex2,base,bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + vuint64m1_t v2 = __riscv_vadd_vv_u64m1 (bindex, bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); + __riscv_vse64_v_u64m1 ((void *)out,v2,vl); +} + +void f3 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100*i, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + vuint64m1_t v2 = __riscv_vadd_vv_u64m1 (bindex, bindex,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + __riscv_vse64_v_u64m1 ((void *)(out + 200*i),v2,vl); + } +} + +void f4 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + vuint64m1_t v2 = __riscv_vadd_vv_u64m1 (bindex, bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); + __riscv_vse64_v_u64m1 ((void *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8_m(m,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vle8_v_i8mf8_tu (v, base2, vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f6 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f7 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t src = __riscv_vle8_v_i8m1 ((void *)(base + 100), vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1_tu(src,base,bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f8 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + vuint64m8_t v2 = __riscv_vadd_vv_u64m8 (bindex, bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); + __riscv_vse64_v_u64m8 ((void *)out,v2,vl); +} + +void f9 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100*i, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + vuint64m8_t v2 = __riscv_vadd_vv_u64m8 (bindex, bindex,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + __riscv_vse64_v_u64m8 ((void *)(out + 200*i),v2,vl); + } +} + +void f10 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + vuint64m8_t v2 = __riscv_vadd_vv_u64m8 (bindex, bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); + __riscv_vse64_v_u64m8 ((void *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1_m(m,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vle8_v_i8m1_tu (v, base2, vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f12 (void *base,void *out,size_t vl, int n) +{ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f13 (void *base,void *out,size_t vl, int n) +{ + vint8m1_t v = __riscv_vle8_v_i8m1 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((void *)(base + 1000 * i), vl); + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f15 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8m1_t v = __riscv_vle8_v_i8m1 ((void *)(base + 1000 * i), vl); + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f16 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((void *)(base + 1000 * i), vl); + vuint64m1_t bindex1 = __riscv_vle64_v_u64m1 (base + 100*i, vl); + vuint64m1_t bindex2 = __riscv_vle64_v_u64m1 (base + 200*i, vl); + vuint64m1_t bindex3 = __riscv_vle64_v_u64m1 (base + 300*i, vl); + vuint64m1_t bindex4 = __riscv_vle64_v_u64m1 (base + 400*i, vl); + vuint64m1_t bindex5 = __riscv_vle64_v_u64m1 (base + 500*i, vl); + vuint64m1_t bindex6 = __riscv_vle64_v_u64m1 (base + 600*i, vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex1,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex2,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex3,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex4,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex5,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex6,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8m1_t v = __riscv_vle8_v_i8m1 ((void *)(base + 1000 * i), vl); + vuint64m8_t bindex1 = __riscv_vle64_v_u64m8 (base + 100*i, vl); + vuint64m8_t bindex2 = __riscv_vle64_v_u64m8 (base + 200*i, vl); + vuint64m8_t bindex3 = __riscv_vle64_v_u64m8 (base + 300*i, vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex1,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex2,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex3,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint32m4_t v = __riscv_vluxei64_v_u32m4_m(m,base,bindex,vl); + vuint32m4_t v2 = __riscv_vle32_v_u32m4_tu (v, base2 + i, vl); + vint8m1_t v3 = __riscv_vluxei32_v_i8m1_m(m,base,v2,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v3,vl); + } +} + +void f19 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl); + vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl); + vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl); + vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v3,vl); + __riscv_vse8_v_i8m1 (out + 222*i,v4,vl); + } +} +void f20 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23"); + + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f21 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vbool8_t m = __riscv_vlm_v_b8 (base, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23"); + + vint8m1_t v = __riscv_vluxei64_v_i8m1_m(m,base,bindex,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f22 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23"); + + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + asm volatile("#" :: + : "v0", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + v = __riscv_vadd_vv_i8m1 (v,v,vl); + asm volatile("#" :: + : "v0", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-times {vmv} 1 } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c new file mode 100644 index 00000000000..521af15ee5e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c @@ -0,0 +1,133 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmadc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmadc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmadc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmadc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmadc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmadc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c new file mode 100644 index 00000000000..66a8791aeb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c @@ -0,0 +1,133 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmsbc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmsbc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmsbc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmsbc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmsbc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmsbc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c new file mode 100644 index 00000000000..b3add7b7bc7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c @@ -0,0 +1,127 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmadc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmadc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmadc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmadc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmadc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmadc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c new file mode 100644 index 00000000000..468471c438a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c @@ -0,0 +1,127 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmsbc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmsbc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmsbc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmsbc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmsbc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmsbc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c new file mode 100644 index 00000000000..97df21dd743 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_mu(m1,m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v1,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_mu(m1,m2,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m4, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vv_i32m8_b4_m (mask, v3, v4,32); + mask = __riscv_vmseq_vv_i32m8_b4_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmseq_vv_i32m1_b32 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vv_i32m1_b32_m (mask, v3, v4,32); + mask = __riscv_vmseq_vv_i32m1_b32_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c new file mode 100644 index 00000000000..56c95d9c884 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_mu(m1,m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v1,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_mu(m1,m2,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m4, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vv_i32m8_b4_m (mask, v3, v4,32); + mask = __riscv_vmslt_vv_i32m8_b4_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmslt_vv_i32m1_b32 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vv_i32m1_b32_m (mask, v3, v4,32); + mask = __riscv_vmslt_vv_i32m1_b32_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c new file mode 100644 index 00000000000..d50e497d6c9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c new file mode 100644 index 00000000000..4e77c51d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c new file mode 100644 index 00000000000..4f7efd508b1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m1,v1, -16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m2,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m4, v2, -16, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m8_b4_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i32m8_b4_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m1_b32 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m1_b32_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i32m1_b32_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c new file mode 100644 index 00000000000..92084be99b2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m1,v1, -15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m2,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m4, v2, -15,4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m8_b4_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i32m8_b4_mu (mask, mask, v4, -15,32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m1_b32 (v, -15,4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m1_b32_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i32m1_b32_mu (mask, mask, v4, -15,32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c new file mode 100644 index 00000000000..f9817caca1e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m1_b64 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c new file mode 100644 index 00000000000..62d1f6dddd5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m1_b64 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c new file mode 100644 index 00000000000..250c3fdb89a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,-16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, -16, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m1_b64 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c new file mode 100644 index 00000000000..72e2d210c05 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,-15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, -15, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, -15, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m1_b64 (v, -15, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, -15, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c new file mode 100644 index 00000000000..0842700475c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m1_b64 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c new file mode 100644 index 00000000000..9c1eddfac7e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m1_b64 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c new file mode 100644 index 00000000000..6988c24bd92 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_mu(m1,m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v1,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_mu(m1,m2,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m4, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool4_t mask = *(vbool4_t*)base1; + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4); + mask = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4); + for (int i = 0; i < n; i++){ + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4); + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vv_f32m8_b4_m (mask, v3, v4,32); + mask = __riscv_vmfeq_vv_f32m8_b4_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool32_t mask = *(vbool32_t*)base1; + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4); + mask = __riscv_vmfeq_vv_f32m1_b32 (v, v2, 4); + for (int i = 0; i < n; i++){ + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4); + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vv_f32m1_b32_m (mask, v3, v4,32); + mask = __riscv_vmfeq_vv_f32m1_b32_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c new file mode 100644 index 00000000000..fe181de4d56 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f7 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, float x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4); + mask = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4); + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vf_f32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmfeq_vf_f32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, float x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4); + mask = __riscv_vmfeq_vf_f32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4); + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vf_f32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmfeq_vf_f32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c new file mode 100644 index 00000000000..ae5b4ed6913 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f7 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, float x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4); + mask = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4); + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmflt_vf_f32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmflt_vf_f32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, float x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4); + mask = __riscv_vmflt_vf_f32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4); + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmflt_vf_f32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmflt_vf_f32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */