From patchwork Fri Mar 24 11:28:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 74487 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp552913vqo; Fri, 24 Mar 2023 04:33:52 -0700 (PDT) X-Google-Smtp-Source: AKy350Z1CDo5wEMmVeQ6om634poJqBusLkBiJiObDBTefLAhlvGcvg5dFNllWYhsKIwScwnijsPp X-Received: by 2002:a17:906:fb08:b0:8b1:2d0e:281 with SMTP id lz8-20020a170906fb0800b008b12d0e0281mr2709222ejb.18.1679657632351; Fri, 24 Mar 2023 04:33:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679657632; cv=none; d=google.com; s=arc-20160816; b=EuLqrOAfuup8KZAgn9+JiCrnFMiHjglp/E4ZNlZphKucYYZUog28S3K5l7nZtNI6YX dg8roqB2tSDdDdNNMsLgUxE4ZjBYuXivx1j546b16534EvNYXYDyAVZ8V5OLBwaRaOOd 34MxL90wpMBffsboRgAFf/tPoVO7DV2BSg75hNfJciqgdM1RNMORsEWrM/4b0w5X7G5o 6y1NkHL9rfbfN7w2dKYoOIHtXSg7SvTekGA3kQ8eCkYovhxX4JLdh1HGnJbAgtIO66oo 6GvgQHzi8HJ6CFykFwzN0UKwGDvMqEJPPFQnIi6VUVz6FDvskHzqp/INE25NWqpvpgmB QXSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=rc6RUN5crftrn8xtQIum23Bl8q8gNxI+zGpOalGc4uc=; b=eY+YwWb3t9h6+IVWH6oOUQS//TQE90fThS/II7NLtLaoW7nFrWFQry2aZJGupSf+tW iN7FN6F/XMo/I5btMSj0Af8P/0I027b7fQQoYOjh/vUW+izcGFy2HIaTqoL3ci+0Ws2y jSyxVKHvKlsLKnHK32P0wbPsswKn0nsuF9Hg844+Zr/P8Hdg0IIITnF3/lSiIZM/V3il bvCLFIm0HeewWPyZSDSiQ2hZDaQVrhF+YTO18rh0DkmtXhGkPQC3Y/Ea6LqPyvuAUkz2 v6Gm9fnCalDWaDRbBzaKe+awlMV5yc4Hg0GQpzEBa6+eGyLTpk9hVbdXMUDvQg3uSZvG fi+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ka20-20020a170907991400b0092a39f75d8asi17668335ejc.623.2023.03.24.04.33.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Mar 2023 04:33:52 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8CA2E3882034 for ; Fri, 24 Mar 2023 11:30:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgjp3.qq.com (smtpbgjp3.qq.com [54.92.39.34]) by sourceware.org (Postfix) with ESMTPS id B8C9A3858CDA for ; Fri, 24 Mar 2023 11:29:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B8C9A3858CDA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp88t1679657329t54dxe64 Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 24 Mar 2023 19:28:48 +0800 (CST) X-QQ-SSF: 01400000000000E0N000000A0000000 X-QQ-FEAT: UQSQY+HvfIE4tPbtlyGJbg0YfhVje3tupfYclQ5i5De01uhE7Y0F+X6rwgJTs EhbrOFJn4PAhTLIJt2LsSAhm3914jUxqJj5zP5zVdGmk77ZRjLOoPOEK0S4QsRe4d/mpvFo nnU5mHmpdp2hfX+V+YC4BFX3X/qatKAoeBrZ2CLXwWjS9Mn1g9RC40zUakD5KHvJ+J1rYJO DKPdyksoBmjqWyIE+OqjuWLvEGeK5UdokzskUpAf+hpwaH/jqo4S+nglTCBhpvHQzJBDkLQ ihigjEhG/+4ADRx504DY+4xwKFMdQtqkewow7hokInCZJ/1ZXuBV+/PhZk0hXxKUc5b+Z1h vHFvj2el65NO6Zp/H5ebC48uR7AcryhehktkqK0F1P39/WDkIEpP9Pm3QJqkt63bJ8KCGBJ VXPrtcQBamZxUCoF2EkY5w== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 5744587774930495867 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Fine tune RVV narrow instruction (source EEW > dest DEST) RA constraint Date: Fri, 24 Mar 2023 19:28:46 +0800 Message-Id: <20230324112846.283687-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvr:qybglogicsvr7 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761248681490959464?= X-GMAIL-MSGID: =?utf-8?q?1761248681490959464?= From: Juzhe-Zhong gcc/ChangeLog: * config/riscv/vector.md (*pred_cmp_merge_tie_mask): New pattern. (*pred_ltge_merge_tie_mask): Ditto. (*pred_cmp_scalar_merge_tie_mask): Ditto. (*pred_eqne_scalar_merge_tie_mask): Ditto. (*pred_cmp_extended_scalar_merge_tie_mask): Ditto. (*pred_eqne_extended_scalar_merge_tie_mask): Ditto. (*pred_cmp_narrow_merge_tie_mask): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: * gcc.target/riscv/rvv/base/binop_vx_constraint-150.c: * gcc.target/riscv/rvv/base/narrow_constraint-12.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-13.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-14.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-15.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-16.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-17.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-18.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-19.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-20.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-21.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-22.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-23.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-24.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-25.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-26.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-27.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-28.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-29.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-30.c: New test. * gcc.target/riscv/rvv/base/narrow_constraint-31.c: New test. --- gcc/config/riscv/vector.md | 780 ++++++++++++------ .../riscv/rvv/base/binop_vv_constraint-4.c | 2 +- .../riscv/rvv/base/binop_vx_constraint-150.c | 2 +- .../riscv/rvv/base/narrow_constraint-12.c | 303 +++++++ .../riscv/rvv/base/narrow_constraint-13.c | 133 +++ .../riscv/rvv/base/narrow_constraint-14.c | 133 +++ .../riscv/rvv/base/narrow_constraint-15.c | 127 +++ .../riscv/rvv/base/narrow_constraint-16.c | 127 +++ .../riscv/rvv/base/narrow_constraint-17.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-18.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-19.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-20.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-21.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-22.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-23.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-24.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-25.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-26.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-27.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-28.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-29.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-30.c | 231 ++++++ .../riscv/rvv/base/narrow_constraint-31.c | 231 ++++++ 23 files changed, 4811 insertions(+), 261 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 1ddc1d3fd39..52597750f69 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -1490,63 +1490,63 @@ ;; DEST eew is smaller than SOURCE eew. (define_insn "@pred_indexed_load_x2_smaller_eew" - [(set (match_operand:VEEWTRUNC2 0 "register_operand" "=&vr, &vr") + [(set (match_operand:VEEWTRUNC2 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else:VEEWTRUNC2 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWTRUNC2 - [(match_operand 3 "pmode_register_operand" " r, r") + [(match_operand 3 "pmode_register_operand" " r, r, r, r, r, r") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " vr, vr")] ORDER) - (match_operand:VEEWTRUNC2 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 4 "register_operand" " 0, 0, 0, 0, vr, vr")] ORDER) + (match_operand:VEEWTRUNC2 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%3),%4%p1" [(set_attr "type" "vldx") (set_attr "mode" "")]) (define_insn "@pred_indexed_load_x4_smaller_eew" - [(set (match_operand:VEEWTRUNC4 0 "register_operand" "=&vr, &vr") + [(set (match_operand:VEEWTRUNC4 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else:VEEWTRUNC4 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWTRUNC4 - [(match_operand 3 "pmode_register_operand" " r, r") + [(match_operand 3 "pmode_register_operand" " r, r, r, r, r, r") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " vr, vr")] ORDER) - (match_operand:VEEWTRUNC4 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 4 "register_operand" " 0, 0, 0, 0, vr, vr")] ORDER) + (match_operand:VEEWTRUNC4 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%3),%4%p1" [(set_attr "type" "vldx") (set_attr "mode" "")]) (define_insn "@pred_indexed_load_x8_smaller_eew" - [(set (match_operand:VEEWTRUNC8 0 "register_operand" "=&vr, &vr") + [(set (match_operand:VEEWTRUNC8 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") (if_then_else:VEEWTRUNC8 (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 5 "vector_length_operand" " rK, rK") - (match_operand 6 "const_int_operand" " i, i") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 6 "const_int_operand" " i, i, i, i, i, i") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (unspec:VEEWTRUNC8 - [(match_operand 3 "pmode_register_operand" " r, r") + [(match_operand 3 "pmode_register_operand" " r, r, r, r, r, r") (mem:BLK (scratch)) - (match_operand: 4 "register_operand" " vr, vr")] ORDER) - (match_operand:VEEWTRUNC8 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 4 "register_operand" " 0, 0, 0, 0, vr, vr")] ORDER) + (match_operand:VEEWTRUNC8 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR" "vlxei.v\t%0,(%3),%4%p1" [(set_attr "type" "vldx") @@ -2420,15 +2420,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) (define_insn "@pred_madc" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr, &vr") (unspec: [(plus:VI - (match_operand:VI 1 "register_operand" " vr, vr") - (match_operand:VI 2 "vector_arith_operand" " vr, vi")) - (match_operand: 3 "register_operand" " vm, vm") + (match_operand:VI 1 "register_operand" " %0, vr, vr") + (match_operand:VI 2 "vector_arith_operand" "vrvi, vr, vi")) + (match_operand: 3 "register_operand" " vm, vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK, rK") - (match_operand 5 "const_int_operand" " i, i") + [(match_operand 4 "vector_length_operand" " rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2439,15 +2439,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_msbc" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (unspec: [(minus:VI - (match_operand:VI 1 "register_operand" " vr") - (match_operand:VI 2 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand:VI 1 "register_operand" " 0, vr, vr") + (match_operand:VI 2 "register_operand" " vr, 0, vr")) + (match_operand: 3 "register_operand" " vm, vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2458,16 +2458,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_madc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "register_operand" " r")) - (match_operand:VI_QHS 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "register_operand" " r, r")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2478,16 +2478,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_msbc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_QHS 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2527,16 +2527,16 @@ }) (define_insn "*pred_madc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2547,17 +2547,17 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "*pred_madc_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))] "TARGET_VECTOR" @@ -2597,16 +2597,16 @@ }) (define_insn "*pred_msbc_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2617,17 +2617,17 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "*pred_msbc_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) - (match_operand: 3 "register_operand" " vm") + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) + (match_operand: 3 "register_operand" " vm, vm") (unspec: - [(match_operand 4 "vector_length_operand" " rK") - (match_operand 5 "const_int_operand" " i") + [(match_operand 4 "vector_length_operand" " rK, rK") + (match_operand 5 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMSBC))] "TARGET_VECTOR" @@ -2638,14 +2638,14 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[5])"))]) (define_insn "@pred_madc_overflow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr, &vr") (unspec: [(plus:VI - (match_operand:VI 1 "register_operand" " vr, vr") - (match_operand:VI 2 "vector_arith_operand" " vr, vi")) + (match_operand:VI 1 "register_operand" " %0, vr, vr") + (match_operand:VI 2 "vector_arith_operand" "vrvi, vr, vi")) (unspec: - [(match_operand 3 "vector_length_operand" " rK, rK") - (match_operand 4 "const_int_operand" " i, i") + [(match_operand 3 "vector_length_operand" " rK, rK, rK") + (match_operand 4 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2656,14 +2656,14 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "@pred_msbc_overflow" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr, &vr") (unspec: [(minus:VI - (match_operand:VI 1 "register_operand" " vr") - (match_operand:VI 2 "register_operand" " vr")) + (match_operand:VI 1 "register_operand" " 0, vr, vr, vr") + (match_operand:VI 2 "register_operand" " vr, 0, vr, vi")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 4 "const_int_operand" " i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2674,15 +2674,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "@pred_madc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_QHS 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2693,15 +2693,15 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "@pred_msbc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_QHS (vec_duplicate:VI_QHS - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_QHS 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_QHS 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2740,15 +2740,15 @@ }) (define_insn "*pred_madc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2759,16 +2759,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "*pred_madc_overflow_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(plus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2807,15 +2807,15 @@ }) (define_insn "*pred_msbc_overflow_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D - (match_operand: 2 "reg_or_0_operand" " rJ")) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ")) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -2826,16 +2826,16 @@ (set (attr "avl_type") (symbol_ref "INTVAL (operands[4])"))]) (define_insn "*pred_msbc_overflow_extended_scalar" - [(set (match_operand: 0 "register_operand" "=&vr") + [(set (match_operand: 0 "register_operand" "=vr, &vr") (unspec: [(minus:VI_D (vec_duplicate:VI_D (sign_extend: - (match_operand: 2 "reg_or_0_operand" " rJ"))) - (match_operand:VI_D 1 "register_operand" " vr")) + (match_operand: 2 "reg_or_0_operand" " rJ, rJ"))) + (match_operand:VI_D 1 "register_operand" " 0, vr")) (unspec: - [(match_operand 3 "vector_length_operand" " rK") - (match_operand 4 "const_int_operand" " i") + [(match_operand 3 "vector_length_operand" " rK, rK") + (match_operand 4 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_OVERFLOW))] "TARGET_VECTOR" @@ -3617,6 +3617,29 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_cmp_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_ltge_operator" + [(match_operand:VI 3 "register_operand" " vr") + (match_operand:VI 4 "vector_arith_operand" "vrvi")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.v%o4\t%0,%3,%v4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp" [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr") @@ -3639,19 +3662,19 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr, &vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 7 "const_int_operand" " i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_ltge_operator" - [(match_operand:VI 4 "register_operand" " vr, vr, vr, vr") - (match_operand:VI 5 "vector_arith_operand" " vr, vr, vi, vi")]) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")))] + [(match_operand:VI 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr") + (match_operand:VI 5 "vector_arith_operand" " vrvi, vrvi, 0, 0, vrvi, 0, 0, vrvi, vrvi")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") @@ -3674,6 +3697,29 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_ltge_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "ltge_operator" + [(match_operand:VI 3 "register_operand" " vr") + (match_operand:VI 4 "vector_neg_arith_operand" "vrvj")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.v%o4\t%0,%3,%v4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_ltge" [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr") @@ -3696,19 +3742,19 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_ltge_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr, &vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 7 "const_int_operand" " i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "ltge_operator" - [(match_operand:VI 4 "register_operand" " vr, vr, vr, vr") - (match_operand:VI 5 "vector_neg_arith_operand" " vr, vr, vj, vj")]) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")))] + [(match_operand:VI 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr") + (match_operand:VI 5 "vector_neg_arith_operand" " vrvj, vrvj, 0, 0, vrvj, 0, 0, vrvj, vrvj")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") @@ -3732,6 +3778,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_cmp_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_eqge_operator" + [(match_operand:VI_QHS 3 "register_operand" " vr") + (vec_duplicate:VI_QHS + (match_operand: 4 "register_operand" " r"))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -3755,20 +3825,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:VI_QHS 4 "register_operand" " vr, vr") + [(match_operand:VI_QHS 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VI_QHS - (match_operand: 5 "register_operand" " r, r"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r"))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -3792,6 +3862,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_eqne_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VI_QHS + (match_operand: 4 "register_operand" " r")) + (match_operand:VI_QHS 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -3815,20 +3909,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VI_QHS - (match_operand: 5 "register_operand" " r, r")) - (match_operand:VI_QHS 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r")) + (match_operand:VI_QHS 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -3909,6 +4003,54 @@ DONE; }) +(define_insn "*pred_cmp_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_eqge_operator" + [(match_operand:VI_D 3 "register_operand" " vr") + (vec_duplicate:VI_D + (match_operand: 4 "register_operand" " r"))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + +(define_insn "*pred_eqne_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VI_D + (match_operand: 4 "register_operand" " r")) + (match_operand:VI_D 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -3932,20 +4074,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:VI_D 4 "register_operand" " vr, vr") + [(match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VI_D - (match_operand: 5 "register_operand" " r, r"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r"))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -3974,25 +4116,50 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VI_D - (match_operand: 5 "register_operand" " r, r")) - (match_operand:VI_D 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r")) + (match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") (set_attr "mode" "")]) +(define_insn "*pred_cmp_extended_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "comparison_except_eqge_operator" + [(match_operand:VI_D 3 "register_operand" " vr") + (vec_duplicate:VI_D + (sign_extend: + (match_operand: 4 "register_operand" " r")))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -4016,26 +4183,51 @@ (set_attr "mode" "")]) (define_insn "*pred_cmp_extended_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:VI_D 4 "register_operand" " vr, vr") + [(match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VI_D (sign_extend: - (match_operand: 5 "register_operand" " r, r")))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r")))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") (set_attr "mode" "")]) +(define_insn "*pred_eqne_extended_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VI_D + (sign_extend: + (match_operand: 4 "register_operand" " r"))) + (match_operand:VI_D 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vms%B2.vx\t%0,%3,%4,v0.t" + [(set_attr "type" "vicmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -4059,21 +4251,21 @@ (set_attr "mode" "")]) (define_insn "*pred_eqne_extended_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VI_D (sign_extend: - (match_operand: 5 "register_operand" " r, r"))) - (match_operand:VI_D 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r, r, r"))) + (match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") @@ -4111,6 +4303,7 @@ { enum rtx_code code = GET_CODE (operands[3]); rtx undef = RVV_VUNDEF (mode); + rtx tmp = gen_reg_rtx (mode); if (code == GEU && rtx_equal_p (operands[5], const0_rtx)) { /* If vmsgeu with 0 immediate, expand it to vmset. */ @@ -4157,12 +4350,11 @@ - pseudoinstruction: vmsge{u}.vx vd, va, x - expansion: vmslt{u}.vx vd, va, x; vmnand.mm vd, vd, vd. */ emit_insn ( - gen_pred_cmp_scalar (operands[0], operands[1], operands[2], + gen_pred_cmp_scalar (tmp, operands[1], operands[2], operands[3], operands[4], operands[5], operands[6], operands[7], operands[8])); emit_insn (gen_pred_nand (operands[0], CONSTM1_RTX (mode), - undef, operands[0], operands[0], - operands[6], operands[8])); + undef, tmp, tmp, operands[6], operands[8])); } else { @@ -4171,13 +4363,12 @@ /* masked va >= x, vd == v0 - pseudoinstruction: vmsge{u}.vx vd, va, x, v0.t, vt - expansion: vmslt{u}.vx vt, va, x; vmandn.mm vd, vd, vt. */ - rtx reg = gen_reg_rtx (mode); emit_insn (gen_pred_cmp_scalar ( - reg, CONSTM1_RTX (mode), undef, operands[3], operands[4], + tmp, CONSTM1_RTX (mode), undef, operands[3], operands[4], operands[5], operands[6], operands[7], operands[8])); emit_insn ( gen_pred_andnot (operands[0], CONSTM1_RTX (mode), undef, - operands[1], reg, operands[6], operands[8])); + operands[1], tmp, operands[6], operands[8])); } else { @@ -4186,10 +4377,10 @@ - expansion: vmslt{u}.vx vd, va, x, v0.t; vmxor.mm vd, vd, v0. */ emit_insn (gen_pred_cmp_scalar ( - operands[0], operands[1], operands[2], operands[3], operands[4], + tmp, operands[1], operands[2], operands[3], operands[4], operands[5], operands[6], operands[7], operands[8])); emit_insn (gen_pred (XOR, mode, operands[0], - CONSTM1_RTX (mode), undef, operands[0], + CONSTM1_RTX (mode), undef, tmp, operands[1], operands[6], operands[8])); } } @@ -6296,21 +6487,44 @@ [(set_attr "type" "vfcmp") (set_attr "mode" "")]) +(define_insn "*pred_cmp_narrow_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "signed_order_operator" + [(match_operand:VF 3 "register_operand" " vr") + (match_operand:VF 4 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vmf%B2.vv\t%0,%3,%4,v0.t" + [(set_attr "type" "vfcmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "signed_order_operator" - [(match_operand:VF 4 "register_operand" " vr, vr") - (match_operand:VF 5 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + [(match_operand:VF 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr") + (match_operand:VF 5 "register_operand" " vr, vr, 0, 0, vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vmf%B3.vv\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") @@ -6334,6 +6548,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_cmp_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "signed_order_operator" + [(match_operand:VF 3 "register_operand" " vr") + (vec_duplicate:VF + (match_operand: 4 "register_operand" " f"))]) + (match_dup 1)))] + "TARGET_VECTOR" + "vmf%B2.vf\t%0,%3,%4,v0.t" + [(set_attr "type" "vfcmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -6357,20 +6595,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "signed_order_operator" - [(match_operand:VF 4 "register_operand" " vr, vr") + [(match_operand:VF 4 "register_operand" " vr, 0, 0, vr, vr") (vec_duplicate:VF - (match_operand: 5 "register_operand" " f, f"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " f, f, f, f, f"))]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") @@ -6394,6 +6632,30 @@ "TARGET_VECTOR" {}) +(define_insn "*pred_eqne_scalar_merge_tie_mask" + [(set (match_operand: 0 "register_operand" "=vm") + (if_then_else: + (unspec: + [(match_operand: 1 "register_operand" " 0") + (match_operand 5 "vector_length_operand" " rK") + (match_operand 6 "const_int_operand" " i") + (match_operand 7 "const_int_operand" " i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (match_operator: 2 "equality_operator" + [(vec_duplicate:VF + (match_operand: 4 "register_operand" " f")) + (match_operand:VF 3 "register_operand" " vr")]) + (match_dup 1)))] + "TARGET_VECTOR" + "vmf%B2.vf\t%0,%3,%4,v0.t" + [(set_attr "type" "vfcmp") + (set_attr "mode" "") + (set_attr "merge_op_idx" "1") + (set_attr "vl_op_idx" "5") + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))]) + ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_scalar" [(set (match_operand: 0 "register_operand" "=vr, vr") @@ -6417,20 +6679,20 @@ ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" - [(set (match_operand: 0 "register_operand" "=&vr, &vr") + [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:VF - (match_operand: 5 "register_operand" " f, f")) - (match_operand:VF 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " f, f, f, f, f")) + (match_operand:VF 4 "register_operand" " vr, 0, 0, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))] "TARGET_VECTOR && known_gt (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") @@ -6730,44 +6992,44 @@ ;; For example, The LMUL = 1 corresponding mode of VNx16QImode is VNx4QImode ;; for -march=rv*zve32* wheras VNx8QImode for -march=rv*zve64* (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_reduc:VI (vec_duplicate:VI (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VI 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VI 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN > 32" "vred.vs\t%0,%3,%4%p1" [(set_attr "type" "vired") (set_attr "mode" "")]) (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_reduc:VI_ZVE32 (vec_duplicate:VI_ZVE32 (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VI_ZVE32 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VI_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN == 32" "vred.vs\t%0,%3,%4%p1" [(set_attr "type" "vired") @@ -6810,90 +7072,90 @@ (set_attr "mode" "")]) (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_freduc:VF (vec_duplicate:VF (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VF 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN > 32" "vfred.vs\t%0,%3,%4%p1" [(set_attr "type" "vfredu") (set_attr "mode" "")]) (define_insn "@pred_reduc_" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (any_freduc:VF_ZVE32 (vec_duplicate:VF_ZVE32 (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF_ZVE32 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC))] + (match_operand:VF_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC))] "TARGET_VECTOR && TARGET_MIN_VLEN == 32" "vfred.vs\t%0,%3,%4%p1" [(set_attr "type" "vfredu") (set_attr "mode" "")]) (define_insn "@pred_reduc_plus" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (plus:VF (vec_duplicate:VF (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC)] ORDER))] + (match_operand:VF 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC)] ORDER))] "TARGET_VECTOR && TARGET_MIN_VLEN > 32" "vfredsum.vs\t%0,%3,%4%p1" [(set_attr "type" "vfred") (set_attr "mode" "")]) (define_insn "@pred_reduc_plus" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr") (unspec: [(unspec: [(unspec: - [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1") - (match_operand 5 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 6 "const_int_operand" " i, i, i, i") - (match_operand 7 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (plus:VF_ZVE32 (vec_duplicate:VF_ZVE32 (vec_select: - (match_operand: 4 "register_operand" " vr, vr, vr, vr") + (match_operand: 4 "register_operand" " vr, vr") (parallel [(const_int 0)]))) - (match_operand:VF_ZVE32 3 "register_operand" " vr, vr, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")] UNSPEC_REDUC)] ORDER))] + (match_operand:VF_ZVE32 3 "register_operand" " vr, vr")) + (match_operand: 2 "vector_merge_operand" " vu, 0")] UNSPEC_REDUC)] ORDER))] "TARGET_VECTOR && TARGET_MIN_VLEN == 32" "vfredsum.vs\t%0,%3,%4%p1" [(set_attr "type" "vfred") diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c index 552c264d895..e16db932f15 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c @@ -24,4 +24,4 @@ void f2 (void * in, void *out, int32_t x) __riscv_vsm_v_b32 (out, m4, 4); } -/* { dg-final { scan-assembler-times {vmv} 2 } } */ +/* { dg-final { scan-assembler-not {vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c index 55a222f47ea..e92a8115f09 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vx_constraint-150.c @@ -18,4 +18,4 @@ void f1 (void * in, void *out, int32_t x) /* { dg-final { scan-assembler-times {vmslt\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+,\s*v0.t} 1 } } */ /* { dg-final { scan-assembler-times {vmxor\.mm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 1 } } */ /* { dg-final { scan-assembler-times {vmnot\.m\s+v[0-9]+,\s*v[0-9]+} 1 } } */ -/* { dg-final { scan-assembler-times {vmv} 1 } } */ +/* { dg-final { scan-assembler-not {vmv} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c new file mode 100644 index 00000000000..df5b2dc5c51 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-12.c @@ -0,0 +1,303 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); +} + +void f1 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t bindex2 = __riscv_vle8_v_i8mf8 ((void *)(base + 100), vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8_tu(bindex2,base,bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); +} + +void f2 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + vuint64m1_t v2 = __riscv_vadd_vv_u64m1 (bindex, bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); + __riscv_vse64_v_u64m1 ((void *)out,v2,vl); +} + +void f3 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100*i, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + vuint64m1_t v2 = __riscv_vadd_vv_u64m1 (bindex, bindex,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + __riscv_vse64_v_u64m1 ((void *)(out + 200*i),v2,vl); + } +} + +void f4 (void *base,void *out,size_t vl) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8(base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + vuint64m1_t v2 = __riscv_vadd_vv_u64m1 (bindex, bindex,vl); + __riscv_vse8_v_i8mf8 (out,v,vl); + __riscv_vse64_v_u64m1 ((void *)out,v2,vl); +} + +void f5 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool64_t m = __riscv_vlm_v_b64 (base + i, vl); + vint8mf8_t v = __riscv_vluxei64_v_i8mf8_m(m,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vle8_v_i8mf8_tu (v, base2, vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f6 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f7 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t src = __riscv_vle8_v_i8m1 ((void *)(base + 100), vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1_tu(src,base,bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f8 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + vuint64m8_t v2 = __riscv_vadd_vv_u64m8 (bindex, bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); + __riscv_vse64_v_u64m8 ((void *)out,v2,vl); +} + +void f9 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100*i, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + vuint64m8_t v2 = __riscv_vadd_vv_u64m8 (bindex, bindex,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + __riscv_vse64_v_u64m8 ((void *)(out + 200*i),v2,vl); + } +} + +void f10 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + vuint64m8_t v2 = __riscv_vadd_vv_u64m8 (bindex, bindex,vl); + __riscv_vse8_v_i8m1 (out,v,vl); + __riscv_vse64_v_u64m8 ((void *)out,v2,vl); +} + +void f11 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vint8m1_t v = __riscv_vluxei64_v_i8m1_m(m,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vle8_v_i8m1_tu (v, base2, vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f12 (void *base,void *out,size_t vl, int n) +{ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f13 (void *base,void *out,size_t vl, int n) +{ + vint8m1_t v = __riscv_vle8_v_i8m1 ((void *)(base + 1000), vl); + for (int i = 0; i < n; i++){ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f14 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((void *)(base + 1000 * i), vl); + vuint64m1_t bindex = __riscv_vle64_v_u64m1 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f15 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8m1_t v = __riscv_vle8_v_i8m1 ((void *)(base + 1000 * i), vl); + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100*i, vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f16 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8mf8_t v = __riscv_vle8_v_i8mf8 ((void *)(base + 1000 * i), vl); + vuint64m1_t bindex1 = __riscv_vle64_v_u64m1 (base + 100*i, vl); + vuint64m1_t bindex2 = __riscv_vle64_v_u64m1 (base + 200*i, vl); + vuint64m1_t bindex3 = __riscv_vle64_v_u64m1 (base + 300*i, vl); + vuint64m1_t bindex4 = __riscv_vle64_v_u64m1 (base + 400*i, vl); + vuint64m1_t bindex5 = __riscv_vle64_v_u64m1 (base + 500*i, vl); + vuint64m1_t bindex6 = __riscv_vle64_v_u64m1 (base + 600*i, vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex1,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex2,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex3,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex4,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex5,vl); + v = __riscv_vluxei64_v_i8mf8_tu(v,base,bindex6,vl); + __riscv_vse8_v_i8mf8 (out + 100*i,v,vl); + } +} + +void f17 (void *base,void *out,size_t vl, int n) +{ + for (int i = 0; i < n; i++){ + vint8m1_t v = __riscv_vle8_v_i8m1 ((void *)(base + 1000 * i), vl); + vuint64m8_t bindex1 = __riscv_vle64_v_u64m8 (base + 100*i, vl); + vuint64m8_t bindex2 = __riscv_vle64_v_u64m8 (base + 200*i, vl); + vuint64m8_t bindex3 = __riscv_vle64_v_u64m8 (base + 300*i, vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex1,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex2,vl); + v = __riscv_vluxei64_v_i8m1_tu(v,base,bindex3,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v,vl); + } +} + +void f18 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint32m4_t v = __riscv_vluxei64_v_u32m4_m(m,base,bindex,vl); + vuint32m4_t v2 = __riscv_vle32_v_u32m4_tu (v, base2 + i, vl); + vint8m1_t v3 = __riscv_vluxei32_v_i8m1_m(m,base,v2,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v3,vl); + } +} + +void f19 (void *base,void *base2,void *out,size_t vl, int n) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base + 100, vl); + for (int i = 0; i < n; i++){ + vbool8_t m = __riscv_vlm_v_b8 (base + i, vl); + vuint64m8_t v = __riscv_vluxei64_v_u64m8_m(m,base,bindex,vl); + vuint64m8_t v2 = __riscv_vle64_v_u64m8_tu (v, base2 + i, vl); + vint8m1_t v3 = __riscv_vluxei64_v_i8m1_m(m,base,v,vl); + vint8m1_t v4 = __riscv_vluxei64_v_i8m1_m(m,base,v2,vl); + __riscv_vse8_v_i8m1 (out + 100*i,v3,vl); + __riscv_vse8_v_i8m1 (out + 222*i,v4,vl); + } +} +void f20 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23"); + + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f21 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + vbool8_t m = __riscv_vlm_v_b8 (base, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23"); + + vint8m1_t v = __riscv_vluxei64_v_i8m1_m(m,base,bindex,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +void f22 (void *base,void *out,size_t vl) +{ + vuint64m8_t bindex = __riscv_vle64_v_u64m8 (base, vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23"); + + vint8m1_t v = __riscv_vluxei64_v_i8m1(base,bindex,vl); + asm volatile("#" :: + : "v0", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + v = __riscv_vadd_vv_i8m1 (v,v,vl); + asm volatile("#" :: + : "v0", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vse8_v_i8m1 (out,v,vl); +} + +/* { dg-final { scan-assembler-times {vmv} 1 } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c new file mode 100644 index 00000000000..521af15ee5e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-13.c @@ -0,0 +1,133 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmadc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmadc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmadc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmadc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmadc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmadc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c new file mode 100644 index 00000000000..66a8791aeb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-14.c @@ -0,0 +1,133 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmsbc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmsbc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + vbool8_t m = __riscv_vlm_v_b8 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + m = __riscv_vmsbc_vvm_i16m2_b8 (v0, v1, m, 4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmsbc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmsbc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + vbool32_t m = __riscv_vlm_v_b32 ((uint8_t *)(base + 200), vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + m = __riscv_vmsbc_vvm_i16mf2_b32 (v0, v1, m, 4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c new file mode 100644 index 00000000000..b3add7b7bc7 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-15.c @@ -0,0 +1,127 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmadc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmadc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmadc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmadc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmadc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmadc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c new file mode 100644 index 00000000000..468471c438a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-16.c @@ -0,0 +1,127 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmsbc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f1 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmsbc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f2 (int16_t *base,int8_t *out,size_t vl) +{ + vint16m2_t v0 = __riscv_vle16_v_i16m2 (base, vl); + vint16m2_t v1 = __riscv_vle16_v_i16m2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27"); + + vbool8_t m = __riscv_vmsbc_vv_i16m2_b8 (v0, v1,4); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,m,vl); +} + +void f3 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmsbc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f4 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmsbc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +void f5 (int16_t *base,int8_t *out,size_t vl) +{ + vint16mf2_t v0 = __riscv_vle16_v_i16mf2 (base, vl); + vint16mf2_t v1 = __riscv_vle16_v_i16mf2 ((int16_t *)(base + 100), vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29"); + + vbool32_t m = __riscv_vmsbc_vv_i16mf2_b32 (v0, v1,4); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v30", "v31"); + + __riscv_vsm_v_b32 (out,m,vl); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c new file mode 100644 index 00000000000..97df21dd743 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_mu(m1,m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v1,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_mu(m1,m2,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m4, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vv_i32m8_b4_m (mask, v3, v4,32); + mask = __riscv_vmseq_vv_i32m8_b4_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmseq_vv_i32m1_b32 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vv_i32m1_b32_m (mask, v3, v4,32); + mask = __riscv_vmseq_vv_i32m1_b32_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c new file mode 100644 index 00000000000..56c95d9c884 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_mu(m1,m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v1,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_mu(m1,m2,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m4, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vv_i32m8_b4_m (mask, v3, v4,32); + mask = __riscv_vmslt_vv_i32m8_b4_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmslt_vv_i32m1_b32 (v, v2, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vv_i32m1_b32_m (mask, v3, v4,32); + mask = __riscv_vmslt_vv_i32m1_b32_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c new file mode 100644 index 00000000000..d50e497d6c9 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c new file mode 100644 index 00000000000..4e77c51d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c new file mode 100644 index 00000000000..4f7efd508b1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m1,v1, -16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m2,v1, -16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m4, v2, -16, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m8_b4_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i32m8_b4_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i32m1_b32 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i32m1_b32_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i32m1_b32_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c new file mode 100644 index 00000000000..92084be99b2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m1,v1, -15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl); + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl); + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m2,v1, -15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b2 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m4, v2, -15,4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4); + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m8_b4 (v, -15,4); + for (int i = 0; i < n; i++){ + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4); + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m8_b4_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i32m8_b4_mu (mask, mask, v4, -15,32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4); + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i32m1_b32 (v, -15,4); + for (int i = 0; i < n; i++){ + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4); + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i32m1_b32_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i32m1_b32_mu (mask, mask, v4, -15,32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c new file mode 100644 index 00000000000..f9817caca1e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m8_b8 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m1_b64 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, x,32); + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c new file mode 100644 index 00000000000..62d1f6dddd5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m8_b8 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, int32_t x) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m1_b64 (v, x, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, x,32); + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c new file mode 100644 index 00000000000..250c3fdb89a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,-16,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,-16,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, -16, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m1_b64 (v, -16, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, -16,32); + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, -16, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c new file mode 100644 index 00000000000..72e2d210c05 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,-15,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,-15,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, -15, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, -15, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m1_b64 (v, -15, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, -15,32); + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, -15, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c new file mode 100644 index 00000000000..0842700475c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmseq_vx_i64m1_b64 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c new file mode 100644 index 00000000000..9c1eddfac7e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl) +{ + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl); + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl); + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,0xAAAA,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b8 (out,v,vl); +} + +void f7 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f8 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f9 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m5, 4); +} + +void f10 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f11 (void * in, void *out) +{ + vbool8_t mask = *(vbool8_t*)in; + asm volatile ("":::"memory"); + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4); + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4); + __riscv_vsm_v_b8 (out, m3, 4); + __riscv_vsm_v_b8 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool8_t mask = *(vbool8_t*)base1; + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4); + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4); + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b8 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool64_t mask = *(vbool64_t*)base1; + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4); + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4); + mask = __riscv_vmslt_vx_i64m1_b64 (v, 0xAAAA, 4); + for (int i = 0; i < n; i++){ + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4); + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, 0xAAAA,32); + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, 0xAAAA, 32); + } + __riscv_vsm_v_b64 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c new file mode 100644 index 00000000000..6988c24bd92 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_mu(m1,m1,v1,v2,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v1,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_mu(m1,m2,v1,v2,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f7 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v2, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m4, v2, v2, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, int32_t x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4); + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n) +{ + vbool4_t mask = *(vbool4_t*)base1; + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4); + mask = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4); + for (int i = 0; i < n; i++){ + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4); + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vv_f32m8_b4_m (mask, v3, v4,32); + mask = __riscv_vmfeq_vv_f32m8_b4_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n) +{ + vbool32_t mask = *(vbool32_t*)base1; + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4); + mask = __riscv_vmfeq_vv_f32m1_b32 (v, v2, 4); + for (int i = 0; i < n; i++){ + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4); + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vv_f32m1_b32_m (mask, v3, v4,32); + mask = __riscv_vmfeq_vv_f32m1_b32_mu (mask, mask, v4, v4, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c new file mode 100644 index 00000000000..fe181de4d56 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f7 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, float x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4); + mask = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4); + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vf_f32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmfeq_vf_f32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, float x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4); + mask = __riscv_vmfeq_vf_f32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4); + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmfeq_vf_f32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmfeq_vf_f32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c new file mode 100644 index 00000000000..ae5b4ed6913 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c @@ -0,0 +1,231 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */ + +#include "riscv_vector.h" + +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_mu(m1,m1,v1,x,vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v27", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x) +{ + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl); + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl); + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl); + asm volatile("#" :: + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25", + "v26", "v28", "v29", "v30", "v31"); + + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_mu(m1,m2,v1,x,vl); + asm volatile("#" :: + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9", + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17", + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25", + "v26", "v28", "v29", "v30", "v31"); + + __riscv_vsm_v_b4 (out,v,vl); +} + +void f7 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f8 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f9 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); + vbool4_t m5 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m4, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m5, 4); +} + +void f10 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f11 (void * in, void *out, float x) +{ + vbool4_t mask = *(vbool4_t*)in; + asm volatile ("":::"memory"); + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4); + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4); + __riscv_vsm_v_b4 (out, m3, 4); + __riscv_vsm_v_b4 (out, m4, 4); +} + +void f12 (void* base1,void* base2,void* out,int n, float x) +{ + vbool4_t mask = *(vbool4_t*)base1; + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4); + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4); + mask = __riscv_vmflt_vf_f32m8_b4 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4); + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4); + mask = __riscv_vmflt_vf_f32m8_b4_m (mask, v3, x,32); + mask = __riscv_vmflt_vf_f32m8_b4_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b4 (out, mask, 32); +} + +void f13 (void* base1,void* base2,void* out,int n, float x) +{ + vbool32_t mask = *(vbool32_t*)base1; + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4); + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4); + mask = __riscv_vmflt_vf_f32m1_b32 (v, x, 4); + for (int i = 0; i < n; i++){ + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4); + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4); + mask = __riscv_vmflt_vf_f32m1_b32_m (mask, v3, x,32); + mask = __riscv_vmflt_vf_f32m1_b32_mu (mask, mask, v4, x, 32); + } + __riscv_vsm_v_b32 (out, mask, 32); +} + +/* { dg-final { scan-assembler-not {vmv} } } */ +/* { dg-final { scan-assembler-not {csrr} } } */