From patchwork Fri Nov 25 16:06:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 26088 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp4127848wrr; Fri, 25 Nov 2022 08:07:34 -0800 (PST) X-Google-Smtp-Source: AA0mqf7Xfyuqk2g0gTbmAT5k50QuTECSZlg4pCro6MbcS/UpPn6gnOnOT3FGW2SSPniOZOab8d+Y X-Received: by 2002:a17:906:d8db:b0:7ba:8633:7f7b with SMTP id re27-20020a170906d8db00b007ba86337f7bmr9798949ejb.206.1669392454745; Fri, 25 Nov 2022 08:07:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669392454; cv=none; d=google.com; s=arc-20160816; b=uaSbBAK8u+chCkI0VZthB1wNM2VeZ6QXO09tvyUL8v09yDvb9I0tkoMjkZDJslk2xc jn1tEm6VMOCnGA0ceOAAHr4B6DYbmYjCFPtrRrzl9DLiqMnKEYzgyeaqKdWUXnpbq1GS aFcudgmO+orasihLS9roT2qTDC0mfpQ2Z1LphgKT2q/7kCLeFmpToMdBRKT1mdWyDWOk eIMSI3+s+PcPPasvlR8pWjpMkw/l+FMjgyzMASuVwDgRJGVY0HV8tUhTVRDwCN2oOLqC tItc74LTZQjHk5JiMll0H7DmVVyShMPu2LTDuS89knuD9g02E+rGYL26SGpqB/iTsJFA rK6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=VyUuFAr9RejD84gVf02PIpfH3GisEay4RKm9JELIuHA=; b=j79z2KZeRlEvs75XqXWhMJQOwCkxZFybv8ZEU1TereJSoBEipPG5UwpImzXXieYyJ1 uUPHsnSA7S/3aGKR44z4v2mGZWm1xB8gCuwepPDszY+mAofqr3kyGX8y/6p/KsM+77wX byJGUBr/GA4P8iV4504yTrfmVH0ytH+5GNx7p80+fEGpsFHD+mJBdJObhoYh0Fozo8VA CqwyyI1XyO4PM8u/gjlaPb7bm7MX9jRVggScBlfdBB5vWisYuwVmHo2NYCa5hLPST2HE 5y8/YguVDw7eXLoQ/XqfkgmHYERkPANK7w1UaGBAnb1LtZofwiraEnkcMx3C4vyEAhbp t4Bw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id xa3-20020a170906fd8300b00791a3dd01b6si3625146ejb.864.2022.11.25.08.07.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Nov 2022 08:07:34 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0BD8F384D9B8 for ; Fri, 25 Nov 2022 16:07:23 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr1.qq.com (smtpbgbr1.qq.com [54.207.19.206]) by sourceware.org (Postfix) with ESMTPS id 5E599384EF4C for ; Fri, 25 Nov 2022 16:06:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5E599384EF4C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp68t1669392404t55pxxy4 Received: from server1.localdomain ( [58.60.1.22]) by bizesmtp.qq.com (ESMTP) with id ; Sat, 26 Nov 2022 00:06:42 +0800 (CST) X-QQ-SSF: 01400000000000D0K000000A0000000 X-QQ-FEAT: 6y0B8B37QwV2cFKxiGATFd9meegVo2rkcoNdM2a2NcpzjU0EREZxGgSPZwkAu PSg1+573hRCJzNz70FAgkJwJw+ZsTQZx+Pbk1yaZiFPfQit/3AVdJ6Zjt9Qegv1fGnnh/Qo dS558PYJcFjfr1RMJU/qU/tFKQjTCh4GDuBTE9JxJlZ3KeQu1HD2UeaEqtMe3pmUrgySJVh JO1i+kMsXewtVT+41gf1rPey7h7jPwGi1YbfM8g9DK+L2n/CPE77H51kMEao4aJGIJ35R6o l7vTVtw/fd9ln8RTBRS+Ws7SdTfXICAI8Kjyuva3Of9LsFj+aIWdlEBcVSkQkDoO1A7cl/f bNgQAZS4Am3Kjku1t80mbM5rQsOhg== X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, Ju-Zhe Zhong Subject: [PATCH] RISC-V: Add duplicate vector support. Date: Sat, 26 Nov 2022 00:06:39 +0800 Message-Id: <20221125160639.43024-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvr:qybglogicsvr7 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750484862640608336?= X-GMAIL-MSGID: =?utf-8?q?1750484862640608336?= From: Ju-Zhe Zhong gcc/ChangeLog: * config/riscv/constraints.md (Wdm): New constraint. * config/riscv/predicates.md (direct_broadcast_operand): New predicate. * config/riscv/riscv-protos.h (RVV_VLMAX): New macro. (emit_pred_op): Refine function. * config/riscv/riscv-selftests.cc (run_const_vector_selftests): New function. (run_broadcast_selftests): Ditto. (BROADCAST_TEST): New tests. (riscv_run_selftests): More tests. * config/riscv/riscv-v.cc (emit_pred_move): Refine function. (emit_vlmax_vsetvl): Ditto. (emit_pred_op): Ditto. (expand_const_vector): New function. (legitimize_move): Add constant vector support. * config/riscv/riscv.cc (riscv_print_operand): New asm print rule for const vector. * config/riscv/riscv.h (X0_REGNUM): New macro. * config/riscv/vector-iterators.md: New attribute. * config/riscv/vector.md (vec_duplicate): New pattern. (@pred_broadcast): New pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/dup-1.c: New test. * gcc.target/riscv/rvv/base/dup-2.c: New test. --- gcc/config/riscv/constraints.md | 5 + gcc/config/riscv/predicates.md | 5 + gcc/config/riscv/riscv-protos.h | 2 + gcc/config/riscv/riscv-selftests.cc | 127 +++++ gcc/config/riscv/riscv-v.cc | 86 ++- gcc/config/riscv/riscv.cc | 13 + gcc/config/riscv/riscv.h | 3 + gcc/config/riscv/vector-iterators.md | 9 + gcc/config/riscv/vector.md | 53 +- .../gcc.target/riscv/rvv/base/dup-1.c | 521 ++++++++++++++++++ .../gcc.target/riscv/rvv/base/dup-2.c | 75 +++ 11 files changed, 881 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md index 4088c48150a..51cffb2bcb6 100644 --- a/gcc/config/riscv/constraints.md +++ b/gcc/config/riscv/constraints.md @@ -151,3 +151,8 @@ A constraint that matches a vector of immediate all ones." (and (match_code "const_vector") (match_test "op == CONSTM1_RTX (GET_MODE (op))"))) + +(define_constraint "Wdm" + "Vector duplicate memory operand" + (and (match_operand 0 "memory_operand") + (match_code "reg" "0"))) diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index dfd98761b8b..5a5a49bf7c0 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -286,6 +286,11 @@ (match_test "GET_CODE (op) == UNSPEC && (XINT (op, 1) == UNSPEC_VUNDEF)")))) +;; The scalar operand can be directly broadcast by RVV instructions. +(define_predicate "direct_broadcast_operand" + (ior (match_operand 0 "register_operand") + (match_test "satisfies_constraint_Wdm (op)"))) + ;; A CONST_INT operand that has exactly two bits cleared. (define_predicate "const_nottwobits_operand" (and (match_code "const_int") diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 2ec3af05aa4..27692ffb210 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -119,6 +119,7 @@ extern void riscv_run_selftests (void); #endif namespace riscv_vector { +#define RVV_VLMAX gen_rtx_REG (Pmode, X0_REGNUM) /* Routines implemented in riscv-vector-builtins.cc. */ extern void init_builtins (void); extern const char *mangle_builtin_type (const_tree); @@ -130,6 +131,7 @@ extern tree builtin_decl (unsigned, bool); extern rtx expand_builtin (unsigned int, tree, rtx); extern bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT); extern bool legitimize_move (rtx, rtx, machine_mode); +extern void emit_pred_op (unsigned, rtx, rtx, machine_mode); enum tail_policy { TAIL_UNDISTURBED = 0, diff --git a/gcc/config/riscv/riscv-selftests.cc b/gcc/config/riscv/riscv-selftests.cc index 636874ebc0f..1bf1a648fa1 100644 --- a/gcc/config/riscv/riscv-selftests.cc +++ b/gcc/config/riscv/riscv-selftests.cc @@ -33,6 +33,9 @@ along with GCC; see the file COPYING3. If not see #include "expr.h" #include "selftest.h" #include "selftest-rtl.h" +#include "insn-attr.h" +#include "target.h" +#include "optabs.h" #if CHECKING_P using namespace selftest; @@ -230,12 +233,136 @@ run_poly_int_selftests (void) run_poly_int_selftest ("rv32imafd_zve32x1p0", ABI_ILP32D, POLY_TEST_DIMODE, worklist); } + +static void +run_const_vector_selftests (void) +{ + /* We dont't need to do the redundant tests in different march && mabi. + Just pick up the march && mabi which fully support all RVV modes. */ + riscv_selftest_arch_abi_setter rv ("rv64imafdcv", ABI_LP64D); + rtl_dump_test t (SELFTEST_LOCATION, locate_file ("riscv/empty-func.rtl")); + set_new_first_and_last_insn (NULL, NULL); + + machine_mode mode; + std::vector worklist = {-111, -17, -16, 7, 15, 16, 111}; + + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_INT) + { + if (riscv_v_ext_vector_mode_p (mode)) + { + for (const HOST_WIDE_INT &val : worklist) + { + start_sequence (); + rtx dest = gen_reg_rtx (mode); + rtx dup = gen_const_vec_duplicate (mode, GEN_INT (val)); + emit_move_insn (dest, dup); + rtx_insn *insn = get_last_insn (); + rtx src = XEXP (SET_SRC (PATTERN (insn)), 1); + /* 1. Should be vmv.v.i for in rang of -16 ~ 15. + 2. Should be vmv.v.x for exceed -16 ~ 15. */ + if (IN_RANGE (val, -16, 15)) + ASSERT_TRUE (rtx_equal_p (src, dup)); + else + ASSERT_TRUE ( + rtx_equal_p (src, + gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0)))); + end_sequence (); + } + } + } + + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_FLOAT) + { + if (riscv_v_ext_vector_mode_p (mode)) + { + scalar_mode inner_mode = GET_MODE_INNER (mode); + REAL_VALUE_TYPE f = REAL_VALUE_ATOF ("0.2928932", inner_mode); + rtx ele = const_double_from_real_value (f, inner_mode); + + start_sequence (); + rtx dest = gen_reg_rtx (mode); + rtx dup = gen_const_vec_duplicate (mode, ele); + emit_move_insn (dest, dup); + rtx_insn *insn = get_last_insn (); + rtx src = XEXP (SET_SRC (PATTERN (insn)), 1); + /* Should always be vfmv.v.f. */ + ASSERT_TRUE ( + rtx_equal_p (src, gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0)))); + end_sequence (); + } + } + + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_BOOL) + { + /* Test vmset.m. */ + if (riscv_v_ext_vector_mode_p (mode)) + { + start_sequence (); + rtx dest = gen_reg_rtx (mode); + emit_move_insn (dest, CONSTM1_RTX (mode)); + rtx_insn *insn = get_last_insn (); + rtx src = XEXP (SET_SRC (PATTERN (insn)), 1); + ASSERT_TRUE (rtx_equal_p (src, CONSTM1_RTX (mode))); + end_sequence (); + } + } +} + +static void +run_broadcast_selftests (void) +{ + /* We dont't need to do the redundant tests in different march && mabi. + Just pick up the march && mabi which fully support all RVV modes. */ + riscv_selftest_arch_abi_setter rv ("rv64imafdcv", ABI_LP64D); + rtl_dump_test t (SELFTEST_LOCATION, locate_file ("riscv/empty-func.rtl")); + set_new_first_and_last_insn (NULL, NULL); + + machine_mode mode; + +#define BROADCAST_TEST(MODE_CLASS) \ + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_INT) \ + { \ + if (riscv_v_ext_vector_mode_p (mode)) \ + { \ + rtx_insn *insn; \ + rtx src; \ + scalar_mode inner_mode = GET_MODE_INNER (mode); \ + /* Test vlse.v with zero stride. */ \ + start_sequence (); \ + rtx addr = gen_reg_rtx (Pmode); \ + rtx mem = gen_rtx_MEM (inner_mode, addr); \ + expand_vector_broadcast (mode, mem); \ + insn = get_last_insn (); \ + src = XEXP (SET_SRC (PATTERN (insn)), 1); \ + ASSERT_TRUE (MEM_P (XEXP (src, 0))); \ + ASSERT_TRUE ( \ + rtx_equal_p (src, gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0)))); \ + end_sequence (); \ + /* Test vmv.v.x or vfmv.v.f. */ \ + start_sequence (); \ + rtx reg = gen_reg_rtx (inner_mode); \ + expand_vector_broadcast (mode, reg); \ + insn = get_last_insn (); \ + src = XEXP (SET_SRC (PATTERN (insn)), 1); \ + ASSERT_TRUE (REG_P (XEXP (src, 0))); \ + ASSERT_TRUE ( \ + rtx_equal_p (src, gen_rtx_VEC_DUPLICATE (mode, XEXP (src, 0)))); \ + end_sequence (); \ + } \ + } + + BROADCAST_TEST (MODE_VECTOR_INT) + BROADCAST_TEST (MODE_VECTOR_FLOAT) +} + namespace selftest { /* Run all target-specific selftests. */ void riscv_run_selftests (void) { run_poly_int_selftests (); + run_const_vector_selftests (); + run_broadcast_selftests (); } } // namespace selftest #endif /* #if CHECKING_P */ diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index e0459e3f610..fbd8bbfe254 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -40,6 +40,7 @@ #include "target.h" #include "expr.h" #include "optabs.h" +#include "tm-constrs.h" using namespace riscv_vector; @@ -104,34 +105,80 @@ const_vec_all_same_in_range_p (rtx x, HOST_WIDE_INT minval, && IN_RANGE (INTVAL (elt), minval, maxval)); } -/* Emit an RVV unmask && vl mov from SRC to DEST. */ -static void -emit_pred_move (rtx dest, rtx src, machine_mode mask_mode) +static rtx +emit_vlmax_vsetvl (machine_mode vmode) { - insn_expander<7> e; - machine_mode mode = GET_MODE (dest); rtx vl = gen_reg_rtx (Pmode); - unsigned int sew = GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL + unsigned int sew = GET_MODE_CLASS (vmode) == MODE_VECTOR_BOOL ? 8 - : GET_MODE_BITSIZE (GET_MODE_INNER (mode)); + : GET_MODE_BITSIZE (GET_MODE_INNER (vmode)); - emit_insn (gen_vsetvl_no_side_effects ( - Pmode, vl, gen_rtx_REG (Pmode, 0), gen_int_mode (sew, Pmode), - gen_int_mode ((unsigned int) mode, Pmode), const1_rtx, const1_rtx)); + emit_insn ( + gen_vsetvl_no_side_effects (Pmode, vl, RVV_VLMAX, gen_int_mode (sew, Pmode), + gen_int_mode ((unsigned int) vmode, Pmode), + const1_rtx, const1_rtx)); + return vl; +} + +/* Emit an RVV unmask && vl mov from SRC to DEST. */ +void +emit_pred_op (unsigned icode, rtx dest, rtx src, machine_mode mask_mode) +{ + insn_expander<7> e; + machine_mode mode = GET_MODE (dest); e.add_output_operand (dest, mode); e.add_all_one_mask_operand (mask_mode); e.add_vundef_operand (mode); - e.add_input_operand (src, mode); + e.add_input_operand (src, GET_MODE (src)); - e.add_input_operand (vl, Pmode); + rtx vlmax = emit_vlmax_vsetvl (mode); + e.add_input_operand (vlmax, Pmode); e.add_policy_operand (TAIL_AGNOSTIC, MASK_AGNOSTIC); - enum insn_code icode; - icode = code_for_pred_mov (mode); - e.expand (icode, true); + e.expand ((enum insn_code) icode, MEM_P (dest) || MEM_P (src)); +} + +static void +expand_const_vector (rtx target, rtx src, machine_mode mask_mode) +{ + machine_mode mode = GET_MODE (target); + scalar_mode elt_mode = GET_MODE_INNER (mode); + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL) + { + rtx elt; + gcc_assert ( + const_vec_duplicate_p (src, &elt) + && (rtx_equal_p (elt, const0_rtx) || rtx_equal_p (elt, const1_rtx))); + emit_pred_op (code_for_pred_mov (mode), target, src, mode); + return; + } + + rtx elt; + if (const_vec_duplicate_p (src, &elt)) + { + rtx tmp = register_operand (target, mode) ? target : gen_reg_rtx (mode); + /* Element in range -16 ~ 15 integer or 0.0 floating-point, + we use vmv.v.i instruction. */ + if (satisfies_constraint_vi (src) || satisfies_constraint_Wc0 (src)) + emit_pred_op (code_for_pred_mov (mode), tmp, src, mask_mode); + else + emit_pred_op (code_for_pred_broadcast (mode), tmp, + force_reg (elt_mode, elt), mask_mode); + + if (tmp != target) + emit_move_insn (target, tmp); + return; + } + + /* TODO: We only support const duplicate vector for now. More cases + will be supported when we support auto-vectorization: + + 1. series vector. + 2. multiple elts duplicate vector. + 3. multiple patterns with multiple elts. */ } /* Expand a pre-RA RVV data move from SRC to DEST. @@ -140,6 +187,11 @@ bool legitimize_move (rtx dest, rtx src, machine_mode mask_mode) { machine_mode mode = GET_MODE (dest); + if (CONST_VECTOR_P (src)) + { + expand_const_vector (dest, src, mask_mode); + return true; + } if (known_ge (GET_MODE_SIZE (mode), BYTES_PER_RISCV_VECTOR) && GET_MODE_CLASS (mode) != MODE_VECTOR_BOOL) { @@ -153,12 +205,12 @@ legitimize_move (rtx dest, rtx src, machine_mode mask_mode) { rtx tmp = gen_reg_rtx (mode); if (MEM_P (src)) - emit_pred_move (tmp, src, mask_mode); + emit_pred_op (code_for_pred_mov (mode), tmp, src, mask_mode); else emit_move_insn (tmp, src); src = tmp; } - emit_pred_move (dest, src, mask_mode); + emit_pred_op (code_for_pred_mov (mode), dest, src, mask_mode); return true; } diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 7bfc0e9f595..0267494ae5a 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4205,6 +4205,19 @@ riscv_print_operand (FILE *file, rtx op, int letter) switch (letter) { + case 'v': { + rtx elt; + + if (!const_vec_duplicate_p (op, &elt)) + output_operand_lossage ("invalid vector constant"); + else if (satisfies_constraint_Wc0 (op)) + asm_fprintf (file, "0"); + else if (satisfies_constraint_vi (op)) + asm_fprintf (file, "%wd", INTVAL (elt)); + else + output_operand_lossage ("invalid vector constant"); + break; + } case 'm': { if (riscv_v_ext_vector_mode_p (mode)) { diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index b05c3c1545c..defb475f948 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -651,6 +651,9 @@ enum reg_class #define FP_ARG_FIRST (FP_REG_FIRST + 10) #define FP_ARG_LAST (FP_ARG_FIRST + MAX_ARGS_IN_REGISTERS - 1) +/* Helper macro for RVV vsetvl instruction generation. */ +#define X0_REGNUM GP_REG_FIRST + #define CALLEE_SAVED_REG_NUMBER(REGNO) \ ((REGNO) >= 8 && (REGNO) <= 9 ? (REGNO) - 8 : \ (REGNO) >= 18 && (REGNO) <= 27 ? (REGNO) - 16 : -1) diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 9d4a9dc8a0e..92c4bd0a6a3 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -71,6 +71,15 @@ (VNx1DF "VNx1BI") (VNx2DF "VNx2BI") (VNx4DF "VNx4BI") (VNx8DF "VNx8BI") ]) +(define_mode_attr VEL [ + (VNx1QI "QI") (VNx2QI "QI") (VNx4QI "QI") (VNx8QI "QI") (VNx16QI "QI") (VNx32QI "QI") (VNx64QI "QI") + (VNx1HI "HI") (VNx2HI "HI") (VNx4HI "HI") (VNx8HI "HI") (VNx16HI "HI") (VNx32HI "HI") + (VNx1SI "SI") (VNx2SI "SI") (VNx4SI "SI") (VNx8SI "SI") (VNx16SI "SI") + (VNx1DI "DI") (VNx2DI "DI") (VNx4DI "DI") (VNx8DI "DI") + (VNx1SF "SF") (VNx2SF "SF") (VNx4SF "SF") (VNx8SF "SF") (VNx16SF "SF") + (VNx1DF "DF") (VNx2DF "DF") (VNx4DF "DF") (VNx8DF "DF") +]) + (define_mode_attr sew [ (VNx1QI "8") (VNx2QI "8") (VNx4QI "8") (VNx8QI "8") (VNx16QI "8") (VNx32QI "8") (VNx64QI "8") (VNx1HI "16") (VNx2HI "16") (VNx4HI "16") (VNx8HI "16") (VNx16HI "16") (VNx32HI "16") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 01418ac5fcf..0dace449316 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -151,6 +151,26 @@ [(set_attr "type" "vmov") (set_attr "mode" "")]) +;; ----------------------------------------------------------------- +;; ---- Duplicate Operations +;; ----------------------------------------------------------------- + +;; According to GCC internal: +;; This pattern only handles duplicates of non-constant inputs. +;; Constant vectors go through the movm pattern instead. +;; So "direct_broadcast_operand" can only be mem or reg, no CONSTANT. +(define_expand "vec_duplicate" + [(set (match_operand:V 0 "register_operand") + (vec_duplicate:V + (match_operand: 1 "direct_broadcast_operand")))] + "TARGET_VECTOR" + { + riscv_vector::emit_pred_op (code_for_pred_broadcast (mode), operands[0], + operands[1], mode); + DONE; + } +) + ;; ----------------------------------------------------------------- ;; ---- 6. Configuration-Setting Instructions ;; ----------------------------------------------------------------- @@ -368,7 +388,7 @@ vle.v\t%0,%3%p1 vse.v\t%3,%0%p1 vmv.v.v\t%0,%3 - vmv.v.i\t%0,v%3" + vmv.v.i\t%0,%v3" "&& register_operand (operands[0], mode) && register_operand (operands[3], mode) && satisfies_constraint_vu (operands[2])" @@ -407,3 +427,34 @@ "" [(set_attr "type" "vldm,vstm,vimov,vmalu,vmalu") (set_attr "mode" "")]) + +;; ------------------------------------------------------------------------------- +;; ---- Predicated Broadcast +;; ------------------------------------------------------------------------------- +;; Includes: +;; - 7.5. Vector Strided Instructions (zero stride) +;; - 11.16 Vector Integer Move Instructions (vmv.v.x) +;; - 13.16 Vector Floating-Point Move Instruction (vfmv.v.f) +;; ------------------------------------------------------------------------------- + +(define_insn "@pred_broadcast" + [(set (match_operand:V 0 "register_operand" "=vr, vr, vr, vr") + (if_then_else:V + (unspec: + [(match_operand: 1 "vector_mask_operand" " Wc1, Wc1, vm, Wc1") + (match_operand 4 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 5 "const_int_operand" " i, i, i, i") + (match_operand 6 "const_int_operand" " i, i, i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (vec_duplicate:V + (match_operand: 3 "direct_broadcast_operand" " r, f, Wdm, Wdm")) + (match_operand:V 2 "vector_merge_operand" "vu0, vu0, vu0, vu0")))] + "TARGET_VECTOR" + "@ + vmv.v.x\t%0,%3 + vfmv.v.f\t%0,%3 + vlse.v\t%0,%3,zero,%1.t + vlse.v\t%0,%3,zero" + [(set_attr "type" "vimov,vfmov,vlds,vlds") + (set_attr "mode" "")]) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c new file mode 100644 index 00000000000..2a83afae056 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-1.c @@ -0,0 +1,521 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -O3 -fgimple" } */ + +#include "riscv_vector.h" + +void __GIMPLE (ssa,guessed_local(1073741824)) +f1 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint8mf8_t *)out_2(D)) = _Literal (vint8mf8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f2 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint8mf4_t *)out_2(D)) = _Literal (vint8mf4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f3 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint8mf2_t *)out_2(D)) = _Literal (vint8mf2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f4 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint8m1_t *)out_2(D)) = _Literal (vint8m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f5 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint8m2_t *)out_2(D)) = _Literal (vint8m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f6 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint8m4_t *)out_2(D)) = _Literal (vint8m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f7 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint8m8_t *)out_2(D)) = _Literal (vint8m8_t) 0; + return; + +} + +void __GIMPLE (ssa,guessed_local(1073741824)) +f8 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint8mf8_t *)out_2(D)) = _Literal (vuint8mf8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f9 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint8mf4_t *)out_2(D)) = _Literal (vuint8mf4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f10 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint8mf2_t *)out_2(D)) = _Literal (vuint8mf2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f11 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint8m1_t *)out_2(D)) = _Literal (vuint8m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f12 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint8m2_t *)out_2(D)) = _Literal (vuint8m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f13 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint8m4_t *)out_2(D)) = _Literal (vuint8m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f14 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint8m8_t *)out_2(D)) = _Literal (vuint8m8_t) 0; + return; + +} + +void __GIMPLE (ssa,guessed_local(1073741824)) +f15 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint16mf4_t *)out_2(D)) = _Literal (vint16mf4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f16 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint16mf2_t *)out_2(D)) = _Literal (vint16mf2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f17 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint16m1_t *)out_2(D)) = _Literal (vint16m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f18 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint16m2_t *)out_2(D)) = _Literal (vint16m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f19 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint16m4_t *)out_2(D)) = _Literal (vint16m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f20 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint16m8_t *)out_2(D)) = _Literal (vint16m8_t) 0; + return; + +} + +void __GIMPLE (ssa,guessed_local(1073741824)) +f21 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint16mf4_t *)out_2(D)) = _Literal (vuint16mf4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f22 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint16mf2_t *)out_2(D)) = _Literal (vuint16mf2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f23 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint16m1_t *)out_2(D)) = _Literal (vuint16m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f24 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint16m2_t *)out_2(D)) = _Literal (vuint16m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f25 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint16m4_t *)out_2(D)) = _Literal (vuint16m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f26 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint16m8_t *)out_2(D)) = _Literal (vuint16m8_t) 0; + return; + +} + +void __GIMPLE (ssa,guessed_local(1073741824)) +f27 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint32mf2_t *)out_2(D)) = _Literal (vint32mf2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f28 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint32m1_t *)out_2(D)) = _Literal (vint32m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f29 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint32m2_t *)out_2(D)) = _Literal (vint32m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f30 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint32m4_t *)out_2(D)) = _Literal (vint32m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f31 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint32m8_t *)out_2(D)) = _Literal (vint32m8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f32 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint32mf2_t *)out_2(D)) = _Literal (vuint32mf2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f33 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint32m1_t *)out_2(D)) = _Literal (vuint32m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f34 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint32m2_t *)out_2(D)) = _Literal (vuint32m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f35 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint32m4_t *)out_2(D)) = _Literal (vuint32m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f36 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint32m8_t *)out_2(D)) = _Literal (vuint32m8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f37 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint64m1_t *)out_2(D)) = _Literal (vint64m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f38 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint64m2_t *)out_2(D)) = _Literal (vint64m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f39 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint64m4_t *)out_2(D)) = _Literal (vint64m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f40 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vint64m8_t *)out_2(D)) = _Literal (vint64m8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f41 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint64m1_t *)out_2(D)) = _Literal (vuint64m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f42 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint64m2_t *)out_2(D)) = _Literal (vuint64m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f43 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint64m4_t *)out_2(D)) = _Literal (vuint64m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f44 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vuint64m8_t *)out_2(D)) = _Literal (vuint64m8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f45 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat32m1_t *)out_2(D)) = _Literal (vfloat32m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f46 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat32m2_t *)out_2(D)) = _Literal (vfloat32m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f47 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat32m4_t *)out_2(D)) = _Literal (vfloat32m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f48 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat32m8_t *)out_2(D)) = _Literal (vfloat32m8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f49 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat64m1_t *)out_2(D)) = _Literal (vfloat64m1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f50 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat64m2_t *)out_2(D)) = _Literal (vfloat64m2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f51 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat64m4_t *)out_2(D)) = _Literal (vfloat64m4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f52 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vfloat64m8_t *)out_2(D)) = _Literal (vfloat64m8_t) 0; + return; + +} + +/* { dg-final { scan-assembler-times {vmv\.v\.i\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1]),\s*0} 52 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c new file mode 100644 index 00000000000..c6903039c2a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/dup-2.c @@ -0,0 +1,75 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -O3 -fgimple" } */ + +#include "riscv_vector.h" + +void __GIMPLE (ssa,guessed_local(1073741824)) +f1 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vbool1_t *)out_2(D)) = _Literal (vbool1_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f2 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vbool2_t *)out_2(D)) = _Literal (vbool2_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f3 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vbool4_t *)out_2(D)) = _Literal (vbool4_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f4 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vbool8_t *)out_2(D)) = _Literal (vbool8_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f5 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vbool16_t *)out_2(D)) = _Literal (vbool16_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f6 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vbool32_t *)out_2(D)) = _Literal (vbool32_t) 0; + return; + +} + + +void __GIMPLE (ssa,guessed_local(1073741824)) +f7 (void * out) +{ + __BB(2,guessed_local(1073741824)): + __MEM ((vbool64_t *)out_2(D)) = _Literal (vbool64_t) 0; + return; + +} + +/* { dg-final { scan-assembler-times {vmclr\.m\s+(?:v[0-9]|v[1-2][0-9]|v3[0-1])} 7 } } */