From patchwork Wed May 10 04:00:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 91834 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3347058vqo; Tue, 9 May 2023 21:01:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4/Jj7nhV338ZVyVVAfHSflLQj9qvXjpyPEML7a6RK3ojUwjI+zms2B/LSI1UavxFvplr9s X-Received: by 2002:a17:907:1c05:b0:930:f953:9608 with SMTP id nc5-20020a1709071c0500b00930f9539608mr17000540ejc.0.1683691281015; Tue, 09 May 2023 21:01:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683691281; cv=none; d=google.com; s=arc-20160816; b=X4lGRi5tJu7NjKTWp4Fxn6X/DvW6ZXqtif/SNILaF+OMCV7uq3nMpyVoVXM+Vf4o8J /+VMAKWxw1hXXOTC7MGYmE9PDYlMDEddaTyfS1+QvHFh7kClnx992xuqOui2aRPClts9 dwy4i6bv7lrZJCltYJ+kVKrx2kEYjd0yuHz8ZCEGRPD08G9MnsKz/Ec4u0unBrXACxYS eA86wML/STKv/X6UYs1gmn7tEdcnbG0CylOI2tNkCK+App0c7zjCVI6rglkCHYuC24fa mNesomy5FWKjWZooItaLPJMEIlztTFw2rDiyrUFGQNNEbTwMdW1Xih4bf5rmji1Wylcz TZ0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=5LIVfLJ2uI9l1C6xUXZXfN21VVhD0xzhnlflCrm7G3s=; b=qNMwMDjJkyfir0ujCMYUuElVU4IpaXnz2+zr872otNjBxi8ciBDWZwzpVcsUR13vf+ 1C3nUmUAsvKviv0BZSIKMNPLkNjqCyPW0xpwPcYYZkOUJKIzxlVzN/XOKYo1EIqMF32c Vj0ljlA7kSvXiy1SUGQ9nPcoFBAGfwjyWqt3kl+iNiznPSTbLXwOF4eYH2Bt2EGvXfru LxxryZnQLkXlaK/HzWVYfV7j5Taj4YyXxuwHeZtJexGKuaPubkvq8ARMtTcAy6x3udzX W8LKcGnS0az7SBYwZZbgUWsk3ZCoCzp/EHGdeBCZDoaeovvoPi1LV+qh2SqZZQJejEqO aT2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id u27-20020a1709060b1b00b008d28622c8d9si3023254ejg.728.2023.05.09.21.01.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 May 2023 21:01:20 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DBAF5385B530 for ; Wed, 10 May 2023 04:01:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id 1A2C03858C2B for ; Wed, 10 May 2023 04:00:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1A2C03858C2B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp90t1683691237tmnpwm5j Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 10 May 2023 12:00:36 +0800 (CST) X-QQ-SSF: 01400000000000F0Q000000A0000000 X-QQ-FEAT: zT6n3Y95oi2EiXqmYRonV9/OuzXw2Qdnm3XvkAfefaYaVYhmq26joSYVpLFBo HqM+Q/0otM5eRoWZfoOYRjGqSNkP/5XUvumzpRqkW/m+TYZrdxA/QzoRC/7/tGduM/PCJzc HEG7qhq1LuKptSEcAVeLmtVp3wOlDgJ8EIkAW+UchBffkN4rPOopkaEAZJ//FUos9urs2V2 1y767q5vpnlH6SNaU5sMD4jmpLQSg5x+yUaeY9WahFxhZUL4YZib4MDvHsQLWfnxVFP4gTz c6ZmJy8QQAuoB4mIjxyaocU87dtmjJfnUTvTiqrW6DmRjqTfS+g13Q6s3dvoaD0Bd/6iMui T5w+ocVLZC47zSVL4DTysQEp80VOgfnNV4JD5yWOzUlc8a5e2F3iClKccP5pZcVbPnvjXnq X-QQ-GoodBg: 2 X-BIZMAIL-ID: 5501039995355565145 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Support const series vector for RVV auto-vectorization Date: Wed, 10 May 2023 12:00:35 +0800 Message-Id: <20230510040035.2972636-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765478268563540465?= X-GMAIL-MSGID: =?utf-8?q?1765478268563540465?= From: Juzhe-Zhong This patch is the prerequiste patch for more RVV auto-vectorization support. Since when we enable a very simple binary operations, we will end up with such following ICE: during RTL pass: expand add_run-1.c: In function 'main': add_run-1.c:28:1: internal compiler error: Segmentation fault 0x1618ea3 crash_signal ../../../riscv-gcc/gcc/toplev.cc:314 0xe76cd9 single_set(rtx_insn const*) ../../../riscv-gcc/gcc/rtl.h:3602 0x1080f8a emit_move_insn(rtx_def*, rtx_def*) ../../../riscv-gcc/gcc/expr.cc:4342 0x170c458 insert_value_copy_on_edge ../../../riscv-gcc/gcc/tree-outof-ssa.cc:352 0x170d58e eliminate_phi ../../../riscv-gcc/gcc/tree-outof-ssa.cc:785 0x170df17 expand_phi_nodes(ssaexpand*) ../../../riscv-gcc/gcc/tree-outof-ssa.cc:1024 0xef27e2 execute ../../../riscv-gcc/gcc/cfgexpand.cc:6818 This is because LoopVectorizer assume target is able to handle series const vector when we enable binary operations. Then it will be easily causing ICE like that. gcc/ChangeLog: * config/riscv/autovec.md (@vec_series): New pattern * config/riscv/riscv-protos.h (expand_vec_series): New function. * config/riscv/riscv-v.cc (emit_binop): Ditto. (emit_indexop): Ditto. (expand_vec_series): Ditto. (expand_const_vector): Add series vector handling. * config/riscv/riscv.cc (riscv_const_insns): Enable series vector for testing. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/series-1.c: New test. * gcc.target/riscv/rvv/autovec/series_run-1.c: New test. --- gcc/config/riscv/autovec.md | 24 ++++ gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-v.cc | 118 +++++++++++++++++- gcc/config/riscv/riscv.cc | 27 +++- .../gcc.target/riscv/rvv/autovec/series-1.c | 50 ++++++++ .../riscv/rvv/autovec/series_run-1.c | 20 +++ 6 files changed, 236 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/series-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/series_run-1.c diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index f1c5ff5951b..99dc4f046b0 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -58,3 +58,27 @@ DONE; } ) + +;; ========================================================================= +;; == Vector creation +;; ========================================================================= + +;; ------------------------------------------------------------------------- +;; ---- [INT] Linear series +;; ------------------------------------------------------------------------- +;; Includes: +;; - vid.v +;; - vmul.vx +;; - vadd.vx/vadd.vi +;; ------------------------------------------------------------------------- + +(define_expand "@vec_series" + [(match_operand:VI 0 "register_operand") + (match_operand: 1 "reg_or_int_operand") + (match_operand: 2 "reg_or_int_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_vec_series (operands[0], operands[1], operands[2]); + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index c0293a306f9..e8a728ae226 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -219,6 +219,7 @@ rtx gen_avl_for_scalar_move (rtx); void expand_tuple_move (machine_mode, rtx *); machine_mode preferred_simd_mode (scalar_mode); opt_machine_mode get_mask_mode (machine_mode); +void expand_vec_series (rtx, rtx, rtx); } /* We classify builtin types into two classes: diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 7ca49ca67c1..0c3b1b4c40b 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -248,6 +248,111 @@ emit_nonvlmax_op (unsigned icode, rtx dest, rtx src, rtx len, emit_pred_op (icode, NULL_RTX, dest, src, len, mask_mode, false); } +/* Emit binary operations. */ + +static void +emit_binop (unsigned icode, rtx *ops, machine_mode mask_mode, + machine_mode scalar_mode) +{ + insn_expander<9> e; + machine_mode mode = GET_MODE (ops[0]); + e.add_output_operand (ops[0], mode); + e.add_all_one_mask_operand (mask_mode); + e.add_vundef_operand (mode); + if (VECTOR_MODE_P (GET_MODE (ops[1]))) + e.add_input_operand (ops[1], GET_MODE (ops[1])); + else + e.add_input_operand (ops[1], scalar_mode); + if (VECTOR_MODE_P (GET_MODE (ops[2]))) + e.add_input_operand (ops[2], GET_MODE (ops[2])); + else + e.add_input_operand (ops[2], scalar_mode); + rtx vlmax = gen_reg_rtx (Pmode); + emit_vlmax_vsetvl (mode, vlmax); + e.add_input_operand (vlmax, Pmode); + e.add_policy_operand (get_prefer_tail_policy (), get_prefer_mask_policy ()); + e.add_avl_type_operand (avl_type::VLMAX); + e.expand ((enum insn_code) icode, false); +} + +/* Emit vid.v instruction. */ + +static void +emit_indexop (rtx target, machine_mode mask_mode) +{ + insn_expander<7> e; + machine_mode mode = GET_MODE (target); + e.add_output_operand (target, mode); + e.add_all_one_mask_operand (mask_mode); + e.add_vundef_operand (mode); + rtx vlmax = gen_reg_rtx (Pmode); + emit_vlmax_vsetvl (mode, vlmax); + e.add_input_operand (vlmax, Pmode); + e.add_policy_operand (get_prefer_tail_policy (), get_prefer_mask_policy ()); + e.add_avl_type_operand (avl_type::VLMAX); + e.expand (code_for_pred_series (mode), false); +} + +/* Expand series const vector. */ + +void +expand_vec_series (rtx dest, rtx base, rtx step) +{ + machine_mode mode = GET_MODE (dest); + machine_mode inner_mode = GET_MODE_INNER (mode); + machine_mode mask_mode; + gcc_assert (get_mask_mode (mode).exists (&mask_mode)); + + /* VECT_IV = BASE + I * STEP. */ + + /* Step 1: Generate I = { 0, 1, 2, ... } by vid.v. */ + rtx tmp = gen_reg_rtx (mode); + emit_indexop (tmp, mask_mode); + if (rtx_equal_p (step, const1_rtx) && rtx_equal_p (base, const0_rtx)) + { + emit_move_insn (dest, tmp); + return; + } + + /* Step 2: Generate I * STEP. + - STEP is 1, we don't emit any instructions. + - STEP is power of 2, we use vsll.vi/vsll.vx. + - STEP is non-power of 2, we use vmul.vx. */ + rtx tmp2 = gen_reg_rtx (mode); + if (!rtx_equal_p (step, const1_rtx)) + { + if (CONST_INT_P (step) && pow2p_hwi (INTVAL (step))) + { + /* Emit logical left shift operation. */ + int shift = exact_log2 (INTVAL (step)); + rtx shift_amount = gen_int_mode (shift, Pmode); + rtx ops[3] = {tmp2, tmp, shift_amount}; + insn_code icode = code_for_pred_scalar (ASHIFT, mode); + emit_binop (icode, ops, mask_mode, Pmode); + } + else + { + rtx ops[3] = {tmp2, tmp, step}; + insn_code icode = code_for_pred_scalar (MULT, mode); + emit_binop (icode, ops, mask_mode, inner_mode); + } + if (rtx_equal_p (base, const0_rtx)) + { + emit_move_insn (dest, tmp2); + return; + } + } + + /* Step 3: Generate BASE + I * STEP. + - BASE is 0, we don't emit any instructions. + - BASE is not 0, we use vadd.vx/vadd.vi. */ + rtx tmp3 = gen_reg_rtx (mode); + rtx ops[3] = {tmp3, rtx_equal_p (step, const1_rtx) ? tmp : tmp2, base}; + insn_code icode = code_for_pred_scalar (PLUS, mode); + emit_binop (icode, ops, mask_mode, inner_mode); + emit_move_insn (dest, tmp3); +} + static void expand_const_vector (rtx target, rtx src, machine_mode mask_mode) { @@ -280,12 +385,19 @@ expand_const_vector (rtx target, rtx src, machine_mode mask_mode) return; } + /* Support scalable const series vector. */ + rtx base, step; + if (const_vec_series_p (src, &base, &step)) + { + emit_insn (gen_vec_series (mode, target, base, step)); + return; + } + /* TODO: We only support const duplicate vector for now. More cases will be supported when we support auto-vectorization: - 1. series vector. - 2. multiple elts duplicate vector. - 3. multiple patterns with multiple elts. */ + 1. multiple elts duplicate vector. + 2. multiple patterns with multiple elts. */ } /* Expand a pre-RA RVV data move from SRC to DEST. diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index ff90c44d811..84e9267bcb2 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -1266,9 +1266,34 @@ riscv_const_insns (rtx x) } case CONST_DOUBLE: - case CONST_VECTOR: /* We can use x0 to load floating-point zero. */ return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; + case CONST_VECTOR: + { + /* TODO: This is not accurate, we will need to + adapt the COST of CONST_VECTOR in the future + for the following cases: + + - 1. const duplicate vector with element value + in range of [-16, 15]. + - 2. const duplicate vector with element value + out range of [-16, 15]. + - 3. const series vector. + ...etc. */ + if (riscv_v_ext_vector_mode_p (GET_MODE (x))) + { + /* const series vector. */ + rtx base, step; + if (const_vec_series_p (x, &base, &step)) + { + /* This is not accurate, we will need to adapt the COST + * accurately according to BASE && STEP. */ + return 1; + } + } + /* TODO: We may support more const vector in the future. */ + return x == CONST0_RTX (GET_MODE (x)) ? 1 : 0; + } case CONST: /* See if we can refer to X directly. */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/series-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/series-1.c new file mode 100644 index 00000000000..a01f6ce7411 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/series-1.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m4" } */ + +#include + +#define NUM_ELEMS(TYPE) (64 / sizeof (TYPE)) + +#define DEF_LOOP(TYPE, BASE, STEP, SUFFIX) \ + void __attribute__ ((noinline, noclone)) \ + loop_##TYPE##_##SUFFIX (TYPE *restrict a) \ + { \ + for (int i = 0; i < NUM_ELEMS (TYPE); ++i) \ + a[i] = (BASE) + i * (STEP); \ + } + +#define TEST_SEW32_TYPES(T, BASE, STEP, SUFFIX) \ + T (uint32_t, BASE, STEP, SUFFIX) \ + T (int32_t, BASE, STEP, SUFFIX) + +#define TEST_ALL(T) \ + TEST_SEW32_TYPES (T, 0, 1, b0s1) \ + TEST_SEW32_TYPES (T, 0, 2, b0s2) \ + TEST_SEW32_TYPES (T, 0, 3, b0s3) \ + TEST_SEW32_TYPES (T, 0, 8, b0s8) \ + TEST_SEW32_TYPES (T, 0, 9, b0s9) \ + TEST_SEW32_TYPES (T, 0, 16, b0s16) \ + TEST_SEW32_TYPES (T, 0, 17, b0s17) \ + TEST_SEW32_TYPES (T, 0, 32, b0s32) \ + TEST_SEW32_TYPES (T, 0, 33, b0s33) \ + TEST_SEW32_TYPES (T, -16, 1, bm16s1) \ + TEST_SEW32_TYPES (T, 15, 1, b15s1) \ + TEST_SEW32_TYPES (T, -17, 1, bm17s1) \ + TEST_SEW32_TYPES (T, 16, 1, b16s1) \ + TEST_SEW32_TYPES (T, -16, 128, bm16s128) \ + TEST_SEW32_TYPES (T, 15, 128, b15s128) \ + TEST_SEW32_TYPES (T, -17, 128, bm17s128) \ + TEST_SEW32_TYPES (T, 16, 128, b16s128) \ + TEST_SEW32_TYPES (T, -16, 179, bm16s179) \ + TEST_SEW32_TYPES (T, 15, 179, b15s179) \ + TEST_SEW32_TYPES (T, -17, 179, bm17s179) \ + TEST_SEW32_TYPES (T, 16, 179, b16s179) \ + TEST_SEW32_TYPES (T, -16, 65536, bm16s65536) \ + TEST_SEW32_TYPES (T, 15, 65536, b15s65536) \ + TEST_SEW32_TYPES (T, -17, 65536, bm17s65536) \ + TEST_SEW32_TYPES (T, 16, 65536, b16s65536) + +TEST_ALL (DEF_LOOP) + +/* { dg-final { scan-assembler-times {vid\.v\s+v[0-9]+} 50 } } */ +/* { dg-final { scan-assembler-times {vsll\.vi\s+v[0-9]+} 24 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/series_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/series_run-1.c new file mode 100644 index 00000000000..09a20809c65 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/series_run-1.c @@ -0,0 +1,20 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m4" } */ + +#include "series-1.c" + +#define TEST_LOOP(TYPE, BASE, STEP, SUFFIX) \ + { \ + TYPE array[NUM_ELEMS (TYPE)] = {}; \ + loop_##TYPE##_##SUFFIX (array); \ + for (int i = 0; i < NUM_ELEMS (TYPE); i++) \ + if (array[i] != (TYPE) (BASE + i * STEP)) \ + __builtin_abort (); \ + } + +int __attribute__ ((optimize (1))) +main () +{ + TEST_ALL (TEST_LOOP) + return 0; +}