From patchwork Fri Aug 11 08:45:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 134370 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp944197vqi; Fri, 11 Aug 2023 01:46:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFMoiiDIj3DsK3wxpX/csmLdPiHOWvDe0JMX/QCQnG5SulcsozxcOPif2AEsINgZ1NoO+bn X-Received: by 2002:aa7:d584:0:b0:522:3cf4:9d86 with SMTP id r4-20020aa7d584000000b005223cf49d86mr1063161edq.33.1691743577199; Fri, 11 Aug 2023 01:46:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691743577; cv=none; d=google.com; s=arc-20160816; b=li+b8YxEb2+a9ZtaInPE0xWGnPXKv0sI9EvDB+s7+QBpsR05pYFm9qw2PmKDf2sHfJ ASej0vRPQSgvglc7icbwkUcvD/rok5+lqGqUgAhllyyHB44NLhmXLYqHPOg1XB7hXCvN jPr1ae/JImGRjOh4g5pJrELXiffV/kY2NWhX26OXaAsakRQiDJEEXuQh1Ylb8nQRcWMm 95jExw8xFmTZmvpxAfq1xsCo1wGtkjGh2Z7ierO4xemTXz8KOqlvaw2NC/BinGz2RpqY cZPSZwUyUzrVX1kkQLUp5cov0IlAaqK26BY5OCWSWoyiryk7QAkS7xr2KDXKyvR/WU8I hpmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=l+SWg4/RGfUjPpKG3U6sAUh8DgSneRUkZ/ilI2sr9ps=; fh=SuV1mxSfYh/fFJBV6FW8ZDQUWC7OLSIDYxyJSOKFLBQ=; b=jtsNewj6S00JF27OrW0PPMvTZmGHxOUM4ns0f3/Qv7FyYlQjiVLf8NhKD2Ph1BJ6ln eTmtt7Iqvt/bD7fDjjc307eEqA+bOZB9p+MNhOywcKGAynKWXUVEDlFzM6z+rjs3l8Im uWacfBzAsmYKYMSuKLJYxLaLp7zsEKJjoF0qE9scQ1SnzNS7y8w6eJH3j1rE2ITveI9C +Rfsk51YhUE8LUNKRZoo0EPUoC6P4kj0FzHh/jPQRCHAWIwfk9vWQ+wI+C16X5/S10aV exfiIA6wYj2kyYQK0NBfNd0vqfOn++89A/N4QWDy7eCDkQjVOQPB2iZjWWu/IAMtbhYo Xrvw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i19-20020a056402055300b00523d0f76875si493221edx.88.2023.08.11.01.46.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Aug 2023 01:46:17 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 86F7F3857C44 for ; Fri, 11 Aug 2023 08:46:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg1.qq.com (smtpbgsg1.qq.com [54.254.200.92]) by sourceware.org (Postfix) with ESMTPS id 330E43858D20 for ; Fri, 11 Aug 2023 08:45:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 330E43858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp81t1691743532t1zihyno Received: from server1.localdomain ( [58.60.1.10]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 11 Aug 2023 16:45:31 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: 5QxEZ1enDv9DduGjzOVvaN9RElglUz955kZ/BGSzCMHH59Kqfd2POHY16A0wk u2zbNU0C+heNh3mmaas+N59ObiLYeexrqWjesoHLqwkgF8ePuw1RJKJGMd9TfwVw0eSDIXK oDXSQZk9F8WaDbAFyJDTTRy7CvwFxzlbtvRwAmlGZLVbzYaIVvxCMSnll73T3JjaoJCVT2n LYyTjA50whIRG82hfxuM5xesC/beERd82YkLkulpRfoKJZm7v9lt5PYxBr/u6pLjSVUYIbS odYgFvUtlbzEw5iAhxHpdWiAGdCZs/tvWSGvXcSCs5lKpxwc7nwGdtrUsqSYkovcAIhX2lC CO4BOMG1HorcijxSiLOYocXsF4LY8WcHTVVa8/CNwfjOGRTa/CrtH2PpUkx48llF7i0uYUT TyF/WhJR6TU= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 1517345130205165354 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Fix vec_series expander[PR110985] Date: Fri, 11 Aug 2023 16:45:26 +0800 Message-Id: <20230811084526.237649-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_PASS, TXREP, T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773921713405185658 X-GMAIL-MSGID: 1773921713405185658 This patch fix bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110985 PR target/110985 gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vec_series): Refactor the expander. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/pr110985.c: New test. --- gcc/config/riscv/riscv-v.cc | 74 +++++++-------- .../riscv/rvv/autovec/vls-vlmax/pr110985.c | 90 +++++++++++++++++++ 2 files changed, 129 insertions(+), 35 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/pr110985.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a3062c90618..5f9b296c92e 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -1309,6 +1309,7 @@ expand_vec_series (rtx dest, rtx base, rtx step) machine_mode mode = GET_MODE (dest); poly_int64 nunits_m1 = GET_MODE_NUNITS (mode) - 1; poly_int64 value; + rtx result = register_operand (dest, mode) ? dest : gen_reg_rtx (mode); /* VECT_IV = BASE + I * STEP. */ @@ -1317,15 +1318,10 @@ expand_vec_series (rtx dest, rtx base, rtx step) rtx op[] = {vid}; emit_vlmax_insn (code_for_pred_series (mode), RVV_MISC_OP, op); - /* Step 2: Generate I * STEP. - - STEP is 1, we don't emit any instructions. - - STEP is power of 2, we use vsll.vi/vsll.vx. - - STEP is non-power of 2, we use vmul.vx. */ rtx step_adj; - if (rtx_equal_p (step, const1_rtx)) - step_adj = vid; - else if (rtx_equal_p (step, constm1_rtx) && poly_int_rtx_p (base, &value) - && known_eq (nunits_m1, value)) + if (rtx_equal_p (step, constm1_rtx) + && poly_int_rtx_p (base, &value) + && known_eq (nunits_m1, value)) { /* Special case: {nunits - 1, nunits - 2, ... , 0}. @@ -1334,46 +1330,54 @@ expand_vec_series (rtx dest, rtx base, rtx step) Code sequence: vid.v v vrsub nunits - 1, v. */ - rtx ops[] = {dest, vid, gen_int_mode (nunits_m1, GET_MODE_INNER (mode))}; + rtx ops[] + = {result, vid, gen_int_mode (nunits_m1, GET_MODE_INNER (mode))}; insn_code icode = code_for_pred_sub_reverse_scalar (mode); emit_vlmax_insn (icode, RVV_BINOP, ops); - return; } else { - step_adj = gen_reg_rtx (mode); - if (CONST_INT_P (step) && pow2p_hwi (INTVAL (step))) + /* Step 2: Generate I * STEP. + - STEP is 1, we don't emit any instructions. + - STEP is power of 2, we use vsll.vi/vsll.vx. + - STEP is non-power of 2, we use vmul.vx. */ + if (rtx_equal_p (step, const1_rtx)) + step_adj = vid; + else { - /* Emit logical left shift operation. */ - int shift = exact_log2 (INTVAL (step)); - rtx shift_amount = gen_int_mode (shift, Pmode); - insn_code icode = code_for_pred_scalar (ASHIFT, mode); - rtx ops[] = {step_adj, vid, shift_amount}; - emit_vlmax_insn (icode, RVV_BINOP, ops); + step_adj = gen_reg_rtx (mode); + if (CONST_INT_P (step) && pow2p_hwi (INTVAL (step))) + { + /* Emit logical left shift operation. */ + int shift = exact_log2 (INTVAL (step)); + rtx shift_amount = gen_int_mode (shift, Pmode); + insn_code icode = code_for_pred_scalar (ASHIFT, mode); + rtx ops[] = {step_adj, vid, shift_amount}; + emit_vlmax_insn (icode, RVV_BINOP, ops); + } + else + { + insn_code icode = code_for_pred_scalar (MULT, mode); + rtx ops[] = {step_adj, vid, step}; + emit_vlmax_insn (icode, RVV_BINOP, ops); + } } + + /* Step 3: Generate BASE + I * STEP. + - BASE is 0, use result of vid. + - BASE is not 0, we use vadd.vx/vadd.vi. */ + if (rtx_equal_p (base, const0_rtx)) + emit_move_insn (result, step_adj); else { - insn_code icode = code_for_pred_scalar (MULT, mode); - rtx ops[] = {step_adj, vid, step}; + insn_code icode = code_for_pred_scalar (PLUS, mode); + rtx ops[] = {result, step_adj, base}; emit_vlmax_insn (icode, RVV_BINOP, ops); } } - /* Step 3: Generate BASE + I * STEP. - - BASE is 0, use result of vid. - - BASE is not 0, we use vadd.vx/vadd.vi. */ - if (rtx_equal_p (base, const0_rtx)) - { - emit_move_insn (dest, step_adj); - } - else - { - rtx result = gen_reg_rtx (mode); - insn_code icode = code_for_pred_scalar (PLUS, mode); - rtx ops[] = {result, step_adj, base}; - emit_vlmax_insn (icode, RVV_BINOP, ops); - emit_move_insn (dest, result); - } + if (result != dest) + emit_move_insn (dest, result); } static void diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/pr110985.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/pr110985.c new file mode 100644 index 00000000000..7710654c1bb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/pr110985.c @@ -0,0 +1,90 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl256b -mabi=lp64d -O3 --param=riscv-autovec-preference=fixed-vlmax -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +typedef int16_t vnx16i __attribute__ ((vector_size (32))); + +/* +** foo1: +** vsetivli\s+zero,\s*16,\s*e16,\s*m1,\s*t[au],\s*m[au] +** vid\.v\s+v[0-9]+ +** vrsub\.vi\s+v[0-9]+,\s*v[0-9]+,\s*15 +** vs1r\.v\s+v[0-9]+,\s*0\([a-x0-9]+\) +** ret +*/ +void +foo1 (int16_t *__restrict out) +{ + vnx16i v = {15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0}; + *(vnx16i *) out = v; +} + +/* +** foo2: +** vsetivli\s+zero,\s*16,\s*e16,\s*m1,\s*t[au],\s*m[au] +** vid\.v\s+v[0-9]+ +** li\s+[a-x0-9]+,\s*7 +** vmul\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+ +** vadd\.vi\s+v[0-9]+,\s*v[0-9]+,\s*3 +** vs1r\.v\s+v[0-9]+,\s*0\([a-x0-9]+\) +** ret +*/ +void +foo2 (int16_t *__restrict out) +{ + vnx16i v + = {3, 3 + 7 * 1, 3 + 7 * 2, 3 + 7 * 3, 3 + 7 * 4, 3 + 7 * 5, + 3 + 7 * 6, 3 + 7 * 7, 3 + 7 * 8, 3 + 7 * 9, 3 + 7 * 10, 3 + 7 * 11, + 3 + 7 * 12, 3 + 7 * 13, 3 + 7 * 14, 3 + 7 * 15}; + *(vnx16i *) out = v; +} + +/* +** foo3: +** vsetivli\s+zero,\s*16,\s*e16,\s*m1,\s*t[au],\s*m[au] +** vid\.v\s+v[0-9]+ +** vs1r\.v\s+v[0-9]+,\s*0\([a-x0-9]+\) +** ret +*/ +void +foo3 (int16_t *__restrict out) +{ + vnx16i v + = {0, 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; + *(vnx16i *) out = v; +} + +/* +** foo4: +** vsetivli\s+zero,\s*16,\s*e16,\s*m1,\s*t[au],\s*m[au] +** vid\.v\s+v[0-9]+ +** li\s+[a-x0-9]+,\s*6 +** vmul\.vx\s+v[0-9]+,\s*v[0-9]+,\s*[a-x0-9]+ +** vs1r\.v\s+v[0-9]+,\s*0\([a-x0-9]+\) +** ret +*/ +void +foo4 (int16_t *__restrict out) +{ + vnx16i v + = {0*6, 1*6,2*6,3*6,4*6,5*6,6*6,7*6,8*6,9*6,10*6,11*6,12*6,13*6,14*6,15*6}; + *(vnx16i *) out = v; +} + +/* +** foo5: +** vsetivli\s+zero,\s*16,\s*e16,\s*m1,\s*t[au],\s*m[au] +** vid\.v\s+v[0-9]+ +** vadd\.vi\s+v[0-9]+,\s*v[0-9]+,\s*-16 +** vs1r\.v\s+v[0-9]+,\s*0\([a-x0-9]+\) +** ret +*/ +void +foo5 (int16_t *__restrict out) +{ + vnx16i v + = {0-16, 1-16,2-16,3-16,4-16,5-16,6-16,7-16,8-16,9-16,10-16,11-16,12-16,13-16,14-16,15-16}; + *(vnx16i *) out = v; +}