From patchwork Fri Nov 3 00:36:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 161145 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:8f47:0:b0:403:3b70:6f57 with SMTP id j7csp729914vqu; Thu, 2 Nov 2023 17:37:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHnov71bmfSy62MQj0R9zyVqO1ozaZEbcY/C9Vhqu1aV9HyTBgyRVAQpL8Y9vEpper6ZUc+ X-Received: by 2002:a05:622a:10f:b0:417:f85b:5a5a with SMTP id u15-20020a05622a010f00b00417f85b5a5amr22715858qtw.5.1698971834797; Thu, 02 Nov 2023 17:37:14 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698971834; cv=pass; d=google.com; s=arc-20160816; b=We4/w6gEAxO/fW6/Z9qwgVNwCtHCsnZaPdVt53/UlPcOPDjqWpyU7I5V2MI9kNsGbM 7j7YIaLeYW+pKhH8noAyjUSmFy2qea4meCl5Thj83nfbETFSSs3KSFKxzanr05gd9GpV o2I3ygYYTk+njawV91GUl6c3zoDUp9i2Vl9FaRCCyW7dyWKisnIdj+JugpMRW2VB0otb 97nZfOpJaFqrgxHlAzNMXxDpryKkPkdSTmZ6Nofh5o32/Y2p1d9kHhP4KIH5wJQQKWzJ 3hZLSFOPAGKihjd/+B7LUvprpq//wOx+XkIgILz6HoMpSgVUnEvFnW/P5ctE+EGPx5J7 H1Cg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=+EHm+D7M4owY5nA5nHElyjqBmZzIen8Gv/OlkedhtG4=; fh=idvV5TQ1gmHAoU8u1GUGfjilVySOK+BR5TeZLoSouN8=; b=gkCL8bqsBXAf3q6bokbWHQ3DeW2VtTYh8zX+rA8hA4jwDkU55n5Y8j7GV5dMRdYmQG fnMLNhkN6qAxAC5MQmWrRRtIY3o0DW/Jlu3lboPCDJ0pbwuhcI35rFMgag8wyyjMtTSP mjdULQU89fipe9cfMerjdO4gvvph/jt3MtKwDxrx3xpWB2lJt2jDJIQHtivKLI/NEaL3 IVGHkFwox3dvv5iLr/guAbZiK2cq+P2W/TrJKh8ywqc/tiCgIiXy4WoSj5kLFX2654Qo km4lHHfEtQFkmzFtJ6G8X5+edSwLbbrCouFYQohVBnDvC8SwPV8c/XEedJ4c3BvxYooc 907Q== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e4-20020ac85984000000b004198ac8ba56si550244qte.393.2023.11.02.17.37.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 17:37:14 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8E5FD3858C2C for ; Fri, 3 Nov 2023 00:37:14 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgeu1.qq.com (smtpbgeu1.qq.com [52.59.177.22]) by sourceware.org (Postfix) with ESMTPS id 7CAE23858D28 for ; Fri, 3 Nov 2023 00:36:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7CAE23858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 7CAE23858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=52.59.177.22 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698971785; cv=none; b=c+3uZ9x9T5zmoFVXCO/FjpW3vlK4CGjfrQ4jQVwYvsGdZRK0XkFhUUv24UUF34EwkBB7k3UQmuEBYtk6EYo9+tH0eRE18yj7qxa3cuPbUtYE1kCrvQobX5dCIvWb/RbIgiol6s2HQIdbMW4LSBJMN9QJhjxshM1rzcA7dYdWF6s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698971785; c=relaxed/simple; bh=01+R82WhC5SOJ45ajA7SlHn8jMHk3ot95i6mxAUeoBU=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=uP93B/l6y9N8RlkEIOdu4bits1QnfypjUxkTjhCqeiBzs8RdqYqro2/bQPDqdYJbfXRh/dVylpMRApU1RUaR5jUC2GJsk7AFuWHJI6nhjBGsNiSbUF/p6Lv0hheU4BhjMhZ93YhTO7ZFR3U4IxUWBDPd4kkDWOEAPyyqdJ4YuGQ= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp65t1698971765thqrcjnw Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 03 Nov 2023 08:36:04 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: CR3LFp2JE4lVxmcETkzOhQ9NwkQaaW638ZlaRf2zDn/v61Bb2wOq1GbOaaVj1 bS6SDd+EDpF4Ew1ZItQLsmANhOYC3B8PX3sAWR21hm6O11G9tQPKoorH5BMVMOAS9LkC1u1 HW+xmeoLym84Gp1NSIuEAwK/BEt15C1N7VClpF+LXSsfB1hF/RB8lwBP6oxP8PbHSdNTRr8 IJ56gBDF7Z9d5SINYmIlyQaV3fO/hY7ohfPWPelbY6yMdzljMlKU6aEHPEiV/smNRd/fmV2 YY+8CI4gmBgs81NCJLAH+o6zzYc1frpYttYMdKlSwLGNM4TwOocf/gM74g3ps62FA4C94GP EXcuVpjism2i3ZWFWoAoczOyT+/B8rhPhR/GF71MYGANpayuTwCmkj+aSboTeXualxp2Vuw X-QQ-GoodBg: 2 X-BIZMAIL-ID: 12108383061142890677 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: Juzhe-Zhong Subject: [Committed V3] RISC-V: Fix redundant vsetvl in fixed-vlmax vectorized codes[PR112326] Date: Fri, 3 Nov 2023 08:36:03 +0800 Message-Id: <20231103003603.3613011-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781419912892507132 X-GMAIL-MSGID: 1781501090506164305 With compile option --param=riscv-autovec-preference=fixed-vlmax, we have redundant AVL/VL toggling: vsetvli a5,a3,e8,mf4,ta,ma -> should be changed into e32m1 vle32.v v1,0(a1) vle32.v v2,0(a0) vsetivli zero,4,e32,m1,ta,ma -> redundant slli a2,a5,2 vadd.vv v1,v1,v2 sub a3,a3,a5 vsetvli zero,a5,e32,m1,ta,ma -> redundant vse32.v v1,0(a4) add a0,a0,a2 add a1,a1,a2 add a4,a4,a2 bne a3,zero,.L3 The root cause is because we simplify AVL into immediate AVL too early in FIXED-VLMAX situation. The later avlprop PASS failed to propagate AVL generated by (SELECT_VL/vsetvl VL, AVL) into the normal RVV instruction. So we need to remove immedate AVL simplification in 'expand' stage. After this patch: vsetvli a5,a3,e32,m1,ta,ma slli a2,a5,2 vle32.v v1,0(a1) vle32.v v2,0(a0) sub a3,a3,a5 vadd.vv v1,v1,v2 vse32.v v1,0(a4) add a0,a0,a2 add a1,a1,a2 add a4,a4,a2 bne a3,zero,.L3 After the removed simplification, the following situation should be fixed: typedef int8_t vnx2qi __attribute__ ((vector_size (2))); __attribute__ ((noipa)) void f_vnx2qi (int8_t a, int8_t b, int8_t *out) { vnx2qi v = {a, b}; *(vnx2qi *) out = v; } We should use vsetvili zero, 2 instead of vsetvl a5,zero. Such simplification is done in avlprop PASS which is also included in this patch to fix regression of these situation. PR target/112326 gcc/ChangeLog: * config/riscv/riscv-avlprop.cc (get_insn_vtype_mode): New function. (simplify_replace_vlmax_avl): Ditto. (pass_avlprop::execute): Add immediate AVL simplification. * config/riscv/riscv-protos.h (imm_avl_p): Rename. * config/riscv/riscv-v.cc (const_vlmax_p): Ditto. (imm_avl_p): Ditto. (emit_vlmax_insn): Adapt for new interface name. * config/riscv/vector.md (mode_idx): New attribute. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112326.c: New test. --- gcc/config/riscv/riscv-avlprop.cc | 95 ++++++++++++++----- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-v.cc | 28 ++---- gcc/config/riscv/vector.md | 29 +++++- .../gcc.target/riscv/rvv/autovec/pr112326.c | 16 ++++ 5 files changed, 124 insertions(+), 45 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c diff --git a/gcc/config/riscv/riscv-avlprop.cc b/gcc/config/riscv/riscv-avlprop.cc index bcd77a3047a..1dfaa8742da 100644 --- a/gcc/config/riscv/riscv-avlprop.cc +++ b/gcc/config/riscv/riscv-avlprop.cc @@ -109,6 +109,48 @@ vlmax_ta_p (rtx_insn *rinsn) return vlmax_avl_type_p (rinsn) && tail_agnostic_p (rinsn); } +static machine_mode +get_insn_vtype_mode (rtx_insn *rinsn) +{ + extract_insn_cached (rinsn); + int mode_idx = get_attr_mode_idx (rinsn); + gcc_assert (mode_idx != INVALID_ATTRIBUTE); + return GET_MODE (recog_data.operand[mode_idx]); +} + +static void +simplify_replace_vlmax_avl (rtx_insn *rinsn, rtx new_avl) +{ + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "\nPropagating AVL: "); + print_rtl_single (dump_file, new_avl); + fprintf (dump_file, "into: "); + print_rtl_single (dump_file, rinsn); + } + /* Replace AVL operand. */ + extract_insn_cached (rinsn); + rtx avl = recog_data.operand[get_attr_vl_op_idx (rinsn)]; + int count = count_regno_occurrences (rinsn, REGNO (avl)); + gcc_assert (count == 1); + rtx new_pat = simplify_replace_rtx (PATTERN (rinsn), avl, new_avl); + validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); + + /* Change AVL TYPE into NONVLMAX if it is VLMAX. */ + if (vlmax_avl_type_p (rinsn)) + { + int index = get_attr_avl_type_idx (rinsn); + gcc_assert (index != INVALID_ATTRIBUTE); + validate_change_or_fail (rinsn, recog_data.operand_loc[index], + get_avl_type_rtx (avl_type::NONVLMAX), false); + } + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Successfully to match this instruction: "); + print_rtl_single (dump_file, rinsn); + } +} + const pass_data pass_data_avlprop = { RTL_PASS, /* type */ "avlprop", /* name */ @@ -384,34 +426,35 @@ pass_avlprop::execute (function *fn) for (const auto prop : *m_avl_propagations) { rtx_insn *rinsn = prop.first->rtl (); + simplify_replace_vlmax_avl (rinsn, prop.second); + } + + if (riscv_autovec_preference == RVV_FIXED_VLMAX) + { + /* Simplify VLMAX AVL into immediate AVL. + E.g. Simplify this following case: + + vsetvl a5, zero, e32, m1 + vadd.vv + + into: + + vsetvl zero, 4, e32, m1 + vadd.vv + if GET_MODE_NUNITS (RVVM1SImode) == 4. */ if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "\nSimplifying VLMAX AVL into IMM AVL\n\n"); + for (auto &candidate : m_candidates) { - fprintf (dump_file, "\nPropagating AVL: "); - print_rtl_single (dump_file, prop.second); - fprintf (dump_file, "into: "); - print_rtl_single (dump_file, rinsn); - } - /* Replace AVL operand. */ - extract_insn_cached (rinsn); - rtx avl = recog_data.operand[get_attr_vl_op_idx (rinsn)]; - int count = count_regno_occurrences (rinsn, REGNO (avl)); - gcc_assert (count == 1); - rtx new_pat = simplify_replace_rtx (PATTERN (rinsn), avl, prop.second); - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); - - /* Change AVL TYPE into NONVLMAX if it is VLMAX. */ - if (vlmax_avl_type_p (rinsn)) - { - int index = get_attr_avl_type_idx (rinsn); - gcc_assert (index != INVALID_ATTRIBUTE); - validate_change_or_fail (rinsn, recog_data.operand_loc[index], - get_avl_type_rtx (avl_type::NONVLMAX), - false); - } - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "Successfully to match this instruction: "); - print_rtl_single (dump_file, rinsn); + rtx_insn *rinsn = candidate.second->rtl (); + machine_mode vtype_mode = get_insn_vtype_mode (rinsn); + if (candidate.first == AVLPROP_VLMAX_TA + && !m_avl_propagations->get (candidate.second) + && imm_avl_p (vtype_mode)) + { + rtx new_avl = gen_int_mode (GET_MODE_NUNITS (vtype_mode), Pmode); + simplify_replace_vlmax_avl (rinsn, new_avl); + } } } diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 02056591ec6..6a0c59bd63f 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -593,6 +593,7 @@ bool vlmax_avl_p (rtx); uint8_t get_sew (rtx_insn *); enum vlmul_type get_vlmul (rtx_insn *); int count_regno_occurrences (rtx_insn *, unsigned int); +bool imm_avl_p (machine_mode); } /* We classify builtin types into two classes: diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 668f3cd706b..f9363fc9355 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -55,17 +55,17 @@ using namespace riscv_vector; namespace riscv_vector { -/* Return true if vlmax is constant value and can be used in vsetivl. */ -static bool -const_vlmax_p (machine_mode mode) +/* Return true if NUNTIS <=31 so that we can use immediate AVL in vsetivli. */ +bool +imm_avl_p (machine_mode mode) { - poly_uint64 nuints = GET_MODE_NUNITS (mode); + poly_uint64 nunits = GET_MODE_NUNITS (mode); - return nuints.is_constant () - /* The vsetivli can only hold register 0~31. */ - ? (IN_RANGE (nuints.to_constant (), 0, 31)) - /* Only allowed in VLS-VLMAX mode. */ - : false; + return nunits.is_constant () + /* The vsetivli can only hold register 0~31. */ + ? (IN_RANGE (nunits.to_constant (), 0, 31)) + /* Only allowed in VLS-VLMAX mode. */ + : false; } /* Helper functions for insn_flags && insn_types */ @@ -298,14 +298,6 @@ public: len = force_reg (Pmode, len); vls_p = true; } - else if (const_vlmax_p (vtype_mode)) - { - /* Optimize VLS-VLMAX code gen, we can use vsetivli instead of - the vsetvli to obtain the value of vlmax. */ - poly_uint64 nunits = GET_MODE_NUNITS (vtype_mode); - len = gen_int_mode (nunits, Pmode); - vls_p = true; - } else if (can_create_pseudo_p ()) { len = gen_reg_rtx (Pmode); @@ -370,7 +362,7 @@ void emit_vlmax_insn (unsigned icode, unsigned insn_flags, rtx *ops) { insn_expander e (insn_flags, true); - gcc_assert (can_create_pseudo_p () || const_vlmax_p (e.get_vtype_mode (ops))); + gcc_assert (can_create_pseudo_p () || imm_avl_p (e.get_vtype_mode (ops))); e.emit_insn ((enum insn_code) icode, ops); } diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index a1a78120525..ce5c5be8e42 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -708,6 +708,32 @@ (const_int 5)] (const_int INVALID_ATTRIBUTE))) +;; The index of operand[] represents the machine mode of the instruction. +(define_attr "mode_idx" "" + (cond [(eq_attr "type" "vlde,vste,vldm,vstm,vlds,vsts,vldux,vldox,vldff,vldr,vstr,\ + vlsegde,vlsegds,vlsegdux,vlsegdox,vlsegdff,vialu,vext,vicalu,\ + vshift,vicmp,viminmax,vimul,vidiv,vimuladd,vimerge,vimov,\ + vsalu,vaalu,vsmul,vsshift,vfalu,vfmul,vfdiv,vfmuladd,vfsqrt,vfrecp,\ + vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\ + vfcvtitof,vfncvtitof,vfncvtftoi,vfncvtftof,vmalu,vmiota,vmidx,\ + vimovxv,vfmovfv,vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,\ + vgather,vcompress,vmov,vnclip,vnshift") + (const_int 0) + + (eq_attr "type" "vimovvx,vfmovvf") + (const_int 1) + + (eq_attr "type" "vssegte,vmpop,vmffs") + (const_int 2) + + (eq_attr "type" "vstux,vstox,vssegts,vssegtux,vssegtox,vfcvtftoi,vfwcvtitof,vfwcvtftoi, + vfwcvtftof,vmsfs,vired,viwred,vfredu,vfredo,vfwredu,vfwredo") + (const_int 3) + + (eq_attr "type" "viwalu,viwmul,viwmuladd,vfwalu,vfwmul,vfwmuladd") + (const_int 4)] + (const_int INVALID_ATTRIBUTE))) + ;; The index of operand[] to get the avl op. (define_attr "vl_op_idx" "" (cond [(eq_attr "type" "vlde,vste,vimov,vfmov,vldm,vstm,vmalu,vsts,vstux,\ @@ -1207,7 +1233,8 @@ } [(set_attr "type" "vmov,vlde,vste") (set_attr "mode" "") - (set (attr "avl_type_idx") (const_int INVALID_ATTRIBUTE))]) + (set (attr "avl_type_idx") (const_int INVALID_ATTRIBUTE)) + (set (attr "mode_idx") (const_int INVALID_ATTRIBUTE))]) ;; ----------------------------------------------------------------- ;; ---- VLS Moves Operations diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c new file mode 100644 index 00000000000..2ad50139cb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */ + +void +f (int *__restrict y, int *__restrict x, int *__restrict z, int n) +{ + for (int i = 0; i < n; ++i) + x[i] = y[i] + x[i]; +} + +/* { dg-final { scan-assembler-times {vsetvli} 1 } } */ +/* { dg-final { scan-assembler-not {vsetivli} } } */ +/* { dg-final { scan-assembler-times {vsetvli\s*[a-x0-9]+,\s*[a-x0-9]+} 1 } } */ +/* { dg-final { scan-assembler-not {vsetvli\s*[a-x0-9]+,\s*zero} } } */ +/* { dg-final { scan-assembler-not {vsetvli\s*zero} } } */ +/* { dg-final { scan-assembler-not {vsetivli\s*zero} } } */