From patchwork Thu Nov 2 03:34:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 160778 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:8f47:0:b0:403:3b70:6f57 with SMTP id j7csp104732vqu; Wed, 1 Nov 2023 20:35:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGPFqy0Bw1Yv7lKfuqbqzvwpOJP2Ui1vp+z7Q3WSAAILG72dMHlRF+yYkOngh7axdzydQ0S X-Received: by 2002:a05:620a:1998:b0:773:a028:71b6 with SMTP id bm24-20020a05620a199800b00773a02871b6mr17967277qkb.65.1698896114169; Wed, 01 Nov 2023 20:35:14 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698896114; cv=pass; d=google.com; s=arc-20160816; b=zgz4tbTNiTaD9eR1vAxZWdK65olWhNWyhvZWgWt2zo61vjNq23rSC6K9gip7/uS67B 4Ruy3O/AU1ZZiwC8Ft7g845V0TdYfD0AJHjKRpVEWVDuMZRTFSYeU4FNcgauATUFpBp8 rWZM8rzIN+mO1GxhVROIJLlLmFRJjN+IOB4Po9yKE8RDQJioVWGBRD2yq+okqrjkIVU9 99Id1W1E9P9jrMQPIh+9MY3p6Bv3dMkdPPpbGKpA0J9ff18bA/674pnPvzaWb40dk8YG K0CE6eYSb7tHG2BF8p18EIjaaFcuZoUVk+V6iv3l8l8DQwQtUbm9dD1oaUWv1iZhNN8s d95g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=WJBKxx0Hm7zROIxB5stXHPt2VQS93yY8EgxDJumzeNA=; fh=12MRPJmZ1mgDpHqWoogMKqnaGRGM2b7lcuJroqfjJiw=; b=OZSYXqnaFeumgaIOz3FPpRBqAv7pFuLkRfyfIFakfdj1KrwONnQna4UsRYl/Ydsddk T7TanQRnZ8tVHUNg4HHnWXWxZko5JENnjA3beuBr1STlZH15Z+658QLKKjMdWpuiaK2Z +BQhDF5XGVHDqEG3CaFgQ2boze/bFo/fEYK3lmeR2s3CCL/WNb5puy2Sd544Dl7ASWM1 eukFGuyrbH3d+9nP5qw/JDzU9P+jRY/9QkuCyeX5TIdVyiqUDdc9FA/1jtQSR4kXx3nH 29006eYpP/p9QSulVMBhds9Yjihyd7lt0KecFPrPTmreLSzhkKMezpdOb9f9FTOguX6I Zdww== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id rh11-20020a05620a8f0b00b00773a681104csi3386170qkn.684.2023.11.01.20.35.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 20:35:14 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E30E9385842E for ; Thu, 2 Nov 2023 03:35:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast1.qq.com (smtpbguseast1.qq.com [54.204.34.129]) by sourceware.org (Postfix) with ESMTPS id 127CD3858D35 for ; Thu, 2 Nov 2023 03:34:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 127CD3858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 127CD3858D35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.204.34.129 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698896087; cv=none; b=jSo2t3glfEn4XtnnHXh0VWTBfhGcMoQdXSkS5TV0iCGa8bdA+/K8DF1q8CojQXwxrVEl+cCgGCi5FeBHryoFre8p2RDWtz0PK0kedD4XPcSw720jfxWOExpp5sDeZX6umfvPNyGuk/duwjN9znJYVgRf0h5qG//Q/UVUBq+vTwQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698896087; c=relaxed/simple; bh=5AFVSN9/k6LXQ1UXqN8U4poSc3P+YiOAtDxoK9ISX2w=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Xpd0mlmEr6nMFRS+IL0eGWPDZm1FK2dCpS45WeHwt6DN2Xoh+6L5O1Tf9NY8x3RULGclkTfTFr6czT/nfGWOa02mkxZcM4UNMOJOFGf3AiJIY6Ev1dxy2rzc2r2TGfqrKJTZJ/6dsYmE43109Uf4htoYHkVQQXciHvn8RSn0Rvc= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp66t1698896069t3iga55v Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 02 Nov 2023 11:34:28 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: 6PjtIMncaix1LjE1zMVMlWM78cuNmH7dUxkTezbQVaBp/y3FBIryaGfeLyCUr U+IhU+h2BBal6/qhGNX+faGitQaF5RtEgE7JrCF5uQ2bCRtqMn/nnwW5DY6ocNtbTQRElM7 63lwnjoDY2d9ZHI68EhM3BT1SHCmZIVpOINzFLHPBKV5ECSmZ/fc76xn8xJACmEjL5sURy0 HXeou5mOm64ezY0fI0LcJYjzqKtHCp8DZ/H3vsAws9Ch7Ar/LOPaxhwFzNfmjElzMv2tIyP HSWUFoxuWUSPNoDjgNvAAHZNdlG5NQLHj3ri8CFW9X9zyfBrAuU+21kMlXYfG2fAI/7SaXb sQxf8/EhKonV1rwB+ssiN2B4Pjx0MDKMDD+iFVxiiWO+iOlUqaPul9QMRkWlTVssCWNazej SEi64+XdcrZ86g4LAHvK4w== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 18093606337883588972 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH V2] RISC-V: Fix redundant vsetvl in fixed-vlmax vectorized codes[PR112326] Date: Thu, 2 Nov 2023 11:34:27 +0800 Message-Id: <20231102033427.178709-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781419912892507132 X-GMAIL-MSGID: 1781421691469718698 With compile option --param=riscv-autovec-preference=fixed-vlmax, we have redundant AVL/VL toggling: vsetvli a5,a3,e8,mf4,ta,ma -> should be changed into e32m1 vle32.v v1,0(a1) vle32.v v2,0(a0) vsetivli zero,4,e32,m1,ta,ma -> redundant slli a2,a5,2 vadd.vv v1,v1,v2 sub a3,a3,a5 vsetvli zero,a5,e32,m1,ta,ma -> redundant vse32.v v1,0(a4) add a0,a0,a2 add a1,a1,a2 add a4,a4,a2 bne a3,zero,.L3 The root cause is because we simplify AVL into immediate AVL too early in FIXED-VLMAX situation. The later avlprop PASS failed to propagate AVL generated by (SELECT_VL/vsetvl VL, AVL) into the normal RVV instruction. So we need to remove immedate AVL simplification in 'expand' stage. After this patch: vsetvli a5,a3,e32,m1,ta,ma slli a2,a5,2 vle32.v v1,0(a1) vle32.v v2,0(a0) sub a3,a3,a5 vadd.vv v1,v1,v2 vse32.v v1,0(a4) add a0,a0,a2 add a1,a1,a2 add a4,a4,a2 bne a3,zero,.L3 After the removed simplification, the following situation should be fixed: typedef int8_t vnx2qi __attribute__ ((vector_size (2))); __attribute__ ((noipa)) void f_vnx2qi (int8_t a, int8_t b, int8_t *out) { vnx2qi v = {a, b}; *(vnx2qi *) out = v; } We should use vsetvili zero, 2 instead of vsetvl a5,zero. Such simplification is done in avlprop PASS which is also included in this patch to fix regression of these situation. PR target/112326 gcc/ChangeLog: * config/riscv/riscv-avlprop.cc (get_insn_vtype_mode): New function. (simplify_replace_vlmax_avl): Ditto. (pass_avlprop::execute): Add immediate AVL simplification. * config/riscv/riscv-protos.h (imm_avl_p): Rename. * config/riscv/riscv-v.cc (const_vlmax_p): Ditto. (imm_avl_p): Ditto. (emit_vlmax_insn): Adapt for new interface name. * config/riscv/vector.md (mode_idx): New attribute. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112326.c: New test. --- gcc/config/riscv/riscv-avlprop.cc | 95 ++++++++++++++----- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-v.cc | 24 ++--- gcc/config/riscv/vector.md | 29 +++++- .../gcc.target/riscv/rvv/autovec/pr112326.c | 16 ++++ 5 files changed, 122 insertions(+), 43 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c diff --git a/gcc/config/riscv/riscv-avlprop.cc b/gcc/config/riscv/riscv-avlprop.cc index c59eb7f6fa3..bec1e3c715a 100644 --- a/gcc/config/riscv/riscv-avlprop.cc +++ b/gcc/config/riscv/riscv-avlprop.cc @@ -109,6 +109,48 @@ vlmax_ta_p (rtx_insn *rinsn) return vlmax_avl_type_p (rinsn) && tail_agnostic_p (rinsn); } +static machine_mode +get_insn_vtype_mode (rtx_insn *rinsn) +{ + extract_insn_cached (rinsn); + int mode_idx = get_attr_mode_idx (rinsn); + gcc_assert (mode_idx != INVALID_ATTRIBUTE); + return GET_MODE (recog_data.operand[mode_idx]); +} + +static void +simplify_replace_vlmax_avl (rtx_insn *rinsn, rtx new_avl) +{ + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "\nPropagating AVL: "); + print_rtl_single (dump_file, new_avl); + fprintf (dump_file, "into: "); + print_rtl_single (dump_file, rinsn); + } + /* Replace AVL operand. */ + extract_insn_cached (rinsn); + rtx avl = recog_data.operand[get_attr_vl_op_idx (rinsn)]; + int count = count_regno_occurrences (rinsn, REGNO (avl)); + gcc_assert (count == 1); + rtx new_pat = simplify_replace_rtx (PATTERN (rinsn), avl, new_avl); + validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); + + /* Change AVL TYPE into NONVLMAX if it is VLMAX. */ + if (vlmax_avl_type_p (rinsn)) + { + int index = get_attr_avl_type_idx (rinsn); + gcc_assert (index != INVALID_ATTRIBUTE); + validate_change_or_fail (rinsn, recog_data.operand_loc[index], + get_avl_type_rtx (avl_type::NONVLMAX), false); + } + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Successfully to match this instruction: "); + print_rtl_single (dump_file, rinsn); + } +} + const pass_data pass_data_avlprop = { RTL_PASS, /* type */ "avlprop", /* name */ @@ -377,34 +419,35 @@ pass_avlprop::execute (function *fn) for (const auto prop : *m_avl_propagations) { rtx_insn *rinsn = prop.first->rtl (); + simplify_replace_vlmax_avl (rinsn, prop.second); + } + + if (riscv_autovec_preference == RVV_FIXED_VLMAX) + { + /* Simplify VLMAX AVL into immediate AVL. + E.g. Simplify this following case: + + vsetvl a5, zero, e32, m1 + vadd.vv + + into: + + vsetvl zero, 4, e32, m1 + vadd.vv + if GET_MODE_NUNITS (RVVM1SImode) == 4. */ if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "\nSimplifying VLMAX AVL into IMM AVL\n\n"); + for (auto &candidate : m_candidates) { - fprintf (dump_file, "\nPropagating AVL: "); - print_rtl_single (dump_file, prop.second); - fprintf (dump_file, "into: "); - print_rtl_single (dump_file, rinsn); - } - /* Replace AVL operand. */ - extract_insn_cached (rinsn); - rtx avl = recog_data.operand[get_attr_vl_op_idx (rinsn)]; - int count = count_regno_occurrences (rinsn, REGNO (avl)); - gcc_assert (count == 1); - rtx new_pat = simplify_replace_rtx (PATTERN (rinsn), avl, prop.second); - validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); - - /* Change AVL TYPE into NONVLMAX if it is VLMAX. */ - if (vlmax_avl_type_p (rinsn)) - { - int index = get_attr_avl_type_idx (rinsn); - gcc_assert (index != INVALID_ATTRIBUTE); - validate_change_or_fail (rinsn, recog_data.operand_loc[index], - get_avl_type_rtx (avl_type::NONVLMAX), - false); - } - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "Successfully to match this instruction: "); - print_rtl_single (dump_file, rinsn); + rtx_insn *rinsn = candidate.second->rtl (); + machine_mode vtype_mode = get_insn_vtype_mode (rinsn); + if (candidate.first == AVLPROP_VLMAX_TA + && !m_avl_propagations->get (candidate.second) + && imm_avl_p (vtype_mode)) + { + rtx new_avl = gen_int_mode (GET_MODE_NUNITS (vtype_mode), Pmode); + simplify_replace_vlmax_avl (rinsn, new_avl); + } } } diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 02056591ec6..6a0c59bd63f 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -593,6 +593,7 @@ bool vlmax_avl_p (rtx); uint8_t get_sew (rtx_insn *); enum vlmul_type get_vlmul (rtx_insn *); int count_regno_occurrences (rtx_insn *, unsigned int); +bool imm_avl_p (machine_mode); } /* We classify builtin types into two classes: diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 668f3cd706b..679f922bc20 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -55,17 +55,17 @@ using namespace riscv_vector; namespace riscv_vector { -/* Return true if vlmax is constant value and can be used in vsetivl. */ -static bool -const_vlmax_p (machine_mode mode) +/* Return true if NUNTIS <=31 so that we can use immediate AVL in vsetivli. */ +bool +imm_avl_p (machine_mode mode) { poly_uint64 nuints = GET_MODE_NUNITS (mode); return nuints.is_constant () - /* The vsetivli can only hold register 0~31. */ - ? (IN_RANGE (nuints.to_constant (), 0, 31)) - /* Only allowed in VLS-VLMAX mode. */ - : false; + /* The vsetivli can only hold register 0~31. */ + ? (IN_RANGE (nuints.to_constant (), 0, 31)) + /* Only allowed in VLS-VLMAX mode. */ + : false; } /* Helper functions for insn_flags && insn_types */ @@ -298,14 +298,6 @@ public: len = force_reg (Pmode, len); vls_p = true; } - else if (const_vlmax_p (vtype_mode)) - { - /* Optimize VLS-VLMAX code gen, we can use vsetivli instead of - the vsetvli to obtain the value of vlmax. */ - poly_uint64 nunits = GET_MODE_NUNITS (vtype_mode); - len = gen_int_mode (nunits, Pmode); - vls_p = true; - } else if (can_create_pseudo_p ()) { len = gen_reg_rtx (Pmode); @@ -370,7 +362,7 @@ void emit_vlmax_insn (unsigned icode, unsigned insn_flags, rtx *ops) { insn_expander e (insn_flags, true); - gcc_assert (can_create_pseudo_p () || const_vlmax_p (e.get_vtype_mode (ops))); + gcc_assert (can_create_pseudo_p () || imm_avl_p (e.get_vtype_mode (ops))); e.emit_insn ((enum insn_code) icode, ops); } diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index a1a78120525..28baee59a9b 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -708,6 +708,32 @@ (const_int 5)] (const_int INVALID_ATTRIBUTE))) +;; The index of operand[] represents the machine mode of the instruction. +(define_attr "mode_idx" "" + (cond [(eq_attr "type" "vlde,vste,vldm,vstm,vlds,vsts,vldux,vldox,vldff,vldr,vstr,\ + vlsegde,vlsegds,vlsegdux,vlsegdox,vlsegdff,vialu,vext,vicalu,\ + vshift,vicmp,viminmax,vimul,vidiv,vimuladd,vimerge,vimov,\ + vsalu,vaalu,vsmul,vsshift,vfalu,vfmul,vfdiv,vfmuladd,vfsqrt,vfrecp,\ + vfcmp,vfminmax,vfsgnj,vfclass,vfmerge,vfmov,\ + vfcvtitof,vfncvtitof,vfncvtftoi,vfncvtftof,vmalu,vmiota,vmidx,\ + vimovxv,vfmovfv,vslideup,vslidedown,vislide1up,vislide1down,vfslide1up,vfslide1down,\ + vgather,vcompress,vmov") + (const_int 0) + + (eq_attr "type" "vimovvx,vfmovvf") + (const_int 1) + + (eq_attr "type" "vssegte,vnshift,vmpop,vmffs") + (const_int 2) + + (eq_attr "type" "vstux,vstox,vssegts,vssegtux,vssegtox,vfcvtftoi,vfwcvtitof,vfwcvtftoi, + vfwcvtftof,vmsfs,vired,viwred,vfredu,vfredo,vfwredu,vfwredo") + (const_int 3) + + (eq_attr "type" "viwalu,viwmul,viwmuladd,vnclip,vfwalu,vfwmul,vfwmuladd") + (const_int 4)] + (const_int INVALID_ATTRIBUTE))) + ;; The index of operand[] to get the avl op. (define_attr "vl_op_idx" "" (cond [(eq_attr "type" "vlde,vste,vimov,vfmov,vldm,vstm,vmalu,vsts,vstux,\ @@ -1207,7 +1233,8 @@ } [(set_attr "type" "vmov,vlde,vste") (set_attr "mode" "") - (set (attr "avl_type_idx") (const_int INVALID_ATTRIBUTE))]) + (set (attr "avl_type_idx") (const_int INVALID_ATTRIBUTE)) + (set (attr "mode_idx") (const_int INVALID_ATTRIBUTE))]) ;; ----------------------------------------------------------------- ;; ---- VLS Moves Operations diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c new file mode 100644 index 00000000000..2ad50139cb2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112326.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model --param=riscv-autovec-preference=fixed-vlmax" } */ + +void +f (int *__restrict y, int *__restrict x, int *__restrict z, int n) +{ + for (int i = 0; i < n; ++i) + x[i] = y[i] + x[i]; +} + +/* { dg-final { scan-assembler-times {vsetvli} 1 } } */ +/* { dg-final { scan-assembler-not {vsetivli} } } */ +/* { dg-final { scan-assembler-times {vsetvli\s*[a-x0-9]+,\s*[a-x0-9]+} 1 } } */ +/* { dg-final { scan-assembler-not {vsetvli\s*[a-x0-9]+,\s*zero} } } */ +/* { dg-final { scan-assembler-not {vsetvli\s*zero} } } */ +/* { dg-final { scan-assembler-not {vsetivli\s*zero} } } */