From patchwork Fri May 5 13:51:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 90446 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp419997vqo; Fri, 5 May 2023 06:52:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7b1+e2UJdgbyJORbVXmECKFjA2NPefl2Py09hL84XOLGMtuGXpm5YTHtKoTCapIUgXkKv0 X-Received: by 2002:a17:907:2ce1:b0:959:c07b:84e0 with SMTP id hz1-20020a1709072ce100b00959c07b84e0mr1144598ejc.50.1683294756303; Fri, 05 May 2023 06:52:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683294756; cv=none; d=google.com; s=arc-20160816; b=nfDNe6nr066yIMVQ2ANqMVVmUQQSd7GfLL2myZMI2XFDuwKPqrDYpWay4sz4WZVhcU hr+MeGZOcR74DX8A6O6iV+wwmSP89D3UZM15SkqNhbjBCIL1PBXJ8CxRKeHpuSoW4UPJ W0tvJFv5Bh60mMrfZEUa8jiOLgkQA1bu6F5fuYLZKSrGZmfjfd42vD7EUi1wQvSX+KiY ugbIXtlPkP48BbuTquA1oLmuQZsLnvhTB1b1S7QUEnR6+WBYgfHqpL0GvSxUzRyCliLJ x3Q4N8STPjmzVMQKQ/JsJQg8fleqVJfLKm8G4Kat+5j2i4UPmzcP8dPVJGxVZ2vxEnA7 YB+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=xNlq/N2MHedpyvB8Vc2Stb7Rdzxrto5gZoc9zZfm12U=; b=pyek7JH4/uf2P1TLD07s4mBtcgR2GWvS2+0QNLMAyHMTvvik3aWMXQDdSjl0iOKDvH 8PSAzDtOVHWOXBS4twopv5SlSg4bOJFguV0RlLqZ6VFCUX2FKWexpvM9Z3IWLpkP7fc/ YRegssKDzhfr4jgNWiS0o7z/S6IAGeYDQG2fAmLESsPFTQ4xq3i3lmCRhKYv9phJpX1g bfj4sqBi79mGTE9pP9uYmqfB9MVOp6kqtRW78RpNfsr+y7GRQtiEHImPnSkqcVbs3T6Y uaGLbwF6VssXGS5O2YbI7sjpoLIRQpws1xON+DHA/qz4U2aM0mq8/Qcz1/CROM2jRSRr hsOA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id ka19-20020a170907991300b009596e2086a0si1357094ejc.83.2023.05.05.06.52.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 May 2023 06:52:36 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AFEBD3855586 for ; Fri, 5 May 2023 13:52:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr1.qq.com (smtpbgbr1.qq.com [54.207.19.206]) by sourceware.org (Postfix) with ESMTPS id 2EC833858D20 for ; Fri, 5 May 2023 13:52:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2EC833858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp83t1683294716toa62pay Received: from rios-cad5.localdomain ( [58.60.1.11]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 05 May 2023 21:51:55 +0800 (CST) X-QQ-SSF: 01400000000000F0Q000000A0000000 X-QQ-FEAT: LrCnY+iDm+MOiytTQTVkheejLmKJad+JDs+Y/FOEoPxXnuLw/IC1Oa1q3moiD ng1M2gEfMtreNjgdbs5AGk7YBdt8xASZwiP396yWUYABxeihq9A8sfDqrvnvQaqUMyo8yeK WQMAGNP5mjX/lfi1FTRSmqUK+3wHOg70QeUyQP+jtaizF9Uu0sfUa1ioXTifeBm2ttI5hGE /grQuEJ+71cgKFrcfD6sXt2QSviPO0djOJSwBJhXOGWzR6FfYC1HFdYDR8/j3nHQFXWVbeO RCYbNRsvlyQBjt/o5A5Uq5JNf69Uhxhugm/ne8X6fetQ6QzsUkNhG1ex5xzrYXXxDeSo+mR TOxzYjO5o1SarG6A9xs/ecgrYSUpzbYA1NtPs6ErQtUNAEVmqs= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 4668670930139548864 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Fix PR109748 Date: Fri, 5 May 2023 21:51:53 +0800 Message-Id: <20230505135153.1308864-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_NUMSUBJECT, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765062482198110254?= X-GMAIL-MSGID: =?utf-8?q?1765062482198110254?= From: Juzhe-Zhong This patch is fixing my recent optimization patch: https://github.com/gcc-mirror/gcc/commit/d51f2456ee51bd59a79b4725ca0e488c25260bbf In that patch, the new_info = parse_insn (i) is not correct. Since consider the following case: vsetvli a5,a4, e8,m1 .. vsetvli zero,a5, e32, m4 vle8.v vmacc.vv ... Since we have backward demand fusion in Phase 1, so the real demand of "vle8.v" is e32, m4. However, if we use parse_insn (vle8.v) = e8, m1 which is not correct. So this patch we change new_info = new_info.parse_insn (i) into: vector_insn_info new_info = m_vector_manager->vector_insn_infos[i->uid ()]; So that, we can correctly optimize codes into: vsetvli a5,a4, e32, m4 .. .. (vsetvli zero,a5, e32, m4 is removed) vle8.v vmacc.vv Since m_vector_manager->vector_insn_infos is the member variable of pass_vsetvl class. We remove static void function "local_eliminate_vsetvl_insn", and make it as the member function of pass_vsetvl class. PR target/109748 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn): Remove it. (pass_vsetvl::local_eliminate_vsetvl_insn): New function. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr109748.c: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 102 ++++++++++-------- .../gcc.target/riscv/rvv/vsetvl/pr109748.c | 36 +++++++ 2 files changed, 93 insertions(+), 45 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 39b4d21210b..e1efd7b1c40 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -1056,51 +1056,6 @@ change_vsetvl_insn (const insn_info *insn, const vector_insn_info &info) change_insn (rinsn, new_pat); } -static void -local_eliminate_vsetvl_insn (const vector_insn_info &dem) -{ - const insn_info *insn = dem.get_insn (); - if (!insn || insn->is_artificial ()) - return; - rtx_insn *rinsn = insn->rtl (); - const bb_info *bb = insn->bb (); - if (vsetvl_insn_p (rinsn)) - { - rtx vl = get_vl (rinsn); - for (insn_info *i = insn->next_nondebug_insn (); - real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) - { - if (i->is_call () || i->is_asm () - || find_access (i->defs (), VL_REGNUM) - || find_access (i->defs (), VTYPE_REGNUM)) - return; - - if (has_vtype_op (i->rtl ())) - { - if (!vsetvl_discard_result_insn_p (PREV_INSN (i->rtl ()))) - return; - rtx avl = get_avl (i->rtl ()); - if (avl != vl) - return; - set_info *def = find_access (i->uses (), REGNO (avl))->def (); - if (def->insn () != insn) - return; - - vector_insn_info new_info; - new_info.parse_insn (i); - if (!new_info.skip_avl_compatible_p (dem)) - return; - - new_info.set_avl_info (dem.get_avl_info ()); - new_info = dem.merge (new_info, LOCAL_MERGE); - change_vsetvl_insn (insn, new_info); - eliminate_insn (PREV_INSN (i->rtl ())); - return; - } - } - } -} - static bool source_equal_p (insn_info *insn1, insn_info *insn2) { @@ -2672,6 +2627,7 @@ private: void pre_vsetvl (void); /* Phase 5. */ + void local_eliminate_vsetvl_insn (const vector_insn_info &) const; void cleanup_insns (void) const; /* Phase 6. */ @@ -3993,6 +3949,62 @@ pass_vsetvl::pre_vsetvl (void) commit_edge_insertions (); } +/* Local user vsetvl optimizaiton: + + Case 1: + vsetvl a5,a4,e8,mf8 + ... + vsetvl zero,a5,e8,mf8 --> Eliminate directly. + + Case 2: + vsetvl a5,a4,e8,mf8 --> vsetvl a5,a4,e32,mf2 + ... + vsetvl zero,a5,e32,mf2 --> Eliminate directly. */ +void +pass_vsetvl::local_eliminate_vsetvl_insn (const vector_insn_info &dem) const +{ + const insn_info *insn = dem.get_insn (); + if (!insn || insn->is_artificial ()) + return; + rtx_insn *rinsn = insn->rtl (); + const bb_info *bb = insn->bb (); + if (vsetvl_insn_p (rinsn)) + { + rtx vl = get_vl (rinsn); + for (insn_info *i = insn->next_nondebug_insn (); + real_insn_and_same_bb_p (i, bb); i = i->next_nondebug_insn ()) + { + if (i->is_call () || i->is_asm () + || find_access (i->defs (), VL_REGNUM) + || find_access (i->defs (), VTYPE_REGNUM)) + return; + + if (has_vtype_op (i->rtl ())) + { + if (!vsetvl_discard_result_insn_p (PREV_INSN (i->rtl ()))) + return; + rtx avl = get_avl (i->rtl ()); + if (avl != vl) + return; + set_info *def = find_access (i->uses (), REGNO (avl))->def (); + if (def->insn () != insn) + return; + + vector_insn_info new_info + = m_vector_manager->vector_insn_infos[i->uid ()]; + if (!new_info.skip_avl_compatible_p (dem)) + return; + + new_info.set_avl_info (dem.get_avl_info ()); + new_info = dem.merge (new_info, LOCAL_MERGE); + change_vsetvl_insn (insn, new_info); + eliminate_insn (PREV_INSN (i->rtl ())); + return; + } + } + } +} + /* Before VSETVL PASS, RVV instructions pattern is depending on AVL operand implicitly. Since we will emit VSETVL instruction and make RVV instructions depending on VL/VTYPE global status registers, we remove the such AVL operand diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c new file mode 100644 index 00000000000..81c42c5a82a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize -fno-schedule-insns -fno-schedule-insns2" } */ + +#include "riscv_vector.h" + +int byte_mac_vec(unsigned char *a, unsigned char *b, int len) { + size_t vlmax = __riscv_vsetvlmax_e8m1(); + vint32m4_t vec_s = __riscv_vmv_v_x_i32m4(0, vlmax); + vint32m1_t vec_zero = __riscv_vmv_v_x_i32m1(0, vlmax); + int k = len; + + for (size_t vl; k > 0; k -= vl, a += vl, b += vl) { + vl = __riscv_vsetvl_e8m1(k); + + vuint8m1_t a8s = __riscv_vle8_v_u8m1(a, vl); + vuint8m1_t b8s = __riscv_vle8_v_u8m1(b, vl); + vuint32m4_t a8s_extended = __riscv_vzext_vf4_u32m4(a8s, vl); + vuint32m4_t b8s_extended = __riscv_vzext_vf4_u32m4(a8s, vl); + + vint32m4_t a8s_as_i32 = __riscv_vreinterpret_v_u32m4_i32m4(a8s_extended); + vint32m4_t b8s_as_i32 = __riscv_vreinterpret_v_u32m4_i32m4(b8s_extended); + + vec_s = __riscv_vmacc_vv_i32m4_tu(vec_s, a8s_as_i32, b8s_as_i32, vl); + } + + vint32m1_t vec_sum = __riscv_vredsum_vs_i32m4_i32m1(vec_s, vec_zero, __riscv_vsetvl_e32m4(len)); + int sum = __riscv_vmv_x_s_i32m1_i32(vec_sum); + + return sum; +} + +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32,\s*m4,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32,\s*m4,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*e32,\s*m4,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ +/* { dg-final { scan-assembler-times {vsetvli} 4 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */