From patchwork Fri Sep 22 01:33:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Li Xu X-Patchwork-Id: 143069 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5256822vqi; Thu, 21 Sep 2023 18:33:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHkHscOfxp4s4oSS/QZbVrCWVLviEo3CC+4oF/MOTefuSse+9WocJeXrr9Qr5gkQyK8KgEk X-Received: by 2002:a05:6512:1186:b0:503:2e6:685e with SMTP id g6-20020a056512118600b0050302e6685emr5517067lfr.14.1695346436169; Thu, 21 Sep 2023 18:33:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695346436; cv=none; d=google.com; s=arc-20160816; b=V7JRmQT7OLxA0BtBMmlrOF+tyEobeDO8SMYjWGWGhUYBZM5SmEzcMBjzXQp5T7Fi72 aMJRbaaM532p4ttkTUAPZDDJ8MUjbVpv0GodB32A7OPFDSfS19UDPlK1ZDqgrPowBoGJ T/IrPgKxFvRpcmTgt94VCbmserQ6xr2KFsRMN8GiJrhcfgexkwCbD2Ga4YgXR27JnnPF aC71t7qnYrrBvTiZGAExVHJcGL9stAHE2S9RmLsp0Y664T4jZcClqg0IDxtPttkc+xjq BdpX2Wp7Fbe+kPWnBss7ogopQkhr+kizItetjAugb+kt6S7L0f/gpfBjACyiTek0Wmvb tfMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:message-id:date:subject:cc:to :from:dmarc-filter:delivered-to; bh=t+EUunKr+PEm+hzY/Zx6VNHY8xTKwmI8FlXPcPhu/Mg=; fh=Qb/t3bub0qUevN32JfR0ksI5uuULifWToHOJuO7VEBY=; b=0PsFgKTO/NGd1CI9kgUxrX6vZ8O9ciVraad2WKthnAMb7IpTq28oGwLj8/8IEy0wRl 28r2oUdLSGFQPRmnQTHzssOWXa44CsG7DZXrFHmOBvXI9qQHqu52ThdUILP6lr3f5lfG SPx4/gdHumThzlwd48zQlAe1dlWUoUDquFJKjcSO8WZLOhEUhqtPI8BfrZuBS15Ez40+ pIvspXMfKlZMWQpEf1MO/VAkzHG40umvBqvYEYKrMdQc+HJLNXUFOkMWf9oGvEbJu7/T yO1To7p/GS38NpPsxDqd0GS5QS8W0TUPQy0ysAuVvR/AIffdvoyISLTOsZHyhAd+L/Ba EiPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id f15-20020aa7d84f000000b0053318ba2b73si2251087eds.692.2023.09.21.18.33.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 18:33:56 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BFE37385773F for ; Fri, 22 Sep 2023 01:33:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from zg8tmty3ljk5ljewns4xndka.icoremail.net (zg8tmty3ljk5ljewns4xndka.icoremail.net [167.99.105.149]) by sourceware.org (Postfix) with ESMTP id ACE7D3858C27 for ; Fri, 22 Sep 2023 01:33:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ACE7D3858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=eswincomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eswincomputing.com Received: from host014-ubuntu-1804.lxd (unknown [10.12.130.31]) by app1 (Coremail) with SMTP id EwgMCgAXtcTW7gxlgPxKAA--.292S4; Fri, 22 Sep 2023 09:33:10 +0800 (CST) From: Li Xu To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, juzhe.zhong@rivai.ai, xuli Subject: [PATCH] RISC-V: Optimization of vrgather.vv into vrgatherei16.vv[PR111451] Date: Fri, 22 Sep 2023 01:33:09 +0000 Message-Id: <20230922013309.21359-1-xuli1@eswincomputing.com> X-Mailer: git-send-email 2.17.1 X-CM-TRANSID: EwgMCgAXtcTW7gxlgPxKAA--.292S4 X-Coremail-Antispam: 1UD129KBjvJXoWxWr17uw4xWr43GFy3XFWkZwb_yoWrtw4Dpa yDGr42yas5GF97G3Z7tF17JrWYvw4aqr95Zwn3AF47Cw4FvrWvqFyvkF47trWYvF48Wr1x uF43Cr4a9w4kXrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUk214x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4U JVWxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_Gc CE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E 2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJV W8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lc2xSY4AK6svPMxAI w28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr 4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxG rwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8Jw CI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2 z280aVCY1x0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjfUoOJ5UUUUU X-CM-SenderInfo: 50xoxi46hv4xpqfrz1xxwl0woofrz/ X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777699584912942659 X-GMAIL-MSGID: 1777699584912942659 From: xuli Consider this following case: typedef int32_t vnx32si __attribute__ ((vector_size (128))); __attribute__ ((noipa)) void permute_##TYPE (TYPE values1, TYPE values2, \ TYPE *out) \ { \ TYPE v \ = __builtin_shufflevector (values1, values2, MASK_##NUNITS (0, NUNITS)); \ *(TYPE *) out = v; \ } T (vnx32si, 32) \ TEST_ALL (PERMUTE) Before this patch: li a4,31 vsetvli a5,zero,e32,m8,ta,ma vl8re32.v v24,0(a0) vid.v v8 vrsub.vx v8,v8,a4 vrgather.vv v16,v24,v8 vs8r.v v16,0(a2) ret The index vector register "v8" occupies 8 registers. We should optimize it into vrgatherei16.vv which is using int16 as the index elements. After this patch: vsetvli a5,zero,e16,m4,ta,ma li a4,31 vid.v v4 vl8re32.v v16,0(a0) vrsub.vx v4,v4,a4 vsetvli zero,zero,e32,m8,ta,ma vrgatherei16.vv v8,v16,v4 vs8r.v v8,0(a2) ret With vrgatherei16.vv, the v8 will occupy 4 registers instead of 8. Lower the register consuming and register pressure. gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_vlmax_gather_insn): Optimization of vrgather.vv into vrgatherei16.vv. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Adjust case. * gcc.target/riscv/rvv/autovec/vls/perm-4.c: Ditto. --- gcc/config/riscv/riscv-v.cc | 20 +++++++++++++++++++ .../riscv/rvv/autovec/vls-vlmax/perm-4.c | 3 ++- .../gcc.target/riscv/rvv/autovec/vls/perm-4.c | 3 ++- 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 64a71a128d4..271e0ff6dfc 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -783,6 +783,8 @@ emit_vlmax_gather_insn (rtx target, rtx op, rtx sel) insn_code icode; machine_mode data_mode = GET_MODE (target); machine_mode sel_mode = GET_MODE (sel); + unsigned int data_sew = get_sew (data_mode); + enum vlmul_type data_lmul = get_vlmul (data_mode); if (maybe_ne (GET_MODE_SIZE (data_mode), GET_MODE_SIZE (sel_mode))) icode = code_for_pred_gatherei16 (data_mode); else if (const_vec_duplicate_p (sel, &elt)) @@ -790,6 +792,24 @@ emit_vlmax_gather_insn (rtx target, rtx op, rtx sel) icode = code_for_pred_gather_scalar (data_mode); sel = elt; } + else if (CONST_VECTOR_P (sel) && data_sew != 16 + && data_sew != 8 && (data_lmul == LMUL_2 + || data_lmul == LMUL_4 || data_lmul == LMUL_8)) + { + /* If the inner mode of data is not QI or HI and data_lmul > 1, + emitting vrgatherei16.vv instruction will lower register + pressure. + data_mode sel_mode ei16 + RVVM1QI RVVM1QI RVVM2HI not needed + RVVM2QI RVVM2QI RVVM4HI not needed + RVVM2HI RVVM2HI RVVM2HI not needed + RVVM2SI RVVM2SI RVVM1HI need + RVVM4SI RVVM4SI RVVM2HI need + RVVM8DI RVVM8DI RVVM2HI need */ + PUT_MODE (sel, get_vector_mode (HImode, + GET_MODE_NUNITS (data_mode)).require ()); + icode = code_for_pred_gatherei16 (data_mode); + } else icode = code_for_pred_gather (data_mode); rtx ops[] = {target, op, sel}; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c index 9df69a0cc2c..7ab31043547 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c @@ -55,6 +55,7 @@ TEST_ALL (PERMUTE) -/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 31 } } */ +/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 19 } } */ +/* { dg-final { scan-assembler-times {vrgatherei16\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 12 } } */ /* { dg-final { scan-assembler-times {vrsub\.vi} 24 } } */ /* { dg-final { scan-assembler-times {vrsub\.vx} 7 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c index 46cad8ea2f4..4d6862cf1c0 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c @@ -3,6 +3,7 @@ #include "../vls-vlmax/perm-4.c" -/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 31 } } */ +/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 19 } } */ +/* { dg-final { scan-assembler-times {vrgatherei16\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 12 } } */ /* { dg-final { scan-assembler-times {vrsub\.vi} 24 } } */ /* { dg-final { scan-assembler-times {vrsub\.vx} 7 } } */