From patchwork Fri Sep 22 02:37:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Li Xu X-Patchwork-Id: 143105 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5279114vqi; Thu, 21 Sep 2023 19:38:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHHFDXSNWFYhsY/vA0ZvukkslsxqSL8YSOGj/yNzgrfRwCT6chxudc9Vc4Wths7xm8bhSTM X-Received: by 2002:a17:906:5a71:b0:9ae:6a8b:f8a7 with SMTP id my49-20020a1709065a7100b009ae6a8bf8a7mr2170334ejc.36.1695350307632; Thu, 21 Sep 2023 19:38:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695350307; cv=none; d=google.com; s=arc-20160816; b=ox1f+8gr4/4X35gqrlSAb0Hs3zw/t7r8pzJKUxBKQgrSU+C3W0cLjVE/7LYOZdvkK8 XIp3U4Oy/aLjEY1wXlF98yBliasPrgSBi+adD1OCldcC6mQHjxwnbT/Hc0n0PweNPHWz 2XoDDvEM4ULj2eCzK/2iwY0rrw8OvsIBm5PNLi7Aqyr+G3A7hqAVLgekS4VZIq9naoSS 91D3lIlL4eWqt6E3X20B3PBRn/29NkjN2QDS+NvRL9A0gPDIRL39HUo/JMzEXOyK834k XLUHJcidMWDo9ef059HhwCmFtgYQosM2lsoWSOvkqGvmm7p3faRWyzeMaLOPI4lDuxYy RUSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:message-id:date:subject:cc:to :from:dmarc-filter:delivered-to; bh=H7onfnHG5DSc59OntYrjFzqbMOl23lZ/vmyJ7+6VgXs=; fh=Qb/t3bub0qUevN32JfR0ksI5uuULifWToHOJuO7VEBY=; b=Y5csXnIolq80mJp5dulWMkyrpSwsCMGQR3wMKusgTPskAjvk+YGZiCzBD7EYaFN+Sy FXvdyZGfnYdi1PpnwUCKrVHgVqh6eNsIprbupWyPErhH4H/j5BKIPIzclXRNMH/A57vL ++kGaHF6MRS+qmzYPlXrRHMHKKCjtMxb28V9tr/nzY4C8in+KSjd0wJJO9h2Z7S9HX3e uUlOQnisWNST4l1w8tquSuuSAuvaXp+3Q6Yjxc8CBj1kM5jLDe2w6MzznGE9yjjKWJtF nXu9f4sLPjWcGjRBmxyfMjGIxM5SPfuQg1I3RnepH8XX5UZzAb23IaXYiOmBBklaTreB FuNg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d25-20020a17090694d900b009adc76a85ffsi2609579ejy.646.2023.09.21.19.38.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 19:38:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 93EE5385735E for ; Fri, 22 Sep 2023 02:38:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from zg8tmty3ljk5ljewns4xndka.icoremail.net (zg8tmty3ljk5ljewns4xndka.icoremail.net [167.99.105.149]) by sourceware.org (Postfix) with ESMTP id 196923858C27 for ; Fri, 22 Sep 2023 02:37:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 196923858C27 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=eswincomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eswincomputing.com Received: from host014-ubuntu-1804.lxd (unknown [10.12.130.31]) by app2 (Coremail) with SMTP id EggMCgBXnZP4_QxlLm1KAA--.38552S4; Fri, 22 Sep 2023 10:37:44 +0800 (CST) From: Li Xu To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, palmer@dabbelt.com, juzhe.zhong@rivai.ai, xuli Subject: [PATCH V2] RISC-V: Optimization of vrgather.vv into vrgatherei16.vv[PR111451] Date: Fri, 22 Sep 2023 02:37:43 +0000 Message-Id: <20230922023743.332-1-xuli1@eswincomputing.com> X-Mailer: git-send-email 2.17.1 X-CM-TRANSID: EggMCgBXnZP4_QxlLm1KAA--.38552S4 X-Coremail-Antispam: 1UD129KBjvJXoWxWr17uw4xWr43GFy3XFWkZwb_yoWrCrWxpa yDGr42yas5GF97G3Z7KF17JayYqw4Sgryfuan3AF4UCw4FvrW0qFyvkF47tw4YvF4UWrnr uF43Cr43uw4kXrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUkF14x267AKxVW8JVW5JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26F4U JVW0owA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oV Cq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0 I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r 4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCY02Avz4vE-syl42xK 82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGw C20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48J MIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMI IF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E 87Iv6xkF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x0JUdHUDUUUUU= X-CM-SenderInfo: 50xoxi46hv4xpqfrz1xxwl0woofrz/ X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777703643879746383 X-GMAIL-MSGID: 1777703643879746383 From: xuli Consider this following case: typedef int32_t vnx32si __attribute__ ((vector_size (128))); __attribute__ ((noipa)) void permute_##TYPE (TYPE values1, TYPE values2, \ TYPE *out) \ { \ TYPE v \ = __builtin_shufflevector (values1, values2, MASK_##NUNITS (0, NUNITS)); \ *(TYPE *) out = v; \ } T (vnx32si, 32) \ TEST_ALL (PERMUTE) Before this patch: li a4,31 vsetvli a5,zero,e32,m8,ta,ma vl8re32.v v24,0(a0) vid.v v8 vrsub.vx v8,v8,a4 vrgather.vv v16,v24,v8 vs8r.v v16,0(a2) ret The index vector register "v8" occupies 8 registers. We should optimize it into vrgatherei16.vv which is using int16 as the index elements. After this patch: vsetvli a5,zero,e16,m4,ta,ma li a4,31 vid.v v4 vl8re32.v v16,0(a0) vrsub.vx v4,v4,a4 vsetvli zero,zero,e32,m8,ta,ma vrgatherei16.vv v8,v16,v4 vs8r.v v8,0(a2) ret With vrgatherei16.vv, the v8 will occupy 4 registers instead of 8. Lower the register consuming and register pressure. PR target/111451 gcc/ChangeLog: * config/riscv/riscv-v.cc (emit_vlmax_gather_insn): Optimization of vrgather.vv into vrgatherei16.vv. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c: Adjust case. * gcc.target/riscv/rvv/autovec/vls/perm-4.c: Ditto. --- gcc/config/riscv/riscv-v.cc | 18 ++++++++++++++++++ .../riscv/rvv/autovec/vls-vlmax/perm-4.c | 3 ++- .../gcc.target/riscv/rvv/autovec/vls/perm-4.c | 3 ++- 3 files changed, 22 insertions(+), 2 deletions(-) diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 64a71a128d4..455efa7ea8a 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -790,6 +790,24 @@ emit_vlmax_gather_insn (rtx target, rtx op, rtx sel) icode = code_for_pred_gather_scalar (data_mode); sel = elt; } + else if (CONST_VECTOR_P (sel) + && GET_MODE_BITSIZE (GET_MODE_INNER (sel_mode)) > 16 + && riscv_get_v_regno_alignment (data_mode) > 1) + { + /* If the inner mode of data is not QI or HI and data_lmul > 1, + emitting vrgatherei16.vv instruction will lower register + pressure. + data_mode sel_mode ei16 + RVVM1QI RVVM1QI RVVM2HI not needed + RVVM2QI RVVM2QI RVVM4HI not needed + RVVM2HI RVVM2HI RVVM2HI not needed + RVVM2SI RVVM2SI RVVM1HI need + RVVM4SI RVVM4SI RVVM2HI need + RVVM8DI RVVM8DI RVVM2HI need */ + PUT_MODE (sel, get_vector_mode (HImode, + GET_MODE_NUNITS (data_mode)).require ()); + icode = code_for_pred_gatherei16 (data_mode); + } else icode = code_for_pred_gather (data_mode); rtx ops[] = {target, op, sel}; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c index 9df69a0cc2c..7ab31043547 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls-vlmax/perm-4.c @@ -55,6 +55,7 @@ TEST_ALL (PERMUTE) -/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 31 } } */ +/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 19 } } */ +/* { dg-final { scan-assembler-times {vrgatherei16\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 12 } } */ /* { dg-final { scan-assembler-times {vrsub\.vi} 24 } } */ /* { dg-final { scan-assembler-times {vrsub\.vx} 7 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c index 46cad8ea2f4..4d6862cf1c0 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/perm-4.c @@ -3,6 +3,7 @@ #include "../vls-vlmax/perm-4.c" -/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 31 } } */ +/* { dg-final { scan-assembler-times {vrgather\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 19 } } */ +/* { dg-final { scan-assembler-times {vrgatherei16\.vv\tv[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 12 } } */ /* { dg-final { scan-assembler-times {vrsub\.vi} 24 } } */ /* { dg-final { scan-assembler-times {vrsub\.vx} 7 } } */