From patchwork Sun Oct 9 11:40:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dimitar Dimitrov X-Patchwork-Id: 1837 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1116691wrs; Sun, 9 Oct 2022 04:42:09 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4PLoxPyM2eix5Hxht3f/Wk2ylxxRI+gXQyHyg55xDeSI6MQ063Kc5DJnSpQHWi1EGy/V1M X-Received: by 2002:a05:6402:3705:b0:454:e006:82 with SMTP id ek5-20020a056402370500b00454e0060082mr12914226edb.360.1665315729726; Sun, 09 Oct 2022 04:42:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665315729; cv=none; d=google.com; s=arc-20160816; b=hYG836uGT1/8Rsqn7xdzUVrx+5id+lQ8zBAkwHMzSQOxjUaH9ixzIfAqIqgknxzC9d qUmqU3sC7GH4HU7ZSa2l2XbNKRyxBPIi1s9t0dyfd4Vhe1E0AeNwhJWCvzg0yDAa6fee hxyqk10WjXvT6Pb4cdwFS2t3ztuaVoDP71aFdiDM6HcPPNV+5FN6TUUzLhnlpECZXKOF ZKhs2YzNbP2pAUIx0DWdw3jftvg+ZNa41cQd18rN2WPoRZZN03KaVVHWyN/8w83TxNjZ R/6JkZyFnJT4YmmbKb3JkrIN7Jn5uTa1Ocf4TxLyuEkNtHv8InoJ3DSYD3QDi6FVsYDY ix3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dkim-signature :dmarc-filter:delivered-to; bh=dxaTMi4fkZ3cGyEdc+czuRTMSwP5HhZ1fwHo1yIva9I=; b=KJgvwAe5VQxPAdlk997iAa/uHGOgKJygfgJMWO60w6L94kmGP5kM6AU7xQvGyAAyVR k4AJC0DU0U/GtferStJmQCjYRey2O7jkFDfnuI/Xz5OCSf34HNEaTTJ+SO+elSpAfaEG rtw2VogQjrgMFFzGY5pDh7HElGqn6+kLtQH3+b1240gwICCZZJLvpUGfvsaxSmzBz3Cn 69k2OnHpe5SvH3W+F725pbXz/hRWWstqsnNEH1lIOZb3Hqz8pTd47GZhMiJYjTMavdeJ mUK5cjQrTTXL2elR7tvpvYNd9pWad5b/Ez6/zmfEdEwKleMlM4QQqvN5ibTTWipwniuW Wphw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@dinux.eu header.s=default header.b=Gi7LNgt7; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id xa4-20020a170906fd8400b007803b5893f2si2360598ejb.672.2022.10.09.04.42.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 09 Oct 2022 04:42:09 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@dinux.eu header.s=default header.b=Gi7LNgt7; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F027C3948A7D for ; Sun, 9 Oct 2022 11:41:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server28.superhosting.bg (server28.superhosting.bg [217.174.156.11]) by sourceware.org (Postfix) with ESMTPS id 2B0A93858D1E for ; Sun, 9 Oct 2022 11:41:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2B0A93858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=dinux.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dinux.eu DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=dinux.eu; s=default; h=Content-Transfer-Encoding:MIME-Version:Message-Id:Date:Subject: Cc:To:From:Sender:Reply-To:Content-Type:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=dxaTMi4fkZ3cGyEdc+czuRTMSwP5HhZ1fwHo1yIva9I=; b=Gi7LNgt7beYcTagGBFmnTVi+3A IrsgacYnIbtJ4y9v4lQ2g6YBLqXTx/YE/fqFUJhkPbdpmtSnE+NqwKy8ARk2y5Tp5mN1frTaaJUJj wVuds71DH6J0w0x0fy2/4ZFrXQDF/As4p6j92JrRdZuk5jhY0ms3AgWFzJVvs+kjiUhNnLlCf5Zwk uETBJtYdXQ9JTafcj5t0/CBoK1tc2ihx9dd83b0/OX1sNJCYzTQxvhlxQ8SuuizcVgJVwQ/JvqyHE APZHUdDFfrZ6vraN+lTDvsge6HpMDq4oiSNkkyYVsHvwcklchBOja1EEEfCwKrlUZT8mWYbtLUGBe f3w8sUHQ==; Received: from [95.87.234.74] (port=34756 helo=kendros.lan) by server28.superhosting.bg with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1ohUg7-0003RB-Gr; Sun, 09 Oct 2022 14:41:10 +0300 From: Dimitar Dimitrov To: gcc-patches@gcc.gnu.org Subject: [committed] pru: Optimize DI shifts Date: Sun, 9 Oct 2022 14:40:48 +0300 Message-Id: <20221009114049.29943-1-dimitar@dinux.eu> X-Mailer: git-send-email 2.37.3 MIME-Version: 1.0 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server28.superhosting.bg X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - dinux.eu X-Get-Message-Sender-Via: server28.superhosting.bg: authenticated_id: dimitar@dinux.eu X-Authenticated-Sender: server28.superhosting.bg: dimitar@dinux.eu X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746210106440647479?= X-GMAIL-MSGID: =?utf-8?q?1746210106440647479?= This patch improves code generation for the PRU backend. Committed to trunk. If the number of shift positions is a constant, then the DI shift operation is expanded to a sequence of 2 to 4 machine instructions. That is more efficient than the default action to call libgcc. gcc/ChangeLog: * config/pru/pru.md (lshrdi3): New expand pattern. (ashldi3): Ditto. gcc/testsuite/ChangeLog: * gcc.target/pru/ashiftdi-1.c: New test. * gcc.target/pru/lshiftrtdi-1.c: New test. Signed-off-by: Dimitar Dimitrov --- gcc/config/pru/pru.md | 196 ++++++++++++++++++++ gcc/testsuite/gcc.target/pru/ashiftdi-1.c | 53 ++++++ gcc/testsuite/gcc.target/pru/lshiftrtdi-1.c | 53 ++++++ 3 files changed, 302 insertions(+) create mode 100644 gcc/testsuite/gcc.target/pru/ashiftdi-1.c create mode 100644 gcc/testsuite/gcc.target/pru/lshiftrtdi-1.c diff --git a/gcc/config/pru/pru.md b/gcc/config/pru/pru.md index 144cd35d809..53ffff07708 100644 --- a/gcc/config/pru/pru.md +++ b/gcc/config/pru/pru.md @@ -703,6 +703,202 @@ (define_insn "ashr3_single" [(set_attr "type" "alu") (set_attr "length" "12")]) + +; 64-bit LSHIFTRT with a constant shift count can be expanded into +; more efficient code sequence than a variable register shift. +; +; 1. For shift >= 32: +; dst_lo = (src_hi >> (shift - 32)) +; dst_hi = 0 +; +; 2. For shift==1 there is no need for a temporary: +; dst_lo = (src_lo >> 1) +; if (src_hi & 1) +; dst_lo |= (1 << 31) +; dst_hi = (src_hi >> 1) +; +; 3. For shift < 32: +; dst_lo = (src_lo >> shift) +; tmp = (src_hi << (32 - shift) +; dst_lo |= tmp +; dst_hi = (src_hi >> shift) +; +; 4. For shift in a register: +; Fall back to calling libgcc. +(define_expand "lshrdi3" + [(set (match_operand:DI 0 "register_operand") + (lshiftrt:DI + (match_operand:DI 1 "register_operand") + (match_operand:QI 2 "const_int_operand")))] + "" +{ + gcc_assert (CONST_INT_P (operands[2])); + + const int nshifts = INTVAL (operands[2]); + rtx dst_lo = simplify_gen_subreg (SImode, operands[0], DImode, 0); + rtx dst_hi = simplify_gen_subreg (SImode, operands[0], DImode, 4); + rtx src_lo = simplify_gen_subreg (SImode, operands[1], DImode, 0); + rtx src_hi = simplify_gen_subreg (SImode, operands[1], DImode, 4); + + if (nshifts >= 32) + { + emit_insn (gen_rtx_SET (dst_lo, + gen_rtx_LSHIFTRT (SImode, + src_hi, + GEN_INT (nshifts - 32)))); + emit_insn (gen_rtx_SET (dst_hi, const0_rtx)); + DONE; + } + + gcc_assert (can_create_pseudo_p ()); + + /* The expansions which follow are safe only if DST_LO and SRC_HI + do not overlap. If they do, then fix by using a temporary register. + Overlapping of DST_HI and SRC_LO is safe because by the time DST_HI + is set, SRC_LO is no longer live. */ + if (reg_overlap_mentioned_p (dst_lo, src_hi)) + { + rtx new_src_hi = gen_reg_rtx (SImode); + + emit_move_insn (new_src_hi, src_hi); + src_hi = new_src_hi; + } + + if (nshifts == 1) + { + rtx_code_label *skip_hiset_label; + rtx j; + + emit_insn (gen_rtx_SET (dst_lo, + gen_rtx_LSHIFTRT (SImode, src_lo, const1_rtx))); + + /* The code generated by `genemit' would create a LABEL_REF. */ + skip_hiset_label = gen_label_rtx (); + j = emit_jump_insn (gen_cbranch_qbbx_const (EQ, + SImode, + src_hi, + GEN_INT (0), + skip_hiset_label)); + JUMP_LABEL (j) = skip_hiset_label; + LABEL_NUSES (skip_hiset_label)++; + + emit_insn (gen_iorsi3 (dst_lo, dst_lo, GEN_INT (1 << 31))); + emit_label (skip_hiset_label); + emit_insn (gen_rtx_SET (dst_hi, + gen_rtx_LSHIFTRT (SImode, src_hi, const1_rtx))); + DONE; + } + + if (nshifts < 32) + { + rtx tmpval = gen_reg_rtx (SImode); + + emit_insn (gen_rtx_SET (dst_lo, + gen_rtx_LSHIFTRT (SImode, + src_lo, + GEN_INT (nshifts)))); + emit_insn (gen_rtx_SET (tmpval, + gen_rtx_ASHIFT (SImode, + src_hi, + GEN_INT (32 - nshifts)))); + emit_insn (gen_iorsi3 (dst_lo, dst_lo, tmpval)); + emit_insn (gen_rtx_SET (dst_hi, + gen_rtx_LSHIFTRT (SImode, + src_hi, + GEN_INT (nshifts)))); + DONE; + } + gcc_unreachable (); +}) + +; 64-bit ASHIFT with a constant shift count can be expanded into +; more efficient code sequence than the libgcc call required by +; a variable shift in a register. + +(define_expand "ashldi3" + [(set (match_operand:DI 0 "register_operand") + (ashift:DI + (match_operand:DI 1 "register_operand") + (match_operand:QI 2 "const_int_operand")))] + "" +{ + gcc_assert (CONST_INT_P (operands[2])); + + const int nshifts = INTVAL (operands[2]); + rtx dst_lo = simplify_gen_subreg (SImode, operands[0], DImode, 0); + rtx dst_hi = simplify_gen_subreg (SImode, operands[0], DImode, 4); + rtx src_lo = simplify_gen_subreg (SImode, operands[1], DImode, 0); + rtx src_hi = simplify_gen_subreg (SImode, operands[1], DImode, 4); + + if (nshifts >= 32) + { + emit_insn (gen_rtx_SET (dst_hi, + gen_rtx_ASHIFT (SImode, + src_lo, + GEN_INT (nshifts - 32)))); + emit_insn (gen_rtx_SET (dst_lo, const0_rtx)); + DONE; + } + + gcc_assert (can_create_pseudo_p ()); + + /* The expansions which follow are safe only if DST_HI and SRC_LO + do not overlap. If they do, then fix by using a temporary register. + Overlapping of DST_LO and SRC_HI is safe because by the time DST_LO + is set, SRC_HI is no longer live. */ + if (reg_overlap_mentioned_p (dst_hi, src_lo)) + { + rtx new_src_lo = gen_reg_rtx (SImode); + + emit_move_insn (new_src_lo, src_lo); + src_lo = new_src_lo; + } + + if (nshifts == 1) + { + rtx_code_label *skip_hiset_label; + rtx j; + + emit_insn (gen_rtx_SET (dst_hi, + gen_rtx_ASHIFT (SImode, src_hi, const1_rtx))); + + skip_hiset_label = gen_label_rtx (); + j = emit_jump_insn (gen_cbranch_qbbx_const (EQ, + SImode, + src_lo, + GEN_INT (31), + skip_hiset_label)); + JUMP_LABEL (j) = skip_hiset_label; + LABEL_NUSES (skip_hiset_label)++; + + emit_insn (gen_iorsi3 (dst_hi, dst_hi, GEN_INT (1 << 0))); + emit_label (skip_hiset_label); + emit_insn (gen_rtx_SET (dst_lo, + gen_rtx_ASHIFT (SImode, src_lo, const1_rtx))); + DONE; + } + + if (nshifts < 32) + { + rtx tmpval = gen_reg_rtx (SImode); + + emit_insn (gen_rtx_SET (dst_hi, + gen_rtx_ASHIFT (SImode, + src_hi, + GEN_INT (nshifts)))); + emit_insn (gen_rtx_SET (tmpval, + gen_rtx_LSHIFTRT (SImode, + src_lo, + GEN_INT (32 - nshifts)))); + emit_insn (gen_iorsi3 (dst_hi, dst_hi, tmpval)); + emit_insn (gen_rtx_SET (dst_lo, + gen_rtx_ASHIFT (SImode, + src_lo, + GEN_INT (nshifts)))); + DONE; + } + gcc_unreachable (); +}) ;; Include ALU patterns with zero-extension of operands. That's where ;; the real insns are defined. diff --git a/gcc/testsuite/gcc.target/pru/ashiftdi-1.c b/gcc/testsuite/gcc.target/pru/ashiftdi-1.c new file mode 100644 index 00000000000..516e5a86102 --- /dev/null +++ b/gcc/testsuite/gcc.target/pru/ashiftdi-1.c @@ -0,0 +1,53 @@ +/* Functional test for DI left shift. */ + +/* { dg-do run } */ +/* { dg-options "-pedantic-errors" } */ + +#include +#include + +extern void abort (void); + +uint64_t __attribute__((noinline)) ashift_1 (uint64_t a) +{ + return a << 1; +} + +uint64_t __attribute__((noinline)) ashift_10 (uint64_t a) +{ + return a << 10; +} + +uint64_t __attribute__((noinline)) ashift_32 (uint64_t a) +{ + return a << 32; +} + +uint64_t __attribute__((noinline)) ashift_36 (uint64_t a) +{ + return a << 36; +} + +int +main (int argc, char** argv) +{ + if (ashift_1 (0xaaaa5555aaaa5555ull) != 0x5554aaab5554aaaaull) + abort(); + if (ashift_10 (0xaaaa5555aaaa5555ull) != 0xa95556aaa9555400ull) + abort(); + if (ashift_32 (0xaaaa5555aaaa5555ull) != 0xaaaa555500000000ull) + abort(); + if (ashift_36 (0xaaaa5555aaaa5555ull) != 0xaaa5555000000000ull) + abort(); + + if (ashift_1 (0x1234567822334455ull) != 0x2468acf0446688aaull) + abort(); + if (ashift_10 (0x1234567822334455ull) != 0xd159e088cd115400ull) + abort(); + if (ashift_32 (0x1234567822334455ull) != 0x2233445500000000ull) + abort(); + if (ashift_36 (0x1234567822334455ull) != 0x2334455000000000ull) + abort(); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/pru/lshiftrtdi-1.c b/gcc/testsuite/gcc.target/pru/lshiftrtdi-1.c new file mode 100644 index 00000000000..7adae6ccc13 --- /dev/null +++ b/gcc/testsuite/gcc.target/pru/lshiftrtdi-1.c @@ -0,0 +1,53 @@ +/* Functional test for DI right shift. */ + +/* { dg-do run } */ +/* { dg-options "-pedantic-errors" } */ + +#include +#include + +extern void abort (void); + +uint64_t __attribute__((noinline)) lshift_1 (uint64_t a) +{ + return a >> 1; +} + +uint64_t __attribute__((noinline)) lshift_10 (uint64_t a) +{ + return a >> 10; +} + +uint64_t __attribute__((noinline)) lshift_32 (uint64_t a) +{ + return a >> 32; +} + +uint64_t __attribute__((noinline)) lshift_36 (uint64_t a) +{ + return a >> 36; +} + +int +main (int argc, char** argv) +{ + if (lshift_1 (0xaaaa5555aaaa5555ull) != 0x55552aaad5552aaaull) + abort(); + if (lshift_10 (0xaaaa5555aaaa5555ull) != 0x002aaa95556aaa95ull) + abort(); + if (lshift_32 (0xaaaa5555aaaa5555ull) != 0x00000000aaaa5555ull) + abort(); + if (lshift_36 (0xaaaa5555aaaa5555ull) != 0x000000000aaaa555ull) + abort(); + + if (lshift_1 (0x1234567822334455ull) != 0x091a2b3c1119a22aull) + abort(); + if (lshift_10 (0x1234567822334455ull) != 0x00048d159e088cd1ull) + abort(); + if (lshift_32 (0x1234567822334455ull) != 0x0000000012345678ull) + abort(); + if (lshift_36 (0x1234567822334455ull) != 0x0000000001234567ull) + abort(); + + return 0; +}