From patchwork Fri Feb 17 07:54:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 58392 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp756161wrn; Thu, 16 Feb 2023 23:56:07 -0800 (PST) X-Google-Smtp-Source: AK7set8zlHUMT4y+L2UFDqe93iEq8odmkPjkWOzkHMYdedEbjWVZHewLxQMiwxNSmTWF4gvxXK4k X-Received: by 2002:a17:907:8a26:b0:8b0:ad0b:7ab8 with SMTP id sc38-20020a1709078a2600b008b0ad0b7ab8mr11947492ejc.14.1676620567554; Thu, 16 Feb 2023 23:56:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676620567; cv=none; d=google.com; s=arc-20160816; b=AP8jmBvCwzmc/BMhV4R91gj9HnHN0jnJeV4jZwl6qwD/dXPXkcSsSFGhf8CtpmO8aY WfACPxFpxI429xE51PUaX2NaqbEszj4j25wP7SPz0Sz609JoHFQ3fo/LJavt8mCy75az AeVmWIm9BdaRbc2w5cQpoOosQsT1GoWAQ3BZdlxuee+IHMwD9ZBaA274/J/GqRj2viYK dUtlOszpEg96I0NdM8EAgekPQ1iSRqbBG39DrfCYDeZBdWFHwn8SxtepOYJxc9gqNoL0 JhynJt49XZSk46RI/I5mWV6BvSN26BWzCIzQ1XMVA1tJe7JQVPS27tG20V1FfyDpYPR4 zBGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:references :content-transfer-encoding:subject:cc:to:user-agent:mime-version :date:message-id:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=0lUYi256jOZx+4VlJPPUjxsNe+rvvwvxVBiaw0p2UJ0=; b=SEWh2+NKF8hVaa8EP7viDVA39BogHVMtgeNy4NecwFQYJv7+mmzHFs2S4ERn+kRdQ3 jD9AndVpwDj9TrJcaFS0KNvR7NrpbdWAvlyFnWKH+liAviDvPFcV9CC97kCnQeKz3+7X SXL8uPtoiGL/q8HDsbCg6x1Md49pV3OEpkZR0k7BNCq1M4cXh6MKiDr8Y+RNfKWm1rVd sXngmLXcfw7/D3kGBJDeqJrTZnLYpz5eYvY+ErKWj1F6k2WeUdcCa7hck4DpeGQaPAM/ l/3aeJ4C2BpSengrdAh7ZNlpn/sX1pAjYaeAmbJOLkZuagNlSPgQ8BwVbrYajnVbqPyi lIjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=J6UvQMDb; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id 15-20020a170906024f00b008b1556a60besi6567017ejl.890.2023.02.16.23.56.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Feb 2023 23:56:07 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=J6UvQMDb; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DBC9C383DB8F for ; Fri, 17 Feb 2023 07:55:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DBC9C383DB8F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1676620546; bh=0lUYi256jOZx+4VlJPPUjxsNe+rvvwvxVBiaw0p2UJ0=; h=Date:To:Cc:Subject:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=J6UvQMDbhTr5DoUZ+aBGi9c/jBIhKKIWHdNOLZi3y+lKBrDFlY1U4iKe3ci11POtA VzurdFstniI3vfFirVCngX2jSqfObXWE/IEnJzKhpy3z2JdwatFJFb0sbpBJSLyaFW bhcudZSdVPC6DQjNm0ZAKfn+oNlb8FriaHk2dDgA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from omggw7020.mail.djm.ynwl.yahoo.co.jp (omggw7020.mail.djm.yahoo.co.jp [183.79.54.37]) by sourceware.org (Postfix) with ESMTPS id D15C4385B515 for ; Fri, 17 Feb 2023 07:54:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D15C4385B515 X-YMail-OSG: S6yL1XoVM1maGWbEfs63l5mSt0VcIx0.kL_KX_1DY1zMwTInu853J.CRfnPCa7R KbBrWwB69HPOhlIYYAWfHG6484yI7QyQb6XCQ_Sobh9.BqoIoHR2.E2j_cr9gSzZGdvOzXmyezC1 IWPudVWQ2eJlAOIPv0CXN7ZqKOsspkMH.IHqYBi1GPHJpdcU2BQxgJQf95MvvIkRQtccygGKKSRC o8m.gSmu3YbvVS1d7m6lzGwQmCdGmtfpDw_9sH5Q3a1PElCwlI._Dx2t8XEtrkmeDCsYvq.xkSUU Ikpvz76hYhtLnJ__oOU.mRmV5NKUIqadyegFRpkFnETY5GRn7qo58cMXTKYEtNVC7N9871bOxaaV ZabXmvAwIjQhjIbKdutnUTmj8Ek.6GGoNGXPfHwEVYKjTP1A2EgSujOHcHnx1rB8.BZjEIAIyGh1 zMznI6jeYc8MxrDH8_S1j6AFt5etVUSiDhCFxR_Fmu7ph9NtAeFU_7myZSZzJiwFY6NhSkqzkx_Z 5qLDMANIZ0vxIdjE9Xi_r1vVIif.lxiEpJmbz1RKl4twTzm.onaN2i9Xks_4U0N3dtUA.xL7QGv8 5cacAu69.0YwwpF2B5U5LmymkD4ZAphiW1EXGtmUB_nsY2ZAgTLKqrnQLaEUmuZNdPlhGjhtkEy4 lhLHkgUvcd7I7hIDcmP4euo3Vx0dqQgAZFDCsfeaKw7ga644ruwaUxcihP8ia6r6rWDaDVtfu2i7 oNkbsQqOnPkT5x7J9XX0szAb2i7DzXMJOHNHZDO6EjrENfitEyVpepsTTYiqiaOjsjP3gTJmXibV k7kE0pvBeidd2dNB8vMg93Ebqoed.aLIQjPvbOTF6re2ydfVtet.ADdoiQ6_RHY0fxmHEn5e2Ifi zR8tgPNrPQ2G06WLceq45ROhGhvos9erSITKEVTrpzf3Zs228w1ZJEBbk4omkHzgoGn2jfDX8GwN x2E0- Received: from sonicgw.mail.yahoo.co.jp by sonicconh5001.mail.kks.yahoo.co.jp with HTTP; Fri, 17 Feb 2023 07:54:55 +0000 Received: by smtphe5009.mail.kks.ynwp.yahoo.co.jp (YJ Hermes SMTP Server) with ESMTPA ID 9754b3cc8aa36750cb345bf38464ff52; Fri, 17 Feb 2023 16:54:54 +0900 (JST) Message-ID: <20fe3e31-0660-386c-7e6d-bc0b6c0f64ad@yahoo.co.jp> Date: Fri, 17 Feb 2023 16:54:49 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 To: GCC Patches Cc: Max Filippov Subject: [PATCH v7] xtensa: Eliminate the use of callee-saved register that saves and restores only once References: <20fe3e31-0660-386c-7e6d-bc0b6c0f64ad.ref@yahoo.co.jp> X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758064087954665398?= X-GMAIL-MSGID: =?utf-8?q?1758064087954665398?= In the case of the CALL0 ABI, values that must be retained before and after function calls are placed in the callee-saved registers (A12 through A15) and referenced later. However, it is often the case that the save and the reference are each only once and a simple register- register move (with two exceptions; i. the register saved to/restored from is the stack pointer, ii. the function needs an additional stack pointer adjustment to grow the stack). e.g. in the following example, if there are no other occurrences of register A14: ;; before ; prologue { ... s32i.n a14, sp, 16 ... ;; no frame pointer needed ;; no additional stack growth ; } prologue ... mov.n a14, a6 ;; A6 is not SP ... call0 foo ... mov.n a8, a14 ;; A8 is not SP ... ; epilogue { ... l32i.n a14, sp, 16 ... ; } epilogue It can be possible like this: ;; after ; prologue { ... (no save needed) ... ; } prologue ... s32i.n a6, sp, 16 ;; replaced with A14's slot ... call0 foo ... l32i.n a8, sp, 16 ;; through SP ... ; epilogue { ... (no restoration needed) ... ; } epilogue This patch adds the abovementioned logic to the function prologue/epilogue RTL expander code. gcc/ChangeLog: * config/xtensa/xtensa.cc (machine_function): Add new member 'eliminated_callee_saved_regs'. (xtensa_can_eliminate_callee_saved_reg_p): New function to determine whether the register can be eliminated or not. (xtensa_expand_prologue): Add invoking the above function and elimination the use of callee-saved register by using its stack slot through the stack pointer (or the frame pointer if needed) directly. (xtensa_expand_prologue): Modify to not emit register restoration insn from its stack slot if the register is already eliminated. gcc/testsuite/ChangeLog: * gcc.target/xtensa/elim_callee_saved.c: New. --- gcc/config/xtensa/xtensa.cc | 134 ++++++++++++++---- .../gcc.target/xtensa/elim_callee_saved.c | 37 +++++ 2 files changed, 146 insertions(+), 25 deletions(-) create mode 100644 gcc/testsuite/gcc.target/xtensa/elim_callee_saved.c diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index 3e2e22d4cbe..d987f1dfede 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -105,6 +105,7 @@ struct GTY(()) machine_function bool epilogue_done; bool inhibit_logues_a1_adjusts; rtx last_logues_a9_content; + bitmap eliminated_callee_saved_regs; }; static void xtensa_option_override (void); @@ -3343,6 +3344,65 @@ xtensa_emit_adjust_stack_ptr (HOST_WIDE_INT offset, int flags) cfun->machine->last_logues_a9_content = GEN_INT (offset); } +static bool +xtensa_can_eliminate_callee_saved_reg_p (unsigned int regno, + rtx_insn **p_insnS, + rtx_insn **p_insnR) +{ + df_ref ref; + rtx_insn *insn, *insnS = NULL, *insnR = NULL; + rtx pattern; + + if (!optimize || !df || call_used_or_fixed_reg_p (regno) + || (frame_pointer_needed && regno == HARD_FRAME_POINTER_REGNUM)) + return false; + + for (ref = DF_REG_DEF_CHAIN (regno); + ref; ref = DF_REF_NEXT_REG (ref)) + if (DF_REF_CLASS (ref) != DF_REF_REGULAR + || DEBUG_INSN_P (insn = DF_REF_INSN (ref))) + continue; + else if (GET_CODE (pattern = PATTERN (insn)) == SET + && REG_P (SET_DEST (pattern)) + && REGNO (SET_DEST (pattern)) == regno + && REG_NREGS (SET_DEST (pattern)) == 1 + && REG_P (SET_SRC (pattern))) + { + if (insnS) + return false; + insnS = insn; + continue; + } + else + return false; + + for (ref = DF_REG_USE_CHAIN (regno); + ref; ref = DF_REF_NEXT_REG (ref)) + if (DF_REF_CLASS (ref) != DF_REF_REGULAR + || DEBUG_INSN_P (insn = DF_REF_INSN (ref))) + continue; + else if (GET_CODE (pattern = PATTERN (insn)) == SET + && REG_P (SET_SRC (pattern)) + && REGNO (SET_SRC (pattern)) == regno + && REG_NREGS (SET_SRC (pattern)) == 1 + && REG_P (SET_DEST (pattern))) + { + if (insnR) + return false; + insnR = insn; + continue; + } + else + return false; + + if (!insnS || !insnR) + return false; + + *p_insnS = insnS, *p_insnR = insnR; + + return true; +} + /* minimum frame = reg save area (4 words) plus static chain (1 word) and the total number of words must be a multiple of 128 bits. */ #define MIN_FRAME_SIZE (8 * UNITS_PER_WORD) @@ -3382,6 +3442,7 @@ xtensa_expand_prologue (void) df_ref ref; bool stack_pointer_needed = frame_pointer_needed || crtl->calls_eh_return; + bool large_stack_needed; /* Check if the function body really needs the stack pointer. */ if (!stack_pointer_needed && df) @@ -3430,23 +3491,44 @@ xtensa_expand_prologue (void) } } + cfun->machine->eliminated_callee_saved_regs + = bitmap_alloc (&bitmap_default_obstack); + large_stack_needed = total_size > 1024 + || (!callee_save_size && total_size > 128); for (regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno) - { - if (xtensa_call_save_reg(regno)) - { - rtx x = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset)); - rtx mem = gen_frame_mem (SImode, x); - rtx reg = gen_rtx_REG (SImode, regno); + if (xtensa_call_save_reg(regno)) + { + rtx x = gen_rtx_PLUS (Pmode, + stack_pointer_rtx, GEN_INT (offset)); + rtx mem = gen_frame_mem (SImode, x); + rtx_insn *insnS, *insnR; + + if (!large_stack_needed + && xtensa_can_eliminate_callee_saved_reg_p (regno, + &insnS, &insnR)) + { + if (frame_pointer_needed) + mem = replace_rtx (mem, stack_pointer_rtx, + hard_frame_pointer_rtx); + SET_DEST (PATTERN (insnS)) = mem; + df_insn_rescan (insnS); + SET_SRC (PATTERN (insnR)) = copy_rtx (mem); + df_insn_rescan (insnR); + bitmap_set_bit (cfun->machine->eliminated_callee_saved_regs, + regno); + } + else + { + rtx reg = gen_rtx_REG (SImode, regno); - offset -= UNITS_PER_WORD; - insn = emit_move_insn (mem, reg); - RTX_FRAME_RELATED_P (insn) = 1; - add_reg_note (insn, REG_FRAME_RELATED_EXPR, - gen_rtx_SET (mem, reg)); - } - } - if (total_size > 1024 - || (!callee_save_size && total_size > 128)) + insn = emit_move_insn (mem, reg); + RTX_FRAME_RELATED_P (insn) = 1; + add_reg_note (insn, REG_FRAME_RELATED_EXPR, + gen_rtx_SET (mem, reg)); + } + offset -= UNITS_PER_WORD; + } + if (large_stack_needed) xtensa_emit_adjust_stack_ptr (callee_save_size - total_size, ADJUST_SP_NEED_NOTE); } @@ -3535,16 +3617,18 @@ xtensa_expand_epilogue (bool sibcall_p) emit_insn (gen_blockage ()); for (regno = 0; regno < FIRST_PSEUDO_REGISTER; ++regno) - { - if (xtensa_call_save_reg(regno)) - { - rtx x = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset)); - - offset -= UNITS_PER_WORD; - emit_move_insn (gen_rtx_REG (SImode, regno), - gen_frame_mem (SImode, x)); - } - } + if (xtensa_call_save_reg(regno)) + { + if (! bitmap_bit_p (cfun->machine->eliminated_callee_saved_regs, + regno)) + { + rtx x = gen_rtx_PLUS (Pmode, + stack_pointer_rtx, GEN_INT (offset)); + emit_move_insn (gen_rtx_REG (SImode, regno), + gen_frame_mem (SImode, x)); + } + offset -= UNITS_PER_WORD; + } if (sibcall_p) emit_use (gen_rtx_REG (SImode, A0_REG)); diff --git a/gcc/testsuite/gcc.target/xtensa/elim_callee_saved.c b/gcc/testsuite/gcc.target/xtensa/elim_callee_saved.c new file mode 100644 index 00000000000..d123e6c01cb --- /dev/null +++ b/gcc/testsuite/gcc.target/xtensa/elim_callee_saved.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mabi=call0" } */ + +extern void foo(void); + +/* eliminated one register (the reservoir of variable 'a') by its stack slot through the stack pointer. */ +int test0(int a) { + int array[252]; /* the maximum bound of non-large stack. */ + foo(); + asm volatile("" : : "m"(array)); + return a; +} + +/* cannot eliminate if large stack is needed, because the offset from TOS cannot fit into single L32I/S32I instruction. */ +int test1(int a) { + int array[10000]; /* requires large stack. */ + foo(); + asm volatile("" : : "m"(array)); + return a; +} + +/* register A15 is the reservoir of the stack pointer and cannot be eliminated if the frame pointer is needed. + other registers still can be, but through the frame pointer rather the stack pointer. */ +int test2(int a) { + int* p = __builtin_alloca(16); + foo(); + asm volatile("" : : "r"(p)); + return a; +} + +/* in -O0 the composite hard registers may still remain unsplitted at pro_and_epilogue and must be excluded. */ +extern double bar(void); +int __attribute__((optimize(0))) test3(int a) { + return bar() + a; +} + +/* { dg-final { scan-assembler-times "a15, 8" 2 } } */