From patchwork Sat May 6 17:05:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 90783 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1161936vqo; Sat, 6 May 2023 10:06:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7f6mEeEHdWLSGQRVEDXuLE/ujDuqgr3G19+cKmQgj2ChKbS9RrORGm61WUmtw6Aqi+mh7x X-Received: by 2002:a17:907:9306:b0:932:f88c:c2ff with SMTP id bu6-20020a170907930600b00932f88cc2ffmr4689211ejc.34.1683392787660; Sat, 06 May 2023 10:06:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683392787; cv=none; d=google.com; s=arc-20160816; b=rjtoyjdQS+PP89GLa/YkKydMN/GyWxJEbR1GQecddx6+cXgqb/mthBqfJfmJg2pNjD YewW5NSEY5vq0SHwWdx8rxDUSgnAJhWgi4gyiitigJs05QjYOMYi6H2WE27hu0/m3+VR t2QeRnw9ObpHd4lyWUEFrhIcPEzPp7XOT2/n5hew48RxFMlFOvlZN1Oai01YClnotOhn 6UeJhbb53+ReFijxNmuEEzIW9bPA9BR1a0YdO5twFWUY/LhM7U6c/hs/aGxo5SqFGFVt IigTPQ2Bk4kFOffSEhpk9tprSdY4yoWMCB5ulpCQ38KBh5G7GZPviHMgS88PFSnSj81P yNXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:content-transfer-encoding:references:in-reply-to:date:cc :to:subject:message-id:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=aLMk8CodrVJlgVonbdKxiY+317Zj8V8fay3IX6lWR6Y=; b=M0x2iw4HlSs/SanBNHJAP0IHCvG8vyH5JeMVWsoSktcghdZOq9d2sZ28MTvu8TtFtg MpkN7n2xeAJ7A7T4Mi8kKilzjI/WrLF70Y+S9c/zm5V14tsTEhErv15KQiBVJegFpQj6 fcBOFAC4nQM+bwuNXiLekrq5QnTa6PpRo/vlflMSxVWvF+gTwg5jmmbua/rFye9STO/+ oj9DBS00Kahiuw5750X4SC9YkKBJ3t910kKREKsyYRl7ppt+Qy3c2xQv/OYtY1xAsA3b bAMy71/lzlIcVBvsddSM3v+lm/NkMecRzmQmThJKqv3nWnB7Z7+uhaX9AfiiMON6K2ui blLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=lbzFddWJ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id mf17-20020a170906cb9100b00965c644d332si2888305ejb.187.2023.05.06.10.06.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 06 May 2023 10:06:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=lbzFddWJ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 747A93858C52 for ; Sat, 6 May 2023 17:06:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 747A93858C52 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683392786; bh=aLMk8CodrVJlgVonbdKxiY+317Zj8V8fay3IX6lWR6Y=; h=Subject:To:Cc:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=lbzFddWJ5iYT3tAb5BvuTadbZK32oQYSM/V70Y0gHX2/o9Dz44ggKCdpeoKwwjgvx bneRrkKjHdNG7Jmj/UKAQVXNcUOmdwGhr0UEjYqrSirPvTNfH6wYK2wJTr9SUAK8zz 8xrb7kEnxTMVGcfJsDkHxiqV5uN164AGUCbIXSD8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id 0D3A73858D28 for ; Sat, 6 May 2023 17:05:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D3A73858D28 Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384)) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 09D4465A6C; Sat, 6 May 2023 13:05:35 -0400 (EDT) Message-ID: Subject: Pushed: [PATCH v2] LoongArch: Enable shrink wrapping To: Lulu Cheng , Guo Jie , gcc-patches@gcc.gnu.org Cc: WANG Xuerui , Chenghua Xu Date: Sun, 07 May 2023 01:05:34 +0800 In-Reply-To: <9d90ec7762d255495a46e699ea7d0ea344399f90.camel@xry111.site> References: <20230423131903.155998-1-xry111@xry111.site> <182bca4f-c605-8e0c-19ce-e840258b05d2@loongson.cn> <9d90ec7762d255495a46e699ea7d0ea344399f90.camel@xry111.site> User-Agent: Evolution 3.48.1 MIME-Version: 1.0 X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_FROM, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Xi Ruoyao via Gcc-patches From: Xi Ruoyao Reply-To: Xi Ruoyao Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765165275899274578?= X-GMAIL-MSGID: =?utf-8?q?1765165275899274578?= On Wed, 2023-04-26 at 21:29 +0800, Xi Ruoyao via Gcc-patches wrote: > > > >        Do you have any questions about the test cases mentioned by > > Guo > > Jie? If there is no problem, modify the test case, > > > > I think the code can be merged into the main branch. > > I'll rewrite the test and commit in a few days (now I'm occupied with > something :( ). The patch has been pushed as the following (with test updated). Unfortunately I forgot to modify the change log to include the SPEC result and the change for test case :(. -- >8 -- From d90eed13ae655fbb4adb173fdae392b082e82a56 Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Sun, 23 Apr 2023 20:52:22 +0800 Subject: [PATCH] LoongArch: Enable shrink wrapping This commit implements the target macros for shrink wrapping of function prologues/epilogues shrink wrapping on LoongArch. Bootstrapped and regtested on loongarch64-linux-gnu. I don't have an access to SPEC CPU so I hope the reviewer can perform a benchmark to see if there is real benefit. gcc/ChangeLog: * config/loongarch/loongarch.h (struct machine_function): Add reg_is_wrapped_separately array for register wrapping information. * config/loongarch/loongarch.cc (loongarch_get_separate_components): New function. (loongarch_components_for_bb): Likewise. (loongarch_disqualify_components): Likewise. (loongarch_process_components): Likewise. (loongarch_emit_prologue_components): Likewise. (loongarch_emit_epilogue_components): Likewise. (loongarch_set_handled_components): Likewise. (TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS): Define. (TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB): Likewise. (TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS): Likewise. (TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS): Likewise. (TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS): Likewise. (TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Likewise. (loongarch_for_each_saved_reg): Skip registers that are wrapped separately. gcc/testsuite/ChangeLog: * gcc.target/loongarch/shrink-wrap.c: New test. --- gcc/config/loongarch/loongarch.cc | 179 +++++++++++++++++- gcc/config/loongarch/loongarch.h | 2 + .../gcc.target/loongarch/shrink-wrap.c | 19 ++ 3 files changed, 197 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/shrink-wrap.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index d808cb3a5ae..7f4e0e59573 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3. If not see #include "builtins.h" #include "rtl-iter.h" #include "opts.h" +#include "function-abi.h" /* This file should be included last. */ #include "target-def.h" @@ -1017,19 +1018,23 @@ loongarch_for_each_saved_reg (HOST_WIDE_INT sp_offset, for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST)) { - loongarch_save_restore_reg (word_mode, regno, offset, fn); + if (!cfun->machine->reg_is_wrapped_separately[regno]) + loongarch_save_restore_reg (word_mode, regno, offset, fn); + offset -= UNITS_PER_WORD; } /* This loop must iterate over the same space as its companion in loongarch_compute_frame_info. */ offset = cfun->machine->frame.fp_sp_offset - sp_offset; + machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode; + for (int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST)) { - machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode; + if (!cfun->machine->reg_is_wrapped_separately[regno]) + loongarch_save_restore_reg (word_mode, regno, offset, fn); - loongarch_save_restore_reg (mode, regno, offset, fn); offset -= GET_MODE_SIZE (mode); } } @@ -6633,6 +6638,151 @@ loongarch_asan_shadow_offset (void) return TARGET_64BIT ? (HOST_WIDE_INT_1 << 46) : 0; } +static sbitmap +loongarch_get_separate_components (void) +{ + HOST_WIDE_INT offset; + sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER); + bitmap_clear (components); + offset = cfun->machine->frame.gp_sp_offset; + + /* The stack should be aligned to 16-bytes boundary, so we can make the use + of ldptr instructions. */ + gcc_assert (offset % UNITS_PER_WORD == 0); + + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST)) + { + /* We can wrap general registers saved at [sp, sp + 32768) using the + ldptr/stptr instructions. For large offsets a pseudo register + might be needed which cannot be created during the shrink + wrapping pass. + + TODO: This may need a revise when we add LA32 as ldptr.w is not + guaranteed available by the manual. */ + if (offset < 32768) + bitmap_set_bit (components, regno); + + offset -= UNITS_PER_WORD; + } + + offset = cfun->machine->frame.fp_sp_offset; + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST)) + { + /* We can only wrap FP registers with imm12 offsets. For large + offsets a pseudo register might be needed which cannot be + created during the shrink wrapping pass. */ + if (IMM12_OPERAND (offset)) + bitmap_set_bit (components, regno); + + offset -= UNITS_PER_FPREG; + } + + /* Don't mess with the hard frame pointer. */ + if (frame_pointer_needed) + bitmap_clear_bit (components, HARD_FRAME_POINTER_REGNUM); + + bitmap_clear_bit (components, RETURN_ADDR_REGNUM); + + return components; +} + +static sbitmap +loongarch_components_for_bb (basic_block bb) +{ + /* Registers are used in a bb if they are in the IN, GEN, or KILL sets. */ + auto_bitmap used; + bitmap_copy (used, DF_LIVE_IN (bb)); + bitmap_ior_into (used, &DF_LIVE_BB_INFO (bb)->gen); + bitmap_ior_into (used, &DF_LIVE_BB_INFO (bb)->kill); + + sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER); + bitmap_clear (components); + + function_abi_aggregator callee_abis; + rtx_insn *insn; + FOR_BB_INSNS (bb, insn) + if (CALL_P (insn)) + callee_abis.note_callee_abi (insn_callee_abi (insn)); + + HARD_REG_SET extra_caller_saves = + callee_abis.caller_save_regs (*crtl->abi); + + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (!fixed_regs[regno] + && !crtl->abi->clobbers_full_reg_p (regno) + && (TEST_HARD_REG_BIT (extra_caller_saves, regno) || + bitmap_bit_p (used, regno))) + bitmap_set_bit (components, regno); + + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (!fixed_regs[regno] + && !crtl->abi->clobbers_full_reg_p (regno) + && (TEST_HARD_REG_BIT (extra_caller_saves, regno) || + bitmap_bit_p (used, regno))) + bitmap_set_bit (components, regno); + + return components; +} + +static void +loongarch_disqualify_components (sbitmap, edge, sbitmap, bool) +{ + /* Do nothing. */ +} + +static void +loongarch_process_components (sbitmap components, loongarch_save_restore_fn fn) +{ + HOST_WIDE_INT offset = cfun->machine->frame.gp_sp_offset; + + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST)) + { + if (bitmap_bit_p (components, regno)) + loongarch_save_restore_reg (word_mode, regno, offset, fn); + + offset -= UNITS_PER_WORD; + } + + offset = cfun->machine->frame.fp_sp_offset; + machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode; + + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST)) + { + if (bitmap_bit_p (components, regno)) + loongarch_save_restore_reg (mode, regno, offset, fn); + + offset -= UNITS_PER_FPREG; + } +} + +static void +loongarch_emit_prologue_components (sbitmap components) +{ + loongarch_process_components (components, loongarch_save_reg); +} + +static void +loongarch_emit_epilogue_components (sbitmap components) +{ + loongarch_process_components (components, loongarch_restore_reg); +} + +static void +loongarch_set_handled_components (sbitmap components) +{ + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (bitmap_bit_p (components, regno)) + cfun->machine->reg_is_wrapped_separately[regno] = true; + + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (bitmap_bit_p (components, regno)) + cfun->machine->reg_is_wrapped_separately[regno] = true; +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -6830,6 +6980,29 @@ loongarch_asan_shadow_offset (void) #undef TARGET_ASAN_SHADOW_OFFSET #define TARGET_ASAN_SHADOW_OFFSET loongarch_asan_shadow_offset +#undef TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS +#define TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS \ + loongarch_get_separate_components + +#undef TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB +#define TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB loongarch_components_for_bb + +#undef TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS +#define TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS \ + loongarch_disqualify_components + +#undef TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS +#define TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS \ + loongarch_emit_prologue_components + +#undef TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS +#define TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS \ + loongarch_emit_epilogue_components + +#undef TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS +#define TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS \ + loongarch_set_handled_components + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-loongarch.h" diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h index a9eff6a81bd..829acdaa9be 100644 --- a/gcc/config/loongarch/loongarch.h +++ b/gcc/config/loongarch/loongarch.h @@ -1147,6 +1147,8 @@ struct GTY (()) machine_function /* The current frame information, calculated by loongarch_compute_frame_info. */ struct loongarch_frame_info frame; + + bool reg_is_wrapped_separately[FIRST_PSEUDO_REGISTER]; }; #endif diff --git a/gcc/testsuite/gcc.target/loongarch/shrink-wrap.c b/gcc/testsuite/gcc.target/loongarch/shrink-wrap.c new file mode 100644 index 00000000000..1431536c59c --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/shrink-wrap.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fshrink-wrap" } */ + +/* We should not save anything before checking the value of x. */ +/* { dg-final { scan-assembler-not "st(ptr)?\\\.\[dw\].*b(eq|ne)z" } } */ + +int +foo (int x) +{ + __asm__ ("nop" :); + if (x) + { + __asm__ ("" ::: "s0", "s1"); + return x; + } + + __asm__ ("" ::: "s2", "s3"); + return 0; +}