Message ID | 20230109135828.879136-2-mark.rutland@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2169576wrt; Mon, 9 Jan 2023 05:59:27 -0800 (PST) X-Google-Smtp-Source: AMrXdXtmgj/CwQHt+ZVL6ceWvkDYFTrcsfQZoHaDKqJ5z4PnTCDLAWeoUJI3XcpQs0y3NotkZHoa X-Received: by 2002:a62:ed08:0:b0:577:272f:fdb with SMTP id u8-20020a62ed08000000b00577272f0fdbmr53043501pfh.29.1673272767000; Mon, 09 Jan 2023 05:59:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673272766; cv=none; d=google.com; s=arc-20160816; b=RPGlojGwxufbk9mWzmxWttRAJuU6UcdrfQaWcStgcBofdzyzzgEN78qszZlijGgB7P JmjIGQNmjW8+Ew+JxzIzpezqi3fyJwy8l2TLNsA4gL22K5UEHlN4VLlu32PKcl72m/lU /dPoCth6/3bACs9EkwjQpCvX9VVmc8eLouDk3SwBWDT3bdNWNKJ0H4i3+5UHhQ0cCaY0 jDapRC0Aet2OfZ8wItPN2bWPW2nNYPFnS9r0+gbZ9w85E6N0EMyBoJso6UsMqfIJsA63 bW3LxMcmfMuanwO5QjmkJnAjVLPp3Vlu4/AgUMmatYVgeFd5eo+PRD+HblOMNRp1cyue p+6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=QC6IVpFz0PA6Zbil2AxcWuJx+5gzCmPeehhx4+03dC4=; b=yV8dlPTa+HnavkVhiJVb695nki72oz6qc7+UTjC3LwKA+VANk3SR5lcDaGkvFsHOcG 7uERgGn2jyC3fogsb1pNJNEnLPG3E++NgL5S3on2vqQSbINAo5Y+RNaxzb7+BuauXGyV 2Th0ErbpNer5DpMTZyX7/pX0pzmwg44oVGu4uGa+WM0d7ZdXgkhzcDaJplWkrLxQhEbc N9LVN0lpWaGJPqiDx7J2g1E8vpeU+L11lNdk7lKlQCxSoFV/rGzRXgIIRQg634vmxsU5 FB5wkpHalBJ2sypSqA9mEwupiIdAI08K2/ZbTpcGPmkUPlnONoOM4s4AsFavga1XYw6i /P0g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f7-20020a056a001ac700b005818631179asi10400600pfv.95.2023.01.09.05.59.13; Mon, 09 Jan 2023 05:59:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234481AbjAIN6z (ORCPT <rfc822;274620705z@gmail.com> + 99 others); Mon, 9 Jan 2023 08:58:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237241AbjAIN6l (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 9 Jan 2023 08:58:41 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id DE339FCA; Mon, 9 Jan 2023 05:58:40 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9B4601515; Mon, 9 Jan 2023 05:59:22 -0800 (PST) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C43293F23F; Mon, 9 Jan 2023 05:58:38 -0800 (PST) From: Mark Rutland <mark.rutland@arm.com> To: linux-arm-kernel@lists.infradead.org Cc: catalin.marinas@arm.com, lenb@kernel.org, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org, mark.rutland@arm.com, mhiramat@kernel.org, ndesaulniers@google.com, ojeda@kernel.org, peterz@infradead.org, rafael.j.wysocki@intel.com, revest@chromium.org, robert.moore@intel.com, rostedt@goodmis.org, will@kernel.org Subject: [PATCH 1/8] Compiler attributes: GCC function alignment workarounds Date: Mon, 9 Jan 2023 13:58:21 +0000 Message-Id: <20230109135828.879136-2-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230109135828.879136-1-mark.rutland@arm.com> References: <20230109135828.879136-1-mark.rutland@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754553665263390474?= X-GMAIL-MSGID: =?utf-8?q?1754553665263390474?= |
Series |
arm64/ftrace: Add support for DYNAMIC_FTRACE_WITH_CALL_OPS
|
|
Commit Message
Mark Rutland
Jan. 9, 2023, 1:58 p.m. UTC
From local testing, contemporary versions of of GCC (e.g. GCC 12.2.0)
don't respect '-falign-functions=N' in all cases. This is unfortunate,
as (for non-zero values of N) CONFIG_FUNCTION_ALIGNMENT=N will set
'-falign-functions=N', but this won't take effect for all functions.
LLVM appears to respect '-falign-functions=N' in call cases.
Today, for x86 this turns out to be functionally benign, though it does
somewhat undermine the CONFIG_FUNCTION_ALIGNMENT option, and it means
that CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not as robust as we'd
like.
On arm64 we'd like to use CONFIG_FUNCTION_ALIGNMENT to implement ftrace
functionality, and we'll require function alignment to be respected for
functional reasons.
As far as I can tell, GCC doesn't respect '-falign-functions=N':
* When the __weak__ attribute is used
GCC seems to forget the alignment specified by '-falign-functions=N',
but will respect the '__aligned__(N)' function attribute. Thus, we can
work around this by explciitly setting the alignment for weak
functions.
* When the __cold__ attribute is used
GCC seems to forget the alignment specified by '-falign-functions=N',
and also doesn't seem to respect the '__aligned__(N)' function
attribute. The only way to work around this is to not use the __cold__
attibute.
This patch implements workarounds for these two cases, using a function
attribute to set the alignment of __weak__ functions, and preventing the
use of the __cold__ attribute when CONFIG_FUNCTION_ALIGNMENT is
non-zero.
I've tested this by selecting CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y,
building and booting a kernel, and looking for misaligned text symbols:
* arm64:
Before:
# grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
4939
After:
# grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
908
* x86_64:
Before:
# grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
7969
After:
# grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
2057
With the patch applied, the remaining unaligned text labels are a
combination of static call trampolines, non-function labels in assembly,
and ACPICA functions, which will be dealt with in subsequent patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
---
include/linux/compiler_attributes.h | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
Comments
On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > * When the __weak__ attribute is used > > GCC seems to forget the alignment specified by '-falign-functions=N', > but will respect the '__aligned__(N)' function attribute. Thus, we can > work around this by explciitly setting the alignment for weak > functions. > > * When the __cold__ attribute is used > > GCC seems to forget the alignment specified by '-falign-functions=N', > and also doesn't seem to respect the '__aligned__(N)' function > attribute. The only way to work around this is to not use the __cold__ > attibute. If you happen to have a reduced case, then it would be nice to link it in the commit. A bug report to GCC would also be nice. I gave it a very quick try in Compiler Explorer, but I couldn't reproduce it, so I guess it depends on flags, non-trivial functions or something else. > + * '-falign-functions=N', and require alignment to be specificed via a function Nit: specificed -> specified > +#if CONFIG_FUNCTION_ALIGNMENT > 0 > +#define __function_aligned __aligned(CONFIG_FUNCTION_ALIGNMENT) > +#else > +#define __function_aligned > +#endif Currently, the file is intended for attributes that do not depend on `CONFIG_*` options. What I usually mention is that we could change that policy, but otherwise these would go into e.g. `compiler_types.h`. > +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0) > #define __cold __attribute__((__cold__)) > +#else > +#define __cold > +#endif Similarly, in this case this could go into `compiler-gcc.h` / `compiler-clang.h` etc., since the definition will be different for each. Cheers, Miguel
On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > * When the __weak__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > work around this by explciitly setting the alignment for weak Whoops: s/explciitly/explicitly/ here too; I'll go re-proofread the series. > > functions. > > > > * When the __cold__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > and also doesn't seem to respect the '__aligned__(N)' function > > attribute. The only way to work around this is to not use the __cold__ > > attibute. Whoops: s/attibute/attribute/ > If you happen to have a reduced case, then it would be nice to link it > in the commit. A bug report to GCC would also be nice. > > I gave it a very quick try in Compiler Explorer, but I couldn't > reproduce it, so I guess it depends on flags, non-trivial functions or > something else. Sorry, that is something I had intendeed to do but I hadn't extracted a reproducer yet. I'll try to come up with something that can be included in the commit message and reported to GCC folk (and double-check at the same time that there's not another hidden cause) With this series applied and this patch reverted, it's possible to see when building defconfig + CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y, where scanning /proc/kallsyms with: $ grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' ... will show a bunch of cold functions (and their callees/callers), largely init/exit functions (so I'll double-check whether section handling as an effect), e.g. ffffdf08be173b8c t snd_soc_exit ffffdf08be173bc4 t apple_mca_driver_exit ffffdf08be173be8 t failover_exit ffffdf08be173c10 t inet_diag_exit ffffdf08be173c60 t tcp_diag_exit ffffdf08be173c84 t cubictcp_unregister ffffdf08be173cac t af_unix_exit ffffdf08be173cf4 t packet_exit ffffdf08be173d3c t cleanup_sunrpc ffffdf08be173d8c t exit_rpcsec_gss ffffdf08be173dc4 t exit_p9 ffffdf08be173dec T p9_client_exit ffffdf08be173e10 t p9_trans_fd_exit ffffdf08be173e58 t p9_virtio_cleanup ffffdf08be173e90 t exit_dns_resolver > > + * '-falign-functions=N', and require alignment to be specificed via a function > > Nit: specificed -> specified Thanks, fixed > > +#if CONFIG_FUNCTION_ALIGNMENT > 0 > > +#define __function_aligned __aligned(CONFIG_FUNCTION_ALIGNMENT) > > +#else > > +#define __function_aligned > > +#endif > > Currently, the file is intended for attributes that do not depend on > `CONFIG_*` options. > > What I usually mention is that we could change that policy, but > otherwise these would go into e.g. `compiler_types.h`. I'm happy to move these, I just wasn't sure what the policy would be w.r.t. the existing __weak and __cold defitions since those end up depending upon __function_aligned. I assume I should move them all? i.e. move __weak as well? > > +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0) > > #define __cold __attribute__((__cold__)) > > +#else > > +#define __cold > > +#endif > > Similarly, in this case this could go into `compiler-gcc.h` / > `compiler-clang.h` etc., since the definition will be different for > each. Sure, can do. Thanks, Mark.
On Mon, Jan 9, 2023 at 6:06 PM Mark Rutland <mark.rutland@arm.com> wrote: > > Sorry, that is something I had intendeed to do but I hadn't extracted a > reproducer yet. I'll try to come up with something that can be included in the > commit message and reported to GCC folk (and double-check at the same time that > there's not another hidden cause) Yeah, no worries :) I suggested it because from my quick test it didn't appear to be reproducible trivially, so I thought having the reproducer would be nice. > I'm happy to move these, I just wasn't sure what the policy would be w.r.t. the > existing __weak and __cold defitions since those end up depending upon > __function_aligned. > > I assume I should move them all? i.e. move __weak as well? Yeah, with the current policy, all should be moved since their behavior now depends on the config (eventually). Cheers, Miguel
On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > * When the __weak__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > work around this by explciitly setting the alignment for weak > > functions. > > > > * When the __cold__ attribute is used > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > and also doesn't seem to respect the '__aligned__(N)' function > > attribute. The only way to work around this is to not use the __cold__ > > attibute. > > If you happen to have a reduced case, then it would be nice to link it > in the commit. A bug report to GCC would also be nice. > > I gave it a very quick try in Compiler Explorer, but I couldn't > reproduce it, so I guess it depends on flags, non-trivial functions or > something else. So having spent today coming up with tests, it turns out it's not quite as I described above, but in a sense worse. I'm posting a summary here for posterity; I'll try to get this to compiler folk shortly. GCC appears to not align cold functions to the alignment specified by `-falign-functions=N` when compiling at `-O1` or above. Alignment *can* be restored with explicit attributes on each function, but due to some interprocedural analysis, callees can be implicitly marked as cold (losing their default alignment), which means we don't have a reliable mechanism to ensure functions are always aligned short of annotating *every* function explicitly (and I suspect that's not sufficient due to some interprocedural optimizations). I've tested with the 12.1.0 binary release from the kernel.org cross toolchains page). LLVM always seems to repsect `-falign-functions=N` at both `-O1` and `-O2` (I tested the 10.0.0, 11.0.0, 11.0.1, 15.0.6 binary releases from llvm.org). For example: | [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-cold.c | #define __cold \ | __attribute__((cold)) | | #define EXPORT_FUNC_PTR(func) \ | typeof((func)) *__ptr_##func = (func) | | __cold | void cold_func_a(void) { } | | __cold | void cold_func_b(void) { } | | __cold | void cold_func_c(void) { } | | static __cold | void static_cold_func_a(void) { } | EXPORT_FUNC_PTR(static_cold_func_a); | | static __cold | void static_cold_func_b(void) { } | EXPORT_FUNC_PTR(static_cold_func_b); | | static __cold | void static_cold_func_c(void) { } | EXPORT_FUNC_PTR(static_cold_func_c); | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-cold.c -O1 | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-cold.o | | test-cold.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <static_cold_func_a>: | 0: d65f03c0 ret | | 0000000000000004 <static_cold_func_b>: | 4: d65f03c0 ret | | 0000000000000008 <static_cold_func_c>: | 8: d65f03c0 ret | | 000000000000000c <cold_func_a>: | c: d65f03c0 ret | | 0000000000000010 <cold_func_b>: | 10: d65f03c0 ret | | 0000000000000014 <cold_func_c>: | 14: d65f03c0 ret | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-cold.o | | test-cold.o: file format elf64-littleaarch64 | | Sections: | Idx Name Size VMA LMA File off Algn | 0 .text 00000018 0000000000000000 0000000000000000 00000040 2**2 | CONTENTS, ALLOC, LOAD, READONLY, CODE | 1 .data 00000018 0000000000000000 0000000000000000 00000058 2**3 | CONTENTS, ALLOC, LOAD, RELOC, DATA | 2 .bss 00000000 0000000000000000 0000000000000000 00000070 2**0 | ALLOC | 3 .comment 00000013 0000000000000000 0000000000000000 00000070 2**0 | CONTENTS, READONLY | 4 .note.GNU-stack 00000000 0000000000000000 0000000000000000 00000083 2**0 | CONTENTS, READONLY | 5 .eh_frame 00000090 0000000000000000 0000000000000000 00000088 2**3 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA In simple cases, alignment *can* be restored if an explicit function attribute is used. For example: | [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-aligned-cold.c | #define __aligned(n) \ | __attribute__((aligned(n))) | | #define __cold \ | __attribute__((cold)) __aligned(16) | | #define EXPORT_FUNC_PTR(func) \ | typeof((func)) *__ptr_##func = (func) | | __cold | void cold_func_a(void) { } | | __cold | void cold_func_b(void) { } | | __cold | void cold_func_c(void) { } | | static __cold | void static_cold_func_a(void) { } | EXPORT_FUNC_PTR(static_cold_func_a); | | static __cold | void static_cold_func_b(void) { } | EXPORT_FUNC_PTR(static_cold_func_b); | | static __cold | void static_cold_func_c(void) { } | EXPORT_FUNC_PTR(static_cold_func_c); | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-aligned-cold.c -O1 | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-aligned-cold.o | | test-aligned-cold.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <static_cold_func_a>: | 0: d65f03c0 ret | 4: d503201f nop | 8: d503201f nop | c: d503201f nop | | 0000000000000010 <static_cold_func_b>: | 10: d65f03c0 ret | 14: d503201f nop | 18: d503201f nop | 1c: d503201f nop | | 0000000000000020 <static_cold_func_c>: | 20: d65f03c0 ret | 24: d503201f nop | 28: d503201f nop | 2c: d503201f nop | | 0000000000000030 <cold_func_a>: | 30: d65f03c0 ret | 34: d503201f nop | 38: d503201f nop | 3c: d503201f nop | | 0000000000000040 <cold_func_b>: | 40: d65f03c0 ret | 44: d503201f nop | 48: d503201f nop | 4c: d503201f nop | | 0000000000000050 <cold_func_c>: | 50: d65f03c0 ret | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-aligned-cold.o | | test-aligned-cold.o: file format elf64-littleaarch64 | | Sections: | Idx Name Size VMA LMA File off Algn | 0 .text 00000054 0000000000000000 0000000000000000 00000040 2**4 | CONTENTS, ALLOC, LOAD, READONLY, CODE | 1 .data 00000018 0000000000000000 0000000000000000 00000098 2**3 | CONTENTS, ALLOC, LOAD, RELOC, DATA | 2 .bss 00000000 0000000000000000 0000000000000000 000000b0 2**0 | ALLOC | 3 .comment 00000013 0000000000000000 0000000000000000 000000b0 2**0 | CONTENTS, READONLY | 4 .note.GNU-stack 00000000 0000000000000000 0000000000000000 000000c3 2**0 | CONTENTS, READONLY | 5 .eh_frame 00000090 0000000000000000 0000000000000000 000000c8 2**3 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA Unfortunately it appears that some interprocedural analysis determines that if a callee is only called/referenced from cold callers, the callee is marked as cold, and the alignment it would have got from the command line option is dropped. If it's given an explicit alignment attribute, the alignment is retained. For example: | [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-aligned-cold-caller.c | #define noinline \ | __attribute__((noinline)) | | #define __aligned(n) \ | __attribute__((aligned(n))) | | #define __cold \ | __attribute__((cold)) __aligned(16) | | #define EXPORT_FUNC_PTR(func) \ | typeof((func)) *__ptr_##func = (func) | | static noinline void callee_a(void) | { | asm volatile("// callee_a\n" ::: "memory"); | } | | static noinline void callee_b(void) | { | asm volatile("// callee_b\n" ::: "memory"); | } | | static noinline void callee_c(void) | { | asm volatile("// callee_c\n" ::: "memory"); | } | __cold | void cold_func_a(void) { callee_a(); } | | __cold | void cold_func_b(void) { callee_b(); } | | __cold | void cold_func_c(void) { callee_c(); } | | static __cold | void static_cold_func_a(void) { callee_a(); } | EXPORT_FUNC_PTR(static_cold_func_a); | | static __cold | void static_cold_func_b(void) { callee_b(); } | EXPORT_FUNC_PTR(static_cold_func_b); | | static __cold | void static_cold_func_c(void) { callee_c(); } | EXPORT_FUNC_PTR(static_cold_func_c); | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-aligned-cold-caller.c -O1 | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-aligned-cold-caller.o | | test-aligned-cold-caller.o: file format elf64-littleaarch64 | | | Disassembly of section .text: | | 0000000000000000 <callee_a>: | 0: d65f03c0 ret | | 0000000000000004 <callee_b>: | 4: d65f03c0 ret | | 0000000000000008 <callee_c>: | 8: d65f03c0 ret | c: d503201f nop | | 0000000000000010 <static_cold_func_a>: | 10: a9bf7bfd stp x29, x30, [sp, #-16]! | 14: 910003fd mov x29, sp | 18: 97fffffa bl 0 <callee_a> | 1c: a8c17bfd ldp x29, x30, [sp], #16 | 20: d65f03c0 ret | 24: d503201f nop | 28: d503201f nop | 2c: d503201f nop | | 0000000000000030 <static_cold_func_b>: | 30: a9bf7bfd stp x29, x30, [sp, #-16]! | 34: 910003fd mov x29, sp | 38: 97fffff3 bl 4 <callee_b> | 3c: a8c17bfd ldp x29, x30, [sp], #16 | 40: d65f03c0 ret | 44: d503201f nop | 48: d503201f nop | 4c: d503201f nop | | 0000000000000050 <static_cold_func_c>: | 50: a9bf7bfd stp x29, x30, [sp, #-16]! | 54: 910003fd mov x29, sp | 58: 97ffffec bl 8 <callee_c> | 5c: a8c17bfd ldp x29, x30, [sp], #16 | 60: d65f03c0 ret | 64: d503201f nop | 68: d503201f nop | 6c: d503201f nop | | 0000000000000070 <cold_func_a>: | 70: a9bf7bfd stp x29, x30, [sp, #-16]! | 74: 910003fd mov x29, sp | 78: 97ffffe2 bl 0 <callee_a> | 7c: a8c17bfd ldp x29, x30, [sp], #16 | 80: d65f03c0 ret | 84: d503201f nop | 88: d503201f nop | 8c: d503201f nop | | 0000000000000090 <cold_func_b>: | 90: a9bf7bfd stp x29, x30, [sp, #-16]! | 94: 910003fd mov x29, sp | 98: 97ffffdb bl 4 <callee_b> | 9c: a8c17bfd ldp x29, x30, [sp], #16 | a0: d65f03c0 ret | a4: d503201f nop | a8: d503201f nop | ac: d503201f nop | | 00000000000000b0 <cold_func_c>: | b0: a9bf7bfd stp x29, x30, [sp, #-16]! | b4: 910003fd mov x29, sp | b8: 97ffffd4 bl 8 <callee_c> | bc: a8c17bfd ldp x29, x30, [sp], #16 | c0: d65f03c0 ret | [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-aligned-cold-caller.o | | test-aligned-cold-caller.o: file format elf64-littleaarch64 | | Sections: | Idx Name Size VMA LMA File off Algn | 0 .text 000000c4 0000000000000000 0000000000000000 00000040 2**4 | CONTENTS, ALLOC, LOAD, READONLY, CODE | 1 .data 00000018 0000000000000000 0000000000000000 00000108 2**3 | CONTENTS, ALLOC, LOAD, RELOC, DATA | 2 .bss 00000000 0000000000000000 0000000000000000 00000120 2**0 | ALLOC | 3 .comment 00000013 0000000000000000 0000000000000000 00000120 2**0 | CONTENTS, READONLY | 4 .note.GNU-stack 00000000 0000000000000000 0000000000000000 00000133 2**0 | CONTENTS, READONLY | 5 .eh_frame 00000110 0000000000000000 0000000000000000 00000138 2**3 | CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA Thanks, Mark.
On Wed, Jan 11, 2023 at 06:27:53PM +0000, Mark Rutland wrote: > On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > > > * When the __weak__ attribute is used > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > > work around this by explciitly setting the alignment for weak > > > functions. > > > > > > * When the __cold__ attribute is used > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > and also doesn't seem to respect the '__aligned__(N)' function > > > attribute. The only way to work around this is to not use the __cold__ > > > attibute. > > > > If you happen to have a reduced case, then it would be nice to link it > > in the commit. A bug report to GCC would also be nice. > > > > I gave it a very quick try in Compiler Explorer, but I couldn't > > reproduce it, so I guess it depends on flags, non-trivial functions or > > something else. > > So having spent today coming up with tests, it turns out it's not quite as I > described above, but in a sense worse. I'm posting a summary here for > posterity; I'll try to get this to compiler folk shortly. I've added the cold bits to an existing ticket: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 I have not been able to reproduce the issue with __weak__, so I'll go dig into that some more; it's likely I was mistaken there. Thanks, Mark.
On Thu, Jan 12, 2023 at 11:38:17AM +0000, Mark Rutland wrote: > On Wed, Jan 11, 2023 at 06:27:53PM +0000, Mark Rutland wrote: > > On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote: > > > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote: > > > > > > > > As far as I can tell, GCC doesn't respect '-falign-functions=N': > > > > > > > > * When the __weak__ attribute is used > > > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > > but will respect the '__aligned__(N)' function attribute. Thus, we can > > > > work around this by explciitly setting the alignment for weak > > > > functions. > > > > > > > > * When the __cold__ attribute is used > > > > > > > > GCC seems to forget the alignment specified by '-falign-functions=N', > > > > and also doesn't seem to respect the '__aligned__(N)' function > > > > attribute. The only way to work around this is to not use the __cold__ > > > > attibute. > > > > > > If you happen to have a reduced case, then it would be nice to link it > > > in the commit. A bug report to GCC would also be nice. > > > > > > I gave it a very quick try in Compiler Explorer, but I couldn't > > > reproduce it, so I guess it depends on flags, non-trivial functions or > > > something else. > > > > So having spent today coming up with tests, it turns out it's not quite as I > > described above, but in a sense worse. I'm posting a summary here for > > posterity; I'll try to get this to compiler folk shortly. > > I've added the cold bits to an existing ticket: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 > > I have not been able to reproduce the issue with __weak__, so I'll go dig into > that some more; it's likely I was mistaken there. It turns out that was a red herring; GCC is actually implicitly marking the abort() function as cold, and as Linux's implementation happened to be marked as weak I assumed that was the culprit. I'll drop the changes to weak and update our abort implementation specifically, with a comment. I'll also go update the ticket above. Thanks, Mark.
On Fri, Jan 13, 2023 at 1:49 PM Mark Rutland <mark.rutland@arm.com> wrote: > > It turns out that was a red herring; GCC is actually implicitly marking the > abort() function as cold, and as Linux's implementation happened to be marked > as weak I assumed that was the culprit. That and your previous message probably explains probably why I couldn't reproduce it. Thanks a lot for all the details -- the `cold` issue is reproducible since gcc 4.6 at least: https://godbolt.org/z/PoxazzT9T The `abort` case appears to happen since gcc 8.1. Cheers, Miguel
diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h index 898b3458b24a0..dcb7ac67b764f 100644 --- a/include/linux/compiler_attributes.h +++ b/include/linux/compiler_attributes.h @@ -33,6 +33,17 @@ #define __aligned(x) __attribute__((__aligned__(x))) #define __aligned_largest __attribute__((__aligned__)) +/* + * Contemporary versions of GCC (e.g. 12.2.0) don't always respect + * '-falign-functions=N', and require alignment to be specificed via a function + * attribute in some cases. + */ +#if CONFIG_FUNCTION_ALIGNMENT > 0 +#define __function_aligned __aligned(CONFIG_FUNCTION_ALIGNMENT) +#else +#define __function_aligned +#endif + /* * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally * available and includes other attributes. For GCC < 9.1, __alloc_size__ gets undefined @@ -78,8 +89,15 @@ /* * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-cold-function-attribute * gcc: https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html#index-cold-label-attribute + * + * GCC drops function alignment when the __cold__ attribute is used. Avoid the + * __cold__ attribute if function alignment is required. */ +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0) #define __cold __attribute__((__cold__)) +#else +#define __cold +#endif /* * Note the long name. @@ -369,8 +387,11 @@ /* * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-weak-function-attribute * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-weak-variable-attribute + * + * GCC drops function alignment when the __weak__ attribute is used. This can + * be restored with function attributes. */ -#define __weak __attribute__((__weak__)) +#define __weak __attribute__((__weak__)) __function_aligned /* * Used by functions that use '__builtin_return_address'. These function