[1/8] Compiler attributes: GCC function alignment workarounds

Message ID 20230109135828.879136-2-mark.rutland@arm.com
State New
Headers
Series arm64/ftrace: Add support for DYNAMIC_FTRACE_WITH_CALL_OPS |

Commit Message

Mark Rutland Jan. 9, 2023, 1:58 p.m. UTC
  From local testing, contemporary versions of of GCC (e.g. GCC 12.2.0)
don't respect '-falign-functions=N' in all cases. This is unfortunate,
as (for non-zero values of N) CONFIG_FUNCTION_ALIGNMENT=N will set
'-falign-functions=N', but this won't take effect for all functions.
LLVM appears to respect '-falign-functions=N' in call cases.

Today, for x86 this turns out to be functionally benign, though it does
somewhat undermine the CONFIG_FUNCTION_ALIGNMENT option, and it means
that CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B is not as robust as we'd
like.

On arm64 we'd like to use CONFIG_FUNCTION_ALIGNMENT to implement ftrace
functionality, and we'll require function alignment to be respected for
functional reasons.

As far as I can tell, GCC doesn't respect '-falign-functions=N':

* When the __weak__ attribute is used

  GCC seems to forget the alignment specified by '-falign-functions=N',
  but will respect the '__aligned__(N)' function attribute. Thus, we can
  work around this by explciitly setting the alignment for weak
  functions.

* When the __cold__ attribute is used

  GCC seems to forget the alignment specified by '-falign-functions=N',
  and also doesn't seem to respect the '__aligned__(N)' function
  attribute. The only way to work around this is to not use the __cold__
  attibute.

This patch implements workarounds for these two cases, using a function
attribute to set the alignment of __weak__ functions, and preventing the
use of the __cold__ attribute when CONFIG_FUNCTION_ALIGNMENT is
non-zero.

I've tested this by selecting CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y,
building and booting a kernel, and looking for misaligned text symbols:

* arm64:

  Before:
    # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
    4939

  After:
    # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
    908

* x86_64:

  Before:
    # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
    7969

  After:
    # grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] ' | wc -l
    2057

With the patch applied, the remaining unaligned text labels are a
combination of static call trampolines, non-function labels in assembly,
and ACPICA functions, which will be dealt with in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Florent Revest <revest@chromium.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Will Deacon <will@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
---
 include/linux/compiler_attributes.h | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)
  

Comments

Miguel Ojeda Jan. 9, 2023, 2:43 p.m. UTC | #1
On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> As far as I can tell, GCC doesn't respect '-falign-functions=N':
>
> * When the __weak__ attribute is used
>
>   GCC seems to forget the alignment specified by '-falign-functions=N',
>   but will respect the '__aligned__(N)' function attribute. Thus, we can
>   work around this by explciitly setting the alignment for weak
>   functions.
>
> * When the __cold__ attribute is used
>
>   GCC seems to forget the alignment specified by '-falign-functions=N',
>   and also doesn't seem to respect the '__aligned__(N)' function
>   attribute. The only way to work around this is to not use the __cold__
>   attibute.

If you happen to have a reduced case, then it would be nice to link it
in the commit. A bug report to GCC would also be nice.

I gave it a very quick try in Compiler Explorer, but I couldn't
reproduce it, so I guess it depends on flags, non-trivial functions or
something else.

> + * '-falign-functions=N', and require alignment to be specificed via a function

Nit: specificed -> specified

> +#if CONFIG_FUNCTION_ALIGNMENT > 0
> +#define __function_aligned             __aligned(CONFIG_FUNCTION_ALIGNMENT)
> +#else
> +#define __function_aligned
> +#endif

Currently, the file is intended for attributes that do not depend on
`CONFIG_*` options.

What I usually mention is that we could change that policy, but
otherwise these would go into e.g. `compiler_types.h`.

> +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0)
>  #define __cold                          __attribute__((__cold__))
> +#else
> +#define __cold
> +#endif

Similarly, in this case this could go into `compiler-gcc.h` /
`compiler-clang.h` etc., since the definition will be different for
each.

Cheers,
Miguel
  
Mark Rutland Jan. 9, 2023, 5:06 p.m. UTC | #2
On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote:
> On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > As far as I can tell, GCC doesn't respect '-falign-functions=N':
> >
> > * When the __weak__ attribute is used
> >
> >   GCC seems to forget the alignment specified by '-falign-functions=N',
> >   but will respect the '__aligned__(N)' function attribute. Thus, we can
> >   work around this by explciitly setting the alignment for weak

Whoops: s/explciitly/explicitly/ here too; I'll go re-proofread the series.

> >   functions.
> >
> > * When the __cold__ attribute is used
> >
> >   GCC seems to forget the alignment specified by '-falign-functions=N',
> >   and also doesn't seem to respect the '__aligned__(N)' function
> >   attribute. The only way to work around this is to not use the __cold__
> >   attibute.

Whoops: s/attibute/attribute/

> If you happen to have a reduced case, then it would be nice to link it
> in the commit. A bug report to GCC would also be nice.
> 
> I gave it a very quick try in Compiler Explorer, but I couldn't
> reproduce it, so I guess it depends on flags, non-trivial functions or
> something else.

Sorry, that is something I had intendeed to do but I hadn't extracted a
reproducer yet. I'll try to come up with something that can be included in the
commit message and reported to GCC folk (and double-check at the same time that
there's not another hidden cause)

With this series applied and this patch reverted, it's possible to see when
building defconfig + CONFIG_DEBUG_FORCE_FUNCTION_ALIGN_64B=y, where scanning
/proc/kallsyms with:

  $ grep ' [Tt] ' /proc/kallsyms | grep -iv '[048c]0 [Tt] '

... will show a bunch of cold functions (and their callees/callers), largely
init/exit functions (so I'll double-check whether section handling as an
effect), e.g.

  ffffdf08be173b8c t snd_soc_exit
  ffffdf08be173bc4 t apple_mca_driver_exit
  ffffdf08be173be8 t failover_exit
  ffffdf08be173c10 t inet_diag_exit
  ffffdf08be173c60 t tcp_diag_exit
  ffffdf08be173c84 t cubictcp_unregister
  ffffdf08be173cac t af_unix_exit
  ffffdf08be173cf4 t packet_exit
  ffffdf08be173d3c t cleanup_sunrpc
  ffffdf08be173d8c t exit_rpcsec_gss
  ffffdf08be173dc4 t exit_p9
  ffffdf08be173dec T p9_client_exit
  ffffdf08be173e10 t p9_trans_fd_exit
  ffffdf08be173e58 t p9_virtio_cleanup
  ffffdf08be173e90 t exit_dns_resolver

> > + * '-falign-functions=N', and require alignment to be specificed via a function
> 
> Nit: specificed -> specified

Thanks, fixed

> > +#if CONFIG_FUNCTION_ALIGNMENT > 0
> > +#define __function_aligned             __aligned(CONFIG_FUNCTION_ALIGNMENT)
> > +#else
> > +#define __function_aligned
> > +#endif
> 
> Currently, the file is intended for attributes that do not depend on
> `CONFIG_*` options.
> 
> What I usually mention is that we could change that policy, but
> otherwise these would go into e.g. `compiler_types.h`.

I'm happy to move these, I just wasn't sure what the policy would be w.r.t. the
existing __weak and __cold defitions since those end up depending upon
__function_aligned.

I assume I should move them all? i.e. move __weak as well?

> > +#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0)
> >  #define __cold                          __attribute__((__cold__))
> > +#else
> > +#define __cold
> > +#endif
> 
> Similarly, in this case this could go into `compiler-gcc.h` /
> `compiler-clang.h` etc., since the definition will be different for
> each.

Sure, can do.

Thanks,
Mark.
  
Miguel Ojeda Jan. 9, 2023, 10:35 p.m. UTC | #3
On Mon, Jan 9, 2023 at 6:06 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> Sorry, that is something I had intendeed to do but I hadn't extracted a
> reproducer yet. I'll try to come up with something that can be included in the
> commit message and reported to GCC folk (and double-check at the same time that
> there's not another hidden cause)

Yeah, no worries :) I suggested it because from my quick test it
didn't appear to be reproducible trivially, so I thought having the
reproducer would be nice.

> I'm happy to move these, I just wasn't sure what the policy would be w.r.t. the
> existing __weak and __cold defitions since those end up depending upon
> __function_aligned.
>
> I assume I should move them all? i.e. move __weak as well?

Yeah, with the current policy, all should be moved since their
behavior now depends on the config (eventually).

Cheers,
Miguel
  
Mark Rutland Jan. 11, 2023, 6:27 p.m. UTC | #4
On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote:
> On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > As far as I can tell, GCC doesn't respect '-falign-functions=N':
> >
> > * When the __weak__ attribute is used
> >
> >   GCC seems to forget the alignment specified by '-falign-functions=N',
> >   but will respect the '__aligned__(N)' function attribute. Thus, we can
> >   work around this by explciitly setting the alignment for weak
> >   functions.
> >
> > * When the __cold__ attribute is used
> >
> >   GCC seems to forget the alignment specified by '-falign-functions=N',
> >   and also doesn't seem to respect the '__aligned__(N)' function
> >   attribute. The only way to work around this is to not use the __cold__
> >   attibute.
> 
> If you happen to have a reduced case, then it would be nice to link it
> in the commit. A bug report to GCC would also be nice.
> 
> I gave it a very quick try in Compiler Explorer, but I couldn't
> reproduce it, so I guess it depends on flags, non-trivial functions or
> something else.

So having spent today coming up with tests, it turns out it's not quite as I
described above, but in a sense worse. I'm posting a summary here for
posterity; I'll try to get this to compiler folk shortly.

GCC appears to not align cold functions to the alignment specified by
`-falign-functions=N` when compiling at `-O1` or above. Alignment *can* be
restored with explicit attributes on each function, but due to some
interprocedural analysis, callees can be implicitly marked as cold (losing
their default alignment), which means we don't have a reliable mechanism to
ensure functions are always aligned short of annotating *every* function
explicitly (and I suspect that's not sufficient due to some interprocedural optimizations).

I've tested with the 12.1.0 binary release from the kernel.org cross toolchains
page).

LLVM always seems to repsect `-falign-functions=N` at both `-O1` and `-O2` (I
tested the 10.0.0, 11.0.0, 11.0.1, 15.0.6 binary releases from llvm.org).

For example:

| [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-cold.c                                           
| #define __cold \
|         __attribute__((cold))
| 
| #define EXPORT_FUNC_PTR(func) \
|         typeof((func)) *__ptr_##func = (func)
| 
| __cold
| void cold_func_a(void) { }
| 
| __cold
| void cold_func_b(void) { }
| 
| __cold
| void cold_func_c(void) { }
| 
| static __cold
| void static_cold_func_a(void) { }
| EXPORT_FUNC_PTR(static_cold_func_a);
| 
| static __cold
| void static_cold_func_b(void) { }
| EXPORT_FUNC_PTR(static_cold_func_b);
| 
| static __cold
| void static_cold_func_c(void) { }
| EXPORT_FUNC_PTR(static_cold_func_c);
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-cold.c -O1
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-cold.o                     
| 
| test-cold.o:     file format elf64-littleaarch64
| 
| 
| Disassembly of section .text:
| 
| 0000000000000000 <static_cold_func_a>:
|    0:   d65f03c0        ret
| 
| 0000000000000004 <static_cold_func_b>:
|    4:   d65f03c0        ret
| 
| 0000000000000008 <static_cold_func_c>:
|    8:   d65f03c0        ret
| 
| 000000000000000c <cold_func_a>:
|    c:   d65f03c0        ret
| 
| 0000000000000010 <cold_func_b>:
|   10:   d65f03c0        ret
| 
| 0000000000000014 <cold_func_c>:
|   14:   d65f03c0        ret
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-cold.o
| 
| test-cold.o:     file format elf64-littleaarch64
| 
| Sections:
| Idx Name          Size      VMA               LMA               File off  Algn
|   0 .text         00000018  0000000000000000  0000000000000000  00000040  2**2
|                   CONTENTS, ALLOC, LOAD, READONLY, CODE
|   1 .data         00000018  0000000000000000  0000000000000000  00000058  2**3
|                   CONTENTS, ALLOC, LOAD, RELOC, DATA
|   2 .bss          00000000  0000000000000000  0000000000000000  00000070  2**0
|                   ALLOC
|   3 .comment      00000013  0000000000000000  0000000000000000  00000070  2**0
|                   CONTENTS, READONLY
|   4 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00000083  2**0
|                   CONTENTS, READONLY
|   5 .eh_frame     00000090  0000000000000000  0000000000000000  00000088  2**3
|                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA

In simple cases, alignment *can* be restored if an explicit function attribute
is used. For example:

| [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-aligned-cold.c                            
| #define __aligned(n) \
|         __attribute__((aligned(n)))
| 
| #define __cold \
|         __attribute__((cold)) __aligned(16)
| 
| #define EXPORT_FUNC_PTR(func) \
|         typeof((func)) *__ptr_##func = (func)
| 
| __cold
| void cold_func_a(void) { }
| 
| __cold
| void cold_func_b(void) { }
| 
| __cold
| void cold_func_c(void) { }
| 
| static __cold
| void static_cold_func_a(void) { }
| EXPORT_FUNC_PTR(static_cold_func_a);
| 
| static __cold
| void static_cold_func_b(void) { }
| EXPORT_FUNC_PTR(static_cold_func_b);
| 
| static __cold
| void static_cold_func_c(void) { }
| EXPORT_FUNC_PTR(static_cold_func_c);
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-aligned-cold.c -O1
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-aligned-cold.o                     
| 
| test-aligned-cold.o:     file format elf64-littleaarch64
| 
| 
| Disassembly of section .text:
| 
| 0000000000000000 <static_cold_func_a>:
|    0:   d65f03c0        ret
|    4:   d503201f        nop
|    8:   d503201f        nop
|    c:   d503201f        nop
| 
| 0000000000000010 <static_cold_func_b>:
|   10:   d65f03c0        ret
|   14:   d503201f        nop
|   18:   d503201f        nop
|   1c:   d503201f        nop
| 
| 0000000000000020 <static_cold_func_c>:
|   20:   d65f03c0        ret
|   24:   d503201f        nop
|   28:   d503201f        nop
|   2c:   d503201f        nop
| 
| 0000000000000030 <cold_func_a>:
|   30:   d65f03c0        ret
|   34:   d503201f        nop
|   38:   d503201f        nop
|   3c:   d503201f        nop
| 
| 0000000000000040 <cold_func_b>:
|   40:   d65f03c0        ret
|   44:   d503201f        nop
|   48:   d503201f        nop
|   4c:   d503201f        nop
| 
| 0000000000000050 <cold_func_c>:
|   50:   d65f03c0        ret
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-aligned-cold.o
| 
| test-aligned-cold.o:     file format elf64-littleaarch64
| 
| Sections:
| Idx Name          Size      VMA               LMA               File off  Algn
|   0 .text         00000054  0000000000000000  0000000000000000  00000040  2**4
|                   CONTENTS, ALLOC, LOAD, READONLY, CODE
|   1 .data         00000018  0000000000000000  0000000000000000  00000098  2**3
|                   CONTENTS, ALLOC, LOAD, RELOC, DATA
|   2 .bss          00000000  0000000000000000  0000000000000000  000000b0  2**0
|                   ALLOC
|   3 .comment      00000013  0000000000000000  0000000000000000  000000b0  2**0
|                   CONTENTS, READONLY
|   4 .note.GNU-stack 00000000  0000000000000000  0000000000000000  000000c3  2**0
|                   CONTENTS, READONLY
|   5 .eh_frame     00000090  0000000000000000  0000000000000000  000000c8  2**3
|                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA


Unfortunately it appears that some interprocedural analysis determines that if
a callee is only called/referenced from cold callers, the callee is marked as
cold, and the alignment it would have got from the command line option is
dropped. If it's given an explicit alignment attribute, the alignment is
retained.

For example:

| [mark@lakrids:/mnt/data/tests/gcc-alignment]% cat test-aligned-cold-caller.c                                    
| #define noinline \
|         __attribute__((noinline))
| 
| #define __aligned(n) \
|         __attribute__((aligned(n)))
| 
| #define __cold \
|         __attribute__((cold)) __aligned(16)
| 
| #define EXPORT_FUNC_PTR(func) \
|         typeof((func)) *__ptr_##func = (func)
| 
| static noinline void callee_a(void)
| {
|         asm volatile("// callee_a\n" ::: "memory");
| }
| 
| static noinline void callee_b(void)
| {
|         asm volatile("// callee_b\n" ::: "memory");
| }
| 
| static noinline void callee_c(void)
| {
|         asm volatile("// callee_c\n" ::: "memory");
| }
| __cold
| void cold_func_a(void) { callee_a(); }
| 
| __cold
| void cold_func_b(void) { callee_b(); }
| 
| __cold
| void cold_func_c(void) { callee_c(); }
| 
| static __cold
| void static_cold_func_a(void) { callee_a(); }
| EXPORT_FUNC_PTR(static_cold_func_a);
| 
| static __cold
| void static_cold_func_b(void) { callee_b(); }
| EXPORT_FUNC_PTR(static_cold_func_b);
| 
| static __cold
| void static_cold_func_c(void) { callee_c(); }
| EXPORT_FUNC_PTR(static_cold_func_c);
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-gcc -falign-functions=16 -c test-aligned-cold-caller.c -O1
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -d test-aligned-cold-caller.o                     
| 
| test-aligned-cold-caller.o:     file format elf64-littleaarch64
| 
| 
| Disassembly of section .text:
| 
| 0000000000000000 <callee_a>:
|    0:   d65f03c0        ret
| 
| 0000000000000004 <callee_b>:
|    4:   d65f03c0        ret
| 
| 0000000000000008 <callee_c>:
|    8:   d65f03c0        ret
|    c:   d503201f        nop
| 
| 0000000000000010 <static_cold_func_a>:
|   10:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
|   14:   910003fd        mov     x29, sp
|   18:   97fffffa        bl      0 <callee_a>
|   1c:   a8c17bfd        ldp     x29, x30, [sp], #16
|   20:   d65f03c0        ret
|   24:   d503201f        nop
|   28:   d503201f        nop
|   2c:   d503201f        nop
| 
| 0000000000000030 <static_cold_func_b>:
|   30:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
|   34:   910003fd        mov     x29, sp
|   38:   97fffff3        bl      4 <callee_b>
|   3c:   a8c17bfd        ldp     x29, x30, [sp], #16
|   40:   d65f03c0        ret
|   44:   d503201f        nop
|   48:   d503201f        nop
|   4c:   d503201f        nop
| 
| 0000000000000050 <static_cold_func_c>:
|   50:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
|   54:   910003fd        mov     x29, sp
|   58:   97ffffec        bl      8 <callee_c>
|   5c:   a8c17bfd        ldp     x29, x30, [sp], #16
|   60:   d65f03c0        ret
|   64:   d503201f        nop
|   68:   d503201f        nop
|   6c:   d503201f        nop
| 
| 0000000000000070 <cold_func_a>:
|   70:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
|   74:   910003fd        mov     x29, sp
|   78:   97ffffe2        bl      0 <callee_a>
|   7c:   a8c17bfd        ldp     x29, x30, [sp], #16
|   80:   d65f03c0        ret
|   84:   d503201f        nop
|   88:   d503201f        nop
|   8c:   d503201f        nop
| 
| 0000000000000090 <cold_func_b>:
|   90:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
|   94:   910003fd        mov     x29, sp
|   98:   97ffffdb        bl      4 <callee_b>
|   9c:   a8c17bfd        ldp     x29, x30, [sp], #16
|   a0:   d65f03c0        ret
|   a4:   d503201f        nop
|   a8:   d503201f        nop
|   ac:   d503201f        nop
| 
| 00000000000000b0 <cold_func_c>:
|   b0:   a9bf7bfd        stp     x29, x30, [sp, #-16]!
|   b4:   910003fd        mov     x29, sp
|   b8:   97ffffd4        bl      8 <callee_c>
|   bc:   a8c17bfd        ldp     x29, x30, [sp], #16
|   c0:   d65f03c0        ret
| [mark@lakrids:/mnt/data/tests/gcc-alignment]% usekorg 12.1.0 aarch64-linux-objdump -h test-aligned-cold-caller.o
| 
| test-aligned-cold-caller.o:     file format elf64-littleaarch64
| 
| Sections:
| Idx Name          Size      VMA               LMA               File off  Algn
|   0 .text         000000c4  0000000000000000  0000000000000000  00000040  2**4
|                   CONTENTS, ALLOC, LOAD, READONLY, CODE
|   1 .data         00000018  0000000000000000  0000000000000000  00000108  2**3
|                   CONTENTS, ALLOC, LOAD, RELOC, DATA
|   2 .bss          00000000  0000000000000000  0000000000000000  00000120  2**0
|                   ALLOC
|   3 .comment      00000013  0000000000000000  0000000000000000  00000120  2**0
|                   CONTENTS, READONLY
|   4 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00000133  2**0
|                   CONTENTS, READONLY
|   5 .eh_frame     00000110  0000000000000000  0000000000000000  00000138  2**3
|                   CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA

Thanks,
Mark.
  
Mark Rutland Jan. 12, 2023, 11:38 a.m. UTC | #5
On Wed, Jan 11, 2023 at 06:27:53PM +0000, Mark Rutland wrote:
> On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote:
> > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > As far as I can tell, GCC doesn't respect '-falign-functions=N':
> > >
> > > * When the __weak__ attribute is used
> > >
> > >   GCC seems to forget the alignment specified by '-falign-functions=N',
> > >   but will respect the '__aligned__(N)' function attribute. Thus, we can
> > >   work around this by explciitly setting the alignment for weak
> > >   functions.
> > >
> > > * When the __cold__ attribute is used
> > >
> > >   GCC seems to forget the alignment specified by '-falign-functions=N',
> > >   and also doesn't seem to respect the '__aligned__(N)' function
> > >   attribute. The only way to work around this is to not use the __cold__
> > >   attibute.
> > 
> > If you happen to have a reduced case, then it would be nice to link it
> > in the commit. A bug report to GCC would also be nice.
> > 
> > I gave it a very quick try in Compiler Explorer, but I couldn't
> > reproduce it, so I guess it depends on flags, non-trivial functions or
> > something else.
> 
> So having spent today coming up with tests, it turns out it's not quite as I
> described above, but in a sense worse. I'm posting a summary here for
> posterity; I'll try to get this to compiler folk shortly.

I've added the cold bits to an existing ticket:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345

I have not been able to reproduce the issue with __weak__, so I'll go dig into
that some more; it's likely I was mistaken there.

Thanks,
Mark.
  
Mark Rutland Jan. 13, 2023, 12:49 p.m. UTC | #6
On Thu, Jan 12, 2023 at 11:38:17AM +0000, Mark Rutland wrote:
> On Wed, Jan 11, 2023 at 06:27:53PM +0000, Mark Rutland wrote:
> > On Mon, Jan 09, 2023 at 03:43:16PM +0100, Miguel Ojeda wrote:
> > > On Mon, Jan 9, 2023 at 2:58 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > >
> > > > As far as I can tell, GCC doesn't respect '-falign-functions=N':
> > > >
> > > > * When the __weak__ attribute is used
> > > >
> > > >   GCC seems to forget the alignment specified by '-falign-functions=N',
> > > >   but will respect the '__aligned__(N)' function attribute. Thus, we can
> > > >   work around this by explciitly setting the alignment for weak
> > > >   functions.
> > > >
> > > > * When the __cold__ attribute is used
> > > >
> > > >   GCC seems to forget the alignment specified by '-falign-functions=N',
> > > >   and also doesn't seem to respect the '__aligned__(N)' function
> > > >   attribute. The only way to work around this is to not use the __cold__
> > > >   attibute.
> > > 
> > > If you happen to have a reduced case, then it would be nice to link it
> > > in the commit. A bug report to GCC would also be nice.
> > > 
> > > I gave it a very quick try in Compiler Explorer, but I couldn't
> > > reproduce it, so I guess it depends on flags, non-trivial functions or
> > > something else.
> > 
> > So having spent today coming up with tests, it turns out it's not quite as I
> > described above, but in a sense worse. I'm posting a summary here for
> > posterity; I'll try to get this to compiler folk shortly.
> 
> I've added the cold bits to an existing ticket:
> 
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345
> 
> I have not been able to reproduce the issue with __weak__, so I'll go dig into
> that some more; it's likely I was mistaken there.

It turns out that was a red herring; GCC is actually implicitly marking the
abort() function as cold, and as Linux's implementation happened to be marked
as weak I assumed that was the culprit.

I'll drop the changes to weak and update our abort implementation specifically,
with a comment.

I'll also go update the ticket above.

Thanks,
Mark.
  
Miguel Ojeda Jan. 15, 2023, 9:32 p.m. UTC | #7
On Fri, Jan 13, 2023 at 1:49 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> It turns out that was a red herring; GCC is actually implicitly marking the
> abort() function as cold, and as Linux's implementation happened to be marked
> as weak I assumed that was the culprit.

That and your previous message probably explains probably why I
couldn't reproduce it.

Thanks a lot for all the details -- the `cold` issue is reproducible
since gcc 4.6 at least: https://godbolt.org/z/PoxazzT9T

The `abort` case appears to happen since gcc 8.1.

Cheers,
Miguel
  

Patch

diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 898b3458b24a0..dcb7ac67b764f 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -33,6 +33,17 @@ 
 #define __aligned(x)                    __attribute__((__aligned__(x)))
 #define __aligned_largest               __attribute__((__aligned__))
 
+/*
+ * Contemporary versions of GCC (e.g. 12.2.0) don't always respect
+ * '-falign-functions=N', and require alignment to be specificed via a function
+ * attribute in some cases.
+ */
+#if CONFIG_FUNCTION_ALIGNMENT > 0
+#define __function_aligned		__aligned(CONFIG_FUNCTION_ALIGNMENT)
+#else
+#define __function_aligned
+#endif
+
 /*
  * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally
  * available and includes other attributes. For GCC < 9.1, __alloc_size__ gets undefined
@@ -78,8 +89,15 @@ 
 /*
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-cold-function-attribute
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html#index-cold-label-attribute
+ *
+ * GCC drops function alignment when the __cold__ attribute is used. Avoid the
+ * __cold__ attribute if function alignment is required.
  */
+#if !defined(CONFIG_CC_IS_GCC) || (CONFIG_FUNCTION_ALIGNMENT == 0)
 #define __cold                          __attribute__((__cold__))
+#else
+#define __cold
+#endif
 
 /*
  * Note the long name.
@@ -369,8 +387,11 @@ 
 /*
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-weak-function-attribute
  *   gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-weak-variable-attribute
+ *
+ * GCC drops function alignment when the __weak__ attribute is used. This can
+ * be restored with function attributes.
  */
-#define __weak                          __attribute__((__weak__))
+#define __weak                          __attribute__((__weak__)) __function_aligned
 
 /*
  * Used by functions that use '__builtin_return_address'. These function