[v2] x86/unwind/orc: add ELF section with ORC version identifier
Commit Message
From: Omar Sandoval <osandov@fb.com>
Commits ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC
metadata") and fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in
two") changed the ORC format. Although ORC is internal to the kernel,
it's the only way for external tools to get reliable kernel stack traces
on x86-64. In particular, the drgn debugger [1] uses ORC for stack
unwinding, and these format changes broke it [2]. As the drgn
maintainer, I don't care how often or how much the kernel changes the
ORC format as long as I have a way to detect the change.
It suffices to store a version identifier in the vmlinux and kernel
module ELF files (to use when parsing ORC sections from ELF), and in
kernel memory (to use when parsing ORC from a core dump+symbol table).
Rather than hard-coding a version number that needs to be manually
bumped, Peterz suggested hashing the definitions from orc_types.h. If
there is a format change that isn't caught by this, the hashing script
can be updated.
This patch adds an .orc_header allocated ELF section containing the
20-byte hash to vmlinux and kernel modules, along with the corresponding
__start_orc_header and __stop_orc_header symbols in vmlinux.
1: https://github.com/osandov/drgn
2: https://github.com/osandov/drgn/issues/303
Signed-off-by: Omar Sandoval <osandov@fb.com>
---
Hi,
This is v2 of my patch to make it possible for external tools like drgn
to identify versions of the ORC format. As stated in v1 [1], I don't
want ORC to be stable ABI; I just need a way to identify the format
being used.
This version incorporates Peter's suggestion to hash the ORC definitions
instead of requiring a manual version number; this is easier to maintain
and more resilient to backports.
I would love to get this in before 6.4 is released, and then hopefully
backport it to 6.3-stable.
This is based on Linus' tree as of today (commit
fb054096aea0576f0c0a61c598e5e9676443ee86).
Thanks!
Omar
arch/x86/Makefile | 12 ++++++++++++
arch/x86/include/asm/Kbuild | 1 +
arch/x86/include/asm/orc_header.h | 19 +++++++++++++++++++
arch/x86/kernel/unwind_orc.c | 3 +++
include/asm-generic/vmlinux.lds.h | 3 +++
scripts/mod/modpost.c | 5 +++++
scripts/orc_hash.sh | 16 ++++++++++++++++
7 files changed, 59 insertions(+)
create mode 100644 arch/x86/include/asm/orc_header.h
create mode 100755 scripts/orc_hash.sh
Comments
On Tue, Jun 13, 2023 at 02:14:56PM -0700, Omar Sandoval wrote:
> From: Omar Sandoval <osandov@fb.com>
>
> Commits ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC
> metadata") and fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in
> two") changed the ORC format. Although ORC is internal to the kernel,
> it's the only way for external tools to get reliable kernel stack traces
> on x86-64. In particular, the drgn debugger [1] uses ORC for stack
> unwinding, and these format changes broke it [2]. As the drgn
> maintainer, I don't care how often or how much the kernel changes the
> ORC format as long as I have a way to detect the change.
>
> It suffices to store a version identifier in the vmlinux and kernel
> module ELF files (to use when parsing ORC sections from ELF), and in
> kernel memory (to use when parsing ORC from a core dump+symbol table).
> Rather than hard-coding a version number that needs to be manually
> bumped, Peterz suggested hashing the definitions from orc_types.h. If
> there is a format change that isn't caught by this, the hashing script
> can be updated.
>
> This patch adds an .orc_header allocated ELF section containing the
> 20-byte hash to vmlinux and kernel modules, along with the corresponding
> __start_orc_header and __stop_orc_header symbols in vmlinux.
>
> 1: https://github.com/osandov/drgn
> 2: https://github.com/osandov/drgn/issues/303
>
> Signed-off-by: Omar Sandoval <osandov@fb.com>
Patch looks good to me; as a follow up I suppose we could verify the orc
hash on module load, to ensure the module and main kernel agree on the
ORC version used -- but we can do that later.
> ---
> Hi,
>
> This is v2 of my patch to make it possible for external tools like drgn
> to identify versions of the ORC format. As stated in v1 [1], I don't
> want ORC to be stable ABI; I just need a way to identify the format
> being used.
>
> This version incorporates Peter's suggestion to hash the ORC definitions
> instead of requiring a manual version number; this is easier to maintain
> and more resilient to backports.
>
> I would love to get this in before 6.4 is released, and then hopefully
> backport it to 6.3-stable.
So we're fairly late in the cycle and it would need justification to go
into objtool/urgent -- preferably only fixes at this point.
But given we 'broke' the ORC layout this cycle, we can mark this with
Fixes: for the two mentioned commits.
Josh?
On Wed, Jun 14, 2023 at 11:17:51AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 13, 2023 at 02:14:56PM -0700, Omar Sandoval wrote:
> > From: Omar Sandoval <osandov@fb.com>
> >
> > Commits ffb1b4a41016 ("x86/unwind/orc: Add 'signal' field to ORC
> > metadata") and fb799447ae29 ("x86,objtool: Split UNWIND_HINT_EMPTY in
> > two") changed the ORC format. Although ORC is internal to the kernel,
> > it's the only way for external tools to get reliable kernel stack traces
> > on x86-64. In particular, the drgn debugger [1] uses ORC for stack
> > unwinding, and these format changes broke it [2]. As the drgn
> > maintainer, I don't care how often or how much the kernel changes the
> > ORC format as long as I have a way to detect the change.
> >
> > It suffices to store a version identifier in the vmlinux and kernel
> > module ELF files (to use when parsing ORC sections from ELF), and in
> > kernel memory (to use when parsing ORC from a core dump+symbol table).
> > Rather than hard-coding a version number that needs to be manually
> > bumped, Peterz suggested hashing the definitions from orc_types.h. If
> > there is a format change that isn't caught by this, the hashing script
> > can be updated.
> >
> > This patch adds an .orc_header allocated ELF section containing the
> > 20-byte hash to vmlinux and kernel modules, along with the corresponding
> > __start_orc_header and __stop_orc_header symbols in vmlinux.
> >
> > 1: https://github.com/osandov/drgn
> > 2: https://github.com/osandov/drgn/issues/303
> >
> > Signed-off-by: Omar Sandoval <osandov@fb.com>
>
> Patch looks good to me; as a follow up I suppose we could verify the orc
> hash on module load, to ensure the module and main kernel agree on the
> ORC version used -- but we can do that later.
>
> > ---
> > Hi,
> >
> > This is v2 of my patch to make it possible for external tools like drgn
> > to identify versions of the ORC format. As stated in v1 [1], I don't
> > want ORC to be stable ABI; I just need a way to identify the format
> > being used.
> >
> > This version incorporates Peter's suggestion to hash the ORC definitions
> > instead of requiring a manual version number; this is easier to maintain
> > and more resilient to backports.
> >
> > I would love to get this in before 6.4 is released, and then hopefully
> > backport it to 6.3-stable.
>
> So we're fairly late in the cycle and it would need justification to go
> into objtool/urgent -- preferably only fixes at this point.
>
> But given we 'broke' the ORC layout this cycle, we can mark this with
> Fixes: for the two mentioned commits.
>
> Josh?
Ping, Josh, any chance of getting this in to 6.4? Sorry to be cutting it
so close.
On Tue, Jun 20, 2023 at 09:50:00AM -0700, Omar Sandoval wrote:
> > So we're fairly late in the cycle and it would need justification to go
> > into objtool/urgent -- preferably only fixes at this point.
> >
> > But given we 'broke' the ORC layout this cycle, we can mark this with
> > Fixes: for the two mentioned commits.
> >
> > Josh?
>
> Ping, Josh, any chance of getting this in to 6.4? Sorry to be cutting it
> so close.
Sorry, I had acked this privately and Peter queued it last week but it
may have slipped through the cracks. He may still try to get into 6.4.
@@ -305,6 +305,18 @@ ifeq ($(RETPOLINE_CFLAGS),)
endif
endif
+ifdef CONFIG_UNWINDER_ORC
+orc_hash_h := arch/$(SRCARCH)/include/generated/asm/orc_hash.h
+orc_hash_sh := $(srctree)/scripts/orc_hash.sh
+targets += $(orc_hash_h)
+quiet_cmd_orc_hash = GEN $@
+ cmd_orc_hash = mkdir -p $(dir $@); \
+ $(CONFIG_SHELL) $(orc_hash_sh) < $< > $@
+$(orc_hash_h): $(srctree)/arch/x86/include/asm/orc_types.h $(orc_hash_sh) FORCE
+ $(call if_changed,orc_hash)
+archprepare: $(orc_hash_h)
+endif
+
archclean:
$(Q)rm -rf $(objtree)/arch/i386
$(Q)rm -rf $(objtree)/arch/x86_64
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0
+generated-y += orc_hash.h
generated-y += syscalls_32.h
generated-y += syscalls_64.h
generated-y += syscalls_x32.h
new file mode 100644
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/* Copyright (c) Meta Platforms, Inc. and affiliates. */
+
+#ifndef _ORC_HEADER_H
+#define _ORC_HEADER_H
+
+#include <linux/types.h>
+#include <linux/compiler.h>
+#include <asm/orc_hash.h>
+
+/*
+ * The header is currently a 20-byte hash of the ORC entry definition; see
+ * scripts/orc_hash.sh.
+ */
+#define ORC_HEADER \
+ __used __section(".orc_header") __aligned(4) \
+ static const u8 orc_header[] = { ORC_HASH }
+
+#endif /* _ORC_HEADER_H */
@@ -7,6 +7,9 @@
#include <asm/unwind.h>
#include <asm/orc_types.h>
#include <asm/orc_lookup.h>
+#include <asm/orc_header.h>
+
+ORC_HEADER;
#define orc_warn(fmt, ...) \
printk_deferred_once(KERN_WARNING "WARNING: " fmt, ##__VA_ARGS__)
@@ -839,6 +839,9 @@
#ifdef CONFIG_UNWINDER_ORC
#define ORC_UNWIND_TABLE \
+ .orc_header : AT(ADDR(.orc_header) - LOAD_OFFSET) { \
+ BOUNDED_SECTION_BY(.orc_header, _orc_header) \
+ } \
. = ALIGN(4); \
.orc_unwind_ip : AT(ADDR(.orc_unwind_ip) - LOAD_OFFSET) { \
BOUNDED_SECTION_BY(.orc_unwind_ip, _orc_unwind_ip) \
@@ -1979,6 +1979,11 @@ static void add_header(struct buffer *b, struct module *mod)
buf_printf(b, "#include <linux/vermagic.h>\n");
buf_printf(b, "#include <linux/compiler.h>\n");
buf_printf(b, "\n");
+ buf_printf(b, "#ifdef CONFIG_UNWINDER_ORC\n");
+ buf_printf(b, "#include <asm/orc_header.h>\n");
+ buf_printf(b, "ORC_HEADER;\n");
+ buf_printf(b, "#endif\n");
+ buf_printf(b, "\n");
buf_printf(b, "BUILD_SALT;\n");
buf_printf(b, "BUILD_LTO_INFO;\n");
buf_printf(b, "\n");
new file mode 100755
@@ -0,0 +1,16 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+
+set -e
+
+printf '%s' '#define ORC_HASH '
+
+awk '
+/^#define ORC_(REG|TYPE)_/ { print }
+/^struct orc_entry {$/ { p=1 }
+p { print }
+/^}/ { p=0 }' |
+ sha1sum |
+ cut -d " " -f 1 |
+ sed 's/\([0-9a-f]\{2\}\)/0x\1,/g'