[v2] x86/mm: Do not verify W^X at boot up on ftrace trampolines
Commit Message
From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Adding on the kernel command line "ftrace=function" triggered:
------------[ cut here ]------------
CPA detected W^X violation: 8000000000000063 -> 0000000000000063 range:
0xffffffffc0013000 - 0xffffffffc0013fff PFN 10031b
WARNING: CPU: 0 PID: 0 at arch/x86/mm/pat/set_memory.c:609
verify_rwx+0x61/0x6d
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc1-test+ #3
Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
RIP: 0010:verify_rwx+0x61/0x6d
Code: e5 01 00 75 27 49 c1 e0 0c 48 89 d1 48 89 fe 48 c7 c7 5b b3 92 84 4e
8d 44 02 ff 48 89 da c6 05 71 29 e5 01 01 e8 35 90 e2 00 <0f> 0b 48 89 d8
5b 5d e9 6f 95 1a 01 0f 1f 44 00 00 55 48 89 e5 53
RSP: 0000:ffffffff84c03b08 EFLAGS: 00010086
RAX: 0000000000000000 RBX: 0000000000000063 RCX: 0000000000000003
RDX: 0000000000000003 RSI: ffffffff84c039b0 RDI: 0000000000000001
RBP: ffffffff84c03b10 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000025 R12: ffff8e730031c098
R13: 000000000010031b R14: 800000010031b063 R15: 8000000000000063
FS: 0000000000000000(0000) GS:ffff8e7416a00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff8e73fd801000 CR3: 00000001fcc22001 CR4: 00000000000606f0
Call Trace:
<TASK>
__change_page_attr_set_clr+0x146/0x8a6
? __mutex_unlock_slowpath+0x41/0x213
? mutex_unlock+0x12/0x18
? _vm_unmap_aliases+0x126/0x136
change_page_attr_set_clr+0x135/0x268
? find_vmap_area+0x32/0x3e
? __fentry__+0x10/0x10
change_page_attr_clear.constprop.0+0x16/0x1c
set_memory_x+0x2c/0x32
arch_ftrace_update_trampoline+0x218/0x2db
? ftrace_caller_op_ptr+0x17/0x17
ftrace_update_trampoline+0x16/0xa1
? tracing_gen_ctx+0x1c/0x1c
__register_ftrace_function+0x93/0xb2
ftrace_startup+0x21/0xf0
? tracing_gen_ctx+0x1c/0x1c
register_ftrace_function_nolock+0x26/0x40
register_ftrace_function+0x4e/0x143
? mutex_unlock+0x12/0x18
? tracing_gen_ctx+0x1c/0x1c
function_trace_init+0x7d/0xc3
tracer_init+0x23/0x2c
tracing_set_tracer+0x1d5/0x206
register_tracer+0x1c0/0x1e4
init_function_trace+0x90/0x96
early_trace_init+0x25c/0x352
start_kernel+0x424/0x6e4
x86_64_start_reservations+0x24/0x2a
x86_64_start_kernel+0x8c/0x95
secondary_startup_64_no_verify+0xe0/0xeb
</TASK>
---[ end trace 0000000000000000 ]---
This is because at boot up, kernel text is writable, and there's no reason
to do tricks to updated it. But the verifier does not distinguish updates
at boot up and at run time, and causes a warning at time of boot.
Add a check for system_state == SYSTEM_BOOTING and allow it if that is the
case and the address happens to be the ftrace trampoline.
Fixes: 652c5bf380ad0 ("x86/mm: Refuse W^X violations")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1: https://lore.kernel.org/r/20221024112730.180916b3@gandalf.local.home
- Hack in a check to only ignore this check if it is early boot *and* the
address being updated is the ftrace trampoline that is being created.
(Dave Hansen)
- Note, this is only needed if we can't come up with a proper way to
be able to call trace_poke() at early boot up. See the discussion on v1.
arch/x86/include/asm/ftrace.h | 2 ++
arch/x86/kernel/ftrace.c | 4 ++++
arch/x86/mm/pat/set_memory.c | 8 ++++++++
3 files changed, 14 insertions(+)
Comments
On Mon, Oct 24, 2022 at 3:25 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
>
> Adding on the kernel command line "ftrace=function" triggered:
This one I *really* detest.
If we have to have a special case, make it just be the simple
"system_state == SYSTEM_BOOTING", don't make it even nastier.
Special cases are bad. Making them these kinds of "this is
super-magical and special" is even worse.
Linus
On Mon, 24 Oct 2022 16:58:58 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Mon, Oct 24, 2022 at 3:25 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> >
> > Adding on the kernel command line "ftrace=function" triggered:
>
> This one I *really* detest.
>
> If we have to have a special case, make it just be the simple
> "system_state == SYSTEM_BOOTING", don't make it even nastier.
>
> Special cases are bad. Making them these kinds of "this is
> super-magical and special" is even worse.
>
I was just trying to narrow down the one special case that allows the
one exception to get pass the security check. It's what they taught me in
security school. Only allow what you know is allowable and block everything
else.
But anyway, I'll just let anyone take the v1 patch. I'll add it to my tests
so that I can at least get my tests to finish.
-- Steve
On Mon, Oct 24, 2022 at 5:24 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> But anyway, I'll just let anyone take the v1 patch. I'll add it to my tests
> so that I can at least get my tests to finish.
Yeah, the v1 patch is the best option for now, and I'll just apply it
to my tree too.
I really hope that poking_mm creation can be cleaned up.
Linus
@@ -25,6 +25,8 @@
#ifndef __ASSEMBLY__
extern void __fentry__(void);
+extern long ftrace_updated_trampoline;
+
static inline unsigned long ftrace_call_adjust(unsigned long addr)
{
/*
@@ -417,7 +417,11 @@ create_trampoline(struct ftrace_ops *ops, unsigned int *tramp_size)
if (likely(system_state != SYSTEM_BOOTING))
set_memory_ro((unsigned long)trampoline, npages);
+ else
+ ftrace_updated_trampoline = trampoline;
set_memory_x((unsigned long)trampoline, npages);
+ ftrace_updated_trampoline = 0;
+
return (unsigned long)trampoline;
fail:
tramp_free(trampoline);
@@ -32,6 +32,7 @@
#include <asm/memtype.h>
#include <asm/hyperv-tlfs.h>
#include <asm/mshyperv.h>
+#include <asm/ftrace.h>
#include "../mm_internal.h"
@@ -579,6 +580,9 @@ static inline pgprot_t static_protections(pgprot_t prot, unsigned long start,
return __pgprot(pgprot_val(prot) & ~forbidden);
}
+/* Store at early bootup the ftrace trampoline */
+long ftrace_updated_trampoline;
+
/*
* Validate strict W^X semantics.
*/
@@ -587,6 +591,10 @@ static inline pgprot_t verify_rwx(pgprot_t old, pgprot_t new, unsigned long star
{
unsigned long end;
+ /* Kernel text is rw at boot up and ftrace may call this */
+ if (system_state == SYSTEM_BOOTING && ftrace_updated_trampoline == start)
+ return new;
+
/*
* 32-bit has some unfixable W+X issues, like EFI code
* and writeable data being in the same page. Disable