[-tip] x86/kprobes: Handle removed INT3 in do_int3()
Commit Message
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Since x86 doesn't use stop_machine() to patch the kernel text,
there is a small chance that the another CPU removes the INT3
during do_int3(). In this case, if no INT3 notifier callbacks
handled that, the kernel calls die() because of a stray INT3.
Currently this is checked and recovered in the kprobe_int3_handler(),
but this is wrong because;
- If CONFIG_KPROBES is not set, kernel does not handle this case.
- After kprobe_int3_handler() ignores that INT3, that can be
removed before notify_die(DIE_INT3). And if the callback misses
it, kernel dies.
- It skips the INT3 notifier callbacks if the INT3 is NOT managed
by the kprobes. Another callback may be able to handle it.
Thus, move the removed INT3 recovering code to do_int3(),
after calling all callbacks.
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
arch/x86/kernel/kprobes/core.c | 15 +--------------
arch/x86/kernel/traps.c | 15 +++++++++++++++
2 files changed, 16 insertions(+), 14 deletions(-)
Comments
On Fri, Nov 25, 2022 at 10:09:02AM +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Since x86 doesn't use stop_machine() to patch the kernel text,
> there is a small chance that the another CPU removes the INT3
> during do_int3(). In this case, if no INT3 notifier callbacks
> handled that, the kernel calls die() because of a stray INT3.
Please clarify; how would that happen? Should not everybody modifying
text take text_mutex ?
On Fri, 25 Nov 2022 08:41:19 +0100
Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, Nov 25, 2022 at 10:09:02AM +0900, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > Since x86 doesn't use stop_machine() to patch the kernel text,
> > there is a small chance that the another CPU removes the INT3
> > during do_int3(). In this case, if no INT3 notifier callbacks
> > handled that, the kernel calls die() because of a stray INT3.
>
> Please clarify; how would that happen? Should not everybody modifying
> text take text_mutex ?
The text_mutex doesn't stop executing do_int3() since do_int3() is
an exception and must not be blocked. That mutex is only blocking
the other kernel text modifiers, not INT3 handling.
If there is only kprobe using INT3, this must not happen because
kprobe_int3_handler() always find a struct kprobe corresponding
to the INT3 address. (from this reason, the current code is wrong too)
However, if there are other INT3 callbacks (e.g. alternatives and
ftrace via text_poke_bp*()) managing the INT3, this can happen.
The possible scenario is here;
1. CPU0 hits an INT3 which is managed by text_poke_bp().
2. CPU1 removes the INT3 from the text and *start* calling
text_poke_sync(). (note that text_poke_sync() will sync core to
serialize pipeline, thus the memory and dcache already updated)
3. CPU0 calls kprobe_int3_handler(), but it can not find the
corresponding kprobe from the address. Thus it reads the instruction
at the address, and find it is not INT3 anymore.
4. CPU0's kprobe_int3_handler() sets the address to the regs->ip,
and returns 1, which means "this INT3 is handled".
5. CPU0 returns from do_int3() and exit the exception, then it
handles the do_sync_core() via IPI.
6. CPU1 finish the text_poke_sync().
This works fine, but it is not handled by the INT3 owner's callback.
Oh! maybe we don't need to handle remove INT3 case, because all
callback MUST handle its own INT3. This can be done simply using
text_poke_sync() because it use an IPI, and the IPI is not handled
until all running INT3 handlers exit.
OK, let me update the patch to just drop the removed INT3 handling
from kprobes.
Thank you,
@@ -987,20 +987,7 @@ int kprobe_int3_handler(struct pt_regs *regs)
return 1;
}
}
-
- if (*addr != INT3_INSN_OPCODE) {
- /*
- * The breakpoint instruction was removed right
- * after we hit it. Another cpu has removed
- * either a probepoint or a debugger breakpoint
- * at this address. In either case, no further
- * handling of this interrupt is appropriate.
- * Back up over the (now missing) int3 and run
- * the original instruction.
- */
- regs->ip = (unsigned long)addr;
- return 1;
- } /* else: not a kprobe fault; let the kernel handle it */
+ /* This may not a kprobe fault; let the kernel handle it */
return 0;
}
@@ -788,6 +788,7 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
static bool do_int3(struct pt_regs *regs)
{
+ unsigned long addr = instruction_pointer(regs) - INT3_INSN_SIZE;
int res;
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
@@ -802,6 +803,20 @@ static bool do_int3(struct pt_regs *regs)
#endif
res = notify_die(DIE_INT3, "int3", regs, 0, X86_TRAP_BP, SIGTRAP);
+ if (unlikely(res != NOTIFY_STOP)) {
+ if (*(u8 *)addr != INT3_INSN_OPCODE) {
+ /*
+ * Another CPU removed the INT3 instruction before
+ * callbacks handle it. This is not a stray INT3
+ * but recoverable.
+ * Back up over the (now missing) INT3 and run
+ * the original instruction.
+ */
+ regs->ip = addr;
+ return true;
+ }
+ }
+
return res == NOTIFY_STOP;
}
NOKPROBE_SYMBOL(do_int3);