tracepoint: Allow livepatch module add trace event

Message ID 20221102160236.11696-1-iecedge@gmail.com
State New
Headers
Series tracepoint: Allow livepatch module add trace event |

Commit Message

Jianlin Lv Nov. 2, 2022, 4:02 p.m. UTC
  In the case of keeping the system running, the preferred method for
tracing the kernel is dynamic tracing (kprobe), but the drawback of
this method is that events are lost, especially when tracing packages
in the network stack.

Livepatching provides a potential solution, which is to reimplement the
function you want to replace and insert a static tracepoint.
In such a way, custom stable static tracepoints can be expanded without
rebooting the system.

Signed-off-by: Jianlin Lv <iecedge@gmail.com>
---
 kernel/tracepoint.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Steven Rostedt Nov. 14, 2022, 5:20 p.m. UTC | #1
On Wed,  2 Nov 2022 16:02:36 +0000
Jianlin Lv <iecedge@gmail.com> wrote:

> In the case of keeping the system running, the preferred method for
> tracing the kernel is dynamic tracing (kprobe), but the drawback of
> this method is that events are lost, especially when tracing packages
> in the network stack.
> 
> Livepatching provides a potential solution, which is to reimplement the
> function you want to replace and insert a static tracepoint.
> In such a way, custom stable static tracepoints can be expanded without
> rebooting the system.

Well that's definitely one way to implement dynamic trace events! :-D

-- Steve

> 
> Signed-off-by: Jianlin Lv <iecedge@gmail.com>
> ---
>  kernel/tracepoint.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
> index f23144af5743..8d1507dd0724 100644
> --- a/kernel/tracepoint.c
> +++ b/kernel/tracepoint.c
> @@ -571,8 +571,8 @@ static void for_each_tracepoint_range(
>  bool trace_module_has_bad_taint(struct module *mod)
>  {
>  	return mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP) |
> -			       (1 << TAINT_UNSIGNED_MODULE) |
> -			       (1 << TAINT_TEST));
> +				(1 << TAINT_UNSIGNED_MODULE) | (1 << TAINT_TEST) |
> +				(1 << TAINT_LIVEPATCH));
>  }
>  
>  static BLOCKING_NOTIFIER_HEAD(tracepoint_notify_list);
  
Steven Rostedt Nov. 14, 2022, 5:23 p.m. UTC | #2
On Wed,  2 Nov 2022 16:02:36 +0000
Jianlin Lv <iecedge@gmail.com> wrote:

> In the case of keeping the system running, the preferred method for
> tracing the kernel is dynamic tracing (kprobe), but the drawback of
> this method is that events are lost, especially when tracing packages
> in the network stack.

I'm not against this change, but the above is where I'm a bit confused. How
are events more likely to be lost with kprobes over a static event?

-- Steve
  
Jianlin Lv Nov. 15, 2022, 2:38 a.m. UTC | #3
On Tue, Nov 15, 2022 at 1:22 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed,  2 Nov 2022 16:02:36 +0000
> Jianlin Lv <iecedge@gmail.com> wrote:
>
> > In the case of keeping the system running, the preferred method for
> > tracing the kernel is dynamic tracing (kprobe), but the drawback of
> > this method is that events are lost, especially when tracing packages
> > in the network stack.
>
> I'm not against this change, but the above is where I'm a bit confused. How
> are events more likely to be lost with kprobes over a static event?

We have encountered a case of kprobes missing event, detailed
information can refer to the following link:
https://github.com/iovisor/bcc/issues/4198

Replacing kprobe with ’bpf + raw tracepoint‘,  no missing events occur.

> -- Steve
  
Steven Rostedt Nov. 15, 2022, 3:02 a.m. UTC | #4
On Tue, 15 Nov 2022 10:38:34 +0800
Jianlin Lv <iecedge@gmail.com> wrote:

> On Tue, Nov 15, 2022 at 1:22 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > On Wed,  2 Nov 2022 16:02:36 +0000
> > Jianlin Lv <iecedge@gmail.com> wrote:
> >  
> > > In the case of keeping the system running, the preferred method for
> > > tracing the kernel is dynamic tracing (kprobe), but the drawback of
> > > this method is that events are lost, especially when tracing packages
> > > in the network stack.  
> >
> > I'm not against this change, but the above is where I'm a bit confused. How
> > are events more likely to be lost with kprobes over a static event?  
> 
> We have encountered a case of kprobes missing event, detailed
> information can refer to the following link:
> https://github.com/iovisor/bcc/issues/4198
> 
> Replacing kprobe with ’bpf + raw tracepoint‘,  no missing events occur.
> 

Masami,

What's the reason that kprobes are not re-entrant when using ftrace?

-- Steve
  
Masami Hiramatsu (Google) Nov. 15, 2022, 3:07 p.m. UTC | #5
On Mon, 14 Nov 2022 22:02:16 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 15 Nov 2022 10:38:34 +0800
> Jianlin Lv <iecedge@gmail.com> wrote:
> 
> > On Tue, Nov 15, 2022 at 1:22 AM Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > On Wed,  2 Nov 2022 16:02:36 +0000
> > > Jianlin Lv <iecedge@gmail.com> wrote:
> > >  
> > > > In the case of keeping the system running, the preferred method for
> > > > tracing the kernel is dynamic tracing (kprobe), but the drawback of
> > > > this method is that events are lost, especially when tracing packages
> > > > in the network stack.  
> > >
> > > I'm not against this change, but the above is where I'm a bit confused. How
> > > are events more likely to be lost with kprobes over a static event?  
> > 
> > We have encountered a case of kprobes missing event, detailed
> > information can refer to the following link:
> > https://github.com/iovisor/bcc/issues/4198
> > 
> > Replacing kprobe with ’bpf + raw tracepoint‘,  no missing events occur.
> > 
> 
> Masami,
> 
> What's the reason that kprobes are not re-entrant when using ftrace?

I think we had discussed this issue when I drop the irq_disable() from
kprobe ftrace handler on x86, see commit a19b2e3d7839 ("kprobes/x86:
 Remove IRQ disabling from ftrace-based/optimized kprobes").

Anyway, kprobes itself is not re-entrant (and no need to be re-entrant
when using int3) because it uses a per-cpu variable to memorize the
current running kprobes while processing the int3 handling and the 
singlestep (trap) handling so that it can go back to the correct track
safely. It also has a single-stage "backup" (see save_previous_kprobe())
for unexpectedly re-entrant kprobes (e.g. call a probed function from
kprobe user handler.)

Thus the kprobe user doesn't need to write a re-entrant handler code.
Since kprobes on function entry is transparently changed to the ftrace,
we have to keep this limitation on the kprobes on ftrace.

BTW, now the kprobe_ftrace_handler() uses ftrace_test_recursion_trylock()
to avoid ftrace recursion, is that OK for this case?

Thank you,
  
Steven Rostedt Nov. 15, 2022, 3:18 p.m. UTC | #6
On Wed, 16 Nov 2022 00:07:07 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> BTW, now the kprobe_ftrace_handler() uses ftrace_test_recursion_trylock()
> to avoid ftrace recursion, is that OK for this case?

Note, the ftrace_test_recursion_trylock() only prevents "same context"
recursion. That is, it will not let normal context recurse into normal
context, or interrupt context recurse into interrupt context.

It has the logic of breaking up into 4 levels:

 1. normal
 2. softirq
 3. irq
 4. NMI

It allows the high levels to recurse into lower levels
 (e.g. irq context into normal context)

Thus, the code within the ftrace_test_recursion_trylock() must itself be
re-entrant to handle being called from different contexts.

-- Steve
  
Jianlin Lv Dec. 23, 2022, 4:52 a.m. UTC | #7
On Tue, Nov 15, 2022 at 11:17 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 16 Nov 2022 00:07:07 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
>
> > BTW, now the kprobe_ftrace_handler() uses ftrace_test_recursion_trylock()
> > to avoid ftrace recursion, is that OK for this case?
>
> Note, the ftrace_test_recursion_trylock() only prevents "same context"
> recursion. That is, it will not let normal context recurse into normal
> context, or interrupt context recurse into interrupt context.
>
> It has the logic of breaking up into 4 levels:
>
>  1. normal
>  2. softirq
>  3. irq
>  4. NMI
>
> It allows the high levels to recurse into lower levels
>  (e.g. irq context into normal context)
>
> Thus, the code within the ftrace_test_recursion_trylock() must itself be
> re-entrant to handle being called from different contexts.
>
> -- Steve

hi, Steve
Any other comments for code changes?
Is it possible for this patch to be merged?

Regards,
Jianlin
  
Steven Rostedt Dec. 23, 2022, 5:08 a.m. UTC | #8
On Fri, 23 Dec 2022 12:52:18 +0800
Jianlin Lv <iecedge@gmail.com> wrote:

> hi, Steve
> Any other comments for code changes?
> Is it possible for this patch to be merged?

Ah, I had it marked as waiting for a reply. But I think we got side
tracked on the discussion.

Anyway, this is a trivial patch, I think I can get it in during -rc1.

-- Steve
  
Steven Rostedt Feb. 17, 2023, 1:47 a.m. UTC | #9
On Fri, 23 Dec 2022 00:08:08 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 23 Dec 2022 12:52:18 +0800
> Jianlin Lv <iecedge@gmail.com> wrote:
> 
> > hi, Steve
> > Any other comments for code changes?
> > Is it possible for this patch to be merged?  
> 
> Ah, I had it marked as waiting for a reply. But I think we got side
> tracked on the discussion.
> 
> Anyway, this is a trivial patch, I think I can get it in during -rc1.
> 

And it appears that due to the Christmas holidays, I dropped the patch.

I'm adding it to the queue now.

-- Steve
  

Patch

diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index f23144af5743..8d1507dd0724 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -571,8 +571,8 @@  static void for_each_tracepoint_range(
 bool trace_module_has_bad_taint(struct module *mod)
 {
 	return mod->taints & ~((1 << TAINT_OOT_MODULE) | (1 << TAINT_CRAP) |
-			       (1 << TAINT_UNSIGNED_MODULE) |
-			       (1 << TAINT_TEST));
+				(1 << TAINT_UNSIGNED_MODULE) | (1 << TAINT_TEST) |
+				(1 << TAINT_LIVEPATCH));
 }
 
 static BLOCKING_NOTIFIER_HEAD(tracepoint_notify_list);