tracing/eprobe: Update cond flag before enabling trigger

Message ID 20221116192552.1066630-1-rafaelmendsr@gmail.com
State New
Headers
Series tracing/eprobe: Update cond flag before enabling trigger |

Commit Message

Rafael Mendonca Nov. 16, 2022, 7:25 p.m. UTC
  The eprobe's trigger, for the event it is attached to, needs access to the
trace record in order to perform its function. This need is indicated by
the TRIGGER_COND flag and properly set by the update_cond_flag() function.
Currently, the cond flag is set after calling
trace_event_trigger_enable_disable(), which leaves a small time gap in
which the eprobe's trigger could be invoked without a trace record, causing
a NULL pointer dereference issue.

This issue can be easily reproduced by enabling an event probe for
kmem/mm_page_alloc using any of its fields as such:

root@localhost:/sys/kernel/tracing# echo 'e kmem/mm_page_alloc $gfp_flags' > dynamic_events
root@localhost:/sys/kernel/tracing# echo 1 > events/eprobes/enable
[   30.847000] general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI
[   30.848618] KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
[   30.849498] CPU: 0 PID: 223 Comm: bash Not tainted 6.1.0-rc5+ #98
[...] stripped
[   30.850623] RIP: 0010:get_event_field.isra.0+0x1b8/0x3f0
[...] stripped
[   30.856173] Call Trace:
[   30.856340]  <TASK>
[   30.856489]  process_fetch_insn+0x1289/0x19b0
[   30.856789]  ? write_comp_data+0x2f/0x90
[   30.857061]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   30.857379]  ? __trace_eprobe_create+0x1d30/0x1d30
[   30.857701]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   30.858019]  ? write_comp_data+0x2f/0x90
[   30.858289]  ? write_comp_data+0x2f/0x90
[   30.858562]  eprobe_trigger_func+0x1253/0x2180
[   30.858867]  ? kasan_poison.part.0+0x3c/0x50
[   30.859165]  ? process_fetch_insn+0x19b0/0x19b0
[   30.859473]  ? get_page_from_freelist+0x10c1/0x29c0
[   30.859806]  ? rcu_read_lock_bh_held+0xd0/0xd0
[   30.860114]  ? write_comp_data+0x2f/0x90
[   30.860387]  event_triggers_call+0x293/0x410
[   30.860683]  __trace_trigger_soft_disabled+0xbb/0xe0
[   30.861019]  trace_event_raw_event_mm_page_alloc+0xb5/0xc0
[   30.861389]  __alloc_pages+0x4cd/0x760
[   30.861651]  ? __alloc_pages_slowpath.constprop.0+0x2480/0x2480
[   30.862047]  ? tracepoint_probe_register_prio_may_exist+0x100/0x100
[   30.862464]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   30.862780]  ? write_comp_data+0x2f/0x90
[   30.863055]  trace_buffered_event_enable+0xc6/0x440
[   30.863381]  __ftrace_event_enable_disable+0x16a/0x880
[   30.863728]  trace_event_enable_disable+0x2a/0x40
[   30.864043]  trace_event_trigger_enable_disable.part.0+0x70/0x90
[   30.864446]  trace_event_trigger_enable_disable+0x31/0xc0
[   30.864806]  eprobe_register+0x49a/0xdf0
[   30.865080]  ? disable_eprobe+0x370/0x370
[   30.865360]  ? mutex_unlock+0x16/0x20
[   30.865618]  ? write_comp_data+0x2f/0x90
[   30.865897]  __ftrace_event_enable_disable+0x4c1/0x880
[   30.866247]  __ftrace_set_clr_event_nolock+0x297/0x390
[   30.866598]  __ftrace_set_clr_event+0x43/0x70
[   30.866901]  system_enable_write+0x1f7/0x2a0
[   30.867214]  ? event_enable_write+0x290/0x290
[   30.867521]  ? rcu_read_lock_held+0xc0/0xc0
[   30.867842]  ? write_comp_data+0x2f/0x90
[   30.868176]  ? write_comp_data+0x2f/0x90
[   30.868467]  vfs_write+0x311/0xe50
[   30.868752]  ? event_enable_write+0x290/0x290
[   30.869067]  ? kernel_write+0x5b0/0x5b0
[   30.869329]  ? lockdep_hardirqs_on_prepare+0x410/0x410
[   30.869675]  ? lock_is_held_type+0xaf/0x120
[   30.869954]  ? close_fd+0x6e/0xb0
[   30.870182]  ? write_comp_data+0x2f/0x90
[   30.870453]  ? __sanitizer_cov_trace_pc+0x25/0x60
[   30.870772]  ? write_comp_data+0x2f/0x90
[   30.871043]  ? write_comp_data+0x2f/0x90
[   30.871315]  ksys_write+0x158/0x2a0
[   30.871558]  ? __ia32_sys_read+0xc0/0xc0
[   30.871827]  ? syscall_enter_from_user_mode+0x25/0x70
[   30.872169]  ? do_syscall_64+0x3b/0x90
[   30.872432]  __x64_sys_write+0x7c/0xc0
[   30.872691]  ? syscall_enter_from_user_mode+0x25/0x70
[   30.873028]  do_syscall_64+0x60/0x90
[   30.873279]  ? syscall_exit_to_user_mode+0x4a/0x60
[   30.873597]  ? do_syscall_64+0x6d/0x90
[   30.873857]  ? fpregs_assert_state_consistent+0x90/0xf0
[   30.874214]  ? syscall_exit_to_user_mode+0x4a/0x60
[   30.874535]  ? do_syscall_64+0x6d/0x90
[   30.874794]  ? do_syscall_64+0x6d/0x90
[   30.875118]  ? do_syscall_64+0x6d/0x90
[   30.875379]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   30.875712] RIP: 0033:0x7f7cc14da8f3
[...] stripped
[   30.879875]  </TASK>
[...] stripped
[   30.881930] ---[ end trace 0000000000000000 ]---
[   30.882237] RIP: 0010:get_event_field.isra.0+0x1b8/0x3f0
[...] stripped
[   30.887734] note: bash[223] exited with preempt_count 2

That happens because enable_eprobe() will eventually trigger the
kmem/mm_page_alloc trace event:

- enable_eprobe [trace_eprobe.c]
  - trace_event_trigger_enable_disable [trace_events_trigger.c]
    - trace_event_enable_disable [trace_events.c]
      - __ftrace_event_enable_disable [trace_events.c]
        - trace_buffered_event_enable [trace.c]
          - alloc_pages_node [gfp.h]
	   ...
            - __alloc_pages [page_alloc.c]
              - trace_mm_page_alloc // eprobe event file without TRIGGER_COND bit set

By the time kmem/mm_page_alloc trace event is hit, the eprobe event file
does not have the TRIGGER_COND flag set yet, which causes the eprobe's
trigger to be invoked (through the trace_trigger_soft_disabled() path)
without a trace record, causing a NULL pointer dereference when fetching
the event fields.

Fix this by setting the cond flag beforehand when enabling the eprobe's
trigger.

Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events")
Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
---
 kernel/trace/trace_eprobe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Steven Rostedt Nov. 18, 2022, 2:17 a.m. UTC | #1
On Wed, 16 Nov 2022 16:25:51 -0300
Rafael Mendonca <rafaelmendsr@gmail.com> wrote:

> That happens because enable_eprobe() will eventually trigger the
> kmem/mm_page_alloc trace event:
> 
> - enable_eprobe [trace_eprobe.c]
>   - trace_event_trigger_enable_disable [trace_events_trigger.c]
>     - trace_event_enable_disable [trace_events.c]
>       - __ftrace_event_enable_disable [trace_events.c]
>         - trace_buffered_event_enable [trace.c]
>           - alloc_pages_node [gfp.h]
> 	   ...
>             - __alloc_pages [page_alloc.c]
>               - trace_mm_page_alloc // eprobe event file without TRIGGER_COND bit set
> 
> By the time kmem/mm_page_alloc trace event is hit, the eprobe event file
> does not have the TRIGGER_COND flag set yet, which causes the eprobe's
> trigger to be invoked (through the trace_trigger_soft_disabled() path)
> without a trace record, causing a NULL pointer dereference when fetching
> the event fields.
> 
> Fix this by setting the cond flag beforehand when enabling the eprobe's
> trigger.
> 
> Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events")
> Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
> ---

Thanks for the report, but I'm worried that this isn't enough because of
how memory ordering can happen on different architectures. That is, just
because you switch the order of updates, doesn't mean that the architecture
will honor it.

I don't want to add memory barriers in the fast path, but instead we can
simply check if rec is NULL in the handler.

So basically:


static void eprobe_trigger_func(struct event_trigger_data *data,
				struct trace_buffer *buffer, void *rec,
				struct ring_buffer_event *rbe)
{
	struct eprobe_data *edata = data->private_data;

	if (!rec)
		return;

	__eprobe_trace_func(edata, rec);
}

And this should be documented.

-- Steve
  
Steven Rostedt Nov. 18, 2022, 2:31 a.m. UTC | #2
On Thu, 17 Nov 2022 21:17:26 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 16 Nov 2022 16:25:51 -0300
> Rafael Mendonca <rafaelmendsr@gmail.com> wrote:
> 
> > That happens because enable_eprobe() will eventually trigger the
> > kmem/mm_page_alloc trace event:
> > 
> > - enable_eprobe [trace_eprobe.c]
> >   - trace_event_trigger_enable_disable [trace_events_trigger.c]
> >     - trace_event_enable_disable [trace_events.c]
> >       - __ftrace_event_enable_disable [trace_events.c]
> >         - trace_buffered_event_enable [trace.c]
> >           - alloc_pages_node [gfp.h]
> > 	   ...
> >             - __alloc_pages [page_alloc.c]
> >               - trace_mm_page_alloc // eprobe event file without TRIGGER_COND bit set
> > 
> > By the time kmem/mm_page_alloc trace event is hit, the eprobe event file
> > does not have the TRIGGER_COND flag set yet, which causes the eprobe's
> > trigger to be invoked (through the trace_trigger_soft_disabled() path)
> > without a trace record, causing a NULL pointer dereference when fetching
> > the event fields.
> > 
> > Fix this by setting the cond flag beforehand when enabling the eprobe's
> > trigger.
> > 
> > Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events")
> > Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
> > ---  
> 
> Thanks for the report, but I'm worried that this isn't enough because of
> how memory ordering can happen on different architectures. That is, just
> because you switch the order of updates, doesn't mean that the architecture
> will honor it.
> 
> I don't want to add memory barriers in the fast path, but instead we can
> simply check if rec is NULL in the handler.
> 
> So basically:
> 
> 
> static void eprobe_trigger_func(struct event_trigger_data *data,
> 				struct trace_buffer *buffer, void *rec,
> 				struct ring_buffer_event *rbe)
> {
> 	struct eprobe_data *edata = data->private_data;
> 
> 	if (!rec)
> 		return;
> 
> 	__eprobe_trace_func(edata, rec);
> }
> 
> And this should be documented.
> 

In fact, does this fix it for you?

I'm going to take this patch and reference you as a reported-by, as I have
a lot of urgent code that needs to got upstream, and I need to start
testing it.

Thanks!

-- Steve

diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
index 5dd0617e5df6..6b31b74954d9 100644
--- a/kernel/trace/trace_eprobe.c
+++ b/kernel/trace/trace_eprobe.c
@@ -563,6 +563,9 @@ static void eprobe_trigger_func(struct event_trigger_data *data,
 {
 	struct eprobe_data *edata = data->private_data;
 
+	if (!rec)
+		return;
+
 	__eprobe_trace_func(edata, rec);
 }
  
Rafael Mendonca Nov. 18, 2022, 12:38 p.m. UTC | #3
On Thu, Nov 17, 2022 at 09:17:26PM -0500, Steven Rostedt wrote:
> On Wed, 16 Nov 2022 16:25:51 -0300
> Rafael Mendonca <rafaelmendsr@gmail.com> wrote:
> 
> > That happens because enable_eprobe() will eventually trigger the
> > kmem/mm_page_alloc trace event:
> > 
> > - enable_eprobe [trace_eprobe.c]
> >   - trace_event_trigger_enable_disable [trace_events_trigger.c]
> >     - trace_event_enable_disable [trace_events.c]
> >       - __ftrace_event_enable_disable [trace_events.c]
> >         - trace_buffered_event_enable [trace.c]
> >           - alloc_pages_node [gfp.h]
> > 	   ...
> >             - __alloc_pages [page_alloc.c]
> >               - trace_mm_page_alloc // eprobe event file without TRIGGER_COND bit set
> > 
> > By the time kmem/mm_page_alloc trace event is hit, the eprobe event file
> > does not have the TRIGGER_COND flag set yet, which causes the eprobe's
> > trigger to be invoked (through the trace_trigger_soft_disabled() path)
> > without a trace record, causing a NULL pointer dereference when fetching
> > the event fields.
> > 
> > Fix this by setting the cond flag beforehand when enabling the eprobe's
> > trigger.
> > 
> > Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events")
> > Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
> > ---
> 
> Thanks for the report, but I'm worried that this isn't enough because of
> how memory ordering can happen on different architectures. That is, just
> because you switch the order of updates, doesn't mean that the architecture
> will honor it.
> 
> I don't want to add memory barriers in the fast path, but instead we can
> simply check if rec is NULL in the handler.

Ok. That was my first approach but then looking for the same pattern in
trace_events_trigger.c and trace_events_hist.c I saw commit 4e4a4d75700d
("tracing: Update cond flag when enabling or disabling a trigger") [1]
and I thought the same would apply here without much thought.

[1] https://lore.kernel.org/lkml/20151126022048.111221642@goodmis.org/

> 
> So basically:
> 
> 
> static void eprobe_trigger_func(struct event_trigger_data *data,
> 				struct trace_buffer *buffer, void *rec,
> 				struct ring_buffer_event *rbe)
> {
> 	struct eprobe_data *edata = data->private_data;
> 
> 	if (!rec)
> 		return;
> 
> 	__eprobe_trace_func(edata, rec);
> }
> 
> And this should be documented.
> 
> -- Steve
  
Rafael Mendonca Nov. 18, 2022, 12:40 p.m. UTC | #4
On Thu, Nov 17, 2022 at 09:31:09PM -0500, Steven Rostedt wrote:
> On Thu, 17 Nov 2022 21:17:26 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Wed, 16 Nov 2022 16:25:51 -0300
> > Rafael Mendonca <rafaelmendsr@gmail.com> wrote:
> > 
> > > That happens because enable_eprobe() will eventually trigger the
> > > kmem/mm_page_alloc trace event:
> > > 
> > > - enable_eprobe [trace_eprobe.c]
> > >   - trace_event_trigger_enable_disable [trace_events_trigger.c]
> > >     - trace_event_enable_disable [trace_events.c]
> > >       - __ftrace_event_enable_disable [trace_events.c]
> > >         - trace_buffered_event_enable [trace.c]
> > >           - alloc_pages_node [gfp.h]
> > > 	   ...
> > >             - __alloc_pages [page_alloc.c]
> > >               - trace_mm_page_alloc // eprobe event file without TRIGGER_COND bit set
> > > 
> > > By the time kmem/mm_page_alloc trace event is hit, the eprobe event file
> > > does not have the TRIGGER_COND flag set yet, which causes the eprobe's
> > > trigger to be invoked (through the trace_trigger_soft_disabled() path)
> > > without a trace record, causing a NULL pointer dereference when fetching
> > > the event fields.
> > > 
> > > Fix this by setting the cond flag beforehand when enabling the eprobe's
> > > trigger.
> > > 
> > > Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events")
> > > Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
> > > ---  
> > 
> > Thanks for the report, but I'm worried that this isn't enough because of
> > how memory ordering can happen on different architectures. That is, just
> > because you switch the order of updates, doesn't mean that the architecture
> > will honor it.
> > 
> > I don't want to add memory barriers in the fast path, but instead we can
> > simply check if rec is NULL in the handler.
> > 
> > So basically:
> > 
> > 
> > static void eprobe_trigger_func(struct event_trigger_data *data,
> > 				struct trace_buffer *buffer, void *rec,
> > 				struct ring_buffer_event *rbe)
> > {
> > 	struct eprobe_data *edata = data->private_data;
> > 
> > 	if (!rec)
> > 		return;
> > 
> > 	__eprobe_trace_func(edata, rec);
> > }
> > 
> > And this should be documented.
> > 
> 
> In fact, does this fix it for you?

It does. I found this while enabling eprobe for all events in my setup.
Doing the same test wih the proposed patch it did not trigger any issue.
Thanks.

> 
> I'm going to take this patch and reference you as a reported-by, as I have
> a lot of urgent code that needs to got upstream, and I need to start
> testing it.
> 
> Thanks!
> 
> -- Steve
> 
> diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
> index 5dd0617e5df6..6b31b74954d9 100644
> --- a/kernel/trace/trace_eprobe.c
> +++ b/kernel/trace/trace_eprobe.c
> @@ -563,6 +563,9 @@ static void eprobe_trigger_func(struct event_trigger_data *data,
>  {
>  	struct eprobe_data *edata = data->private_data;
>  
> +	if (!rec)
> +		return;
> +
>  	__eprobe_trace_func(edata, rec);
>  }
>  
>
  
Rafael Mendonca Nov. 18, 2022, 1:34 p.m. UTC | #5
On Fri, Nov 18, 2022 at 09:40:09AM -0300, Rafael Mendonca wrote:
> On Thu, Nov 17, 2022 at 09:31:09PM -0500, Steven Rostedt wrote:
> > On Thu, 17 Nov 2022 21:17:26 -0500
> > Steven Rostedt <rostedt@goodmis.org> wrote:
> > 
> > > On Wed, 16 Nov 2022 16:25:51 -0300
> > > Rafael Mendonca <rafaelmendsr@gmail.com> wrote:
> > > 
> > > > That happens because enable_eprobe() will eventually trigger the
> > > > kmem/mm_page_alloc trace event:
> > > > 
> > > > - enable_eprobe [trace_eprobe.c]
> > > >   - trace_event_trigger_enable_disable [trace_events_trigger.c]
> > > >     - trace_event_enable_disable [trace_events.c]
> > > >       - __ftrace_event_enable_disable [trace_events.c]
> > > >         - trace_buffered_event_enable [trace.c]
> > > >           - alloc_pages_node [gfp.h]
> > > > 	   ...
> > > >             - __alloc_pages [page_alloc.c]
> > > >               - trace_mm_page_alloc // eprobe event file without TRIGGER_COND bit set
> > > > 
> > > > By the time kmem/mm_page_alloc trace event is hit, the eprobe event file
> > > > does not have the TRIGGER_COND flag set yet, which causes the eprobe's
> > > > trigger to be invoked (through the trace_trigger_soft_disabled() path)
> > > > without a trace record, causing a NULL pointer dereference when fetching
> > > > the event fields.
> > > > 
> > > > Fix this by setting the cond flag beforehand when enabling the eprobe's
> > > > trigger.
> > > > 
> > > > Fixes: 7491e2c44278 ("tracing: Add a probe that attaches to trace events")
> > > > Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com>
> > > > ---  
> > > 
> > > Thanks for the report, but I'm worried that this isn't enough because of
> > > how memory ordering can happen on different architectures. That is, just
> > > because you switch the order of updates, doesn't mean that the architecture
> > > will honor it.
> > > 
> > > I don't want to add memory barriers in the fast path, but instead we can
> > > simply check if rec is NULL in the handler.
> > > 
> > > So basically:
> > > 
> > > 
> > > static void eprobe_trigger_func(struct event_trigger_data *data,
> > > 				struct trace_buffer *buffer, void *rec,
> > > 				struct ring_buffer_event *rbe)
> > > {
> > > 	struct eprobe_data *edata = data->private_data;
> > > 
> > > 	if (!rec)
> > > 		return;
> > > 
> > > 	__eprobe_trace_func(edata, rec);
> > > }
> > > 
> > > And this should be documented.
> > > 
> > 
> > In fact, does this fix it for you?
> 
> It does. I found this while enabling eprobe for all events in my setup.
> Doing the same test wih the proposed patch it did not trigger any issue.
> Thanks.

It did not trigger the NULL pointer issue to be more specific. When
creating event probe for all events I was unable to create any event for
the xhci-hcd system:

root@localhost:/sys/kernel/tracing# echo 'e xhci-hcd/xhci_add_endpoint' > dynamic_events 
-bash: echo: write error: Invalid argument

Debugging the issue it seems that the problem is in the is_good_name()
check, which returns false for "xhci-hcd". Should we sanitize it by
converting '-' into '_'?

> 
> > 
> > I'm going to take this patch and reference you as a reported-by, as I have
> > a lot of urgent code that needs to got upstream, and I need to start
> > testing it.
> > 
> > Thanks!
> > 
> > -- Steve
> > 
> > diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
> > index 5dd0617e5df6..6b31b74954d9 100644
> > --- a/kernel/trace/trace_eprobe.c
> > +++ b/kernel/trace/trace_eprobe.c
> > @@ -563,6 +563,9 @@ static void eprobe_trigger_func(struct event_trigger_data *data,
> >  {
> >  	struct eprobe_data *edata = data->private_data;
> >  
> > +	if (!rec)
> > +		return;
> > +
> >  	__eprobe_trace_func(edata, rec);
> >  }
> >  
> >
  
Steven Rostedt Nov. 18, 2022, 4:19 p.m. UTC | #6
On Fri, 18 Nov 2022 10:34:40 -0300
Rafael Mendonca <rafaelmendsr@gmail.com> wrote:

> It did not trigger the NULL pointer issue to be more specific. When
> creating event probe for all events I was unable to create any event for
> the xhci-hcd system:
> 
> root@localhost:/sys/kernel/tracing# echo 'e xhci-hcd/xhci_add_endpoint' > dynamic_events 
> -bash: echo: write error: Invalid argument
> 
> Debugging the issue it seems that the problem is in the is_good_name()
> check, which returns false for "xhci-hcd". Should we sanitize it by

Ouch. I didn't realize that.

> converting '-' into '_'?

Actually, it's just the system name that's an issue. I tested this patch
and it appears to work.

-- Steve


diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 54ee5711c729..a16fb4c9642e 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1955,17 +1955,30 @@ static __always_inline void trace_iterator_reset(struct trace_iterator *iter)
 }
 
 /* Check the name is good for event/group/fields */
-static inline bool is_good_name(const char *name)
+static inline bool __is_good_name(const char *name, bool hash_ok)
 {
-	if (!isalpha(*name) && *name != '_')
+	if (!isalpha(*name) && *name != '_' && (!hash_ok || *name != '-'))
 		return false;
 	while (*++name != '\0') {
-		if (!isalpha(*name) && !isdigit(*name) && *name != '_')
+		if (!isalpha(*name) && !isdigit(*name) && *name != '_' &&
+		    (!hash_ok || *name != '-'))
 			return false;
 	}
 	return true;
 }
 
+/* Check the name is good for event/group/fields */
+static inline bool is_good_name(const char *name)
+{
+	return __is_good_name(name, false);
+}
+
+/* Check the name is good for system */
+static inline bool is_good_system_name(const char *name)
+{
+	return __is_good_name(name, true);
+}
+
 /* Convert certain expected symbols into '_' when generating event names */
 static inline void sanitize_event_name(char *name)
 {
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 36dff277de46..bb2f95d7175c 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -246,7 +246,7 @@ int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
 			return -EINVAL;
 		}
 		strlcpy(buf, event, slash - event + 1);
-		if (!is_good_name(buf)) {
+		if (!is_good_system_name(buf)) {
 			trace_probe_log_err(offset, BAD_GROUP_NAME);
 			return -EINVAL;
 		}
  
Masami Hiramatsu (Google) Nov. 23, 2022, 4:01 p.m. UTC | #7
On Fri, 18 Nov 2022 11:19:40 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Fri, 18 Nov 2022 10:34:40 -0300
> Rafael Mendonca <rafaelmendsr@gmail.com> wrote:
> 
> > It did not trigger the NULL pointer issue to be more specific. When
> > creating event probe for all events I was unable to create any event for
> > the xhci-hcd system:
> > 
> > root@localhost:/sys/kernel/tracing# echo 'e xhci-hcd/xhci_add_endpoint' > dynamic_events 
> > -bash: echo: write error: Invalid argument
> > 
> > Debugging the issue it seems that the problem is in the is_good_name()
> > check, which returns false for "xhci-hcd". Should we sanitize it by
> 
> Ouch. I didn't realize that.

Maybe we need better error message so that user can notice which character
caused the error. 

> 
> > converting '-' into '_'?
> 
> Actually, it's just the system name that's an issue. I tested this patch
> and it appears to work.

Ah, the system name is more flexible than the event name because it has
TRACE_SYSTEM_VAR.

Steve, can you send me the below patch?

BTW, TRACE_DEFINE_ENUM() and TRACE_DEFINE_SIZEOF() macros are using
TRACE_SYSTEM instead of TRACE_SYSTEM_VAR. Should those use TRACE_SYSTEM_VAR
for defining a variable?

Thank you,

> 
> -- Steve
> 
> 
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 54ee5711c729..a16fb4c9642e 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -1955,17 +1955,30 @@ static __always_inline void trace_iterator_reset(struct trace_iterator *iter)
>  }
>  
>  /* Check the name is good for event/group/fields */
> -static inline bool is_good_name(const char *name)
> +static inline bool __is_good_name(const char *name, bool hash_ok)
>  {
> -	if (!isalpha(*name) && *name != '_')
> +	if (!isalpha(*name) && *name != '_' && (!hash_ok || *name != '-'))
>  		return false;
>  	while (*++name != '\0') {
> -		if (!isalpha(*name) && !isdigit(*name) && *name != '_')
> +		if (!isalpha(*name) && !isdigit(*name) && *name != '_' &&
> +		    (!hash_ok || *name != '-'))
>  			return false;
>  	}
>  	return true;
>  }
>  
> +/* Check the name is good for event/group/fields */
> +static inline bool is_good_name(const char *name)
> +{
> +	return __is_good_name(name, false);
> +}
> +
> +/* Check the name is good for system */
> +static inline bool is_good_system_name(const char *name)
> +{
> +	return __is_good_name(name, true);
> +}
> +
>  /* Convert certain expected symbols into '_' when generating event names */
>  static inline void sanitize_event_name(char *name)
>  {
> diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
> index 36dff277de46..bb2f95d7175c 100644
> --- a/kernel/trace/trace_probe.c
> +++ b/kernel/trace/trace_probe.c
> @@ -246,7 +246,7 @@ int traceprobe_parse_event_name(const char **pevent, const char **pgroup,
>  			return -EINVAL;
>  		}
>  		strlcpy(buf, event, slash - event + 1);
> -		if (!is_good_name(buf)) {
> +		if (!is_good_system_name(buf)) {
>  			trace_probe_log_err(offset, BAD_GROUP_NAME);
>  			return -EINVAL;
>  		}
  
Steven Rostedt Nov. 23, 2022, 4:33 p.m. UTC | #8
On Thu, 24 Nov 2022 01:01:52 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> On Fri, 18 Nov 2022 11:19:40 -0500
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > On Fri, 18 Nov 2022 10:34:40 -0300
> > Rafael Mendonca <rafaelmendsr@gmail.com> wrote:
> >   
> > > It did not trigger the NULL pointer issue to be more specific. When
> > > creating event probe for all events I was unable to create any event for
> > > the xhci-hcd system:
> > > 
> > > root@localhost:/sys/kernel/tracing# echo 'e xhci-hcd/xhci_add_endpoint' > dynamic_events 
> > > -bash: echo: write error: Invalid argument
> > > 
> > > Debugging the issue it seems that the problem is in the is_good_name()
> > > check, which returns false for "xhci-hcd". Should we sanitize it by  
> > 
> > Ouch. I didn't realize that.  
> 
> Maybe we need better error message so that user can notice which character
> caused the error. 
> 
> >   
> > > converting '-' into '_'?  
> > 
> > Actually, it's just the system name that's an issue. I tested this patch
> > and it appears to work.  
> 
> Ah, the system name is more flexible than the event name because it has
> TRACE_SYSTEM_VAR.
> 
> Steve, can you send me the below patch?

I already did ;-)

  https://patchwork.kernel.org/project/linux-trace-kernel/patch/20221122122345.160f5077@gandalf.local.home/


> 
> BTW, TRACE_DEFINE_ENUM() and TRACE_DEFINE_SIZEOF() macros are using
> TRACE_SYSTEM instead of TRACE_SYSTEM_VAR. Should those use TRACE_SYSTEM_VAR
> for defining a variable?

Hmm, good question. Probably. But since nothing that has a bad name uses
it, we can fix that whenever (but before a system with a hyphen uses those
macros). Do you want to make the change, or shall I?

-- Steve
  

Patch

diff --git a/kernel/trace/trace_eprobe.c b/kernel/trace/trace_eprobe.c
index 5dd0617e5df6..5004c6f6f232 100644
--- a/kernel/trace/trace_eprobe.c
+++ b/kernel/trace/trace_eprobe.c
@@ -677,8 +677,8 @@  static int enable_eprobe(struct trace_eprobe *ep,
 
 	list_add_tail_rcu(&trigger->list, &file->triggers);
 
-	trace_event_trigger_enable_disable(file, 1);
 	update_cond_flag(file);
+	trace_event_trigger_enable_disable(file, 1);
 
 	return 0;
 }