[v2,0/9] tracing: Improbe BTF support on probe events

Message ID 168960739768.34107.15145201749042174448.stgit@devnote2
Headers
Series tracing: Improbe BTF support on probe events |

Message

Masami Hiramatsu (Google) July 17, 2023, 3:23 p.m. UTC
  Hi,

Here is the 2nd version of series to improve the BTF support on probe events.
The previous series is here:

https://lore.kernel.org/linux-trace-kernel/168699521817.528797.13179901018528120324.stgit@mhiramat.roam.corp.google.com/

In this version, I added a NULL check fix patch [1/9] (which will go to
fixes branch) and move BTF related API to kernel/bpf/btf.c [2/9] and add
a new BTF API [3/9] so that anyone can reuse it.
Also I decided to use '$retval' directly instead of 'retval' pseudo BTF
variable for field access at [5/9] because I introduced an idea to choose
function 'exit' event automatically if '$retval' is used [7/9]. With that
change, we can not use 'retval' because if a function has 'retval'
argument, we can not decide 'f func retval' is function exit or entry.
Selftest test case [8/9] and document [9/9] are also updated according to
those changes.

This series can be applied on top of "v6.5-rc2" kernel.

You can also get this series from:

git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext


Thank you,

---

Masami Hiramatsu (Google) (9):
      tracing/probes: Fix to add NULL check for BTF APIs
      bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF
      bpf/btf: Add a function to search a member of a struct/union
      tracing/probes: Support BTF based data structure field access
      tracing/probes: Support BTF field access from $retval
      tracing/probes: Add string type check with BTF
      tracing/fprobe-event: Assume fprobe is a return event by $retval
      selftests/ftrace: Add BTF fields access testcases
      Documentation: tracing: Update fprobe event example with BTF field


 Documentation/trace/fprobetrace.rst                |   50 ++
 include/linux/btf.h                                |    7 
 kernel/bpf/btf.c                                   |   83 ++++
 kernel/trace/trace_fprobe.c                        |   58 ++-
 kernel/trace/trace_probe.c                         |  402 +++++++++++++++-----
 kernel/trace/trace_probe.h                         |   12 +
 .../ftrace/test.d/dynevent/add_remove_btfarg.tc    |   11 +
 .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |    6 
 8 files changed, 503 insertions(+), 126 deletions(-)

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
  

Comments

Alan Maguire July 19, 2023, 9:02 a.m. UTC | #1
On 17/07/2023 16:23, Masami Hiramatsu (Google) wrote:
> Hi,
> 
> Here is the 2nd version of series to improve the BTF support on probe events.
> The previous series is here:
> 
> https://lore.kernel.org/linux-trace-kernel/168699521817.528797.13179901018528120324.stgit@mhiramat.roam.corp.google.com/
> 
> In this version, I added a NULL check fix patch [1/9] (which will go to
> fixes branch) and move BTF related API to kernel/bpf/btf.c [2/9] and add
> a new BTF API [3/9] so that anyone can reuse it.
> Also I decided to use '$retval' directly instead of 'retval' pseudo BTF
> variable for field access at [5/9] because I introduced an idea to choose
> function 'exit' event automatically if '$retval' is used [7/9]. With that
> change, we can not use 'retval' because if a function has 'retval'
> argument, we can not decide 'f func retval' is function exit or entry.

this is fantastic work! (FWIW I ran into the retval argument issue with
ksnoop as well; I got around it by using "return" to signify the return
value since as a reserved word it won't clash with a variable name.
However in the trace subsystem context retval is used extensively so
makes sense to stick with that).

One thing we should probably figure out is a common approach to handling
ambiguous static functions that will work across ftrace and BPF.  A few
edge cases that are worth figuring out:

1. a static function with the same name exists in multiple modules,
either with different or identical function signatures
2. a static function has .isra.0 and other gcc suffixes applied to
static functions during optimization

As Alexei mentioned, we're still working on 1, so it would be good
to figure out a naming scheme that works well in both ftrace and BPF
contexts. There are a few hundred of these ambiguous functions. My
reading of the fprobe docs seems to suggest that there is no mechanism
to specify a specific module for a given symbol (as in ftrace filters),
is that right?

Jiri led a session on this topic at LSF/MM/BPF ; perhaps we should
carve out some time at Plumbers to discuss this?

With respect to 2, pahole v1.25 will generate representations for these
"."-suffixed functions in BTF via --btf_gen_optimized [1]. (BTF
representation is skipped if the optimizations impact on the registers
used for function arguments; if these don't match calling conventions
due to optimized-out params, we don't represent the function in BTF,
as the tracing expectations are violated).

However the BTF function name - in line with DWARF representation -
will not have the .isra suffix. So the thing to bear in mind is if
you use the function name with suffix as the fprobe function name,
a BTF lookup of that exact ("foo.isra.0") name will not find anything,
while a lookup of "foo" will succeed. I'll add some specifics in your
patch doing the lookups, but just wanted to highlight the issue at
the top-level.

Thanks!

Alan

[1]
https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/

> Selftest test case [8/9] and document [9/9] are also updated according to
> those changes.
> 
> This series can be applied on top of "v6.5-rc2" kernel.
> 
> You can also get this series from:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext
> 
> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (Google) (9):
>       tracing/probes: Fix to add NULL check for BTF APIs
>       bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF
>       bpf/btf: Add a function to search a member of a struct/union
>       tracing/probes: Support BTF based data structure field access
>       tracing/probes: Support BTF field access from $retval
>       tracing/probes: Add string type check with BTF
>       tracing/fprobe-event: Assume fprobe is a return event by $retval
>       selftests/ftrace: Add BTF fields access testcases
>       Documentation: tracing: Update fprobe event example with BTF field
> 
> 
>  Documentation/trace/fprobetrace.rst                |   50 ++
>  include/linux/btf.h                                |    7 
>  kernel/bpf/btf.c                                   |   83 ++++
>  kernel/trace/trace_fprobe.c                        |   58 ++-
>  kernel/trace/trace_probe.c                         |  402 +++++++++++++++-----
>  kernel/trace/trace_probe.h                         |   12 +
>  .../ftrace/test.d/dynevent/add_remove_btfarg.tc    |   11 +
>  .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |    6 
>  8 files changed, 503 insertions(+), 126 deletions(-)
> 
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
  
Masami Hiramatsu (Google) July 19, 2023, 4:01 p.m. UTC | #2
On Wed, 19 Jul 2023 10:02:06 +0100
Alan Maguire <alan.maguire@oracle.com> wrote:

> On 17/07/2023 16:23, Masami Hiramatsu (Google) wrote:
> > Hi,
> > 
> > Here is the 2nd version of series to improve the BTF support on probe events.
> > The previous series is here:
> > 
> > https://lore.kernel.org/linux-trace-kernel/168699521817.528797.13179901018528120324.stgit@mhiramat.roam.corp.google.com/
> > 
> > In this version, I added a NULL check fix patch [1/9] (which will go to
> > fixes branch) and move BTF related API to kernel/bpf/btf.c [2/9] and add
> > a new BTF API [3/9] so that anyone can reuse it.
> > Also I decided to use '$retval' directly instead of 'retval' pseudo BTF
> > variable for field access at [5/9] because I introduced an idea to choose
> > function 'exit' event automatically if '$retval' is used [7/9]. With that
> > change, we can not use 'retval' because if a function has 'retval'
> > argument, we can not decide 'f func retval' is function exit or entry.
> 
> this is fantastic work! (FWIW I ran into the retval argument issue with
> ksnoop as well; I got around it by using "return" to signify the return
> value since as a reserved word it won't clash with a variable name.
> However in the trace subsystem context retval is used extensively so
> makes sense to stick with that).

Thanks!

> 
> One thing we should probably figure out is a common approach to handling
> ambiguous static functions that will work across ftrace and BPF.  A few
> edge cases that are worth figuring out:
> 
> 1. a static function with the same name exists in multiple modules,
> either with different or identical function signatures
> 2. a static function has .isra.0 and other gcc suffixes applied to
> static functions during optimization
> 
> As Alexei mentioned, we're still working on 1, so it would be good
> to figure out a naming scheme that works well in both ftrace and BPF
> contexts. There are a few hundred of these ambiguous functions. My
> reading of the fprobe docs seems to suggest that there is no mechanism
> to specify a specific module for a given symbol (as in ftrace filters),
> is that right?

Yes, it doesn't have module specificaiton at this moment. I'll considering
to fix this. BTW, for the same-name functions, we are discussing another
approach. We also need to sync this with BTF. 

https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/

> 
> Jiri led a session on this topic at LSF/MM/BPF ; perhaps we should
> carve out some time at Plumbers to discuss this?

Yeah, good idea.

> 
> With respect to 2, pahole v1.25 will generate representations for these
> "."-suffixed functions in BTF via --btf_gen_optimized [1]. (BTF
> representation is skipped if the optimizations impact on the registers
> used for function arguments; if these don't match calling conventions
> due to optimized-out params, we don't represent the function in BTF,
> as the tracing expectations are violated).

Correct. But can't we know which argument is skipped by the optimization
from the DWARF? At least the function parameters will be changed.

> However the BTF function name - in line with DWARF representation -
> will not have the .isra suffix. So the thing to bear in mind is if
> you use the function name with suffix as the fprobe function name,
> a BTF lookup of that exact ("foo.isra.0") name will not find anything,
> while a lookup of "foo" will succeed. I'll add some specifics in your
> patch doing the lookups, but just wanted to highlight the issue at
> the top-level.

So, what about adding an index sorted list of the address and BTF entry
index as an expansion of the BTF? It allowed us to easily map the suffixed
symbol address (we can get it from kallsyms) to BTF quickly.
So the module will have

[BTF data][array length][BTF index array]

Index array member will be like this.

struct btf_index {
	u32	offset;	// offset from the start text
	u32	id:		// BTF type id
};

We can do binary search the function type id from the symbol address.

Thank you,

> 
> Thanks!
> 
> Alan
> 
> [1]
> https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> 
> > Selftest test case [8/9] and document [9/9] are also updated according to
> > those changes.
> > 
> > This series can be applied on top of "v6.5-rc2" kernel.
> > 
> > You can also get this series from:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext
> > 
> > 
> > Thank you,
> > 
> > ---
> > 
> > Masami Hiramatsu (Google) (9):
> >       tracing/probes: Fix to add NULL check for BTF APIs
> >       bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF
> >       bpf/btf: Add a function to search a member of a struct/union
> >       tracing/probes: Support BTF based data structure field access
> >       tracing/probes: Support BTF field access from $retval
> >       tracing/probes: Add string type check with BTF
> >       tracing/fprobe-event: Assume fprobe is a return event by $retval
> >       selftests/ftrace: Add BTF fields access testcases
> >       Documentation: tracing: Update fprobe event example with BTF field
> > 
> > 
> >  Documentation/trace/fprobetrace.rst                |   50 ++
> >  include/linux/btf.h                                |    7 
> >  kernel/bpf/btf.c                                   |   83 ++++
> >  kernel/trace/trace_fprobe.c                        |   58 ++-
> >  kernel/trace/trace_probe.c                         |  402 +++++++++++++++-----
> >  kernel/trace/trace_probe.h                         |   12 +
> >  .../ftrace/test.d/dynevent/add_remove_btfarg.tc    |   11 +
> >  .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |    6 
> >  8 files changed, 503 insertions(+), 126 deletions(-)
> > 
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
  
Alan Maguire July 20, 2023, 9:50 p.m. UTC | #3
On 19/07/2023 17:01, Masami Hiramatsu (Google) wrote:
> On Wed, 19 Jul 2023 10:02:06 +0100
> Alan Maguire <alan.maguire@oracle.com> wrote:
> 
>> On 17/07/2023 16:23, Masami Hiramatsu (Google) wrote:
>>> Hi,
>>>
>>> Here is the 2nd version of series to improve the BTF support on probe events.
>>> The previous series is here:
>>>
>>> https://lore.kernel.org/linux-trace-kernel/168699521817.528797.13179901018528120324.stgit@mhiramat.roam.corp.google.com/
>>>
>>> In this version, I added a NULL check fix patch [1/9] (which will go to
>>> fixes branch) and move BTF related API to kernel/bpf/btf.c [2/9] and add
>>> a new BTF API [3/9] so that anyone can reuse it.
>>> Also I decided to use '$retval' directly instead of 'retval' pseudo BTF
>>> variable for field access at [5/9] because I introduced an idea to choose
>>> function 'exit' event automatically if '$retval' is used [7/9]. With that
>>> change, we can not use 'retval' because if a function has 'retval'
>>> argument, we can not decide 'f func retval' is function exit or entry.
>>
>> this is fantastic work! (FWIW I ran into the retval argument issue with
>> ksnoop as well; I got around it by using "return" to signify the return
>> value since as a reserved word it won't clash with a variable name.
>> However in the trace subsystem context retval is used extensively so
>> makes sense to stick with that).
> 
> Thanks!
> 
>>
>> One thing we should probably figure out is a common approach to handling
>> ambiguous static functions that will work across ftrace and BPF.  A few
>> edge cases that are worth figuring out:
>>
>> 1. a static function with the same name exists in multiple modules,
>> either with different or identical function signatures
>> 2. a static function has .isra.0 and other gcc suffixes applied to
>> static functions during optimization
>>
>> As Alexei mentioned, we're still working on 1, so it would be good
>> to figure out a naming scheme that works well in both ftrace and BPF
>> contexts. There are a few hundred of these ambiguous functions. My
>> reading of the fprobe docs seems to suggest that there is no mechanism
>> to specify a specific module for a given symbol (as in ftrace filters),
>> is that right?
> 
> Yes, it doesn't have module specificaiton at this moment. I'll considering
> to fix this. BTW, for the same-name functions, we are discussing another
> approach. We also need to sync this with BTF. 
> 
> https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/
> 
>>
>> Jiri led a session on this topic at LSF/MM/BPF ; perhaps we should
>> carve out some time at Plumbers to discuss this?
> 
> Yeah, good idea.
> 
>>
>> With respect to 2, pahole v1.25 will generate representations for these
>> "."-suffixed functions in BTF via --btf_gen_optimized [1]. (BTF
>> representation is skipped if the optimizations impact on the registers
>> used for function arguments; if these don't match calling conventions
>> due to optimized-out params, we don't represent the function in BTF,
>> as the tracing expectations are violated).
> 
> Correct. But can't we know which argument is skipped by the optimization
> from the DWARF? At least the function parameters will be changed.
>

Yep; we use the expected registers to spot cases where something
has been optimized out.

>> However the BTF function name - in line with DWARF representation -
>> will not have the .isra suffix. So the thing to bear in mind is if
>> you use the function name with suffix as the fprobe function name,
>> a BTF lookup of that exact ("foo.isra.0") name will not find anything,
>> while a lookup of "foo" will succeed. I'll add some specifics in your
>> patch doing the lookups, but just wanted to highlight the issue at
>> the top-level.
> 
> So, what about adding an index sorted list of the address and BTF entry
> index as an expansion of the BTF? It allowed us to easily map the suffixed
> symbol address (we can get it from kallsyms) to BTF quickly.
> So the module will have
> 
> [BTF data][array length][BTF index array]
> 
> Index array member will be like this.
> 
> struct btf_index {
> 	u32	offset;	// offset from the start text
> 	u32	id:		// BTF type id
> };
> 
> We can do binary search the function type id from the symbol address.
> 

Yeah, I wonder if a representation that bridged between kallsyms and BTF
might be valuable? I don't _think_ it's as much of an issue for your
case though since you only need to do the BTF lookup once on fprobe
setup, right? Thanks!

Alan



> Thank you,
> 
>>
>> Thanks!
>>
>> Alan
>>
>> [1]
>> https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
>>
>>> Selftest test case [8/9] and document [9/9] are also updated according to
>>> those changes.
>>>
>>> This series can be applied on top of "v6.5-rc2" kernel.
>>>
>>> You can also get this series from:
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext
>>>
>>>
>>> Thank you,
>>>
>>> ---
>>>
>>> Masami Hiramatsu (Google) (9):
>>>       tracing/probes: Fix to add NULL check for BTF APIs
>>>       bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF
>>>       bpf/btf: Add a function to search a member of a struct/union
>>>       tracing/probes: Support BTF based data structure field access
>>>       tracing/probes: Support BTF field access from $retval
>>>       tracing/probes: Add string type check with BTF
>>>       tracing/fprobe-event: Assume fprobe is a return event by $retval
>>>       selftests/ftrace: Add BTF fields access testcases
>>>       Documentation: tracing: Update fprobe event example with BTF field
>>>
>>>
>>>  Documentation/trace/fprobetrace.rst                |   50 ++
>>>  include/linux/btf.h                                |    7 
>>>  kernel/bpf/btf.c                                   |   83 ++++
>>>  kernel/trace/trace_fprobe.c                        |   58 ++-
>>>  kernel/trace/trace_probe.c                         |  402 +++++++++++++++-----
>>>  kernel/trace/trace_probe.h                         |   12 +
>>>  .../ftrace/test.d/dynevent/add_remove_btfarg.tc    |   11 +
>>>  .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |    6 
>>>  8 files changed, 503 insertions(+), 126 deletions(-)
>>>
>>> --
>>> Masami Hiramatsu (Google) <mhiramat@kernel.org>
>>>
> 
>
  
Masami Hiramatsu (Google) July 25, 2023, 11:45 p.m. UTC | #4
On Thu, 20 Jul 2023 22:50:18 +0100
Alan Maguire <alan.maguire@oracle.com> wrote:


> >> One thing we should probably figure out is a common approach to handling
> >> ambiguous static functions that will work across ftrace and BPF.  A few
> >> edge cases that are worth figuring out:
> >>
> >> 1. a static function with the same name exists in multiple modules,
> >> either with different or identical function signatures
> >> 2. a static function has .isra.0 and other gcc suffixes applied to
> >> static functions during optimization
> >>
> >> As Alexei mentioned, we're still working on 1, so it would be good
> >> to figure out a naming scheme that works well in both ftrace and BPF
> >> contexts. There are a few hundred of these ambiguous functions. My
> >> reading of the fprobe docs seems to suggest that there is no mechanism
> >> to specify a specific module for a given symbol (as in ftrace filters),
> >> is that right?
> > 
> > Yes, it doesn't have module specificaiton at this moment. I'll considering
> > to fix this. BTW, for the same-name functions, we are discussing another
> > approach. We also need to sync this with BTF. 
> > 
> > https://lore.kernel.org/all/20230714150326.1152359-1-alessandro.carminati@gmail.com/
> > 
> >>
> >> Jiri led a session on this topic at LSF/MM/BPF ; perhaps we should
> >> carve out some time at Plumbers to discuss this?
> > 
> > Yeah, good idea.
> > 
> >>
> >> With respect to 2, pahole v1.25 will generate representations for these
> >> "."-suffixed functions in BTF via --btf_gen_optimized [1]. (BTF
> >> representation is skipped if the optimizations impact on the registers
> >> used for function arguments; if these don't match calling conventions
> >> due to optimized-out params, we don't represent the function in BTF,
> >> as the tracing expectations are violated).
> > 
> > Correct. But can't we know which argument is skipped by the optimization
> > from the DWARF? At least the function parameters will be changed.
> >
> 
> Yep; we use the expected registers to spot cases where something
> has been optimized out.

I guess DWARF knows which register is optimized out and then BTF also
knows that?
Let me update my pahole too.

> >> However the BTF function name - in line with DWARF representation -
> >> will not have the .isra suffix. So the thing to bear in mind is if
> >> you use the function name with suffix as the fprobe function name,
> >> a BTF lookup of that exact ("foo.isra.0") name will not find anything,
> >> while a lookup of "foo" will succeed. I'll add some specifics in your
> >> patch doing the lookups, but just wanted to highlight the issue at
> >> the top-level.
> > 
> > So, what about adding an index sorted list of the address and BTF entry
> > index as an expansion of the BTF? It allowed us to easily map the suffixed
> > symbol address (we can get it from kallsyms) to BTF quickly.
> > So the module will have
> > 
> > [BTF data][array length][BTF index array]
> > 
> > Index array member will be like this.
> > 
> > struct btf_index {
> > 	u32	offset;	// offset from the start text
> > 	u32	id:		// BTF type id
> > };
> > 
> > We can do binary search the function type id from the symbol address.
> > 
> 
> Yeah, I wonder if a representation that bridged between kallsyms and BTF
> might be valuable? I don't _think_ it's as much of an issue for your
> case though since you only need to do the BTF lookup once on fprobe
> setup, right? Thanks!

Yes, but I'm thinking fprobe to support some sort of 'symbol+offset' format
to specify one of the "same-name" symbols. In that case, it is important to
identify which address the BTF type is pointed.

Thank you!

> 
> Alan
> 
> 
> 
> > Thank you,
> > 
> >>
> >> Thanks!
> >>
> >> Alan
> >>
> >> [1]
> >> https://lore.kernel.org/bpf/1675790102-23037-1-git-send-email-alan.maguire@oracle.com/
> >>
> >>> Selftest test case [8/9] and document [9/9] are also updated according to
> >>> those changes.
> >>>
> >>> This series can be applied on top of "v6.5-rc2" kernel.
> >>>
> >>> You can also get this series from:
> >>>
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git topic/fprobe-event-ext
> >>>
> >>>
> >>> Thank you,
> >>>
> >>> ---
> >>>
> >>> Masami Hiramatsu (Google) (9):
> >>>       tracing/probes: Fix to add NULL check for BTF APIs
> >>>       bpf/btf: tracing: Move finding func-proto API and getting func-param API to BTF
> >>>       bpf/btf: Add a function to search a member of a struct/union
> >>>       tracing/probes: Support BTF based data structure field access
> >>>       tracing/probes: Support BTF field access from $retval
> >>>       tracing/probes: Add string type check with BTF
> >>>       tracing/fprobe-event: Assume fprobe is a return event by $retval
> >>>       selftests/ftrace: Add BTF fields access testcases
> >>>       Documentation: tracing: Update fprobe event example with BTF field
> >>>
> >>>
> >>>  Documentation/trace/fprobetrace.rst                |   50 ++
> >>>  include/linux/btf.h                                |    7 
> >>>  kernel/bpf/btf.c                                   |   83 ++++
> >>>  kernel/trace/trace_fprobe.c                        |   58 ++-
> >>>  kernel/trace/trace_probe.c                         |  402 +++++++++++++++-----
> >>>  kernel/trace/trace_probe.h                         |   12 +
> >>>  .../ftrace/test.d/dynevent/add_remove_btfarg.tc    |   11 +
> >>>  .../ftrace/test.d/dynevent/fprobe_syntax_errors.tc |    6 
> >>>  8 files changed, 503 insertions(+), 126 deletions(-)
> >>>
> >>> --
> >>> Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >>>
> > 
> >