[RFC,0/9] tracing: Add fprobe events

Message ID 166792255429.919356.14116090269057513181.stgit@devnote3
Headers
Series tracing: Add fprobe events |

Message

Masami Hiramatsu (Google) Nov. 8, 2022, 3:49 p.m. UTC
  Hi,

Here is a series of patches to improve fprobe and add a basic fprobe event
support for ftrace (tracefs) and perf.

With this series, user can add new events on the entry and exit of kernel
functions (which can be ftraced). Unlike kprobe events, the fprobe events
can only probe the function entry and exit, the IP address will have some
offsets from the symbol address. And it can only trace the function args,
return value, and stacks. (no registers)

The fprobe events syntax is here;

 f[:[GRP/][EVENT]] FUNCTION [FETCHARGS]
 f[MAXACTIVE][:[GRP/][EVENT]] FUNCTION%return [FETCHARGS]

E.g.

 # echo 'f vfs_read $arg1'  >> dynamic_events
 # echo 'f vfs_read%return $retval'  >> dynamic_events
 # cat dynamic_events
 f:fprobes/vfs_read_entry vfs_read arg1=$arg1
 f:fprobes/vfs_read_exit vfs_read%return arg1=$retval
 # echo 1 > events/fprobes/enable
 # head -n 20 trace | tail
 #           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
 #              | |         |   |||||     |         |
              sh-142     [005] ...1.   448.386420: vfs_read_entry: (vfs_read+0x4/0x340) arg1=0xffff888007f7c540
              sh-142     [005] .....   448.386436: vfs_read_exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=0x1
              sh-142     [005] ...1.   448.386451: vfs_read_entry: (vfs_read+0x4/0x340) arg1=0xffff888007f7c540
              sh-142     [005] .....   448.386458: vfs_read_exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=0x1
              sh-142     [005] ...1.   448.386469: vfs_read_entry: (vfs_read+0x4/0x340) arg1=0xffff888007f7c540
              sh-142     [005] .....   448.386476: vfs_read_exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=0x1
              sh-142     [005] ...1.   448.602073: vfs_read_entry: (vfs_read+0x4/0x340) arg1=0xffff888007f7c540
              sh-142     [005] .....   448.602089: vfs_read_exit: (ksys_read+0x75/0x100 <- vfs_read) arg1=0x1

Future works:
 - Trace multiple function entry/exit (wildcard).
 - Integrate it with the function graph tracer.
 - Use ftrace_regs instead of pt_regs and remove dependency of
   CONFIG_DYNAMIC_FTRACE_WITH_REGS.
 - Support (limited) register access via ftrace_regs.
 - Support fprobe event by perf probe.
 - Support entry data accessing from exit event.
 - Support BTF for trace arguments.

This fprobe event may eventually replace the kprobe events for
function entry and exit on some archs (e.g. arm64).

Here's my current migration (kretprobe to fprobe) idea:

Phase 1. introduce fprobe events. (THIS)
Phase 2. introduce generic function graph shadow stack
Phase 3. Replace the rethook with function shadow stack
         and use ftrace_regs in fprobe handlers.
Phase 4. Extend this fprobe support to other archs.

Even if kretprobe event is replaced with fprobe event, tracefs user can
transparently use fprobe events for function entry/exit with 'p:...'
and 'r:...' syntax (for backward compatibility.)

Thank you,

---

Masami Hiramatsu (Google) (9):
      fprobe: Pass entry_data to handlers
      lib/test_fprobe: Add private entry_data testcases
      fprobe: Add nr_maxactive to specify rethook_node pool size
      lib/test_fprobe: Add a test case for nr_maxactive
      fprobe: Skip exit_handler if entry_handler returns !0
      lib/test_fprobe: Add a testcase for skipping exit_handler
      docs: tracing: Update fprobe documentation
      fprobe: Pass return address to the handlers
      tracing/probes: Add fprobe-events


 Documentation/trace/fprobe.rst  |   16 -
 include/linux/fprobe.h          |   17 +
 include/linux/rethook.h         |    2 
 include/linux/trace_events.h    |    3 
 kernel/kprobes.c                |    1 
 kernel/trace/Kconfig            |   14 
 kernel/trace/Makefile           |    1 
 kernel/trace/bpf_trace.c        |   19 +
 kernel/trace/fprobe.c           |   45 +-
 kernel/trace/rethook.c          |    3 
 kernel/trace/trace.h            |   11 
 kernel/trace/trace_fprobe.c     | 1120 +++++++++++++++++++++++++++++++++++++++
 kernel/trace/trace_probe.c      |    4 
 kernel/trace/trace_probe.h      |    4 
 lib/test_fprobe.c               |  109 ++++
 samples/fprobe/fprobe_example.c |   11 
 16 files changed, 1349 insertions(+), 31 deletions(-)
 create mode 100644 kernel/trace/trace_fprobe.c

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>