[v6,0/5] Add ftrace direct call for arm64

Message ID 20230405180250.2046566-1-revest@chromium.org
Headers
Series Add ftrace direct call for arm64 |

Message

Florent Revest April 5, 2023, 6:02 p.m. UTC
  This series adds ftrace direct call support to arm64.
This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.

It is meant to be taken by the arm64 tree but it depends on the
trace-direct-v6.3-rc3 tag of the linux-trace tree:
  git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
this depends on. [1]

Thanks to the ftrace refactoring under that tag, an ftrace_ops backing a ftrace
direct call will only ever point to *one* direct call. This means we can look up
the direct called trampoline address stored in the ops from the ftrace_caller
trampoline in the case when the destination would be out of reach of a BL
instruction at the ftrace callsite. This fixes limitations of previous attempts
such as [2].

This series has been tested on arm64 with:
1- CONFIG_FTRACE_SELFTEST
2- samples/ftrace/*.ko (cf: patch 4)
3- tools/testing/selftests/bpf/test_progs (cf: patch 5)

Changes since v5 [3]:
- Fixed saving the fourth argument of handle_mm_fault in both the x86 (patch 3)
  and arm64 (as part of patch 4) "ftrace-direct-too" sample trampolines
- Fixed the address of the traced function logged by some direct call samples
  (ftrace-direct-multi and ftrace-direct-multi-modify) by moving lr into x0

1: https://lore.kernel.org/all/ZB2Nl7fzpHoq5V20@FVFF77S0Q05N/
2: https://lore.kernel.org/all/20220913162732.163631-1-xukuohai@huaweicloud.com/
3: https://lore.kernel.org/bpf/20230403113552.2857693-1-revest@chromium.org/

Florent Revest (5):
  arm64: ftrace: Add direct call support
  arm64: ftrace: Simplify get_ftrace_plt
  samples: ftrace: Save required argument registers in sample
    trampolines
  arm64: ftrace: Add direct call trampoline samples support
  selftests/bpf: Update the tests deny list on aarch64

 arch/arm64/Kconfig                           |  6 ++
 arch/arm64/include/asm/ftrace.h              | 22 +++++
 arch/arm64/kernel/asm-offsets.c              |  6 ++
 arch/arm64/kernel/entry-ftrace.S             | 90 ++++++++++++++++----
 arch/arm64/kernel/ftrace.c                   | 46 +++++++---
 samples/ftrace/ftrace-direct-modify.c        | 34 ++++++++
 samples/ftrace/ftrace-direct-multi-modify.c  | 40 +++++++++
 samples/ftrace/ftrace-direct-multi.c         | 24 ++++++
 samples/ftrace/ftrace-direct-too.c           | 40 +++++++--
 samples/ftrace/ftrace-direct.c               | 24 ++++++
 tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++----------------
 11 files changed, 306 insertions(+), 108 deletions(-)
  

Comments

Mark Rutland April 11, 2023, 3:56 p.m. UTC | #1
On Wed, Apr 05, 2023 at 08:02:45PM +0200, Florent Revest wrote:
> This series adds ftrace direct call support to arm64.
> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> 
> It is meant to be taken by the arm64 tree but it depends on the
> trace-direct-v6.3-rc3 tag of the linux-trace tree:
>   git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
> That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
> this depends on. [1]

Catalin, Will, are you happy to pick this via the arm64 tree, or for it to go
via the trace tree?

We'd been assuming the former, but it looks like there'll be a (simple) merge
conflict with the series adding FUNCTION_GRAPH_RETVAL:

  https://lore.kernel.org/lkml/cover.1680954589.git.pengdonglin@sangfor.com.cn/

... as both series add some definitions to arm64's asm-offsets.c in the same
place, and all those additions need to be kept. Other than that, the two series
are independent.

IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
trace tree, and if that's still the plan, maybe both should go that way?

Mark.

> Thanks to the ftrace refactoring under that tag, an ftrace_ops backing a ftrace
> direct call will only ever point to *one* direct call. This means we can look up
> the direct called trampoline address stored in the ops from the ftrace_caller
> trampoline in the case when the destination would be out of reach of a BL
> instruction at the ftrace callsite. This fixes limitations of previous attempts
> such as [2].
> 
> This series has been tested on arm64 with:
> 1- CONFIG_FTRACE_SELFTEST
> 2- samples/ftrace/*.ko (cf: patch 4)
> 3- tools/testing/selftests/bpf/test_progs (cf: patch 5)
> 
> Changes since v5 [3]:
> - Fixed saving the fourth argument of handle_mm_fault in both the x86 (patch 3)
>   and arm64 (as part of patch 4) "ftrace-direct-too" sample trampolines
> - Fixed the address of the traced function logged by some direct call samples
>   (ftrace-direct-multi and ftrace-direct-multi-modify) by moving lr into x0
> 
> 1: https://lore.kernel.org/all/ZB2Nl7fzpHoq5V20@FVFF77S0Q05N/
> 2: https://lore.kernel.org/all/20220913162732.163631-1-xukuohai@huaweicloud.com/
> 3: https://lore.kernel.org/bpf/20230403113552.2857693-1-revest@chromium.org/
> 
> Florent Revest (5):
>   arm64: ftrace: Add direct call support
>   arm64: ftrace: Simplify get_ftrace_plt
>   samples: ftrace: Save required argument registers in sample
>     trampolines
>   arm64: ftrace: Add direct call trampoline samples support
>   selftests/bpf: Update the tests deny list on aarch64
> 
>  arch/arm64/Kconfig                           |  6 ++
>  arch/arm64/include/asm/ftrace.h              | 22 +++++
>  arch/arm64/kernel/asm-offsets.c              |  6 ++
>  arch/arm64/kernel/entry-ftrace.S             | 90 ++++++++++++++++----
>  arch/arm64/kernel/ftrace.c                   | 46 +++++++---
>  samples/ftrace/ftrace-direct-modify.c        | 34 ++++++++
>  samples/ftrace/ftrace-direct-multi-modify.c  | 40 +++++++++
>  samples/ftrace/ftrace-direct-multi.c         | 24 ++++++
>  samples/ftrace/ftrace-direct-too.c           | 40 +++++++--
>  samples/ftrace/ftrace-direct.c               | 24 ++++++
>  tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++----------------
>  11 files changed, 306 insertions(+), 108 deletions(-)
> 
> -- 
> 2.40.0.577.gac1e443424-goog
>
  
Steven Rostedt April 11, 2023, 4:47 p.m. UTC | #2
On Tue, 11 Apr 2023 16:56:45 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> trace tree, and if that's still the plan, maybe both should go that way?

The conflict is minor, and I think I prefer to still have the ARM64 bits go
through the arm64 tree, as it will get better testing, and I don't like to
merge branches ;-)

I've added Linus to the Cc so he knows that there will be conflicts, but as
long as we mention it in our pull request, with a branch that includes the
solution, it should be fine going through two different trees.

-- Steve
  
Will Deacon April 11, 2023, 5:08 p.m. UTC | #3
On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2023 16:56:45 +0100
> Mark Rutland <mark.rutland@arm.com> wrote:
> 
> > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > trace tree, and if that's still the plan, maybe both should go that way?
> 
> The conflict is minor, and I think I prefer to still have the ARM64 bits go
> through the arm64 tree, as it will get better testing, and I don't like to
> merge branches ;-)
> 
> I've added Linus to the Cc so he knows that there will be conflicts, but as
> long as we mention it in our pull request, with a branch that includes the
> solution, it should be fine going through two different trees.

If it's just the simple asm-offsets conflict that Mark mentioned, then that
sounds fine to me. However, patches 3-5 don't seem to have anything to do
with arm64 at all and I'd prefer those to go via other trees (esp. as patch
3 is an independent -stable candidate and the last one is a bpf selftest
change which conflicts in -next).

So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
on trace-direct-v6.3-rc3.

Will
  
Steven Rostedt April 11, 2023, 5:44 p.m. UTC | #4
On Tue, 11 Apr 2023 18:08:08 +0100
Will Deacon <will@kernel.org> wrote:

> On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> > On Tue, 11 Apr 2023 16:56:45 +0100
> > Mark Rutland <mark.rutland@arm.com> wrote:
> >   
> > > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > > trace tree, and if that's still the plan, maybe both should go that way?  
> > 
> > The conflict is minor, and I think I prefer to still have the ARM64 bits go
> > through the arm64 tree, as it will get better testing, and I don't like to
> > merge branches ;-)
> > 
> > I've added Linus to the Cc so he knows that there will be conflicts, but as
> > long as we mention it in our pull request, with a branch that includes the
> > solution, it should be fine going through two different trees.  
> 
> If it's just the simple asm-offsets conflict that Mark mentioned, then that
> sounds fine to me. However, patches 3-5 don't seem to have anything to do

I guess 3 and 5 are not, but patch 4 adds arm64 code to the samples (as
it requires arch specific asm to handle the direct trampolines).

> with arm64 at all and I'd prefer those to go via other trees (esp. as patch
> 3 is an independent -stable candidate and the last one is a bpf selftest
> change which conflicts in -next).
> 
> So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
> on trace-direct-v6.3-rc3.

Are 3-5 dependent on those changes? If not, I can pull them into my tree.

-- Steve
  
Will Deacon April 11, 2023, 5:54 p.m. UTC | #5
On Tue, Apr 11, 2023 at 01:44:56PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2023 18:08:08 +0100
> Will Deacon <will@kernel.org> wrote:
> 
> > On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> > > On Tue, 11 Apr 2023 16:56:45 +0100
> > > Mark Rutland <mark.rutland@arm.com> wrote:
> > >   
> > > > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > > > trace tree, and if that's still the plan, maybe both should go that way?  
> > > 
> > > The conflict is minor, and I think I prefer to still have the ARM64 bits go
> > > through the arm64 tree, as it will get better testing, and I don't like to
> > > merge branches ;-)
> > > 
> > > I've added Linus to the Cc so he knows that there will be conflicts, but as
> > > long as we mention it in our pull request, with a branch that includes the
> > > solution, it should be fine going through two different trees.  
> > 
> > If it's just the simple asm-offsets conflict that Mark mentioned, then that
> > sounds fine to me. However, patches 3-5 don't seem to have anything to do
> 
> I guess 3 and 5 are not, but patch 4 adds arm64 code to the samples (as
> it requires arch specific asm to handle the direct trampolines).

Sorry, yes, I was thinking of arch/arm64/ and then failed spectacularly
at communicating :)

> > with arm64 at all and I'd prefer those to go via other trees (esp. as patch
> > 3 is an independent -stable candidate and the last one is a bpf selftest
> > change which conflicts in -next).
> > 
> > So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
> > on trace-direct-v6.3-rc3.
> 
> Are 3-5 dependent on those changes? If not, I can pull them into my tree.

Good question. Florent?

Will
  
Will Deacon April 11, 2023, 6:37 p.m. UTC | #6
On Wed, 5 Apr 2023 20:02:45 +0200, Florent Revest wrote:
> This series adds ftrace direct call support to arm64.
> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> 
> It is meant to be taken by the arm64 tree but it depends on the
> trace-direct-v6.3-rc3 tag of the linux-trace tree:
>   git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
> That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
> this depends on. [1]
> 
> [...]

Applied first two to arm64 (for-next/ftrace), thanks!

[1/5] arm64: ftrace: Add direct call support
      https://git.kernel.org/arm64/c/2aa6ac03516d
[2/5] arm64: ftrace: Simplify get_ftrace_plt
      https://git.kernel.org/arm64/c/0f59dca63bf2

Cheers,
  
Mark Rutland April 12, 2023, 9:50 a.m. UTC | #7
On Tue, Apr 11, 2023 at 06:54:24PM +0100, Will Deacon wrote:
> On Tue, Apr 11, 2023 at 01:44:56PM -0400, Steven Rostedt wrote:
> > On Tue, 11 Apr 2023 18:08:08 +0100
> > Will Deacon <will@kernel.org> wrote:
> > 
> > > On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> > > > On Tue, 11 Apr 2023 16:56:45 +0100
> > > > Mark Rutland <mark.rutland@arm.com> wrote:
> > > >   
> > > > > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > > > > trace tree, and if that's still the plan, maybe both should go that way?  
> > > > 
> > > > The conflict is minor, and I think I prefer to still have the ARM64 bits go
> > > > through the arm64 tree, as it will get better testing, and I don't like to
> > > > merge branches ;-)
> > > > 
> > > > I've added Linus to the Cc so he knows that there will be conflicts, but as
> > > > long as we mention it in our pull request, with a branch that includes the
> > > > solution, it should be fine going through two different trees.  
> > > 
> > > If it's just the simple asm-offsets conflict that Mark mentioned, then that
> > > sounds fine to me. However, patches 3-5 don't seem to have anything to do
> > 
> > I guess 3 and 5 are not, but patch 4 adds arm64 code to the samples (as
> > it requires arch specific asm to handle the direct trampolines).
> 
> Sorry, yes, I was thinking of arch/arm64/ and then failed spectacularly
> at communicating :)
> 
> > > with arm64 at all and I'd prefer those to go via other trees (esp. as patch
> > > 3 is an independent -stable candidate and the last one is a bpf selftest
> > > change which conflicts in -next).
> > > 
> > > So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
> > > on trace-direct-v6.3-rc3.
> > 
> > Are 3-5 dependent on those changes? If not, I can pull them into my tree.
> 
> Good question. Florent?

Patch 3 (the fix to the ftrace test) does not depend upon patches 1 and 2. It
probably would've been better to queue that as a preparatory fix before the
other changes.

Patch 4 (adding arm64 support to the samples) depends on patch 3. The arm64
parts depends upon patch 1 to be selectable, and without patch 1 the samples
will behave the same as before. It could be queued independently of patch 1,
but won't have any effect until merged with patch 1.

Patch 5 (the bpf selftest list changes) depends on patch 1 alone.

Perhaps we could queue 1 and 2 via the arm64 tree, 3 and 4 via the ftrace tree,
and follow up with patch 5 via the bpf tree after -rc1?

Thanks,
Mark.
  
Steven Rostedt April 24, 2023, 8:09 p.m. UTC | #8
On Wed, 12 Apr 2023 10:50:21 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> Perhaps we could queue 1 and 2 via the arm64 tree, 3 and 4 via the ftrace tree,
> and follow up with patch 5 via the bpf tree after -rc1?

Any patches that you want through the ftrace tree, please send as a
separate queue to the linux-trace-kernel mailing list (and lkml) if you
haven't done that already. I'm still a thousand emails behind, and
walking through them while at the airport lounge.

-- Steve