[0/8] Add ftrace direct call for arm64

Message ID 20230201163420.1579014-1-revest@chromium.org
Headers
Series Add ftrace direct call for arm64 |

Message

Florent Revest Feb. 1, 2023, 4:34 p.m. UTC
  This series adds ftrace direct call support to arm64.
This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.

It is meant to apply on top of the arm64 tree which contains Mark Rutland's
series on CALL_OPS [1] under the for-next/ftrace tag.

The first three patches consolidate the two existing ftrace APIs for registering
direct calls. They are split to make the reviewers lives easier but if it'd be a
preferred style, I'd be happy to squash them in the next revision.
Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
from samples and selftests, there are no users of the _ftrace_direct API left
in-tree so this deletes it and renames the _ftrace_direct_multi API to
_ftrace_direct for simplicity.

The main benefit of this refactoring is that, with the API that's left, an
ftrace_ops backing a direct call will only ever point to one direct call. We can
therefore store the direct called trampoline address in the ops (patch 4) and
look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
destination would be out of reach of a BL instruction at the ftrace callsite.
(in this case, ftrace_caller acts as a lightweight intermediary trampoline)

This series has been tested on both arm64 and x86_64 with:
1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
2- samples/ftrace/*.ko (cf: patch 8)
3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)

This follows up on prior art by Xu Kuohai [2].
The implementation here is totally different but the fix for ftrace selftests
(patch 6) is a trivial rebase of a patch originally by Xu so I kept his
authorship and trailers untouched on that patch, I hope that's ok.

1: https://lore.kernel.org/all/20230123134603.1064407-1-mark.rutland@arm.com/
2: https://lore.kernel.org/bpf/20220913162732.163631-1-xukuohai@huaweicloud.com/

Florent Revest (7):
  ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
  ftrace: Remove the legacy _ftrace_direct API
  ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
  ftrace: Store direct called addresses in their ops
  ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
  arm64: ftrace: Add direct call support
  arm64: ftrace: Add direct called trampoline samples support

Xu Kuohai (1):
  ftrace: Fix dead loop caused by direct call in ftrace selftest

 arch/arm64/Kconfig                          |   4 +
 arch/arm64/include/asm/ftrace.h             |  24 ++
 arch/arm64/kernel/asm-offsets.c             |   6 +
 arch/arm64/kernel/entry-ftrace.S            |  70 +++-
 arch/arm64/kernel/ftrace.c                  |  36 +-
 include/linux/ftrace.h                      |  51 +--
 kernel/bpf/trampoline.c                     |  14 +-
 kernel/trace/Kconfig                        |   2 +-
 kernel/trace/ftrace.c                       | 433 +-------------------
 kernel/trace/trace_selftest.c               |  14 +-
 samples/Kconfig                             |   2 +-
 samples/ftrace/ftrace-direct-modify.c       |  41 +-
 samples/ftrace/ftrace-direct-multi-modify.c |  44 +-
 samples/ftrace/ftrace-direct-multi.c        |  28 +-
 samples/ftrace/ftrace-direct-too.c          |  35 +-
 samples/ftrace/ftrace-direct.c              |  33 +-
 16 files changed, 333 insertions(+), 504 deletions(-)
  

Comments

Xu Kuohai Feb. 2, 2023, 8:36 a.m. UTC | #1
On 2/2/2023 12:34 AM, Florent Revest wrote:
> This series adds ftrace direct call support to arm64.
> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> 
> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> series on CALL_OPS [1] under the for-next/ftrace tag.
> > The first three patches consolidate the two existing ftrace APIs for registering
> direct calls. They are split to make the reviewers lives easier but if it'd be a
> preferred style, I'd be happy to squash them in the next revision.
> Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
> from samples and selftests, there are no users of the _ftrace_direct API left
> in-tree so this deletes it and renames the _ftrace_direct_multi API to
> _ftrace_direct for simplicity.
> 
> The main benefit of this refactoring is that, with the API that's left, an
> ftrace_ops backing a direct call will only ever point to one direct call. We can
> therefore store the direct called trampoline address in the ops (patch 4) and
> look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
> destination would be out of reach of a BL instruction at the ftrace callsite.
> (in this case, ftrace_caller acts as a lightweight intermediary trampoline)
> 
> This series has been tested on both arm64 and x86_64 with:
> 1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
> 2- samples/ftrace/*.ko (cf: patch 8)
> 3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)

so it's time to update DENYLIST.aarch64 to unblock tests that failed due to lack of direct call.

> 
> This follows up on prior art by Xu Kuohai [2].
> The implementation here is totally different but the fix for ftrace selftests
> (patch 6) is a trivial rebase of a patch originally by Xu so I kept his
> authorship and trailers untouched on that patch, I hope that's ok. >

that's ok for me, thanks.

> 1: https://lore.kernel.org/all/20230123134603.1064407-1-mark.rutland@arm.com/
> 2: https://lore.kernel.org/bpf/20220913162732.163631-1-xukuohai@huaweicloud.com/
> 
> Florent Revest (7):
>    ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
>    ftrace: Remove the legacy _ftrace_direct API
>    ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
>    ftrace: Store direct called addresses in their ops
>    ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
>    arm64: ftrace: Add direct call support
>    arm64: ftrace: Add direct called trampoline samples support
> 
> Xu Kuohai (1):
>    ftrace: Fix dead loop caused by direct call in ftrace selftest
> 
>   arch/arm64/Kconfig                          |   4 +
>   arch/arm64/include/asm/ftrace.h             |  24 ++
>   arch/arm64/kernel/asm-offsets.c             |   6 +
>   arch/arm64/kernel/entry-ftrace.S            |  70 +++-
>   arch/arm64/kernel/ftrace.c                  |  36 +-
>   include/linux/ftrace.h                      |  51 +--
>   kernel/bpf/trampoline.c                     |  14 +-
>   kernel/trace/Kconfig                        |   2 +-
>   kernel/trace/ftrace.c                       | 433 +-------------------
>   kernel/trace/trace_selftest.c               |  14 +-
>   samples/Kconfig                             |   2 +-
>   samples/ftrace/ftrace-direct-modify.c       |  41 +-
>   samples/ftrace/ftrace-direct-multi-modify.c |  44 +-
>   samples/ftrace/ftrace-direct-multi.c        |  28 +-
>   samples/ftrace/ftrace-direct-too.c          |  35 +-
>   samples/ftrace/ftrace-direct.c              |  33 +-
>   16 files changed, 333 insertions(+), 504 deletions(-)
>
  
Daniel Borkmann Feb. 2, 2023, 10:50 a.m. UTC | #2
On 2/2/23 9:36 AM, Xu Kuohai wrote:
> On 2/2/2023 12:34 AM, Florent Revest wrote:
>> This series adds ftrace direct call support to arm64.
>> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
>>
>> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
>> series on CALL_OPS [1] under the for-next/ftrace tag.
>> > The first three patches consolidate the two existing ftrace APIs for registering
>> direct calls. They are split to make the reviewers lives easier but if it'd be a
>> preferred style, I'd be happy to squash them in the next revision.
>> Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
>> from samples and selftests, there are no users of the _ftrace_direct API left
>> in-tree so this deletes it and renames the _ftrace_direct_multi API to
>> _ftrace_direct for simplicity.
>>
>> The main benefit of this refactoring is that, with the API that's left, an
>> ftrace_ops backing a direct call will only ever point to one direct call. We can
>> therefore store the direct called trampoline address in the ops (patch 4) and
>> look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
>> destination would be out of reach of a BL instruction at the ftrace callsite.
>> (in this case, ftrace_caller acts as a lightweight intermediary trampoline)
>>
>> This series has been tested on both arm64 and x86_64 with:
>> 1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
>> 2- samples/ftrace/*.ko (cf: patch 8)
>> 3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)

Thanks a ton for working on this!

> so it's time to update DENYLIST.aarch64 to unblock tests that failed due to lack of direct call.

+1, with regards to logistics, if possible it might be nice to eventually gets
this into a feature branch on arm64 tree, then we could pull it too from there
for bpf-next and hash out the BPF CI bits for arm64 in the meantime.

>> This follows up on prior art by Xu Kuohai [2].
>> The implementation here is totally different but the fix for ftrace selftests
>> (patch 6) is a trivial rebase of a patch originally by Xu so I kept his
>> authorship and trailers untouched on that patch, I hope that's ok. >
> 
> that's ok for me, thanks.
> 
>> 1: https://lore.kernel.org/all/20230123134603.1064407-1-mark.rutland@arm.com/
>> 2: https://lore.kernel.org/bpf/20220913162732.163631-1-xukuohai@huaweicloud.com/
>>
>> Florent Revest (7):
>>    ftrace: Replace uses of _ftrace_direct APIs with _ftrace_direct_multi
>>    ftrace: Remove the legacy _ftrace_direct API
>>    ftrace: Rename _ftrace_direct_multi APIs to _ftrace_direct APIs
>>    ftrace: Store direct called addresses in their ops
>>    ftrace: Make DIRECT_CALLS work WITH_ARGS and !WITH_REGS
>>    arm64: ftrace: Add direct call support
>>    arm64: ftrace: Add direct called trampoline samples support
>>
>> Xu Kuohai (1):
>>    ftrace: Fix dead loop caused by direct call in ftrace selftest
>>
>>   arch/arm64/Kconfig                          |   4 +
>>   arch/arm64/include/asm/ftrace.h             |  24 ++
>>   arch/arm64/kernel/asm-offsets.c             |   6 +
>>   arch/arm64/kernel/entry-ftrace.S            |  70 +++-
>>   arch/arm64/kernel/ftrace.c                  |  36 +-
>>   include/linux/ftrace.h                      |  51 +--
>>   kernel/bpf/trampoline.c                     |  14 +-
>>   kernel/trace/Kconfig                        |   2 +-
>>   kernel/trace/ftrace.c                       | 433 +-------------------
>>   kernel/trace/trace_selftest.c               |  14 +-
>>   samples/Kconfig                             |   2 +-
>>   samples/ftrace/ftrace-direct-modify.c       |  41 +-
>>   samples/ftrace/ftrace-direct-multi-modify.c |  44 +-
>>   samples/ftrace/ftrace-direct-multi.c        |  28 +-
>>   samples/ftrace/ftrace-direct-too.c          |  35 +-
>>   samples/ftrace/ftrace-direct.c              |  33 +-
>>   16 files changed, 333 insertions(+), 504 deletions(-)
>>
>
  
Florent Revest Feb. 2, 2023, 5:32 p.m. UTC | #3
On Thu, Feb 2, 2023 at 11:50 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 2/2/23 9:36 AM, Xu Kuohai wrote:
> > On 2/2/2023 12:34 AM, Florent Revest wrote:
> >> This series adds ftrace direct call support to arm64.
> >> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> >>
> >> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> >> series on CALL_OPS [1] under the for-next/ftrace tag.
> >> > The first three patches consolidate the two existing ftrace APIs for registering
> >> direct calls. They are split to make the reviewers lives easier but if it'd be a
> >> preferred style, I'd be happy to squash them in the next revision.
> >> Currently, there is both a _ftrace_direct and _ftrace_direct_multi API. Apart
> >> from samples and selftests, there are no users of the _ftrace_direct API left
> >> in-tree so this deletes it and renames the _ftrace_direct_multi API to
> >> _ftrace_direct for simplicity.
> >>
> >> The main benefit of this refactoring is that, with the API that's left, an
> >> ftrace_ops backing a direct call will only ever point to one direct call. We can
> >> therefore store the direct called trampoline address in the ops (patch 4) and
> >> look it up from the ftrace trampoline on arm64 (patch 7) in the case when the
> >> destination would be out of reach of a BL instruction at the ftrace callsite.
> >> (in this case, ftrace_caller acts as a lightweight intermediary trampoline)
> >>
> >> This series has been tested on both arm64 and x86_64 with:
> >> 1- CONFIG_FTRACE_SELFTEST (cf: patch 6)
> >> 2- samples/ftrace/*.ko (cf: patch 8)
> >> 3- tools/testing/selftests/bpf/test_progs (both -t lsm and -t fentry_fexit)
>
> Thanks a ton for working on this!
>
> > so it's time to update DENYLIST.aarch64 to unblock tests that failed due to lack of direct call.

That's a good point Xu, thanks! I'll update the deny list in my next revision.
It looks like this series fixes *a lot* of these tests, so that's exciting. :)

> +1, with regards to logistics, if possible it might be nice to eventually gets
> this into a feature branch on arm64 tree, then we could pull it too from there
> for bpf-next and hash out the BPF CI bits for arm64 in the meantime.

I believe that Manu Bretelle already wired up the BPF CI for arm64, is
there more work required ?
Regarding the logistics, whatever works sgtm... :) I suppose it's up
to Catalin or Will.
  
Steven Rostedt Feb. 2, 2023, 8:06 p.m. UTC | #4
On Wed,  1 Feb 2023 17:34:12 +0100
Florent Revest <revest@chromium.org> wrote:

> It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> series on CALL_OPS [1] under the for-next/ftrace tag.

Just a note for future ftrace patches. Could you add the link to the
arm64 tree, so I don't need to go look for it ;-)

(Yes, I'm lazy)

-- Steve
  
Mark Rutland Feb. 3, 2023, 9:49 a.m. UTC | #5
On Thu, Feb 02, 2023 at 03:06:47PM -0500, Steven Rostedt wrote:
> On Wed,  1 Feb 2023 17:34:12 +0100
> Florent Revest <revest@chromium.org> wrote:
> 
> > It is meant to apply on top of the arm64 tree which contains Mark Rutland's
> > series on CALL_OPS [1] under the for-next/ftrace tag.
> 
> Just a note for future ftrace patches. Could you add the link to the
> arm64 tree, so I don't need to go look for it ;-)

For the benefit of others looking for it now, the arm64 tree lives at:

  https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/
  git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git

Mark.