[v10,0/2] Introducing trace buffer mapping by user-space

Message ID 20240105094729.2363579-1-vdonnefort@google.com
Headers
Series Introducing trace buffer mapping by user-space |

Message

Vincent Donnefort Jan. 5, 2024, 9:47 a.m. UTC
  The tracing ring-buffers can be stored on disk or sent to network
without any copy via splice. However the later doesn't allow real time
processing of the traces. A solution is to give userspace direct access
to the ring-buffer pages via a mapping. An application can now become a
consumer of the ring-buffer, in a similar fashion to what trace_pipe
offers.

Support for this new feature in libtracefs can be found here:

  https://lore.kernel.org/all/20231228201100.78aae259@rorschach.local.home

Vincent

v9 -> v10:
  * Refactor rb_update_meta_page()
  * In-loop declaration for foreach_subbuf_page()
  * Check for cpu_buffer->mapped overflow

v8 -> v9:
  * Fix the unlock path in ring_buffer_map()
  * Fix cpu_buffer cast with rb_work_rq->is_cpu_buffer
  * Rebase on linux-trace/for-next (3cb3091138ca0921c4569bcf7ffa062519639b6a)

v7 -> v8:
  * Drop the subbufs renaming into bpages
  * Use subbuf as a name when relevant

v6 -> v7:
  * Rebase onto lore.kernel.org/lkml/20231215175502.106587604@goodmis.org/
  * Support for subbufs
  * Rename subbufs into bpages

v5 -> v6:
  * Rebase on next-20230802.
  * (unsigned long) -> (void *) cast for virt_to_page().
  * Add a wait for the GET_READER_PAGE ioctl.
  * Move writer fields update (overrun/pages_lost/entries/pages_touched)
    in the irq_work.
  * Rearrange id in struct buffer_page.
  * Rearrange the meta-page.
  * ring_buffer_meta_page -> trace_buffer_meta_page.
  * Add meta_struct_len into the meta-page.

v4 -> v5:
  * Trivial rebase onto 6.5-rc3 (previously 6.4-rc3)

v3 -> v4:
  * Add to the meta-page:
       - pages_lost / pages_read (allow to compute how full is the
	 ring-buffer)
       - read (allow to compute how many entries can be read)
       - A reader_page struct.
  * Rename ring_buffer_meta_header -> ring_buffer_meta
  * Rename ring_buffer_get_reader_page -> ring_buffer_map_get_reader_page
  * Properly consume events on ring_buffer_map_get_reader_page() with
    rb_advance_reader().

v2 -> v3:
  * Remove data page list (for non-consuming read)
    ** Implies removing order > 0 meta-page
  * Add a new meta page field ->read
  * Rename ring_buffer_meta_page_header into ring_buffer_meta_header

v1 -> v2:
  * Hide data_pages from the userspace struct
  * Fix META_PAGE_MAX_PAGES
  * Support for order > 0 meta-page
  * Add missing page->mapping.

---

Vincent Donnefort (2):
  ring-buffer: Introducing ring-buffer mapping functions
  tracing: Allow user-space mapping of the ring-buffer

 include/linux/ring_buffer.h     |   7 +
 include/uapi/linux/trace_mmap.h |  31 +++
 kernel/trace/ring_buffer.c      | 384 +++++++++++++++++++++++++++++++-
 kernel/trace/trace.c            |  79 ++++++-
 4 files changed, 497 insertions(+), 4 deletions(-)
 create mode 100644 include/uapi/linux/trace_mmap.h


base-commit: 3cb3091138ca0921c4569bcf7ffa062519639b6a
  

Comments

Masami Hiramatsu (Google) Jan. 9, 2024, 1:04 p.m. UTC | #1
Hi Vincent,

On Fri,  5 Jan 2024 09:47:27 +0000
Vincent Donnefort <vdonnefort@google.com> wrote:

> The tracing ring-buffers can be stored on disk or sent to network
> without any copy via splice. However the later doesn't allow real time
> processing of the traces. A solution is to give userspace direct access
> to the ring-buffer pages via a mapping. An application can now become a
> consumer of the ring-buffer, in a similar fashion to what trace_pipe
> offers.

I think this is very nice feature. But this series seems just a feature,
no document and no example code. Can you add 2 patches to add those?
I know libtracefs already provide a support code, but I think it is
better to have a test code under tools/testing/selftests/ring_buffer.

I also wonder what happen if other operation (e.g. taking snapshot) happens
while mmaping the ring buffer.

Thank you,

> 
> Support for this new feature in libtracefs can be found here:
> 
>   https://lore.kernel.org/all/20231228201100.78aae259@rorschach.local.home
> 
> Vincent
> 
> v9 -> v10:
>   * Refactor rb_update_meta_page()
>   * In-loop declaration for foreach_subbuf_page()
>   * Check for cpu_buffer->mapped overflow
> 
> v8 -> v9:
>   * Fix the unlock path in ring_buffer_map()
>   * Fix cpu_buffer cast with rb_work_rq->is_cpu_buffer
>   * Rebase on linux-trace/for-next (3cb3091138ca0921c4569bcf7ffa062519639b6a)
> 
> v7 -> v8:
>   * Drop the subbufs renaming into bpages
>   * Use subbuf as a name when relevant
> 
> v6 -> v7:
>   * Rebase onto lore.kernel.org/lkml/20231215175502.106587604@goodmis.org/
>   * Support for subbufs
>   * Rename subbufs into bpages
> 
> v5 -> v6:
>   * Rebase on next-20230802.
>   * (unsigned long) -> (void *) cast for virt_to_page().
>   * Add a wait for the GET_READER_PAGE ioctl.
>   * Move writer fields update (overrun/pages_lost/entries/pages_touched)
>     in the irq_work.
>   * Rearrange id in struct buffer_page.
>   * Rearrange the meta-page.
>   * ring_buffer_meta_page -> trace_buffer_meta_page.
>   * Add meta_struct_len into the meta-page.
> 
> v4 -> v5:
>   * Trivial rebase onto 6.5-rc3 (previously 6.4-rc3)
> 
> v3 -> v4:
>   * Add to the meta-page:
>        - pages_lost / pages_read (allow to compute how full is the
> 	 ring-buffer)
>        - read (allow to compute how many entries can be read)
>        - A reader_page struct.
>   * Rename ring_buffer_meta_header -> ring_buffer_meta
>   * Rename ring_buffer_get_reader_page -> ring_buffer_map_get_reader_page
>   * Properly consume events on ring_buffer_map_get_reader_page() with
>     rb_advance_reader().
> 
> v2 -> v3:
>   * Remove data page list (for non-consuming read)
>     ** Implies removing order > 0 meta-page
>   * Add a new meta page field ->read
>   * Rename ring_buffer_meta_page_header into ring_buffer_meta_header
> 
> v1 -> v2:
>   * Hide data_pages from the userspace struct
>   * Fix META_PAGE_MAX_PAGES
>   * Support for order > 0 meta-page
>   * Add missing page->mapping.
> 
> ---
> 
> Vincent Donnefort (2):
>   ring-buffer: Introducing ring-buffer mapping functions
>   tracing: Allow user-space mapping of the ring-buffer
> 
>  include/linux/ring_buffer.h     |   7 +
>  include/uapi/linux/trace_mmap.h |  31 +++
>  kernel/trace/ring_buffer.c      | 384 +++++++++++++++++++++++++++++++-
>  kernel/trace/trace.c            |  79 ++++++-
>  4 files changed, 497 insertions(+), 4 deletions(-)
>  create mode 100644 include/uapi/linux/trace_mmap.h
> 
> 
> base-commit: 3cb3091138ca0921c4569bcf7ffa062519639b6a
> -- 
> 2.43.0.472.g3155946c3a-goog
> 
>
  
Steven Rostedt Jan. 9, 2024, 1:20 p.m. UTC | #2
Hi Masami, thanks for looking at this.

On Tue, 9 Jan 2024 22:04:45 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> > The tracing ring-buffers can be stored on disk or sent to network
> > without any copy via splice. However the later doesn't allow real time
> > processing of the traces. A solution is to give userspace direct access
> > to the ring-buffer pages via a mapping. An application can now become a
> > consumer of the ring-buffer, in a similar fashion to what trace_pipe
> > offers.  
> 
> I think this is very nice feature. But this series seems just a feature,
> no document and no example code. Can you add 2 patches to add those?
> I know libtracefs already provide a support code, but I think it is
> better to have a test code under tools/testing/selftests/ring_buffer.

Yeah, we should have sample code and a test.

> 
> I also wonder what happen if other operation (e.g. taking snapshot) happens
> while mmaping the ring buffer.

Hmm, good point. We should disable snapshots when mapped, and also prevent
mapping with latency tracer if we are not already doing that.

-- Steve
  
Vincent Donnefort Jan. 9, 2024, 1:47 p.m. UTC | #3
On Tue, Jan 09, 2024 at 08:20:57AM -0500, Steven Rostedt wrote:
> 
> Hi Masami, thanks for looking at this.
> 
> On Tue, 9 Jan 2024 22:04:45 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> 
> > > The tracing ring-buffers can be stored on disk or sent to network
> > > without any copy via splice. However the later doesn't allow real time
> > > processing of the traces. A solution is to give userspace direct access
> > > to the ring-buffer pages via a mapping. An application can now become a
> > > consumer of the ring-buffer, in a similar fashion to what trace_pipe
> > > offers.  
> > 
> > I think this is very nice feature. But this series seems just a feature,
> > no document and no example code. Can you add 2 patches to add those?
> > I know libtracefs already provide a support code, but I think it is
> > better to have a test code under tools/testing/selftests/ring_buffer.
> 
> Yeah, we should have sample code and a test.

Ack. I will recycle what I had in the cover letter in a ring_buffer selftest.

> 
> > 
> > I also wonder what happen if other operation (e.g. taking snapshot) happens
> > while mmaping the ring buffer.
> 
> Hmm, good point. We should disable snapshots when mapped, and also prevent
> mapping with latency tracer if we are not already doing that.

ring_buffer_swap_cpu() is already disabled when mapped as well as
resize_disabled set. Is something else necessary?

> 
> -- Steve