Message ID | 20240111161712.1480333-5-vdonnefort@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-23846-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1559916dyi; Thu, 11 Jan 2024 08:19:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IG/DKKh681JiIBLNXrgYQKUYOCoHAMa2IdfeVOnb732LgKpY2/xFXASmMFeWpsYxRdFmmei X-Received: by 2002:a9d:6a8b:0:b0:6dc:6fa8:fafa with SMTP id l11-20020a9d6a8b000000b006dc6fa8fafamr23049otq.38.1704989991834; Thu, 11 Jan 2024 08:19:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704989991; cv=none; d=google.com; s=arc-20160816; b=jQfr/PJr78tqLCOsRTd3xyBi+afYnmoTRx4+VhY62DFsE8w4egbVOikRjEFOs9HebS d3i5zsrd5EfuFeLDBxUf3W5EcOzuHifyY5KVfK/4G75iHOYomdzg0ER2sVL1qPUDBpGb Pa3zcedL/CxJhqaXDKxjHsPgnmJNeJuGUP9i798cGZuj6SlXQXC42x421ye+sLFD+013 B8WrRA86LqCbibcZjiE1O6tNMXeWDtB0y4U9JWrTCd3lLZBTj86rQh1GJ9X2nV36HcHW mxSDzUb4DoyT9NPqxmu20TTgHGmHXfYI7JVnTrMjlxK/DOMy1WJjnyO3UMT0istu4tfR uO4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=XCuyvwf8Wck/dCBmVNrAbagpZQESmdweCPz4Yt1A3mA=; fh=Osewoyidmyt6GUNpr1HkxMO6OLm/aX24FkE9kqTXAtA=; b=VGnf1BIthchO0hgajwy38ToaM6DUxDLGiaQYbPIXrEwdk0hq3xxAIixwwbwzQt0gTm T8K5KzM0R7iaomux36OqXJkEJSzNzVM2L7PnNRxwXUDSx0xLYIHnG3vVOJPFS1oYckW3 h+rxM+ej1O3C1Pi1dmP86ECoX7SUH3sSOL9KBOnJ5uWSKu3qMwV5AAKpCiuovNf33mdD LYLr6yl8viHl7EDRxIndbZPcsmRnVHPlC2OG/AUJagrwXNZ7bZMSOHuYGdEvOJ72kmKT G63NrNZOJFCB7hQmOKaTUccVeLlAQiuv9wHBbJEKXsRaWzSuGs063WEE+xbSaUX6GvFZ 60Nw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Yl5sYijT; spf=pass (google.com: domain of linux-kernel+bounces-23846-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23846-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id v17-20020a05622a189100b004299a99c7e4si1232853qtc.757.2024.01.11.08.19.51 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jan 2024 08:19:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23846-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Yl5sYijT; spf=pass (google.com: domain of linux-kernel+bounces-23846-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23846-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 825911C23685 for <ouuuleilei@gmail.com>; Thu, 11 Jan 2024 16:19:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5054F5576B; Thu, 11 Jan 2024 16:17:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Yl5sYijT" Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 89A3A53800 for <linux-kernel@vger.kernel.org>; Thu, 11 Jan 2024 16:17:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vdonnefort.bounces.google.com Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-33769c5db4aso3309908f8f.0 for <linux-kernel@vger.kernel.org>; Thu, 11 Jan 2024 08:17:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1704989849; x=1705594649; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=XCuyvwf8Wck/dCBmVNrAbagpZQESmdweCPz4Yt1A3mA=; b=Yl5sYijTJddYPE8A0XbzsIxeUXUxr0R4yCq8ZzoKqOZLWBwT59vScwLfcMHzGaEu6o 6db7yvLqTcwCj3kmHSKnsoA7tsyBhu3XnFD0jJgCUhT17S2t6Xg3M2GPq/JTCg3rSYh1 kgO+QW8caVlNLBRh+oPED2UD5vqxTKGaMYkqUu24QuTZbEEZlf7yclpmBbDHNg63nu7U 21SDh14jgxJFTHdQ4Klav2uQNq3erZlwpA+UGLAxlJ5lm1u+BBJ/kgkV83Y+XT130706 iRF1p64/p6IipwupCJHs34EWLxrJmLjOX2QgM7855UYsqrEVGkXGoPbCUNG1Fp/Cgc0+ Pu2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704989849; x=1705594649; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XCuyvwf8Wck/dCBmVNrAbagpZQESmdweCPz4Yt1A3mA=; b=oO3mqcogYJUj6mhDwo6aIjeLzJIr+pSfw9zi5llD6IocTu9lLGUQ405mI+8a/DspE1 YLKGscHbVr6u8SxCnAlMkTguSE7Osl/PKJnMGs/OoVpMdORK9xxJcPc3fWR0iGy5VBIR wjAD2Z70Kf+1QKogScP3t13k1HT2kQr/Iz0PxN8fTNtDZ+6/vUMiNJhKvGN7h6VqZH99 VrE0wAjLSJFrZjQl9ULLZfM44maZbRF8txmQYVbeFaiRN2P6A2Ewjmpx62DjLigRneNu z+KT7rdSn00ez24Bi2qn65+KrfL6bbi65KGciqqMx16tlFVeL8LJu/i+7L7XaYOO8Oic KhQg== X-Gm-Message-State: AOJu0YzhYMrzSuY05dcKNeGaufPpxuHVTDUoahF9ic3IIjBLPMvB8MsX 3A5KichaKYTrtC0JmrOk5xjyoeV0G1e6ZX2LoWc/D20= X-Received: from vdonnefort.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:2eea]) (user=vdonnefort job=sendgmr) by 2002:a05:6000:618:b0:337:8f60:972e with SMTP id bn24-20020a056000061800b003378f60972emr1263wrb.3.1704989848845; Thu, 11 Jan 2024 08:17:28 -0800 (PST) Date: Thu, 11 Jan 2024 16:17:11 +0000 In-Reply-To: <20240111161712.1480333-1-vdonnefort@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> Mime-Version: 1.0 References: <20240111161712.1480333-1-vdonnefort@google.com> X-Mailer: git-send-email 2.43.0.275.g3460e3d667-goog Message-ID: <20240111161712.1480333-5-vdonnefort@google.com> Subject: [PATCH v11 4/5] Documentation: tracing: Add ring-buffer mapping From: Vincent Donnefort <vdonnefort@google.com> To: rostedt@goodmis.org, mhiramat@kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: mathieu.desnoyers@efficios.com, kernel-team@android.com, Vincent Donnefort <vdonnefort@google.com> Content-Type: text/plain; charset="UTF-8" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787811585394758536 X-GMAIL-MSGID: 1787811585394758536 |
Series |
Introducing trace buffer mapping by user-space
|
|
Commit Message
Vincent Donnefort
Jan. 11, 2024, 4:17 p.m. UTC
It is now possible to mmap() a ring-buffer to stream its content. Add
some documentation and a code example.
Signed-off-by: Vincent Donnefort <vdonnefort@google.com>
Comments
On Thu, 11 Jan 2024 16:17:11 +0000 Vincent Donnefort <vdonnefort@google.com> wrote: > It is now possible to mmap() a ring-buffer to stream its content. Add > some documentation and a code example. > > Signed-off-by: Vincent Donnefort <vdonnefort@google.com> > > diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst > index 5092d6c13af5..0b300901fd75 100644 > --- a/Documentation/trace/index.rst > +++ b/Documentation/trace/index.rst > @@ -29,6 +29,7 @@ Linux Tracing Technologies > timerlat-tracer > intel_th > ring-buffer-design > + ring-buffer-map > stm > sys-t > coresight/index > diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst > new file mode 100644 > index 000000000000..2ba7b5339178 > --- /dev/null > +++ b/Documentation/trace/ring-buffer-map.rst > @@ -0,0 +1,105 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +================================== > +Tracefs ring-buffer memory mapping > +================================== > + > +:Author: Vincent Donnefort <vdonnefort@google.com> > + > +Overview > +======== > +Tracefs ring-buffer memory map provides an efficient method to stream data > +as no memory copy is necessary. The application mapping the ring-buffer becomes > +then a consumer for that ring-buffer, in a similar fashion to trace_pipe. > + > +Memory mapping setup > +==================== > +The mapping works with a mmap() of the trace_pipe_raw interface. > + > +The first system page of the mapping contains ring-buffer statistics and > +description. It is referred as the meta-page. One of the most important field of > +the meta-page is the reader. It contains the subbuf ID which can be safely read > +by the mapper (see ring-buffer-design.rst). > + > +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is > +therefore effortless to know where the reader starts in the mapping: > + > +.. code-block:: c > + > + reader_id = meta->reader->id; > + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size; > + > +When the application is done with the current reader, it can get a new one using > +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates > +the meta-page fields. > + > +Limitations > +=========== > +When a mapping is in place on a Tracefs ring-buffer, it is not possible to > +either resize it (either by increasing the entire size of the ring-buffer or > +each subbuf). It is also not possible to use snapshot or splice. > + > +Concurrent readers (either another application mapping that ring-buffer or the > +kernel with trace_pipe) are allowed but not recommended. They will compete for > +the ring-buffer and the output is unpredictable. > + > +Example > +======= > + > +.. code-block:: c > + > + #include <fcntl.h> > + #include <stdio.h> > + #include <stdlib.h> > + #include <unistd.h> > + > + #include <linux/trace_mmap.h> > + > + #include <sys/mman.h> > + #include <sys/ioctl.h> > + > + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw" > + > + int main(void) > + { > + int page_size = getpagesize(), fd, reader_id; > + unsigned long meta_len, data_len; > + struct trace_buffer_meta *meta; > + void *map, *reader, *data; nit: this example code has a compile warning. rbmap.c: In function ‘main’: rbmap.c:18:21: warning: variable ‘reader’ set but not used [-Wunused-but-set-variable] 18 | void *map, *reader, *data; | ^~~~~~ > + > + fd = open(TRACE_PIPE_RAW, O_RDONLY); > + if (fd < 0) > + exit(EXIT_FAILURE); > + > + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); > + if (map == MAP_FAILED) > + exit(EXIT_FAILURE); > + > + meta = (struct trace_buffer_meta *)map; > + meta_len = meta->meta_page_size; > + > + printf("entries: %lu\n", meta->entries); > + printf("overrun: %lu\n", meta->overrun); > + printf("read: %lu\n", meta->read); > + printf("subbufs_touched:%lu\n", meta->subbufs_touched); > + printf("subbufs_lost: %lu\n", meta->subbufs_lost); > + printf("subbufs_read: %lu\n", meta->subbufs_read); > + printf("nr_subbufs: %u\n", meta->nr_subbufs); > + > + data_len = meta->subbuf_size * meta->nr_subbufs; > + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len); > + if (data == MAP_FAILED) > + exit(EXIT_FAILURE); > + > + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0) > + exit(EXIT_FAILURE); > + > + reader_id = meta->reader.id; > + reader = data + meta->subbuf_size * reader_id; So here, maybe you need; printf("Current read sub-buffer address: %p\n", reader); Thank you, > + > + munmap(data, data_len); > + munmap(meta, meta_len); > + close (fd); > + > + return 0; > + } > -- > 2.43.0.275.g3460e3d667-goog >
Hi Vincent, On Thu, 11 Jan 2024 16:17:11 +0000 Vincent Donnefort <vdonnefort@google.com> wrote: > It is now possible to mmap() a ring-buffer to stream its content. Add > some documentation and a code example. > > Signed-off-by: Vincent Donnefort <vdonnefort@google.com> > > diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst > index 5092d6c13af5..0b300901fd75 100644 > --- a/Documentation/trace/index.rst > +++ b/Documentation/trace/index.rst > @@ -29,6 +29,7 @@ Linux Tracing Technologies > timerlat-tracer > intel_th > ring-buffer-design > + ring-buffer-map > stm > sys-t > coresight/index > diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst > new file mode 100644 > index 000000000000..2ba7b5339178 > --- /dev/null > +++ b/Documentation/trace/ring-buffer-map.rst > @@ -0,0 +1,105 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +================================== > +Tracefs ring-buffer memory mapping > +================================== > + > +:Author: Vincent Donnefort <vdonnefort@google.com> > + > +Overview > +======== > +Tracefs ring-buffer memory map provides an efficient method to stream data > +as no memory copy is necessary. The application mapping the ring-buffer becomes > +then a consumer for that ring-buffer, in a similar fashion to trace_pipe. > + > +Memory mapping setup > +==================== > +The mapping works with a mmap() of the trace_pipe_raw interface. > + > +The first system page of the mapping contains ring-buffer statistics and > +description. It is referred as the meta-page. One of the most important field of > +the meta-page is the reader. It contains the subbuf ID which can be safely read > +by the mapper (see ring-buffer-design.rst). > + > +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is > +therefore effortless to know where the reader starts in the mapping: > + > +.. code-block:: c > + > + reader_id = meta->reader->id; > + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size; > + > +When the application is done with the current reader, it can get a new one using > +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates > +the meta-page fields. > + > +Limitations > +=========== > +When a mapping is in place on a Tracefs ring-buffer, it is not possible to > +either resize it (either by increasing the entire size of the ring-buffer or > +each subbuf). It is also not possible to use snapshot or splice. I've played with the sample code. - "free_buffer" just doesn't work when the process is mmap the ring buffer. - After mmap the buffers, when the snapshot took, the IOCTL returns an error. OK, but I rather like to fail snapshot with -EBUSY when the buffer is mmaped. > + > +Concurrent readers (either another application mapping that ring-buffer or the > +kernel with trace_pipe) are allowed but not recommended. They will compete for > +the ring-buffer and the output is unpredictable. > + > +Example > +======= > + > +.. code-block:: c > + > + #include <fcntl.h> > + #include <stdio.h> > + #include <stdlib.h> > + #include <unistd.h> > + > + #include <linux/trace_mmap.h> > + > + #include <sys/mman.h> > + #include <sys/ioctl.h> > + > + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw" > + > + int main(void) > + { > + int page_size = getpagesize(), fd, reader_id; > + unsigned long meta_len, data_len; > + struct trace_buffer_meta *meta; > + void *map, *reader, *data; > + > + fd = open(TRACE_PIPE_RAW, O_RDONLY); > + if (fd < 0) > + exit(EXIT_FAILURE); > + > + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); > + if (map == MAP_FAILED) > + exit(EXIT_FAILURE); > + > + meta = (struct trace_buffer_meta *)map; > + meta_len = meta->meta_page_size; > + > + printf("entries: %lu\n", meta->entries); > + printf("overrun: %lu\n", meta->overrun); > + printf("read: %lu\n", meta->read); > + printf("subbufs_touched:%lu\n", meta->subbufs_touched); > + printf("subbufs_lost: %lu\n", meta->subbufs_lost); > + printf("subbufs_read: %lu\n", meta->subbufs_read); > + printf("nr_subbufs: %u\n", meta->nr_subbufs); > + > + data_len = meta->subbuf_size * meta->nr_subbufs; > + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len); > + if (data == MAP_FAILED) > + exit(EXIT_FAILURE); > + > + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0) > + exit(EXIT_FAILURE); > + > + reader_id = meta->reader.id; > + reader = data + meta->subbuf_size * reader_id; Also, this caused a bus error if I add below 2 lines here. printf("reader_id: %d, addr: %p\n", reader_id, reader); printf("read data head: %lx\n", *(unsigned long *)reader); ----- / # cd /sys/kernel/tracing/ /sys/kernel/tracing # echo 1 > events/enable [ 17.941894] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 /sys/kernel/tracing # /sys/kernel/tracing # echo 1 > buffer_percent /sys/kernel/tracing # /mnt/rbmap2 entries: 245291 overrun: 203741 read: 0 subbufs_touched:2041 subbufs_lost: 1688 subbufs_read: 0 nr_subbufs: 355 reader_id: 1, addr: 0x7f0cde51a000 Bus error ----- Is this expected behavior? how can I read the ring buffer? Thank you, > + > + munmap(data, data_len); > + munmap(meta, meta_len); > + close (fd); > + > + return 0; > + } > -- > 2.43.0.275.g3460e3d667-goog >
On Sun, 14 Jan 2024 23:26:43 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > Hi Vincent, > > On Thu, 11 Jan 2024 16:17:11 +0000 > Vincent Donnefort <vdonnefort@google.com> wrote: > > > It is now possible to mmap() a ring-buffer to stream its content. Add > > some documentation and a code example. > > > > Signed-off-by: Vincent Donnefort <vdonnefort@google.com> > > > > diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst > > index 5092d6c13af5..0b300901fd75 100644 > > --- a/Documentation/trace/index.rst > > +++ b/Documentation/trace/index.rst > > @@ -29,6 +29,7 @@ Linux Tracing Technologies > > timerlat-tracer > > intel_th > > ring-buffer-design > > + ring-buffer-map > > stm > > sys-t > > coresight/index > > diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst > > new file mode 100644 > > index 000000000000..2ba7b5339178 > > --- /dev/null > > +++ b/Documentation/trace/ring-buffer-map.rst > > @@ -0,0 +1,105 @@ > > +.. SPDX-License-Identifier: GPL-2.0 > > + > > +================================== > > +Tracefs ring-buffer memory mapping > > +================================== > > + > > +:Author: Vincent Donnefort <vdonnefort@google.com> > > + > > +Overview > > +======== > > +Tracefs ring-buffer memory map provides an efficient method to stream data > > +as no memory copy is necessary. The application mapping the ring-buffer becomes > > +then a consumer for that ring-buffer, in a similar fashion to trace_pipe. > > + > > +Memory mapping setup > > +==================== > > +The mapping works with a mmap() of the trace_pipe_raw interface. > > + > > +The first system page of the mapping contains ring-buffer statistics and > > +description. It is referred as the meta-page. One of the most important field of > > +the meta-page is the reader. It contains the subbuf ID which can be safely read > > +by the mapper (see ring-buffer-design.rst). > > + > > +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is > > +therefore effortless to know where the reader starts in the mapping: > > + > > +.. code-block:: c > > + > > + reader_id = meta->reader->id; > > + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size; > > + > > +When the application is done with the current reader, it can get a new one using > > +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates > > +the meta-page fields. > > + > > +Limitations > > +=========== > > +When a mapping is in place on a Tracefs ring-buffer, it is not possible to > > +either resize it (either by increasing the entire size of the ring-buffer or > > +each subbuf). It is also not possible to use snapshot or splice. > > I've played with the sample code. > > - "free_buffer" just doesn't work when the process is mmap the ring buffer. > - After mmap the buffers, when the snapshot took, the IOCTL returns an error. > > OK, but I rather like to fail snapshot with -EBUSY when the buffer is mmaped. > > > + > > +Concurrent readers (either another application mapping that ring-buffer or the > > +kernel with trace_pipe) are allowed but not recommended. They will compete for > > +the ring-buffer and the output is unpredictable. > > + > > +Example > > +======= > > + > > +.. code-block:: c > > + > > + #include <fcntl.h> > > + #include <stdio.h> > > + #include <stdlib.h> > > + #include <unistd.h> > > + > > + #include <linux/trace_mmap.h> > > + > > + #include <sys/mman.h> > > + #include <sys/ioctl.h> > > + > > + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw" > > + > > + int main(void) > > + { > > + int page_size = getpagesize(), fd, reader_id; > > + unsigned long meta_len, data_len; > > + struct trace_buffer_meta *meta; > > + void *map, *reader, *data; > > + > > + fd = open(TRACE_PIPE_RAW, O_RDONLY); > > + if (fd < 0) > > + exit(EXIT_FAILURE); > > + > > + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); > > + if (map == MAP_FAILED) > > + exit(EXIT_FAILURE); > > + > > + meta = (struct trace_buffer_meta *)map; > > + meta_len = meta->meta_page_size; > > + > > + printf("entries: %lu\n", meta->entries); > > + printf("overrun: %lu\n", meta->overrun); > > + printf("read: %lu\n", meta->read); > > + printf("subbufs_touched:%lu\n", meta->subbufs_touched); > > + printf("subbufs_lost: %lu\n", meta->subbufs_lost); > > + printf("subbufs_read: %lu\n", meta->subbufs_read); > > + printf("nr_subbufs: %u\n", meta->nr_subbufs); > > + > > + data_len = meta->subbuf_size * meta->nr_subbufs; > > + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len); The above is buggy. It should be: data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, meta_len); The last parameter is where to start the mapping from, which is just after the meta page. The code is currently starting the map far away from that. -- Steve > > + if (data == MAP_FAILED) > > + exit(EXIT_FAILURE); > > + > > + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0) > > + exit(EXIT_FAILURE); > > + > > + reader_id = meta->reader.id; > > + reader = data + meta->subbuf_size * reader_id; > > Also, this caused a bus error if I add below 2 lines here. > > printf("reader_id: %d, addr: %p\n", reader_id, reader); > printf("read data head: %lx\n", *(unsigned long *)reader); > > ----- > / # cd /sys/kernel/tracing/ > /sys/kernel/tracing # echo 1 > events/enable > [ 17.941894] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 > /sys/kernel/tracing # > /sys/kernel/tracing # echo 1 > buffer_percent > /sys/kernel/tracing # /mnt/rbmap2 > entries: 245291 > overrun: 203741 > read: 0 > subbufs_touched:2041 > subbufs_lost: 1688 > subbufs_read: 0 > nr_subbufs: 355 > reader_id: 1, addr: 0x7f0cde51a000 > Bus error > ----- > > Is this expected behavior? how can I read the ring buffer? > > Thank you, > > > + > > + munmap(data, data_len); > > + munmap(meta, meta_len); > > + close (fd); > > + > > + return 0; > > + } > > -- > > 2.43.0.275.g3460e3d667-goog > > > >
On Sun, 14 Jan 2024 11:23:24 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > On Sun, 14 Jan 2024 23:26:43 +0900 > Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > Hi Vincent, > > > > On Thu, 11 Jan 2024 16:17:11 +0000 > > Vincent Donnefort <vdonnefort@google.com> wrote: > > > > > It is now possible to mmap() a ring-buffer to stream its content. Add > > > some documentation and a code example. > > > > > > Signed-off-by: Vincent Donnefort <vdonnefort@google.com> > > > > > > diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst > > > index 5092d6c13af5..0b300901fd75 100644 > > > --- a/Documentation/trace/index.rst > > > +++ b/Documentation/trace/index.rst > > > @@ -29,6 +29,7 @@ Linux Tracing Technologies > > > timerlat-tracer > > > intel_th > > > ring-buffer-design > > > + ring-buffer-map > > > stm > > > sys-t > > > coresight/index > > > diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst > > > new file mode 100644 > > > index 000000000000..2ba7b5339178 > > > --- /dev/null > > > +++ b/Documentation/trace/ring-buffer-map.rst > > > @@ -0,0 +1,105 @@ > > > +.. SPDX-License-Identifier: GPL-2.0 > > > + > > > +================================== > > > +Tracefs ring-buffer memory mapping > > > +================================== > > > + > > > +:Author: Vincent Donnefort <vdonnefort@google.com> > > > + > > > +Overview > > > +======== > > > +Tracefs ring-buffer memory map provides an efficient method to stream data > > > +as no memory copy is necessary. The application mapping the ring-buffer becomes > > > +then a consumer for that ring-buffer, in a similar fashion to trace_pipe. > > > + > > > +Memory mapping setup > > > +==================== > > > +The mapping works with a mmap() of the trace_pipe_raw interface. > > > + > > > +The first system page of the mapping contains ring-buffer statistics and > > > +description. It is referred as the meta-page. One of the most important field of > > > +the meta-page is the reader. It contains the subbuf ID which can be safely read > > > +by the mapper (see ring-buffer-design.rst). > > > + > > > +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is > > > +therefore effortless to know where the reader starts in the mapping: > > > + > > > +.. code-block:: c > > > + > > > + reader_id = meta->reader->id; > > > + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size; > > > + > > > +When the application is done with the current reader, it can get a new one using > > > +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates > > > +the meta-page fields. > > > + > > > +Limitations > > > +=========== > > > +When a mapping is in place on a Tracefs ring-buffer, it is not possible to > > > +either resize it (either by increasing the entire size of the ring-buffer or > > > +each subbuf). It is also not possible to use snapshot or splice. > > > > I've played with the sample code. > > > > - "free_buffer" just doesn't work when the process is mmap the ring buffer. > > - After mmap the buffers, when the snapshot took, the IOCTL returns an error. > > > > OK, but I rather like to fail snapshot with -EBUSY when the buffer is mmaped. > > > > > + > > > +Concurrent readers (either another application mapping that ring-buffer or the > > > +kernel with trace_pipe) are allowed but not recommended. They will compete for > > > +the ring-buffer and the output is unpredictable. > > > + > > > +Example > > > +======= > > > + > > > +.. code-block:: c > > > + > > > + #include <fcntl.h> > > > + #include <stdio.h> > > > + #include <stdlib.h> > > > + #include <unistd.h> > > > + > > > + #include <linux/trace_mmap.h> > > > + > > > + #include <sys/mman.h> > > > + #include <sys/ioctl.h> > > > + > > > + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw" > > > + > > > + int main(void) > > > + { > > > + int page_size = getpagesize(), fd, reader_id; > > > + unsigned long meta_len, data_len; > > > + struct trace_buffer_meta *meta; > > > + void *map, *reader, *data; > > > + > > > + fd = open(TRACE_PIPE_RAW, O_RDONLY); > > > + if (fd < 0) > > > + exit(EXIT_FAILURE); > > > + > > > + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); > > > + if (map == MAP_FAILED) > > > + exit(EXIT_FAILURE); > > > + > > > + meta = (struct trace_buffer_meta *)map; > > > + meta_len = meta->meta_page_size; > > > + > > > + printf("entries: %lu\n", meta->entries); > > > + printf("overrun: %lu\n", meta->overrun); > > > + printf("read: %lu\n", meta->read); > > > + printf("subbufs_touched:%lu\n", meta->subbufs_touched); > > > + printf("subbufs_lost: %lu\n", meta->subbufs_lost); > > > + printf("subbufs_read: %lu\n", meta->subbufs_read); > > > + printf("nr_subbufs: %u\n", meta->nr_subbufs); > > > + > > > + data_len = meta->subbuf_size * meta->nr_subbufs; > > > + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len); > > The above is buggy. It should be: > > data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, meta_len); > > The last parameter is where to start the mapping from, which is just > after the meta page. The code is currently starting the map far away > from that. Ah, indeed! I confirmed that fixed the bus error. Thank you! > > -- Steve > > > > + if (data == MAP_FAILED) > > > + exit(EXIT_FAILURE); > > > + > > > + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0) > > > + exit(EXIT_FAILURE); > > > + > > > + reader_id = meta->reader.id; > > > + reader = data + meta->subbuf_size * reader_id; > > > > Also, this caused a bus error if I add below 2 lines here. > > > > printf("reader_id: %d, addr: %p\n", reader_id, reader); > > printf("read data head: %lx\n", *(unsigned long *)reader); > > > > ----- > > / # cd /sys/kernel/tracing/ > > /sys/kernel/tracing # echo 1 > events/enable > > [ 17.941894] Scheduler tracepoints stat_sleep, stat_iowait, stat_blocked and stat_runtime require the kernel parameter schedstats=enable or kernel.sched_schedstats=1 > > /sys/kernel/tracing # > > /sys/kernel/tracing # echo 1 > buffer_percent > > /sys/kernel/tracing # /mnt/rbmap2 > > entries: 245291 > > overrun: 203741 > > read: 0 > > subbufs_touched:2041 > > subbufs_lost: 1688 > > subbufs_read: 0 > > nr_subbufs: 355 > > reader_id: 1, addr: 0x7f0cde51a000 > > Bus error > > ----- > > > > Is this expected behavior? how can I read the ring buffer? > > > > Thank you, > > > > > + > > > + munmap(data, data_len); > > > + munmap(meta, meta_len); > > > + close (fd); > > > + > > > + return 0; > > > + } > > > -- > > > 2.43.0.275.g3460e3d667-goog > > > > > > > >
diff --git a/Documentation/trace/index.rst b/Documentation/trace/index.rst index 5092d6c13af5..0b300901fd75 100644 --- a/Documentation/trace/index.rst +++ b/Documentation/trace/index.rst @@ -29,6 +29,7 @@ Linux Tracing Technologies timerlat-tracer intel_th ring-buffer-design + ring-buffer-map stm sys-t coresight/index diff --git a/Documentation/trace/ring-buffer-map.rst b/Documentation/trace/ring-buffer-map.rst new file mode 100644 index 000000000000..2ba7b5339178 --- /dev/null +++ b/Documentation/trace/ring-buffer-map.rst @@ -0,0 +1,105 @@ +.. SPDX-License-Identifier: GPL-2.0 + +================================== +Tracefs ring-buffer memory mapping +================================== + +:Author: Vincent Donnefort <vdonnefort@google.com> + +Overview +======== +Tracefs ring-buffer memory map provides an efficient method to stream data +as no memory copy is necessary. The application mapping the ring-buffer becomes +then a consumer for that ring-buffer, in a similar fashion to trace_pipe. + +Memory mapping setup +==================== +The mapping works with a mmap() of the trace_pipe_raw interface. + +The first system page of the mapping contains ring-buffer statistics and +description. It is referred as the meta-page. One of the most important field of +the meta-page is the reader. It contains the subbuf ID which can be safely read +by the mapper (see ring-buffer-design.rst). + +The meta-page is followed by all the subbuf, ordered by ascendant ID. It is +therefore effortless to know where the reader starts in the mapping: + +.. code-block:: c + + reader_id = meta->reader->id; + reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size; + +When the application is done with the current reader, it can get a new one using +the trace_pipe_raw ioctl() TRACE_MMAP_IOCTL_GET_READER. This ioctl also updates +the meta-page fields. + +Limitations +=========== +When a mapping is in place on a Tracefs ring-buffer, it is not possible to +either resize it (either by increasing the entire size of the ring-buffer or +each subbuf). It is also not possible to use snapshot or splice. + +Concurrent readers (either another application mapping that ring-buffer or the +kernel with trace_pipe) are allowed but not recommended. They will compete for +the ring-buffer and the output is unpredictable. + +Example +======= + +.. code-block:: c + + #include <fcntl.h> + #include <stdio.h> + #include <stdlib.h> + #include <unistd.h> + + #include <linux/trace_mmap.h> + + #include <sys/mman.h> + #include <sys/ioctl.h> + + #define TRACE_PIPE_RAW "/sys/kernel/tracing/per_cpu/cpu0/trace_pipe_raw" + + int main(void) + { + int page_size = getpagesize(), fd, reader_id; + unsigned long meta_len, data_len; + struct trace_buffer_meta *meta; + void *map, *reader, *data; + + fd = open(TRACE_PIPE_RAW, O_RDONLY); + if (fd < 0) + exit(EXIT_FAILURE); + + map = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0); + if (map == MAP_FAILED) + exit(EXIT_FAILURE); + + meta = (struct trace_buffer_meta *)map; + meta_len = meta->meta_page_size; + + printf("entries: %lu\n", meta->entries); + printf("overrun: %lu\n", meta->overrun); + printf("read: %lu\n", meta->read); + printf("subbufs_touched:%lu\n", meta->subbufs_touched); + printf("subbufs_lost: %lu\n", meta->subbufs_lost); + printf("subbufs_read: %lu\n", meta->subbufs_read); + printf("nr_subbufs: %u\n", meta->nr_subbufs); + + data_len = meta->subbuf_size * meta->nr_subbufs; + data = mmap(NULL, data_len, PROT_READ, MAP_SHARED, fd, data_len); + if (data == MAP_FAILED) + exit(EXIT_FAILURE); + + if (ioctl(fd, TRACE_MMAP_IOCTL_GET_READER) < 0) + exit(EXIT_FAILURE); + + reader_id = meta->reader.id; + reader = data + meta->subbuf_size * reader_id; + + munmap(data, data_len); + munmap(meta, meta_len); + close (fd); + + return 0; + }