[RFC] tracing/ring-buffer: Drop inappropriate WARN in rb_set_head_page()

Message ID 20230301034702.3449755-1-zhengyejian1@huawei.com
State New
Headers
Series [RFC] tracing/ring-buffer: Drop inappropriate WARN in rb_set_head_page() |

Commit Message

Zheng Yejian March 1, 2023, 3:47 a.m. UTC
  Following WARNING appears several times during test on v5.10 but
mainline kernel should have the same problem. However I currently
can't find the reproduction method.

WARNING: CPU: 29 PID: 686834 at kernel/trace/ring_buffer.c:1357
           rb_set_head_page+0x168/0x264
Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/trace/ring_buffer.c?h=linux-5.10.y#n1357
Call trace:
 rb_set_head_page+0x168/0x264
 rb_per_cpu_empty+0x34/0x15c
 ring_buffer_empty_cpu.part.0.isra.0+0x1a4/0x3f0
 ring_buffer_empty_cpu+0x74/0xb4
 __find_next_entry+0x14c/0x2f4
 trace_find_next_entry_inc+0x48/0x13c
 tracing_read_pipe+0x2c8/0x6b4
 vfs_read+0x144/0x324
 ksys_read+0x104/0x220
 __arm64_sys_read+0x54/0x70
 el0_svc_common.constprop.0+0xd8/0x37c
 do_el0_svc+0x50/0x120
 el0_svc+0x24/0x3c
 el0_sync_handler+0x17c/0x180
 el0_sync+0x160/0x180

The WARNING appears because rb_set_head_page() didn't grab the header
after three loops traversing buffer pages. This was not considered
to be expected, as comment said, writer possibly moves the header in
one loop.

However, supposing writer keeps moving the header, we may miss more
loops and it seems normal not to grab the header within three loops
in rb_set_head_page(). Therefore drop that RB_WARN_ON().

Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
---
 kernel/trace/ring_buffer.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)
  

Comments

Steven Rostedt March 1, 2023, 2:46 p.m. UTC | #1
On Wed, 1 Mar 2023 11:47:02 +0800
Zheng Yejian <zhengyejian1@huawei.com> wrote:

> Following WARNING appears several times during test on v5.10 but
> mainline kernel should have the same problem. However I currently
> can't find the reproduction method.
> 
> WARNING: CPU: 29 PID: 686834 at kernel/trace/ring_buffer.c:1357
>            rb_set_head_page+0x168/0x264
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/trace/ring_buffer.c?h=linux-5.10.y#n1357
> Call trace:
>  rb_set_head_page+0x168/0x264
>  rb_per_cpu_empty+0x34/0x15c
>  ring_buffer_empty_cpu.part.0.isra.0+0x1a4/0x3f0
>  ring_buffer_empty_cpu+0x74/0xb4
>  __find_next_entry+0x14c/0x2f4
>  trace_find_next_entry_inc+0x48/0x13c
>  tracing_read_pipe+0x2c8/0x6b4
>  vfs_read+0x144/0x324
>  ksys_read+0x104/0x220
>  __arm64_sys_read+0x54/0x70
>  el0_svc_common.constprop.0+0xd8/0x37c
>  do_el0_svc+0x50/0x120
>  el0_svc+0x24/0x3c
>  el0_sync_handler+0x17c/0x180
>  el0_sync+0x160/0x180
> 
> The WARNING appears because rb_set_head_page() didn't grab the header
> after three loops traversing buffer pages. This was not considered
> to be expected, as comment said, writer possibly moves the header in
> one loop.
> 
> However, supposing writer keeps moving the header, we may miss more
> loops and it seems normal not to grab the header within three loops
> in rb_set_head_page(). Therefore drop that RB_WARN_ON().

It's normal to grab the header in two loops. I only made it three in case I
was wrong. If it took 4 tries, something is wrong. Just returning NULL
without setting the header will cause bugs elsewhere.

-- Steve


> 
> Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
> ---
>  kernel/trace/ring_buffer.c | 6 +-----
>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index af50d931b020..cbfa306570d3 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -1471,9 +1471,7 @@ rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer)
>  	page = head = cpu_buffer->head_page;
>  	/*
>  	 * It is possible that the writer moves the header behind
> -	 * where we started, and we miss in one loop.
> -	 * A second loop should grab the header, but we'll do
> -	 * three loops just because I'm paranoid.
> +	 * where we started, so we try three loops to grab the header.
>  	 */
>  	for (i = 0; i < 3; i++) {
>  		do {
> @@ -1485,8 +1483,6 @@ rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer)
>  		} while (page != head);
>  	}
>  
> -	RB_WARN_ON(cpu_buffer, 1);
> -
>  	return NULL;
>  }
>
  

Patch

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index af50d931b020..cbfa306570d3 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -1471,9 +1471,7 @@  rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer)
 	page = head = cpu_buffer->head_page;
 	/*
 	 * It is possible that the writer moves the header behind
-	 * where we started, and we miss in one loop.
-	 * A second loop should grab the header, but we'll do
-	 * three loops just because I'm paranoid.
+	 * where we started, so we try three loops to grab the header.
 	 */
 	for (i = 0; i < 3; i++) {
 		do {
@@ -1485,8 +1483,6 @@  rb_set_head_page(struct ring_buffer_per_cpu *cpu_buffer)
 		} while (page != head);
 	}
 
-	RB_WARN_ON(cpu_buffer, 1);
-
 	return NULL;
 }