[printk,v3,15/40] kdb: use srcu console list iterator

Message ID 20221107141638.3790965-16-john.ogness@linutronix.de
State New
Headers
Series reduce console_lock scope |

Commit Message

John Ogness Nov. 7, 2022, 2:16 p.m. UTC
  Guarantee safe iteration of the console list by using SRCU.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/debug/kdb/kdb_io.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
  

Comments

Daniel Thompson Nov. 9, 2022, 8:53 a.m. UTC | #1
On Mon, Nov 07, 2022 at 03:22:13PM +0106, John Ogness wrote:
> Guarantee safe iteration of the console list by using SRCU.
>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  kernel/debug/kdb/kdb_io.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 550fe8b456ec..ed8289ce4fcb 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -545,6 +545,7 @@ static void kdb_msg_write(const char *msg, int msg_len)
>  {
>  	struct console *c;
>  	const char *cp;
> +	int cookie;
>  	int len;
>
>  	if (msg_len == 0)
> @@ -558,7 +559,15 @@ static void kdb_msg_write(const char *msg, int msg_len)
>  		cp++;
>  	}
>
> -	for_each_console(c) {
> +	/*
> +	 * The console_srcu_read_lock() only provides safe console list
> +	 * traversal. The use of the ->write() callback relies on all other
> +	 * CPUs being stopped at the moment and console drivers being able to
> +	 * handle reentrance when @oops_in_progress is set. (Note that there
> +	 * is no guarantee for either criteria.)
> +	 */

The debugger entry protocol does ensure that other CPUs are either
stopped or unresponsive. In the case where the other CPU is unresponsive
(e.g. timed out after being asked to stop) then there is a "real" printk()
issued prior to any of the above interference with the console system to
the developer driving the debugger gets as much clue as we can offer them
about what is going on (typically this is emitted from regular interrupt
context).

Given this comment is part of the debugger code then for the
oops_in_progress hack it might be more helpful to describe what
the developer in front the debugger needs to do to have the most
reliable debug session possible.

  There is no guarantee that every console drivers can handle reentrance
  in this way; the developer deploying the debugger is responsible for
  ensuring that the console drivers they have selected handle reentrance
  appropriately.


Daniel.


> +	cookie = console_srcu_read_lock();
> +	for_each_console_srcu(c) {
>  		if (!console_is_enabled(c))
>  			continue;
>  		if (c == dbg_io_ops->cons)
> @@ -577,6 +586,7 @@ static void kdb_msg_write(const char *msg, int msg_len)
>  		--oops_in_progress;
>  		touch_nmi_watchdog();
>  	}
> +	console_srcu_read_unlock(cookie);
>  }
>
>  int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)
> --
> 2.30.2
>
  
John Ogness Nov. 9, 2022, 9:27 a.m. UTC | #2
Hi Daniel,

On 2022-11-09, Daniel Thompson <daniel.thompson@linaro.org> wrote:
>> +	/*
>> +	 * The console_srcu_read_lock() only provides safe console list
>> +	 * traversal. The use of the ->write() callback relies on all other
>> +	 * CPUs being stopped at the moment and console drivers being able to
>> +	 * handle reentrance when @oops_in_progress is set. (Note that there
>> +	 * is no guarantee for either criteria.)
>> +	 */
>
> The debugger entry protocol does ensure that other CPUs are either
> stopped or unresponsive. In the case where the other CPU is unresponsive
> (e.g. timed out after being asked to stop) then there is a "real" printk()
> issued prior to any of the above interference with the console system to
> the developer driving the debugger gets as much clue as we can offer them
> about what is going on (typically this is emitted from regular interrupt
> context).
>
> Given this comment is part of the debugger code then for the
> oops_in_progress hack it might be more helpful to describe what
> the developer in front the debugger needs to do to have the most
> reliable debug session possible.
>
>   There is no guarantee that every console drivers can handle reentrance
>   in this way; the developer deploying the debugger is responsible for
>   ensuring that the console drivers they have selected handle reentrance
>   appropriately.

Thanks for the explanation. I will change the comment to:

	/*
	 * The console_srcu_read_lock() only provides safe console list
	 * traversal. The use of the ->write() callback relies on all other
	 * CPUs being stopped at the moment and console drivers being able to
	 * handle reentrance when @oops_in_progress is set.
	 *
	 * There is no guarantee that every console driver can handle
	 * reentrance in this way; the developer deploying the debugger
	 * is responsible for ensuring that the console drivers they
	 * have selected handle reentrance appropriately.
	 */

John
  
Petr Mladek Nov. 9, 2022, 3:27 p.m. UTC | #3
On Wed 2022-11-09 10:33:55, John Ogness wrote:
> Hi Daniel,
> 
> On 2022-11-09, Daniel Thompson <daniel.thompson@linaro.org> wrote:
> >> +	/*
> >> +	 * The console_srcu_read_lock() only provides safe console list
> >> +	 * traversal. The use of the ->write() callback relies on all other
> >> +	 * CPUs being stopped at the moment and console drivers being able to
> >> +	 * handle reentrance when @oops_in_progress is set. (Note that there
> >> +	 * is no guarantee for either criteria.)
> >> +	 */
> >
> > The debugger entry protocol does ensure that other CPUs are either
> > stopped or unresponsive. In the case where the other CPU is unresponsive
> > (e.g. timed out after being asked to stop) then there is a "real" printk()
> > issued prior to any of the above interference with the console system to
> > the developer driving the debugger gets as much clue as we can offer them
> > about what is going on (typically this is emitted from regular interrupt
> > context).
> >
> > Given this comment is part of the debugger code then for the
> > oops_in_progress hack it might be more helpful to describe what
> > the developer in front the debugger needs to do to have the most
> > reliable debug session possible.
> >
> >   There is no guarantee that every console drivers can handle reentrance
> >   in this way; the developer deploying the debugger is responsible for
> >   ensuring that the console drivers they have selected handle reentrance
> >   appropriately.
> 
> Thanks for the explanation. I will change the comment to:
> 
> 	/*
> 	 * The console_srcu_read_lock() only provides safe console list
> 	 * traversal. The use of the ->write() callback relies on all other
> 	 * CPUs being stopped at the moment and console drivers being able to
> 	 * handle reentrance when @oops_in_progress is set.
> 	 *
> 	 * There is no guarantee that every console driver can handle
> 	 * reentrance in this way; the developer deploying the debugger
> 	 * is responsible for ensuring that the console drivers they
> 	 * have selected handle reentrance appropriately.
> 	 */

Looks good to me.

After merging this with the 10th patch that adds the data_race() annotated
check of CON_ENABLED flag:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr
  

Patch

diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
index 550fe8b456ec..ed8289ce4fcb 100644
--- a/kernel/debug/kdb/kdb_io.c
+++ b/kernel/debug/kdb/kdb_io.c
@@ -545,6 +545,7 @@  static void kdb_msg_write(const char *msg, int msg_len)
 {
 	struct console *c;
 	const char *cp;
+	int cookie;
 	int len;
 
 	if (msg_len == 0)
@@ -558,7 +559,15 @@  static void kdb_msg_write(const char *msg, int msg_len)
 		cp++;
 	}
 
-	for_each_console(c) {
+	/*
+	 * The console_srcu_read_lock() only provides safe console list
+	 * traversal. The use of the ->write() callback relies on all other
+	 * CPUs being stopped at the moment and console drivers being able to
+	 * handle reentrance when @oops_in_progress is set. (Note that there
+	 * is no guarantee for either criteria.)
+	 */
+	cookie = console_srcu_read_lock();
+	for_each_console_srcu(c) {
 		if (!console_is_enabled(c))
 			continue;
 		if (c == dbg_io_ops->cons)
@@ -577,6 +586,7 @@  static void kdb_msg_write(const char *msg, int msg_len)
 		--oops_in_progress;
 		touch_nmi_watchdog();
 	}
+	console_srcu_read_unlock(cookie);
 }
 
 int vkdb_printf(enum kdb_msgsrc src, const char *fmt, va_list ap)