[printk,v2,25/26] rcu: Mark emergency section in rcu stalls

Message ID 20240218185726.1994771-26-john.ogness@linutronix.de
State New
Headers
Series wire up write_atomic() printing |

Commit Message

John Ogness Feb. 18, 2024, 6:57 p.m. UTC
  Mark an emergency section within print_other_cpu_stall(), where
RCU stall information is printed. In this section, the CPU will
not perform console output for the printk() calls. Instead, a
flushing of the console output is triggered when exiting the
emergency section.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/rcu/tree_stall.h | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

Petr Mladek March 1, 2024, 3:13 p.m. UTC | #1
On Sun 2024-02-18 20:03:25, John Ogness wrote:
> Mark an emergency section within print_other_cpu_stall(), where
> RCU stall information is printed. In this section, the CPU will
> not perform console output for the printk() calls. Instead, a
> flushing of the console output is triggered when exiting the
> emergency section.
>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Reviewed-by: Petr Mladek <pmladek@suse.com>

I was just curious about one thing. But it seems to work well.

print_other_cpu_stall() print backtraces on other CPUs via NMI.
The other CPUs would not see the emergency context. They would
call defer_console_output() because they are in NMI. As a result:

  + Legacy consoles might be flushed on other CPUs even before
    nbcon_cpu_emergency_exit() gets called.

  + nbcon consoles might still be flushed by the printk kthread
    until all messages get flushed directly by nbcon_cpu_emergency_exit()

As I wrote. The behavior is corrent. It was just not obvious to me.


> --- a/kernel/rcu/tree_stall.h
> +++ b/kernel/rcu/tree_stall.h
> @@ -9,6 +9,7 @@
>  
>  #include <linux/kvm_para.h>
>  #include <linux/rcu_notifier.h>
> +#include <linux/console.h>
>  
>  //////////////////////////////////////////////////////////////////////////////
>  //
> @@ -604,6 +605,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
>  	if (rcu_stall_is_suppressed())
>  		return;
>  
> +	nbcon_cpu_emergency_enter();
> +
>  	/*
>  	 * OK, time to rat on our buddy...
>  	 * See Documentation/RCU/stallwarn.rst for info on how to debug
> @@ -658,6 +661,8 @@ static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
>  	panic_on_rcu_stall();
>  
>  	rcu_force_quiescent_state();  /* Kick them all. */
> +
> +	nbcon_cpu_emergency_exit();
>  }
>  
>  static void print_cpu_stall(unsigned long gps)

Best Regards,
Petr
  

Patch

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index ac8e86babe44..efb2be8939a2 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -9,6 +9,7 @@ 
 
 #include <linux/kvm_para.h>
 #include <linux/rcu_notifier.h>
+#include <linux/console.h>
 
 //////////////////////////////////////////////////////////////////////////////
 //
@@ -604,6 +605,8 @@  static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 	if (rcu_stall_is_suppressed())
 		return;
 
+	nbcon_cpu_emergency_enter();
+
 	/*
 	 * OK, time to rat on our buddy...
 	 * See Documentation/RCU/stallwarn.rst for info on how to debug
@@ -658,6 +661,8 @@  static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
 	panic_on_rcu_stall();
 
 	rcu_force_quiescent_state();  /* Kick them all. */
+
+	nbcon_cpu_emergency_exit();
 }
 
 static void print_cpu_stall(unsigned long gps)