[printk,v1,16/18] kernel/panic: Add atomic write enforcement to warn/panic

Message ID 20230302195618.156940-17-john.ogness@linutronix.de
State New
Headers
Series threaded/atomic console support |

Commit Message

John Ogness March 2, 2023, 7:56 p.m. UTC
  From: Thomas Gleixner <tglx@linutronix.de>

Invoke the atomic write enforcement functions for warn/panic to
ensure that the information gets out to the consoles.

For the panic case, add explicit intermediate atomic flush calls to
ensure immediate flushing at important points. Otherwise the atomic
flushing only occurs when dropping out of the elevated priority,
which for panic may never happen.

It is important to note that if there are any legacy consoles
registered, they will be attempting to directly print from the
printk-caller context, which may jeopardize the reliability of the
atomic consoles. Optimally there should be no legacy consoles
registered.

Co-developed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de>
---
 kernel/panic.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)
  

Comments

Petr Mladek April 13, 2023, 10:08 a.m. UTC | #1
On Thu 2023-03-02 21:02:16, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Invoke the atomic write enforcement functions for warn/panic to
> ensure that the information gets out to the consoles.
> 
> For the panic case, add explicit intermediate atomic flush calls to
> ensure immediate flushing at important points. Otherwise the atomic
> flushing only occurs when dropping out of the elevated priority,
> which for panic may never happen.
> 
> It is important to note that if there are any legacy consoles
> registered, they will be attempting to directly print from the
> printk-caller context, which may jeopardize the reliability of the
> atomic consoles. Optimally there should be no legacy consoles
> registered.
> 
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -329,6 +332,8 @@ void panic(const char *fmt, ...)
>  	if (_crash_kexec_post_notifiers)
>  		__crash_kexec(NULL);
>  
> +	cons_atomic_flush(NULL, true);

Do we need to explicitly flush the messages here?

cons_atomic_flush() is called also from vprintk_emit(). And there are
many messages printed with the PANIC priority above.

This makes an assumption that either printk() in PANIC context
does not try to show the messages immediately or that this
explicit console_atomic_flush() tries harder. I think
that both assumptions are wrong.

> +
>  	console_unblank();
>  
>  	/*
> @@ -353,6 +358,7 @@ void panic(const char *fmt, ...)
>  		 * We can't use the "normal" timers since we just panicked.
>  		 */
>  		pr_emerg("Rebooting in %d seconds..\n", panic_timeout);
> +		cons_atomic_flush(NULL, true);

Same here.
>  
>  		for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) {
>  			touch_nmi_watchdog();
> @@ -371,6 +377,7 @@ void panic(const char *fmt, ...)
>  		 */
>  		if (panic_reboot_mode != REBOOT_UNDEFINED)
>  			reboot_mode = panic_reboot_mode;
> +		cons_atomic_flush(NULL, true);

And here.

>  		emergency_restart();
>  	}
>  #ifdef __sparc__
> @@ -383,12 +390,16 @@ void panic(const char *fmt, ...)
>  	}
>  #endif
>  #if defined(CONFIG_S390)
> +	cons_atomic_flush(NULL, true);

And here.

>  	disabled_wait();
>  #endif
>  	pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
>  
>  	/* Do not scroll important messages printed above */
>  	suppress_printk = 1;
> +
> +	cons_atomic_exit(CONS_PRIO_PANIC, prev_prio);

On the contrary, I would explicitly call cons_atomic_flush(NULL, false)
here instead of hiding it in cons_atomic_exit().

It would make it clear that this is the POINT where panic() tries
harder to get the messages out on NOBKL consoles.

> +
>  	local_irq_enable();
>  	for (i = 0; ; i += PANIC_TIMER_STEP) {
>  		touch_softlockup_watchdog();
> @@ -599,6 +610,10 @@ struct warn_args {
>  void __warn(const char *file, int line, void *caller, unsigned taint,
>  	    struct pt_regs *regs, struct warn_args *args)
>  {
> +	enum cons_prio prev_prio;
> +
> +	prev_prio = cons_atomic_enter(CONS_PRIO_EMERGENCY);
> +
>  	disable_trace_on_warning();
>  
>  	if (file)
> @@ -630,6 +645,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>  
>  	/* Just a warning, don't kill lockdep. */
>  	add_taint(taint, LOCKDEP_STILL_OK);
> +
> +	cons_atomic_exit(CONS_PRIO_EMERGENCY, prev_prio);
>  }

I would more this into separate patch and keep this one only for the
PANIC handling.

Also I think that we want to set the EMERGENCY prio also in oops_enter()?

Best Regards,
Petr
  
John Ogness April 13, 2023, 12:13 p.m. UTC | #2
On 2023-04-13, Petr Mladek <pmladek@suse.com> wrote:
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -329,6 +332,8 @@ void panic(const char *fmt, ...)
>>  	if (_crash_kexec_post_notifiers)
>>  		__crash_kexec(NULL);
>>  
>> +	cons_atomic_flush(NULL, true);
>
> Do we need to explicitly flush the messages here?

This is where the atomic printing actually starts (after the full dump
has been inserted into the ringbuffer).

> cons_atomic_flush() is called also from vprintk_emit(). And there are
> many messages printed with the PANIC priority above.

vprintk_emit() does not print in this case. From cons_atomic_flush():

        /*
         * When in an elevated priority, the printk() calls are not
         * individually flushed. This is to allow the full output to
         * be dumped to the ringbuffer before starting with printing
         * the backlog.
         */
        if (cpu_state->prio > NBCON_PRIO_NORMAL && printk_caller_wctxt)
                return;

> This makes an assumption that either printk() in PANIC context
> does not try to show the messages immediately or that this
> explicit console_atomic_flush() tries harder. I think
> that both assumptions are wrong.

Both assumptions are correct, because until this point there has been no
effort to print.

>> @@ -353,6 +358,7 @@ void panic(const char *fmt, ...)
>>  		 * We can't use the "normal" timers since we just panicked.
>>  		 */
>>  		pr_emerg("Rebooting in %d seconds..\n", panic_timeout);
>> +		cons_atomic_flush(NULL, true);
>
> Same here.

This flush is just to make sure the rebooting message is
output. For nbcon consoles printk() calls are never synchronous except
for during early boot (before kthreads are ready).

The same goes for the other cons_atomic_flush() calls in this function.

>>  	disabled_wait();
>>  #endif
>>  	pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
>>  
>>  	/* Do not scroll important messages printed above */
>>  	suppress_printk = 1;
>> +
>> +	cons_atomic_exit(CONS_PRIO_PANIC, prev_prio);
>
> On the contrary, I would explicitly call cons_atomic_flush(NULL, false)
> here instead of hiding it in cons_atomic_exit().

It is not hiding there. That is the semantic. After entering an atomic
block all printk's are only writing to the ringbuffer. On exiting the
atomic block the ringbuffer is flushed via atomic printing.

Exiting CONS_PRIO_PANIC has a special condition that it first tries to
safely flush all consoles, then will try the unsafe variant for consoles
that were not flushed.

> Also I think that we want to set the EMERGENCY prio also in
> oops_enter()?

Agreed.

John
  
Petr Mladek April 14, 2023, 10:10 a.m. UTC | #3
On Thu 2023-04-13 14:19:13, John Ogness wrote:
> On 2023-04-13, Petr Mladek <pmladek@suse.com> wrote:
> >> --- a/kernel/panic.c
> >> +++ b/kernel/panic.c
> >> @@ -329,6 +332,8 @@ void panic(const char *fmt, ...)
> >>  	if (_crash_kexec_post_notifiers)
> >>  		__crash_kexec(NULL);
> >>  
> >> +	cons_atomic_flush(NULL, true);
> >
> > Do we need to explicitly flush the messages here?
> 
> This is where the atomic printing actually starts (after the full dump
> has been inserted into the ringbuffer).
> 
> > cons_atomic_flush() is called also from vprintk_emit(). And there are
> > many messages printed with the PANIC priority above.
> 
> vprintk_emit() does not print in this case. From cons_atomic_flush():
> 
>         /*
>          * When in an elevated priority, the printk() calls are not
>          * individually flushed. This is to allow the full output to
>          * be dumped to the ringbuffer before starting with printing
>          * the backlog.
>          */
>         if (cpu_state->prio > NBCON_PRIO_NORMAL && printk_caller_wctxt)
>                 return;

OK, what is the motivation for this behavior, please?
Does it has any advantages?

> 
> > This makes an assumption that either printk() in PANIC context
> > does not try to show the messages immediately or that this
> > explicit console_atomic_flush() tries harder. I think
> > that both assumptions are wrong.
> 
> Both assumptions are correct, because until this point there has been no
> effort to print.

Honestly, this makes me nervous. It means that panic() messages will
not reach the console unless they are explicitly flushed.

First, it is error-prone because it requires calling
console_atomic_flush() in all relevant code paths on the right
locations.

Second, it expects that panic() code could never fail between
the explicit console_atomic_flush() calls. If it failed, it might
be pretty useful to see the last printed message.

Third, messages might get lost when there are too many. And it is
realistic. For example, see panic_print_sys_info() it might add
quite long reports.

I would really prefer to flush atomic consoles with every printk()
unless there is a really good reason not to do it.

Best Regards,
Petr
  

Patch

diff --git a/kernel/panic.c b/kernel/panic.c
index da323209f583..db9834fbdf26 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -209,6 +209,7 @@  static void panic_print_sys_info(bool console_flush)
  */
 void panic(const char *fmt, ...)
 {
+	enum cons_prio prev_prio;
 	static char buf[1024];
 	va_list args;
 	long i, i_next = 0, len;
@@ -256,6 +257,8 @@  void panic(const char *fmt, ...)
 	if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
 		panic_smp_self_stop();
 
+	prev_prio = cons_atomic_enter(CONS_PRIO_PANIC);
+
 	console_verbose();
 	bust_spinlocks(1);
 	va_start(args, fmt);
@@ -329,6 +332,8 @@  void panic(const char *fmt, ...)
 	if (_crash_kexec_post_notifiers)
 		__crash_kexec(NULL);
 
+	cons_atomic_flush(NULL, true);
+
 	console_unblank();
 
 	/*
@@ -353,6 +358,7 @@  void panic(const char *fmt, ...)
 		 * We can't use the "normal" timers since we just panicked.
 		 */
 		pr_emerg("Rebooting in %d seconds..\n", panic_timeout);
+		cons_atomic_flush(NULL, true);
 
 		for (i = 0; i < panic_timeout * 1000; i += PANIC_TIMER_STEP) {
 			touch_nmi_watchdog();
@@ -371,6 +377,7 @@  void panic(const char *fmt, ...)
 		 */
 		if (panic_reboot_mode != REBOOT_UNDEFINED)
 			reboot_mode = panic_reboot_mode;
+		cons_atomic_flush(NULL, true);
 		emergency_restart();
 	}
 #ifdef __sparc__
@@ -383,12 +390,16 @@  void panic(const char *fmt, ...)
 	}
 #endif
 #if defined(CONFIG_S390)
+	cons_atomic_flush(NULL, true);
 	disabled_wait();
 #endif
 	pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
 
 	/* Do not scroll important messages printed above */
 	suppress_printk = 1;
+
+	cons_atomic_exit(CONS_PRIO_PANIC, prev_prio);
+
 	local_irq_enable();
 	for (i = 0; ; i += PANIC_TIMER_STEP) {
 		touch_softlockup_watchdog();
@@ -599,6 +610,10 @@  struct warn_args {
 void __warn(const char *file, int line, void *caller, unsigned taint,
 	    struct pt_regs *regs, struct warn_args *args)
 {
+	enum cons_prio prev_prio;
+
+	prev_prio = cons_atomic_enter(CONS_PRIO_EMERGENCY);
+
 	disable_trace_on_warning();
 
 	if (file)
@@ -630,6 +645,8 @@  void __warn(const char *file, int line, void *caller, unsigned taint,
 
 	/* Just a warning, don't kill lockdep. */
 	add_taint(taint, LOCKDEP_STILL_OK);
+
+	cons_atomic_exit(CONS_PRIO_EMERGENCY, prev_prio);
 }
 
 #ifndef __WARN_FLAGS