[printk,v2,24/26] panic: Mark emergency section in oops

Message ID 20240218185726.1994771-25-john.ogness@linutronix.de
State New
Headers
Series wire up write_atomic() printing |

Commit Message

John Ogness Feb. 18, 2024, 6:57 p.m. UTC
  Mark an emergency section beginning with oops_enter() until the
end of oops_exit(). In this section, the CPU will not perform
console output for the printk() calls. Instead, a flushing of the
console output is triggered when exiting the emergency section.

The very end of oops_exit() performs a kmsg_dump(). This is not
included in the emergency section because it is another
flushing mechanism that should occur after the consoles have
been triggered to flush.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/panic.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Petr Mladek March 1, 2024, 2:55 p.m. UTC | #1
On Sun 2024-02-18 20:03:24, John Ogness wrote:
> Mark an emergency section beginning with oops_enter() until the
> end of oops_exit(). In this section, the CPU will not perform
> console output for the printk() calls. Instead, a flushing of the
> console output is triggered when exiting the emergency section.
> 
> The very end of oops_exit() performs a kmsg_dump(). This is not
> included in the emergency section because it is another
> flushing mechanism that should occur after the consoles have
> been triggered to flush.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>  kernel/panic.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/panic.c b/kernel/panic.c
> index d30d261f9246..9fa44bc38f46 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -634,6 +634,7 @@ bool oops_may_print(void)
>   */
>  void oops_enter(void)
>  {
> +	nbcon_cpu_emergency_enter();
>  	tracing_off();
>  	/* can't trust the integrity of the kernel anymore: */
>  	debug_locks_off();
> @@ -656,6 +657,7 @@ void oops_exit(void)
>  {
>  	do_oops_enter_exit();

The comment above oops_enter() function says:

/*
 * Called when the architecture enters its oops handler, before it prints
 * anything.  If this is the first CPU to oops, and it's oopsing the first
 * time then let it proceed.
 *
 * This is all enabled by the pause_on_oops kernel boot option.  We do all
 * this to ensure that oopses don't scroll off the screen.  It has the
 * side-effect of preventing later-oopsing CPUs from mucking up the display,
 * too.
 *
 * It turns out that the CPU which is allowed to print ends up pausing for
 * the right duration, whereas all the other CPUs pause for twice as long:
 * once in oops_enter(), once in oops_exit().
 */

and indeed do_oops_enter_exit(); does the waiting.

IMHO, we should enter() the emergency context after waiting in
oops_enter(). And exit() it before waiting in oops_exit(). Aka


 void oops_enter(void)
 {
 	tracing_off();
 	/* can't trust the integrity of the kernel anymore: */
 	debug_locks_off();
 	do_oops_enter_exit();
+ 	nbcon_cpu_emergency_enter();
 
 	if (sysctl_oops_all_cpu_backtrace)
 		trigger_all_cpu_backtrace();
 }

 void oops_exit(void)
 {
+	nbcon_cpu_emergency_exit();
 	do_oops_enter_exit();
 	print_oops_end_marker();
 	kmsg_dump(KMSG_DUMP_OOPS);
 }


>  	print_oops_end_marker();
> +	nbcon_cpu_emergency_exit();
>  	kmsg_dump(KMSG_DUMP_OOPS);
>  }

Otherwise, it looks good.

Best Regards,
Petr
  

Patch

diff --git a/kernel/panic.c b/kernel/panic.c
index d30d261f9246..9fa44bc38f46 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -634,6 +634,7 @@  bool oops_may_print(void)
  */
 void oops_enter(void)
 {
+	nbcon_cpu_emergency_enter();
 	tracing_off();
 	/* can't trust the integrity of the kernel anymore: */
 	debug_locks_off();
@@ -656,6 +657,7 @@  void oops_exit(void)
 {
 	do_oops_enter_exit();
 	print_oops_end_marker();
+	nbcon_cpu_emergency_exit();
 	kmsg_dump(KMSG_DUMP_OOPS);
 }