[printk,v3,13/14] printk: Avoid non-panic CPUs writing to ringbuffer

Message ID 20231214214201.499426-14-john.ogness@linutronix.de
State New
Headers
Series fix console flushing |

Commit Message

John Ogness Dec. 14, 2023, 9:42 p.m. UTC
  Commit 13fb0f74d702 ("printk: Avoid livelock with heavy printk
during panic") introduced a mechanism to silence non-panic CPUs
if too many messages are being dropped. Aside from trying to
workaround the livelock bugs of legacy consoles, it was also
intended to avoid losing panic messages. However, if non-panic
CPUs are writing to the ringbuffer, then reacting to dropped
messages is too late.

To avoid losing panic CPU messages, silence non-panic CPUs
immediately on panic.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 26 ++++++--------------------
 1 file changed, 6 insertions(+), 20 deletions(-)
  

Comments

Petr Mladek Feb. 2, 2024, 9:26 a.m. UTC | #1
On Thu 2023-12-14 22:48:00, John Ogness wrote:
> Commit 13fb0f74d702 ("printk: Avoid livelock with heavy printk
> during panic") introduced a mechanism to silence non-panic CPUs
> if too many messages are being dropped. Aside from trying to
> workaround the livelock bugs of legacy consoles, it was also
> intended to avoid losing panic messages. However, if non-panic
> CPUs are writing to the ringbuffer, then reacting to dropped
> messages is too late.
> 
> To avoid losing panic CPU messages, silence non-panic CPUs
> immediately on panic.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

I am slightly nervous about this change because it looks
too agresive ;-)

But it makes perfect sense. And it nicely complements the 10th
patch, see
https://lore.kernel.org/r/20231214214201.499426-11-john.ogness@linutronix.de
The 10th patch allows to skip messages in reserved state. It might
cause skipping random messages from other CPUs. So it really
looks better to skip/ignore them completely.

It would be nice to mention the relation to the 10th patch
in the commit message. Something like:

<proposal>
Another motivation is that non-finalized messages already might be
skipped in panic(). By other words, random messages from non-panic
CPUs might already get lost. It is better to ignore all to avoid
confusion.
</proposal>

With the updated commit message:

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr
  

Patch

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index cb99c854a648..1685a71f3f71 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -462,12 +462,6 @@  static int console_msg_format = MSG_FORMAT_DEFAULT;
 static DEFINE_MUTEX(syslog_lock);
 
 #ifdef CONFIG_PRINTK
-/*
- * During panic, heavy printk by other CPUs can delay the
- * panic and risk deadlock on console resources.
- */
-static int __read_mostly suppress_panic_printk;
-
 DECLARE_WAIT_QUEUE_HEAD(log_wait);
 /* All 3 protected by @syslog_lock. */
 /* the next printk record to read by syslog(READ) or /proc/kmsg */
@@ -2322,7 +2316,12 @@  asmlinkage int vprintk_emit(int facility, int level,
 	if (unlikely(suppress_printk))
 		return 0;
 
-	if (unlikely(suppress_panic_printk) && other_cpu_in_panic())
+	/*
+	 * The messages on the panic CPU are the most important. If
+	 * non-panic CPUs are generating any messages, they will be
+	 * silently dropped.
+	 */
+	if (other_cpu_in_panic())
 		return 0;
 
 	if (level == LOGLEVEL_SCHED) {
@@ -2799,8 +2798,6 @@  void console_prepend_dropped(struct printk_message *pmsg, unsigned long dropped)
 bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
 			     bool is_extended, bool may_suppress)
 {
-	static int panic_console_dropped;
-
 	struct printk_buffers *pbufs = pmsg->pbufs;
 	const size_t scratchbuf_sz = sizeof(pbufs->scratchbuf);
 	const size_t outbuf_sz = sizeof(pbufs->outbuf);
@@ -2828,17 +2825,6 @@  bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
 	pmsg->seq = r.info->seq;
 	pmsg->dropped = r.info->seq - seq;
 
-	/*
-	 * Check for dropped messages in panic here so that printk
-	 * suppression can occur as early as possible if necessary.
-	 */
-	if (pmsg->dropped &&
-	    panic_in_progress() &&
-	    panic_console_dropped++ > 10) {
-		suppress_panic_printk = 1;
-		pr_warn_once("Too many dropped messages. Suppress messages on non-panic CPUs to prevent livelock.\n");
-	}
-
 	/* Skip record that has level above the console loglevel. */
 	if (may_suppress && suppress_message_printing(r.info->level))
 		goto out;