[printk,v3,10/14] printk: ringbuffer: Skip non-finalized records in panic

Message ID 20231214214201.499426-11-john.ogness@linutronix.de
State New
Headers
Series fix console flushing |

Commit Message

John Ogness Dec. 14, 2023, 9:41 p.m. UTC
  Normally a reader will stop once reaching a non-finalized
record. However, when a panic happens, writers from other CPUs
(or an interrupted context on the panic CPU) may have been
writing a record and were unable to finalize it. The panic CPU
will reserve/commit/finalize its panic records, but these will
be located after the non-finalized records. This results in
panic() not flushing the panic messages.

Extend _prb_read_valid() to skip over non-finalized records if
on the panic CPU.

Fixes: 896fbe20b4e2 ("printk: use the lockless ringbuffer")
Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk_ringbuffer.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)
  

Comments

Petr Mladek Feb. 1, 2024, 4:56 p.m. UTC | #1
On Thu 2023-12-14 22:47:57, John Ogness wrote:
> Normally a reader will stop once reaching a non-finalized
> record. However, when a panic happens, writers from other CPUs
> (or an interrupted context on the panic CPU) may have been
> writing a record and were unable to finalize it. The panic CPU
> will reserve/commit/finalize its panic records, but these will
> be located after the non-finalized records. This results in
> panic() not flushing the panic messages.
> 
> Extend _prb_read_valid() to skip over non-finalized records if
> on the panic CPU.
> 
> Fixes: 896fbe20b4e2 ("printk: use the lockless ringbuffer")
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Makes sense.

The most interesting information from other CPUs are backtraces. But
they should be finalized before reaching panic().
trigger_all_cpu_backtrace() waits until all CPUs print the backtrace.

In fact, it might be even useful because other CPUs might just spam
the ring buffer with mess caused by the panic() situation.
Messages which caused the panic() situation should be in the ring
buffer before panic() gets called.

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr
  

Patch

diff --git a/kernel/printk/printk_ringbuffer.c b/kernel/printk/printk_ringbuffer.c
index b7748d7c44c1..d6ed33683b8b 100644
--- a/kernel/printk/printk_ringbuffer.c
+++ b/kernel/printk/printk_ringbuffer.c
@@ -2107,6 +2107,10 @@  u64 prb_next_reserve_seq(struct printk_ringbuffer *rb)
  *
  * On failure @seq is updated to a record that is not yet available to the
  * reader, but it will be the next record available to the reader.
+ *
+ * Note: When the current CPU is in panic, this function will skip over any
+ *       non-existent/non-finalized records in order to allow the panic CPU
+ *       to print any and all records that have been finalized.
  */
 static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			    struct printk_record *r, unsigned int *line_count)
@@ -2129,8 +2133,28 @@  static bool _prb_read_valid(struct printk_ringbuffer *rb, u64 *seq,
 			(*seq)++;
 
 		} else {
-			/* Non-existent/non-finalized record. Must stop. */
-			return false;
+			/*
+			 * Non-existent/non-finalized record. Must stop.
+			 *
+			 * For panic situations it cannot be expected that
+			 * non-finalized records will become finalized. But
+			 * there may be other finalized records beyond that
+			 * need to be printed for a panic situation. If this
+			 * is the panic CPU, skip this
+			 * non-existent/non-finalized record unless it is
+			 * at or beyond the head, in which case it is not
+			 * possible to continue.
+			 *
+			 * Note that new messages printed on panic CPU are
+			 * finalized when we are here. The only exception
+			 * might be the last message without trailing newline.
+			 * But it would have the sequence number returned
+			 * by "prb_next_reserve_seq() - 1".
+			 */
+			if (this_cpu_in_panic() && ((*seq + 1) < prb_next_reserve_seq(rb)))
+				(*seq)++;
+			else
+				return false;
 		}
 	}