[printk,v3,4/7] printk: Do not take console lock for console_flush_on_panic()

Message ID 20230717194607.145135-5-john.ogness@linutronix.de
State New
Headers
Series various cleanups |

Commit Message

John Ogness July 17, 2023, 7:46 p.m. UTC
  Currently console_flush_on_panic() will attempt to acquire the
console lock when flushing the buffer on panic. If it fails to
acquire the lock, it continues anyway because this is the last
chance to get any pending records printed.

The reason why the console lock was attempted at all was to
prevent any other CPUs from acquiring the console lock for
printing while the panic CPU was printing. But as of the
previous commit, non-panic CPUs will no longer attempt to
acquire the console lock in a panic situation. Therefore it is
no longer strictly necessary for a panic CPU to acquire the
console lock.

Avoiding taking the console lock when flushing in panic has
the additional benefit of avoiding possible deadlocks due to
semaphore usage in NMI context (semaphores are not NMI-safe)
and avoiding possible deadlocks if another CPU accesses the
semaphore and is stopped while holding one of the semaphore's
internal spinlocks.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/printk/printk.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)
  

Comments

Sergey Senozhatsky July 18, 2023, 3:18 a.m. UTC | #1
On (23/07/17 21:52), John Ogness wrote:
> 
> Currently console_flush_on_panic() will attempt to acquire the
> console lock when flushing the buffer on panic. If it fails to
> acquire the lock, it continues anyway because this is the last
> chance to get any pending records printed.
> 
> The reason why the console lock was attempted at all was to
> prevent any other CPUs from acquiring the console lock for
> printing while the panic CPU was printing. But as of the
> previous commit, non-panic CPUs will no longer attempt to
> acquire the console lock in a panic situation. Therefore it is
> no longer strictly necessary for a panic CPU to acquire the
> console lock.
> 
> Avoiding taking the console lock when flushing in panic has
> the additional benefit of avoiding possible deadlocks due to
> semaphore usage in NMI context (semaphores are not NMI-safe)
> and avoiding possible deadlocks if another CPU accesses the
> semaphore and is stopped while holding one of the semaphore's
> internal spinlocks.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
  
Petr Mladek July 18, 2023, 10:28 a.m. UTC | #2
On Mon 2023-07-17 21:52:04, John Ogness wrote:
> Currently console_flush_on_panic() will attempt to acquire the
> console lock when flushing the buffer on panic. If it fails to
> acquire the lock, it continues anyway because this is the last
> chance to get any pending records printed.
> 
> The reason why the console lock was attempted at all was to
> prevent any other CPUs from acquiring the console lock for
> printing while the panic CPU was printing. But as of the
> previous commit, non-panic CPUs will no longer attempt to
> acquire the console lock in a panic situation. Therefore it is
> no longer strictly necessary for a panic CPU to acquire the
> console lock.
> 
> Avoiding taking the console lock when flushing in panic has
> the additional benefit of avoiding possible deadlocks due to
> semaphore usage in NMI context (semaphores are not NMI-safe)
> and avoiding possible deadlocks if another CPU accesses the
> semaphore and is stopped while holding one of the semaphore's
> internal spinlocks.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Reviewed-by: Petr Mladek <pmladek@suse.com>

Best Regards,
Petr
  

Patch

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7219991885e6..51445e8ea730 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3118,14 +3118,24 @@  void console_unblank(void)
  */
 void console_flush_on_panic(enum con_flush_mode mode)
 {
+	bool handover;
+	u64 next_seq;
+
 	/*
-	 * If someone else is holding the console lock, trylock will fail
-	 * and may_schedule may be set.  Ignore and proceed to unlock so
-	 * that messages are flushed out.  As this can be called from any
-	 * context and we don't want to get preempted while flushing,
-	 * ensure may_schedule is cleared.
+	 * Ignore the console lock and flush out the messages. Attempting a
+	 * trylock would not be useful because:
+	 *
+	 *   - if it is contended, it must be ignored anyway
+	 *   - console_lock() and console_trylock() block and fail
+	 *     respectively in panic for non-panic CPUs
+	 *   - semaphores are not NMI-safe
+	 */
+
+	/*
+	 * If another context is holding the console lock,
+	 * @console_may_schedule might be set. Clear it so that
+	 * this context does not call cond_resched() while flushing.
 	 */
-	console_trylock();
 	console_may_schedule = 0;
 
 	if (mode == CONSOLE_REPLAY_ALL) {
@@ -3138,15 +3148,15 @@  void console_flush_on_panic(enum con_flush_mode mode)
 		cookie = console_srcu_read_lock();
 		for_each_console_srcu(c) {
 			/*
-			 * If the above console_trylock() failed, this is an
-			 * unsynchronized assignment. But in that case, the
+			 * This is an unsynchronized assignment, but the
 			 * kernel is in "hope and pray" mode anyway.
 			 */
 			c->seq = seq;
 		}
 		console_srcu_read_unlock(cookie);
 	}
-	console_unlock();
+
+	console_flush_all(false, &next_seq, &handover);
 }
 
 /*