[printk,v2,4/5] printk: Add per-console suspended state

Message ID 20230710134524.25232-5-john.ogness@linutronix.de
State New
Headers
Series various cleanups |

Commit Message

John Ogness July 10, 2023, 1:45 p.m. UTC
  Currently the global @console_suspended is used to determine if
consoles are in a suspended state. Its primary purpose is to allow
usage of the console_lock when suspended without causing console
printing. It is synchronized by the console_lock.

Rather than relying on the console_lock to determine suspended
state, make it an official per-console state that is set within
console->flags. This allows the state to be queried via SRCU.

Remove @console_suspended. Console printing will still be avoided
when suspended because console_is_usable() returns false when
the new suspended flag is set for that console.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 include/linux/console.h |  3 ++
 kernel/printk/printk.c  | 74 ++++++++++++++++++++++++-----------------
 2 files changed, 47 insertions(+), 30 deletions(-)
  

Comments

Sergey Senozhatsky July 11, 2023, 3:08 p.m. UTC | #1
On (23/07/10 15:51), John Ogness wrote:
[..]
> @@ -2623,8 +2647,6 @@ void console_lock(void)
>  		msleep(1000);
>  
>  	down_console_sem();
> -	if (console_suspended)
> -		return;
>  	console_locked = 1;
>  	console_may_schedule = 1;
>  }
> @@ -2645,10 +2667,6 @@ int console_trylock(void)
>  		return 0;
>  	if (down_trylock_console_sem())
>  		return 0;
> -	if (console_suspended) {
> -		up_console_sem();
> -		return 0;
> -	}
>  	console_locked = 1;
>  	console_may_schedule = 0;
>  	return 1;

Interesting. console_locked previously would not be set if
console is suspended, but now it's always set, which in theory
changes the way WARN_CONSOLE_UNLOCKED() macro works in some
cases?
  
John Ogness July 11, 2023, 3:23 p.m. UTC | #2
On 2023-07-12, Sergey Senozhatsky <senozhatsky@chromium.org> wrote:
>> @@ -2623,8 +2647,6 @@ void console_lock(void)
>>  		msleep(1000);
>>  
>>  	down_console_sem();
>> -	if (console_suspended)
>> -		return;
>>  	console_locked = 1;
>>  	console_may_schedule = 1;
>>  }
>> @@ -2645,10 +2667,6 @@ int console_trylock(void)
>>  		return 0;
>>  	if (down_trylock_console_sem())
>>  		return 0;
>> -	if (console_suspended) {
>> -		up_console_sem();
>> -		return 0;
>> -	}
>>  	console_locked = 1;
>>  	console_may_schedule = 0;
>>  	return 1;
>
> Interesting. console_locked previously would not be set if
> console is suspended, but now it's always set, which in theory
> changes the way WARN_CONSOLE_UNLOCKED() macro works in some
> cases?

Yes, Petr mentioned [0] this during the v1 review. His direct comment:

  "console_locked" seems to be used only in WARN_CONSOLE_UNLOCKED().
  I could imagine a corner case where, for example, "vt" code does
  not print the warning because it works as it works. But it does
  not make much sense. IMHO, such a code should get fixed. And it
  is just a warning after all.

And his final comment in that thread:

  I believe that @console_suspended is not longer needed.
  Let's replace it with the per-console flag and do not worry
  about races.

John Ogness

[0] https://lore.kernel.org/lkml/ZAieQtcs7YEuCCDa@alley
  
Sergey Senozhatsky July 11, 2023, 3:26 p.m. UTC | #3
On (23/07/11 17:29), John Ogness wrote:
[..]
> > Interesting. console_locked previously would not be set if
> > console is suspended, but now it's always set, which in theory
> > changes the way WARN_CONSOLE_UNLOCKED() macro works in some
> > cases?
> 
> Yes, Petr mentioned [0] this during the v1 review. His direct comment:
> 
>   "console_locked" seems to be used only in WARN_CONSOLE_UNLOCKED().
>   I could imagine a corner case where, for example, "vt" code does
>   not print the warning because it works as it works. But it does
>   not make much sense. IMHO, such a code should get fixed. And it
>   is just a warning after all.
> 
> And his final comment in that thread:
> 
>   I believe that @console_suspended is not longer needed.
>   Let's replace it with the per-console flag and do not worry
>   about races.

Oh, thanks for the pointers!

> [0] https://lore.kernel.org/lkml/ZAieQtcs7YEuCCDa@alley
  
Sergey Senozhatsky July 11, 2023, 3:30 p.m. UTC | #4
On (23/07/10 15:51), John Ogness wrote:
> Currently the global @console_suspended is used to determine if
> consoles are in a suspended state. Its primary purpose is to allow
> usage of the console_lock when suspended without causing console
> printing. It is synchronized by the console_lock.
> 
> Rather than relying on the console_lock to determine suspended
> state, make it an official per-console state that is set within
> console->flags. This allows the state to be queried via SRCU.
> 
> Remove @console_suspended. Console printing will still be avoided
> when suspended because console_is_usable() returns false when
> the new suspended flag is set for that console.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
  
Petr Mladek July 13, 2023, 3:53 p.m. UTC | #5
On Mon 2023-07-10 15:51:23, John Ogness wrote:
> Currently the global @console_suspended is used to determine if
> consoles are in a suspended state. Its primary purpose is to allow
> usage of the console_lock when suspended without causing console
> printing. It is synchronized by the console_lock.
> 
> Rather than relying on the console_lock to determine suspended
> state, make it an official per-console state that is set within
> console->flags. This allows the state to be queried via SRCU.
> 
> Remove @console_suspended. Console printing will still be avoided
> when suspended because console_is_usable() returns false when
> the new suspended flag is set for that console.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

Looks good to me:

Reviewed-by: Petr Mladek <pmladek@suse.com>

I have double checked the history. suspend_console() was added
into v2.6.18-rc1 by the commit ("Add support for suspending and
resuming the whole console subsystem").

The above commit added "secondary_console_sem". It was taken
by acquire_console_sem() instead of the normal "console_sem"
when "console_suspended" was set. It means that the normal
"console_sem" really was not taken.

The "secondary_console_sem" was removed in v2.6.29-rc6 by the commit
("PM: Fix suspend_console and resume_console to use only one
semaphore"). It solved races between code taking "console_sem"
and code "secondary_console_sem". This commit kept the handling of
"console_locked". It was not set when console_suspended was set
even though "console_sem" was actually taken.

IMHO, it was a bug in the commit removing "secondary_console_sem".
But it probably never caused any issues.

Anyway, this patch makes "console_locked" handling sane. And if some
tty code relies on the insane logic then it should get fixed.

Best Regards,
Petr
  

Patch

diff --git a/include/linux/console.h b/include/linux/console.h
index d3195664baa5..7de11c763eb3 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -154,6 +154,8 @@  static inline int con_debug_leave(void)
  *			receiving the printk spam for obvious reasons.
  * @CON_EXTENDED:	The console supports the extended output format of
  *			/dev/kmesg which requires a larger output buffer.
+ * @CON_SUSPENDED:	Indicates if a console is suspended. If true, the
+ *			printing callbacks must not be called.
  */
 enum cons_flags {
 	CON_PRINTBUFFER		= BIT(0),
@@ -163,6 +165,7 @@  enum cons_flags {
 	CON_ANYTIME		= BIT(4),
 	CON_BRL			= BIT(5),
 	CON_EXTENDED		= BIT(6),
+	CON_SUSPENDED		= BIT(7),
 };
 
 /**
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index e7632ed3b6fd..701908d389f4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -86,7 +86,7 @@  EXPORT_SYMBOL(oops_in_progress);
 static DEFINE_MUTEX(console_mutex);
 
 /*
- * console_sem protects updates to console->seq and console_suspended,
+ * console_sem protects updates to console->seq
  * and also provides serialization for console printing.
  */
 static DEFINE_SEMAPHORE(console_sem);
@@ -359,7 +359,7 @@  static bool panic_in_progress(void)
  * paths in the console code where we end up in places I want
  * locked without the console semaphore held).
  */
-static int console_locked, console_suspended;
+static int console_locked;
 
 /*
  *	Array of consoles built from command line options (console=)
@@ -2549,22 +2549,46 @@  MODULE_PARM_DESC(console_no_auto_verbose, "Disable console loglevel raise to hig
  */
 void suspend_console(void)
 {
+	struct console *con;
+
 	if (!console_suspend_enabled)
 		return;
 	pr_info("Suspending console(s) (use no_console_suspend to debug)\n");
 	pr_flush(1000, true);
-	console_lock();
-	console_suspended = 1;
-	up_console_sem();
+
+	console_list_lock();
+	for_each_console(con)
+		console_srcu_write_flags(con, con->flags | CON_SUSPENDED);
+	console_list_unlock();
+
+	/*
+	 * Ensure that all SRCU list walks have completed. All printing
+	 * contexts must be able to see that they are suspended so that it
+	 * is guaranteed that all printing has stopped when this function
+	 * completes.
+	 */
+	synchronize_srcu(&console_srcu);
 }
 
 void resume_console(void)
 {
+	struct console *con;
+
 	if (!console_suspend_enabled)
 		return;
-	down_console_sem();
-	console_suspended = 0;
-	console_unlock();
+
+	console_list_lock();
+	for_each_console(con)
+		console_srcu_write_flags(con, con->flags & ~CON_SUSPENDED);
+	console_list_unlock();
+
+	/*
+	 * Ensure that all SRCU list walks have completed. All printing
+	 * contexts must be able to see they are no longer suspended so
+	 * that they are guaranteed to wake up and resume printing.
+	 */
+	synchronize_srcu(&console_srcu);
+
 	pr_flush(1000, true);
 }
 
@@ -2623,8 +2647,6 @@  void console_lock(void)
 		msleep(1000);
 
 	down_console_sem();
-	if (console_suspended)
-		return;
 	console_locked = 1;
 	console_may_schedule = 1;
 }
@@ -2645,10 +2667,6 @@  int console_trylock(void)
 		return 0;
 	if (down_trylock_console_sem())
 		return 0;
-	if (console_suspended) {
-		up_console_sem();
-		return 0;
-	}
 	console_locked = 1;
 	console_may_schedule = 0;
 	return 1;
@@ -2674,6 +2692,9 @@  static inline bool console_is_usable(struct console *con)
 	if (!(flags & CON_ENABLED))
 		return false;
 
+	if ((flags & CON_SUSPENDED))
+		return false;
+
 	if (!con->write)
 		return false;
 
@@ -2992,11 +3013,6 @@  void console_unlock(void)
 	bool flushed;
 	u64 next_seq;
 
-	if (console_suspended) {
-		up_console_sem();
-		return;
-	}
-
 	/*
 	 * Console drivers are called with interrupts disabled, so
 	 * @console_may_schedule should be cleared before; however, we may
@@ -3702,8 +3718,7 @@  static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
 
 		/*
 		 * Hold the console_lock to guarantee safe access to
-		 * console->seq and to prevent changes to @console_suspended
-		 * until all consoles have been processed.
+		 * console->seq.
 		 */
 		console_lock();
 
@@ -3711,6 +3726,11 @@  static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
 		for_each_console_srcu(c) {
 			if (con && con != c)
 				continue;
+			/*
+			 * If consoles are not usable, it cannot be expected
+			 * that they make forward progress, so only increment
+			 * @diff for usable consoles.
+			 */
 			if (!console_is_usable(c))
 				continue;
 			printk_seq = c->seq;
@@ -3719,18 +3739,12 @@  static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
 		}
 		console_srcu_read_unlock(cookie);
 
-		/*
-		 * If consoles are suspended, it cannot be expected that they
-		 * make forward progress, so timeout immediately. @diff is
-		 * still used to return a valid flush status.
-		 */
-		if (console_suspended)
-			remaining = 0;
-		else if (diff != last_diff && reset_on_progress)
+		if (diff != last_diff && reset_on_progress)
 			remaining = timeout_ms;
 
 		console_unlock();
 
+		/* Note: @diff is 0 if there are no usable consoles. */
 		if (diff == 0 || remaining == 0)
 			break;
 
@@ -3764,7 +3778,7 @@  static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
  * printer has been seen to make some forward progress.
  *
  * Context: Process context. May sleep while acquiring console lock.
- * Return: true if all enabled printers are caught up.
+ * Return: true if all usable printers are caught up.
  */
 static bool pr_flush(int timeout_ms, bool reset_on_progress)
 {