[printk,v2,06/11] printk: nbcon: Wire up nbcon console atomic flushing

Message ID 20230919230856.661435-7-john.ogness@linutronix.de
State New
Headers
Series wire up nbcon atomic printing |

Commit Message

John Ogness Sept. 19, 2023, 11:08 p.m. UTC
  From: Thomas Gleixner <tglx@linutronix.de>

Perform nbcon console atomic flushing at key call sites:

nbcon_atomic_exit() - When exiting from the outermost atomic
	printing section. If the priority was NBCON_PRIO_PANIC,
	attempt a second flush allowing unsafe hostile
	takeovers.

console_flush_on_panic() - Called from several call sites to
	trigger ringbuffer dumping in an urgent situation.

console_flush_on_panic() - Typically the panic() function will
	take care of atomic flushing the nbcon consoles on
	panic. However, there are several users of
	console_flush_on_panic() outside of panic().

printk_trigger_flush() - Used in urgent situations to trigger a
	dump in an irq_work context. However, the atomic
	flushing part happens in the calling context since an
	alternative context is not required.

Co-developed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de>
---
 kernel/printk/nbcon.c  | 10 ++++++++++
 kernel/printk/printk.c |  6 ++++++
 2 files changed, 16 insertions(+)
  

Comments

Petr Mladek Sept. 22, 2023, 5:41 p.m. UTC | #1
On Wed 2023-09-20 01:14:51, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Perform nbcon console atomic flushing at key call sites:
> 
> nbcon_atomic_exit() - When exiting from the outermost atomic
> 	printing section.

IMHO, it would not help because it all depends
whether nbcon_context_try_acquire() succeeds or now, see below.

> 	If the priority was NBCON_PRIO_PANIC,
> 	attempt a second flush allowing unsafe hostile
> 	takeovers.

I was first scared that this would be called by any printk()
in panic(). But it seems that nbcon_atomic_exit() is called
only one after disabling printk(). It might deserve a comment
where it is supposed to be used. See a proposal below.


> console_flush_on_panic() - Called from several call sites to
> 	trigger ringbuffer dumping in an urgent situation.
> 
> console_flush_on_panic() - Typically the panic() function will

This is a second description of console_flush_of_panic() which
looks like a mistake.

> 	take care of atomic flushing the nbcon consoles on
> 	panic. However, there are several users of
> 	console_flush_on_panic() outside of panic().

The generic panic() seems to use console_flush_on_panic() correctly
at the very end.

Hmm, I see that console_flush_on_panic() is called also in
arch/loongarch/kernel/reset.c and arch/powerpc/kernel/traps.c.

The loongarch code uses it in halt(). IMHO, it would be
better to use the normal flush. But maybe they accept the risk.
I know nothing about this architecture and who uses it.

The function defined on powerpc is then used in:

  + arch/powerpc/platforms/powernv/opal.c:

     IMHO, it should be replaced there by normal flush. It seems
     to call the generic panic() later.

  + arch/powerpc/platforms/ps3/setup.c:

      This seems to be used as a panic notifier ppc_panic_platform_handler().
      I am not completely sure but it does not look like the final flush.

  + arch/powerpc/platforms/pseries/setup.c:

      Same as ps3/setup.c. Also "just" a panic notifier.

Anyway, we should make clear that console_flush_on_panic() might break
the system and should be called as the last attempt to flush consoles.
The above arch-specific users are worth review.

> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1092,6 +1092,11 @@ void nbcon_atomic_flush_all(void)
>   * Return:	The previous priority that needs to be fed into
>   *		the corresponding nbcon_atomic_exit()
>   * Context:	Any context. Disables migration.
> + *
> + * When within an atomic printing section, no atomic printing occurs. This
> + * is to allow all emergency messages to be dumped into the ringbuffer before
> + * flushing the ringbuffer.

The comment sounds like it is an advantage. But I think that it would be
a disadvantage.

Fortunately, this is not true. Even the atomic context tries
to flush the messages immediately when it is able to get
the per-cpu lock. It happes via console_flush_all() called
from console_unlock().

> + * The actual atomic printing occurs when exiting
> + * the outermost atomic printing section.
>   */
>  enum nbcon_prio nbcon_atomic_enter(enum nbcon_prio prio)
>  {
> @@ -1130,8 +1135,13 @@ void nbcon_atomic_exit(enum nbcon_prio prio, enum nbcon_prio prev_prio)
>  {
>  	struct nbcon_cpu_state *cpu_state;
>  
> +	__nbcon_atomic_flush_all(false);

IMHO, this would not help. Either this context was able to acquire the
lock and flush each message directly. Or it would fail to get the lock
even here.

> +
>  	cpu_state = nbcon_get_cpu_state();
>  
> +	if (cpu_state->prio == NBCON_PRIO_PANIC)
> +		__nbcon_atomic_flush_all(true);

It would deserve a comment that nbcon_atomic_exit() is used
in panic() to calm down the consoles completely, similar effect
as setting the variable "suppress_printk".

> +
>  	/*
>  	 * Undo the nesting of nbcon_atomic_enter() at the CPU state
>  	 * priority.
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 6ef33cefa4d0..419c0fabc481 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3159,6 +3159,8 @@ void console_flush_on_panic(enum con_flush_mode mode)
>  		console_srcu_read_unlock(cookie);
>  	}
>  
> +	nbcon_atomic_flush_all();
> +
>  	console_flush_all(false, &next_seq, &handover);

It seems that console_flush_all() tries to flush nbcon conosoles
as well. And nbcon_atomic_flush_all() is explicitely called
close above this flush_on_panic(). This is from panic()
after this patchset is applied.


void panic(const char *fmt, ...)
{
[...]
	nbcon_atomic_flush_all();

	console_unblank();

	/*
	 * We may have ended up stopping the CPU holding the lock (in
	 * smp_send_stop()) while still having some valuable data in the console
	 * buffer.  Try to acquire the lock then release it regardless of the
	 * result.  The release will also print the buffers out.  Locks debug
	 * should be disabled to avoid reporting bad unlock balance when
	 * panic() is not being callled from OOPS.
	 */
	debug_locks_off();
	console_flush_on_panic(CONSOLE_FLUSH_PENDING);


By other words, we try to flush nbcon consoles 3 times almost
immediately after each other. I believe that there is a motivation
to do so. Anyway, I want to make sure that it was on purpose.

It would deserve some comment what is the purpose. Otherwise, people
would tend to remove the "redundant" calls.

>  }
>  
> @@ -3903,6 +3905,10 @@ void defer_console_output(void)
>  
>  void printk_trigger_flush(void)
>  {
> +	migrate_disable();
> +	nbcon_atomic_flush_all();
> +	migrate_enable();

Can this be actually called in NMI?

> +
>  	defer_console_output();
>  }

This function would deserve some description because it looks strange.
There are three names which are contradicting each other:

  + trigger_flush:    it sounds like it tells someone to do the flush

  + nbcon_atomic_flush_all: does the flush immediately

  + defer_console_output: sounds like it prevents the current context
	to flush the consoles immediately

We should either choose better names and/or add comments:

/**
 * console_flush_or_trigger() - Make sure that the consoles will get flushed.
 *
 * Try to flush consoles when possible or queue flushing consoles like
 * in the deferred printk.
 *
 * Context: Can be used in any context (including NMI?)
 */
void printk_flush_or_trigger(void)
{
	/*
	 * Try to flush consoles which do not depend on console_lock()
	 * and support safe atomic_write()
	 */
	migrate_disable();
	nbcon_atomic_flush_all();
	migrate_enable();

	/* Try flushing legacy consoles or queue the deferred handling. */
	if (!in_nmi() && console_trylock())
		console_unlock();
	else
		defer_console_output();
}


All in all. I feel a bit overwhelmed by the many *flush*() functions at
the moment. Some flush only nbcon consoles. Some flush all. Some
are using only the safe takeover/handover and some allow also
harsh takeover. Some ignore port->lock because of bust_spinlock().
Some are ignoring console_lock. They are called on different
locations. The nbcon variants are called explicitly and also
inside the generic *flush*() functions.

It is partly because it is Friday evening. Anyway, I need to think
more about it.

Best Regards,
Petr
  
John Ogness Sept. 25, 2023, 1:37 p.m. UTC | #2
On 2023-09-22, Petr Mladek <pmladek@suse.com> wrote:
>> Perform nbcon console atomic flushing at key call sites:
>> 
>> nbcon_atomic_exit() - When exiting from the outermost atomic
>> 	printing section.
>
> IMHO, it would not help because it all depends
> whether nbcon_context_try_acquire() succeeds or now, see below.

Sure, but it first attempts a _safe_ flush on all nbcon consoles that
alow it, _then_ it falls back to an _unsafe_ flush (which cannot fail,
but might explode).

>> 	If the priority was NBCON_PRIO_PANIC,
>> 	attempt a second flush allowing unsafe hostile
>> 	takeovers.
>
> I was first scared that this would be called by any printk()
> in panic(). But it seems that nbcon_atomic_exit() is called
> only one after disabling printk(). It might deserve a comment
> where it is supposed to be used. See a proposal below.

OK.

>> console_flush_on_panic() - Called from several call sites to
>> 	trigger ringbuffer dumping in an urgent situation.
>> 
>> console_flush_on_panic() - Typically the panic() function will
>
> This is a second description of console_flush_of_panic() which
> looks like a mistake.

Oops. The first one should not be there.

>> 	take care of atomic flushing the nbcon consoles on
>> 	panic. However, there are several users of
>> 	console_flush_on_panic() outside of panic().
>
> The generic panic() seems to use console_flush_on_panic() correctly
> at the very end.
>
> Hmm, I see that console_flush_on_panic() is called also in

[...]

> Anyway, we should make clear that console_flush_on_panic() might break
> the system and should be called as the last attempt to flush consoles.
> The above arch-specific users are worth review.

In an upcoming series you will see that console_flush_on_panic() only
takes the console lock if there are legacy consoles. Ideally, eventually
there will only be nbcon consoles, so your concern would disappear.

And if those users continue to use legacy consoles, then the risks will
be the same as now.

>>   * Return:	The previous priority that needs to be fed into
>>   *		the corresponding nbcon_atomic_exit()
>>   * Context:	Any context. Disables migration.
>> + *
>> + * When within an atomic printing section, no atomic printing occurs. This
>> + * is to allow all emergency messages to be dumped into the ringbuffer before
>> + * flushing the ringbuffer.
>
> The comment sounds like it is an advantage. But I think that it would be
> a disadvantage.

Please explain. At LPC2022 we agreed it was an advantage and it is
implemented on purpose using the atomic printing sections.

> Fortunately, this is not true. Even the atomic context tries
> to flush the messages immediately when it is able to get
> the per-cpu lock. It happes via console_flush_all() called
> from console_unlock().

You are talking about legacy consoles.

The goal of the nbcon consoles is to introduce a _new_ console type to
support controlled decisions for emergency printing. Legacy consoles
will continue to work (or not work) as before.

>> @@ -1130,8 +1135,13 @@ void nbcon_atomic_exit(enum nbcon_prio prio, enum nbcon_prio prev_prio)
>>  {
>>  	struct nbcon_cpu_state *cpu_state;
>>  
>> +	__nbcon_atomic_flush_all(false);
>
> IMHO, this would not help. Either this context was able to acquire the
> lock and flush each message directly.

What do you mean by "help"? We are exiting an atomic printing
section. This is where the accumulated emergency messages are to be
printed. If this is a nested atomic printing section, it does nothing
because the outermost atomic printing section will flush.

> Or it would fail to get the lock even here.

If it fails to get the lock, we are talking about the worst case
scenario. That scenario demands unsafe printing, which is only allowed
in panic. If we are not in panic, it is assumed the current owner will
take care of it.

>> +
>>  	cpu_state = nbcon_get_cpu_state();
>>  
>> +	if (cpu_state->prio == NBCON_PRIO_PANIC)
>> +		__nbcon_atomic_flush_all(true);
>
> It would deserve a comment that nbcon_atomic_exit() is used
> in panic() to calm down the consoles completely, similar effect
> as setting the variable "suppress_printk".

Actually, it is nbcon_atomic_enter() that does the calming down, but
only for the emergency CPU. Other CPUs are allowed to print as wildly as
they want. And all CPUs are allowed to continue to fill the ringbuffer.

@suppress_printk is quite different as it globally blocks any new
messages from being stored into the ringbuffer. This is not calming down
the consoles, but rather putting a finite limit on what they produce.

>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 6ef33cefa4d0..419c0fabc481 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -3159,6 +3159,8 @@ void console_flush_on_panic(enum con_flush_mode mode)
>>  		console_srcu_read_unlock(cookie);
>>  	}
>>  
>> +	nbcon_atomic_flush_all();
>> +
>>  	console_flush_all(false, &next_seq, &handover);
>
> It seems that console_flush_all() tries to flush nbcon conosoles
> as well. And nbcon_atomic_flush_all() is explicitely called
> close above this flush_on_panic(). This is from panic()
> after this patchset is applied.
>
>
> void panic(const char *fmt, ...)
> {
> [...]
> 	nbcon_atomic_flush_all();
>
> 	console_unblank();
>
> 	/*
> 	 * We may have ended up stopping the CPU holding the lock (in
> 	 * smp_send_stop()) while still having some valuable data in the console
> 	 * buffer.  Try to acquire the lock then release it regardless of the
> 	 * result.  The release will also print the buffers out.  Locks debug
> 	 * should be disabled to avoid reporting bad unlock balance when
> 	 * panic() is not being callled from OOPS.
> 	 */
> 	debug_locks_off();
> 	console_flush_on_panic(CONSOLE_FLUSH_PENDING);
>
>
> By other words, we try to flush nbcon consoles 3 times almost
> immediately after each other.

"Almost immediately" != "immediately". Keep in mind that this is in
atomic printing context. Generally speaking, no messages will be seen on
the consoles yet. Anytime time there is a signficant piece (such as
console_unblank()) it would probably be wise to flush.

We will probably want even more flush points. But I wanted to at least
start with the risky points (such as console_unblank() when legacy
consoles still exist).

> It would deserve some comment what is the purpose. Otherwise, people
> would tend to remove the "redundant" calls.

OK.

>> @@ -3903,6 +3905,10 @@ void defer_console_output(void)
>>  
>>  void printk_trigger_flush(void)
>>  {
>> +	migrate_disable();
>> +	nbcon_atomic_flush_all();
>> +	migrate_enable();
>
> Can this be actually called in NMI?

No. I need to add in_nmi() checks.

>>  	defer_console_output();
>>  }
>
> This function would deserve some description because it looks strange.
> There are three names which are contradicting each other:
>
>   + trigger_flush:    it sounds like it tells someone to do the flush
>
>   + nbcon_atomic_flush_all: does the flush immediately
>
>   + defer_console_output: sounds like it prevents the current context
> 	to flush the consoles immediately

Agreed.

> We should either choose better names and/or add comments:
>
> /**
>  * console_flush_or_trigger() - Make sure that the consoles will get flushed.
>  *
>  * Try to flush consoles when possible or queue flushing consoles like
>  * in the deferred printk.
>  *
>  * Context: Can be used in any context (including NMI?)
>  */
> void printk_flush_or_trigger(void)
> {
> 	/*
> 	 * Try to flush consoles which do not depend on console_lock()
> 	 * and support safe atomic_write()
> 	 */
 	if (!in_nmi()) migrate_disable();
> 	nbcon_atomic_flush_all();
 	if (!in_nmi()) migrate_enable();
>
> 	/* Try flushing legacy consoles or queue the deferred handling. */
> 	if (!in_nmi() && console_trylock())
> 		console_unlock();
> 	else
> 		defer_console_output();
> }

I would be OK with this name and (fixed up for NMI) implementation.

> All in all. I feel a bit overwhelmed by the many *flush*() functions at
> the moment. Some flush only nbcon consoles. Some flush all. Some
> are using only the safe takeover/handover and some allow also
> harsh takeover. Some ignore port->lock because of bust_spinlock().
> Some are ignoring console_lock. They are called on different
> locations. The nbcon variants are called explicitly and also
> inside the generic *flush*() functions.

Agreed. This madness exists in part because we are continuing to support
legacy consoles. I tried to keep the two separate as much as
possible. But when it comes to flushing, there will be some overlap.

When working on the code, I tend to either look at the legacy consoles
_or_ the nbcon consoles. If you try to visualize the whole picture on a
system with legacy and nbcon consoles, it is a bit overwhelming. I
recommend focussing on them separately. I.e. when talking about nbcon
consoles, there is no reason to be concerned about the state of the
console lock or the port lock. When talking about legacy consoles, there
is no reason to be concerned about CPU atomic printing states.

John
  
Petr Mladek Sept. 26, 2023, 12:14 p.m. UTC | #3
On Mon 2023-09-25 15:43:03, John Ogness wrote:
> On 2023-09-22, Petr Mladek <pmladek@suse.com> wrote:
> >> console_flush_on_panic() - Called from several call sites to
> >> 	trigger ringbuffer dumping in an urgent situation.
> >> 
> >> console_flush_on_panic() - Typically the panic() function will
> >
> > This is a second description of console_flush_of_panic() which
> > looks like a mistake.
> 
> Oops. The first one should not be there.
> 
> >> 	take care of atomic flushing the nbcon consoles on
> >> 	panic. However, there are several users of
> >> 	console_flush_on_panic() outside of panic().
> >
> > The generic panic() seems to use console_flush_on_panic() correctly
> > at the very end.
> >
> > Hmm, I see that console_flush_on_panic() is called also in
> 
> [...]
> 
> > Anyway, we should make clear that console_flush_on_panic() might break
> > the system and should be called as the last attempt to flush consoles.
> > The above arch-specific users are worth review.
> 
> In an upcoming series you will see that console_flush_on_panic() only
> takes the console lock if there are legacy consoles. Ideally, eventually
> there will only be nbcon consoles, so your concern would disappear.

The legacy consoles have two risk levels:

  1. post->lock is ignored after bust_spinlocks()
  2. even console_lock is ignored in console_flush_on_panic()

The nbcon consoles have only one risk level:

  1. unsafe takeover is allowed

First, I thought that we wanted to allow the unsafe takeover in
console_flush_on_panic(). In that case, this function would
be dangerous even for nbcon consoles.

Now, I remember that we wanted to allow it only before entering
the infinite loop (blinking diodes). In this case,
console_flush_on_panic() would be really safe for nbcon consoles.


> And if those users continue to use legacy consoles, then the risks will
> be the same as now.
> 
> >>   * Return:	The previous priority that needs to be fed into
> >>   *		the corresponding nbcon_atomic_exit()
> >>   * Context:	Any context. Disables migration.
> >> + *
> >> + * When within an atomic printing section, no atomic printing occurs. This
> >> + * is to allow all emergency messages to be dumped into the ringbuffer before
> >> + * flushing the ringbuffer.
> >
> > The comment sounds like it is an advantage. But I think that it would be
> > a disadvantage.
> 
> Please explain. At LPC2022 we agreed it was an advantage and it is
> implemented on purpose using the atomic printing sections.

I am sorry but I do not remember the details. Do you remember
the motivation, please?

In each case, we can't just say that this works by design
because someone somewhere agreed on it. We must explain
why this is better and I do not see it at the moment.

I am terribly sorry if I agreed with this and I disagree now.
I have never been good in life discussion because there is
no enough time to think about all consequences.

Anyway, the proposed behavior (agreed on LPC2022) seems
to contradict the original plan from LPC 2019, see
https://lore.kernel.org/all/87k1acz5rx.fsf@linutronix.de/
Namely:

  --- cut ---
  3. Rather than defining emergency _messages_, we define an emergency
  _state_ where the kernel wants to flush the messages immediately before
  dying. Unlike oops_in_progress, this state will not be visible to
  anything outside of the printk infrastructure.

  4. When in emergency state, the kernel will use a new console callback
  write_atomic() to flush the messages in whatever context the CPU is in
  at that moment. Only consoles that implement the NMI-safe write_atomic()
  will be able to flush in this state.
  --- cut ---

We wanted to flush it ASAP.

I wonder if we discussed some limitations where the messages
could not be flushed immediately. Maybe, we discussed a scenario
when there are many pending messages which would delay
flushing the emergency ones. But we need to flush them anyway.

Now, I do not see any real advantage to first store all messages
and flush them later in the same context.

OK, flushign them immediately might cause delay when flushing the
first emergency one. But storing all might cause overwrinting
the first emergency messages.

I hope that my proposed change would actually make things easier
and will not affect that much the upcoming patchsets.

Best Regards,
Petr
  
John Ogness Oct. 5, 2023, 1:59 p.m. UTC | #4
On 2023-09-26, Petr Mladek <pmladek@suse.com> wrote:
> Anyway, the proposed behavior (agreed on LPC2022) seems
> to contradict the original plan from LPC 2019, see
> https://lore.kernel.org/all/87k1acz5rx.fsf@linutronix.de/
> Namely:
>
>   --- cut ---
>   3. Rather than defining emergency _messages_, we define an emergency
>   _state_ where the kernel wants to flush the messages immediately before
>   dying. Unlike oops_in_progress, this state will not be visible to
>   anything outside of the printk infrastructure.
>
>   4. When in emergency state, the kernel will use a new console callback
>   write_atomic() to flush the messages in whatever context the CPU is in
>   at that moment. Only consoles that implement the NMI-safe write_atomic()
>   will be able to flush in this state.
>   --- cut ---
>
> We wanted to flush it ASAP.

In 2019 we didn't have all the code yet. Yes, we assumed that flushing
each individual printk() call would be how to do it. But in 2022, during
a live demo with real code, we saw that it was a bad idea.

I disagree that the 2022 decisions are contradicting the 2019
decisions. We have added more details because now we have a real
implementation.

v2 establishes an emergency state before dying. v2 flushes the messages
safely before dying (and unsafely _immediately_ before dying). v2
flushes the messages in whatever context the CPU is in. v2 uses
write_atomic() to perform the flush.

John
  

Patch

diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index 2a9ff18fc78c..82e6a1678363 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1092,6 +1092,11 @@  void nbcon_atomic_flush_all(void)
  * Return:	The previous priority that needs to be fed into
  *		the corresponding nbcon_atomic_exit()
  * Context:	Any context. Disables migration.
+ *
+ * When within an atomic printing section, no atomic printing occurs. This
+ * is to allow all emergency messages to be dumped into the ringbuffer before
+ * flushing the ringbuffer. The actual atomic printing occurs when exiting
+ * the outermost atomic printing section.
  */
 enum nbcon_prio nbcon_atomic_enter(enum nbcon_prio prio)
 {
@@ -1130,8 +1135,13 @@  void nbcon_atomic_exit(enum nbcon_prio prio, enum nbcon_prio prev_prio)
 {
 	struct nbcon_cpu_state *cpu_state;
 
+	__nbcon_atomic_flush_all(false);
+
 	cpu_state = nbcon_get_cpu_state();
 
+	if (cpu_state->prio == NBCON_PRIO_PANIC)
+		__nbcon_atomic_flush_all(true);
+
 	/*
 	 * Undo the nesting of nbcon_atomic_enter() at the CPU state
 	 * priority.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 6ef33cefa4d0..419c0fabc481 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3159,6 +3159,8 @@  void console_flush_on_panic(enum con_flush_mode mode)
 		console_srcu_read_unlock(cookie);
 	}
 
+	nbcon_atomic_flush_all();
+
 	console_flush_all(false, &next_seq, &handover);
 }
 
@@ -3903,6 +3905,10 @@  void defer_console_output(void)
 
 void printk_trigger_flush(void)
 {
+	migrate_disable();
+	nbcon_atomic_flush_all();
+	migrate_enable();
+
 	defer_console_output();
 }