[printk,v2,22/26] printk: nbcon: Implement emergency sections
Commit Message
From: Thomas Gleixner <tglx@linutronix.de>
In emergency situations (something has gone wrong but the
system continues to operate), usually important information
(such as a backtrace) is generated via printk(). Each
individual printk record has little meaning. It is the
collection of printk messages that is most often needed by
developers and users.
In order to help ensure that the collection of printk messages
in an emergency situation are all stored to the ringbuffer as
quickly as possible, disable console output for that CPU while
it is in the emergency situation. When exiting the emergency
situation, trigger the consoles to be flushed.
Add per-CPU emergency nesting tracking because an emergency
can arise while in an emergency situation.
Add functions to mark the beginning and end of emergency
sections where the urgent messages are generated.
Do not print if the current CPU is in an emergency state.
Trigger console flushing when exiting all emergency nesting.
Note that the emergency state is not system-wide. While one CPU
is in an emergency state, another CPU may continue to print
console messages.
Co-developed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de>
---
include/linux/console.h | 4 ++
kernel/printk/nbcon.c | 81 +++++++++++++++++++++++++++++++++++++++++
kernel/printk/printk.c | 25 ++++++++++---
3 files changed, 104 insertions(+), 6 deletions(-)
Comments
On Sun 2024-02-18 20:03:22, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
>
> In emergency situations (something has gone wrong but the
> system continues to operate), usually important information
> (such as a backtrace) is generated via printk(). Each
> individual printk record has little meaning. It is the
> collection of printk messages that is most often needed by
> developers and users.
>
> In order to help ensure that the collection of printk messages
> in an emergency situation are all stored to the ringbuffer as
> quickly as possible, disable console output for that CPU while
> it is in the emergency situation. When exiting the emergency
> situation, trigger the consoles to be flushed.
>
> Add per-CPU emergency nesting tracking because an emergency
> can arise while in an emergency situation.
>
> Add functions to mark the beginning and end of emergency
> sections where the urgent messages are generated.
>
> Do not print if the current CPU is in an emergency state.
>
> Trigger console flushing when exiting all emergency nesting.
>
> Note that the emergency state is not system-wide. While one CPU
> is in an emergency state, another CPU may continue to print
> console messages.
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2404,16 +2404,29 @@ asmlinkage int vprintk_emit(int facility, int level,
> * printing of all remaining records to all consoles so that
> * this context can return as soon as possible. Hopefully
> * another printk() caller will take over the printing.
> + *
> + * Also, nbcon_get_default_prio() requires migration disabled.
> */
> preempt_disable();
> +
> /*
> - * Try to acquire and then immediately release the console
> - * semaphore. The release will print out buffers. With the
> - * spinning variant, this context tries to take over the
> - * printing from another printing context.
> + * Do not emit for EMERGENCY priority. The console will be
> + * explicitly flushed when exiting the emergency section.
> */
> - if (console_trylock_spinning())
> - console_unlock();
> + if (nbcon_get_default_prio() == NBCON_PRIO_EMERGENCY) {
> + do_trylock_unlock = false;
This would cause calling defer_console_output() in this printk().
I think that we do not want it. It is done later by
nbcon_cpu_emergency_exit(). I think that we want something like:
/*
* Try to acquire and then immediately release the
* console semaphore. The release will print out
* buffers. With the spinning variant, this context
* tries to take over the printing from another
* printing context.
*
* Skip it in EMERGENCY priority. The console will be
* explicitly flushed when exiting the emergency section.
*/
if (nbcon_get_default_prio() != NBCON_PRIO_EMERGENCY) {
if (console_trylock_spinning())
console_unlock();
}
> + } else {
> + /*
> + * Try to acquire and then immediately release the
> + * console semaphore. The release will print out
> + * buffers. With the spinning variant, this context
> + * tries to take over the printing from another
> + * printing context.
> + */
> + if (console_trylock_spinning())
> + console_unlock();
> + }
> +
> preempt_enable();
> }
Otherwise, it looks good.
Best Regards,
Petr
On Sun 2024-02-18 20:03:22, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
>
> In emergency situations (something has gone wrong but the
> system continues to operate), usually important information
> (such as a backtrace) is generated via printk(). Each
> individual printk record has little meaning. It is the
> collection of printk messages that is most often needed by
> developers and users.
>
> In order to help ensure that the collection of printk messages
> in an emergency situation are all stored to the ringbuffer as
> quickly as possible, disable console output for that CPU while
> it is in the emergency situation. When exiting the emergency
> situation, trigger the consoles to be flushed.
>
> Add per-CPU emergency nesting tracking because an emergency
> can arise while in an emergency situation.
>
> Add functions to mark the beginning and end of emergency
> sections where the urgent messages are generated.
>
> Do not print if the current CPU is in an emergency state.
>
> Trigger console flushing when exiting all emergency nesting.
>
> Note that the emergency state is not system-wide. While one CPU
> is in an emergency state, another CPU may continue to print
> console messages.
>
> --- a/kernel/printk/nbcon.c
> +++ b/kernel/printk/nbcon.c
> @@ -1105,6 +1134,58 @@ void nbcon_atomic_flush_unsafe(void)
> __nbcon_atomic_flush_all(prb_next_reserve_seq(prb), true);
> }
>
> +/**
> + * nbcon_cpu_emergency_enter - Enter an emergency section where printk()
> + * messages for that CPU are only stored
> + *
> + * Upon exiting the emergency section, all stored messages are flushed.
> + *
> + * Context: Any context. Disables preemption.
> + *
> + * When within an emergency section, no printing occurs on that CPU. This
> + * is to allow all emergency messages to be dumped into the ringbuffer before
> + * flushing the ringbuffer. The actual printing occurs when exiting the
> + * outermost emergency section.
> + */
> +void nbcon_cpu_emergency_enter(void)
> +{
> + unsigned int *cpu_emergency_nesting;
> +
> + preempt_disable();
> +
> + cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> + (*cpu_emergency_nesting)++;
> +}
> +
> +/**
> + * nbcon_cpu_emergency_exit - Exit an emergency section and flush the
> + * stored messages
> + *
> + * Flushing only occurs when exiting all nesting for the CPU.
> + *
> + * Context: Any context. Enables preemption.
> + */
> +void nbcon_cpu_emergency_exit(void)
> +{
> + unsigned int *cpu_emergency_nesting;
> + bool do_trigger_flush = false;
> +
> + cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> +
> + WARN_ON_ONCE(*cpu_emergency_nesting == 0);
> +
> + if (*cpu_emergency_nesting == 1)
> + do_trigger_flush = true;
> +
> + /* Undo the nesting count of nbcon_cpu_emergency_enter(). */
> + (*cpu_emergency_nesting)--;
> +
> + preempt_enable();
> +
> + if (do_trigger_flush)
> + printk_trigger_flush();
Just an idea. printk_trigger_flush() calls defer_console_output().
It always moves the flushing into IRQ context.
It might make sense to add a flush() function which would try
to flush the messages directly from this context and
queue the IRQ work only when it fails. Something like:
void printk_emergency_flush()
{
/* nbcon consoles could be flushed in any context. */
if (have_nbcon_console)
nbcon_flush_all();
if (have_legacy_console) {
if (console_trylock())
console_unlock();
else
defer_console_output();
}
}
But wait, nbcon_flush_all() might have troubles to get the per-console
lock because it would be called with NBCON_PRIO_NORMAL.
Wait, wait, wait.
defer_console_output() schedules wake_up_klogd_work_func(). It flushes
only legacy consoles. It means that even emergency messages would
need to wait for the printk kthread.
By other words, it seems that the emergency context does not have any
effect for nbcon consoles.
It brings a question if any code would need to acquire the emergency
context at all.
I quess that this does not work as expected.
Or do I miss anything, please?
Best Regards,
Petr
On Fri 2024-03-01 16:49:54, Petr Mladek wrote:
> On Sun 2024-02-18 20:03:22, John Ogness wrote:
> > From: Thomas Gleixner <tglx@linutronix.de>
> >
> > In emergency situations (something has gone wrong but the
> > system continues to operate), usually important information
> > (such as a backtrace) is generated via printk(). Each
> > individual printk record has little meaning. It is the
> > collection of printk messages that is most often needed by
> > developers and users.
> >
> > In order to help ensure that the collection of printk messages
> > in an emergency situation are all stored to the ringbuffer as
> > quickly as possible, disable console output for that CPU while
> > it is in the emergency situation. When exiting the emergency
> > situation, trigger the consoles to be flushed.
> >
> > Add per-CPU emergency nesting tracking because an emergency
> > can arise while in an emergency situation.
> >
> > Add functions to mark the beginning and end of emergency
> > sections where the urgent messages are generated.
> >
> > Do not print if the current CPU is in an emergency state.
> >
> > Trigger console flushing when exiting all emergency nesting.
> >
> > Note that the emergency state is not system-wide. While one CPU
> > is in an emergency state, another CPU may continue to print
> > console messages.
> >
> > --- a/kernel/printk/nbcon.c
> > +++ b/kernel/printk/nbcon.c
> > @@ -1105,6 +1134,58 @@ void nbcon_atomic_flush_unsafe(void)
> > __nbcon_atomic_flush_all(prb_next_reserve_seq(prb), true);
> > }
> >
> > +/**
> > + * nbcon_cpu_emergency_enter - Enter an emergency section where printk()
> > + * messages for that CPU are only stored
> > + *
> > + * Upon exiting the emergency section, all stored messages are flushed.
> > + *
> > + * Context: Any context. Disables preemption.
> > + *
> > + * When within an emergency section, no printing occurs on that CPU. This
> > + * is to allow all emergency messages to be dumped into the ringbuffer before
> > + * flushing the ringbuffer. The actual printing occurs when exiting the
> > + * outermost emergency section.
> > + */
> > +void nbcon_cpu_emergency_enter(void)
> > +{
> > + unsigned int *cpu_emergency_nesting;
> > +
> > + preempt_disable();
> > +
> > + cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> > + (*cpu_emergency_nesting)++;
> > +}
> > +
> > +/**
> > + * nbcon_cpu_emergency_exit - Exit an emergency section and flush the
> > + * stored messages
> > + *
> > + * Flushing only occurs when exiting all nesting for the CPU.
> > + *
> > + * Context: Any context. Enables preemption.
> > + */
> > +void nbcon_cpu_emergency_exit(void)
> > +{
> > + unsigned int *cpu_emergency_nesting;
> > + bool do_trigger_flush = false;
> > +
> > + cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
> > +
> > + WARN_ON_ONCE(*cpu_emergency_nesting == 0);
> > +
> > + if (*cpu_emergency_nesting == 1)
> > + do_trigger_flush = true;
> > +
> > + /* Undo the nesting count of nbcon_cpu_emergency_enter(). */
> > + (*cpu_emergency_nesting)--;
> > +
> > + preempt_enable();
> > +
> > + if (do_trigger_flush)
> > + printk_trigger_flush();
>
> Just an idea. printk_trigger_flush() calls defer_console_output().
> It always moves the flushing into IRQ context.
>
> It might make sense to add a flush() function which would try
> to flush the messages directly from this context and
> queue the IRQ work only when it fails. Something like:
>
> void printk_emergency_flush()
> {
> /* nbcon consoles could be flushed in any context. */
> if (have_nbcon_console)
> nbcon_flush_all();
It seems that the previous version of the patchset called this
in nbcon_cpu_emergency_exit() before decrementing the cpu state.
Sigh, it was me who said that it was useless, see
https://lore.kernel.org/all/ZQ3R4Lz1LHQYsylw@alley/
I am sorry for this.
To my defense. It was a generic function added early in the patchset.
And it was used by both emergency and panic contexts. It is possible
that I did not know about the emergency context behavior at
the time of review.
> if (have_legacy_console) {
> if (console_trylock())
> console_unlock();
> else
> defer_console_output();
> }
> }
>
> But wait, nbcon_flush_all() might have troubles to get the per-console
> lock because it would be called with NBCON_PRIO_NORMAL.
>
>
> Wait, wait, wait.
>
> defer_console_output() schedules wake_up_klogd_work_func(). It flushes
> only legacy consoles. It means that even emergency messages would
> need to wait for the printk kthread.
>
> By other words, it seems that the emergency context does not have any
> effect for nbcon consoles.
It should get fixed by calling nbcon_flush_all() before decrementing
cpu_emergency_nesting counter.
Best Regards,
Petr
@@ -478,10 +478,14 @@ static inline bool console_is_registered(const struct console *con)
hlist_for_each_entry(con, &console_list, node)
#ifdef CONFIG_PRINTK
+extern void nbcon_cpu_emergency_enter(void);
+extern void nbcon_cpu_emergency_exit(void);
extern bool nbcon_can_proceed(struct nbcon_write_context *wctxt);
extern bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt);
extern bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt);
#else
+static inline void nbcon_cpu_emergency_enter(void) { }
+static inline void nbcon_cpu_emergency_exit(void) { }
static inline bool nbcon_can_proceed(struct nbcon_write_context *wctxt) { return false; }
static inline bool nbcon_enter_unsafe(struct nbcon_write_context *wctxt) { return false; }
static inline bool nbcon_exit_unsafe(struct nbcon_write_context *wctxt) { return false; }
@@ -929,6 +929,29 @@ static bool nbcon_emit_next_record(struct nbcon_write_context *wctxt)
return nbcon_context_exit_unsafe(ctxt);
}
+/* Track the nbcon emergency nesting per CPU. */
+static DEFINE_PER_CPU(unsigned int, nbcon_pcpu_emergency_nesting);
+static unsigned int early_nbcon_pcpu_emergency_nesting __initdata;
+
+/**
+ * nbcon_get_cpu_emergency_nesting - Get the per CPU emergency nesting pointer
+ *
+ * Return: Either a pointer to the per CPU emergency nesting counter of
+ * the current CPU or to the init data during early boot.
+ */
+static __ref unsigned int *nbcon_get_cpu_emergency_nesting(void)
+{
+ /*
+ * The value of __printk_percpu_data_ready gets set in normal
+ * context and before SMP initialization. As a result it could
+ * never change while inside an nbcon emergency section.
+ */
+ if (!printk_percpu_data_ready())
+ return &early_nbcon_pcpu_emergency_nesting;
+
+ return this_cpu_ptr(&nbcon_pcpu_emergency_nesting);
+}
+
/**
* nbcon_atomic_emit_one - Print one record for an nbcon console using the
* write_atomic() callback
@@ -971,9 +994,15 @@ static bool nbcon_atomic_emit_one(struct nbcon_write_context *wctxt)
*/
enum nbcon_prio nbcon_get_default_prio(void)
{
+ unsigned int *cpu_emergency_nesting;
+
if (this_cpu_in_panic())
return NBCON_PRIO_PANIC;
+ cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
+ if (*cpu_emergency_nesting)
+ return NBCON_PRIO_EMERGENCY;
+
return NBCON_PRIO_NORMAL;
}
@@ -1105,6 +1134,58 @@ void nbcon_atomic_flush_unsafe(void)
__nbcon_atomic_flush_all(prb_next_reserve_seq(prb), true);
}
+/**
+ * nbcon_cpu_emergency_enter - Enter an emergency section where printk()
+ * messages for that CPU are only stored
+ *
+ * Upon exiting the emergency section, all stored messages are flushed.
+ *
+ * Context: Any context. Disables preemption.
+ *
+ * When within an emergency section, no printing occurs on that CPU. This
+ * is to allow all emergency messages to be dumped into the ringbuffer before
+ * flushing the ringbuffer. The actual printing occurs when exiting the
+ * outermost emergency section.
+ */
+void nbcon_cpu_emergency_enter(void)
+{
+ unsigned int *cpu_emergency_nesting;
+
+ preempt_disable();
+
+ cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
+ (*cpu_emergency_nesting)++;
+}
+
+/**
+ * nbcon_cpu_emergency_exit - Exit an emergency section and flush the
+ * stored messages
+ *
+ * Flushing only occurs when exiting all nesting for the CPU.
+ *
+ * Context: Any context. Enables preemption.
+ */
+void nbcon_cpu_emergency_exit(void)
+{
+ unsigned int *cpu_emergency_nesting;
+ bool do_trigger_flush = false;
+
+ cpu_emergency_nesting = nbcon_get_cpu_emergency_nesting();
+
+ WARN_ON_ONCE(*cpu_emergency_nesting == 0);
+
+ if (*cpu_emergency_nesting == 1)
+ do_trigger_flush = true;
+
+ /* Undo the nesting count of nbcon_cpu_emergency_enter(). */
+ (*cpu_emergency_nesting)--;
+
+ preempt_enable();
+
+ if (do_trigger_flush)
+ printk_trigger_flush();
+}
+
/**
* nbcon_alloc - Allocate buffers needed by the nbcon console
* @con: Console to allocate buffers for
@@ -2404,16 +2404,29 @@ asmlinkage int vprintk_emit(int facility, int level,
* printing of all remaining records to all consoles so that
* this context can return as soon as possible. Hopefully
* another printk() caller will take over the printing.
+ *
+ * Also, nbcon_get_default_prio() requires migration disabled.
*/
preempt_disable();
+
/*
- * Try to acquire and then immediately release the console
- * semaphore. The release will print out buffers. With the
- * spinning variant, this context tries to take over the
- * printing from another printing context.
+ * Do not emit for EMERGENCY priority. The console will be
+ * explicitly flushed when exiting the emergency section.
*/
- if (console_trylock_spinning())
- console_unlock();
+ if (nbcon_get_default_prio() == NBCON_PRIO_EMERGENCY) {
+ do_trylock_unlock = false;
+ } else {
+ /*
+ * Try to acquire and then immediately release the
+ * console semaphore. The release will print out
+ * buffers. With the spinning variant, this context
+ * tries to take over the printing from another
+ * printing context.
+ */
+ if (console_trylock_spinning())
+ console_unlock();
+ }
+
preempt_enable();
}