[printk,v1,09/18] printk: nobkl: Add print state functions

Message ID 20230302195618.156940-10-john.ogness@linutronix.de
State New
Headers
Series threaded/atomic console support |

Commit Message

John Ogness March 2, 2023, 7:56 p.m. UTC
  From: Thomas Gleixner <tglx@linutronix.de>

Provide three functions which are related to the safe handover
mechanism and allow console drivers to denote takeover unsafe
sections:

 - console_can_proceed()

   Invoked by a console driver to check whether a handover request
   is pending or whether the console was taken over in a hostile
   fashion.

 - console_enter/exit_unsafe()

   Invoked by a console driver to denote that the driver output
   function is about to enter or to leave an critical region where a
   hostile take over is unsafe. These functions are also
   cancellation points.

   The unsafe state is stored in the console state and allows a
   takeover attempt to make informed decisions whether to take over
   and/or output on such a console at all. The unsafe state is also
   available to the driver in the write context for the
   atomic_write() output function so the driver can make informed
   decisions about the required actions or take a special emergency
   path.

Co-developed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de>
---
 include/linux/console.h      |   3 +
 kernel/printk/printk_nobkl.c | 139 +++++++++++++++++++++++++++++++++++
 2 files changed, 142 insertions(+)
  

Comments

Petr Mladek March 29, 2023, 1:58 p.m. UTC | #1
On Thu 2023-03-02 21:02:09, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Provide three functions which are related to the safe handover
> mechanism and allow console drivers to denote takeover unsafe
> sections:
> 
>  - console_can_proceed()
> 
>    Invoked by a console driver to check whether a handover request
>    is pending or whether the console was taken over in a hostile
>    fashion.
> 
> --- a/kernel/printk/printk_nobkl.c
> +++ b/kernel/printk/printk_nobkl.c
> @@ -947,6 +947,145 @@ static void cons_free_percpu_data(struct console *con)
>  	con->pcpu_data = NULL;
>  }
>  
> +/**
> + * console_can_proceed - Check whether printing can proceed
> + * @wctxt:	The write context that was handed to the write function
> + *
> + * Returns:	True if the state is correct. False if a handover
> + *		has been requested or if the console was taken
> + *		over.
> + *
> + * Must be invoked after the record was dumped into the assigned record
> + * buffer

The word "after" made me think about possible races when the record
buffer is being filled. The owner might loose the lock a hostile
way during this action. And we should prevent using the same buffer
when the other owner is still modifying the content.

It should be safe when the same buffer might be used only by nested
contexts. It does not matter if the outer context finishes writing
later. The nested context should not need the buffer anymore.

But a problem might happen when the same buffer is shared between
more non-nested contexts. One context might loose the lock a hostile way.
The other context might get the access after the hostile context
released the lock.

NORMAL and PANIC contexts are safe. These priorities have only
one context and both have their own buffers.

A problem might be with EMERGENCY contexts. Each CPU might have
its own EMERGENCY context. We might prevent this problem if
we do not allow to acquire the lock in EMERGENCY (and NORMAL)
context when panic() is running or after the first hostile
takeover.

If we want to detect these problems and always be on the safe side,
we might need to add a flag 1:1 connected with the buffers.

We either could put a flag into struct printk_buffers. Or we could
bundle this struct into another one for the atomic consoles.
I mean something like:

struct printk_atomic_buffers {
	struct printk_buffers pbufs;
	atomic_t write_lock;
}

And use it in cons_get_record():

#define PRINTK_BUFFER_WRITE_LOCKED_VAL 1

static int cons_get_record(struct cons_write_context *wctxt)
{
	int ret = 0;

	/*
	 * The buffer is not usable when another write context
	 * is still writing the record and lost the lock a hostille
	 * way.
	 */
	if (WARN_ON_ONCE(cmpxchg_acquire(&wctxt->pabufs->write_lock,
			    0, PRINTK_BUFFER_WRITE_LOCKED_VAL) != 0)) {
		return -EBUSY;
	}

	// Fill the buffers

	if (no new message) {
		ret = -ENOENT;
		goto unlock;
	}

unlock:
	/* Release the write lock */
	atomic_set_release(&wctxt->pabufs->write_lock, 0);
	return ret;
}


Note: This is related to the discussion about the 7th patch but
      it is not the same.

      This mail is only about using the same buffer for the same console.

      The other discussion was also about using the same buffer
      for more consoles. It is even more problematic because
      each console uses its own lock.

      It means that we would still need separate buffer for each
      interrupt context. Nested context might be able to get
      the lock for another console a regular way, see
      https://lore.kernel.org/all/ZBndaSUFd4ipvKwj@alley/

> + * and at appropriate safe places in the driver.  For unsafe driver
> + * sections see console_enter_unsafe().
> + *
> + * When this function returns false then the calling context is not allowed
> + * to go forward and has to back out immediately and carefully. The buffer
> + * content is no longer trusted either and the console lock is no longer
> + * held.
> + */
> +bool console_can_proceed(struct cons_write_context *wctxt)
> +{

Best Regards,
Petr
  
Petr Mladek March 29, 2023, 2:05 p.m. UTC | #2
On Thu 2023-03-02 21:02:09, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Provide three functions which are related to the safe handover
> mechanism and allow console drivers to denote takeover unsafe
> sections:
> 
>  - console_can_proceed()
> 
>    Invoked by a console driver to check whether a handover request
>    is pending or whether the console was taken over in a hostile
>    fashion.
> 
>  - console_enter/exit_unsafe()
> 
>    Invoked by a console driver to denote that the driver output
>    function is about to enter or to leave an critical region where a
>    hostile take over is unsafe. These functions are also
>    cancellation points.
> 
>    The unsafe state is stored in the console state and allows a
>    takeover attempt to make informed decisions whether to take over
>    and/or output on such a console at all. The unsafe state is also
>    available to the driver in the write context for the
>    atomic_write() output function so the driver can make informed
>    decisions about the required actions or take a special emergency
>    path.
> 
> --- a/kernel/printk/printk_nobkl.c
> +++ b/kernel/printk/printk_nobkl.c
> @@ -947,6 +947,145 @@ static void cons_free_percpu_data(struct console *con)
>  	con->pcpu_data = NULL;
>  }
>  
> +/**
> + * console_can_proceed - Check whether printing can proceed
> + * @wctxt:	The write context that was handed to the write function
> + *
> + * Returns:	True if the state is correct. False if a handover
> + *		has been requested or if the console was taken
> + *		over.
> + *
> + * Must be invoked after the record was dumped into the assigned record
> + * buffer and at appropriate safe places in the driver.  For unsafe driver
> + * sections see console_enter_unsafe().
> + *
> + * When this function returns false then the calling context is not allowed
> + * to go forward and has to back out immediately and carefully. The buffer
> + * content is no longer trusted either and the console lock is no longer
> + * held.
> + */
> +bool console_can_proceed(struct cons_write_context *wctxt)
> +{
> +	struct cons_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
> +	struct console *con = ctxt->console;
> +	struct cons_state state;
> +
> +	cons_state_read(con, CON_STATE_CUR, &state);
> +	/* Store it for analysis or reuse */
> +	copy_full_state(ctxt->old_state, state);
> +
> +	/* Make sure this context is still the owner. */
> +	if (!cons_state_full_match(state, ctxt->state))
> +		return false;
> +
> +	/*
> +	 * Having a safe point for take over and eventually a few
> +	 * duplicated characters or a full line is way better than a
> +	 * hostile takeover. Post processing can take care of the garbage.
> +	 * Continue if the requested priority is not sufficient.
> +	 */
> +	if (state.req_prio <= state.cur_prio)
> +		return true;
> +
> +	/*
> +	 * A console printer within an unsafe region is allowed to continue.
> +	 * It can perform the handover when exiting the safe region. Otherwise
> +	 * a hostile takeover will be necessary.
> +	 */
> +	if (state.unsafe)
> +		return true;

It took me quite some time to scratch my head around the above two comments.
The code is clear but the comments are somehow cryptic ;-)

It is probably because the 1st comment starts talking about a safe point.
But .unsafe is checked after the 2nd comment. And the word "allowed"
confused me in the 2nd comment.

I would explain this in these details in the function description. The
code will be self-explanatory then. I would write something like:

	 * The console is allowed to continue when it still owns the lock
	 * and there is no request from a higher priority context.
	 *
	 * The context might have lost the lock during a hostile takeover.
	 *
	 * The function will handover the lock when there is a request
	 * with a higher priority and the console is in a safe context.
	 * The new owner would print the same line again. But a duplicated
	 * part of a line is better than risking a hostile takeover in
	 * an unsafe context.

> +
> +	/* Release and hand over */
> +	cons_release(ctxt);
> +	/*
> +	 * This does not check whether the handover succeeded.

This is a bit ambiguous. What exactly means that the handover succeeded?
I guess that it means that the context with the higher priority
successfully took over the lock.

A "failure" might be when the other context timeouted and given up.
In that case, nobody would continue printing.

We actually should wake up the kthread when the lock was not
successfully passed. Or even better, we should release the lock
only when the request was still pending. It should be possible
with the cmpxchg().


> +	 * outermost callsite has to make the final decision whether printing
> +	 * should continue

This is a bit misleading. The current owner could not continue after
loosing the lock. It would need to re-start the entire operation.

Is this about the kthread or cons_flush*() layers? Yes, they have to
decide what to do next. Well, we should make it clear that we are talking
about this layer. The con->atomic_write() layer can only carefully
back off.

Maybe, we do not need to describe it here. It should be enough to
mention in the function description that the driver has to
carefully back off and that the buffer content is not longer
trusted. It is already mentioned there.

> +	 * or not (via reacquire, possibly hostile). The
> +	 * console is unlocked already so go back all the way instead of
> +	 * trying to implement heuristics in tons of places.
> +	 */
> +	return false;
> +}
> +
> +static bool __console_update_unsafe(struct cons_write_context *wctxt, bool unsafe)
 > +{
> +	struct cons_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
> +	struct console *con = ctxt->console;
> +	struct cons_state new;
> +
> +	do  {
> +		if (!console_can_proceed(wctxt))
> +			return false;
> +		/*
> +		 * console_can_proceed() saved the real state in
> +		 * ctxt->old_state
> +		 */
> +		copy_full_state(new, ctxt->old_state);
> +		new.unsafe = unsafe;
> +
> +	} while (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &ctxt->old_state, &new));

This updates only the bit in struct cons_state. But there is also
"bool unsafe" in struct cons_write_context. Should the boolean
be updated as well?

Or is the boolean needed at all? It seems that it is set in
cons_emit_record() and never read.

> +
> +	copy_full_state(ctxt->state, new);
> +	return true;
> +}

Best Regards,
Petr
  
John Ogness March 29, 2023, 2:33 p.m. UTC | #3
On 2023-03-29, Petr Mladek <pmladek@suse.com> wrote:
>> +/**
>> + * console_can_proceed - Check whether printing can proceed
>> + * @wctxt:	The write context that was handed to the write function
>> + *
>> + * Returns:	True if the state is correct. False if a handover
>> + *		has been requested or if the console was taken
>> + *		over.
>> + *
>> + * Must be invoked after the record was dumped into the assigned record
>> + * buffer
>
> The word "after" made me think about possible races when the record
> buffer is being filled. The owner might loose the lock a hostile
> way during this action. And we should prevent using the same buffer
> when the other owner is still modifying the content.
>
> It should be safe when the same buffer might be used only by nested
> contexts. It does not matter if the outer context finishes writing
> later. The nested context should not need the buffer anymore.
>
> But a problem might happen when the same buffer is shared between
> more non-nested contexts. One context might loose the lock a hostile way.
> The other context might get the access after the hostile context
> released the lock.

Hostile takeovers _only occur during panic_.

> NORMAL and PANIC contexts are safe. These priorities have only
> one context and both have their own buffers.
>
> A problem might be with EMERGENCY contexts. Each CPU might have
> its own EMERGENCY context. We might prevent this problem if
> we do not allow to acquire the lock in EMERGENCY (and NORMAL)
> context when panic() is running or after the first hostile
> takeover.

A hostile takeover means a CPU took ownership with PANIC priority. No
CPU can steal ownership from the PANIC owner. Once the PANIC owner
releases ownership, the panic message has been output to the atomic
consoles. Do we really care what happens after that?

John
  
Petr Mladek March 30, 2023, 11:54 a.m. UTC | #4
On Wed 2023-03-29 16:39:54, John Ogness wrote:
> On 2023-03-29, Petr Mladek <pmladek@suse.com> wrote:
> >> +/**
> >> + * console_can_proceed - Check whether printing can proceed
> >> + * @wctxt:	The write context that was handed to the write function
> >> + *
> >> + * Returns:	True if the state is correct. False if a handover
> >> + *		has been requested or if the console was taken
> >> + *		over.
> >> + *
> >> + * Must be invoked after the record was dumped into the assigned record
> >> + * buffer
> >
> > The word "after" made me think about possible races when the record
> > buffer is being filled. The owner might loose the lock a hostile
> > way during this action. And we should prevent using the same buffer
> > when the other owner is still modifying the content.
> >
> > It should be safe when the same buffer might be used only by nested
> > contexts. It does not matter if the outer context finishes writing
> > later. The nested context should not need the buffer anymore.
> >
> > But a problem might happen when the same buffer is shared between
> > more non-nested contexts. One context might loose the lock a hostile way.
> > The other context might get the access after the hostile context
> > released the lock.
> 
> Hostile takeovers _only occur during panic_.
>
> > NORMAL and PANIC contexts are safe. These priorities have only
> > one context and both have their own buffers.
> >
> > A problem might be with EMERGENCY contexts. Each CPU might have
> > its own EMERGENCY context. We might prevent this problem if
> > we do not allow to acquire the lock in EMERGENCY (and NORMAL)
> > context when panic() is running or after the first hostile
> > takeover.
> 
> A hostile takeover means a CPU took ownership with PANIC priority. No
> CPU can steal ownership from the PANIC owner. Once the PANIC owner
> releases ownership, the panic message has been output to the atomic
> consoles. Do we really care what happens after that?

I see. The hostile take over is allowed only in
cons_atomic_exit(CONS_PRIO_PANIC, prev_prio) that is called at the
very end of panic() before the infinite blinking.

It is true that we do not care at this moment. It is actually called
after "suppress_printk = 1;" so that there should not be any
new messages.

Anyway, it would be nice to document this subtle race somewhere.
I could imagine that people would want to risk the hostile
takeover even earlier so the race might get introduced.

Best Regards,
Petr
  

Patch

diff --git a/include/linux/console.h b/include/linux/console.h
index 942cc7f57798..0779757cb917 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -464,6 +464,9 @@  static inline bool console_is_registered(const struct console *con)
 	lockdep_assert_console_list_lock_held();			\
 	hlist_for_each_entry(con, &console_list, node)
 
+extern bool console_can_proceed(struct cons_write_context *wctxt);
+extern bool console_enter_unsafe(struct cons_write_context *wctxt);
+extern bool console_exit_unsafe(struct cons_write_context *wctxt);
 extern bool console_try_acquire(struct cons_write_context *wctxt);
 extern bool console_release(struct cons_write_context *wctxt);
 
diff --git a/kernel/printk/printk_nobkl.c b/kernel/printk/printk_nobkl.c
index 7184a93a5b0d..3318a79a150a 100644
--- a/kernel/printk/printk_nobkl.c
+++ b/kernel/printk/printk_nobkl.c
@@ -947,6 +947,145 @@  static void cons_free_percpu_data(struct console *con)
 	con->pcpu_data = NULL;
 }
 
+/**
+ * console_can_proceed - Check whether printing can proceed
+ * @wctxt:	The write context that was handed to the write function
+ *
+ * Returns:	True if the state is correct. False if a handover
+ *		has been requested or if the console was taken
+ *		over.
+ *
+ * Must be invoked after the record was dumped into the assigned record
+ * buffer and at appropriate safe places in the driver.  For unsafe driver
+ * sections see console_enter_unsafe().
+ *
+ * When this function returns false then the calling context is not allowed
+ * to go forward and has to back out immediately and carefully. The buffer
+ * content is no longer trusted either and the console lock is no longer
+ * held.
+ */
+bool console_can_proceed(struct cons_write_context *wctxt)
+{
+	struct cons_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
+	struct console *con = ctxt->console;
+	struct cons_state state;
+
+	cons_state_read(con, CON_STATE_CUR, &state);
+	/* Store it for analysis or reuse */
+	copy_full_state(ctxt->old_state, state);
+
+	/* Make sure this context is still the owner. */
+	if (!cons_state_full_match(state, ctxt->state))
+		return false;
+
+	/*
+	 * Having a safe point for take over and eventually a few
+	 * duplicated characters or a full line is way better than a
+	 * hostile takeover. Post processing can take care of the garbage.
+	 * Continue if the requested priority is not sufficient.
+	 */
+	if (state.req_prio <= state.cur_prio)
+		return true;
+
+	/*
+	 * A console printer within an unsafe region is allowed to continue.
+	 * It can perform the handover when exiting the safe region. Otherwise
+	 * a hostile takeover will be necessary.
+	 */
+	if (state.unsafe)
+		return true;
+
+	/* Release and hand over */
+	cons_release(ctxt);
+	/*
+	 * This does not check whether the handover succeeded. The
+	 * outermost callsite has to make the final decision whether printing
+	 * should continue or not (via reacquire, possibly hostile). The
+	 * console is unlocked already so go back all the way instead of
+	 * trying to implement heuristics in tons of places.
+	 */
+	return false;
+}
+
+/**
+ * __console_update_unsafe - Update the unsafe bit in @con->atomic_state
+ * @wctxt:	The write context that was handed to the write function
+ *
+ * Returns:	True if the state is correct. False if a handover
+ *		has been requested or if the console was taken
+ *		over.
+ *
+ * Must be invoked before an unsafe driver section is entered.
+ *
+ * When this function returns false then the calling context is not allowed
+ * to go forward and has to back out immediately and carefully. The buffer
+ * content is no longer trusted either and the console lock is no longer
+ * held.
+ *
+ * Internal helper to avoid duplicated code
+ */
+static bool __console_update_unsafe(struct cons_write_context *wctxt, bool unsafe)
+{
+	struct cons_context *ctxt = &ACCESS_PRIVATE(wctxt, ctxt);
+	struct console *con = ctxt->console;
+	struct cons_state new;
+
+	do  {
+		if (!console_can_proceed(wctxt))
+			return false;
+		/*
+		 * console_can_proceed() saved the real state in
+		 * ctxt->old_state
+		 */
+		copy_full_state(new, ctxt->old_state);
+		new.unsafe = unsafe;
+
+	} while (!cons_state_try_cmpxchg(con, CON_STATE_CUR, &ctxt->old_state, &new));
+
+	copy_full_state(ctxt->state, new);
+	return true;
+}
+
+/**
+ * console_enter_unsafe - Enter an unsafe region in the driver
+ * @wctxt:	The write context that was handed to the write function
+ *
+ * Returns:	True if the state is correct. False if a handover
+ *		has been requested or if the console was taken
+ *		over.
+ *
+ * Must be invoked before an unsafe driver section is entered.
+ *
+ * When this function returns false then the calling context is not allowed
+ * to go forward and has to back out immediately and carefully. The buffer
+ * content is no longer trusted either and the console lock is no longer
+ * held.
+ */
+bool console_enter_unsafe(struct cons_write_context *wctxt)
+{
+	return __console_update_unsafe(wctxt, true);
+}
+
+/**
+ * console_exit_unsafe - Exit an unsafe region in the driver
+ * @wctxt:	The write context that was handed to the write function
+ *
+ * Returns:	True if the state is correct. False if a handover
+ *		has been requested or if the console was taken
+ *		over.
+ *
+ * Must be invoked before an unsafe driver section is exited.
+ *
+ * When this function returns false then the calling context is not allowed
+ * to go forward and has to back out immediately and carefully. The buffer
+ * content is no longer trusted either and the console lock is no longer
+ * held.
+ */
+bool console_exit_unsafe(struct cons_write_context *wctxt)
+{
+	return __console_update_unsafe(wctxt, false);
+}
+
 /**
  * cons_nobkl_init - Initialize the NOBKL console specific data
  * @con:	Console to initialize