[printk,v2,6/8] printk: nbcon: Add ownership state functions

Message ID 20230728000233.50887-7-john.ogness@linutronix.de
State New
Headers
Series wire up nbcon consoles |

Commit Message

John Ogness July 28, 2023, 12:02 a.m. UTC
  From: Thomas Gleixner <tglx@linutronix.de>

Provide functions that are related to the safe handover mechanism
and allow console drivers to dynamically specify unsafe regions:

 - nbcon_context_can_proceed()

   Invoked by a console owner to check whether a handover request
   is pending or whether the console was taken over in a hostile
   fashion. If a handover request is pending, this function will
   also perform the handover, thus cancelling its own ownership.

 - nbcon_context_update_unsafe()

   Invoked by a console owner to denote that the driver is about
   to enter or leave a critical region where a hostile take over
   is unsafe. This function is also a cancellation point where
   loss of ownership can occur.

   The unsafe state is stored in the console state and allows a
   new context to make informed decisions whether to attempt a
   takeover of such a console. The unsafe state is also available
   to the driver so that it can make informed decisions about the
   required actions or take a special emergency path.

Co-developed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Thomas Gleixner (Intel) <tglx@linutronix.de>
---
 kernel/printk/printk_nbcon.c | 113 ++++++++++++++++++++++++++++++++++-
 1 file changed, 112 insertions(+), 1 deletion(-)
  

Comments

Petr Mladek Aug. 10, 2023, 12:56 p.m. UTC | #1
On Fri 2023-07-28 02:08:31, John Ogness wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> Provide functions that are related to the safe handover mechanism
> and allow console drivers to dynamically specify unsafe regions:
> --- a/kernel/printk/printk_nbcon.c
> +++ b/kernel/printk/printk_nbcon.c
> @@ -650,6 +649,118 @@ static void nbcon_context_release(struct nbcon_context *ctxt)
>  	ctxt->pbufs = NULL;
>  }
>  
> +/**
> + * nbcon_context_can_proceed - Check whether ownership can proceed
> + * @ctxt:	The nbcon context from nbcon_context_try_acquire()
> + * @cur:	The current console state
> + *
> + * Return:	True if the state is correct. False if ownership was
> + *		handed over or taken.
> + *
> + * Must be invoked after the record was dumped into the assigned buffer
> + * and at appropriate safe places in the driver.
> + *
> + * When this function returns false then the calling context is no longer
> + * the owner and is no longer allowed to go forward. In this case it must
> + * back out immediately and carefully. The buffer content is also no longer
> + * trusted since it no longer belongs to the calling context.
> + */
> +static bool nbcon_context_can_proceed(struct nbcon_context *ctxt, struct nbcon_state *cur)
> +{
[...]
> +	/*
> +	 * A console owner within an unsafe region is always allowed to
> +	 * proceed, even if there are waiters. It can perform a handover
> +	 * when exiting the unsafe region. Otherwise the waiter will
> +	 * need to perform an unsafe hostile takeover.
> +	 */
> +	if (cur->unsafe) {
> +		debug_store(cur->req_prio > cur->prio,
> +			    "handover: cpu%d IGNORING HANDOVER prio%d -> prio%d (unsafe)\n",
> +			    cpu, cur->prio, cur->req_prio);
> +		return true;
> +	}
[...]
> +}
> +
> +/**
> + * nbcon_context_update_unsafe - Update the unsafe bit in @con->nbcon_state
> + * @ctxt:	The nbcon context from nbcon_context_try_acquire()
> + * @unsafe:	The new value for the unsafe bit
> + *
> + * Return:	True if the state is correct. False if ownership was
> + *		handed over or taken.
> + *
> + * Typically the unsafe bit is set during acquire. This function allows
> + * modifying the unsafe status without releasing ownership.
> + *
> + * When this function returns false then the calling context is no longer
> + * the owner and is no longer allowed to go forward. In this case it must
> + * back out immediately and carefully. The buffer content is also no longer
> + * trusted since it no longer belongs to the calling context.
> + *
> + * Internal helper to avoid duplicated code
> + */
> +__maybe_unused
> +static bool nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsafe)
> +{
> +	struct console *con = ctxt->console;
> +	struct nbcon_state cur;
> +	struct nbcon_state new;
> +
> +	nbcon_state_read(con, &cur);
> +
> +	/* The unsafe bit must not be cleared if @hostile_unsafe is set. */
> +	if (!unsafe && cur.hostile_unsafe)
> +		return nbcon_context_can_proceed(ctxt, &cur);
> +
> +	do {
> +		if (!nbcon_context_can_proceed(ctxt, &cur))
> +			return false;

nbcon_context_can_proceed() returns "true" even when there
is a pending request. It happens when the current state is "unsafe".
But see below.

> +
> +		new.atom = cur.atom;
> +		new.unsafe = unsafe;
> +	} while (!nbcon_state_try_cmpxchg(con, &cur, &new));

If the new state is "safe" and there is a pending request
then we should release the lock and return false here.

It does not make sense to block the waiter just to realize
that we can't enter "unsafe" state again.

> +	ctxt->unsafe = unsafe;
> +
> +	return true;

An easy solution would be to do here:

	ctxt->unsafe = unsafe;
	return nbcon_context_can_proceed(ctxt, &cur);

> +}

But maybe, we can change the logic a bit. Something like:

/**
 * nbcon_context_update_unsafe - Update the unsafe bit in @con->nbcon_state
 * @ctxt:	The nbcon context from nbcon_context_try_acquire()
 * @unsafe:	The new value for the unsafe bit
 *
 * Return:	True if the state is correct. False if ownership was
 *		handed over or taken.
 *
 * When this function returns false then the calling context is no longer
 * the owner and is no longer allowed to go forward. In this case it must
 * back out immediately and carefully. The buffer content is also no longer
 * trusted since it no longer belongs to the calling context.
 *
 * Internal helper to avoid duplicated code
 */
static bool nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsafe)
{
	struct console *con = ctxt->console;
	struct nbcon_state cur;
	struct nbcon_state new;
	bool updated, can_proceed;

	if (!nbcon_context_can_proceed(ctxt, &cur))
		return false;

	/* The unsafe bit must not be cleared if @hostile_unsafe is set. */
	if (cur.hostile_unsafe)
		unsafe = true;

	if (cur.unsafe == unsafe)
		return true;

	do {
		new.atom = cur.atom;
		new.unsafe = unsafe;

		updated = nbcon_state_try_cmpxchg(con, &cur, &new));
		/*
		 * The state has changed. Either there is a new
		 * request lor there was a hostile takeover.
		 */
		can_proceed = nbcon_context_can_proceed(ctxt, &cur);
	} while (!updated && can_proceed);

	if (updated)
		ctxt->unsafe = unsafe;

	return can_proceed;
}

Best Regards,
Petr
  

Patch

diff --git a/kernel/printk/printk_nbcon.c b/kernel/printk/printk_nbcon.c
index 8229a0a00d5b..e41f2eff5ef6 100644
--- a/kernel/printk/printk_nbcon.c
+++ b/kernel/printk/printk_nbcon.c
@@ -623,7 +623,6 @@  static bool nbcon_owner_matches(struct nbcon_state *cur, int expected_cpu,
  * nbcon_context_release - Release the console
  * @ctxt:	The nbcon context from nbcon_context_try_acquire()
  */
-__maybe_unused
 static void nbcon_context_release(struct nbcon_context *ctxt)
 {
 	unsigned int cpu = smp_processor_id();
@@ -650,6 +649,118 @@  static void nbcon_context_release(struct nbcon_context *ctxt)
 	ctxt->pbufs = NULL;
 }
 
+/**
+ * nbcon_context_can_proceed - Check whether ownership can proceed
+ * @ctxt:	The nbcon context from nbcon_context_try_acquire()
+ * @cur:	The current console state
+ *
+ * Return:	True if the state is correct. False if ownership was
+ *		handed over or taken.
+ *
+ * Must be invoked after the record was dumped into the assigned buffer
+ * and at appropriate safe places in the driver.
+ *
+ * When this function returns false then the calling context is no longer
+ * the owner and is no longer allowed to go forward. In this case it must
+ * back out immediately and carefully. The buffer content is also no longer
+ * trusted since it no longer belongs to the calling context.
+ */
+static bool nbcon_context_can_proceed(struct nbcon_context *ctxt, struct nbcon_state *cur)
+{
+	unsigned int cpu = smp_processor_id();
+
+	/* Make sure this context is still the owner. */
+	if (!nbcon_owner_matches(cur, cpu, ctxt->prio)) {
+		debug_store(1, "handover: cpu%d DETECTED HOSTILE takeover\n", cpu);
+		return false;
+	}
+
+	/* The console owner can proceed if there is no waiter. */
+	if (cur->req_prio == NBCON_PRIO_NONE)
+		return true;
+
+	/*
+	 * A console owner within an unsafe region is always allowed to
+	 * proceed, even if there are waiters. It can perform a handover
+	 * when exiting the unsafe region. Otherwise the waiter will
+	 * need to perform an unsafe hostile takeover.
+	 */
+	if (cur->unsafe) {
+		debug_store(cur->req_prio > cur->prio,
+			    "handover: cpu%d IGNORING HANDOVER prio%d -> prio%d (unsafe)\n",
+			    cpu, cur->prio, cur->req_prio);
+		return true;
+	}
+
+	/* Waiters always have higher priorities than owners. */
+	WARN_ON_ONCE(cur->req_prio <= cur->prio);
+
+	debug_store(1, "handover: cpu%d HANDING OVER prio%d -> prio%d\n",
+		    cpu, cur->prio, cur->req_prio);
+
+
+	/*
+	 * Having a safe point for take over and eventually a few
+	 * duplicated characters or a full line is way better than a
+	 * hostile takeover. Post processing can take care of the garbage.
+	 * Release and hand over.
+	 */
+	nbcon_context_release(ctxt);
+
+	/*
+	 * It is not known whether the handover succeeded. The outermost
+	 * callsite has to make the final decision whether printing
+	 * should proceed or not (via reacquire, possibly hostile). The
+	 * console is now unlocked so go back all the way instead of
+	 * trying to implement heuristics in tons of places.
+	 */
+	return false;
+}
+
+/**
+ * nbcon_context_update_unsafe - Update the unsafe bit in @con->nbcon_state
+ * @ctxt:	The nbcon context from nbcon_context_try_acquire()
+ * @unsafe:	The new value for the unsafe bit
+ *
+ * Return:	True if the state is correct. False if ownership was
+ *		handed over or taken.
+ *
+ * Typically the unsafe bit is set during acquire. This function allows
+ * modifying the unsafe status without releasing ownership.
+ *
+ * When this function returns false then the calling context is no longer
+ * the owner and is no longer allowed to go forward. In this case it must
+ * back out immediately and carefully. The buffer content is also no longer
+ * trusted since it no longer belongs to the calling context.
+ *
+ * Internal helper to avoid duplicated code
+ */
+__maybe_unused
+static bool nbcon_context_update_unsafe(struct nbcon_context *ctxt, bool unsafe)
+{
+	struct console *con = ctxt->console;
+	struct nbcon_state cur;
+	struct nbcon_state new;
+
+	nbcon_state_read(con, &cur);
+
+	/* The unsafe bit must not be cleared if @hostile_unsafe is set. */
+	if (!unsafe && cur.hostile_unsafe)
+		return nbcon_context_can_proceed(ctxt, &cur);
+
+	do {
+		if (!nbcon_context_can_proceed(ctxt, &cur))
+			return false;
+
+		new.atom = cur.atom;
+		new.unsafe = unsafe;
+	} while (!nbcon_state_try_cmpxchg(con, &cur, &new));
+
+	ctxt->unsafe = unsafe;
+
+	return true;
+}
+
 /**
  * nbcon_init - Initialize the nbcon console specific data
  * @con:	Console to initialize