[23/30] x86/microcode: Provide new control functions

Message ID 20230810160806.278309863@linutronix.de
State New
Headers
Series x86/microcode: Cleanup and late loading enhancements |

Commit Message

Thomas Gleixner Aug. 10, 2023, 6:38 p.m. UTC
  From: Thomas Gleixner <tglx@linutronix.de>

The current all in one code is unreadable and really not suited for adding
future features like uniform loading with package or system scope.

Provide a set of new control functions which split the handling of the
primary and secondary CPUs. These will replace the current rendevouz all in
one function in the next step. This is intentionally a separate change
because diff makes an complete unreadable mess otherwise.

So the flow separates the primary and the secondary CPUs into their own
functions, which use the control field in the per CPU ucode_ctrl struct.

   primary()			secondary()
    wait_for_all()		 wait_for_all()
    apply_ucode()		 wait_for_release()
    release()			 apply_ucode()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 arch/x86/kernel/cpu/microcode/core.c |   86 +++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)
---
  

Comments

Peter Zijlstra Aug. 10, 2023, 8:25 p.m. UTC | #1
On Thu, Aug 10, 2023 at 08:38:00PM +0200, Thomas Gleixner wrote:
> From: Thomas Gleixner <tglx@linutronix.de>
> 
> The current all in one code is unreadable and really not suited for adding
> future features like uniform loading with package or system scope.
> 
> Provide a set of new control functions which split the handling of the
> primary and secondary CPUs. These will replace the current rendevouz all in
> one function in the next step. This is intentionally a separate change
> because diff makes an complete unreadable mess otherwise.
> 
> So the flow separates the primary and the secondary CPUs into their own
> functions, which use the control field in the per CPU ucode_ctrl struct.
> 
>    primary()			secondary()
>     wait_for_all()		 wait_for_all()
>     apply_ucode()		 wait_for_release()
>     release()			 apply_ucode()

This hard assumes SMT2, right? If someone were ever to do an x86 smt4
part then smt siblings 1,2,3 would all apply concurrently in secondary,
is that intended?

> +	/*
> +	 * Wait for primary threads to complete. If one of them hangs due
> +	 * to the update, there is no way out. This is non-recoverable
> +	 * because the CPU might hold locks or resources and confuse the
> +	 * scheduler, watchdogs etc. There is no way to safely evacuate the
> +	 * machine.
> +	 */
> +	if (!wait_for_ctrl())
> +		panic("Microcode load: Primary CPU %d timed out\n", ctrl_cpu);

One way around this is to first hot-unplug the CPUs, then NMI prod them
into the rendevous, and only on-line them again if ucode update is
successful. On failure stick them in a (new) error state so that manual
online also fails and scream murder, like above.

But yeah, rather unlikely, and for another day etc..
  

Patch

--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -357,6 +357,92 @@  static bool wait_for_cpus(atomic_t *cnt)
 	return false;
 }
 
+static bool wait_for_ctrl(void)
+{
+	unsigned int timeout;
+
+	for (timeout = 0; timeout < USEC_PER_SEC; timeout++) {
+		if (this_cpu_read(ucode_ctrl.ctrl) != SCTRL_WAIT)
+			return true;
+		udelay(1);
+		if (!(timeout % 1000))
+			touch_nmi_watchdog();
+	}
+	return false;
+}
+
+static __maybe_unused void ucode_load_secondary(unsigned int cpu)
+{
+	unsigned int ctrl_cpu = this_cpu_read(ucode_ctrl.ctrl_cpu);
+	enum ucode_state ret;
+
+	/* Initial rendevouz to ensure that all CPUs have arrived */
+	if (!wait_for_cpus(&late_cpus_in)) {
+		pr_err_once("Microcode load: %d CPUs timed out\n",
+			    atomic_read(&late_cpus_in) - 1);
+		this_cpu_write(ucode_ctrl.result, UCODE_TIMEOUT);
+		return;
+	}
+
+	/*
+	 * Wait for primary threads to complete. If one of them hangs due
+	 * to the update, there is no way out. This is non-recoverable
+	 * because the CPU might hold locks or resources and confuse the
+	 * scheduler, watchdogs etc. There is no way to safely evacuate the
+	 * machine.
+	 */
+	if (!wait_for_ctrl())
+		panic("Microcode load: Primary CPU %d timed out\n", ctrl_cpu);
+
+	/*
+	 * If the primary succeeded then invoke the apply() callback,
+	 * otherwise copy the state from the primary thread.
+	 */
+	if (this_cpu_read(ucode_ctrl.ctrl) == SCTRL_APPLY)
+		ret = microcode_ops->apply_microcode(cpu);
+	else
+		ret = per_cpu(ucode_ctrl.result, ctrl_cpu);
+
+	this_cpu_write(ucode_ctrl.result, ret);
+	this_cpu_write(ucode_ctrl.ctrl, SCTRL_DONE);
+}
+
+static __maybe_unused void ucode_load_primary(unsigned int cpu)
+{
+	struct cpumask *secondaries = topology_sibling_cpumask(cpu);
+	enum sibling_ctrl ctrl;
+	enum ucode_state ret;
+	unsigned int sibling;
+
+	/* Initial rendevouz to ensure that all CPUs have arrived */
+	if (!wait_for_cpus(&late_cpus_in)) {
+		this_cpu_write(ucode_ctrl.result, UCODE_TIMEOUT);
+		pr_err_once("Microcode load: %d CPUs timed out\n",
+			    atomic_read(&late_cpus_in) - 1);
+		return;
+	}
+
+	ret = microcode_ops->apply_microcode(cpu);
+	this_cpu_write(ucode_ctrl.result, ret);
+	this_cpu_write(ucode_ctrl.ctrl, SCTRL_DONE);
+
+	/*
+	 * If the update was successful, let the siblings run the apply()
+	 * callback. If not, tell them it's done. This also covers the
+	 * case where the CPU has uniform loading at package or system
+	 * scope implemented but does not advertise it.
+	 */
+	if (ret == UCODE_UPDATED || ret == UCODE_OK)
+		ctrl = SCTRL_APPLY;
+	else
+		ctrl = SCTRL_DONE;
+
+	for_each_cpu(sibling, secondaries) {
+		if (sibling != cpu)
+			per_cpu(ucode_ctrl.ctrl, sibling) = ctrl;
+	}
+}
+
 static int ucode_load_cpus_stopped(void *unused)
 {
 	int cpu = smp_processor_id();