From patchwork Mon Oct 2 12:00:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 147409 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1599459vqb; Mon, 2 Oct 2023 11:04:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGT8DIts4emBXAgSyhIEZZKYBAb34VWJ70dy2+gR2/bi21gGdfPaNbwFsRsSeDDwAPqd44a X-Received: by 2002:a05:6a00:1f0f:b0:68f:b5a3:5ec6 with SMTP id be15-20020a056a001f0f00b0068fb5a35ec6mr466520pfb.0.1696269847590; Mon, 02 Oct 2023 11:04:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696269847; cv=none; d=google.com; s=arc-20160816; b=LfcR2/Pg9gbCS2y0B04MwYn3JtFiydqCj9c7u8DDOtS+uDjZmE+JiyHhuCFpJIjE0F wdA0lfYr6/AyL+72aUnA09Jmk0OB5h2OdyHfsfyLwQiVKsHbWa+WMz2v0ZRRVAEgPGRt xiyur8dHqA9fR6N0dY7q3Ak+u2DdPNEs5ABUsiSgqcxgX43I4gmVr/EDhqAe5CzClqx+ leHSjgnAbmRvqikYqO3FtVjlpPw8M2v4H1sJfYUyfjOSb6QkIV/HBhtkJrRRBcxH/m6/ 6ON0iHDi6iuthHUvtyXUWdHpP1iUf0nun4lJnLz2TweucezWstCT+O+EQQaWoXaeHU09 EcLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=BZYJfmyEkcEUrvO2SGy5jScmTWgbT5L9UgtF0YXKdoY=; fh=u57tXYamzTrJA+Ht8n1u7SfTMptrQaIb6LVW+jsaYf4=; b=gT8O1v9p2wH9dGCbdhoYuCg6ZqEsV8tXXd8m1dBXUSjBMSYPsXIXK2hLukepWbXomD 0TeG8ML1lbkYPqSSaNaDC/3o8ZrognfOh8Tf8vGB9AZtMTcimJDMV23dPfixCsLGT6rH HWA5HDX16FuE+VEVvF/UdY2IBhWKIXuUeSHC9wyUzjRd48Xoodc2sdYyyCU4yytn3iBp N8qFY+33g/ALnmO0Ci3H+Q6B9Y17aDVAfkrpanN3VaquNVU1r4Y6CaU6Efp6LJOY5AlG h/QrUXLmBsUltYGVNudJY6ccRhT1IXiDHrehA2z5ypx9r7yeFwPzPhlRsZ2EWWoeQEq0 0VfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=IF6PEv6o; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id ay42-20020a056a00302a00b0068c7033a5f5si27931427pfb.74.2023.10.02.11.04.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 11:04:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=IF6PEv6o; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 9E7BC81121F8; Mon, 2 Oct 2023 05:02:15 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237001AbjJBMBI (ORCPT + 18 others); Mon, 2 Oct 2023 08:01:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236955AbjJBMA1 (ORCPT ); Mon, 2 Oct 2023 08:00:27 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4411810D9 for ; Mon, 2 Oct 2023 05:00:08 -0700 (PDT) Message-ID: <20231002115903.545969323@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1696248006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=BZYJfmyEkcEUrvO2SGy5jScmTWgbT5L9UgtF0YXKdoY=; b=IF6PEv6oZJQwZyYqk/0hWecJqpLTSDyDZDHCr9F2Kh/jZ6wx3hjJyftauZ9oMUvT9yj2a2 tFOpzvN5VDhQuMsKlOL3MuUl5oaP1PjiDqP7EAqSODYA940lmCP+WixS20h2WH7qKP5Mkb xdeGJo3THpTHMbL/I0T+hU06tI0HBchN+JrIMoEZ8bfUb3iU66D1LPJI+5uyn7psSrjsF4 G3NxMSHOWubsUwTjWeoGaZ65AKIq7ZPmJ5r1peuYArOAhjswXLM9Rfhe3yWN1Hoi4v0gvb 6imh62cs3BQ4I+uaqbb4whdRVeeBwTKgvZ5Qb+UCUKgW0fSnp5buFR/ONGjIWQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1696248006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=BZYJfmyEkcEUrvO2SGy5jScmTWgbT5L9UgtF0YXKdoY=; b=OZ5XG5hLbib6kqtcQi41zT0OPxlQJf20vlyyLoZ+Vt/glLTf++xmWdRRfI9elYcvbZDBZ7 LKb2psvzadyA2VDQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Borislav Petkov , "Chang S. Bae" , Arjan van de Ven , Nikolay Borisov Subject: [patch V4 26/30] x86/microcode: Protect against instrumentation References: <20231002115506.217091296@linutronix.de> MIME-Version: 1.0 Date: Mon, 2 Oct 2023 14:00:06 +0200 (CEST) X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 02 Oct 2023 05:02:15 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778667852013679379 X-GMAIL-MSGID: 1778667852013679379 From: Thomas Gleixner The wait for control loop in which the siblings are waiting for the microcode update on the primary thread must be protected against instrumentation as instrumentation can end up in #INT3, #DB or #PF, which then returns with IRET. That IRET reenables NMI which is the opposite of what the NMI rendezvouz is trying to achieve. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/cpu/microcode/core.c | 110 ++++++++++++++++++++++++++--------- 1 file changed, 82 insertions(+), 28 deletions(-) --- --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -301,54 +301,65 @@ struct microcode_ctrl { DEFINE_STATIC_KEY_FALSE(microcode_nmi_handler_enable); static DEFINE_PER_CPU(struct microcode_ctrl, ucode_ctrl); +static unsigned int loops_per_usec; static atomic_t late_cpus_in; -static bool wait_for_cpus(atomic_t *cnt) +static noinstr bool wait_for_cpus(atomic_t *cnt) { - unsigned int timeout; + unsigned int timeout, loops; - WARN_ON_ONCE(atomic_dec_return(cnt) < 0); + WARN_ON_ONCE(raw_atomic_dec_return(cnt) < 0); for (timeout = 0; timeout < USEC_PER_SEC; timeout++) { - if (!atomic_read(cnt)) + if (!raw_atomic_read(cnt)) return true; - udelay(1); + for (loops = 0; loops < loops_per_usec; loops++) + cpu_relax(); /* If invoked directly, tickle the NMI watchdog */ - if (!microcode_ops->use_nmi && !(timeout % USEC_PER_MSEC)) + if (!microcode_ops->use_nmi && !(timeout % USEC_PER_MSEC)) { + instrumentation_begin(); touch_nmi_watchdog(); + instrumentation_end(); + } } /* Prevent the late comers from making progress and let them time out */ - atomic_inc(cnt); + raw_atomic_inc(cnt); return false; } -static bool wait_for_ctrl(void) +static noinstr bool wait_for_ctrl(void) { - unsigned int timeout; + unsigned int timeout, loops; for (timeout = 0; timeout < USEC_PER_SEC; timeout++) { - if (this_cpu_read(ucode_ctrl.ctrl) != SCTRL_WAIT) + if (raw_cpu_read(ucode_ctrl.ctrl) != SCTRL_WAIT) return true; - udelay(1); + + for (loops = 0; loops < loops_per_usec; loops++) + cpu_relax(); + /* If invoked directly, tickle the NMI watchdog */ - if (!microcode_ops->use_nmi && !(timeout % 1000)) + if (!microcode_ops->use_nmi && !(timeout % USEC_PER_MSEC)) { + instrumentation_begin(); touch_nmi_watchdog(); + instrumentation_end(); + } } return false; } -static void load_secondary(unsigned int cpu) +/* + * Protected against instrumentation up to the point where the primary + * thread completed the update. See microcode_nmi_handler() for details. + */ +static noinstr bool load_secondary_wait(unsigned int ctrl_cpu) { - unsigned int ctrl_cpu = this_cpu_read(ucode_ctrl.ctrl_cpu); - enum ucode_state ret; - /* Initial rendezvouz to ensure that all CPUs have arrived */ if (!wait_for_cpus(&late_cpus_in)) { - pr_err_once("load: %d CPUs timed out\n", atomic_read(&late_cpus_in) - 1); - this_cpu_write(ucode_ctrl.result, UCODE_TIMEOUT); - return; + raw_cpu_write(ucode_ctrl.result, UCODE_TIMEOUT); + return false; } /* @@ -358,9 +369,33 @@ static void load_secondary(unsigned int * scheduler, watchdogs etc. There is no way to safely evacuate the * machine. */ - if (!wait_for_ctrl()) - panic("Microcode load: Primary CPU %d timed out\n", ctrl_cpu); + if (wait_for_ctrl()) + return true; + + instrumentation_begin(); + panic("Microcode load: Primary CPU %d timed out\n", ctrl_cpu); + instrumentation_end(); +} +/* + * Protected against instrumentation up to the point where the primary + * thread completed the update. See microcode_nmi_handler() for details. + */ +static noinstr void load_secondary(unsigned int cpu) +{ + unsigned int ctrl_cpu = raw_cpu_read(ucode_ctrl.ctrl_cpu); + enum ucode_state ret; + + if (!load_secondary_wait(ctrl_cpu)) { + instrumentation_begin(); + pr_err_once("Microcode load: %d CPUs timed out\n", + atomic_read(&late_cpus_in) - 1); + instrumentation_end(); + return; + } + + /* Primary thread completed. Allow to invoke instrumentable code */ + instrumentation_begin(); /* * If the primary succeeded then invoke the apply() callback, * otherwise copy the state from the primary thread. @@ -372,6 +407,7 @@ static void load_secondary(unsigned int this_cpu_write(ucode_ctrl.result, ret); this_cpu_write(ucode_ctrl.ctrl, SCTRL_DONE); + instrumentation_end(); } static void load_primary(unsigned int cpu) @@ -409,25 +445,42 @@ static void load_primary(unsigned int cp } } -static bool microcode_update_handler(void) +static noinstr bool microcode_update_handler(void) { - unsigned int cpu = smp_processor_id(); + unsigned int cpu = raw_smp_processor_id(); - if (this_cpu_read(ucode_ctrl.ctrl_cpu) == cpu) + if (raw_cpu_read(ucode_ctrl.ctrl_cpu) == cpu) { + instrumentation_begin(); load_primary(cpu); - else + instrumentation_end(); + } else { load_secondary(cpu); + } + instrumentation_begin(); touch_nmi_watchdog(); + instrumentation_end(); return true; } -bool microcode_nmi_handler(void) +/* + * Protection against instrumentation is required for CPUs which are not + * safe against an NMI which is delivered to the secondary SMT sibling + * while the primary thread updates the microcode. Instrumentation can end + * up in #INT3, #DB and #PF. The IRET from those exceptions reenables NMI + * which is the opposite of what the NMI rendevouz is trying to achieve. + * + * The primary thread is safe versus instrumentation as the actual + * microcode update handles this correctly. It's only the sibling code + * path which must be NMI safe until the primary thread completed the + * update. + */ +bool noinstr microcode_nmi_handler(void) { - if (!this_cpu_read(ucode_ctrl.nmi_enabled)) + if (!raw_cpu_read(ucode_ctrl.nmi_enabled)) return false; - this_cpu_write(ucode_ctrl.nmi_enabled, false); + raw_cpu_write(ucode_ctrl.nmi_enabled, false); return microcode_update_handler(); } @@ -454,6 +507,7 @@ static int load_late_stop_cpus(void) pr_err("You should switch to early loading, if possible.\n"); atomic_set(&late_cpus_in, num_online_cpus()); + loops_per_usec = loops_per_jiffy / (TICK_NSEC / 1000); /* * Take a snapshot before the microcode update in order to compare and