[v2,01/13] x86/microcode/intel: Prevent printing updated microcode rev multiple times
Message ID | 20221103175901.164783-2-ashok.raj@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp682374wru; Thu, 3 Nov 2022 11:01:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM68gXLAN6kSsCuhlXDmHvDUDjqlS4wd2r5uo1LxDLDcxTEj/0ymHnecGhxLFaFvCZzBEnWC X-Received: by 2002:a65:4c0e:0:b0:46a:eeb1:e78a with SMTP id u14-20020a654c0e000000b0046aeeb1e78amr27073734pgq.193.1667498470560; Thu, 03 Nov 2022 11:01:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667498470; cv=none; d=google.com; s=arc-20160816; b=YPaJvM/XZRfsR+66m/PrW1NE1QTFjXdwM0ETPsqfXnd7mv268eyMXnrY0blHAi3uBf d6rZkkYjqM/RnougSxR6Ebspo1+aq2WvduHe0og/cOSGM40UsqpYYfcDU+oIFs9y2rFd 40mkKZbRrFbppML5QFINw4F30KP0VyO7GJh5YsxGhOdo/4Fxe0fyUCr6ck+EmlKpKvDQ yJG6ztgLMpeo6DXiVht09QFKyZc3UxO1LCKTSdg1ZoeN5pMAE05thph7gnzrJAdLLeV7 joV1vXEzEEWE96d5TU9AhQkMC+cc9O3UXjBy1J0d0QvgWnz9bGaYgQnHIKXhN2n2BO3u 01tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ZtRYN3/lWBQEhA9qI17tcYVOl/gNwOwCoS7CIHGd670=; b=ORjWKjvCzxdHco4D4ezlAMEztzGzIyKFV+IL2syF4uF2QBj5j9TB0CgVw3ChYMPpQJ lMzZltr2kP92Dnv3Z8DWd5DntrJk8D7/tm5ttifR/STZasEAVRaIP0MJFjYrnvcXgop5 WuegMpr508UxyI0okaZNg42icRBz/zvTbTTJWQIBB5HW3Y+dlEYT1MdwoDZuHNAOC9KZ TDHFMxQv87MNYQEPG2mvnRbci6/I8iD3hhbl+5+l4Mncu+IR6WD29ybjJ767nFj1RpDm uGEo10g3ZTTT3bI032DEYwj3h2GUbjUb0QllpwcxSX7N5VPlhO588jgbiZ04XBR3tKNM AR3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bIWFEOAe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h14-20020a63df4e000000b0045cbd4e43a1si1791388pgj.57.2022.11.03.11.00.57; Thu, 03 Nov 2022 11:01:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bIWFEOAe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232081AbiKCR75 (ORCPT <rfc822;yves.mi.zy@gmail.com> + 99 others); Thu, 3 Nov 2022 13:59:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232042AbiKCR71 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 3 Nov 2022 13:59:27 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53C972628 for <linux-kernel@vger.kernel.org>; Thu, 3 Nov 2022 10:59:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667498359; x=1699034359; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IRafYvcGTb3XRIZ8BF10OMlvhWCpsdfukSgZuf8/1v8=; b=bIWFEOAeCGrPWyjKFHyXRC4UxDwtcJ3DglJJHuWAwJfDXrfHh9E0/wzM trifDp6L+lVtU0fBUvQ5Q7Bm3oo8D9P1ku7GM/7Dof/TFIDbcEP5hbH2a bJx26hsohmpVTRpjjlsgmpOJuX3nS6ZgnQTD94ipwORzdSsjEyi7uOSK5 D4RGpriRcl2eKseFNp7gwzunTi9DEMmJbX5mcjBE0FH2eDDr3TcKFCarr Vu33aq6MzyuvE4diI2KHzIG84wOodvOrlVbp5Pt08yB7TLE6KxxwrsE4R lV2a0xwfrJE2c11Hmyn87LlaPGd6NmvLC4ewGKnUBYjJusjTp/HY3PLUw w==; X-IronPort-AV: E=McAfee;i="6500,9779,10520"; a="308476963" X-IronPort-AV: E=Sophos;i="5.96,134,1665471600"; d="scan'208";a="308476963" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Nov 2022 10:59:18 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10520"; a="809762540" X-IronPort-AV: E=Sophos;i="5.96,134,1665471600"; d="scan'208";a="809762540" Received: from araj-dh-work.jf.intel.com ([10.165.157.158]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Nov 2022 10:59:17 -0700 From: Ashok Raj <ashok.raj@intel.com> To: Borislav Petkov <bp@alien8.de>, Thomas Gleixner <tglx@linutronix.de> Cc: "LKML Mailing List" <linux-kernel@vger.kernel.org>, X86-kernel <x86@kernel.org>, Tony Luck <tony.luck@intel.com>, Dave Hansen <dave.hansen@intel.com>, Arjan van de Ven <arjan.van.de.ven@intel.com>, Andy Lutomirski <luto@kernel.org>, Jacon Jun Pan <jacob.jun.pan@intel.com>, Tom Lendacky <thomas.lendacky@amd.com>, Kai Huang <kai.huang@intel.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Ashok Raj <ashok.raj@intel.com> Subject: [v2 01/13] x86/microcode/intel: Prevent printing updated microcode rev multiple times Date: Thu, 3 Nov 2022 17:58:49 +0000 Message-Id: <20221103175901.164783-2-ashok.raj@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221103175901.164783-1-ashok.raj@intel.com> References: <20221103175901.164783-1-ashok.raj@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748498876429794153?= X-GMAIL-MSGID: =?utf-8?q?1748498876429794153?= |
Series |
Make microcode late loading more robust
|
|
Commit Message
Ashok Raj
Nov. 3, 2022, 5:58 p.m. UTC
Commit b6f86689d5b7 ("x86/microcode: Rip out the subsys interface gunk") introduced a race where all CPUs follow this call chain: microcode_init()->schedule_on_each_cpu(setup_online_cpu)->collect_cpu_info This results in console spam where multiple CPUs print the signature. [ 33.688639] microcode: sig=0x50654, pf=0x80, revision=0x2006e05 [ 33.688659] microcode: sig=0x50654, pf=0x80, revision=0x2006e05 [ 33.688660] microcode: sig=0x50654, pf=0x80, revision=0x2006e05 Fix by making sure only boot CPU prints the message. Fixes: b6f86689d5b7 ("x86/microcode: Rip out the subsys interface gunk") Reported-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> --- arch/x86/kernel/cpu/microcode/intel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
Comments
On Thu, Nov 03, 2022 at 05:58:49PM +0000, Ashok Raj wrote: > @@ -696,8 +697,7 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig) > > csig->rev = c->microcode; > > - /* No extra locking on prev, races are harmless. */ > - if (csig->sig != prev.sig || csig->pf != prev.pf || csig->rev != prev.rev) { > + if (bsp && csig->rev != prev.rev) { This basically means that the loader is not going to support mixed steppings microcode. Yes, no? If yes, can I remove the patch cache too and use a single buffer for the current patch? That would simplify things even more.
> This basically means that the loader is not going to support mixed > steppings microcode. > > Yes, no? > > If yes, can I remove the patch cache too and use a single buffer for the > current patch? > > That would simplify things even more. multistepping is really not well supported, and for cases where it ends up happening, often a "full set" microcode file is made (where the kernel doesn't need to know) so I think by all means, if life is simpler, stop doing complicated things for mixed stepping
On Fri, Nov 04, 2022 at 01:53:22PM +0000, Van De Ven, Arjan wrote: > so I think by all means, if life is simpler, stop doing complicated > things for mixed stepping Ok, that's cool. Lemme put it on my TODO to remove the cache. Thx.
On Fri, Nov 04, 2022 at 04:52:07PM +0100, Borislav Petkov wrote: > On Fri, Nov 04, 2022 at 01:53:22PM +0000, Van De Ven, Arjan wrote: > > so I think by all means, if life is simpler, stop doing complicated > > things for mixed stepping > > Ok, that's cool. Lemme put it on my TODO to remove the cache. > I have a series cooking too, but this series got too long already. Didn't want to slow the minrev and the late load enabling patches :-) I'll submit right after. There is a bit more cleanup, I had planned. Wanted to add a check every AP that if its different fms+pf warn and bork late-load. We don't need a list, all we need is a single entry. We didn't push the fix below, but removing the cache is a better option. So that's already in my list. https://lore.kernel.org/lkml/20220817051127.3323755-2-ashok.raj@intel.com/ Cheers, Ashok
On Fri, Nov 04, 2022 at 11:28:36AM -0700, Ashok Raj wrote: > Wanted to add a check every AP that if its different fms+pf warn > and bork late-load. We don't need that check as we don't/won't support mixed steppings. It is as simple as that.
On Thu, Nov 03, 2022 at 05:58:49PM +0000, Ashok Raj wrote: > @@ -696,8 +697,7 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig) > > csig->rev = c->microcode; > > - /* No extra locking on prev, races are harmless. */ > - if (csig->sig != prev.sig || csig->pf != prev.pf || csig->rev != prev.rev) { > + if (bsp && csig->rev != prev.rev) { > pr_info("sig=0x%x, pf=0x%x, revision=0x%x\n", > csig->sig, csig->pf, csig->rev); And now that we've established that we don't do mixed steppings anymore and the microcode revision is the same system-wide, you should simply drop this pr_info(), in your next patch you're adding +static u32 early_old_rev; That thing should simply be /* Currently applied microcode revision */ static u32 microcode_rev; and you simply update that one each time you update microcode and print it as the previous and the new one and then write the new one into this var and that's it. Simple. Thx.
On Sun, Nov 06, 2022 at 02:35:58PM +0100, Borislav Petkov wrote: > On Thu, Nov 03, 2022 at 05:58:49PM +0000, Ashok Raj wrote: > > @@ -696,8 +697,7 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig) > > > > csig->rev = c->microcode; > > > > - /* No extra locking on prev, races are harmless. */ > > - if (csig->sig != prev.sig || csig->pf != prev.pf || csig->rev != prev.rev) { > > + if (bsp && csig->rev != prev.rev) { > > pr_info("sig=0x%x, pf=0x%x, revision=0x%x\n", > > csig->sig, csig->pf, csig->rev); > > And now that we've established that we don't do mixed steppings anymore > and the microcode revision is the same system-wide, you should simply > drop this pr_info(), in your next patch you're adding > > +static u32 early_old_rev; > > That thing should simply be > > /* Currently applied microcode revision */ > static u32 microcode_rev; > > and you simply update that one each time you update microcode and print > it as the previous and the new one and then write the new one into this > var and that's it. Simple. > I thought about it... Since microcode/core.c already provides this information. Only missing part is the microcode date which the common function doesn't seem to get this. Thought it might be useful. But I can certainly drop it, if you think this isn't much value.
On Sun, Nov 06, 2022 at 02:35:58PM +0100, Borislav Petkov wrote: > On Thu, Nov 03, 2022 at 05:58:49PM +0000, Ashok Raj wrote: > > @@ -696,8 +697,7 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig) > > > > csig->rev = c->microcode; > > > > - /* No extra locking on prev, races are harmless. */ > > - if (csig->sig != prev.sig || csig->pf != prev.pf || csig->rev != prev.rev) { > > + if (bsp && csig->rev != prev.rev) { > > pr_info("sig=0x%x, pf=0x%x, revision=0x%x\n", > > csig->sig, csig->pf, csig->rev); > > And now that we've established that we don't do mixed steppings anymore > and the microcode revision is the same system-wide, you should simply > drop this pr_info(), in your next patch you're adding > > +static u32 early_old_rev; > > That thing should simply be > > /* Currently applied microcode revision */ > static u32 microcode_rev; > > and you simply update that one each time you update microcode and print > it as the previous and the new one and then write the new one into this > var and that's it. Simple. > I removed that preparing for the next round. We already have late-loading capture the previous rev already. So we don't need any new changes. [ 482.242727] microcode: Reload completed, microcode revision: 0x2b000070 -> 0x2b000081 Only missing is the ucode date, not a big deal missing it.
On Mon, Nov 07, 2022 at 04:12:59PM +0000, Ashok Raj wrote:
> Only missing is the ucode date, not a big deal missing it.
Yes, it isn't. One can find it out another way:
$ iucode-tool -l /lib/firmware/intel-ucode/06-9c-00
microcode bundle 1: /lib/firmware/intel-ucode/06-9c-00
selected microcodes:
001/001: sig 0x000906c0, pf_mask 0x01, 2022-02-19, rev 0x24000023, size 20480
^^^^^^^^^^
Hi Boris, On Mon, Nov 07, 2022 at 07:47:37PM +0100, Borislav Petkov wrote: > On Mon, Nov 07, 2022 at 04:12:59PM +0000, Ashok Raj wrote: > > Only missing is the ucode date, not a big deal missing it. > > Yes, it isn't. One can find it out another way: > > $ iucode-tool -l /lib/firmware/intel-ucode/06-9c-00 > microcode bundle 1: /lib/firmware/intel-ucode/06-9c-00 > selected microcodes: > 001/001: sig 0x000906c0, pf_mask 0x01, 2022-02-19, rev 0x24000023, size 20480 > ^^^^^^^^^^ That's correct, my thought as well. Did you get a chance to review rest of the patches? Thought I'll wait for more comments before I send the next rev. Patch2 is a simple fix that you suggested. Patch3 is a bug fix. I suspect some earlier upstream reports of ucode failure after update (early loading) might be related. The symptom is similar, but those are too old to followup. I got into a similare situation when i tried to update an incompatible uCode from initrd and system hung. I'm not positive, but seems highly likely. The following are early loading failures, not late loading. https://bugs.launchpad.net/ubuntu/+source/intel-microcode/+bug/1911959 https://forums.linuxmint.com/viewtopic.php?p=1827032#1827032 https://askubuntu.com/questions/1291486/boot-crash-after-latest-update-of-intel-microcode-nov-11-2020 Patch 4 is also a bugfix, today when we reload the same ucode even though nothing changes it seems to think some feature is new. When i added some more debug it turned out SGX was probably turned off by OS, but enumerated by microcode. So it always reports Cheers, Ashok
On 11/8/22 15:06, Ashok Raj wrote: > Patch3 is a bug fix. I suspect some earlier upstream reports of ucode > failure after update (early loading) might be related. The symptom is > similar, but those are too old to followup. I got into a similare situation > when i tried to update an incompatible uCode from initrd and system hung. Hi Ashok, If this really is a bug fix, could you please break it out and send it separately along with appropriate tags like Fixes and a cc:stable@? We handle bug fixes differently from new features.
On Tue, Nov 08, 2022 at 03:06:20PM -0800, Ashok Raj wrote: > That's correct, my thought as well. Did you get a chance to review rest of > the patches? Dave had a spot-on response to one of your colleagues pinging impatiently the same way: "Things are a bit busy in the review queue at the moment. As always, we'd love help reviewing stuff. So, while you're waiting for us to review this, could you perhaps look around and find a series that's also hurting for review tags?" So how about you help out with review while waiting? Ashok, I'll say this only once: if you don't stop this impatient pinging and the private bullsh*t emails - them especially! - I will ignore you completely. We've documented it all: "You should receive comments within a week or so; if that does not happen, make sure that you have sent your patches to the right place. Wait for a minimum of one week before resubmitting or pinging reviewers - possibly longer during busy times like merge windows." And yes, those rules apply to you too.
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 8c35c70029bf..8f7f8dd6680e 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -680,6 +680,7 @@ void reload_ucode_intel(void) static int collect_cpu_info(int cpu_num, struct cpu_signature *csig) { + bool bsp = cpu_num == boot_cpu_data.cpu_index; static struct cpu_signature prev; struct cpuinfo_x86 *c = &cpu_data(cpu_num); unsigned int val[2]; @@ -696,8 +697,7 @@ static int collect_cpu_info(int cpu_num, struct cpu_signature *csig) csig->rev = c->microcode; - /* No extra locking on prev, races are harmless. */ - if (csig->sig != prev.sig || csig->pf != prev.pf || csig->rev != prev.rev) { + if (bsp && csig->rev != prev.rev) { pr_info("sig=0x%x, pf=0x%x, revision=0x%x\n", csig->sig, csig->pf, csig->rev); prev = *csig;