Message ID | 20231002115506.217091296@linutronix.de |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1464620vqb; Mon, 2 Oct 2023 07:26:38 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGpATpcf+swZGBmbIQH5E6xUywDIL/ZjipVPHVxajtN2iJs12TBpBJ/ViIyHeIzrZ/8F8k3 X-Received: by 2002:a17:902:a415:b0:1c5:de06:9e5a with SMTP id p21-20020a170902a41500b001c5de069e5amr7712456plq.21.1696256797869; Mon, 02 Oct 2023 07:26:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696256797; cv=none; d=google.com; s=arc-20160816; b=g7CTitUQgqBZoWakab7kpTU0SfEhTl4irKm2qkNjM+VdQwrSfJi7WQAhQIeIE5piXq niDMB0LjdO0wGxd1ULG7ra2CC6fpKAPdFZSeF1kUDsCn0zIlTPJuz1bvl0y9yAEblR6C 8SAXaEEOviHcyLxlhXdAa2P0PNcaE95jcNcZVUKUhnMlduXoDEEE/OXxKUBoadidOLQm iFCnKT4WJWnnzvykNyW2Ota7ChuH0l92aC5AuL+Oj/g7sA/i5vvh8Bk3IrPs6TE1Bvgv YvMr+vQsQrA3ZScekXKcS0KN2+XOss08K9uG4EqnWUDJrcw1R/0ikI46WZzf8HeXGIak 92UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:subject:cc:to:from:dkim-signature :dkim-signature:message-id; bh=BqEDe5JJULg8F/6J5m2oO/d4LtY3PQFxeMN1+3fdG3U=; fh=u57tXYamzTrJA+Ht8n1u7SfTMptrQaIb6LVW+jsaYf4=; b=YkAVI70+Md7dnGciJginUBRGQxRkXCOT9RFnvdwNwAQj5AnEcU4HOzcQgVxBJTi0yE 5fC+wRCGEaAFCdnpTtBxDBAeCS7IC6U84RhYn/qM+F1D3paCGMKcTIHUdXCTg7LhxxS1 nILiV3Dca2j+RiC0Og6HPUDnhOEiomjdQWKitEC1DwExkXdYEvx1OLlJXOqOkod7cIQP LZfKD3URVz/iastwCtfBVa0CWoaFjdEcGJWIfPqy1Nr3O2fKbtLAa7IKY0w6mj7dlSB9 CQVKirZgngZbFu3012NbJ0wePLkJSxOo0aR1sS8YwsRdMM3VMwEkL9G/giwMb+6oJChx r1XQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=FAZXfmT+; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id b16-20020a170902d51000b001c7545405b5si6115580plg.519.2023.10.02.07.26.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 07:26:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=FAZXfmT+; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id ABEB380238A6; Mon, 2 Oct 2023 05:00:04 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236697AbjJBL7l (ORCPT <rfc822;pusanteemu@gmail.com> + 18 others); Mon, 2 Oct 2023 07:59:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236822AbjJBL7k (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 2 Oct 2023 07:59:40 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EA4994 for <linux-kernel@vger.kernel.org>; Mon, 2 Oct 2023 04:59:37 -0700 (PDT) Message-ID: <20231002115506.217091296@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1696247974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc; bh=BqEDe5JJULg8F/6J5m2oO/d4LtY3PQFxeMN1+3fdG3U=; b=FAZXfmT+pePEX5zHjiWOpGl8tnaiTjgaVMAw3CyzNrjxsSMHxUS4nwkkT5FSI0CnJynf3u iZ+xpMbJ/0Yr9Iv9HXoKa02Zx10L3SPq94o7rIWb/e3xb28IHBjdrUCmpeWQxAa+Ff1p6o LKjvme5Cm/yZjl0nKf8xeF8X35283O6j5MIZtuewVzegFpgadsw4wG/HZ2wXak/d+FrzG2 8rYbOPlxW1LtqbDAPFPxTwow36562Ce+i/jH6ZGI251CCcWAvMRplF3XuXv+o1AUs5pDd4 d+thsKnEZA8USjmDupqcQPdbW/AyvXxn3saQzZ4bfP8VfEeUSrEQjMuAIj0a4w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1696247974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc; bh=BqEDe5JJULg8F/6J5m2oO/d4LtY3PQFxeMN1+3fdG3U=; b=jGr9D+FUWoWUBzWK7H0omKIOSDmDYoy6FRUC6tWUIehzS2cd9Qpeq0Se03/AcZOyQ8U7PG HAfoPmmw0DL2/dDA== From: Thomas Gleixner <tglx@linutronix.de> To: LKML <linux-kernel@vger.kernel.org> Cc: x86@kernel.org, Borislav Petkov <bp@alien8.de>, "Chang S. Bae" <chang.seok.bae@intel.com>, Arjan van de Ven <arjan@linux.intel.com>, Nikolay Borisov <nik.borisov@suse.com> Subject: [patch V4 00/30] x86/microcode: Cleanup and late loading enhancements Date: Mon, 2 Oct 2023 13:59:34 +0200 (CEST) X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 02 Oct 2023 05:00:04 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778654168398532336 X-GMAIL-MSGID: 1778654168398532336 |
Series |
x86/microcode: Cleanup and late loading enhancements
|
|
Message
Thomas Gleixner
Oct. 2, 2023, 11:59 a.m. UTC
This is a follow up on: https://lore.kernel.org/lkml/20230912065249.695681286@linutronix.de Late microcode loading is desired by enterprise users. Late loading is problematic as it requires detailed knowledge about the change and an analysis whether this change modifies something which is already in use by the kernel. Large enterprise customers have engineering teams and access to deep technical vendor support. The regular admin does not have such resources, so the kernel has always tainted the kernel after late loading. Intel recently added a new previously reserved field to the microcode header which contains the minimal microcode revision which must be running on the CPU to make the load safe. This field is 0 in all older microcode revisions, which the kernel assumes to be unsafe. Minimal revision checking can be enforced via Kconfig or kernel command line. It then refuses to load an unsafe revision. The default loads unsafe revisions like before and taints the kernel. If a safe revision is loaded the kernel is not tainted. But that does not solve all other known problems with late loading: - Late loading on current Intel CPUs is unsafe vs. NMI when hyperthreading is enabled. If a NMI hits the secondary sibling while the primary loads the microcode, the machine can crash. - Soft offline SMT siblings which are playing dead with MWAIT can cause damage too when the microcode update modifies MWAIT. That's a realistic scenario in the context of 'nosmt' mitigations. :( Neither the core code nor the Intel specific code handles any of this at all. While trying to implement this, I stumbled over disfunctional, horribly complex and redundant code, which I decided to clean up first so the new functionality can be added on a clean slate. So the series has several sections: 1) Move the 32bit early loading after paging enable 2) Cleanup of the Intel specific code 3) Implementation of proper core control logic to handle the NMI safe requirements 4) Support for minimal revision check in the core and the Intel specific parts. Changes vs. V3: - Rebased on v6.6-rc1 - Remove the early load magic which was required for physical address mode from the AMD code. - Address the review comments from Borislav, which is mostly naming, comments and change logs. No functional changes vs. v3 The series is also available from git: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git ucode-v4 Thanks, tglx --- Documentation/admin-guide/kernel-parameters.txt | 5 arch/x86/Kconfig | 25 arch/x86/include/asm/apic.h | 5 arch/x86/include/asm/cpu.h | 20 arch/x86/include/asm/microcode.h | 19 arch/x86/kernel/Makefile | 1 arch/x86/kernel/apic/apic_flat_64.c | 2 arch/x86/kernel/apic/ipi.c | 8 arch/x86/kernel/apic/x2apic_cluster.c | 1 arch/x86/kernel/apic/x2apic_phys.c | 1 arch/x86/kernel/cpu/common.c | 12 arch/x86/kernel/cpu/microcode/amd.c | 129 +--- arch/x86/kernel/cpu/microcode/core.c | 637 ++++++++++++++-------- arch/x86/kernel/cpu/microcode/intel.c | 682 +++++++----------------- arch/x86/kernel/cpu/microcode/internal.h | 32 - arch/x86/kernel/head32.c | 6 arch/x86/kernel/head_32.S | 10 arch/x86/kernel/nmi.c | 9 arch/x86/kernel/smpboot.c | 12 drivers/platform/x86/intel/ifs/load.c | 8 include/linux/cpuhotplug.h | 1 21 files changed, 788 insertions(+), 837 deletions(-)
Comments
Hi Thomas, > ... > > The series is also available from git: > > git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git ucode-v4 > ... Test Result (same as ucode-v3) ------------------------------ Tested 'ucode-v4' on an Intel Sapphire Rapids server that both early load and late load worked well. For more details, please refer to the test below: Tested Machine -------------- Intel Sapphire Rapids server with 2 sockets, each containing 48 cores, resulting in a total of 192 threads. Microcodes ---------- a) Microcode revisison of CPU : 0xab000130 b) Microcode revision in the initramfs : 0xab000140 // for early load c) Microcode revision in /lib/firmware/intel-ucode/* : 0xab000160 // for late load [ Microcode b) & c) headers both contain minirev 0x2b0000a1. ] Dmesg log --------- // Early load OK. [ 0.000000] microcode: updated early: 0xab000130 -> 0xab000140, date = 2022-11-04 ... [ 20.215653] microcode: Microcode Update Driver: v2.2. ... // Late load OK. [ 27.596822] microcode: Updated to revision 0xab000160, date = 2022-11-16 [ 27.606848] microcode: load: updated on 96 primary CPUs with 96 siblings [ 27.614789] microcode: revision: 0xab000140 -> 0xab000160 Thanks! -Qiuxu
On Sun, Oct 08, 2023 at 04:54:56PM +0800, Qiuxu Zhuo wrote: > Test Result (same as ucode-v3) > ------------------------------ > Tested 'ucode-v4' on an Intel Sapphire Rapids server that both early load > and late load worked well. For more details, please refer to the test below: Thanks. I've found a couple of issues and once I'm done with my testing, I'll push tip:x86/microcode and you could run it then to make sure it all is still ok.
> From: Borislav Petkov <bp@alien8.de> > ... > > Test Result (same as ucode-v3) > > ------------------------------ > > Tested 'ucode-v4' on an Intel Sapphire Rapids server that both early > > load and late load worked well. For more details, please refer to the test > below: > > Thanks. > > I've found a couple of issues and once I'm done with my testing, I'll push > tip:x86/microcode and you could run it then to make sure it all is still ok. Hi Boris, OK. I'll re-run the test once you push the code to tip:x86/microcode. -Qiuxu
Hi Boris, > From: Borislav Petkov <bp@alien8.de> > ... > > Test Result (same as ucode-v3) > > ------------------------------ > > Tested 'ucode-v4' on an Intel Sapphire Rapids server that both early > > load and late load worked well. For more details, please refer to the test > below: > > Thanks. > > I've found a couple of issues and once I'm done with my testing, I'll push > tip:x86/microcode and you could run it then to make sure it all is still ok. Test Result (same as ucode-v4) ------------------------------ Tested tip:x86/microcode (top commit 9975802d3f74) on an Intel Sapphire Rapids server that both early load and late load worked well. For more details, please refer to the test below: Tested Machine -------------- Intel Sapphire Rapids server with 2 sockets, each containing 48 cores, resulting in a total of 192 threads. Microcodes ---------- a) Microcode revisison of CPU : 0xab000130 b) Microcode revision in the initramfs : 0xab000140 // for early load c) Microcode revision in /lib/firmware/intel-ucode/* : 0xab000160 // for late load [ Microcode b) & c) headers both contain minirev 0x2b0000a1. ] Dmesg log --------- // Early load OK. [ 0.000000] microcode: updated early: 0xab000130 -> 0xab000140, date = 2022-11-04 ... [ 20.261926] microcode: Microcode Update Driver: v2.2. ... // Late load OK. [ 27.400858] microcode: Updated to revision 0xab000160, date = 2022-11-16 [ 27.409978] microcode: load: updated on 96 primary CPUs with 96 siblings [ 27.409997] microcode: revision: 0xab000140 -> 0xab000160 cpuinfo ------- cat /proc/cpuinfo | grep -m1 microcode microcode : 0xab000160 Thanks! -Qiuxu
On Tue, Oct 10, 2023 at 08:00:27AM +0000, Zhuo, Qiuxu wrote: > Test Result (same as ucode-v4) > ------------------------------ > Tested tip:x86/microcode (top commit 9975802d3f74) on an Intel Sapphire > Rapids server that both early load and late load worked well. For more > details, please refer to the test below: Thanks!