From patchwork Tue Jun 13 12:17:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107353 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp506733vqr; Tue, 13 Jun 2023 05:31:40 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7APfG+a79wBd+vHgnPDdqgayOCY1dawfkmNfuoYR9b7JG784k5u+bEpvBKPyFmZD3MDgjz X-Received: by 2002:a05:6402:1291:b0:516:2aaa:9bdd with SMTP id w17-20020a056402129100b005162aaa9bddmr8793829edv.7.1686659499925; Tue, 13 Jun 2023 05:31:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686659499; cv=none; d=google.com; s=arc-20160816; b=JOSJRSf4Dn46CbvH8J41OSOWD/nyMiMUcraxy7f/Qon6yFKvIPFOnDIIJJq7/+a8H6 Sr839WwtoRG1K3NrIi42Q5YknqWG8H0gT4ik5OhptzmgOXP9UCqidUxuaKteasSnDKFg nsqGw/2Wzhwkti1dqMMZSfckcXqyjafEAqloTB3brG2Ll51PWRxWlPaQVb/4ZzsvUoNf USEYwe+qVbSYCAGhLToKt8ramQGfFi4IL43ZBDjh4PlLzlWLrJp8KMybCQkiCKhyOFgW kVIFY1Q17iR0BXFjLRSwmK8jlmfEzEkMV9DoQhNBcYFvl6vHWaqFpz3ir3t4Ax6R+66L HTQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=R5syc+q39gNuEsthP9jc/1HJgP0yzbDEM2sa8HGymTo=; b=p6YPmXEBN9tnK3TAFkadArD2UWHktHx6nrW6E4YJOfpRUQRarrdOxvCZVwaqbW7axW +l75SaCFb+yuuEdJCl0Szgo4DJNEPxMaQFB9/cUA+HtaWbhnI1Z9av34k5uYhH2uvNe1 W0syXhdI7XgzRTS9Y54D2J6Y+hYwlihX9nP4Ldtg6DpEefze2Lz/MbEGA6OR0xLDRdvu r9V0PXEDBWyKhvWFp+rFrm0x//viaZt2heYXWqnfN6h//X+of3R5OyR01OeVbHidBkIJ /zGB8JFcSkVci34xvnf2gCHNccgp6gJTGomjCPt5Vi+LfqQCQlnpQuv6XQ8k+nzvT69I eALw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=QXedgiIL; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z3-20020aa7d403000000b005187910e766si766260edq.191.2023.06.13.05.31.03; Tue, 13 Jun 2023 05:31:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=QXedgiIL; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242263AbjFMMSC (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239026AbjFMMR7 (ORCPT ); Tue, 13 Jun 2023 08:17:59 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0BB810CB for ; Tue, 13 Jun 2023 05:17:57 -0700 (PDT) Message-ID: <20230613121615.639116359@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658676; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=R5syc+q39gNuEsthP9jc/1HJgP0yzbDEM2sa8HGymTo=; b=QXedgiILOgX4U9zO3CpAhkrakhg9ImQwY+RPK/CFpWR/NPGFDxYzxTb01iiQD+kdy/xIMX S8AghohdvJy4E2MXd438LrqAVWwiybcycrcDk2/D+u9Cdr0yJ8ziniVcOroKsFSKujZULM mvzlFJiUyvSNksBMEekXftCnW86ji3TOu70xvjvs89LCGvDxWO95cAphre5e7jAW6yUZ8y DVrXPvsGZR4jLkQkH4A37eTEutJX8+FFSA1aT3JiIXJ7N/wAbaYWUPLKE8cXrDlRHD9YmZ bKUu3zxKy3Dwe5EvNb26Iw8xv5mwmQsI28UaqUgm6k//8FMxGy9eAMUm13ud/A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658676; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=R5syc+q39gNuEsthP9jc/1HJgP0yzbDEM2sa8HGymTo=; b=BJ/4X4yKAF5hpJKeO67DVXz59+LWF0mDLNOOQ+A5fnBzBhTjR4Pk2hTQUD1QKX8dR5VqI+ v+20bgFNZFvVyKCQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman Subject: [patch V2 1/8] x86/smp: Make stop_other_cpus() more robust References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:17:55 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768590671435042167?= X-GMAIL-MSGID: =?utf-8?q?1768590671435042167?= Tony reported intermittent lockups on poweroff. His analysis identified the wbinvd() in stop_this_cpu() as the culprit. This was added to ensure that on SME enabled machines a kexec() does not leave any stale data in the caches when switching from encrypted to non-encrypted mode or vice versa. That wbindv() is conditional on the SME feature bit which is read directly from CPUID. But that readout does not check whether the CPUID leaf is available or not. If it's not available the CPU will return the value of the highest supported leaf instead. Depending on the content the "SME" bit might be set or not. That's incorrect but harmless. Making the CPUID readout conditional makes the observed hangs go away, but it does not fix the underlying problem: CPU0 CPU1 stop_other_cpus() send_IPIs(REBOOT); stop_this_cpu() while (num_online_cpus() > 1); set_online(false); proceed... -> hang wbinvd() WBINVD is an expensive operation and if multiple CPUs issue it at the same time the resulting delays are even larger. But CPU0 already observed num_online_cpus() going down to 1 and proceeds which causes the system to hang. Make this more robust by adding a counter which is set to the number of online CPUs before sending the IPIs and decremented in stop_this_cpu() after the WBINVD completed. Check for that counter in stop_other_cpus() instead of watching num_online_cpus(). Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use") Reported-by: Tony Battersby Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/all/3817d810-e0f1-8ef8-0bbd-663b919ca49b@cybernetics.com --- arch/x86/include/asm/cpu.h | 2 ++ arch/x86/kernel/process.c | 10 ++++++++++ arch/x86/kernel/smp.c | 15 ++++++++++++--- 3 files changed, 24 insertions(+), 3 deletions(-) --- a/arch/x86/include/asm/cpu.h +++ b/arch/x86/include/asm/cpu.h @@ -98,4 +98,6 @@ extern u64 x86_read_arch_cap_msr(void); int intel_find_matching_signature(void *mc, unsigned int csig, int cpf); int intel_microcode_sanity_check(void *mc, bool print_err, int hdr_type); +extern atomic_t stop_cpus_count; + #endif /* _ASM_X86_CPU_H */ --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -759,6 +759,8 @@ bool xen_set_default_idle(void) } #endif +atomic_t stop_cpus_count; + void __noreturn stop_this_cpu(void *dummy) { local_irq_disable(); @@ -783,6 +785,14 @@ void __noreturn stop_this_cpu(void *dumm */ if (cpuid_eax(0x8000001f) & BIT(0)) native_wbinvd(); + + /* + * native_stop_other_cpus() will write to @stop_cpus_count after + * observing that it went down to zero, which will invalidate the + * cacheline on this CPU. + */ + atomic_dec(&stop_cpus_count); + for (;;) { /* * Use native_halt() so that memory contents don't change --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -171,6 +172,8 @@ static void native_stop_other_cpus(int w if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1) return; + atomic_set(&stop_cpus_count, num_online_cpus() - 1); + /* sync above data before sending IRQ */ wmb(); @@ -183,12 +186,12 @@ static void native_stop_other_cpus(int w * CPUs reach shutdown state. */ timeout = USEC_PER_SEC; - while (num_online_cpus() > 1 && timeout--) + while (atomic_read(&stop_cpus_count) > 0 && timeout--) udelay(1); } /* if the REBOOT_VECTOR didn't work, try with the NMI */ - if (num_online_cpus() > 1) { + if (atomic_read(&stop_cpus_count) > 0) { /* * If NMI IPI is enabled, try to register the stop handler * and send the IPI. In any case try to wait for the other @@ -208,7 +211,7 @@ static void native_stop_other_cpus(int w * one or more CPUs do not reach shutdown state. */ timeout = USEC_PER_MSEC * 10; - while (num_online_cpus() > 1 && (wait || timeout--)) + while (atomic_read(&stop_cpus_count) > 0 && (wait || timeout--)) udelay(1); } @@ -216,6 +219,12 @@ static void native_stop_other_cpus(int w disable_local_APIC(); mcheck_cpu_clear(this_cpu_ptr(&cpu_info)); local_irq_restore(flags); + + /* + * Ensure that the cache line is invalidated on the other CPUs. See + * comment vs. SME in stop_this_cpu(). + */ + atomic_set(&stop_cpus_count, INT_MAX); } /* From patchwork Tue Jun 13 12:17:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107341 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp499736vqr; Tue, 13 Jun 2023 05:20:08 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4s4J6rkXu5dGXSndZKFerAdm1LzQKl7PqsWWGQiSX5SjjXe3UgBP+qXf7F7LCK+s+ExXsx X-Received: by 2002:a17:907:320a:b0:975:942e:81d5 with SMTP id xg10-20020a170907320a00b00975942e81d5mr13024156ejb.1.1686658807854; Tue, 13 Jun 2023 05:20:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686658807; cv=none; d=google.com; s=arc-20160816; b=aCl64TxfiDTzipEc5Z0wpOrI/ito+YEjr/hQpQAg0guq9Vcp9Y95FlkHT+VTgd2BNU /sGbB4EnI86NDc08dpqYqWz3SrCBMmauSlWqe/xJyeTLPnhnBK5ctvY6sRhQkPcsfL1x RoncIh/KpRUrOhNR/0Vtaimun9PLKFhknnAlh+PmQdnX9G4SSxBFVui/FoLNcOnuUwgR rCkZ9ZLWpb5+od/pYd2XV7IZcm8cmAADbyfeXWPCZx67NEx3bTsugtqoyNoHNhrLGXBh ho7Kq/F68vcM+aSV77rmwD3IvbDqPklIUuUCFivUHPrBBopMbHOy4ILucmAqobrLtkGT GZ1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=HM6V+irlq0EoYO9XtWgZJ+0ilH4pTT198i0/jTCKMCs=; b=HkeGimp5NWhaJPyhErA7uqcq3xHQx26Pqd780cqgOXVIGAakhSo5liyId6eG9RI0SQ 9n7yXtUvkhjr0P8wO68fXoAOMVI2oSQuNrFGJgHTejuY1mWyJ/yq61AKiPZBUPuTFI9m By5DDiPGQyuqL9ulWhyFIfjuN6tXNwvBF9jmEj7+X3l9yaBDHK1GLPWoRmkgFMJt5ciG SgCEVSixhI+Kzcyd/pPjZOWL0/QLGedLuKvleez3wdvQXwN9aAHw7nC6U8N4efH4xqPQ xWCtZEpEEv6k7aJKGFt79WovS7K2d52r/Q2Q4zWkw72ZFzHhvteOF5ojDd2ZeTycpnJo kwqw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=GJm07Y87; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kq19-20020a170906abd300b00965cf8f1be1si4236649ejb.617.2023.06.13.05.19.41; Tue, 13 Jun 2023 05:20:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=GJm07Y87; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242331AbjFMMSG (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57106 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240344AbjFMMR7 (ORCPT ); Tue, 13 Jun 2023 08:17:59 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2A7DCE for ; Tue, 13 Jun 2023 05:17:58 -0700 (PDT) Message-ID: <20230613121615.697412459@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=HM6V+irlq0EoYO9XtWgZJ+0ilH4pTT198i0/jTCKMCs=; b=GJm07Y878g9vNRWiWCyGq50zu00hhSLlq6nAhrAbyrA0rxJLVi1O8/XalaKYs5Q2A3CwR6 KkwQ6IpRoXfyLscT9bkdfkIJptf1VvUjNFQUkXsz9kxp9rMbGZqU+s9DPLPzTTQRLBaU8N 5TQucOy1T+9LZQY2pei87uZM9DkiIsUpi9DJbcGoBMUpkKIo0w2+tNkDHhmyEubRtBm94/ POvT8ZEdm8HAowmgLsE1pAjxAmbMVTLLphIIptjwzDqgIKBP3iNBd1ORvPwUTpkY3nEbMv bdEhlar1yAaVjlBpsIZDg8rjvjy2jji4fRzw0AUSdBpvyEEvOrxYnT5SIoP5zA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658677; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=HM6V+irlq0EoYO9XtWgZJ+0ilH4pTT198i0/jTCKMCs=; b=dyViilN1El+jNxDNYUEbnhtqsOckRJgiExrmzt8Tb/SNbuyGsPTqh4GWxs1VsBxF1AKKtj RdnqWTfKf1C21JAw== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman Subject: [patch V2 2/8] x86/smp: Dont access non-existing CPUID leaf References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:17:57 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768589946228815466?= X-GMAIL-MSGID: =?utf-8?q?1768589946228815466?= From: Tony Battersby stop_this_cpu() tests CPUID leaf 0x8000001f::EAX unconditionally. CPUs return the content of the highest supported leaf when a non-existing leaf is read. So the result of the test is lottery except on AMD CPUs which support that leaf. While harmless it's incorrect and causes the conditional wbinvd() to be issued where not required. Check whether the leaf is supported before reading it. [ tglx: Adjusted changelog ] Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use") Signed-off-by: Tony Battersby Signed-off-by: Thomas Gleixner Link: https://lore.kernel.org/r/3817d810-e0f1-8ef8-0bbd-663b919ca49b@cybernetics.com --- arch/x86/kernel/process.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -763,13 +763,15 @@ atomic_t stop_cpus_count; void __noreturn stop_this_cpu(void *dummy) { + struct cpuinfo_x86 *c = this_cpu_ptr(&cpu_info); + local_irq_disable(); /* * Remove this CPU: */ set_cpu_online(smp_processor_id(), false); disable_local_APIC(); - mcheck_cpu_clear(this_cpu_ptr(&cpu_info)); + mcheck_cpu_clear(c); /* * Use wbinvd on processors that support SME. This provides support @@ -783,7 +785,7 @@ void __noreturn stop_this_cpu(void *dumm * Test the CPUID bit directly because the machine might've cleared * X86_FEATURE_SME due to cmdline options. */ - if (cpuid_eax(0x8000001f) & BIT(0)) + if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0))) native_wbinvd(); /* From patchwork Tue Jun 13 12:17:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107342 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp499799vqr; Tue, 13 Jun 2023 05:20:14 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5gtgOH//E7DUR6e+h6GsynO8TZCZNf/u+EYbIzEvaYL2/g91kbSPADw11FgcoTiWuFLTzq X-Received: by 2002:a05:6402:1487:b0:518:670a:7c57 with SMTP id e7-20020a056402148700b00518670a7c57mr2532199edv.14.1686658813786; Tue, 13 Jun 2023 05:20:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686658813; cv=none; d=google.com; s=arc-20160816; b=vCPZNS5aB1Ww6e5gRtpG9Y4rTyhE2uX2vEKxKW+JW61VBNL9RkX1kPZu8Un0V/bHN/ JrRqTc2L7TZMVCqGt9ctLsJXl9UOGA4MhrSYA8tbOwWc8SQa9bD+Xcm+LS/yDAwpYOyl jIUePV0Xam7abBn7VZ7N4Dec8Xz244Rc1Le/EhDRWH6CPGjqH+uyxu8apn+cwwcwBUGj rVlzFmhthMN/noHVmQ8Ky0fz08kYXE0n5DBez4bfy4CFFUPNJjdQRClYPlCYZ8WKccWi LXm69f0U9RVKGufBnWaZJq20zqEPoRXfTw3nbCtuEfqHU7yoOk6gk01vIu7ef3VjQ4MH exwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=9CaNHDYSUbbGc4RwsKkuSl3Vh+jfDiEx4Q/XI2wrZ6E=; b=sltKhTbfTTFbwJW9yx1pmQoT4/Y2hNKqxJ5Xl3+auUlBSe4R0vD9KUYOZFeErWqiQB CxWjcIu8x9tjYVAxVev62m5cKZhOWuFtdX6qJV3ORyz1ABz5v13fR/Irz7Pu/6X6XI+Y xcClWdq4w9oCZf/XPf2DrFiEfjEffATniUC5YVnO6iLeRDs4lBQFrbK/gjrNE3ylhx4q CQVcq2hp+LY5XwUqTE8GtE0TF+UtxD3g8v+ITg1R2evxeV1V9zvzyY2Yj7HhTpnlEJpy C9KINZAXFMvCpV2gpO1iKMedVlyixgFrFVErJswEu1o+PiIffwC3s8ICrH1v39MjYKjI 4FlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=g7pEVjNy; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g25-20020a50ee19000000b00510e0361bb6si3393675eds.88.2023.06.13.05.19.47; Tue, 13 Jun 2023 05:20:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=g7pEVjNy; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242342AbjFMMSI (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57112 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242245AbjFMMSA (ORCPT ); Tue, 13 Jun 2023 08:18:00 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0C8D10C6 for ; Tue, 13 Jun 2023 05:17:59 -0700 (PDT) Message-ID: <20230613121615.762734722@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658678; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=9CaNHDYSUbbGc4RwsKkuSl3Vh+jfDiEx4Q/XI2wrZ6E=; b=g7pEVjNyx6dO3YLmKuqLdZq8xnkl7rWlZCyBJA93QQQoZACRDYPfAvTeUe+3YTMarHIqbI EJ/cChZla8NMoXa1u5NftFQndVRHdG0OKh7Cuo3IhGgeU0MJOVTzp4TPVZRPKGge7ym5AD mizCvJdQUHiQMQ1cIKFy4LFc2mF0PIO5woL2DcMzi7SMgkPub+aDLW/nbVmEpS5k1q6afR ouL16j3drh3EPfcEJ0y4uMP+zhTcb64iTQmNZyVEswNxahdKKtMOjKreAzNBSnEJWBjUka geK8Dv0i9mBAIYpZtBWfLRV/VWjhcHgZge8GYANML7K430Q/vhqdg9PCoEfy7A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658678; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=9CaNHDYSUbbGc4RwsKkuSl3Vh+jfDiEx4Q/XI2wrZ6E=; b=FsGtQBIlN01SMmNLlLVp1CCzWGYi9iqlxLiuts/hjVGmofq2C1N4ztCtz0aMmqh4Tri4wm AAyVpZsZq/E9uSAA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman , Ashok Raj Subject: [patch V2 3/8] x86/smp: Remove pointless wmb() from native_stop_other_cpus() References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:17:58 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768589952021374880?= X-GMAIL-MSGID: =?utf-8?q?1768589952021374880?= The wmb() after the successfull atomic_cmpxchg() is complete voodoo along with the comment stating "sync above data before sending IRQ". There is no "above" data except for the atomic_t stopping_cpu which has just been acquired. The reboot IPI handler does not check any data and unconditionally disables the CPU. Remove this cargo cult barrier. Signed-off-by: Thomas Gleixner Reviewed-by: Ashok Raj --- arch/x86/kernel/smp.c | 3 --- 1 file changed, 3 deletions(-) --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -174,9 +174,6 @@ static void native_stop_other_cpus(int w atomic_set(&stop_cpus_count, num_online_cpus() - 1); - /* sync above data before sending IRQ */ - wmb(); - apic_send_IPI_allbutself(REBOOT_VECTOR); /* From patchwork Tue Jun 13 12:17:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107351 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp506359vqr; Tue, 13 Jun 2023 05:31:10 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4VjpUp5Tvut4AwOUtd298CmHZhe9dOrvgA/0SEzZjQj4Yd2rAFWxEFA4gnoVpsZ7LJNGhj X-Received: by 2002:aa7:d60e:0:b0:50a:11ce:4d24 with SMTP id c14-20020aa7d60e000000b0050a11ce4d24mr6722418edr.15.1686659470467; Tue, 13 Jun 2023 05:31:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686659470; cv=none; d=google.com; s=arc-20160816; b=LE+ryY8wjlUZZ7uRaiYryzzuN/Zjbdn4DPmXI4TCf4Z0YTY1gxZL3sIEKCEAnKMMTV +sPWn2Qj/NNtOPoKEVIHLf/5Y66rWdtaqypSr6BWUW8v1IgJ9VgmTDVvWPz07bhbYe4a JK3FbiuNxOA6dwCz+sbjDRjfDvgzhODM95bXMJxNuxyDOL0sx/6Vg1VvcsccleUIUkOk W5ch57hvJR5Ua1wGC95Yab8ewljWkok5Wy5GxfXqvzbJfKSi9v/1SdXgOkE3lVDDE1Uc BqbFRZJa1om2s/cVthP6vpGjyeb0997GSG7fzOuLtLHQiThJHScJCHN77IK+o1ci1FyY ztwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=5gDoXglY/QeknUv64+zoXgi0Yj9FIJUXTEG9+0wz2Ho=; b=WkI96rVyKhlkOpG/i2rBOP+z7EzkJoeqGEckv3sjxjsWl1tRuLx4Qaw531gnGm3Fbu GbbF5dqavz/LO/9aKcU7//Mjdjs/75YXdiV3sOx7gqD8Hdbi37Ru+HPVcCAPB3Zhe6ri AmVtFZCZYzteuAYY9R7SwBjoaqkqLowdtBSX2EcgHRC01xlgLLkBnjBWdDSgR/LdN8A2 ULoE3QaotnpUcrbxk8efEF986ahFL3wHvIfovBRC4swtJnrGUCFT+ufvxa2qz5RaWklF IzIiQKh0Yf0dodE5gZ673ZdWNylz8r3Up5S7W580DWZOYhltsQ7RSMm4EOmnWtX7g6em 9mOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=HDk3SkzL; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n5-20020aa7db45000000b00514be2f2355si7204259edt.109.2023.06.13.05.30.32; Tue, 13 Jun 2023 05:31:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=HDk3SkzL; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242352AbjFMMSM (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242277AbjFMMSC (ORCPT ); Tue, 13 Jun 2023 08:18:02 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1D5EDCE for ; Tue, 13 Jun 2023 05:18:01 -0700 (PDT) Message-ID: <20230613121615.820042015@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658679; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=5gDoXglY/QeknUv64+zoXgi0Yj9FIJUXTEG9+0wz2Ho=; b=HDk3SkzLXmBKoEWWYdQhk62//czfpnloAdjufIsu9ln9M1Ia/twdJgubGCHXveDMBLSNoE LcBehRNNf84MCUSreODAFngI8vbo5637kUizdmU11YQUkLMBe3dJ5H4qozl7NIwF926g4v wX9JzZ/FoxH2riOgkVUFD9OcFG+ZITkwbm0TemDdm8QjlvAYEiZZ3hgNpvc0jVP6fWGNku pxm7wv4RNQSIUdSOYJ7NDMMkuGhH/9hFmX53oIsFzR+n8MXD2MuvpfQu6Cn0QmPgOE2MhQ SWf+ZWcbJkxe/q6+biIZolgD+kuNNrPqqgxy9dg8eaK8mJ/5rqOz86XU7WOPeg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658679; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=5gDoXglY/QeknUv64+zoXgi0Yj9FIJUXTEG9+0wz2Ho=; b=He364tdunJPaA7wqlPfGXiIfHsYqZG8Q3qlaaiQgoJafWURCpUyvS8b5vE+6oVNLgXvshQ rsW/e3A/5tZYG/CQ== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman , Ashok Raj Subject: [patch V2 4/8] x86/smp: Acquire stopping_cpu unconditionally References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:17:59 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768590641287012673?= X-GMAIL-MSGID: =?utf-8?q?1768590641287012673?= There is no reason to acquire the stopping_cpu atomic_t only when there is more than one online CPU. Make it unconditional to prepare for fixing the kexec() problem when there are present but "offline" CPUs which play dead in mwait_play_dead(). They need to be brought out of mwait before kexec() as kexec() can overwrite text, pagetables, stacks and the monitored cacheline of the original kernel. The latter causes mwait to resume execution which obviously causes havoc on the kexec kernel which results usually in triple faults. Move the acquire out of the num_online_cpus() > 1 condition so the upcoming 'kick mwait' fixup is properly protected. Signed-off-by: Thomas Gleixner Reviewed-by: Ashok Raj --- arch/x86/kernel/smp.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -153,6 +153,12 @@ static void native_stop_other_cpus(int w if (reboot_force) return; + /* Only proceed if this is the first CPU to reach this code */ + if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1) + return; + + atomic_set(&stop_cpus_count, num_online_cpus() - 1); + /* * Use an own vector here because smp_call_function * does lots of things not suitable in a panic situation. @@ -167,13 +173,7 @@ static void native_stop_other_cpus(int w * code. By syncing, we give the cpus up to one second to * finish their work before we force them off with the NMI. */ - if (num_online_cpus() > 1) { - /* did someone beat us here? */ - if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1) - return; - - atomic_set(&stop_cpus_count, num_online_cpus() - 1); - + if (atomic_read(&stop_cpus_count) > 0) { apic_send_IPI_allbutself(REBOOT_VECTOR); /* From patchwork Tue Jun 13 12:18:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107347 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp504588vqr; Tue, 13 Jun 2023 05:28:25 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4zu/rByZRu7ECruP9cxXDQ+WN2BcHUnMu67ywGxmiGCFGn0HzzZET1XkLPANb/RhuPb2ob X-Received: by 2002:a19:7b1b:0:b0:4e0:a426:6ddc with SMTP id w27-20020a197b1b000000b004e0a4266ddcmr5395311lfc.0.1686659304931; Tue, 13 Jun 2023 05:28:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686659304; cv=none; d=google.com; s=arc-20160816; b=oyfpbG/Dq6vX8ydebHPaC9/FiHYJsZdHamLEzHaVbzMPAAIGHnlbtJfWlqFk7moWbE kHY/oV8Dae+9WQ70nK7dA3MvHksr3/7Z+7xZgXeZroiB7G5NzP8AyCCXH9OtLmQEvfko Wns23cBqxBzfMPXsC1M0i6Zk2szeoLFoL7axfgh4pTrbSULyffvF6e6w/B0e+lX1xFjQ 5oQDzFJ+25uogfGOKS+NYZJGPhED7gZt8R3x+/bmenpHbPZfM5rPRcQD/v9H8KX+dfFN GW4H1RSw367+bXWwxWKP+5u6Uc1CJ+biMSNEFfhcFSTWcJmo3E1vDPTuK/K9H/RLmlZn Muyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=tVfoJ5rLMlFkeEBpRbPrXavA+I9P+UPy7NG09ZxG1es=; b=It30nabkHbbCReCgbtXuAqBgUmooGam6mg9q/6joB/ZDdRESJv2c6EdHQbhl6tTQyU ZxjsKOPc00jP7VoSffy9od/ucfcTRegYNKNSEW5ueNmgsFmj7q3GZEAx40QKgclDou+R /PPT5KFYZ+nZZedk8dfKD/pjKld5S07imIgTyXqS0Fw8OSEwDYKtzCeGsjnOklF8GTNP wyt6XOONfu3Gk4cUp0ZgB5cH4UIjBKW3HemxfJgEvExwKrI5nCXu1Lrol7GL3Vx0jxMf nAMRzYlO6TTUV64Ru04ILy2wr7wGej+iK/ebtTNEN94BaJe1TjoQ/EWPJUwcLC0wIMUZ L+qg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=3Uwe3XzB; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=xhpN5AOu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t17-20020aa7d4d1000000b0051830de62c6si4357046edr.413.2023.06.13.05.28.00; Tue, 13 Jun 2023 05:28:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=3Uwe3XzB; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=xhpN5AOu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242330AbjFMMSQ (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242293AbjFMMSD (ORCPT ); Tue, 13 Jun 2023 08:18:03 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51B5B10C6 for ; Tue, 13 Jun 2023 05:18:02 -0700 (PDT) Message-ID: <20230613121615.874928734@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658681; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=tVfoJ5rLMlFkeEBpRbPrXavA+I9P+UPy7NG09ZxG1es=; b=3Uwe3XzBEhnjJMIexwevv7CpqOvhCrzYSk5hZZIbs9UG6BnDKAnSMlinOnDqUrOhDyi8JM fAfqYqAodrOAb0kN+GyZiwgGTwZJJTMyfgSWAqmOZNz1GZCxOZmfcIni2i3Vf6znFC8Bfw JF1Pc5X9WRVpGvSTxXzMblqMnXa1vCmz+UaUCZjOA19bSXeszZTf4krYoaa7FdyWwwuFUa kqFR2xKDEHzq5zJeETAqHydDhiAlQP4dxsYR0l7Soep0wbngULTmfzzUXaWULM2V3y51u2 hJgVWvUL1TmSQdcFY3IoTCDzFtlJX+SBa2FZEVzCDN8T3pssZj+nJFXSOo6NlQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658681; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=tVfoJ5rLMlFkeEBpRbPrXavA+I9P+UPy7NG09ZxG1es=; b=xhpN5AOuS/th5GqtT7cDKbuzAUDRbLG/yXjcoQ9RpGNpaFGApOFq1F/RqoiRVrIDFUuF2q RFKz4M60uLO8r/Ag== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman , Ashok Raj Subject: [patch V2 5/8] x86/smp: Use dedicated cache-line for mwait_play_dead() References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:18:00 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768590466950404897?= X-GMAIL-MSGID: =?utf-8?q?1768590466950404897?= Monitoring idletask::thread_info::flags in mwait_play_dead() has been an obvious choice as all what is needed is a cache line which is not written by other CPUs. But there is a use case where a "dead" CPU needs to be brought out of that mwait(): kexec(). The CPU needs to be brought out of mwait before kexec() as kexec() can overwrite text, pagetables, stacks and the monitored cacheline of the original kernel. The latter causes mwait to resume execution which obviously causes havoc on the kexec kernel which results usually in triple faults. Use a dedicated per CPU storage to prepare for that. Signed-off-by: Thomas Gleixner Reviewed-by: Ashok Raj --- arch/x86/kernel/smpboot.c | 24 ++++++++++++++---------- 1 file changed, 14 insertions(+), 10 deletions(-) --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -101,6 +101,17 @@ EXPORT_PER_CPU_SYMBOL(cpu_die_map); DEFINE_PER_CPU_READ_MOSTLY(struct cpuinfo_x86, cpu_info); EXPORT_PER_CPU_SYMBOL(cpu_info); +struct mwait_cpu_dead { + unsigned int control; + unsigned int status; +}; + +/* + * Cache line aligned data for mwait_play_dead(). Separate on purpose so + * that it's unlikely to be touched by other CPUs. + */ +static DEFINE_PER_CPU_ALIGNED(struct mwait_cpu_dead, mwait_cpu_dead); + /* Logical package management. We might want to allocate that dynamically */ unsigned int __max_logical_packages __read_mostly; EXPORT_SYMBOL(__max_logical_packages); @@ -1758,10 +1769,10 @@ EXPORT_SYMBOL_GPL(cond_wakeup_cpu0); */ static inline void mwait_play_dead(void) { + struct mwait_cpu_dead *md = this_cpu_ptr(&mwait_cpu_dead); unsigned int eax, ebx, ecx, edx; unsigned int highest_cstate = 0; unsigned int highest_subcstate = 0; - void *mwait_ptr; int i; if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD || @@ -1796,13 +1807,6 @@ static inline void mwait_play_dead(void) (highest_subcstate - 1); } - /* - * This should be a memory location in a cache line which is - * unlikely to be touched by other processors. The actual - * content is immaterial as it is not actually modified in any way. - */ - mwait_ptr = ¤t_thread_info()->flags; - wbinvd(); while (1) { @@ -1814,9 +1818,9 @@ static inline void mwait_play_dead(void) * case where we return around the loop. */ mb(); - clflush(mwait_ptr); + clflush(md); mb(); - __monitor(mwait_ptr, 0, 0); + __monitor(md, 0, 0); mb(); __mwait(eax, 0); From patchwork Tue Jun 13 12:18:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107343 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp500009vqr; Tue, 13 Jun 2023 05:20:33 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5KdYRhitgPwR8KhK3seX/M4HHHhJnEHkTFZG0ZlkQtiUsuHsR7PRGe3/sN68lUi+TidKK4 X-Received: by 2002:a17:907:318e:b0:978:6a95:512d with SMTP id xe14-20020a170907318e00b009786a95512dmr12125941ejb.11.1686658832962; Tue, 13 Jun 2023 05:20:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686658832; cv=none; d=google.com; s=arc-20160816; b=QcxdOb+g/at1CJ9CDhAle4cwPAQttlDTqjj1b7Jm+09n87PWVnB42yJmWsen2FHOoF 23HhlIYPUu9T562liubaWkrMbRhvPWrSr52wKd0ceezx0zfZMn/DlxLtLCSjWqQxq9jN +LMTESvLEYZK9RgC0ZfEgOYy8dCSvzug8OZkBgXxXFIUY9qoTSF61Pp/0x/B+vFdoBPL QDJc8eBwgXLP4Unj9FoaZVRZmak4s3MQb6SZFFrns4KnjJ9uCtJUyxXn+IibAb9sL64V fXc304h6TtRnYDkZ17xwT9q/S96IftORx3l9/TV2XIZAlTpaQ1mbHdqHGUvbtXzrjL7l Y2IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=2tbe1yf/elw3q9ddi46TIJjkrkI2lPyw8V2Oqd13dsI=; b=LJ0XWWHO3j0kYfifccA3BjVAotjYpePko1Lt1bDa1z6oILdWWH42jm8nW4wo+etsIh 5+6tVTDBv233U19ux4CGae3xokuyGFPkxHGl4mAonaB6+PylNUKUmPN3K3rVneGzFgsU hzLTU2XhEA0uUalQSbn5OLekRkwT9ZfiVT46Wy0XEsGhE+4WrZWndj4UPUTDaQ9guU1D nZ8bcLyNrGlrPvhHu5sRNmBiZAeL13Cf9/je3TpSvJd8Uxl7phNRSu9fhcVXdUM12xIP NTur08dUNFLV0B/1F4CcafBjEO7TdLLlzrNloH2V+2aT7Xnflz5GllN0mxlUEao+9bBq 3LiA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Z+j5IOig; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j17-20020a17090686d100b00977cb42a33bsi3500867ejy.860.2023.06.13.05.20.06; Tue, 13 Jun 2023 05:20:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=Z+j5IOig; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242375AbjFMMST (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242302AbjFMMSF (ORCPT ); Tue, 13 Jun 2023 08:18:05 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9208910CE for ; Tue, 13 Jun 2023 05:18:03 -0700 (PDT) Message-ID: <20230613121615.930971031@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658682; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=2tbe1yf/elw3q9ddi46TIJjkrkI2lPyw8V2Oqd13dsI=; b=Z+j5IOigCOSxABtOlN+yWSRchWd5GWYQ9Z9wwU7tk8wF8owqZ/CoWtYWWbNpfGPDVsq9pe VdMnziL6VtzIVlboyXqn0WnVYllsvFaUg4wb/8trEiEvamMN/N7bY1YPYRGHegyLFjxsUP qWKA70ogPLwbp2ycCXF328xnqAeMv+n6UfFiO/oY4WXfQ/cywVvX0rrFS3rizKUi6AnVm9 vNuHXIKAPhj88C+JXeZfuLKICKoaA4nFNDVVhZQ0ETebCSLXL2SkoHqXOsCnYX4OLSiytY lWpwwnmbxIqbUwBbeg0qA2WHWI9nPf/3oQMXCDFNH0+awhpvP3lOINycqRDa6w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658682; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=2tbe1yf/elw3q9ddi46TIJjkrkI2lPyw8V2Oqd13dsI=; b=YcKBKvNaFHmXxiJgFMKXlF+yzcTSYQ460MahzXCsEGWq7z/B/Gbs5AZ730QiVR65TMdbCX cSKMIcLLr3aYaCDg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman , Ashok Raj Subject: [patch V2 6/8] x86/smp: Cure kexec() vs. mwait_play_dead() breakage References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:18:02 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768589972749557419?= X-GMAIL-MSGID: =?utf-8?q?1768589972749557419?= TLDR: It's a mess. When kexec() is executed on a system with "offline" CPUs, which are parked in mwait_play_dead() it can end up in a triple fault during the bootup of the kexec kernel or cause hard to diagnose data corruption. The reason is that kexec() eventually overwrites the previous kernels text, page tables, data and stack, If it writes to the cache line which is monitored by an previously offlined CPU, MWAIT resumes execution and ends up executing the wrong text, dereferencing overwritten page tables or corrupting the kexec kernels data. Cure this by bringing the offline CPUs out of MWAIT into HLT. Write to the monitored cache line of each offline CPU, which makes MWAIT resume execution. The written control word tells the offline CPUs to issue HLT, which does not have the MWAIT problem. That does not help, if a stray NMI, MCE or SMI hits the offline CPUs as those make it come out of HLT. A follow up change will put them into INIT, which protects at least against NMI and SMI. Fixes: ea53069231f9 ("x86, hotplug: Use mwait to offline a processor, fix the legacy case") Reported-by: Ashok Raj Signed-off-by: Thomas Gleixner Tested-by: Ashok Raj Reviewed-by: Ashok Raj --- arch/x86/include/asm/smp.h | 2 + arch/x86/kernel/smp.c | 23 ++++++++--------- arch/x86/kernel/smpboot.c | 59 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 72 insertions(+), 12 deletions(-) --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -132,6 +132,8 @@ void wbinvd_on_cpu(int cpu); int wbinvd_on_all_cpus(void); void cond_wakeup_cpu0(void); +void smp_kick_mwait_play_dead(void); + void native_smp_send_reschedule(int cpu); void native_send_call_func_ipi(const struct cpumask *mask); void native_send_call_func_single_ipi(int cpu); --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -157,21 +158,19 @@ static void native_stop_other_cpus(int w if (atomic_cmpxchg(&stopping_cpu, -1, safe_smp_processor_id()) != -1) return; - atomic_set(&stop_cpus_count, num_online_cpus() - 1); + /* For kexec, ensure that offline CPUs are out of MWAIT and in HLT */ + if (kexec_in_progress) + smp_kick_mwait_play_dead(); - /* - * Use an own vector here because smp_call_function - * does lots of things not suitable in a panic situation. - */ + atomic_set(&stop_cpus_count, num_online_cpus() - 1); /* - * We start by using the REBOOT_VECTOR irq. - * The irq is treated as a sync point to allow critical - * regions of code on other cpus to release their spin locks - * and re-enable irqs. Jumping straight to an NMI might - * accidentally cause deadlocks with further shutdown/panic - * code. By syncing, we give the cpus up to one second to - * finish their work before we force them off with the NMI. + * Start by using the REBOOT_VECTOR. That acts as a sync point to + * allow critical regions of code on other cpus to leave their + * critical regions. Jumping straight to an NMI might accidentally + * cause deadlocks with further shutdown code. This gives the CPUs + * up to one second to finish their work before forcing them off + * with the NMI. */ if (atomic_read(&stop_cpus_count) > 0) { apic_send_IPI_allbutself(REBOOT_VECTOR); --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -53,6 +53,7 @@ #include #include #include +#include #include #include #include @@ -106,6 +107,9 @@ struct mwait_cpu_dead { unsigned int status; }; +#define CPUDEAD_MWAIT_WAIT 0xDEADBEEF +#define CPUDEAD_MWAIT_KEXEC_HLT 0x4A17DEAD + /* * Cache line aligned data for mwait_play_dead(). Separate on purpose so * that it's unlikely to be touched by other CPUs. @@ -173,6 +177,10 @@ static void smp_callin(void) { int cpuid; + /* Mop up eventual mwait_play_dead() wreckage */ + this_cpu_write(mwait_cpu_dead.status, 0); + this_cpu_write(mwait_cpu_dead.control, 0); + /* * If waken up by an INIT in an 82489DX configuration * cpu_callout_mask guarantees we don't get here before @@ -1807,6 +1815,10 @@ static inline void mwait_play_dead(void) (highest_subcstate - 1); } + /* Set up state for the kexec() hack below */ + md->status = CPUDEAD_MWAIT_WAIT; + md->control = CPUDEAD_MWAIT_WAIT; + wbinvd(); while (1) { @@ -1824,10 +1836,57 @@ static inline void mwait_play_dead(void) mb(); __mwait(eax, 0); + if (READ_ONCE(md->control) == CPUDEAD_MWAIT_KEXEC_HLT) { + /* + * Kexec is about to happen. Don't go back into mwait() as + * the kexec kernel might overwrite text and data including + * page tables and stack. So mwait() would resume when the + * monitor cache line is written to and then the CPU goes + * south due to overwritten text, page tables and stack. + * + * Note: This does _NOT_ protect against a stray MCE, NMI, + * SMI. They will resume execution at the instruction + * following the HLT instruction and run into the problem + * which this is trying to prevent. + */ + WRITE_ONCE(md->status, CPUDEAD_MWAIT_KEXEC_HLT); + while(1) + native_halt(); + } + cond_wakeup_cpu0(); } } +/* + * Kick all "offline" CPUs out of mwait on kexec(). See comment in + * mwait_play_dead(). + */ +void smp_kick_mwait_play_dead(void) +{ + u32 newstate = CPUDEAD_MWAIT_KEXEC_HLT; + struct mwait_cpu_dead *md; + unsigned int cpu, i; + + for_each_cpu_andnot(cpu, cpu_present_mask, cpu_online_mask) { + md = per_cpu_ptr(&mwait_cpu_dead, cpu); + + /* Does it sit in mwait_play_dead() ? */ + if (READ_ONCE(md->status) != CPUDEAD_MWAIT_WAIT) + continue; + + /* Wait maximal 5ms */ + for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) { + /* Bring it out of mwait */ + WRITE_ONCE(md->control, newstate); + udelay(5); + } + + if (READ_ONCE(md->status) != newstate) + pr_err("CPU%u is stuck in mwait_play_dead()\n", cpu); + } +} + void __noreturn hlt_play_dead(void) { if (__this_cpu_read(cpu_info.x86) >= 4) From patchwork Tue Jun 13 12:18:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107348 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp505847vqr; Tue, 13 Jun 2023 05:30:35 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ51p2eYw2Y3qUy1fF1k+M2ROHM7Nd6HL37fco/YhdEwPi4iYWg/v5RvqyfFZiF3gDv2sNCK X-Received: by 2002:a17:906:da85:b0:966:2123:e0ca with SMTP id xh5-20020a170906da8500b009662123e0camr13461306ejb.34.1686659435518; Tue, 13 Jun 2023 05:30:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686659435; cv=none; d=google.com; s=arc-20160816; b=aLw8aoEdWnXJHvIjEbh6Nwn8viSsofMbpRAu73XuyWECJvaefrE21e37I4VAJgZdBm 5ig9N70OxQKR9GSX7tIMYCbfrjNEKNTHWp5b4PbljA3uzjCATIV1MNqO0iNH9UnB0TLE ZfaChFb8JZhqUiJLycKx8JtY/WfwPNaqv8jwChAetDuiaiJWDeacfuqN4ie/DNl8Cp05 4XZH3JejWisMbNZoQju4+hh0P40iwwug4f4d2pMryJxypyutr/R4bpyY6PE6g595+2nH i9dxPKhU9f+AV6qOQqfPdVbn0EXlY7Hjj6aDhmyOs4758e8iz9AqgpC8Lnh6syQkXvpw jqYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=TBdYSJyouEMeMZNGX8gFG7/P16NsNnlIxKFRjUwkdQA=; b=g8ILzkj+C2ZZI0mROsdS8u3tUoIv4AIGmUUUm136P/VlS/sh/TsJllvTuHynuZf627 w8e5hq0+RYFrNr1ETIXv+bMQVzImsAXr7A+SATkbeIiAQNKlKZpKdM3PwTHQ6skBB25N Q3cbeahERq0M7GuZvh7OTT/obJNsY7l1wKTWA79qbGEyqwS88SFrobd6xiqj/gvgelbW By3M6DyCLU6XZboV9pOfJr/Zgxive7ZWz0fpjxtjnZssviePZ0zKd1G9m94AnWinIlR8 SfPHOiPslK0MjcOt/8G16gHIrbiysPboNzCnO0R9NkDHt/6njrDVXk3VbBhGxcoQ5S1b UyXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=PnXfWGOW; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z14-20020a1709067e4e00b009787b15a877si6339784ejr.515.2023.06.13.05.30.10; Tue, 13 Jun 2023 05:30:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=PnXfWGOW; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242381AbjFMMSW (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242329AbjFMMSF (ORCPT ); Tue, 13 Jun 2023 08:18:05 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C35A810DA for ; Tue, 13 Jun 2023 05:18:04 -0700 (PDT) Message-ID: <20230613121615.988238767@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658683; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=TBdYSJyouEMeMZNGX8gFG7/P16NsNnlIxKFRjUwkdQA=; b=PnXfWGOWNKoX5Y2YOXm/AS72gnZ7FfbtQ7A8gA7wqf0e0pRab16EvmBYn7zoDp82o+Ifeq MvJOfq+KpLw8CFq9Oaj7A0bY81jGhs2usisMwJSC0GiTZSi1vuddo0KNGTFaVIJVvmiGTt 2FfTQXkoTzaIeYrXoWbS3wifYi/5I/6TXnkq2K2yWOTavzkHXfENc0yb2wb17L/OGSBz76 d+Ar9DIbanNwDbVwFPr1+O8GK+3Drj0WAcyCCQURQT88Ko+S3oJDy8v+/f2vSjjMsTvWl1 +UbN65Fs8ZgHyZdFFZ3u5dBoKKFB3qmEzdWHzwWEsfpq5lQF2vTfhyJbeN7z9A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658683; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=TBdYSJyouEMeMZNGX8gFG7/P16NsNnlIxKFRjUwkdQA=; b=sQqGJr9gWYQT+uqv1DzEL8yLTUSr7eHXZfptlhAktVNcdTuo/WU8Ml7RAsqYwMrOPrN76a n4do5/hAXULSMxDg== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman , Ashok Raj Subject: [patch V2 7/8] x86/smp: Split sending INIT IPI out into a helper function References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:18:03 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768590604043852853?= X-GMAIL-MSGID: =?utf-8?q?1768590604043852853?= Putting CPUs into INIT is a safer place during kexec() to park CPUs. Split the INIT assert/deassert sequence out so it can be reused. Signed-off-by: Thomas Gleixner Reviewed-by: Ashok Raj --- V2: Fix rebase screwup --- arch/x86/kernel/smpboot.c | 49 ++++++++++++++++++---------------------------- 1 file changed, 20 insertions(+), 29 deletions(-) --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -853,47 +853,38 @@ wakeup_secondary_cpu_via_nmi(int apicid, return (send_status | accept_status); } -static int -wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip) +static void send_init_sequence(int phys_apicid) { - unsigned long send_status = 0, accept_status = 0; - int maxlvt, num_starts, j; - - maxlvt = lapic_get_maxlvt(); + int maxlvt = lapic_get_maxlvt(); - /* - * Be paranoid about clearing APIC errors. - */ + /* Be paranoid about clearing APIC errors. */ if (APIC_INTEGRATED(boot_cpu_apic_version)) { - if (maxlvt > 3) /* Due to the Pentium erratum 3AP. */ + /* Due to the Pentium erratum 3AP. */ + if (maxlvt > 3) apic_write(APIC_ESR, 0); apic_read(APIC_ESR); } - pr_debug("Asserting INIT\n"); - - /* - * Turn INIT on target chip - */ - /* - * Send IPI - */ - apic_icr_write(APIC_INT_LEVELTRIG | APIC_INT_ASSERT | APIC_DM_INIT, - phys_apicid); - - pr_debug("Waiting for send to finish...\n"); - send_status = safe_apic_wait_icr_idle(); + /* Assert INIT on the target CPU */ + apic_icr_write(APIC_INT_LEVELTRIG | APIC_INT_ASSERT | APIC_DM_INIT, phys_apicid); + safe_apic_wait_icr_idle(); udelay(init_udelay); - pr_debug("Deasserting INIT\n"); - - /* Target chip */ - /* Send IPI */ + /* Deassert INIT on the target CPU */ apic_icr_write(APIC_INT_LEVELTRIG | APIC_DM_INIT, phys_apicid); + safe_apic_wait_icr_idle(); +} + +/* + * Wake up AP by INIT, INIT, STARTUP sequence. + */ +static int wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip) +{ + unsigned long send_status = 0, accept_status = 0; + int num_starts, j, maxlvt = lapic_get_maxlvt(); - pr_debug("Waiting for send to finish...\n"); - send_status = safe_apic_wait_icr_idle(); + send_init_sequence(phys_apicid); mb(); From patchwork Tue Jun 13 12:18:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 107346 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp504567vqr; Tue, 13 Jun 2023 05:28:23 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ46n8nl3nKre6gOvGkWjw6gZuKu08i62eCJ7paHCk350Tc/8hFlduIuTv3KoM35WVxx9bzT X-Received: by 2002:a17:906:da87:b0:96f:6a03:eca with SMTP id xh7-20020a170906da8700b0096f6a030ecamr11647939ejb.48.1686659303086; Tue, 13 Jun 2023 05:28:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686659303; cv=none; d=google.com; s=arc-20160816; b=Wteo5ZGOuO+0hLLqb8aNjF+UqQQ72Vh7I9ZbauFRgXfcTdObi1oYHe+8N1rsfiTe0S nmIIYldM8GUIJ14HiGlGdHWLBgP+kYxB+f+MwJN1GRt3KDg/R9TI4CtKBrbyqBX8jwAd 6zQ7BdtGljyxBED9wT8jSoOcwU6cwgM2NSrq6xm7QcWUa3MTR4g8dlEUoRRf4dF8EeFB TXeyLPq0Sm6ZF2UJ49g3F3cPre+IA94EwgauVM3AIbaG37XxjeGTo85Ol7m+72vbj7+L BpXhvF2vexKNMIzNUwno2YdfxyX1EXwQu86BiWTCJvYDKeur38WUVGwQuJnSjMtNf+Ic LILw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=+pGCv5A/cUsvuPq2YWIxIWEZjsKI3s/qdD/44v/vU8Q=; b=LJU6OPPufe2HEhTd5jEOKcQBmm2FSrZLuOV5X6izlj0JCZaPbmKCUTH8kI3LKILx9v +LVvtHsJR3QtFE5LqdFmmAcyd7H0hBUW9SdEc4baUQ1Nu7zOF2dVbVAx9K77qnU9nHI4 Y387lLdAe543a6vpOb+BEDpuEqB3Yjn10ae1dkVI5B/4tC2gUvenigqTM9isY2FRbSXf 0qLOMPBaWlDhcGWghIilsVNPi5XiAseuyKgr151G/l0K239l9OS8kI/qdhtc0y+KHWuq iCaHILkKHKUAC7ZRqvSOY5yeI2up1JIhMQetOrESfI3alOhzjNgHp1JQcNugVf5c+V9l o0XA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=C+IAa5Vc; dkim=neutral (no key) header.i=@linutronix.de header.b=SJ6Eu0mS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j16-20020a1709066dd000b0096f56df6a00si6740544ejt.661.2023.06.13.05.27.58; Tue, 13 Jun 2023 05:28:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=C+IAa5Vc; dkim=neutral (no key) header.i=@linutronix.de header.b=SJ6Eu0mS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242367AbjFMMSZ (ORCPT + 99 others); Tue, 13 Jun 2023 08:18:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242339AbjFMMSI (ORCPT ); Tue, 13 Jun 2023 08:18:08 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07FB810DE for ; Tue, 13 Jun 2023 05:18:06 -0700 (PDT) Message-ID: <20230613121616.043917725@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1686658684; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=+pGCv5A/cUsvuPq2YWIxIWEZjsKI3s/qdD/44v/vU8Q=; b=C+IAa5VcWWtJUvW9fkPRzEnCIXIE1fuu4nI1g0LAWosPE3uELqrVZ+xrVZnmTgonotB3fW fetUCFEyv1icg62OIzvSAb7R3MgeSRf67DHAkPfYJiZjy+9nebqynSXC0qGDcznlesvcaS x4voMKMH4MrHymVDNW0gNqnfrzKqilMCFXo/ZPu4gbgOWD2zTbweT1ClIy6W2B5+j37nZj 0tGkiHSjE9n3t/T5AwnVH1QasayiLhe094Swjv3MvY4D5pxxAwpjPtkffWl4AV0+YSD0Sj EFw/UdyS5tZBy3OvYRMsAn65lQHQS1Dw6Gb1oPRmQgX4KWooODZpXfODtVG8OQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1686658684; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=+pGCv5A/cUsvuPq2YWIxIWEZjsKI3s/qdD/44v/vU8Q=; b=SJ6Eu0mSyRlcNSUcInmuOlYzla2QknnMjcYSFX67vaztA2iwBWxdTukfv9zUoYD77RFx58 D95D+j/gvHapg5CA== From: Thomas Gleixner To: LKML Cc: x86@kernel.org, Mario Limonciello , Tom Lendacky , Tony Battersby , Ashok Raj , Tony Luck , Arjan van de Veen , Eric Biederman , Ashok Raj Subject: [patch V2 8/8] x86/smp: Put CPUs into INIT on shutdown if possible References: <20230613115353.599087484@linutronix.de> MIME-Version: 1.0 Date: Tue, 13 Jun 2023 14:18:04 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768590465705102573?= X-GMAIL-MSGID: =?utf-8?q?1768590465705102573?= Parking CPUs in a HLT loop is not completely safe vs. kexec() as HLT can resume execution due to NMI, SMI and MCE, which has the same issue as the MWAIT loop. Kicking the secondary CPUs into INIT makes this safe against NMI and SMI. A broadcast MCE will take the machine down, but a broadcast MCE which makes HLT resume and execute overwritten text, pagetables or data will end up in a disaster too. So chose the lesser of two evils and kick the secondary CPUs into INIT unless the system has installed special wakeup mechanisms which are not using INIT. Signed-off-by: Thomas Gleixner Reviewed-by: Ashok Raj --- arch/x86/include/asm/smp.h | 2 ++ arch/x86/kernel/smp.c | 38 +++++++++++++++++++++++++++++--------- arch/x86/kernel/smpboot.c | 19 +++++++++++++++++++ 3 files changed, 50 insertions(+), 9 deletions(-) --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -139,6 +139,8 @@ void native_send_call_func_ipi(const str void native_send_call_func_single_ipi(int cpu); void x86_idle_thread_init(unsigned int cpu, struct task_struct *idle); +bool smp_park_nonboot_cpus_in_init(void); + void smp_store_boot_cpu_info(void); void smp_store_cpu_info(int id); --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -131,7 +131,7 @@ static int smp_stop_nmi_callback(unsigne } /* - * this function calls the 'stop' function on all other CPUs in the system. + * Disable virtualization, APIC etc. and park the CPU in a HLT loop */ DEFINE_IDTENTRY_SYSVEC(sysvec_reboot) { @@ -148,8 +148,7 @@ static int register_stop_handler(void) static void native_stop_other_cpus(int wait) { - unsigned long flags; - unsigned long timeout; + unsigned long flags, timeout; if (reboot_force) return; @@ -167,10 +166,10 @@ static void native_stop_other_cpus(int w /* * Start by using the REBOOT_VECTOR. That acts as a sync point to * allow critical regions of code on other cpus to leave their - * critical regions. Jumping straight to an NMI might accidentally - * cause deadlocks with further shutdown code. This gives the CPUs - * up to one second to finish their work before forcing them off - * with the NMI. + * critical regions. Jumping straight to NMI or INIT might + * accidentally cause deadlocks with further shutdown code. This + * gives the CPUs up to one second to finish their work before + * forcing them off with the NMI or INIT. */ if (atomic_read(&stop_cpus_count) > 0) { apic_send_IPI_allbutself(REBOOT_VECTOR); @@ -178,7 +177,7 @@ static void native_stop_other_cpus(int w /* * Don't wait longer than a second for IPI completion. The * wait request is not checked here because that would - * prevent an NMI shutdown attempt in case that not all + * prevent an NMI/INIT shutdown in case that not all * CPUs reach shutdown state. */ timeout = USEC_PER_SEC; @@ -186,7 +185,27 @@ static void native_stop_other_cpus(int w udelay(1); } - /* if the REBOOT_VECTOR didn't work, try with the NMI */ + /* + * Park all nonboot CPUs in INIT including offline CPUs, if + * possible. That's a safe place where they can't resume execution + * of HLT and then execute the HLT loop from overwritten text or + * page tables. + * + * The only downside is a broadcast MCE, but up to the point where + * the kexec() kernel brought all APs online again an MCE will just + * make HLT resume and handle the MCE. The machine crashs and burns + * due to overwritten text, page tables and data. So there is a + * choice between fire and frying pan. The result is pretty much + * the same. Chose frying pan until x86 provides a sane mechanism + * to park a CPU. + */ + if (smp_park_nonboot_cpus_in_init()) + goto done; + + /* + * If park with INIT was not possible and the REBOOT_VECTOR didn't + * take all secondary CPUs offline, try with the NMI. + */ if (atomic_read(&stop_cpus_count) > 0) { /* * If NMI IPI is enabled, try to register the stop handler @@ -211,6 +230,7 @@ static void native_stop_other_cpus(int w udelay(1); } +done: local_irq_save(flags); disable_local_APIC(); mcheck_cpu_clear(this_cpu_ptr(&cpu_info)); --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1465,6 +1465,25 @@ void arch_thaw_secondary_cpus_end(void) cache_aps_init(); } +bool smp_park_nonboot_cpus_in_init(void) +{ + unsigned int cpu, this_cpu = smp_processor_id(); + unsigned int apicid; + + if (apic->wakeup_secondary_cpu_64 || apic->wakeup_secondary_cpu) + return false; + + for_each_present_cpu(cpu) { + if (cpu == this_cpu) + continue; + apicid = apic->cpu_present_to_apicid(cpu); + if (apicid == BAD_APICID) + continue; + send_init_sequence(apicid); + } + return true; +} + /* * Early setup to make printk work. */