Message ID | 20230202215625.3248306-8-usama.arif@bytedance.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp494208wrn; Thu, 2 Feb 2023 13:59:02 -0800 (PST) X-Google-Smtp-Source: AK7set9BGXWupUsoswfmsimPeWor+XPoCwGgF/gkNehygKVjIzWPftwtd4jLIeATi9Yp1urmLLW/ X-Received: by 2002:a17:906:e09:b0:885:fb8a:7c3f with SMTP id l9-20020a1709060e0900b00885fb8a7c3fmr7593918eji.65.1675375142064; Thu, 02 Feb 2023 13:59:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675375142; cv=none; d=google.com; s=arc-20160816; b=RoTXuYWmoYTOfJurqv6QItJolm2ovkC2N4xa1I0IZNJM2uW8jvAscE2cnigPRFvaNk g5dWV2FRuqtbV+Wup73LTu95WxGM6ze5Q3jn4+gYfAchDND5y8RKD72tQKEBGZLXoy2D VD/zOp6GpIo91KKumNx0kD4CulIXV+jU7F8hujD9Dv+nrV+jNL3QiMtvezI8k9HCM7hK gMBgP281nnegfb4axX7q8fU+AJfxLuaCCZIzj2a20wQ7+RJVnSqi2oyWQAWtDyj9BaeM o1OdLiDzpjkK8BcaI7+SAyBIrsyifCwYVLWwPSQSqIJLtJqyPRaAIkXNPj0MFnxBvDlO q5CQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NSUUSSxTZ02F+r1LTVXNULWrMNdIZEEBIKeOIraHNX8=; b=zjpkSchsw6UebCo+SW2whOkAMZ3o4DqgCMlLCef+jvrt3Q+nKluDt9cX99fd5R8pan pQDJvVHTZqQIOWuvYHcF9JsA5YGCp6ojQcIQPACzgie8CQlfMllb6hAMjB/uV+3YU5XK 6H4EPveWP2Y6SJJaH+TuAjIYSkcduaW8NEvbZElNuBXdliBOFOLNh9j79sRAPwTeWr5F uF/3zqexRZ6hS98I9d6NPPpmM3fZy1sFPhSdsjGy52R9BloZrBaIikVXr6opteTMo/j3 NsEAHu23AJKvjFJooYo20KzlUIU3VNFe9RlC61j7V44kQpPSfSKtpQUadVLzLGUEuBOy NSEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=jWrmnXct; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gc9-20020a1709072b0900b0087a3b3f01ccsi659958ejc.923.2023.02.02.13.58.37; Thu, 02 Feb 2023 13:59:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=jWrmnXct; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233274AbjBBV5m (ORCPT <rfc822;il.mystafa@gmail.com> + 99 others); Thu, 2 Feb 2023 16:57:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233237AbjBBV5S (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 2 Feb 2023 16:57:18 -0500 Received: from mail-wr1-x42e.google.com (mail-wr1-x42e.google.com [IPv6:2a00:1450:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A79A728D5 for <linux-kernel@vger.kernel.org>; Thu, 2 Feb 2023 13:56:38 -0800 (PST) Received: by mail-wr1-x42e.google.com with SMTP id q5so3042154wrv.0 for <linux-kernel@vger.kernel.org>; Thu, 02 Feb 2023 13:56:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NSUUSSxTZ02F+r1LTVXNULWrMNdIZEEBIKeOIraHNX8=; b=jWrmnXctFoGJa9Wp7ZeU4q2i+NjqAJ2dYC96bkDbNz/V5jBzNy4cOuVK5EOPrnZJyq /vpwLO/s8e9/wVeW8Shg8jS//TmAwxBJVWnJJERRHdtvez1G1qiZ06wqhIYPy5MPxsSh 8Vb/DlK4m4XAqJokCHT4pGnu7NU/zeYRJsgzo6YI6RkYQqHTnluKdmWVlUNAvqYCG/lx XbgH1M3MTSsxPWLBlAh6scMY/xpQ9kQph0tL23tJYJQUphfb617YuhPngBrGVzw5SWak msODHrjIlagSjaREHtgTxusON6vTobYmR2bISUHElyoTNfwIjvzehB94J43pAdYcCSKS XpFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NSUUSSxTZ02F+r1LTVXNULWrMNdIZEEBIKeOIraHNX8=; b=aGFEfyXRBqPN8E6XZTTZAnH1iPp+KdiJIU+a70A0IJDoYA2jkg1zAMUcvgtk8fKy5S Ihk8K0lshnY6IJZWAJkO88dzUt+UeGad7Uo6WBjdQytXymlbvSHw/muPwX4/XLnMb3UP 3x1Fv/7xKpUwHGGD4N6/dGB55ETVEtanVFTBuymGpqyrskQRzS+qY9zN+OBwCE9ZR3HU G37OJrgkp24qKT/QcSKlMOsHy2nTDlRrB6F2zz696iPn+CF+EQAexWBGGuFVyXh1TeSs laEYDZ3RRjZ5zGej/wBXjjI79gTK35cZIrY/jCSV2PJr3Sex0arG0ySPv/xA2g6ui+24 GIhA== X-Gm-Message-State: AO0yUKW6nukoe2DjZ82Lhaqos2wss0uTIP9kfGOImNkk/jCDxp7l08QG QhreMaRjXSvkVaYJzzxkoINmSg== X-Received: by 2002:a5d:6da3:0:b0:2bf:ae11:c40c with SMTP id u3-20020a5d6da3000000b002bfae11c40cmr8326498wrs.32.1675374996905; Thu, 02 Feb 2023 13:56:36 -0800 (PST) Received: from usaari01.cust.communityfibre.co.uk ([2a02:6b6a:b566:0:98fe:e4ee:fc7e:cd71]) by smtp.gmail.com with ESMTPSA id e8-20020a5d6d08000000b00297dcfdc90fsm506078wrq.24.2023.02.02.13.56.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Feb 2023 13:56:36 -0800 (PST) From: Usama Arif <usama.arif@bytedance.com> To: dwmw2@infradead.org, tglx@linutronix.de, arjan@linux.intel.com Cc: mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, x86@kernel.org, pbonzini@redhat.com, paulmck@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, rcu@vger.kernel.org, mimoja@mimoja.de, hewenliang4@huawei.com, thomas.lendacky@amd.com, seanjc@google.com, pmenzel@molgen.mpg.de, fam.zheng@bytedance.com, punit.agrawal@bytedance.com, simon.evans@bytedance.com, liangma@liangbit.com, David Woodhouse <dwmw@amazon.co.uk> Subject: [PATCH v6 07/11] x86/smpboot: Disable parallel boot for AMD CPUs Date: Thu, 2 Feb 2023 21:56:21 +0000 Message-Id: <20230202215625.3248306-8-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230202215625.3248306-1-usama.arif@bytedance.com> References: <20230202215625.3248306-1-usama.arif@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756758165078612241?= X-GMAIL-MSGID: =?utf-8?q?1756758165078612241?= |
Series |
Parallel CPU bringup for x86_64
|
|
Commit Message
Usama Arif
Feb. 2, 2023, 9:56 p.m. UTC
From: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/amd.c | 11 +++++++++++ arch/x86/kernel/smpboot.c | 13 +++++++++++-- 3 files changed, 23 insertions(+), 2 deletions(-)
Comments
+Mario Hi, On 2/2/23 3:56 PM, Usama Arif wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- I'd like to nack this, but can't (and not because it doesn't have commit text): If I: - take dwmw2's parallel-6.2-rc6 branch (commit 459d1c46dbd1) - remove the set_cpu_bug(c, X86_BUG_NO_PARALLEL_BRINGUP) line from amd.c Then: - a Ryzen 3000 (Picasso A1/Zen+) notebook I have access to fails to boot. - Zen 2,3,4-based servers boot fine - a Zen1-based server doesn't boot. This is what's left on its serial port: [ 3.199633] smp: Bringing up secondary CPUs ... [ 3.200732] x86: Booting SMP configuration: [ 3.204242] .... node #0, CPUs: #1 [ 3.204301] CPU 1 to 93/x86/cpu:kick in 63 21 -114014307645 0 . 0 0 0 0 . 0 114025055970 [ 3.204478] ------------[ cut here ]------------ [ 3.204481] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 [ 3.204490] Modules linked in: [ 3.204493] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc6+ #19 [ 3.204496] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 [ 3.204498] RIP: 0010:cpu_init+0x2d/0x1f0 [ 3.204502] Code: e5 41 56 41 55 41 54 53 65 48 8b 1c 25 80 2e 1f 00 65 44 8b 35 20 e4 39 55 48 8b 05 5d f7 51 02 44 89 f2 f0 48 0f ab 10 73 06 <0f> 0b eb 02 f3 90 48 8b 05 3e f7 51 02 48 0f a3 10 73 f1 45 85 f6 [ 3.204504] RSP: 0000:ffffffffac803d70 EFLAGS: 00010083 [ 3.204506] RAX: ffff8d293eef6e40 RBX: ffff8d1d40010000 RCX: 0000000000000008 [ 3.204508] RDX: 0000000000000000 RSI: ffff8d1d1c40b048 RDI: ffffffffac566418 [ 3.204509] RBP: ffffffffac803d90 R08: 00000000fffffe14 R09: ffff8d1d1c406078 [ 3.204510] R10: ffffffffac803dc0 R11: 0000000000000000 R12: 0000000000000000 [ 3.204511] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 3.204512] FS: 0000000000000000(0000) GS:ffff8d1d1c400000(0000) knlGS:0000000000000000 [ 3.204514] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.204515] CR2: 0000000000000000 CR3: 0000800daec12000 CR4: 00000000003100a0 [ 3.204517] Call Trace: [ 3.204519] ---[ end trace 0000000000000000 ]--- [ 3.204580] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 APIC: 2 [ 3.288686] #2 [ 3.288735] CPU 2 to 93/x86/cpu:kick in 210 42 -114355248756 0 . 0 0 0 0 . 0 114356192013 [ 3.288798] ------------[ cut here ]------------ [ 3.288804] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 [ 3.288815] Modules linked in: [ 3.288819] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc6+ #19 [ 3.288823] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 [ 3.288826] RIP: 0010:cpu_init+0x2d/0x1f0 [ 3.288831] Code: e5 41 56 41 55 41 54 53 65 48 8b 1c 25 80 2e 1f 00 65 44 8b 35 20 e4 39 55 48 8b 05 5d f7 51 02 44 89 f2 f0 48 0f ab 10 73 06 <0f> 0b eb 02 f3 90 48 8b 05 3e f7 51 02 48 0f a3 10 73 f1 45 85 f6 [ 3.288835] RSP: 0000:ffffffffac803d70 EFLAGS: 00010083 [ 3.288838] RAX: ffff8d293eef6e40 RBX: ffff8d1d40010000 RCX: 0000000000000008 [ 3.288841] RDX: 0000000000000000 RSI: ffff8d1d1c40b048 RDI: ffffffffac566418 [ 3.288844] RBP: ffffffffac803d90 R08: 00000000fffffe14 R09: ffff8d1d1c406078 [ 3.288845] R10: ffffffffac803dc0 R11: 0000000000000000 R12: 0000000000000000 [ 3.288848] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 3.288850] FS: 0000000000000000(0000) GS:ffff8d1d1c400000(0000) knlGS:0000000000000000 [ 3.288852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.288855] CR2: 0000000000000000 CR3: 0000800daec12000 CR4: 00000000003100a0 [ 3.288857] Call Trace: [ 3.288859] ---[ end trace 0000000000000000 ]--- [ 3.288925] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 APIC: 8 6.36[ [ 3. 68 33]3 [ #3[ [ # [ 3.368623[ 3 [ 3.368623] #3 [ 3.368662] ------------[ cut here ]------------ [ 3.368673] CPU 3 to 93/x86/cpu:kick in 504 315 -114684508974 0 . 0 0 0 0 . 0 114685353594 [ 3.368705] BUG: scheduling while atomic: swapper/0/1/0x00000003 [ 3.368708] 7 locks held by swapper/0/1: [ 3.368710] #0: ffffffffacbff920 (console_lock){....}-{0:0}, at: vprintk_emit+0x13a/0x2e0 [ 3.368721] #1: ffffffffacbffd48 (console_srcu){....}-{0:0}, at: console_flush_all+0x2d/0x250 [ 3.368728] #2: ffffffffac87f540 (console_owner){....}-{0:0}, at: console_emit_next_record.constprop.22+0x189/0x350 [ 3.368735] #3: ffffffffadaae838 (&port_lock_key){....}-{2:2}, at: serial8250_console_write+0x88/0x3c0 [ 3.368745] #4: ffffffffac86aa50 (cpu_add_remove_lock){....}-{3:3}, at: cpu_up+0x6a/0xd0 [ 3.368753] #5: ffffffffac86a9a0 (cpu_hotplug_lock){....}-{0:0}, at: _cpu_up+0x3d/0x2f0 [ 3.368760] #6: ffffffffac8763b0 (smpboot_threads_lock){....}-{3:3}, at: smpboot_create_threads+0x21/0x80 [ 3.368769] Modules linked in: [ 3.368770] Preemption disabled at: [ 3.368771] [<ffffffffaae717a4>] do_cpu_up+0x3e4/0x780 [ 3.368777] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc6+ #19 [ 3.368781] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 [ 3.368782] Call Trace: [ 3.368783] <TASK> [ 3.368789] dump_stack_lvl+0x49/0x63 [ 3.368795] ? do_cpu_up+0x3e4/0x780 [ 3.368799] dump_stack+0x10/0x16 [ 3.368802] __schedule_bug+0xad/0xd0 [ 3.368808] __schedule+0x76/0x8a0 [ 3.368812] ? sched_clock+0x9/0x10 [ 3.368817] ? sched_clock_local+0x17/0x90 [ 3.368826] ? sort_range+0x30/0x30 [ 3.368830] schedule+0x88/0xd0 [ 3.368833] schedule_timeout+0x40/0x320 [ 3.368840] ? __this_cpu_preempt_check+0x13/0x20 [ 3.368844] ? lock_release+0x353/0x3c0 [ 3.368852] ? sort_range+0x30/0x30 [ 3.368856] wait_for_completion_killable+0xe0/0x1c0 [ 3.368864] __kthread_create_on_node+0xfe/0x1e0 [ 3.368876] ? wait_for_completion_killable+0x38/0x1c0 [ 3.368884] kthread_create_on_node+0x46/0x70 [ 3.368894] kthread_create_on_cpu+0x2c/0x90 [ 3.368899] __smpboot_create_thread+0x87/0x140 [ 3.368905] smpboot_create_threads+0x3f/0x80 [ 3.368909] ? idle_thread_get+0x40/0x40 [ 3.368913] cpuhp_invoke_callback+0x13c/0x5d0 [ 3.368921] __cpuhp_invoke_callback_range+0x69/0xf0 [ 3.368929] _cpu_up+0x12a/0x2f0 [ 3.368937] cpu_up+0x8f/0xd0 [ 3.368942] bringup_nonboot_cpus+0x7c/0x160 [ 3.368950] smp_init+0x2a/0x83 [ 3.368957] kernel_init_freeable+0x1a1/0x309 [ 3.368961] ? lock_release+0x353/0x3c0 [ 3.368972] ? rest_init+0x140/0x140 [ 3.368977] kernel_init+0x1a/0x130 [ 3.368980] ret_from_fork+0x22/0x30 [ 3.368996] </TASK> [ 3.369419] [ 3.369420] .... node #1, CPUs: #4 [ 3.369466] ------------[ cut here ]------------ [ 3.369469] CPU 4 to 93/x86/cpu:kick in 378 42 -114685407543 0 . 0 0 0 0 . 0 114687022569 [ 3.369474] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 [ 3.369487] Modules linked in: [ 3.369491] ------------[ cut here ]------------ [ 3.369494] DEBUG_LOCKS_WARN_ON(val > preempt_count()) [ 3.369493] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc6+ #19 [ 3.369499] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 ...which points to the WARN_ON here: static void wait_for_master_cpu(int cpu) { #ifdef CONFIG_SMP /* * wait for ACK from master CPU before continuing * with AP initialization */ WARN_ON(cpumask_test_and_set_cpu(cpu, cpu_initialized_mask)); while (!cpumask_test_cpu(cpu, cpu_callout_mask)) cpu_relax(); #endif } Let me know if you'd like me to test any changes. Thanks, Kim
On 2/3/23 2:19 PM, Woodhouse, David wrote: > Would be interesting to know if that goes away if you test only the > first part of the tree. My first inclination is to suspect that's a bug > in the later patches... although the APIC ID mismatch is interesting. > That part (with each AP getting its own APIC ID from CPUID) is in the > *first* part of the series.... dwmw2/parallel-6.2-rc6-part1 (commit 942b3faa258c) re-enabled for AMD gets the same splat(s): [ 3.195498] smp: Bringing up secondary CPUs ... [ 3.196670] x86: Booting SMP configuration: [ 3.200189] .... node #0, CPUs: #1 [ 3.200247] CPU 1 to 93/x86/cpu:kick in 315 42 -155206575216 0 . 0 0 0 0 . 0 155217324423 [ 3.200418] ------------[ cut here ]------------ [ 3.200420] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 [ 3.200428] Modules linked in: [ 3.200430] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc6+ #19 [ 3.200433] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 [ 3.200435] RIP: 0010:cpu_init+0x2d/0x1f0 [ 3.200438] Code: e5 41 56 41 55 41 54 53 65 48 8b 1c 25 80 2e 1f 00 65 44 8b 35 20 e4 99 47 48 8b 05 5d f7 51 02 44 89 f2 f0 48 0f ab 10 73 06 <0f> 0b eb 02 f3 90 48 8b 05 3e f7 51 02 48 0f a3 10 73 f1 45 85 f6 [ 3.200440] RSP: 0000:ffffffffba203d70 EFLAGS: 00010083 [ 3.200443] RAX: ffff8e027eef6e40 RBX: ffff8dfe80064b80 RCX: 0000000000000008 [ 3.200444] RDX: 0000000000000000 RSI: ffff8df65c40b048 RDI: ffffffffb9f66418 [ 3.200445] RBP: ffffffffba203d90 R08: 00000000fffffe4d R09: ffff8df65c406078 [ 3.200446] R10: ffffffffba203dc0 R11: 0000000000000000 R12: 0000000000000000 [ 3.200447] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 3.200448] FS: 0000000000000000(0000) GS:ffff8df65c400000(0000) knlGS:0000000000000000 [ 3.200450] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.200451] CR2: 0000000000000000 CR3: 0000800307e12000 CR4: 00000000003100a0 [ 3.200453] Call Trace: [ 3.200454] ---[ end trace 0000000000000000 ]--- [ 3.200509] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 APIC: 2 [ 3.284620] #2 [ 3.284669] CPU 2 to 93/x86/cpu:kick in 252 42 -155547668514 0 . 0 0 0 0 . 0 155548597197 [ 3.284727] ------------[ cut here ]------------ [ 3.284732] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 [ 3.284741] Modules linked in: [ 3.284745] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc6+ #19 [ 3.284749] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 [ 3.284752] RIP: 0010:cpu_init+0x2d/0x1f0 [ 3.284756] Code: e5 41 56 41 55 41 54 53 65 48 8b 1c 25 80 2e 1f 00 65 44 8b 35 20 e4 99 47 48 8b 05 5d f7 51 02 44 89 f2 f0 48 0f ab 10 73 06 <0f> 0b eb 02 f3 90 48 8b 05 3e f7 51 02 48 0f a3 10 73 f1 45 85 f6 [ 3.284760] RSP: 0000:ffffffffba203d70 EFLAGS: 00010083 [ 3.284764] RAX: ffff8e027eef6e40 RBX: ffff8dfe80064b80 RCX: 0000000000000008 [ 3.284766] RDX: 0000000000000000 RSI: ffff8df65c40b048 RDI: ffffffffb9f66418 [ 3.284769] RBP: ffffffffba203d90 R08: 00000000fffffe4d R09: ffff8df65c406078 [ 3.284771] R10: ffffffffba203dc0 R11: 0000000000000000 R12: 0000000000000000 [ 3.284773] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 3.284775] FS: 0000000000000000(0000) GS:ffff8df65c400000(0000) knlGS:0000000000000000 [ 3.284778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.284779] CR2: 0000000000000000 CR3: 0000800307e12000 CR4: 00000000003100a0 [ 3.284781] Call Trace: [ 3.284783] ---[ end trace 0000000000000000 ]--- [ 3.284841] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 APIC: 8 [ 3.364[[ [. 343 35 5 5] [ 3 364575 [ 3 [ [ [3 3 [ [ 3 Thanks, Kim
On Fri, 2023-02-03 at 15:45 -0600, Kim Phillips wrote: > On 2/3/23 2:19 PM, Woodhouse, David wrote: > > Would be interesting to know if that goes away if you test only the > > first part of the tree. My first inclination is to suspect that's a bug > > in the later patches... although the APIC ID mismatch is interesting. > > That part (with each AP getting its own APIC ID from CPUID) is in the > > *first* part of the series.... > > dwmw2/parallel-6.2-rc6-part1 (commit 942b3faa258c) re-enabled for > AMD gets the same splat(s): Sure that's the right image? Looks like it still has the timing debug patch from the last commit in the tree? > [ 3.195498] smp: Bringing up secondary CPUs ... > [ 3.196670] x86: Booting SMP configuration: > [ 3.200189] .... node #0, CPUs: #1 > [ 3.200247] CPU 1 to 93/x86/cpu:kick in 315 42 -155206575216 0 . 0 0 0 0 . 0 155217324423 > [ 3.200418] ------------[ cut here ]------------ > [ 3.200420] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 > [ 3.200428] Modules linked in: > [ 3.200430] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc6+ #19 > [ 3.200433] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 > [ 3.200435] RIP: 0010:cpu_init+0x2d/0x1f0 > [ 3.200438] Code: e5 41 56 41 55 41 54 53 65 48 8b 1c 25 80 2e 1f 00 65 44 8b 35 20 e4 99 47 48 8b 05 5d f7 51 02 44 89 f2 f0 48 0f ab 10 73 06 <0f> 0b eb 02 f3 90 48 8b 05 3e f7 51 02 48 0f a3 10 73 f1 45 85 f6 > [ 3.200440] RSP: 0000:ffffffffba203d70 EFLAGS: 00010083 > [ 3.200443] RAX: ffff8e027eef6e40 RBX: ffff8dfe80064b80 RCX: 0000000000000008 > [ 3.200444] RDX: 0000000000000000 RSI: ffff8df65c40b048 RDI: ffffffffb9f66418 > [ 3.200445] RBP: ffffffffba203d90 R08: 00000000fffffe4d R09: ffff8df65c406078 > [ 3.200446] R10: ffffffffba203dc0 R11: 0000000000000000 R12: 0000000000000000 > [ 3.200447] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 3.200448] FS: 0000000000000000(0000) GS:ffff8df65c400000(0000) knlGS:0000000000000000 > [ 3.200450] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3.200451] CR2: 0000000000000000 CR3: 0000800307e12000 CR4: 00000000003100a0 > [ 3.200453] Call Trace: > [ 3.200454] ---[ end trace 0000000000000000 ]--- > [ 3.200509] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 APIC: 2 > [ 3.284620] #2 > [ 3.284669] CPU 2 to 93/x86/cpu:kick in 252 42 -155547668514 0 . 0 0 0 0 . 0 155548597197 > [ 3.284727] ------------[ cut here ]------------ > [ 3.284732] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 > [ 3.284741] Modules linked in: > [ 3.284745] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc6+ #19 > [ 3.284749] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 > [ 3.284752] RIP: 0010:cpu_init+0x2d/0x1f0 > [ 3.284756] Code: e5 41 56 41 55 41 54 53 65 48 8b 1c 25 80 2e 1f 00 65 44 8b 35 20 e4 99 47 48 8b 05 5d f7 51 02 44 89 f2 f0 48 0f ab 10 73 06 <0f> 0b eb 02 f3 90 48 8b 05 3e f7 51 02 48 0f a3 10 73 f1 45 85 f6 > [ 3.284760] RSP: 0000:ffffffffba203d70 EFLAGS: 00010083 > [ 3.284764] RAX: ffff8e027eef6e40 RBX: ffff8dfe80064b80 RCX: 0000000000000008 > [ 3.284766] RDX: 0000000000000000 RSI: ffff8df65c40b048 RDI: ffffffffb9f66418 > [ 3.284769] RBP: ffffffffba203d90 R08: 00000000fffffe4d R09: ffff8df65c406078 > [ 3.284771] R10: ffffffffba203dc0 R11: 0000000000000000 R12: 0000000000000000 > [ 3.284773] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 3.284775] FS: 0000000000000000(0000) GS:ffff8df65c400000(0000) knlGS:0000000000000000 > [ 3.284778] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3.284779] CR2: 0000000000000000 CR3: 0000800307e12000 CR4: 00000000003100a0 > [ 3.284781] Call Trace: > [ 3.284783] ---[ end trace 0000000000000000 ]--- > [ 3.284841] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 APIC: 8 > [ 3.364[[ [. 343 35 5 5] [ 3 364575 [ > 3 [ [ [3 3 [ [ 3 > > Thanks, > > Kim
On Fri, 2023-02-03 at 13:48 -0600, Kim Phillips wrote: > +Mario > > Hi, > > On 2/2/23 3:56 PM, Usama Arif wrote: > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > --- > > I'd like to nack this, but can't (and not because it doesn't have > commit text): > > If I: > > - take dwmw2's parallel-6.2-rc6 branch (commit 459d1c46dbd1) > - remove the set_cpu_bug(c, X86_BUG_NO_PARALLEL_BRINGUP) line from amd.c > > Then: > > - a Ryzen 3000 (Picasso A1/Zen+) notebook I have access to fails to boot. > - Zen 2,3,4-based servers boot fine > - a Zen1-based server doesn't boot. > > This is what's left on its serial port: > > [ 3.199633] smp: Bringing up secondary CPUs ... > [ 3.200732] x86: Booting SMP configuration: > [ 3.204242] .... node #0, CPUs: #1 > [ 3.204301] CPU 1 to 93/x86/cpu:kick in 63 21 -114014307645 0 . 0 0 0 0 . 0 114025055970 > [ 3.204478] ------------[ cut here ]------------ > [ 3.204481] WARNING: CPU: 0 PID: 1 at arch/x86/kernel/cpu/common.c:2122 cpu_init+0x2d/0x1f0 > [ 3.204490] Modules linked in: > [ 3.204493] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc6+ #19 > [ 3.204496] Hardware name: AMD Corporation Speedway/Speedway, BIOS RSW1009C 07/27/2018 > [ 3.204498] RIP: 0010:cpu_init+0x2d/0x1f0 > [ 3.204502] Code: e5 41 56 41 55 41 54 53 65 48 8b 1c 25 80 2e 1f 00 65 44 8b 35 20 e4 39 55 48 8b 05 5d f7 51 02 44 89 f2 f0 48 0f ab 10 73 06 <0f> 0b eb 02 f3 90 48 8b 05 3e f7 51 02 48 0f a3 10 73 f1 45 85 f6 > [ 3.204504] RSP: 0000:ffffffffac803d70 EFLAGS: 00010083 > [ 3.204506] RAX: ffff8d293eef6e40 RBX: ffff8d1d40010000 RCX: 0000000000000008 > [ 3.204508] RDX: 0000000000000000 RSI: ffff8d1d1c40b048 RDI: ffffffffac566418 > [ 3.204509] RBP: ffffffffac803d90 R08: 00000000fffffe14 R09: ffff8d1d1c406078 > [ 3.204510] R10: ffffffffac803dc0 R11: 0000000000000000 R12: 0000000000000000 > [ 3.204511] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 3.204512] FS: 0000000000000000(0000) GS:ffff8d1d1c400000(0000) knlGS:0000000000000000 > [ 3.204514] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 3.204515] CR2: 0000000000000000 CR3: 0000800daec12000 CR4: 00000000003100a0 > [ 3.204517] Call Trace: > [ 3.204519] ---[ end trace 0000000000000000 ]--- > [ 3.204580] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 APIC: 2 So this is where it all starts to go wrong, and we didn't really change this in the less-tested later part of the series. So even though I'd like to be sure you've tested with just '-part1' to be sure, I suspect that isn't the issue. The warning you highlighted the end of the end of the log is just complaining that a given AP is *already* marked as initialized when it's trying to come up, and that's just fallout of the fact that they don't know which CPU they are. They *all* come up thinking they're CPU#0. So that's weird. Now, what we do in this series is stop *telling* the AP which CPU# it is in a global variable as we bring them up one at a time, and instead we let them get their own APIC ID from CPUID leaf 0xb and look up their per-cpu data that way. The commit message for 'Support parallel startup of secondary CPUs' (currently commit 0f52d4eaaf0c) explains that in a bit more detail. Please could you try parallel-6.2-rc6-part1 with something like this? If I boot this in qemu with a weird topology to stop it being a 1:1 mapping, I do see a series of BbCcDdDeEeFfIgJh... showing the APIC ID and CPU# that each AP finds as it starts up. qemu-system-x86_64 -kernel arch/x86/boot/bzImage -smp 24,sockets=4,cores=3,threads=2 -append "console=ttyS0,115200 lpj=16528321 earlyprintk=ttyS0" -serial mon:stdio -display none -m 1G -accel kvm,kernel-irqchip=split ... [ 0.570022] x86: Booting SMP configuration: BbCc[ 0.570923] .... node #0, CPUs: #1 #2 [ 0.711468] Callback from call_rcu_tasks_rude() invoked. DdEe[ 0.713316] #3 #4 [ 0.854459] Callback from call_rcu_tasks() invoked. FfIgJhKiLjMkNlQmRnSoTpUqVrYsZt[u\v]w^x[ 0.856289] #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index d07f694691d2..c3219dc2a201 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -247,8 +247,8 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) * Is this the boot CPU coming up? If so everything is available * in initial_gs, initial_stack and early_gdt_descr. */ - movl smpboot_control(%rip), %eax - testl %eax, %eax + movl smpboot_control(%rip), %edx + testl %edx, %edx jz .Lsetup_cpu /* @@ -259,30 +259,45 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL) * Bit 30 STARTUP_SECONDARY flag * Bit 31 STARTUP_PARALLEL flag (use CPUID 0x0b for APIC ID) */ - testl $STARTUP_PARALLEL, %eax + testl $STARTUP_PARALLEL, %edx jnz .Luse_cpuid_0b - andl $0x0FFFFFFF, %eax + andl $0x0FFFFFFF, %edx jmp .Lsetup_AP .Luse_cpuid_0b: mov $0x0B, %eax xorl %ecx, %ecx cpuid - mov %edx, %eax + +/* test hack: print 'a' + APICID */ .Lsetup_AP: - /* EAX contains the APICID of the current CPU */ + /* EDX contains the APICID of the current CPU */ + + /* Test hack: Print APIC ID and then CPU# when we find it. */ + mov %edx, %ecx + mov %edx, %eax + addb $'A', %al + mov $0x3f8, %dx + outb %al, %dx + mov %ecx, %edx + mov $'a', %al + xorl %ecx, %ecx leaq cpuid_to_apicid(%rip), %rbx .Lfind_cpunr: - cmpl (%rbx), %eax + cmpl (%rbx), %edx jz .Linit_cpu_data addq $4, %rbx addq $8, %rcx + addb $1, %al jmp .Lfind_cpunr .Linit_cpu_data: + mov $0x3f8, %dx + outb %al, %dx + /* Get the per cpu offset */ leaq __per_cpu_offset(%rip), %rbx addq %rcx, %rbx
On Fri, 2023-02-03 at 13:48 -0600, Kim Phillips wrote: > [ 3.204580] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: 0 > APIC: 2 Could it just be that some processors have CPUID leaves above 0xb but don't actually support 0xb? What does this do? --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1578,6 +1578,18 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) do_parallel_bringup = false; + if (do_parallel_bringup) { + /* Check that CPUID 0x0B really does look sane. */ + unsigned int eax, ebx, ecx, edx; + + cpuid_count(0xb, 0, &eax, &ebx, &ecx, &edx); + printk("CPUID 0xB level 0: %x %x %x %x\n", eax, ebx, ecx, edx); + if (!eax && !ebx) { + pr_info("Disabling parallel bringup because CPUID 0xb looks untrustworthy\n"); + do_parallel_bringup = false; + } + } + if (do_parallel_bringup && boot_cpu_has_bug(X86_BUG_NO_PARALLEL_BRINGUP)) { pr_info("Disabling parallel bringup due to CPU bugs\n");
On Fri, 2023-02-03 at 13:48 -0600, Kim Phillips wrote: > > I'd like to nack this, but can't (and not because it doesn't have > commit text) > So, I *think* that worked precisely as designed; thanks for letting it grab your attention :) > If I: > > - take dwmw2's parallel-6.2-rc6 branch (commit 459d1c46dbd1) > - remove the set_cpu_bug(c, X86_BUG_NO_PARALLEL_BRINGUP) line from amd.c > > Then: > > - a Ryzen 3000 (Picasso A1/Zen+) notebook I have access to fails to boot. > - Zen 2,3,4-based servers boot fine > - a Zen1-based server doesn't boot. I've changed it to use CPUID 0xb only if we're actually in x2apic mode, which Boris tells me won't be the case on Zen1 because that doesn't support X2APIC. When we're not in x2apic mode, we can use CPUID 0x1 because the 8 bits of APIC ID we find there are perfectly sufficient. New tree in the same place as before, commit ce7e2d1e046a for the parallel-6.2-rc6-part1 tag and 17bbd12ee03 for parallel-6.2-rc6. However... Even though we *can* support non-X2APIC processors, we *might* want to play it safe and not go back that far; only enabling parallel bringup on machines with X2APIC which roughly correlates with "lots of CPUs" since that's where the benefit is.
> > However... > > Even though we *can* support non-X2APIC processors, we *might* want to > play it safe and not go back that far; only enabling parallel bringup > on machines with X2APIC which roughly correlates with "lots of CPUs" > since that's where the benefit is. I think that this is the right approach, at least on the initial patch series. KISS principle; do all the easy-but-important cases first, get it stable and working and in later series/kernels the range can be expanded.... if it matters. > > >
On 4 February 2023 18:18:55 GMT, Arjan van de Ven <arjan@linux.intel.com> wrote: >> >> However... >> >> Even though we *can* support non-X2APIC processors, we *might* want to >> play it safe and not go back that far; only enabling parallel bringup >> on machines with X2APIC which roughly correlates with "lots of CPUs" >> since that's where the benefit is. > >I think that this is the right approach, at least on the initial patch series. >KISS principle; do all the easy-but-important cases first, get it stable and working >and in later series/kernels the range can be expanded.... if it matters. Agreed. I'll split it to do it only with X2APIC for the initial series, and then hold the CPUID 0x1 part back for the next phase.
On 04/02/2023 22:31, David Woodhouse wrote: > > > On 4 February 2023 18:18:55 GMT, Arjan van de Ven <arjan@linux.intel.com> wrote: >>> >>> However... >>> >>> Even though we *can* support non-X2APIC processors, we *might* want to >>> play it safe and not go back that far; only enabling parallel bringup >>> on machines with X2APIC which roughly correlates with "lots of CPUs" >>> since that's where the benefit is. >> >> I think that this is the right approach, at least on the initial patch series. >> KISS principle; do all the easy-but-important cases first, get it stable and working >> and in later series/kernels the range can be expanded.... if it matters. > > Agreed. I'll split it to do it only with X2APIC for the initial series, and then hold the CPUID 0x1 part back for the next phase. This was an interesting find! I tested the latest branch parallel-6.2-rc6 and it works well. The numbers from Russ makes the patch series look so much better! :) If we do it with x2apic only and not support non-x2apic CPUID 0x1 case, maybe we apply the following diff to part 1? diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index f53a060a899b..f6b89cf40076 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1564,7 +1564,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) * sufficient). Otherwise it's too hard. And not for SEV-ES * guests because they can't use CPUID that early. */ - if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 1 || + if (IS_ENABLED(CONFIG_X86_32) || !x2apic_mode || (x2apic_mode && boot_cpu_data.cpuid_level < 0xb) || cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) do_parallel_bringup = false; For reusing timer calibration, calibrate_delay ends up being used in start_kernel, smp_callin, debug_calc_bogomips and check_cx686_slop. I think reusing timer calibration would be ok in the first 2 uses? but not really sure about the other 2. cx686 seems to be quite old so not sure if anyone will have it to test or will ever run 6.2 kernel on it :). I guess if unsure, better to leave out initially and try and get part1 merged? Thanks, Usama
On Sun, 2023-02-05 at 22:13 +0000, Usama Arif wrote: > > > On 04/02/2023 22:31, David Woodhouse wrote: > > > > > > On 4 February 2023 18:18:55 GMT, Arjan van de Ven <arjan@linux.intel.com> wrote: > > > > > > > > However... > > > > > > > > Even though we *can* support non-X2APIC processors, we *might* want to > > > > play it safe and not go back that far; only enabling parallel bringup > > > > on machines with X2APIC which roughly correlates with "lots of CPUs" > > > > since that's where the benefit is. > > > > > > I think that this is the right approach, at least on the initial patch series. > > > KISS principle; do all the easy-but-important cases first, get it stable and working > > > and in later series/kernels the range can be expanded.... if it matters. > > > > Agreed. I'll split it to do it only with X2APIC for the initial series, and then hold the CPUID 0x1 part back for the next phase. > > This was an interesting find! I tested the latest branch > parallel-6.2-rc6 and it works well. The numbers from Russ makes the > patch series look so much better! :) > > If we do it with x2apic only and not support non-x2apic CPUID 0x1 case, > maybe we apply the following diff to part 1? Using x2apic_mode would also disable parallel boot when the CPU *does* have X2APIC support but the kernel just isn't using it at the moment. I think boot_cpu_has(X86_FEATURE_X2APIC) is the better criterion? I was thinking I'd tweak the 'no_parallel_bringup' command line argument into something that also allows us to *enable* it without X2APIC being supported. But I've also been pondering the fact that this is all only for 64-bit anyway. It's not like we're doing it for the zoo of ancient i586 and even i486 machines where the APICs were hooked up with blue wire and duct tape. Maybe "64-bit only" is good enough, with a command line opt-out. And maybe a printk pointing out the existence of that command line option before the bringup, just in case? > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c > index f53a060a899b..f6b89cf40076 100644 > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -1564,7 +1564,7 @@ void __init native_smp_prepare_cpus(unsigned int > max_cpus) > * sufficient). Otherwise it's too hard. And not for SEV-ES > * guests because they can't use CPUID that early. > */ > - if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 1 || > + if (IS_ENABLED(CONFIG_X86_32) || !x2apic_mode || > (x2apic_mode && boot_cpu_data.cpuid_level < 0xb) || > cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) > do_parallel_bringup = false; > > > > For reusing timer calibration, calibrate_delay ends up being used in > start_kernel, smp_callin, debug_calc_bogomips and check_cx686_slop. I > think reusing timer calibration would be ok in the first 2 uses? > It already explicitly allows the calibration to be reused from another CPU in the same *core*, but no further. I think we'd need to be sure about the correctness of extending that further, and which of the mess of constant/invariant/we-really-mean-it-this-time TSC flags need to be set for that to be OK. > but not really sure about the other 2. cx686 seems to be quite old so not sure > if anyone will have it to test or will ever run 6.2 kernel on it :). I > guess if unsure, better to leave out initially and try and get part1 merged? I doubt cx686 advertises constant TSC; the early ones didn't even *have* a TSC. Does it even support SMP?
On 06/02/2023 08:05, David Woodhouse wrote: > On Sun, 2023-02-05 at 22:13 +0000, Usama Arif wrote: >> >> >> On 04/02/2023 22:31, David Woodhouse wrote: >>> >>> >>> On 4 February 2023 18:18:55 GMT, Arjan van de Ven <arjan@linux.intel.com> wrote: >>>>> >>>>> However... >>>>> >>>>> Even though we *can* support non-X2APIC processors, we *might* want to >>>>> play it safe and not go back that far; only enabling parallel bringup >>>>> on machines with X2APIC which roughly correlates with "lots of CPUs" >>>>> since that's where the benefit is. >>>> >>>> I think that this is the right approach, at least on the initial patch series. >>>> KISS principle; do all the easy-but-important cases first, get it stable and working >>>> and in later series/kernels the range can be expanded.... if it matters. >>> >>> Agreed. I'll split it to do it only with X2APIC for the initial series, and then hold the CPUID 0x1 part back for the next phase. >> >> This was an interesting find! I tested the latest branch >> parallel-6.2-rc6 and it works well. The numbers from Russ makes the >> patch series look so much better! :) >> >> If we do it with x2apic only and not support non-x2apic CPUID 0x1 case, >> maybe we apply the following diff to part 1? > > Using x2apic_mode would also disable parallel boot when the CPU *does* > have X2APIC support but the kernel just isn't using it at the moment. I > think boot_cpu_has(X86_FEATURE_X2APIC) is the better criterion? > x2apic_mode is set to 0 only in the case when nox2apic is specified in the kernel cmdline or if x2apic_setup fails. As 0xB leaf gets the "x2apic id" and not the "apic id", I thought it would be better to not use the x2apic id if the user doesnt want to use x2apic (cmdline), or the kernel fails to set it up. Another thing I noticed from the Intel Architecture Manual CPUID—CPU Identification section: "CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first checking for the existence of Leaf 1FH before using leaf 0BH." So I think we should switch from 0BH to using the 1FH leaf EDX register. > I was thinking I'd tweak the 'no_parallel_bringup' command line > argument into something that also allows us to *enable* it without > X2APIC being supported. > > But I've also been pondering the fact that this is all only for 64-bit > anyway. It's not like we're doing it for the zoo of ancient i586 and > even i486 machines where the APICs were hooked up with blue wire and > duct tape. > > Maybe "64-bit only" is good enough, with a command line opt-out. And > maybe a printk pointing out the existence of that command line option > before the bringup, just in case? > I think that makes sense, the patch only has a significant impact when the core count is high, and x2apic was made to support higher CPU count. >> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c >> index f53a060a899b..f6b89cf40076 100644 >> --- a/arch/x86/kernel/smpboot.c >> +++ b/arch/x86/kernel/smpboot.c >> @@ -1564,7 +1564,7 @@ void __init native_smp_prepare_cpus(unsigned int >> max_cpus) >> * sufficient). Otherwise it's too hard. And not for SEV-ES >> * guests because they can't use CPUID that early. >> */ >> - if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 1 || >> + if (IS_ENABLED(CONFIG_X86_32) || !x2apic_mode || >> (x2apic_mode && boot_cpu_data.cpuid_level < 0xb) || >> cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) >> do_parallel_bringup = false; >> >> >> >> For reusing timer calibration, calibrate_delay ends up being used in >> start_kernel, smp_callin, debug_calc_bogomips and check_cx686_slop. I >> think reusing timer calibration would be ok in the first 2 uses? >> > > It already explicitly allows the calibration to be reused from another > CPU in the same *core*, but no further. I think we'd need to be sure > about the correctness of extending that further, and which of the mess > of constant/invariant/we-really-mean-it-this-time TSC flags need to be > set for that to be OK. > >> but not really sure about the other 2. cx686 seems to be quite old so not sure >> if anyone will have it to test or will ever run 6.2 kernel on it :). I >> guess if unsure, better to leave out initially and try and get part1 merged? > > I doubt cx686 advertises constant TSC; the early ones didn't even *have* a TSC. > Does it even support SMP? Just checked the wiki page, and it says core count is 1 :)
On 2/4/23 9:40 AM, David Woodhouse wrote: > On Fri, 2023-02-03 at 13:48 -0600, Kim Phillips wrote: >> If I: >> >> - take dwmw2's parallel-6.2-rc6 branch (commit 459d1c46dbd1) >> - remove the set_cpu_bug(c, X86_BUG_NO_PARALLEL_BRINGUP) line from amd.c >> >> Then: >> >> - a Ryzen 3000 (Picasso A1/Zen+) notebook I have access to fails to boot. >> - Zen 2,3,4-based servers boot fine >> - a Zen1-based server doesn't boot. > > I've changed it to use CPUID 0xb only if we're actually in x2apic mode, > which Boris tells me won't be the case on Zen1 because that doesn't > support X2APIC. > > When we're not in x2apic mode, we can use CPUID 0x1 because the 8 bits > of APIC ID we find there are perfectly sufficient. > > New tree in the same place as before, commit ce7e2d1e046a for the > parallel-6.2-rc6-part1 tag and 17bbd12ee03 for parallel-6.2-rc6. Thanks, Zen 1 through 4 based servers all boot both those two tree commits successfully. I'll try that Ryzen again later. Kim
On Mon, Feb 06, 2023, Usama Arif wrote: > > > On 06/02/2023 08:05, David Woodhouse wrote: > > On Sun, 2023-02-05 at 22:13 +0000, Usama Arif wrote: > > > > > > > > > On 04/02/2023 22:31, David Woodhouse wrote: > > > > > > > > > > > > On 4 February 2023 18:18:55 GMT, Arjan van de Ven <arjan@linux.intel.com> wrote: > > > > > > > > > > > > However... > > > > > > > > > > > > Even though we *can* support non-X2APIC processors, we *might* want to > > > > > > play it safe and not go back that far; only enabling parallel bringup > > > > > > on machines with X2APIC which roughly correlates with "lots of CPUs" > > > > > > since that's where the benefit is. > > > > > > > > > > I think that this is the right approach, at least on the initial patch series. > > > > > KISS principle; do all the easy-but-important cases first, get it stable and working > > > > > and in later series/kernels the range can be expanded.... if it matters. > > > > > > > > Agreed. I'll split it to do it only with X2APIC for the initial series, And sanity check CPUID.0xB output even when x2APIC is supported, e.g. require CPUID.0xB.EBX to be non-zero. Odds are very good that there are VMs in the wild that support x2APIC but have an empty CPUID.0xB due to it being a topology leaf, i.e. may be suppressed when vCPUs aren't pinned. QEMU even has a knob to deliberately disable CPUID.0xB, e.g. booting a VM with cpu host,host-phys-bits,-cpuid-0xb,+x2apic works just fine. > > > > and then hold the CPUID 0x1 part back for the next phase. > > > This was an interesting find! I tested the latest branch > > > parallel-6.2-rc6 and it works well. The numbers from Russ makes the > > > patch series look so much better! :) > > > > > > If we do it with x2apic only and not support non-x2apic CPUID 0x1 case, > > > maybe we apply the following diff to part 1? > > > > Using x2apic_mode would also disable parallel boot when the CPU *does* > > have X2APIC support but the kernel just isn't using it at the moment. I > > think boot_cpu_has(X86_FEATURE_X2APIC) is the better criterion? > > > > x2apic_mode is set to 0 only in the case when nox2apic is specified in the > kernel cmdline or if x2apic_setup fails. As 0xB leaf gets the "x2apic id" > and not the "apic id", I thought it would be better to not use the x2apic id > if the user doesnt want to use x2apic (cmdline), or the kernel fails to set > it up. I agree with David that checking boot_cpu_has(X86_FEATURE_X2APIC) is preferred, x2APIC goes unused on a lot of platforms due to the kernel's interrupt remapping requirement. I would rather have a single explicit "no_parallel_bringup" option than try to infer the user's intentions based on tangentially related params. > Another thing I noticed from the Intel Architecture Manual CPUID—CPU > Identification section: > > "CPUID leaf 1FH is a preferred superset to leaf 0BH. Intel recommends first > checking for the existence of Leaf 1FH before using leaf 0BH." > > So I think we should switch from 0BH to using the 1FH leaf EDX register. I don't think using 0x1F will buy us anything except complexity. 0x1F provides more details on the topology, but its x2APIC ID enumeration isn't any more trustworthy than 0xB. > > I was thinking I'd tweak the 'no_parallel_bringup' command line > > argument into something that also allows us to *enable* it without > > X2APIC being supported. > > > > But I've also been pondering the fact that this is all only for 64-bit > > anyway. It's not like we're doing it for the zoo of ancient i586 and > > even i486 machines where the APICs were hooked up with blue wire and > > duct tape. The only thing I can see benefiting from parallel bringup without x2APIC is large VMs running on pre-x2AVIC hardware. E.g. IIRC, we (Google) hide x2APIC on AMD-based VMs so that VMs would take advantage of AVIC acceleration if AVIC were even to be enabled. But that's more of an argument for "try to use CPUID.0x1" than it is an argument for trying to use CPUID.0xB without x2APIC. > > Maybe "64-bit only" is good enough, with a command line opt-out. And > > maybe a printk pointing out the existence of that command line option > > before the bringup, just in case?
On Thu, Feb 02 2023 at 21:56, Usama Arif wrote: > From: David Woodhouse <dwmw@amazon.co.uk> -ENOCONTENT > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
On Sat, Feb 04 2023 at 15:40, David Woodhouse wrote: > On Fri, 2023-02-03 at 13:48 -0600, Kim Phillips wrote: >> Then: >> >> - a Ryzen 3000 (Picasso A1/Zen+) notebook I have access to fails to boot. >> - Zen 2,3,4-based servers boot fine >> - a Zen1-based server doesn't boot. > > I've changed it to use CPUID 0xb only if we're actually in x2apic mode, > which Boris tells me won't be the case on Zen1 because that doesn't > support X2APIC. Correct. > When we're not in x2apic mode, we can use CPUID 0x1 because the 8 bits > of APIC ID we find there are perfectly sufficient. Is that worth the trouble? > Even though we *can* support non-X2APIC processors, we *might* want to > play it safe and not go back that far; only enabling parallel bringup > on machines with X2APIC which roughly correlates with "lots of CPUs" > since that's where the benefit is. The parallel bringup code is complex enough already, so please don't optimize for the non-interesting case in the first place. When this has stabilized then the CPUID 0x1 mechanism can be added if anyone thinks it's interesting. KISS is still the best engineering principle. Thanks, tglx
On Tue, 2023-02-07 at 01:23 +0100, Thomas Gleixner wrote: > On Sat, Feb 04 2023 at 15:40, David Woodhouse wrote: > > On Fri, 2023-02-03 at 13:48 -0600, Kim Phillips wrote: > > > Then: > > > > > > - a Ryzen 3000 (Picasso A1/Zen+) notebook I have access to fails to boot. > > > - Zen 2,3,4-based servers boot fine > > > - a Zen1-based server doesn't boot. > > > > I've changed it to use CPUID 0xb only if we're actually in x2apic mode, > > which Boris tells me won't be the case on Zen1 because that doesn't > > support X2APIC. > > Correct. > > > When we're not in x2apic mode, we can use CPUID 0x1 because the 8 bits > > of APIC ID we find there are perfectly sufficient. > > Is that worth the trouble? Well, that's what was being debated. I think the conclusion that was bring reached was that it *is* worth the trouble, because there will be a number of physical and especially virtual machines which have a high CPU count but which don't actually use X2APIC mode. And which might not even *support* CPUID 0xb. So using CPUID 0x1 when there is no APIC ID greater than 254 does seem to make sense. > > Even though we *can* support non-X2APIC processors, we *might* want to > > play it safe and not go back that far; only enabling parallel bringup > > on machines with X2APIC which roughly correlates with "lots of CPUs" > > since that's where the benefit is. > > The parallel bringup code is complex enough already, so please don't > optimize for the non-interesting case in the first place. When this has > stabilized then the CPUID 0x1 mechanism can be added if anyone thinks > it's interesting. KISS is still the best engineering principle. Actually it ends up being trivial. It probably makes sense to keep it in there even if it can only be exercised by a deliberate opt-in on older CPUs. I reworked the register usage from your original anyway, which helps a little. testl $STARTUP_APICID_CPUID_0B, %edx jnz .Luse_cpuid_0b testl $STARTUP_APICID_CPUID_01, %edx jnz .Luse_cpuid_01 andl $0x0FFFFFFF, %edx jmp .Lsetup_AP .Luse_cpuid_01: mov $0x01, %eax cpuid mov %ebx, %edx shr $24, %edx jmp .Lsetup_AP .Luse_cpuid_0b: mov $0x0B, %eax xorl %ecx, %ecx cpuid .Lsetup_AP: /* EDX contains the APICID of the current CPU */
David! On Tue, Feb 07 2023 at 10:04, David Woodhouse wrote: > On Tue, 2023-02-07 at 01:23 +0100, Thomas Gleixner wrote: >> > When we're not in x2apic mode, we can use CPUID 0x1 because the 8 bits >> > of APIC ID we find there are perfectly sufficient. >> >> Is that worth the trouble? > > Well, that's what was being debated. I think the conclusion that was > bring reached was that it *is* worth the trouble, because there will be > a number of physical and especially virtual machines which have a high > CPU count but which don't actually use X2APIC mode. And which might not > even *support* CPUID 0xb. > > So using CPUID 0x1 when there is no APIC ID greater than 254 does seem > to make sense. Fair enough. >> > Even though we *can* support non-X2APIC processors, we *might* want to >> > play it safe and not go back that far; only enabling parallel bringup >> > on machines with X2APIC which roughly correlates with "lots of CPUs" >> > since that's where the benefit is. >> >> The parallel bringup code is complex enough already, so please don't >> optimize for the non-interesting case in the first place. When this has >> stabilized then the CPUID 0x1 mechanism can be added if anyone thinks >> it's interesting. KISS is still the best engineering principle. > > Actually it ends up being trivial. It probably makes sense to keep it > in there even if it can only be exercised by a deliberate opt-in on > older CPUs. I reworked the register usage from your original anyway, > which helps a little. > > testl $STARTUP_APICID_CPUID_0B, %edx > jnz .Luse_cpuid_0b > testl $STARTUP_APICID_CPUID_01, %edx > jnz .Luse_cpuid_01 > andl $0x0FFFFFFF, %edx > jmp .Lsetup_AP > > .Luse_cpuid_01: > mov $0x01, %eax > cpuid > mov %ebx, %edx > shr $24, %edx > jmp .Lsetup_AP > > .Luse_cpuid_0b: > mov $0x0B, %eax > xorl %ecx, %ecx > cpuid > > .Lsetup_AP: > /* EDX contains the APICID of the current CPU */ That looks trivial enough. So no objections from my side. Not sure whether this needs a special opt-in though. We probably want an opt-out for the parallel bringup mode for diagnosis purposes anyway and that should be good enough for a start. Thanks, tglx
On Tue, Feb 07 2023 at 09:52, David Woodhouse wrote: > On Tue, 2023-02-07 at 01:09 +0100, Thomas Gleixner wrote: >> On Thu, Feb 02 2023 at 21:56, Usama Arif wrote: >> >> -ENOCONTENT > > > Yeah, I'm aware of the nonsense here too but when I'm actually writing > a commit message, years of habit kick in and it doesn't occur to me to > pointlessly repeat the words that are already there, one line up where > it says "Disable parallel boot for AMD CPUs". > > I'm old and stupid, but eventually I'll start to remember that people > nowadays like to gratuitously refuse to read those words, and they want > them repeated in a different place. I'm not asking for repeating information from the subject line but to actually add a rationale. The WHY is the most important information of a changelog, no? > Bear with me... I do, but that doesn't mean I'm giving you special treatment :) Thanks, tglx
On 2/6/23 11:58 AM, Kim Phillips wrote: > On 2/4/23 9:40 AM, David Woodhouse wrote: >> On Fri, 2023-02-03 at 13:48 -0600, Kim Phillips wrote: >>> If I: >>> >>> - take dwmw2's parallel-6.2-rc6 branch (commit 459d1c46dbd1) >>> - remove the set_cpu_bug(c, X86_BUG_NO_PARALLEL_BRINGUP) line from amd.c >>> >>> Then: >>> >>> - a Ryzen 3000 (Picasso A1/Zen+) notebook I have access to fails to boot. >>> - Zen 2,3,4-based servers boot fine >>> - a Zen1-based server doesn't boot. >> >> I've changed it to use CPUID 0xb only if we're actually in x2apic mode, >> which Boris tells me won't be the case on Zen1 because that doesn't >> support X2APIC. >> >> When we're not in x2apic mode, we can use CPUID 0x1 because the 8 bits >> of APIC ID we find there are perfectly sufficient. >> >> New tree in the same place as before, commit ce7e2d1e046a for the >> parallel-6.2-rc6-part1 tag and 17bbd12ee03 for parallel-6.2-rc6. > > Thanks, Zen 1 through 4 based servers all boot both those two tree > commits successfully. > > I'll try that Ryzen again later. The Ryzen 3000 also successfully boots those two branches now. Thanks, Kim
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 61012476d66e..ed7f32354edc 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -466,5 +466,6 @@ #define X86_BUG_MMIO_UNKNOWN X86_BUG(26) /* CPU is too old and its MMIO Stale Data status is unknown */ #define X86_BUG_RETBLEED X86_BUG(27) /* CPU is affected by RETBleed */ #define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Barrier RSB Predictions */ +#define X86_BUG_NO_PARALLEL_BRINGUP X86_BUG(29) /* CPU has hardware issues with parallel AP bringup */ #endif /* _ASM_X86_CPUFEATURES_H */ diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index f769d6d08b43..19b5c8342d7e 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -941,6 +941,17 @@ static void init_amd(struct cpuinfo_x86 *c) case 0x19: init_amd_zn(c); break; } + /* + * Various AMD CPUs appear to not to cope with APs being brought up + * in parallel. In debugging, the AP doesn't even seem to reach an + * outb to port 0x3f8 right at the top of the startup trampoline. + * We don't *think* there are any remaining software issues which + * may contribute to this, although it's possible. So far, attempts + * to get AMD to investigate this have been to no avail. So just + * disable parallel bring up for all AMD CPUs for now. + */ + set_cpu_bug(c, X86_BUG_NO_PARALLEL_BRINGUP); + /* * Enable workaround for FXSAVE leak on CPUs * without a XSaveErPtr feature diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 741da8d306a4..656897b055f5 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1534,13 +1534,22 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) * We can do 64-bit AP bringup in parallel if the CPU reports its * APIC ID in CPUID leaf 0x0B. Otherwise it's too hard. And not * for SEV-ES guests because they can't use CPUID that early. - * Also, some AMD CPUs crash when doing parallel cpu bringup, disable - * it for all AMD CPUs to be on the safe side. */ if (IS_ENABLED(CONFIG_X86_32) || boot_cpu_data.cpuid_level < 0x0B || cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) do_parallel_bringup = false; + /* + * Various AMD CPUs appear not to cope with APs being brought up + * in parallel, so just disable parallel bring up for all AMD CPUs + * for now. + */ + if (do_parallel_bringup && + boot_cpu_has_bug(X86_BUG_NO_PARALLEL_BRINGUP)) { + pr_info("Disabling parallel bringup due to CPU bugs\n"); + do_parallel_bringup = false; + } + snp_set_wakeup_secondary_cpu(); }