From patchwork Fri Feb 17 16:37:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Rui" X-Patchwork-Id: 58715 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp993060wrn; Fri, 17 Feb 2023 08:51:13 -0800 (PST) X-Google-Smtp-Source: AK7set/ankNK9gEIBU8+ej7p5X+cVt7PT4JdngeydyeaVx2K5omxlSHTCbrGwWoD0cXZMa9DFJUL X-Received: by 2002:a17:906:10d8:b0:8b1:2dd3:cb45 with SMTP id v24-20020a17090610d800b008b12dd3cb45mr240991ejv.42.1676652673092; Fri, 17 Feb 2023 08:51:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676652673; cv=none; d=google.com; s=arc-20160816; b=Z7Y1fCEAF39BYui+UzyOqK3Uw3pRboeS4JoyCoL53vAABL+vQ04qxPvrey8+ieL0bB rK1z4PkyKQ51p285xelm11fuiaLLgI4efhh4kwfIxWobEonZ0Iq7GAk9drgP487pJWX9 NhjPzuJA2XmvmqUvwakOL+eGF8mPKUcnL4yrTBFbeounFIBwB5ziAYWxQ+DGFyDeA3ML /qCjK0/mUsVR9CKVKI4KJFSOVMZFjakhAUImf/6QthyoOAt3AiXhVDtICZHpl9Vs/kYo mHRdPLQmvdRJVX4NEtqobXXAOWDcwntSlZaAoEkH7pL/1iskXG3KhGsKlGZetnfhs71Z XaUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ECKV+10S9lajrOXL645+G88N9ApUxS/N/wBcFvRJtsY=; b=Klf9QGXY4Uekg0BMEMHc9Fcul6sTsvKfjtVzQk1Yt2S/RI4otMUmhppO/+pzpRIfMN mM2K6ZaUc2TmBACkcdBxRaqbc1TIPdczGZbnVlnRhIn1Ah8S/Vkzh69fz1T++iiYSL1Y gjzCh00CL+WhaHzCaxeyl2WtLQAUwZ6WBRsyY2u2TOzrl007JoujmFcbLI+eyADu6Q0D vvGA83NLEHlqe+qW2spC6eLFQZM4JoLMok//pCmvRtbNH8+5TSJGhG37t9wCdTp+jAAm lIztr7OAXmLNyuCFiJl5PAnHsr91QtNZqrpL2Fwu9348q+YuZoOvUZIWevjPU6Zcqrsp T+4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BNQyYA0p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ev23-20020a17090729d700b008b1598f6a24si5048853ejc.518.2023.02.17.08.50.49; Fri, 17 Feb 2023 08:51:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BNQyYA0p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229991AbjBQQhr (ORCPT + 99 others); Fri, 17 Feb 2023 11:37:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230010AbjBQQhn (ORCPT ); Fri, 17 Feb 2023 11:37:43 -0500 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3F6A718ED for ; Fri, 17 Feb 2023 08:37:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676651862; x=1708187862; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ble9Qvx//Q2xEWAbJvNGkPJLN/eT48lfTbbNkwkxdrU=; b=BNQyYA0pmDNNXNEUNl3lzKBbgrBts8tzwYAp1H9OR2rpjtv/+9xOb/2K RGhBbp1tgwGXDqFiPl1QtU71esBr+kERqFAGKKmDSg6Gh0PsCjo1JJWaL ttHca0OV2ztpbHU7BdVSpgxtffNP59BTvHBZZmN/yFmf6k8rcA0bDH9sJ k3yz34M7Xav5HdyWTfS1fgmG5riOTgEG6IdT83sMPtCZK9S0C1f5DdYHo VrN1rDiyxmyeMZaHebA5tnj4NsNCWoqhNdEBfuCol1EQVM/yvqqD94Hy4 cGxeIF41IRnDJwyVM006vMSdcxY3pMmSvU5on5l+SW/jzGjBwnFqYCfzF w==; X-IronPort-AV: E=McAfee;i="6500,9779,10624"; a="311655943" X-IronPort-AV: E=Sophos;i="5.97,306,1669104000"; d="scan'208";a="311655943" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Feb 2023 08:37:42 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10624"; a="670598347" X-IronPort-AV: E=Sophos;i="5.97,306,1669104000"; d="scan'208";a="670598347" Received: from hany-desk.ccr.corp.intel.com (HELO rzhang1-DESK.intel.com) ([10.254.214.18]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Feb 2023 08:37:39 -0800 From: Zhang Rui To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com Cc: x86@kernel.org, linux-kernel@vger.kernel.org, zhang.jia@linux.alibaba.com, len.brown@intel.com, rui.zhang@intel.com Subject: [PATCH 1/1] x86/topology: fix erroneous smp_num_siblings on Intel Hybrid platform Date: Sat, 18 Feb 2023 00:37:24 +0800 Message-Id: <20230217163724.581513-2-rui.zhang@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230217163724.581513-1-rui.zhang@intel.com> References: <20230217163724.581513-1-rui.zhang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758097753786096447?= X-GMAIL-MSGID: =?utf-8?q?1758097753786096447?= The SMT siblings value returned by CPUID.1F SMT level EBX differs among CPUs on Intel Hybrid platforms like AlderLake and MeteorLake. It returns 2 for Pcore CPUs which have SMT siblings and returns 1 for Ecore CPUs which do not have SMT siblings. Today, the CPU boot code sets the global variable smp_num_siblings when every CPU thread is brought up. The last thread to boot will overwrite it with the number of siblings of *that* thread. That last thread to boot will "win". If the thread is a Pcore, smp_num_siblings == 2. If it is an Ecore, smp_num_siblings == 1. smp_num_siblings describes if the *system* supports SMT. It should specify the maximum number of SMT threads among all cores. On AlderLake-P/S platforms, it does not cause any functional issues so far. But on MeteorLake-P platform, when probing an Ecore CPU, a). smp_num_siblings varies like AlderLake and it is set to 1 for Ecore. b). x86_max_cores is totally broken and it is set to 1 for the boot cpu. Altogether, these two issues make the system being treated as an UP system in set_cpu_sibling_map() when probing Ecore CPUs, and the Ecore CPUs are not updated in any cpu sibling maps erroneously. Below shows part of the CPU topology information before and after the fix, for both Pcore and Ecore CPU (cpu0 is Pcore, cpu 12 is Ecore). ... -/sys/devices/system/cpu/cpu0/topology/package_cpus:000fff -/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-11 +/sys/devices/system/cpu/cpu0/topology/package_cpus:3fffff +/sys/devices/system/cpu/cpu0/topology/package_cpus_list:0-21 ... -/sys/devices/system/cpu/cpu12/topology/package_cpus:001000 -/sys/devices/system/cpu/cpu12/topology/package_cpus_list:12 +/sys/devices/system/cpu/cpu12/topology/package_cpus:3fffff +/sys/devices/system/cpu/cpu12/topology/package_cpus_list:0-21 And this also breaks userspace tools like lscpu -Core(s) per socket: 1 -Socket(s): 11 +Core(s) per socket: 16 +Socket(s): 1 To fix the first issue, ensure that smp_num_siblings represents the system-wide maximum number of siblings by always increasing its value. Never allow it to decrease. Note that this fix is sufficient to make set_cpu_sibling_map() work correctly. And how to fix the bogus cpuinfo_x86.x86_max_cores will be addressed separately. CC: stable@kernel.org Suggested-by: Len Brown Signed-off-by: Zhang Rui --- arch/x86/kernel/cpu/topology.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/topology.c b/arch/x86/kernel/cpu/topology.c index 5e868b62a7c4..0270925fe013 100644 --- a/arch/x86/kernel/cpu/topology.c +++ b/arch/x86/kernel/cpu/topology.c @@ -79,7 +79,7 @@ int detect_extended_topology_early(struct cpuinfo_x86 *c) * initial apic id, which also represents 32-bit extended x2apic id. */ c->initial_apicid = edx; - smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx); + smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx)); #endif return 0; } @@ -109,7 +109,8 @@ int detect_extended_topology(struct cpuinfo_x86 *c) */ cpuid_count(leaf, SMT_LEVEL, &eax, &ebx, &ecx, &edx); c->initial_apicid = edx; - core_level_siblings = smp_num_siblings = LEVEL_MAX_SIBLINGS(ebx); + core_level_siblings = LEVEL_MAX_SIBLINGS(ebx); + smp_num_siblings = max_t(int, smp_num_siblings, LEVEL_MAX_SIBLINGS(ebx)); core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax); die_level_siblings = LEVEL_MAX_SIBLINGS(ebx); pkg_mask_width = die_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);