From patchwork Mon Oct 17 13:29:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 3461 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1448479wrs; Mon, 17 Oct 2022 06:31:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5ynm4N7sbotlVp5joaSHCnkQ4g6ZNaXWL09XjU37LPt5GXXzosngsBAw9cGfQlI7t8yd1D X-Received: by 2002:a17:90b:33cd:b0:20d:9da6:56e3 with SMTP id lk13-20020a17090b33cd00b0020d9da656e3mr28160541pjb.143.1666013470234; Mon, 17 Oct 2022 06:31:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666013470; cv=none; d=google.com; s=arc-20160816; b=KPEZqPUXNBziIKc430qV6vRTJqMAEAD0c7e1o0cqX6tGs2B0FGUoGsuqlFNXxeMb1W LcsfXyrFE/2F6Qh1YLRExBz6bP/7O+EghDopLIHxfmoB/QGetint3lcn9yscSVoiXAO7 zACNI0ddxWJUS2D9V7j5EbAE0PhBdRAJccHOx5YL3M9OrE0qKXvSr/1AaVo7GvJuoTxQ pHDQu6eJSRf+k0LQZiPc8IPWuLjfCsachgZQF/w1LEPbu8ru1R+N0LeQ5nvoJAebTed0 I6iF5pfEB2MwQx+oCAJHu2xQBZ/PXy02129oOzezYZQRH7kbQEbZyaJV/S0w7UnsBOqX 8W/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=fShjSjdZpxxIK7zMPp1x6SCav+g3iBmiZ2BP379Pb/I=; b=DilX3cmlsYbr4i+zuJaRX9v4+fBAtW2MT480ln+g06KIe0HoRbj7Y5q3Wi6S4Cs2bz a/LiOuXB0SbuY8H/Hveg3JOcBjNhl6y7FqG9C8TwN7BJBmmGx16n1gkA+Wpq5fEK5GNY HH1Z/vm9B4KjieuxkzeFfj68XYtqaeR7QkouWnk3GHD8vlC68Pa4dfcMjO1xHORZwTFc GNqDx5oMYoOE9g7KitCIgqCBCKj2n7ljfn5wk8mdfLFGA4S0mc4wskDy1XbHI6bCumMs vWFzAG6zjvFAEXH3OK0LHE7PNnxFg/B8V1ophhVQkqmAEgbHQf+JzgvXmI/IM4JAKoCh VdDg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fM6PivgX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l27-20020a63ba5b000000b0046301a9c718si13295244pgu.21.2022.10.17.06.30.53; Mon, 17 Oct 2022 06:31:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fM6PivgX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229742AbiJQN3w (ORCPT + 99 others); Mon, 17 Oct 2022 09:29:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229673AbiJQN3u (ORCPT ); Mon, 17 Oct 2022 09:29:50 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08D1A1A3BB for ; Mon, 17 Oct 2022 06:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666013389; x=1697549389; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=EEHjGPSGt+4bmtkff7+Q3rjKMcvh7S09g7DA8Sh/8KE=; b=fM6PivgX7XKDe1Tq9u9xM4qstFw8DPMkxJQ+7B5HTnH1ZgU82LJq1xKV EknanaLf6oORQZzb0sYgm97czSd0N6HyA+cs1lk8sNpntnBBCLWUMBnIL MaCaNRfDnEfSQYuOVfhe3ne0r5Fe5icHVLFcKN5Bys67A9RQCVc4DvUqW dsKoRnAJo/vEo6chO1SnHo6sK2lIAyH6wUJ4Rw02kyYXmxu3TB+yDyL/A dujKsJ2XieMqCtXwO06tJ8JiU3GmJmCTr/hxw4lCPXinitAfbUQEEVU+z vBSDHQ4+tNQWLwyXEhBEOws6WdwdCW9SyDiJjQfxwB+DiDtvrvm2q5blx g==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="285521025" X-IronPort-AV: E=Sophos;i="5.95,191,1661842800"; d="scan'208";a="285521025" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Oct 2022 06:29:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691357639" X-IronPort-AV: E=Sophos;i="5.95,191,1661842800"; d="scan'208";a="691357639" Received: from feng-clx.sh.intel.com ([10.238.200.228]) by fmsmga008.fm.intel.com with ESMTP; 17 Oct 2022 06:29:45 -0700 From: Feng Tang To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , Peter Zijlstra , x86@kernel.org, linux-kernel@vger.kernel.org Cc: rui.zhang@intel.com, tim.c.chen@intel.com, liaoyu15@huawei.com, Feng Tang Subject: [RFC PATCH] x86/tsc: use topology_max_packages() in tsc watchdog check Date: Mon, 17 Oct 2022 21:29:42 +0800 Message-Id: <20221017132942.1646934-1-feng.tang@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746941740593722008?= X-GMAIL-MSGID: =?utf-8?q?1746941740593722008?= Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms") was introduced to solve problem that sometimes TSC clocksource is wrongly judged as unstable by watchdog like 'jiffies', HPET, etc. In it, the hardware socket number is a key factor for judging whether to disable the watchdog for TSC, and 'nr_online_nodes' was chosen as an estimation due to it is needed in early boot phase before registering 'tsc-early' clocksource, where all none-boot CPUs are not brought up yet. In recent patch review, Dave Hansen pointed out there are many cases that 'nr_online_nodes' could have issue, like: * numa emulation (numa=fake=4 etc.) * numa=off * platforms with CPU+DRAM nodes, CPU-less HBM nodes, CPU-less persistent memory nodes. Peter Zijlstra suggested to use logical package ids, but it is only usable after smp_init() and all CPUs are initialized. One solution is to skip the watchdog for 'tsc-early' clocksource, and move the check after smp_init(), while before 'tsc' clocksoure is registered, where topology_max_packages() could be used as a much more accurate socket number. Signed-off-by: Feng Tang --- arch/x86/kernel/tsc.c | 42 ++++++++++++++++-------------------------- 1 file changed, 16 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index cafacb2e58cc..8dc7a0aeaf4d 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1131,8 +1131,7 @@ static struct clocksource clocksource_tsc_early = { .uncertainty_margin = 32 * NSEC_PER_MSEC, .read = read_tsc, .mask = CLOCKSOURCE_MASK(64), - .flags = CLOCK_SOURCE_IS_CONTINUOUS | - CLOCK_SOURCE_MUST_VERIFY, + .flags = CLOCK_SOURCE_IS_CONTINUOUS, .vdso_clock_mode = VDSO_CLOCKMODE_TSC, .enable = tsc_cs_enable, .resume = tsc_resume, @@ -1180,12 +1179,6 @@ void mark_tsc_unstable(char *reason) EXPORT_SYMBOL_GPL(mark_tsc_unstable); -static void __init tsc_disable_clocksource_watchdog(void) -{ - clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY; - clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; -} - static void __init check_system_tsc_reliable(void) { #if defined(CONFIG_MGEODEGX1) || defined(CONFIG_MGEODE_LX) || defined(CONFIG_X86_GENERIC) @@ -1202,23 +1195,6 @@ static void __init check_system_tsc_reliable(void) #endif if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE)) tsc_clocksource_reliable = 1; - - /* - * Disable the clocksource watchdog when the system has: - * - TSC running at constant frequency - * - TSC which does not stop in C-States - * - the TSC_ADJUST register which allows to detect even minimal - * modifications - * - not more than two sockets. As the number of sockets cannot be - * evaluated at the early boot stage where this has to be - * invoked, check the number of online memory nodes as a - * fallback solution which is an reasonable estimate. - */ - if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && - boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && - boot_cpu_has(X86_FEATURE_TSC_ADJUST) && - nr_online_nodes <= 2) - tsc_disable_clocksource_watchdog(); } /* @@ -1413,6 +1389,20 @@ static int __init init_tsc_clocksource(void) if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3)) clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP; + /* + * Disable the clocksource watchdog when the system has: + * - TSC running at constant frequency + * - TSC which does not stop in C-States + * - the TSC_ADJUST register which allows to detect even minimal + * modifications + * - not more than two sockets. + */ + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && + boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && + boot_cpu_has(X86_FEATURE_TSC_ADJUST) && + topology_max_packages() <= 2) + clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; + /* * When TSC frequency is known (retrieved via MSR or CPUID), we skip * the refined calibration and directly register it as a clocksource. @@ -1547,7 +1537,7 @@ void __init tsc_init(void) } if (tsc_clocksource_reliable || no_tsc_watchdog) - tsc_disable_clocksource_watchdog(); + clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; clocksource_register_khz(&clocksource_tsc_early, tsc_khz); detect_art();