From patchwork Wed Jan 25 00:27:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 47986 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp12720wrn; Tue, 24 Jan 2023 16:31:05 -0800 (PST) X-Google-Smtp-Source: AMrXdXtkvc1BrTGP9OeYUo3V1KUsuKwROGX7nOChfLd65Zq5CpAxjvgNcnbEK4QTNwRMhqwRrzWX X-Received: by 2002:a50:eac3:0:b0:499:b48b:2c3 with SMTP id u3-20020a50eac3000000b00499b48b02c3mr32720502edp.25.1674606665790; Tue, 24 Jan 2023 16:31:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674606665; cv=none; d=google.com; s=arc-20160816; b=JuFmBvgV9K/6nIBpvQnQ1JQi73PdVYea0l8vziI2N7L2zjwW9/X3GcbZ6Neiec9emH BG11DH7z4ua4CNWHZPNSD50/cVDKcXFA7AKwyKLXVimJeTEqmg0awM3Jydc/QO1rgd9H yNMITBYiYiWfnjqmmo4tpXxbpiehDM+ZHpiY+r8gGxcq0yD9yN8VbyL3y87OkWS/T7Gw 1nLxVDgANEx5OTpVWJnQ2Zat1c3GyBowfWY+lFWHmvWXhN1gosZ1q7T/5DNS308Vd2iL AmA7DVtfvtLL6H+hMtz9RqXK0JO0hNK+BTJC5czU0rAIGwiTgGAbSvi5f4Y0pNWWLcRW iK5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=leyAfntoPyncGeLmBFFQvj2xV9SgXtKjZS2Ph6m63vk=; b=PD2syV73t+eofQeytiCe9wz65uhBTHVwpAhcvXdm8EbTmweHkIqngFzlHsm/tIC2Q1 1mlJY3wzwmF4dMDdc8/25MAujswZqvtA7ktYeAwZekogckEN3Zlgl8CJwQi3ZcQ52Rew 62W9uNyOQdKBWNid2l8g3nehyiz/tyFVkorntKTs7ieZfqUkxDa64i0//RxfP68cFvX3 t9H6uWUSQYfOGVay3U5xqsTRgihsiW44LXl7W6X0HR0oIT7EH3Xv9IH1v9J3hewnxO1x EAYith4EnxBfsNbuK0+hDkwKBXVkx/uFCrUJRVgVrDxo5wefbwwwYVDYXoTLUUTLW5/9 pNUQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VwL9RjIy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id en21-20020a056402529500b004a0a8527b8fsi874495edb.596.2023.01.24.16.30.35; Tue, 24 Jan 2023 16:31:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=VwL9RjIy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234904AbjAYA2t (ORCPT + 99 others); Tue, 24 Jan 2023 19:28:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234844AbjAYA2o (ORCPT ); Tue, 24 Jan 2023 19:28:44 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F91B4FCF9 for ; Tue, 24 Jan 2023 16:28:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 13015B81733 for ; Wed, 25 Jan 2023 00:27:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B9194C433D2; Wed, 25 Jan 2023 00:27:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606452; bh=tNnAiq5HYDDhjHdWky6iQVYABLIWoBenLL5L50zs8N4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VwL9RjIynMxgRjV7vJJ4DVSGgr9HP66z59GPqdNyaTTCH/mGO8pgQlEMliBHMBbWf K/v4erTgqLapYCyAQYvU1gcIWjNRJve/+Y7y8JvTWuFBSd/6IOMdyWMKzezOiFC0X9 kUtDuapIjbCkaMFgU4bc+gtcbEcyAAlngeMI2I96uzJoODzEWFNV2fZyTogbV9QVOJ Kpx6jGAdNRcmKNXn6c8o5uHzinrXPxVvj1e1iypF2YB9RCwMIal2NbiWDZvEP8oHMA 6bGLfg56FPc/taghp4N0ColrG4u57Jo2dZuY3togkI8893JEGEZT3eaiZKNiSkKeL/ 2meIwr+u4tkUA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 6FE445C1052; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, Yunying Sun , "Paul E . McKenney" Subject: [PATCH v2 clocksource 1/7] clocksource: Print clocksource name when clocksource is tested unstable Date: Tue, 24 Jan 2023 16:27:24 -0800 Message-Id: <20230125002730.1471349-1-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755952359572478844?= X-GMAIL-MSGID: =?utf-8?q?1755952359572478844?= From: Yunying Sun Some "TSC fall back to HPET" messages appear on systems having more than 2 NUMA nodes: clocksource: timekeeping watchdog on CPU168: hpet read-back delay of 4296200ns, attempt 4, marking unstable The "hpet" here is misleading the clocksource watchdog is really doing repeated reads of "hpet" in order to check for unrelated delays. Therefore, print the name of the clocksource under test, prefixed by "wd-" and suffixed by "-wd", for example, "wd-tsc-wd". Signed-off-by: Yunying Sun Signed-off-by: Paul E. McKenney --- kernel/time/clocksource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 9cf32ccda715d..4a2c3bb92e2e9 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -257,8 +257,8 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, goto skip_test; } - pr_warn("timekeeping watchdog on CPU%d: %s read-back delay of %lldns, attempt %d, marking unstable\n", - smp_processor_id(), watchdog->name, wd_delay, nretries); + pr_warn("timekeeping watchdog on CPU%d: wd-%s-wd read-back delay of %lldns, attempt %d, marking unstable\n", + smp_processor_id(), cs->name, wd_delay, nretries); return WD_READ_UNSTABLE; skip_test: From patchwork Wed Jan 25 00:27:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 47987 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp15202wrn; Tue, 24 Jan 2023 16:37:45 -0800 (PST) X-Google-Smtp-Source: AMrXdXso4/Gr94K7Qw58PyaOro24iMNx6A6IU+eRtQkVbzh/VefFtv/VYD5D+WTzdAcYcEkFWmSP X-Received: by 2002:a17:906:1112:b0:84d:28d6:3179 with SMTP id h18-20020a170906111200b0084d28d63179mr30896666eja.0.1674607065206; Tue, 24 Jan 2023 16:37:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674607065; cv=none; d=google.com; s=arc-20160816; b=qPrnPTqVbVzDQb6oPVkuw4FLYOJ8sWIk/PUrPxEQc1zk1t1Q3ER9L5GuOKCB1dwxJ9 gLkt4q5z7lgL+aJGhMdJ6c59eA+jbbSUtVTXJCzSjLtkbfkyDm0PaSFXPnmUDz/PueaF vC2TmpOxBjJziyGBzUbCmYaeiXIPzVDtrKhr+Ddt3UDpV40PBBYNLET7U3Phb5OBu1GW wq7aF+VH3yXg3wW+G0l9/8qj2sHk3mN8mEdZgdun5wSbcH29FicK99EbitO9wg/+gxIO Hs+4LqSrSsXh/KRa5ihJGDeX7sIZ5AdmNFlQlEvZO8d+Cqr3730x2q1yo7yF3wojMX0G VWjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pS5sTazU1rZslFt6OqU/kU4RJoGUmrP+rWuC47g6Wuc=; b=tmnOza5osBydTUVmBRBFkLRV4IxCUCh0nclQpggT0ssoNl7B1X2TKtj+SwjJcFJDYX rLxjeQWpKya9gFVLH/kxrBOzzqyBvtedpPuo7/xv0JVlwP3qmE/L33mAo+lzK4nr8P8r NiWWTgSLjfaxvifFpEnNPtwTS+q2h6Th3sC0J2TiSrzUcry4ro8ulnXilMCJ3odXUs/h MtMUTliml7FocvPzOuxbx9n2pfeYwG8LV0qpX887q9tuwAJkI3NRruny21Glv7EVMVJF 2zlzu2XE5BRifoaGTtyiWzjWLMh+4XbpkV9+Nlrc8la5Hsr1iXDw7F4noTUG8/F6IKdK FqNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZQ24BFTr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h3-20020a056402280300b0048b74fca8cesi6582852ede.172.2023.01.24.16.37.19; Tue, 24 Jan 2023 16:37:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZQ24BFTr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234913AbjAYA2v (ORCPT + 99 others); Tue, 24 Jan 2023 19:28:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234859AbjAYA2o (ORCPT ); Tue, 24 Jan 2023 19:28:44 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65CA74CE73 for ; Tue, 24 Jan 2023 16:28:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 181BEB81732 for ; Wed, 25 Jan 2023 00:27:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC5A8C433EF; Wed, 25 Jan 2023 00:27:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606452; bh=yfdddVAsVm0wKfhUXEcuvIc8bj3pYmqk8I+8WWgBY/w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZQ24BFTraLEIXc/xXMXIE8QW+TWApivhTNcmlC4LDDq1JzVlzmVikaF026ycEe6LX d37GJib2NSbzWdG5zBt8wB2VWY3DAxAmW7TQfxtg7qwGAqO8Tcrw1tmo7Fxod97H0+ s1OrvjGp0Wq8UYs5M1ocdYcn+sM+THIQAdSLF3DnTd2en8nXDRyYRacoYxfk3KUxCX YYFmPZiWdXv+xdrum3R6cUFYBGi6MKyRSuS3cDzvCziZsfvmC0PlUgK7WqCZ1fDJuM DTLyrAdhwPU+esSyxPtTu7WrAvb1njyGn2r9/OH9Fa0zLX0lwg5Nx/R9Js7Nn1J2O4 C/ZDf2Ny/DPMQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 723795C155D; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, "Paul E. McKenney" Subject: [PATCH v2 clocksource 2/7] clocksource: Loosen clocksource watchdog constraints Date: Tue, 24 Jan 2023 16:27:25 -0800 Message-Id: <20230125002730.1471349-2-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755952778058977119?= X-GMAIL-MSGID: =?utf-8?q?1755952778058977119?= Currently, MAX_SKEW_USEC is set to 100 microseconds, which has worked reasonably well. However, NTP is willing to tolerate 500 microseconds of skew per second, and a clocksource that is good enough for NTP should be good enough for the clocksource watchdog. The watchdog's skew is controlled by MAX_SKEW_USEC and the CLOCKSOURCE_WATCHDOG_MAX_SKEW_US Kconfig option. However, these values are doubled before being associated with a clocksource's ->uncertainty_margin, and the ->uncertainty_margin values of the pair of clocksource's being compared are summed before checking against the skew. Therefore, set both MAX_SKEW_USEC and the default for the CLOCKSOURCE_WATCHDOG_MAX_SKEW_US Kconfig option to 125 microseconds of skew per second, resulting in 500 microseconds of skew per second in the clocksource watchdog's skew comparison. Suggested-by Rik van Riel Signed-off-by: Paul E. McKenney --- kernel/time/Kconfig | 6 +++++- kernel/time/clocksource.c | 15 +++++++++------ 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index a41753be1a2bf..bae8f11070bef 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig @@ -200,10 +200,14 @@ config CLOCKSOURCE_WATCHDOG_MAX_SKEW_US int "Clocksource watchdog maximum allowable skew (in μs)" depends on CLOCKSOURCE_WATCHDOG range 50 1000 - default 100 + default 125 help Specify the maximum amount of allowable watchdog skew in microseconds before reporting the clocksource to be unstable. + The default is based on a half-second clocksource watchdog + interval and NTP's maximum frequency drift of 500 parts + per million. If the clocksource is good enough for NTP, + it is good enough for the clocksource watchdog! endmenu endif diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index 4a2c3bb92e2e9..a3d19f6660ac7 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -95,6 +95,11 @@ static char override_name[CS_NAME_LEN]; static int finished_booting; static u64 suspend_start; +/* + * Interval: 0.5sec. + */ +#define WATCHDOG_INTERVAL (HZ >> 1) + /* * Threshold: 0.0312s, when doubled: 0.0625s. * Also a default for cs->uncertainty_margin when registering clocks. @@ -106,11 +111,14 @@ static u64 suspend_start; * clocksource surrounding a read of the clocksource being validated. * This delay could be due to SMIs, NMIs, or to VCPU preemptions. Used as * a lower bound for cs->uncertainty_margin values when registering clocks. + * + * The default of 500 parts per million is based on NTP's limits. + * If a clocksource is good enough for NTP, it is good enough for us! */ #ifdef CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US #define MAX_SKEW_USEC CONFIG_CLOCKSOURCE_WATCHDOG_MAX_SKEW_US #else -#define MAX_SKEW_USEC 100 +#define MAX_SKEW_USEC (125 * WATCHDOG_INTERVAL / HZ) #endif #define WATCHDOG_MAX_SKEW (MAX_SKEW_USEC * NSEC_PER_USEC) @@ -140,11 +148,6 @@ static inline void clocksource_watchdog_unlock(unsigned long *flags) static int clocksource_watchdog_kthread(void *data); static void __clocksource_change_rating(struct clocksource *cs, int rating); -/* - * Interval: 0.5sec. - */ -#define WATCHDOG_INTERVAL (HZ >> 1) - static void clocksource_watchdog_work(struct work_struct *work) { /* From patchwork Wed Jan 25 00:27:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 47982 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp11888wrn; Tue, 24 Jan 2023 16:29:09 -0800 (PST) X-Google-Smtp-Source: AMrXdXt560LO92861uflVyHq1pGNWxlpNlAprfTDbhS27R9Nx7ADn+Qk10oPzbz0N+g7x4bQVLTA X-Received: by 2002:a05:6402:4cd:b0:46c:8544:42be with SMTP id n13-20020a05640204cd00b0046c854442bemr30423974edw.5.1674606548965; Tue, 24 Jan 2023 16:29:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674606548; cv=none; d=google.com; s=arc-20160816; b=GQZ3YUfJcVo3w3zYfKU38xWceNHm5/AXxtGuaNAuaY7r8PxpwJMnzIUjCsOFu5g8sC ueWFV1WoY4I3iOulA7ld8wX4N5/InTHpUQ286bdQnqBtVIVvdbHs91rcOkKXo6HtVQbl zZoyU9gJnP1HkRhZHFJRIaWPQaD26QxOhOhXKbOMEGbbk1m3GLt/KDgbIsrXED1M5ZQE m5r982zRfTmW4v/t6znY6jWlHUCld46QMuAHXlX5UUx8toNiSsXcoDU38rUmFPcmfhSh POcN7zVxP/7TpbEd2MmXmZOzjhayCda0ijdww7kIkscCHOAP125kYB9MGR0DaL/+dfpW aHnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+OndCycZAVKgCubNF85TmtspvEXQeM4SpV0jmeM03ZM=; b=Rz2GMQFdAVvO3gFU9Ln0o5PZiaBEqC5971pzjJ1g0v1dob2KGq+2tZHFt/M7nWSYWk qtA/Je1o0tY1cKy0wwbD3+M3k7CeWBD+8sLsSvcU2kW2h+llZzNjstbPNobQljFlwCb/ Ywe2e0k5WC1lei15BJ/drpBg6VMYNPf9C+AFYIJgiJGTQqGkK39Tb++NcbTUP9AE1Rif JHkA9Kg+KvtRWERQft92vzJVYP5ms0y47JQVuD5Em9XHXu+hIwTTKdNunXk2abJU45un mrOYb1xYPRkUpBocWGcVQdySE4g7nNWaF+Z6d6EMLkX79zE8nfA7d3UvXh2yg2ISdfJ2 97Vg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CFCBsChk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x5-20020a056402414500b0049e0440d493si4766453eda.15.2023.01.24.16.28.44; Tue, 24 Jan 2023 16:29:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CFCBsChk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234800AbjAYA2Q (ORCPT + 99 others); Tue, 24 Jan 2023 19:28:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234776AbjAYA2P (ORCPT ); Tue, 24 Jan 2023 19:28:15 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B10B34FCE4 for ; Tue, 24 Jan 2023 16:27:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1F85AB817B0 for ; Wed, 25 Jan 2023 00:27:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5AE4C433A1; Wed, 25 Jan 2023 00:27:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606452; bh=ZxflhOZe+HfSeKoe2YE5lW7UCVsim4bFg/Q32iI5pZk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CFCBsChk11U1OdzY0Ae1UZuelg2vUshpjAcvDpHpLrGrsl3hYCHZgf36c3Vz81xpq t5McnArBb5PK1zKcnF7XMajXFvs2O6c+NQ8xLdHqJs6MyGqt0i4hT6arXaMy1J4mhx tArxfPbW/plciZF0sFLosPE4BYqRcj4P0c8EyaJkbpaEiW90NJl758LJorI7zMALQd 5Y2e4d5m2QlXc8Kk6iCsbHgAvPoAqyLXkUkHyKgy43MXHo+Fvkfx6kaVnn/nk7iDRp Ne1dqc6suNhYtf177qfGNy46QpyXzQpaqmLajeFUocHT8BibTwe2dKVROtA6Hsh1bJ oEHHg1uzaU14A== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 748B75C1C66; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, "Paul E. McKenney" , John Stultz Subject: [PATCH v2 clocksource 3/7] clocksource: Improve read-back-delay message Date: Tue, 24 Jan 2023 16:27:26 -0800 Message-Id: <20230125002730.1471349-3-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755952236648716323?= X-GMAIL-MSGID: =?utf-8?q?1755952236648716323?= When cs_watchdog_read() is unable to get a qualifying clocksource read within the limit set by max_cswd_read_retries, it prints a message and marks the clocksource under test as unstable. But that message is unclear to anyone unfamiliar with the code: clocksource: timekeeping watchdog on CPU13: wd-tsc-wd read-back delay 1000614ns, attempt 3, marking unstable Therefore, add some context so that the message appears as follows: clocksource: timekeeping watchdog on CPU13: wd-tsc-wd excessive read-back delay of 1000614ns vs. limit of 125000ns, wd-wd read-back delay only 27ns, attempt 3, marking tsc unstable Signed-off-by: Paul E. McKenney Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Feng Tang --- kernel/time/clocksource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index a3d19f6660ac7..b59914953809f 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -260,8 +260,8 @@ static enum wd_read_status cs_watchdog_read(struct clocksource *cs, u64 *csnow, goto skip_test; } - pr_warn("timekeeping watchdog on CPU%d: wd-%s-wd read-back delay of %lldns, attempt %d, marking unstable\n", - smp_processor_id(), cs->name, wd_delay, nretries); + pr_warn("timekeeping watchdog on CPU%d: wd-%s-wd excessive read-back delay of %lldns vs. limit of %ldns, wd-wd read-back delay only %lldns, attempt %d, marking %s unstable\n", + smp_processor_id(), cs->name, wd_delay, WATCHDOG_MAX_SKEW, wd_seq_delay, nretries, cs->name); return WD_READ_UNSTABLE; skip_test: From patchwork Wed Jan 25 00:27:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 47983 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp11944wrn; Tue, 24 Jan 2023 16:29:16 -0800 (PST) X-Google-Smtp-Source: AMrXdXubrIT7DWSt3YscQQvbtMZAzAxEiYToVfu8sv6zBrVIjUSrMuL+UhmEKU5pnZdJqfqkpjS9 X-Received: by 2002:a17:907:a4c1:b0:812:d53e:1084 with SMTP id vq1-20020a170907a4c100b00812d53e1084mr47710395ejc.70.1674606556833; Tue, 24 Jan 2023 16:29:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674606556; cv=none; d=google.com; s=arc-20160816; b=wuoVtQaSEpukMrQhWwUsBoqKKCUA7iuL4egVs254uxrjIbbQW4PGzYxiW3rQQU3SEP Y8u7mpAaWJp6X8Dxp0xgrBQY06Qc+9i2wSXFkmHpEKXpwQ+tZpeVcp6nObuyKBp7XM/9 GQoaDvKl6JSQHol33mEQovjcjhLDk0WfTOlah425oq1p7FnVpsA0JckBFa4HeP7pD8fC 76Y7VtJdav+5eDasyDlLms0Hfw9ND5xXoqSbSjf9RjqeJ/emSRjrl0r6EFhYa/JEvOu3 rb5ovL35Mj69ttJGeDEWuTLdOTmqi44ghKYrhBlxzdr/R3rmhcDSVRQIfk/YO/bLn4FI 5lQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9VU6o7331EELaksFBzm59fGLxGOtFKsHL2NmD2pLhaU=; b=zQwz+JXTzUlT2tNWCaevnMdKAtvyB7/9M39pgq8oCJ7IJdq1xlA7bPEBmYmNDUWIfw UUC1NJZGT/7/S5BHMhxSA7QJo8TOKrv3UnntFiKaf0TgEYCW0ObouloXVS7Le22ELfD8 U18uVN4GZ/XANLm9wR+Z4bUFGnK4nj9AZAiUPDSBn/IINNlMptpYZ1VVrneNfih7P357 CGiZPCfgEcBF1f/g0j1IkWgrjXAOhH24DRuTPvb+WA8HkRzYFAJIsCKzEDq4gTX3Vwf0 6vY7HdKeEq7Alm8hS+GFua4lhXpz2+Vqg0bK2nzqzQU4dqHdkygmQeQS4Dk6UGoMnEI8 gLvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uD7pPh8D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id uk19-20020a170907ca1300b008738b6acdedsi4163246ejc.238.2023.01.24.16.28.53; Tue, 24 Jan 2023 16:29:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uD7pPh8D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234806AbjAYA2l (ORCPT + 99 others); Tue, 24 Jan 2023 19:28:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234776AbjAYA2k (ORCPT ); Tue, 24 Jan 2023 19:28:40 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34FFCF75D for ; Tue, 24 Jan 2023 16:28:10 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 65E0D61411 for ; Wed, 25 Jan 2023 00:27:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BE5C1C4339E; Wed, 25 Jan 2023 00:27:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606452; bh=xEevU0ma7O1DL7C5uiFkN+haaj0y+rnQnb6YU8qQKmc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uD7pPh8DvrqCt71QimBiukpmO6s3hCllmHEYIKbbA2ZbXBrOND1ENGWhgDXKsVe/d /0hFuP8UA4x4Xivu47dVTz5SNhTIgdx5oEYe8p6gkAXBx4xxEp9e4BGuzYM9BaRM7U 1N7YU0sHgyiC4W+ralpiZoAdFKibDQChSdL7D9Axg3rGirTNc7dTSlmGn5WePEDFX6 RHJL7SNPCfXJWkPVmi97VftpVM/PNgyigN6ggpgcNDAnmNvqBToK4+xWvTzZTyuX51 21YnxwdFj3wC2jbZiMao9O4/iXRnhOmf0EVqfF0CB2xMMWJ/JDSWn5gTPIJJ9xo5fk czf3j5pZNNprQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 767EA5C1C79; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, "Paul E. McKenney" , John Stultz Subject: [PATCH v2 clocksource 4/7] clocksource: Improve "skew is too large" messages Date: Tue, 24 Jan 2023 16:27:27 -0800 Message-Id: <20230125002730.1471349-4-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755952245241034249?= X-GMAIL-MSGID: =?utf-8?q?1755952245241034249?= When clocksource_watchdog() detects excessive clocksource skew compared to the watchdog clocksource, it marks the clocksource under test as unstable and prints several lines worth of message. But that message is unclear to anyone unfamiliar with the code: clocksource: timekeeping watchdog on CPU2: Marking clocksource 'wdtest-ktime' as unstable because the skew is too large: clocksource: 'kvm-clock' wd_nsec: 400744390 wd_now: 612625c2c wd_last: 5fa7f7c66 mask: ffffffffffffffff clocksource: 'wdtest-ktime' cs_nsec: 600744034 cs_now: 173081397a292d4f cs_last: 17308139565a8ced mask: ffffffffffffffff clocksource: 'kvm-clock' (not 'wdtest-ktime') is current clocksource. Therefore, add the following line near the end of that message: Clocksource 'wdtest-ktime' skewed 199999644 ns (199 ms) over watchdog 'kvm-clock' interval of 400744390 ns (400 ms) This new line clearly indicates the amount of skew between the two clocksources, along with the duration of the time interval over which the skew occurred, both in nanoseconds and milliseconds. Signed-off-by: Paul E. McKenney Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Feng Tang --- kernel/time/clocksource.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index b59914953809f..fc486cd972635 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -446,12 +446,20 @@ static void clocksource_watchdog(struct timer_list *unused) /* Check the deviation from the watchdog clocksource. */ md = cs->uncertainty_margin + watchdog->uncertainty_margin; if (abs(cs_nsec - wd_nsec) > md) { + u64 cs_wd_msec; + u64 wd_msec; + u32 wd_rem; + pr_warn("timekeeping watchdog on CPU%d: Marking clocksource '%s' as unstable because the skew is too large:\n", smp_processor_id(), cs->name); pr_warn(" '%s' wd_nsec: %lld wd_now: %llx wd_last: %llx mask: %llx\n", watchdog->name, wd_nsec, wdnow, wdlast, watchdog->mask); pr_warn(" '%s' cs_nsec: %lld cs_now: %llx cs_last: %llx mask: %llx\n", cs->name, cs_nsec, csnow, cslast, cs->mask); + cs_wd_msec = div_u64_rem(cs_nsec - wd_nsec, 1000U * 1000U, &wd_rem); + wd_msec = div_u64_rem(wd_nsec, 1000U * 1000U, &wd_rem); + pr_warn(" Clocksource '%s' skewed %lld ns (%lld ms) over watchdog '%s' interval of %lld ns (%lld ms)\n", + cs->name, cs_nsec - wd_nsec, cs_wd_msec, watchdog->name, wd_nsec, wd_msec); if (curr_clocksource == cs) pr_warn(" '%s' is current clocksource.\n", cs->name); else if (curr_clocksource) From patchwork Wed Jan 25 00:27:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 47984 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp11997wrn; Tue, 24 Jan 2023 16:29:27 -0800 (PST) X-Google-Smtp-Source: AMrXdXuslXpeESEI52Si3Ja7bC054pkCRA+RwUbHfTPudV1O0zCkv76u5b/Ku4YbBn2oW7quyuvp X-Received: by 2002:a05:6402:524f:b0:49e:910:5706 with SMTP id t15-20020a056402524f00b0049e09105706mr43814390edd.2.1674606567732; Tue, 24 Jan 2023 16:29:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674606567; cv=none; d=google.com; s=arc-20160816; b=Qe3SatJxFmQ9zgaa/kB75LhdqbpVveHi3/zUuNV9ckR34iD975DA0ObndywV0D4ovm 0fyxwnRYpfLRgn5zXjf4EVyJDFoM4LXdCl8OAbHYRfl4NxPHF5LR+Lcz4RghIEell6Gp l3Y1wxJ90EiaIz8kNzerC03wK2k4KklIN6Q0VZYY9/IBK9hXbnJ2viXYLRDwdn6aYLIH zeL8cZM2Xk1r/sk0R6KtzIePHJcB2aPcpD5Zvt9y/4NcIND7Cs0QyMyvUqzTHucgp5XX Lbw4UY5Ayv5sw9x7vpV8CYMBdPrDIqiBM5HQUxTSVCDdqViEBJdQlz7G5f+9qod9aYYM agqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3rl6zHZD80oMz88Ac5seyqwEdqEWbobXx/I2hR0FK24=; b=G+YS9jvDU0MD03xJ4WyGiim3pXGTsq5vpCL39Xxt4hjVEnayfp7jQ5MV6JKhmnndPG g9RdTq1L2zdlQBx3jaTh9GMN1gRxA3LsM/GwRTMWiYrRajPRiD/HpKvGmlf8yPAfZBCp sBrQjWsXaMUsQT+ud+ZwKoHID6ovIG7SsgIA9U/HGAsMfayFYe2YWgqMYtlM6yyeTOHe sBfk4ES+tG9E2+JTssKFKn7szfME1dn9ZRK43c8blOYE36lgLSsc1+Iii7e+OJub1kKa 91UbxBEh8X+c82eFTssh6DhzwgtbH3cgN5A27aAdIA1TQkdF8PO2eZh9pqneQjVuY7GO M95Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZK08XQli; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ds11-20020a0564021ccb00b004a091925e42si1696005edb.116.2023.01.24.16.29.03; Tue, 24 Jan 2023 16:29:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZK08XQli; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234776AbjAYA2n (ORCPT + 99 others); Tue, 24 Jan 2023 19:28:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234840AbjAYA2l (ORCPT ); Tue, 24 Jan 2023 19:28:41 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7011F4DE27 for ; Tue, 24 Jan 2023 16:28:10 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7015A613DC for ; Wed, 25 Jan 2023 00:27:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C81BCC433A4; Wed, 25 Jan 2023 00:27:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606452; bh=qdP6NXSXpyI8pYe6XWt4MIApSDs2MCybUN24u0PUPWM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZK08XQliqevn5CBB4TaWIMENAvtA3RRTjasnvfvdSSTJ/bYHYy+Q4+Yfwn062zEJR n8uJ5wyXlYDFdafjugzt8vswtiZ+lJoP/Hwvhh8kkbdEKdhr6SuC6ozXxVcfE0TTHk YjUDSLE9auItg+lnJ++fkCGGk2UWe/dZtWAsb5Ya2dCtJ5vLxy/LG2T8+NCHOrvjBG bUsN6UNzjxzXt6dy9Y/IgbgmOXMtlatXybXM7jnqNa0SonWXMMhfCqIltDuMU+rJc5 LJN4UX09lBE+9If4iqZX97KPfm+qSjjQNSaZgWC7gTIDd3L4Nn+9z17CqjP5EftVGK RyhcWDEyLizjg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 793CD5C1CEF; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, Waiman Long , John Stultz , "Paul E . McKenney" Subject: [PATCH v2 clocksource 5/7] clocksource: Suspend the watchdog temporarily when high read latency detected Date: Tue, 24 Jan 2023 16:27:28 -0800 Message-Id: <20230125002730.1471349-5-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755952256317621340?= X-GMAIL-MSGID: =?utf-8?q?1755952256317621340?= From: Feng Tang Bugs have been reported on 8 sockets x86 machines in which the TSC was wrongly disabled when the system is under heavy workload. [ 818.380354] clocksource: timekeeping watchdog on CPU336: hpet wd-wd read-back delay of 1203520ns [ 818.436160] clocksource: wd-tsc-wd read-back delay of 181880ns, clock-skew test skipped! [ 819.402962] clocksource: timekeeping watchdog on CPU338: hpet wd-wd read-back delay of 324000ns [ 819.448036] clocksource: wd-tsc-wd read-back delay of 337240ns, clock-skew test skipped! [ 819.880863] clocksource: timekeeping watchdog on CPU339: hpet read-back delay of 150280ns, attempt 3, marking unstable [ 819.936243] tsc: Marking TSC unstable due to clocksource watchdog [ 820.068173] TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. [ 820.092382] sched_clock: Marking unstable (818769414384, 1195404998) [ 820.643627] clocksource: Checking clocksource tsc synchronization from CPU 267 to CPUs 0,4,25,70,126,430,557,564. [ 821.067990] clocksource: Switched to clocksource hpet This can be reproduced by running memory intensive 'stream' tests, or some of the stress-ng subcases such as 'ioport'. The reason for these issues is the when system is under heavy load, the read latency of the clocksources can be very high. Even lightweight TSC reads can show high latencies, and latencies are much worse for external clocksources such as HPET or the APIC PM timer. These latencies can result in false-positive clocksource-unstable determinations. These issues were initially reported by a customer running on a production system, and this problem was reproduced on several generations of Xeon servers, especially when running the stress-ng test. These Xeon servers were not production systems, but they did have the latest steppings and firmware. Given that the clocksource watchdog is a continual diagnostic check with frequency of twice a second, there is no need to rush it when the system is under heavy load. Therefore, when high clocksource read latencies are detected, suspend the watchdog timer for 5 minutes. Signed-off-by: Feng Tang Acked-by: Waiman Long Cc: John Stultz Cc: Thomas Gleixner Cc: Stephen Boyd Cc: Feng Tang Signed-off-by: Paul E. McKenney --- kernel/time/clocksource.c | 45 ++++++++++++++++++++++++++++----------- 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index fc486cd972635..91836b727cef5 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -387,6 +387,15 @@ void clocksource_verify_percpu(struct clocksource *cs) } EXPORT_SYMBOL_GPL(clocksource_verify_percpu); +static inline void clocksource_reset_watchdog(void) +{ + struct clocksource *cs; + + list_for_each_entry(cs, &watchdog_list, wd_list) + cs->flags &= ~CLOCK_SOURCE_WATCHDOG; +} + + static void clocksource_watchdog(struct timer_list *unused) { u64 csnow, wdnow, cslast, wdlast, delta; @@ -394,6 +403,7 @@ static void clocksource_watchdog(struct timer_list *unused) int64_t wd_nsec, cs_nsec; struct clocksource *cs; enum wd_read_status read_ret; + unsigned long extra_wait = 0; u32 md; spin_lock(&watchdog_lock); @@ -413,13 +423,30 @@ static void clocksource_watchdog(struct timer_list *unused) read_ret = cs_watchdog_read(cs, &csnow, &wdnow); - if (read_ret != WD_READ_SUCCESS) { - if (read_ret == WD_READ_UNSTABLE) - /* Clock readout unreliable, so give it up. */ - __clocksource_unstable(cs); + if (read_ret == WD_READ_UNSTABLE) { + /* Clock readout unreliable, so give it up. */ + __clocksource_unstable(cs); continue; } + /* + * When WD_READ_SKIP is returned, it means the system is likely + * under very heavy load, where the latency of reading + * watchdog/clocksource is very big, and affect the accuracy of + * watchdog check. So give system some space and suspend the + * watchdog check for 5 minutes. + */ + if (read_ret == WD_READ_SKIP) { + /* + * As the watchdog timer will be suspended, and + * cs->last could keep unchanged for 5 minutes, reset + * the counters. + */ + clocksource_reset_watchdog(); + extra_wait = HZ * 300; + break; + } + /* Clocksource initialized ? */ if (!(cs->flags & CLOCK_SOURCE_WATCHDOG) || atomic_read(&watchdog_reset_pending)) { @@ -523,7 +550,7 @@ static void clocksource_watchdog(struct timer_list *unused) * pair clocksource_stop_watchdog() clocksource_start_watchdog(). */ if (!timer_pending(&watchdog_timer)) { - watchdog_timer.expires += WATCHDOG_INTERVAL; + watchdog_timer.expires += WATCHDOG_INTERVAL + extra_wait; add_timer_on(&watchdog_timer, next_cpu); } out: @@ -548,14 +575,6 @@ static inline void clocksource_stop_watchdog(void) watchdog_running = 0; } -static inline void clocksource_reset_watchdog(void) -{ - struct clocksource *cs; - - list_for_each_entry(cs, &watchdog_list, wd_list) - cs->flags &= ~CLOCK_SOURCE_WATCHDOG; -} - static void clocksource_resume_watchdog(void) { atomic_inc(&watchdog_reset_pending); From patchwork Wed Jan 25 00:27:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 47985 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp12411wrn; Tue, 24 Jan 2023 16:30:26 -0800 (PST) X-Google-Smtp-Source: AMrXdXt8f8EPtpB1QHjR1weQKb3s/LfSeQr3hpU4qVr6o8/6fZ4JUiq7YJ7e3BkWUGqKCYViY/59 X-Received: by 2002:a17:907:2129:b0:86f:ed46:c07f with SMTP id qo9-20020a170907212900b0086fed46c07fmr30587913ejb.75.1674606626396; Tue, 24 Jan 2023 16:30:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674606626; cv=none; d=google.com; s=arc-20160816; b=l7kqnGkkyhMhqKQ4lGlfv0Np3WXlP8shOt2NJ0UwbdcIUDVrKeTyHxDkJKAy+HrKGX roXyipW/yDXcdasATM1OGSCsa3fg/nCGRa8T5f2ofvjTISPi+5Se99Q31+M6FmCtKJiR 9GI5/658ls85/hPI/ywkm+2wQanxPElZrso0SY/kgS7beJOip3KpsPedyMjPZo2IEvZB zYm5GCsy/KICNbfbhSXtLPCFm4+uWzs1qTavoVkajhg6Ttr1DiN73eSMvfOzu6pmnUph 33eL8nLIycTnwHCnJ17R8Piw3GNun4TMAVKZple0UXWcVZCcq4Eyx97RRK8P/Bsuhlck ysHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=25mF64haFbH/+4jnakI7eErLPLKssRORE9tlub0E9JE=; b=OOrv++tzcZu5OYgm1Ob61fpvMAzS0mqXrjlXIp3srZNhVH6vaLgRuYwoNqYaGKGtX6 /Js7lkJgL8uhOxCLMwDVl5UVB03Kl9W8ulYeEKRqg+ciDpOTJhwmJ4bInnZ3GLpgEzVV x1qnCZ/7QlfGqkpSYNNz7Bk+DsvT0Xfh2FER1CVBatlOcAE0/wUl0TYO7nkhQ7XxeFIV q59AsLLI2TxmVTPzWHmqPGLNmMmUM56cGYevFK70ucl3mhagZlgD9yQjB/1Ytt/4XwKy 8RP0fob5gzy5ARUGZuwzSgU3h4siUzxNPdhOip27KBLa/19eKkOj9YMhMiRnblygkbzv ysUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=P890L5dM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gj38-20020a170907742600b00877da429e45si3599851ejc.922.2023.01.24.16.29.58; Tue, 24 Jan 2023 16:30:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=P890L5dM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234870AbjAYA2p (ORCPT + 99 others); Tue, 24 Jan 2023 19:28:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234845AbjAYA2m (ORCPT ); Tue, 24 Jan 2023 19:28:42 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 76E5D65AF for ; Tue, 24 Jan 2023 16:28:11 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 089CF61425 for ; Wed, 25 Jan 2023 00:27:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 20CB2C43321; Wed, 25 Jan 2023 00:27:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606453; bh=2niA6R2oYPuWrzIU0B6wVlNq4jSpbdluB2kfj2HTDEE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=P890L5dMPd0z14y217P4NJAsrRGhnNTPWJb+uAeG8aIx0xrLSURfo8KclMKBjrr/j QVpNEjrbBNAWCWVdYlzAXYuyhgIZfGFxc+VZ7iLzq+DDd/ty5W/VEE9aR3TGBwy3Eo TG2lF20+d1MZzl3woOiq2XdeWEo6b3P9ZzJJt7qrfMRvL2BnvkRHToVIZHpizGtF9S rgmcbOO7HCBGo4p7/AWWMq39sHof8LIUrc6OIw0LMExGdS8BDKMchdWlOmTU6QWUH3 RNgxElyg24b5zVZwYzBOvMfj2tGgxSmqPyjPDM0KYnRtYiFHI1mhuq2E/Mh88XW80N c5h9N7Nc69zBw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 7B5EF5C1CF4; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, "Paul E. McKenney" , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Daniel Lezcano , Waiman Long , x86@kernel.org Subject: [PATCH v2 clocksource 6/7] clocksource: Verify HPET and PMTMR when TSC unverified Date: Tue, 24 Jan 2023 16:27:29 -0800 Message-Id: <20230125002730.1471349-6-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755952317852444720?= X-GMAIL-MSGID: =?utf-8?q?1755952317852444720?= On systems with two or fewer sockets, when the boot CPU has CONSTANT_TSC, NONSTOP_TSC, and TSC_ADJUST, clocksource watchdog verification of the TSC is disabled. This works well much of the time, but there is the occasional production-level system that meets all of these criteria, but which still has a TSC that skews significantly from atomic-clock time. This is usually attributed to a firmware or hardware fault. Yes, the various NTP daemons do express their opinions of userspace-to-atomic-clock time skew, but they put them in various places, depending on the daemon and distro in question. It would therefore be good for the kernel to have some clue that there is a problem. The old behavior of marking the TSC unstable is a non-starter because a great many workloads simply cannot tolerate the overheads and latencies of the various non-TSC clocksources. In addition, NTP-corrected systems sometimes can tolerate significant kernel-space time skew as long as the userspace time sources are within epsilon of atomic-clock time. Therefore, when watchdog verification of TSC is disabled, enable it for HPET and PMTMR (AKA ACPI PM timer). This provides the needed in-kernel time-skew diagnostic without degrading the system's performance. Signed-off-by: Paul E. McKenney Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Daniel Lezcano Cc: Waiman Long Cc: Tested-by: Feng Tang --- arch/x86/include/asm/time.h | 1 + arch/x86/kernel/hpet.c | 2 ++ arch/x86/kernel/tsc.c | 5 +++++ drivers/clocksource/acpi_pm.c | 6 ++++-- 4 files changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/time.h b/arch/x86/include/asm/time.h index 8ac563abb567b..a53961c64a567 100644 --- a/arch/x86/include/asm/time.h +++ b/arch/x86/include/asm/time.h @@ -8,6 +8,7 @@ extern void hpet_time_init(void); extern void time_init(void); extern bool pit_timer_init(void); +extern bool tsc_clocksource_watchdog_disabled(void); extern struct clock_event_device *global_clock_event; diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index 71f336425e58a..c8eb1ac5125ab 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -1091,6 +1091,8 @@ int __init hpet_enable(void) if (!hpet_counting()) goto out_nohpet; + if (tsc_clocksource_watchdog_disabled()) + clocksource_hpet.flags |= CLOCK_SOURCE_MUST_VERIFY; clocksource_register_hz(&clocksource_hpet, (u32)hpet_freq); if (id & HPET_ID_LEGSUP) { diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index a78e73da4a74b..af3782fb6200c 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1186,6 +1186,11 @@ static void __init tsc_disable_clocksource_watchdog(void) clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY; } +bool tsc_clocksource_watchdog_disabled(void) +{ + return !(clocksource_tsc.flags & CLOCK_SOURCE_MUST_VERIFY); +} + static void __init check_system_tsc_reliable(void) { #if defined(CONFIG_MGEODEGX1) || defined(CONFIG_MGEODE_LX) || defined(CONFIG_X86_GENERIC) diff --git a/drivers/clocksource/acpi_pm.c b/drivers/clocksource/acpi_pm.c index 279ddff81ab49..82338773602ca 100644 --- a/drivers/clocksource/acpi_pm.c +++ b/drivers/clocksource/acpi_pm.c @@ -23,6 +23,7 @@ #include #include #include +#include /* * The I/O port the PMTMR resides at. @@ -210,8 +211,9 @@ static int __init init_acpi_pm_clocksource(void) return -ENODEV; } - return clocksource_register_hz(&clocksource_acpi_pm, - PMTMR_TICKS_PER_SEC); + if (tsc_clocksource_watchdog_disabled()) + clocksource_acpi_pm.flags |= CLOCK_SOURCE_MUST_VERIFY; + return clocksource_register_hz(&clocksource_acpi_pm, PMTMR_TICKS_PER_SEC); } /* We use fs_initcall because we want the PCI fixups to have run From patchwork Wed Jan 25 00:27:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 47981 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp11858wrn; Tue, 24 Jan 2023 16:29:07 -0800 (PST) X-Google-Smtp-Source: AMrXdXumv+DqlhREaZSjzAXruVbS4m8ky6Sb5Sku3JZehjskOkvr4UespxUlGHagM/Ipt1Q2Fupq X-Received: by 2002:aa7:c150:0:b0:49e:5902:398d with SMTP id r16-20020aa7c150000000b0049e5902398dmr27012143edp.34.1674606547204; Tue, 24 Jan 2023 16:29:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674606547; cv=none; d=google.com; s=arc-20160816; b=hlPfgdflePnWb4DMqpY0jxtsvsgNVlBbU545UBo7pGwoTIwRv18dixlvB1PPpIsSAI 97hg7oNcxkp0znBzOhq73puyKdGKyjt8kzMxPBjNEc2U5w5cg4pX/a+pMwi8IFD8Ssfx 7bzZlY61rFglQqqsI3gE2zvGB60O5Nb0kgFVS8qhHJxg5VLXyJ/bkCldrVEkaDE3Tnnj O6yYB7ZfD4wXS+DQiaiT8CmBwh1tl2U4hAYwovyHESbr+DAqZZFFYX/o5DcR/DOOUtYx tGVVPKdBUP8inebu+j/n2V7Wj7WG4Nktl1rIkZf73SflBQSXhiavnP7qI/oIYXnhiL40 acvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DxdMq+d0Zmz/tzCnagp6o0mjIgCj93IfyKmJqumhEyY=; b=0k4dcbNPzlLwmnvQoe47w7BIlaazg1o0kEVT4wE9kk5900Tinf/XMsHtjPfDPb5tL9 6/FoSV+dkvBgbnT/Mczp47i11PgTpLw4ZN1EC0pgZaXKmI1JDd3tzwtY38MGKwfHhYMr usRUfJ1YgXIrXWvVUHTvm1cFGTOydzzgZE5F4bOG6aO9vWeBJm9g2eOM1U8yGMA1sqGo FgrI1aB6UOprU7Lb2IplKBVpBbq3ie3wkLsBciDVt+oRImGjbqEZh+gKjC8L8tX0D96K CzGCIHPYQYejQecAEIQyX+vzwwHrTHJxBGz7BZynZlGVcLPwRJVaYFx4dmRVpvyKoIhz EsPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GsY2fk5z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b14-20020aa7df8e000000b004a0119daf48si3968269edy.262.2023.01.24.16.28.42; Tue, 24 Jan 2023 16:29:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GsY2fk5z; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234775AbjAYA2O (ORCPT + 99 others); Tue, 24 Jan 2023 19:28:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234745AbjAYA2N (ORCPT ); Tue, 24 Jan 2023 19:28:13 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C719518E0; Tue, 24 Jan 2023 16:27:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F33146141D; Wed, 25 Jan 2023 00:27:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2088BC43442; Wed, 25 Jan 2023 00:27:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606453; bh=Hf6D8jaVh9N46w1G0EEZbriQYqxBI7NIDla18BtPWI4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GsY2fk5zT8DTcq3AwyfRvEy78M9JjdbFUe8YMnBV7hXK7/RIi5I8ZkMfNVaxnQfoi uT6pA7E7X2eN+KlCKaWkSG8pvviYwHmm2/ji9DqcUamRKwt0h/IKwmy1NPPekEvYXX i5JxsWEH6816qagZnEbQCN2dWOeNM+y87ChgdTEqHuOW10rYUruZxvIQu0x53ogep+ lcv7lypzWme4ZHo+TO/GX0SbaBrXR5srcqayHkvhPLyyLRt88zs3m+Dldxnzz+UEMn SIiAZLAhaqRx0i14Tfho4gcsTgBv0Bj7l7vAfSKoZoWJ4yoYGd4svMwYpbc9gTStQc FjFFY0BU+NGQg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 7D74F5C1D0D; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , x86@kernel.org, linux-doc@vger.kernel.org, "Paul E . McKenney" Subject: [PATCH v2 clocksource 7/7] x86/tsc: Add option to force frequency recalibration with HW timer Date: Tue, 24 Jan 2023 16:27:30 -0800 Message-Id: <20230125002730.1471349-7-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755952235000637797?= X-GMAIL-MSGID: =?utf-8?q?1755952235000637797?= From: Feng Tang The kernel assumes that the TSC frequency which is provided by the hardware / firmware via MSRs or CPUID(0x15) is correct after applying a few basic consistency checks. This disables the TSC recalibration against HPET or PM timer. As a result there is no mechanism to validate that frequency in cases where a firmware or hardware defect is suspected. And there was case that some user used atomic clock to measure the TSC frequency and reported an inaccuracy issue, which was later fixed in firmware. Add an option 'recalibrate' for 'tsc' kernel parameter to force the tsc freq recalibration with HPET or PM timer, and warn if the deviation from previous value is more than about 500 PPM, which provides a way to verify the data from hardware / firmware. There is no functional change to existing work flow. Recently there was a real-world case: "The 40ms/s divergence between TSC and HPET was observed on hardware that is quite recent" [1], on that platform the TSC frequence 1896 MHz was got from CPUID(0x15), and the force-reclibration with HPET/PMTIMER both calibrated out value of 1975 MHz, which also matched with check from software 'chronyd', indicating it's a problem of BIOS or firmware. [Thanks tglx for helping improving the commit log] [ paulmck: Wordsmith Kconfig help text. ] [1]. https://lore.kernel.org/lkml/20221117230910.GI4001@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Feng Tang Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Jonathan Corbet Cc: Cc: Signed-off-by: Paul E. McKenney --- .../admin-guide/kernel-parameters.txt | 4 +++ arch/x86/kernel/tsc.c | 34 ++++++++++++++++--- 2 files changed, 34 insertions(+), 4 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6cfa6e3996cf7..95f0d104c2322 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6369,6 +6369,10 @@ in situations with strict latency requirements (where interruptions from clocksource watchdog are not acceptable). + [x86] recalibrate: force recalibration against a HW timer + (HPET or PM timer) on systems whose TSC frequency was + obtained from HW or FW using either an MSR or CPUID(0x15). + Warn if the difference is more than 500 ppm. tsc_early_khz= [X86] Skip early TSC calibration and use the given value instead. Useful when the early TSC frequency discovery diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index af3782fb6200c..a5371c6d4b64b 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -48,6 +48,8 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc); int tsc_clocksource_reliable; +static int __read_mostly tsc_force_recalibrate; + static u32 art_to_tsc_numerator; static u32 art_to_tsc_denominator; static u64 art_to_tsc_offset; @@ -303,6 +305,8 @@ static int __init tsc_setup(char *str) mark_tsc_unstable("boot parameter"); if (!strcmp(str, "nowatchdog")) no_tsc_watchdog = 1; + if (!strcmp(str, "recalibrate")) + tsc_force_recalibrate = 1; return 1; } @@ -1379,6 +1383,25 @@ static void tsc_refine_calibration_work(struct work_struct *work) else freq = calc_pmtimer_ref(delta, ref_start, ref_stop); + /* Will hit this only if tsc_force_recalibrate has been set */ + if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) { + + /* Warn if the deviation exceeds 500 ppm */ + if (abs(tsc_khz - freq) > (tsc_khz >> 11)) { + pr_warn("Warning: TSC freq calibrated by CPUID/MSR differs from what is calibrated by HW timer, please check with vendor!!\n"); + pr_info("Previous calibrated TSC freq:\t %lu.%03lu MHz\n", + (unsigned long)tsc_khz / 1000, + (unsigned long)tsc_khz % 1000); + } + + pr_info("TSC freq recalibrated by [%s]:\t %lu.%03lu MHz\n", + hpet ? "HPET" : "PM_TIMER", + (unsigned long)freq / 1000, + (unsigned long)freq % 1000); + + return; + } + /* Make sure we're within 1% */ if (abs(tsc_khz - freq) > tsc_khz/100) goto out; @@ -1412,8 +1435,10 @@ static int __init init_tsc_clocksource(void) if (!boot_cpu_has(X86_FEATURE_TSC) || !tsc_khz) return 0; - if (tsc_unstable) - goto unreg; + if (tsc_unstable) { + clocksource_unregister(&clocksource_tsc_early); + return 0; + } if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3)) clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP; @@ -1426,9 +1451,10 @@ static int __init init_tsc_clocksource(void) if (boot_cpu_has(X86_FEATURE_ART)) art_related_clocksource = &clocksource_tsc; clocksource_register_khz(&clocksource_tsc, tsc_khz); -unreg: clocksource_unregister(&clocksource_tsc_early); - return 0; + + if (!tsc_force_recalibrate) + return 0; } schedule_delayed_work(&tsc_irqwork, 0);