[GIT,PULL,v2,clocksource] Clocksource watchdog commits for v6.3

Message ID 20230210193640.GA3325193@paulmck-ThinkPad-P17-Gen-1
State New
Headers
Series [GIT,PULL,v2,clocksource] Clocksource watchdog commits for v6.3 |

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git tags/clocksource.2023.02.06b

Message

Paul E. McKenney Feb. 10, 2023, 7:36 p.m. UTC
  Hello, Thomas,

The following changes since commit 1b929c02afd37871d5afb9d498426f83432e71c2:

  Linux 6.2-rc1 (2022-12-25 13:41:39 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git tags/clocksource.2023.02.06b

for you to fetch changes up to 0051293c533017e2a860e0a0a33517bc40240fff:

  clocksource: Enable TSC watchdog checking of HPET and PMTMR only when requested (2023-02-06 16:38:30 -0800)

This adds commit 0051293c5330 ("clocksource: Enable TSC watchdog checking
of HPET and PMTMR only when requested") to the previous pull request as
discussed here:

https://lore.kernel.org/lkml/20230131012440.GA1251465@paulmck-ThinkPad-P17-Gen-1/

----------------------------------------------------------------
Clocksource watchdog commits for v6.3

This pull request contains the following:

o	Improvements to clocksource-watchdog console messages.

o	Loosening of the clocksource-watchdog skew criteria to match
	those of NTP (500 parts per million, relaxed from 400 parts
	per million).  If it is good enough for NTP, it is good enough
	for the clocksource watchdog.

o	Suspend clocksource-watchdog checking temporarily when high
	memory latencies are detected.	This avoids the false-positive
	clock-skew events that have been seen on production systems
	running memory-intensive workloads.

o	On systems where the TSC is deemed trustworthy, use it as the
	watchdog timesource, but only when specifically requested using
	the tsc=watchdog kernel boot parameter.  This permits clock-skew
	events to be detected, but avoids forcing workloads to use the
	slow HPET and ACPI PM timers.  These last two timers are slow
	enough to cause systems to be needlessly marked bad on the one
	hand, and real skew does sometimes happen on production systems
	running production workloads on the other.  And sometimes it is
	the fault of the TSC, or at least of the firmware that told the
	kernel to program the TSC with the wrong frequency.

o	Add a tsc=revalidate kernel boot parameter to allow the kernel
	to diagnose cases where the TSC hardware works fine, but was told
	by firmware to tick at the wrong frequency.  Such cases are rare,
	but they really have happened on production systems.

----------------------------------------------------------------
Feng Tang (2):
      clocksource: Suspend the watchdog temporarily when high read latency detected
      x86/tsc: Add option to force frequency recalibration with HW timer

Paul E. McKenney (5):
      clocksource: Loosen clocksource watchdog constraints
      clocksource: Improve read-back-delay message
      clocksource: Improve "skew is too large" messages
      clocksource: Verify HPET and PMTMR when TSC unverified
      clocksource: Enable TSC watchdog checking of HPET and PMTMR only when requested

Yunying Sun (1):
      clocksource: Print clocksource name when clocksource is tested unstable

 Documentation/admin-guide/kernel-parameters.txt | 10 ++++
 arch/x86/include/asm/time.h                     |  1 +
 arch/x86/kernel/hpet.c                          |  2 +
 arch/x86/kernel/tsc.c                           | 55 +++++++++++++++++--
 drivers/clocksource/acpi_pm.c                   |  6 ++-
 kernel/time/Kconfig                             |  6 ++-
 kernel/time/clocksource.c                       | 72 +++++++++++++++++--------
 7 files changed, 123 insertions(+), 29 deletions(-)
  

Comments

tip-bot2 for Thomas Gleixner Feb. 13, 2023, 6:48 p.m. UTC | #1
The following commit has been merged into the timers/core branch of tip:

Commit-ID:     ab407a1919d2676ddc5761ed459d4cc5c7be18ed
Gitweb:        https://git.kernel.org/tip/ab407a1919d2676ddc5761ed459d4cc5c7be18ed
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Mon, 13 Feb 2023 19:28:48 +01:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 13 Feb 2023 19:28:48 +01:00

Merge tag 'clocksource.2023.02.06b' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into timers/core

Pull clocksource watchdog changes from Paul McKenney:

     o	Improvements to clocksource-watchdog console messages.

     o	Loosening of the clocksource-watchdog skew criteria to match
     	those of NTP (500 parts per million, relaxed from 400 parts
     	per million).  If it is good enough for NTP, it is good enough
     	for the clocksource watchdog.

     o	Suspend clocksource-watchdog checking temporarily when high
     	memory latencies are detected.	This avoids the false-positive
     	clock-skew events that have been seen on production systems
     	running memory-intensive workloads.

     o	On systems where the TSC is deemed trustworthy, use it as the
     	watchdog timesource, but only when specifically requested using
     	the tsc=watchdog kernel boot parameter.  This permits clock-skew
     	events to be detected, but avoids forcing workloads to use the
     	slow HPET and ACPI PM timers.  These last two timers are slow
     	enough to cause systems to be needlessly marked bad on the one
     	hand, and real skew does sometimes happen on production systems
     	running production workloads on the other.  And sometimes it is
     	the fault of the TSC, or at least of the firmware that told the
     	kernel to program the TSC with the wrong frequency.

     o	Add a tsc=revalidate kernel boot parameter to allow the kernel
     	to diagnose cases where the TSC hardware works fine, but was told
     	by firmware to tick at the wrong frequency.  Such cases are rare,
     	but they really have happened on production systems.

Link: https://lore.kernel.org/r/20230210193640.GA3325193@paulmck-ThinkPad-P17-Gen-1
---