[v2] Add hardcoded crystal clock for KabyLake

Message ID 20221018190124.v2.1.I918ccc908c5c836c1e21e01d2cf6f59b0157bcc3@changeid
State New
Headers
Series [v2] Add hardcoded crystal clock for KabyLake |

Commit Message

Rishabh Agrawal Oct. 18, 2022, 7:01 p.m. UTC
  Set KabyLake crystal clock manually since the TSC calibration is off
by 0.5%. All the tests that are based on the timer/clock will fail in
this case.

The HPET has been disabled by upstream due to PC10 issue causing the
3 hatch devices, dratini, jinlon, and kohaku to not calibrate the clock
precisely. These 3 devices are KabyLake devices. Intel team has verified
that all KBL devices have 24.0 MHz clock frequency, therefore this
change is valid.

Signed-off-by: Rishabh Agrawal <rishabhagr@chromium.org>
---

Changes in v2:
  - Adding Thomas Gleixner, Daniel Drake, Rafael Wysocki, Len brown and Ingo to review since you worked on this.

 arch/x86/kernel/tsc.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)
  

Comments

Peter Zijlstra Oct. 20, 2022, 5:13 p.m. UTC | #1
On Tue, Oct 18, 2022 at 07:01:32PM +0000, Rishabh Agrawal wrote:
> Set KabyLake crystal clock manually since the TSC calibration is off
> by 0.5%. All the tests that are based on the timer/clock will fail in
> this case.
> 
> The HPET has been disabled by upstream due to PC10 issue causing the
> 3 hatch devices, dratini, jinlon, and kohaku to not calibrate the clock

I have no idea what a hatch device is, nor what pokemon have anything to
do with things.

> precisely. These 3 devices are KabyLake devices. Intel team has verified

But no actual Intel person with a Reviewed-by tag we can go pester if
this turns out to be wrong.

> that all KBL devices have 24.0 MHz clock frequency, therefore this
> change is valid.

And yet you only change KBL_L.

And why, pray *WHY* can't Intel simply write the correct information in
CPUID leaf 15h. I mean, they defined the leaf, might as well use it, no?

> Signed-off-by: Rishabh Agrawal <rishabhagr@chromium.org>
> ---
> 
> Changes in v2:
>   - Adding Thomas Gleixner, Daniel Drake, Rafael Wysocki, Len brown and Ingo to review since you worked on this.
> 
>  arch/x86/kernel/tsc.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index cafacb2e58cc..63a06224fa48 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -644,10 +644,21 @@ unsigned long native_calibrate_tsc(void)
>  	 * Denverton SoCs don't report crystal clock, and also don't support
>  	 * CPUID.0x16 for the calculation below, so hardcode the 25MHz crystal
>  	 * clock.
> +	 *
> +	 * Intel KabyLake devices' clocks are off by 0.5% when using the below
> +	 * calculation, so hardcode 24.0 MHz crystal clock.
>  	 */
> -	if (crystal_khz == 0 &&
> -			boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT_D)
> -		crystal_khz = 25000;
> +	if (crystal_khz == 0) {
> +		switch (boot_cpu_data.x86_model) {
> +		case INTEL_FAM6_ATOM_GOLDMONT_D:
> +			crystal_khz = 25000;
> +			break;
> +		case INTEL_FAM6_KABYLAKE_L:
> +			crystal_khz = 24000;
> +			break;
> +		}
> +
> +	}
>  
>  	/*
>  	 * TSC frequency reported directly by CPUID is a "hardware reported"
> -- 
> 2.38.0.413.g74048e4d9e-goog
>
  
Dave Hansen Oct. 20, 2022, 5:18 p.m. UTC | #2
On 10/20/22 10:13, Peter Zijlstra wrote:
> And why, pray *WHY* can't Intel simply write the correct information in
> CPUID leaf 15h. I mean, they defined the leaf, might as well use it, no?

Is the data that's in the leaf just wrong?  Doesn't that mean that the
CPUID leaf on these models is violating the architecture contract?  That
sounds like something that deserves an erratum.

Is there a documented erratum?
  
Thomas Gleixner Nov. 14, 2022, 10:58 p.m. UTC | #3
On Thu, Oct 20 2022 at 10:18, Dave Hansen wrote:
> On 10/20/22 10:13, Peter Zijlstra wrote:
>> And why, pray *WHY* can't Intel simply write the correct information in
>> CPUID leaf 15h. I mean, they defined the leaf, might as well use it, no?
>
> Is the data that's in the leaf just wrong?  Doesn't that mean that the
> CPUID leaf on these models is violating the architecture contract?  That
> sounds like something that deserves an erratum.
>
> Is there a documented erratum?

I don't know. The code has this comment:

	/*
	 * Some Intel SoCs like Skylake and Kabylake don't report the crystal
	 * clock, but we can easily calculate it to a high degree of accuracy
	 * by considering the crystal ratio and the CPU speed.
	 */

so those SoCs fail to expose clock in leaf 15h and then the information
in leaf 16h is so inaccurate that the calculation is off.

Sigh. It's 2022 and we are still relying on crystalball mechanisms to
figure out the damned crystal frequency.

The specification of leaf 15h is:

 15H Time Stamp Counter and Nominal Core Crystal Clock Information Leaf
    NOTES:
        If EBX[31:0] is 0, the TSC/”core crystal clock” ratio is not enumerated.
        If ECX is 0, the nominal core crystal clock frequency is not enumerated.

IOW, this CPUID leaf is defined to be useless and leaves it up to the
SoC integration to provide this information or not. It needs even two
fields to chose from to make it useless...

I'm sure this took 10+ draft versions and consumed a non-quantifiable
amount of work hours to come up with this joke.

Thanks,

        tglx
  
Christian A. Ehrhardt Dec. 21, 2022, 10:35 p.m. UTC | #4
Hi,

On Mon, Nov 14, 2022 at 11:58:54PM +0100, Thomas Gleixner wrote:
> On Thu, Oct 20 2022 at 10:18, Dave Hansen wrote:
> > On 10/20/22 10:13, Peter Zijlstra wrote:
> >> And why, pray *WHY* can't Intel simply write the correct information in
> >> CPUID leaf 15h. I mean, they defined the leaf, might as well use it, no?
> >
> > Is the data that's in the leaf just wrong?  Doesn't that mean that the
> > CPUID leaf on these models is violating the architecture contract?  That
> > sounds like something that deserves an erratum.
> >
> > Is there a documented erratum?
> 
> I don't know. The code has this comment:
> 
> 	/*
> 	 * Some Intel SoCs like Skylake and Kabylake don't report the crystal
> 	 * clock, but we can easily calculate it to a high degree of accuracy
> 	 * by considering the crystal ratio and the CPU speed.
> 	 */

Latest (April 2022) version of the SDM clearly states that the
above comment is wrong. CPUID.16h has the following note:
| Data is returned from this interface in accordance with the processor's
| specification and does not reflect actual values. Suitable use of this
| data includes the display of processor information in like manner to the
| processor brand string and for determining the appropriate range to use
| when displaying processor information e.g. frequency history graphs. The
| returned information should not be used for any other purpose as the
| returned information does not accurately correlate to information /
| counters returned by other processor interfaces.

Thus using CPUID.16h to determine the crystal clock frequency is wrong.
This difference is significant. I have one Kaby Lake latop where
the CPUID.16h reported frequency is 1900MHz but the real frequency is
only 1896MHz. This amounts to a time drift of about 8s/hour if the
wrong TSC frequency is used for time keeping.

Basically, I think this commit:
    604dc9170 (x86/tsc: Use CPUID.0x16 to calculate ...)
needs to be reverted.

> so those SoCs fail to expose clock in leaf 15h and then the information
> in leaf 16h is so inaccurate that the calculation is off.
> 
> Sigh. It's 2022 and we are still relying on crystalball mechanisms to
> figure out the damned crystal frequency.
> 
> The specification of leaf 15h is:
> 
>  15H Time Stamp Counter and Nominal Core Crystal Clock Information Leaf
>     NOTES:
>         If EBX[31:0] is 0, the TSC/”core crystal clock” ratio is not enumerated.
>         If ECX is 0, the nominal core crystal clock frequency is not enumerated.
> 
> IOW, this CPUID leaf is defined to be useless and leaves it up to the
> SoC integration to provide this information or not. It needs even two
> fields to chose from to make it useless...

The SDM (now?) has some hints on how to do this. This is hidden here:

    Vol.3 Chapter 19.7.3: Determining the Processor Base Frequency

This chapter contains a table that lists the correct crystal clock
frequencies for CPU models that do not enumerate it via CPUID.15h.

       regards   Christian
  

Patch

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index cafacb2e58cc..63a06224fa48 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -644,10 +644,21 @@  unsigned long native_calibrate_tsc(void)
 	 * Denverton SoCs don't report crystal clock, and also don't support
 	 * CPUID.0x16 for the calculation below, so hardcode the 25MHz crystal
 	 * clock.
+	 *
+	 * Intel KabyLake devices' clocks are off by 0.5% when using the below
+	 * calculation, so hardcode 24.0 MHz crystal clock.
 	 */
-	if (crystal_khz == 0 &&
-			boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT_D)
-		crystal_khz = 25000;
+	if (crystal_khz == 0) {
+		switch (boot_cpu_data.x86_model) {
+		case INTEL_FAM6_ATOM_GOLDMONT_D:
+			crystal_khz = 25000;
+			break;
+		case INTEL_FAM6_KABYLAKE_L:
+			crystal_khz = 24000;
+			break;
+		}
+
+	}
 
 	/*
 	 * TSC frequency reported directly by CPUID is a "hardware reported"