x86/perf: Fixed kernel panic during boot on Nano processor.

Message ID 20221103032304.27753-1-silviazhao-oc@zhaoxin.com
State New
Headers
Series x86/perf: Fixed kernel panic during boot on Nano processor. |

Commit Message

silviazhao Nov. 3, 2022, 3:23 a.m. UTC
  Nano processor may not fully support rdpmc instruction, it works well
for reading general pmc counter, but will lead GP(general protection)
when accessing fixed pmc counter. Furthermore, family/mode information
is same between Nano processor and ZX-C processor, it leads to zhaoxin
pmu driver is wrongly loaded for Nano processor, which resulting boot
kernal fail.

To solve this problem, stepping information will be checked to distinguish
between Nano processor and ZX-C processor.

[https://bugzilla.kernel.org/show_bug.cgi?id=212389]

Reported-by: Arjan <8vvbbqzo567a@nospam.xutrox.com>
Signed-off-by: silviazhao-oc <silviazhao-oc@zhaoxin.com>
---
 arch/x86/events/zhaoxin/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Borislav Petkov Nov. 3, 2022, 10:07 a.m. UTC | #1
On Thu, Nov 03, 2022 at 11:23:04AM +0800, silviazhao-oc wrote:
> Nano processor may not fully support rdpmc instruction,

What does that even mean? Not fully support?

> it works well for reading general pmc counter, but will lead
> GP(general protection) when accessing fixed pmc counter.

RDPMC will #GP when the perf counter specified cannot be read.

AFAICT, that is RCX: 0000000040000001 which looks like perf counter
index 1 with INTEL_PMC_FIXED_RDPMC_BASE ORed in.

> Furthermore, family/mode information is same between Nano processor
> and ZX-C processor, it leads to zhaoxin pmu driver is wrongly loaded
> for Nano processor, which resulting boot kernal fail.

So *that* is the real problem - it tries to access perf counters
thinking it is running on architectural perf counters implementation but
nano doesn't have that.

> To solve this problem, stepping information will be checked to distinguish
> between Nano processor and ZX-C processor.

Why doesn't that ZXC thing doesn't have a CPUID flag to check instead of
looking at models and steppings and thus confusing it with a nano CPU?

> [https://bugzilla.kernel.org/show_bug.cgi?id=212389]
> 
> Reported-by: Arjan <8vvbbqzo567a@nospam.xutrox.com>

Does Arjan have a last name?

> Signed-off-by: silviazhao-oc <silviazhao-oc@zhaoxin.com>

I'm assuming your name is properly spelled "Silvia Zhao" and not in a
single word with a "-oc" string appended at the end, yes?

Thx.
  

Patch

diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c
index 949d845c922b..cef1de251613 100644
--- a/arch/x86/events/zhaoxin/core.c
+++ b/arch/x86/events/zhaoxin/core.c
@@ -541,7 +541,8 @@  __init int zhaoxin_pmu_init(void)
 
 	switch (boot_cpu_data.x86) {
 	case 0x06:
-		if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) {
+		if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) ||
+			boot_cpu_data.x86_model == 0x19) {
 
 			x86_pmu.max_period = x86_pmu.cntval_mask >> 1;