Message ID | 20230202091738.5947-1-silviazhao-oc@zhaoxin.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2388:b0:96:219d:e725 with SMTP id i8csp192313dyf; Thu, 2 Feb 2023 01:42:01 -0800 (PST) X-Google-Smtp-Source: AK7set8N9L9P+8kaMkRL+Lmv9g1uGDNF3/yd+r+fh5A9TDkXJ5tp8E9Onljo5+WODwJ3o3iw8rS/ X-Received: by 2002:a62:f24e:0:b0:593:a33f:3acf with SMTP id y14-20020a62f24e000000b00593a33f3acfmr4891830pfl.1.1675330921187; Thu, 02 Feb 2023 01:42:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675330921; cv=none; d=google.com; s=arc-20160816; b=vOTPFmgmYzx6fvJbaAuA6cNbY4uzHh81k9Km0u2gHxZJEAs286fbC6iHNjNQiWDI7P vCVuaCtiEjofynUgmyXey6GIsrVqtGFZBAbH7lZB9ZBIE/g1KTFLlVht1XF+Fqb/5yLH oTzPJWX0nXQDGvYFEGwTkLPrkWlTTNZyPJ5cV8gJ5p+EJhYOTKnpb022SwS4LHPFlVUH JkrsjjXpVSZdeR4ynRvy4CrBj0MAgzT/NPkr/OYcgsHOQIOTQSmbVMqm6195VAidfqvS lj2X/aZ7kW4Sb6Qkjyei73ynXhNzxsCYlOQFj7n9GkrgbID2CcQD2dsfxhjCKEDvlVHJ d9Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=G8adD3xf+3oMvJfOL/G3tTAozq+FOJNDMB1NAEvGiGA=; b=JnFuWqLMzFvBdlNfz61wsUI0xWStYs1qxOUAkn4MYSHWMI5Q+wWXSguvSTzIGRjc7X 5iehsLi5//52b6VaQhEVFdXJ50I8CaV/X0A34TMIZ+M7IrvGJ9cpJUFpXpdTVX+DuN5I 9eU3YoEe6FvECw8A29TFE8DDAwdiHiKsTzSfIrmtgiq71lTk0rlIxBLvGAEMLoI9OObP v+pg69qMP//9WdcUuy1Bvjg81JE0B2lCFwow3vZ37xcg0q+xPcnUSagxUMhv1apcKfPj xKZKshFIC3kauDjjw94ueYmZ+MxCtcMeBOqxPgZDcUo/u99xA3Je9ZC9aHxFDzMV65Yo /4zQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y30-20020aa79e1e000000b005943e277f02si221767pfq.141.2023.02.02.01.41.49; Thu, 02 Feb 2023 01:42:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229801AbjBBJgl (ORCPT <rfc822;il.mystafa@gmail.com> + 99 others); Thu, 2 Feb 2023 04:36:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229721AbjBBJgg (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 2 Feb 2023 04:36:36 -0500 X-Greylist: delayed 1111 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Thu, 02 Feb 2023 01:36:15 PST Received: from mx2.zhaoxin.com (mx2.zhaoxin.com [203.110.167.99]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBF79366AA for <linux-kernel@vger.kernel.org>; Thu, 2 Feb 2023 01:36:15 -0800 (PST) X-ASG-Debug-ID: 1675329459-1eb14e798049070001-xx1T2L Received: from ZXSHMBX3.zhaoxin.com (ZXSHMBX3.zhaoxin.com [10.28.252.165]) by mx2.zhaoxin.com with ESMTP id fIQu6BWVEJRrs4Go (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Thu, 02 Feb 2023 17:17:39 +0800 (CST) X-Barracuda-Envelope-From: SilviaZhao-oc@zhaoxin.com X-Barracuda-RBL-Trusted-Forwarder: 10.28.252.165 Received: from ZXBJMBX02.zhaoxin.com (10.29.252.6) by ZXSHMBX3.zhaoxin.com (10.28.252.165) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.16; Thu, 2 Feb 2023 17:17:39 +0800 Received: from silvia-OptiPlex-3010.zhaoxin.com (10.29.8.47) by ZXBJMBX02.zhaoxin.com (10.29.252.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.16; Thu, 2 Feb 2023 17:17:38 +0800 X-Barracuda-RBL-Trusted-Forwarder: 10.28.252.165 From: silviazhao-oc <silviazhao-oc@zhaoxin.com> X-Barracuda-RBL-Trusted-Forwarder: 10.29.252.6 To: <peterz@infradead.org>, <mingo@redhat.com>, <acme@kernel.org>, <mark.rutland@arm.com>, <alexander.shishkin@linux.intel.com>, <jolsa@kernel.org>, <namhyung@kernel.org>, <tglx@linutronix.de>, <bp@alien8.de>, <dave.hansen@linux.intel.com>, <x86@kernel.org>, <hpa@zytor.com>, <linux-perf-users@vger.kernel.org>, <linux-kernel@vger.kernel.org> CC: <cobechen@zhaoxin.com>, <louisqi@zhaoxin.com>, <silviazhao@zhaoxin.com>, <tonywwang@zhaoxin.com>, <kevinbrace@gmx.com>, <8vvbbqzo567a@nospam.xutrox.com> Subject: [PATCH] x86/perf/zhaoxin: Add stepping check for ZX-C Date: Thu, 2 Feb 2023 17:17:38 +0800 X-ASG-Orig-Subj: [PATCH] x86/perf/zhaoxin: Add stepping check for ZX-C Message-ID: <20230202091738.5947-1-silviazhao-oc@zhaoxin.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Originating-IP: [10.29.8.47] X-ClientProxiedBy: zxbjmbx1.zhaoxin.com (10.29.252.163) To ZXBJMBX02.zhaoxin.com (10.29.252.6) X-Barracuda-Connect: ZXSHMBX3.zhaoxin.com[10.28.252.165] X-Barracuda-Start-Time: 1675329459 X-Barracuda-Encrypted: ECDHE-RSA-AES128-GCM-SHA256 X-Barracuda-URL: https://10.28.252.36:4443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at zhaoxin.com X-Barracuda-Scan-Msg-Size: 1478 X-Barracuda-BRTS-Status: 1 X-Barracuda-Bayes: INNOCENT GLOBAL 0.4337 1.0000 0.0000 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.104216 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756711796205038920?= X-GMAIL-MSGID: =?utf-8?q?1756711796205038920?= |
Series |
x86/perf/zhaoxin: Add stepping check for ZX-C
|
|
Commit Message
silviazhao
Feb. 2, 2023, 9:17 a.m. UTC
Nano processor may not fully support rdpmc instruction, it works well
for reading general pmc counter, but will lead GP(general protection)
when accessing fixed pmc counter. Furthermore, family/mode information
is same between Nano processor and ZX-C processor, it leads to zhaoxin
pmu driver is wrongly loaded for Nano processor, which resulting boot
kernal fail.
To solve this problem, stepping information will be checked to distinguish
between Nano processor and ZX-C processor.
Fixes: 3a4ac121c2ca (“x86/perf: Add hardware performance events support for Zhaoxin CPU”)
Reported-by: Arjan <8vvbbqzo567a@nospam.xutrox.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=212389
Reported-by: Kevin Brace <kevinbrace@gmx.com>
Signed-off-by: silviazhao-oc <silviazhao-oc@zhaoxin.com>
---
arch/x86/events/zhaoxin/core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Comments
Hi Silvia, Thank you very much for resending the VIA / Zhaoxin PMU patch and for keeping me in the loop. I observed this bug on ECS (Elitegroup Computer Systems) VX900-I mainboard. The mainboard contains one VIA Nano L2007 processor (1.6 GHz) soldered on the PCB. Although I have not independently verified it, CPUID steppings for VIA Nano 1000 / 2000 series (Centaur Technology code name: CNA) are supposedly 2 and 3. CPUID stepppings for VIA Nano 3000 series (Centaur Technology code name: CNB) is 8. CPUID stepppings for VIA Nano x2 (Centaur Technology code name: CNC) is 10. https://www.reddit.com/r/VIA/comments/dy71bn/via_centaurs_new_cpu_is_a_8core_x86_cpu_with_an/ I have not checked the actual CPUID steppings, but I have confirmed that the current code without the fix works okay with Nano 3000 series and Nano x2, but definitely not Nano 2000 series. For Nano 3000 series test, I used VIA Embedded EPIA M830 mainboard. For Nano x2 test, I used HP T510 thin client. Based on my observations, it appears that Centaur CNA contains a bug reading some performance counters, so not to cause inconveniences with users of Nano 1000 / 2000 series processors, the patch should limit / prevent reading performance counters on these processors. I think the code for the fix should reflect this the following way.
Hi Kevin, Thanks for your kindly reply. Since VIA Nano 1000/2000/3000 series are really very old CPU, and we can't find related mainboard for full verification. We suggest not to support all Nano series for PMC driver. On 2023/2/3 07:53, Kevin Brace wrote: > Hi Silvia, > > Thank you very much for resending the VIA / Zhaoxin PMU patch and for keeping me in the loop. > I observed this bug on ECS (Elitegroup Computer Systems) VX900-I mainboard. > The mainboard contains one VIA Nano L2007 processor (1.6 GHz) soldered on the PCB. > Although I have not independently verified it, CPUID steppings for VIA Nano 1000 / 2000 series (Centaur Technology code name: CNA) are supposedly 2 and 3. > CPUID stepppings for VIA Nano 3000 series (Centaur Technology code name: CNB) is 8. > CPUID stepppings for VIA Nano x2 (Centaur Technology code name: CNC) is 10. > > https://www.reddit.com/r/VIA/comments/dy71bn/via_centaurs_new_cpu_is_a_8core_x86_cpu_with_an/ > > I have not checked the actual CPUID steppings, but I have confirmed that the current code without the fix works okay with Nano 3000 series and Nano x2, but definitely not Nano 2000 series. > For Nano 3000 series test, I used VIA Embedded EPIA M830 mainboard. > For Nano x2 test, I used HP T510 thin client. > Based on my observations, it appears that Centaur CNA contains a bug reading some performance counters, so not to cause inconveniences with users of Nano 1000 / 2000 series processors, the patch should limit / prevent reading performance counters on these processors. > I think the code for the fix should reflect this the following way. > > _______________________________________________________________ > switch (boot_cpu_data.x86) { > case 0x06: > - if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) { > + if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x08) || > + boot_cpu_data.x86_model == 0x19) { > > x86_pmu.max_period = x86_pmu.cntval_mask >> 1; > > _______________________________________________________________ > > The above code should exclude Nano 1000 / 2000 series processors properly. > I lost easy access to ECS VX900-I mainboard for now (I need to look around for it. I do own another copy of it.), so I cannot confirm if the fix is properly working. > I still have easy access to EPIA M830 mainboard. > I welcome anyone's feedback on this issue. > This fix should go into the current kernel in development since it is a show stopper for users of Nano 1000 / 2000 series processors. > If the fix is adopted, please backport it to previous releases of the kernel. > I wasted about 2 weeks on this issue, and this fix should have never been ignored for such a long time. > > Regards, > > Kevin Brace > Brace Computer Laboratory blog > https://bracecomputerlab.com > > >> Sent: Thursday, February 02, 2023 at 3:17 AM >> From: "silviazhao-oc" <silviazhao-oc@zhaoxin.com> >> To: peterz@infradead.org, mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, namhyung@kernel.org, tglx@linutronix.de, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org >> Cc: cobechen@zhaoxin.com, louisqi@zhaoxin.com, silviazhao@zhaoxin.com, tonywwang@zhaoxin.com, kevinbrace@gmx.com, 8vvbbqzo567a@nospam.xutrox.com >> Subject: [PATCH] x86/perf/zhaoxin: Add stepping check for ZX-C >> >> Nano processor may not fully support rdpmc instruction, it works well >> for reading general pmc counter, but will lead GP(general protection) >> when accessing fixed pmc counter. Furthermore, family/mode information >> is same between Nano processor and ZX-C processor, it leads to zhaoxin >> pmu driver is wrongly loaded for Nano processor, which resulting boot >> kernal fail. >> >> To solve this problem, stepping information will be checked to distinguish >> between Nano processor and ZX-C processor. >> >> Fixes: 3a4ac121c2ca (“x86/perf: Add hardware performance events support for Zhaoxin CPU”) >> Reported-by: Arjan <8vvbbqzo567a@nospam.xutrox.com> >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=212389 >> Reported-by: Kevin Brace <kevinbrace@gmx.com> >> >> Signed-off-by: silviazhao-oc <silviazhao-oc@zhaoxin.com> >> --- >> arch/x86/events/zhaoxin/core.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c >> index 949d845c922b..cef1de251613 100644 >> --- a/arch/x86/events/zhaoxin/core.c >> +++ b/arch/x86/events/zhaoxin/core.c >> @@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void) >> >> switch (boot_cpu_data.x86) { >> case 0x06: >> - if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) { >> + if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) || >> + boot_cpu_data.x86_model == 0x19) { >> >> x86_pmu.max_period = x86_pmu.cntval_mask >> 1; >> >> -- >> 2.17.1 >> >>
On Thu, Feb 02, 2023 at 05:17:38PM +0800, silviazhao-oc wrote: > Nano processor may not fully support rdpmc instruction, it works well > for reading general pmc counter, but will lead GP(general protection) > when accessing fixed pmc counter. Furthermore, family/mode information > is same between Nano processor and ZX-C processor, it leads to zhaoxin > pmu driver is wrongly loaded for Nano processor, which resulting boot > kernal fail. > > To solve this problem, stepping information will be checked to distinguish > between Nano processor and ZX-C processor. > > Fixes: 3a4ac121c2ca (“x86/perf: Add hardware performance events support for Zhaoxin CPU”) > Reported-by: Arjan <8vvbbqzo567a@nospam.xutrox.com> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=212389 > Reported-by: Kevin Brace <kevinbrace@gmx.com> > > Signed-off-by: silviazhao-oc <silviazhao-oc@zhaoxin.com> Please use your proper name in the Signed-off-by. > --- > arch/x86/events/zhaoxin/core.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c > index 949d845c922b..cef1de251613 100644 > --- a/arch/x86/events/zhaoxin/core.c > +++ b/arch/x86/events/zhaoxin/core.c > @@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void) > > switch (boot_cpu_data.x86) { > case 0x06: > - if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) { > + if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) || > + boot_cpu_data.x86_model == 0x19) { > > x86_pmu.max_period = x86_pmu.cntval_mask >> 1; Last time we talked: https://lore.kernel.org/r/3c7da7fd-402f-c74f-c96c-0e88828eab58@zhaoxin.com you said that Nano #GPs when trying to RDPMC the fixed counters. Which sounds like an erratum. We do those by adding a X86_BUG flag, set that flag for Nano and then test it where needed. Grep the source tree for examples. Please do that above too unstead of testing steppings. Also, I'd like to know why do steppings < 0xe mean Nano and why isn't there a more reliable way to detect it? Thx.
On 2023/2/4 21:44, Borislav Petkov wrote: > On Thu, Feb 02, 2023 at 05:17:38PM +0800, silviazhao-oc wrote: >> Nano processor may not fully support rdpmc instruction, it works well >> for reading general pmc counter, but will lead GP(general protection) >> when accessing fixed pmc counter. Furthermore, family/mode information >> is same between Nano processor and ZX-C processor, it leads to zhaoxin >> pmu driver is wrongly loaded for Nano processor, which resulting boot >> kernal fail. >> >> To solve this problem, stepping information will be checked to distinguish >> between Nano processor and ZX-C processor. >> >> Fixes: 3a4ac121c2ca (“x86/perf: Add hardware performance events support for Zhaoxin CPU”) >> Reported-by: Arjan <8vvbbqzo567a@nospam.xutrox.com> >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=212389 >> Reported-by: Kevin Brace <kevinbrace@gmx.com> >> >> Signed-off-by: silviazhao-oc <silviazhao-oc@zhaoxin.com> > > Please use your proper name in the Signed-off-by. > Due to our company’s email policy, email address with oc suffix is used for sending email without confidentiality statement at the end of the mail body. I will remove –oc from my name later. >> --- >> arch/x86/events/zhaoxin/core.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c >> index 949d845c922b..cef1de251613 100644 >> --- a/arch/x86/events/zhaoxin/core.c >> +++ b/arch/x86/events/zhaoxin/core.c >> @@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void) >> >> switch (boot_cpu_data.x86) { >> case 0x06: >> - if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) { >> + if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) || >> + boot_cpu_data.x86_model == 0x19) { >> >> x86_pmu.max_period = x86_pmu.cntval_mask >> 1; > > Last time we talked: > > https://lore.kernel.org/r/3c7da7fd-402f-c74f-c96c-0e88828eab58@zhaoxin.com > > you said that Nano #GPs when trying to RDPMC the fixed counters. Which > sounds like an erratum. > > We do those by adding a X86_BUG flag, set that flag for Nano and then > test it where needed. Grep the source tree for examples. > > Please do that above too unstead of testing steppings. > Nano series CPUs are too old, we have not fully verified their PMC functions, furthermore our original development plan is to support PMC from ZXC, not including Nano series. In that way, I think it’s better to exclude Nano series CPU in the PMC driver. > Also, I'd like to know why do steppings < 0xe mean Nano and why isn't > there a more reliable way to detect it? > Generally, CPUs are identified by FMS(Family/Model/Stepping) information. As you said, it is not a good practice to use stepping information to distinguish different CPU series. But due to some unknown historical reasons, the FMS of Nano and ZXC are only different in stepping. I have considered about using the “Model name string” to distinguish them, but it doesn't seem to be a common way in Linux kernel. > Thx. >
On Mon, Feb 06, 2023 at 04:26:25PM +0800, silviazhaooc wrote: > Due to our company’s email policy, email address with oc suffix is used for > sending email without confidentiality statement at the end of the mail body. > > I will remove –oc from my name later. Yes, please. The email address is fine but the name doesn't have to have that funky "-oc" thing. > But due to some unknown historical reasons, the FMS of Nano and ZXC are only > different in stepping. > > I have considered about using the “Model name string” to distinguish them, > but it doesn't seem to be a common way in Linux kernel. I don't mind you using steppings to differentiate the two as long as this is not going to change all of a sudden and that differentiation is unambiguous. If not, you will have to use name strings as you don't have any other choice. Whatever you do, pls define a new X86_BUG_ flag, set it only on Nano and then test it in the PMU init code. Thx.
Hi Boris, Thanks for your reply. As I mentioned before, Nano has several series. We cannot test if all of them have the bug. Besides, AFAIK Nano's hardware support for PMC has not externally announced. So setting a new X86_BUG_ flag to Nano is inappropriate. I still think exclude PMC support in driver is more appropriate. Looking forward to your comments. On 2023/2/6 17:48, Borislav Petkov wrote: > On Mon, Feb 06, 2023 at 04:26:25PM +0800, silviazhaooc wrote: >> Due to our company’s email policy, email address with oc suffix is used for >> sending email without confidentiality statement at the end of the mail body. >> >> I will remove –oc from my name later. > > Yes, please. The email address is fine but the name doesn't have to have > that funky "-oc" thing. > >> But due to some unknown historical reasons, the FMS of Nano and ZXC are only >> different in stepping. >> >> I have considered about using the “Model name string” to distinguish them, >> but it doesn't seem to be a common way in Linux kernel. > > I don't mind you using steppings to differentiate the two as long as > this is not going to change all of a sudden and that differentiation is > unambiguous. > > If not, you will have to use name strings as you don't have any other > choice. > > Whatever you do, pls define a new X86_BUG_ flag, set it only on Nano and > then test it in the PMU init code. > > Thx. >
Hi Silvia, On Mon, Feb 06, 2023 at 06:55:21PM +0800, silviazhaooc wrote: > Thanks for your reply. You're welcome. First of all, please do not top-post when replying on a public mailing list but put your reply under the text you're replying to. Like the rest of us do. > As I mentioned before, Nano has several series. We cannot test if all of > them have the bug. If you cannot test if all of them have the bug, then testing the stepping as you do is wrong too. You need an unambiguous way to differentiate between ZXC and Nano CPUs. If steppings >= 0xe belong solely to ZXC, then state that in a comment above it so that you can exclude Nano. If Nano starts using those steppings later, though, then that check will become wrong too. So I need a statement: "this is how you detect a ZXC CPU unambiguously" and then use that method when enabling PMU support on it and *only* on it. Makes more sense? Thx.
On 2023/2/6 19:13, Borislav Petkov wrote: > Hi Silvia, > > On Mon, Feb 06, 2023 at 06:55:21PM +0800, silviazhaooc wrote: >> Thanks for your reply. > > You're welcome. > > First of all, please do not top-post when replying on a public mailing > list but put your reply under the text you're replying to. Like the rest > of us do. > Sorry, I'm a newbie in Linux. Thanks for your reminding. >> As I mentioned before, Nano has several series. We cannot test if all of >> them have the bug. > > If you cannot test if all of them have the bug, then testing the > stepping as you do is wrong too. > > You need an unambiguous way to differentiate between ZXC and Nano CPUs. > > If steppings >= 0xe belong solely to ZXC, then state that in a comment > above it so that you can exclude Nano. > > If Nano starts using those steppings later, though, then that check will > become wrong too. > > So I need a statement: "this is how you detect a ZXC CPU unambiguously" > and then use that method when enabling PMU support on it and *only* on > it. > > Makes more sense? > Yes, that makes sense. I have carefully checked our product manual for Nano and ZXC FMS. For ZXC, there are 2 kinds of FMS: 1. Family=6, Model=0x19, Stepping=0-3 2. Family=6, Model=F, Stepping=E-F For Nano, there is only one kind of FMS: Family=6, Model=F, Stepping=[0-A]/[C-D] So model = 0xf, steppings >= 0xe or model = 0x19 belong solely to ZXC. Nano is an old CPU series which has been stopped production several years ago. It will not use the steppings which belong to ZXC.This is an unambiguous way to differentiate between ZXC and Nano CPUs. Do I need to add the statements in the source code and re-commit the patch? Thx. > Thx. >
On Tue, Feb 07, 2023 at 04:42:26PM +0800, silviazhaooc wrote: > Sorry, I'm a newbie in Linux. Thanks for your reminding. No worries. :) > I have carefully checked our product manual for Nano and ZXC FMS. > > For ZXC, there are 2 kinds of FMS: > > 1. Family=6, Model=0x19, Stepping=0-3 > > 2. Family=6, Model=F, Stepping=E-F > > For Nano, there is only one kind of FMS: > > Family=6, Model=F, Stepping=[0-A]/[C-D] > > So model = 0xf, steppings >= 0xe or model = 0x19 belong solely to ZXC. > Nano is an old CPU series which has been stopped production several years > ago. Good, which sounds like there won't be any more Nano steppings. > It will not use the steppings which belong to ZXC.This is an > unambiguous way to differentiate between ZXC and Nano CPUs. > > Do I need to add the statements in the source code and re-commit the patch? Yes please. That would explain in a clear way why it is ok to test those models/steppings. If it turns out that we need this Nano - ZXC distinction more often, we can do something more involved later. Thx.
diff --git a/arch/x86/events/zhaoxin/core.c b/arch/x86/events/zhaoxin/core.c index 949d845c922b..cef1de251613 100644 --- a/arch/x86/events/zhaoxin/core.c +++ b/arch/x86/events/zhaoxin/core.c @@ -541,7 +541,8 @@ __init int zhaoxin_pmu_init(void) switch (boot_cpu_data.x86) { case 0x06: - if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) { + if ((boot_cpu_data.x86_model == 0x0f && boot_cpu_data.x86_stepping >= 0x0e) || + boot_cpu_data.x86_model == 0x19) { x86_pmu.max_period = x86_pmu.cntval_mask >> 1;