Message ID | 20230616115316.3652155-1-leitao@debian.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1288198vqr; Fri, 16 Jun 2023 05:03:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5iNTIHbhgSiRQmiUC5Evb3IRO8Jty570ewBHxohavmw+ufDaswttof490v54Jr1bg9OJw8 X-Received: by 2002:a6b:cf0f:0:b0:774:7f35:657a with SMTP id o15-20020a6bcf0f000000b007747f35657amr2365071ioa.10.1686917037457; Fri, 16 Jun 2023 05:03:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686917037; cv=none; d=google.com; s=arc-20160816; b=BzxmO/FPuIiQ0Du1dKoZDdP+ODrN3YLL5boJW/BdOGVS5/Kd21+Rw0ajeSOdB2Mvsa ZvKWSGbJ2ux3t04+GLTGdTmHY8sSc6VV/G4el3I2u/8CpVetPhDXlgW2084n4ecDi2QL F68YEFbHtXp6PWrMeuDfaKt8UBHOjdoIZgNDUEeHG6fwm0oWoJRGW8IFhYeQ2gk4m7An yq2TCNA5bPXgbjok7K85S3hr2ZbPlSwNX4uJZUWLdDHHhUt/J/GgOHFaUulqdHx+Idc1 bhas8Wjo2TkYEnIzUV7u2oLsUiXYmlOY2nnHTelshnt9dlwACqDe5H0dLf/v8wK5jFMU CwqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=K7xAAjbq425WmYDF4eeJ0EL9NFPdHLP90wUaRqnBZNg=; b=NNDAKqYI86rIqY1uEerb2V6fOsCArFrOUxLPEQolVC8j/i4uF+tvxM5OK/oxNVyE+P UiKiOBxue9THPMXvMYQyCU7NmXR7ci6L4Mujiwyw6gcmkcetoxDbzvYCvvWy711BQoVt hxGQGAWkaR9G3llL5ljlQHvZprbeHBzAokzqARKK3V9j608x5ye7/Fy206OmdptEUnLi SFKP2BmClrsjW6mEjaQjqHYgOM/qEngMBbeTz9GxpKFPbzUdLgOLhvfthOezOwP9u1v9 AowWPGvq0Jy4dCwYrjf+rhTU56y/ae1fI5bthrQ8woEcQFKr4yaYKzDGVfwKO1zqJVpf /HDQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l14-20020a17090a72ce00b0025c1f64f29dsi1399131pjk.171.2023.06.16.05.03.43; Fri, 16 Jun 2023 05:03:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344397AbjFPLxY (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Fri, 16 Jun 2023 07:53:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344895AbjFPLxX (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 16 Jun 2023 07:53:23 -0400 Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A74B30CF; Fri, 16 Jun 2023 04:53:20 -0700 (PDT) Received: by mail-wr1-f49.google.com with SMTP id ffacd0b85a97d-311153ec442so507936f8f.1; Fri, 16 Jun 2023 04:53:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686916399; x=1689508399; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=K7xAAjbq425WmYDF4eeJ0EL9NFPdHLP90wUaRqnBZNg=; b=VVKAefyzUkhucd2zTeHlApagVyGIzoaYmHllvLtwtLKqhsQdGadoO6/n/a3KL0XMJ3 O40hvTcKLHyGCuiljhOd/P+7q5A6n9XyzREKNwl4V0A8ph2ryGdFFmMtjhdjdBcW8TRg w+OhCNxrlruv8K8GoqhG/zxhpkJzLGl+d+akHhUBDMGkwN6nVpldTeHQZcohEr488oGD SBiA78KIMqTMe9bfFxzfzsyZUy78NnBryk7XrQ+oOgAzEtQWH7MXjB34r1MDyx1QMw+w uAUQH4bzKYmyevKeYLxSa3ZT3qe8WOMi1260ruPbyAOX5Qy/AuHCTyOqHSICEDXeiFbi Fysg== X-Gm-Message-State: AC+VfDy1gHNULGgDFRX+OLDobNKPap93wwe/W3aUqkgm2Y85wJ8chFnp YyQ3lmaWitGDohIFecK9KlI= X-Received: by 2002:adf:e4c1:0:b0:30f:c71a:1b28 with SMTP id v1-20020adfe4c1000000b0030fc71a1b28mr6558770wrm.28.1686916398618; Fri, 16 Jun 2023 04:53:18 -0700 (PDT) Received: from localhost (fwdproxy-cln-111.fbsv.net. [2a03:2880:31ff:6f::face:b00c]) by smtp.gmail.com with ESMTPSA id u24-20020a05600c00d800b003f7ead9be7fsm1954267wmm.38.2023.06.16.04.53.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Jun 2023 04:53:17 -0700 (PDT) From: Breno Leitao <leitao@debian.org> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Ian Rogers <irogers@google.com>, Adrian Hunter <adrian.hunter@intel.com>, Thomas Gleixner <tglx@linutronix.de>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, Sandipan Das <sandipan.das@amd.com> Cc: leit@fb.com, dcostantino@meta.com, linux-perf-users@vger.kernel.org (open list:PERFORMANCE EVENTS SUBSYSTEM), linux-kernel@vger.kernel.org (open list:PERFORMANCE EVENTS SUBSYSTEM) Subject: [PATCH] perf/x86/amd: Do not WARN on every IRQ Date: Fri, 16 Jun 2023 04:53:15 -0700 Message-Id: <20230616115316.3652155-1-leitao@debian.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768860719065222931?= X-GMAIL-MSGID: =?utf-8?q?1768860719065222931?= |
Series |
perf/x86/amd: Do not WARN on every IRQ
|
|
Commit Message
Breno Leitao
June 16, 2023, 11:53 a.m. UTC
On some systems, the Performance Counter Global Status Register is
coming with reserved bits set, which causes the system to be unusable
if a simple `perf top` runs. The system hits the WARN() thousands times
while perf runs.
WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0
This happens because the "Performance Counter Global Status Register"
(PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according
to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s
Manual, Volume 2: System Programming, 24593"[1]
WARN_ONCE if any reserved bit is set, and sanitize the value to what the
code is handling, so the overflow events continue to be handled for the
number of events that are known to be sane.
Signed-off-by: Breno Leitao <leitao@debian.org>
Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
[1] Link: https://www.amd.com/system/files/TechDocs/24593.pdf
---
arch/x86/events/amd/core.c | 4 ++++
1 file changed, 4 insertions(+)
Comments
On Fri, Jun 16, 2023 at 04:53:15AM -0700, Breno Leitao wrote: > On some systems, the Performance Counter Global Status Register is > coming with reserved bits set, which causes the system to be unusable > if a simple `perf top` runs. The system hits the WARN() thousands times > while perf runs. > > WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0 > > This happens because the "Performance Counter Global Status Register" > (PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according > to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s > Manual, Volume 2: System Programming, 24593"[1] Would it then not make more sense to mask out bit7 before: + status &= ~AMD_PMU_V2_GLOBAL_STATUS_RESERVED; if (!status) goto done; ? Aside from being reserved, why are these bits magically set all of a sudden?
On Fri, Jun 16, 2023 at 03:29:54PM +0200, Peter Zijlstra wrote: > On Fri, Jun 16, 2023 at 04:53:15AM -0700, Breno Leitao wrote: > > On some systems, the Performance Counter Global Status Register is > > coming with reserved bits set, which causes the system to be unusable > > if a simple `perf top` runs. The system hits the WARN() thousands times > > while perf runs. > > > > WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0 > > > > This happens because the "Performance Counter Global Status Register" > > (PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according > > to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s > > Manual, Volume 2: System Programming, 24593"[1] > > Would it then not make more sense to mask out bit7 before: It is more than bit 7. This is the register structure according to the document above: Bits Mnemonic Description Access type 63:60 Reserved RO 59 PMCF Performance Counter Freeze RO 58 LBRSF Last Branch Record Stack Freeze RO 57:6 Reserved RO 5:0 CNT_OF Counter overflow for PerfCnt[5:0] RO In the code, bit GLOBAL_STATUS_LBRS_FROZEN is handled and cleared before we reach my changes That said, your approach is almost similar to what I did, and I will be happy to change in order to make the code clearer. > + status &= ~AMD_PMU_V2_GLOBAL_STATUS_RESERVED; > if (!status) > goto done; > > ? > > Aside from being reserved, why are these bits magically set all of a > sudden? That is probably a question to AMD.
On 16/06/23 7:33 pm, Breno Leitao wrote: > On Fri, Jun 16, 2023 at 03:29:54PM +0200, Peter Zijlstra wrote: >> On Fri, Jun 16, 2023 at 04:53:15AM -0700, Breno Leitao wrote: >>> On some systems, the Performance Counter Global Status Register is >>> coming with reserved bits set, which causes the system to be unusable >>> if a simple `perf top` runs. The system hits the WARN() thousands times >>> while perf runs. >>> >>> WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0 >>> >>> This happens because the "Performance Counter Global Status Register" >>> (PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according >>> to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s >>> Manual, Volume 2: System Programming, 24593"[1] >> >> Would it then not make more sense to mask out bit7 before: > > It is more than bit 7. This is the register structure according to the document > above: > > Bits Mnemonic Description Access type > 63:60 Reserved RO > 59 PMCF Performance Counter Freeze RO > 58 LBRSF Last Branch Record Stack Freeze RO > 57:6 Reserved RO > 5:0 CNT_OF Counter overflow for PerfCnt[5:0] RO > > In the code, bit GLOBAL_STATUS_LBRS_FROZEN is handled and cleared before > we reach my changes > > That said, your approach is almost similar to what I did, and I will be happy > to change in order to make the code clearer. > >> + status &= ~AMD_PMU_V2_GLOBAL_STATUS_RESERVED; >> if (!status) >> goto done; >> >> ? >> >> Aside from being reserved, why are these bits magically set all of a >> sudden? > > That is probably a question to AMD. > The reserved bits should never be set. The purpose of the WARN_ON() is to catch such anomalies. I am working out the details with Breno and will report back with a resolution. - Sandipan [sending from my personal email as I currently don't have access to my work laptop]
On Fri, Jun 16, 2023 at 08:13:22PM +0530, Sandipan Das (AMD) wrote: > The reserved bits should never be set. The purpose of the WARN_ON() is to catch such anomalies. > I am working out the details with Breno and will report back with a resolution. Thanks!
Hi Sandipan, Is there any update on this issue? We have hit the issue, and it makes the server pretty unusable due to the thousands of Call Traces being logged. Thanks a lot! Jirka On Fri, Jun 16, 2023 at 5:34 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Fri, Jun 16, 2023 at 08:13:22PM +0530, Sandipan Das (AMD) wrote: > > The reserved bits should never be set. The purpose of the WARN_ON() is to catch such anomalies. > > I am working out the details with Breno and will report back with a resolution. > > Thanks! >
On Wed, Sep 13, 2023 at 04:30:44PM +0200, Jirka Hladky wrote: > Hi Sandipan, > > Is there any update on this issue? I still think we want to have this patch in Linux, so, we protect the kernel for whatever the hardware/firmware is doing.
I agree. Are you planning v2 of the patch, addressing the points raised by Peter Zijlstra? On Wed, Sep 13, 2023 at 5:03 PM Breno Leitao <leitao@debian.org> wrote: > > On Wed, Sep 13, 2023 at 04:30:44PM +0200, Jirka Hladky wrote: > > Hi Sandipan, > > > > Is there any update on this issue? > > I still think we want to have this patch in Linux, so, we protect the > kernel for whatever the hardware/firmware is doing. >
Hi Jirka, Can you please share the following? 1. Family, Model and Stepping of the processor 2. Microcode patch level On 9/13/2023 8:00 PM, Jirka Hladky wrote: > Hi Sandipan, > > Is there any update on this issue? We have hit the issue, and it makes > the server pretty unusable due to the thousands of Call Traces being > logged. > > Thanks a lot! > Jirka > > > On Fri, Jun 16, 2023 at 5:34 PM Peter Zijlstra <peterz@infradead.org> wrote: >> >> On Fri, Jun 16, 2023 at 08:13:22PM +0530, Sandipan Das (AMD) wrote: >>> The reserved bits should never be set. The purpose of the WARN_ON() is to catch such anomalies. >>> I am working out the details with Breno and will report back with a resolution. >> >> Thanks! >> > >
Hi Peter, On Fri, Jun 16, 2023 at 03:29:54PM +0200, Peter Zijlstra wrote: > On Fri, Jun 16, 2023 at 04:53:15AM -0700, Breno Leitao wrote: > > On some systems, the Performance Counter Global Status Register is > > coming with reserved bits set, which causes the system to be unusable > > if a simple `perf top` runs. The system hits the WARN() thousands times > > while perf runs. > > > > WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0 > > > > This happens because the "Performance Counter Global Status Register" > > (PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according > > to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s > > Manual, Volume 2: System Programming, 24593"[1] > > Would it then not make more sense to mask out bit7 before: > > + status &= ~AMD_PMU_V2_GLOBAL_STATUS_RESERVED; > if (!status) > goto done; Instead of masking `status` against AMD_PMU_V2_GLOBAL_STATUS_RESERVED (AMD64_NUM_COUNTERS?), I opted for using the `amd_pmu_global_cntr_mask` global variable because it seems to represent what the loop below is iterating over: /* PMC Enable and Overflow bits for PerfCntrGlobal* registers */ static u64 amd_pmu_global_cntr_mask __read_mostly; Also, I think we want to WARN_ON_ONCE() if we see this problem. Right now, it warns at every time we call this function, which makes the machine unusable, but, warning it once could be helpful to figure out there is something wrong with the machine/firmware. Anyway, please let me know whatever is your preferred way and I will submit a v2.
Hi Sandipan, here is the info from /proc/cpuinfo vendor_id : AuthenticAMD cpu family : 25 model : 160 model name : AMD EPYC 9754 128-Core Processor stepping : 2 microcode : 0xaa0020f >2. Microcode patch level Is it the microcode string from cpuinfo provided above, or are you looking for something else? Thanks! Jirka On Wed, Sep 13, 2023 at 6:19 PM Sandipan Das <sandipan.das@amd.com> wrote: > > Hi Jirka, > > Can you please share the following? > > 1. Family, Model and Stepping of the processor > 2. Microcode patch level > On 9/13/2023 8:00 PM, Jirka Hladky wrote: > > Hi Sandipan, > > > > Is there any update on this issue? We have hit the issue, and it makes > > the server pretty unusable due to the thousands of Call Traces being > > logged. > > > > Thanks a lot! > > Jirka > > > > > > On Fri, Jun 16, 2023 at 5:34 PM Peter Zijlstra <peterz@infradead.org> wrote: > >> > >> On Fri, Jun 16, 2023 at 08:13:22PM +0530, Sandipan Das (AMD) wrote: > >>> The reserved bits should never be set. The purpose of the WARN_ON() is to catch such anomalies. > >>> I am working out the details with Breno and will report back with a resolution. > >> > >> Thanks! > >> > > > > >
Hi Breno, I'm definitively voting for using WARN_ON_ONCE - in the current implementation, we are getting thousands of the same warnings and Call Traces, causing the system to become unusable. >Anyway, please let me know whatever is your preferred way and I will submit a v2. @Peter Zijlstra and @Sandipan - could you please comment on the preferred implementation of the patch? THANK YOU Jirka On Wed, Sep 13, 2023 at 6:24 PM Breno Leitao <leitao@debian.org> wrote: > > Hi Peter, > > On Fri, Jun 16, 2023 at 03:29:54PM +0200, Peter Zijlstra wrote: > > On Fri, Jun 16, 2023 at 04:53:15AM -0700, Breno Leitao wrote: > > > On some systems, the Performance Counter Global Status Register is > > > coming with reserved bits set, which causes the system to be unusable > > > if a simple `perf top` runs. The system hits the WARN() thousands times > > > while perf runs. > > > > > > WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0 > > > > > > This happens because the "Performance Counter Global Status Register" > > > (PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according > > > to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s > > > Manual, Volume 2: System Programming, 24593"[1] > > > > Would it then not make more sense to mask out bit7 before: > > > > + status &= ~AMD_PMU_V2_GLOBAL_STATUS_RESERVED; > > if (!status) > > goto done; > > Instead of masking `status` against AMD_PMU_V2_GLOBAL_STATUS_RESERVED > (AMD64_NUM_COUNTERS?), I opted for using the `amd_pmu_global_cntr_mask` > global variable because it seems to represent what the loop below is > iterating over: > > /* PMC Enable and Overflow bits for PerfCntrGlobal* registers */ > static u64 amd_pmu_global_cntr_mask __read_mostly; > > Also, I think we want to WARN_ON_ONCE() if we see this problem. Right > now, it warns at every time we call this function, which makes the > machine unusable, but, warning it once could be helpful to figure out > there is something wrong with the machine/firmware. > > Anyway, please let me know whatever is your preferred way and I will > submit a v2. >
Hi Jirka, Thanks for reporting back. Moving to the latest Family 19h microcode (link below) will fix the problem. https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/amd-ucode/microcode_amd_fam19h.bin On 9/14/2023 2:03 PM, Jirka Hladky wrote: > Hi Sandipan, > > here is the info from /proc/cpuinfo > > vendor_id : AuthenticAMD > cpu family : 25 > model : 160 > model name : AMD EPYC 9754 128-Core Processor > stepping : 2 > microcode : 0xaa0020f > >>2. Microcode patch level > Is it the microcode string from cpuinfo provided above, or are you looking for something else? > > Thanks! > Jirka > > > On Wed, Sep 13, 2023 at 6:19 PM Sandipan Das <sandipan.das@amd.com <mailto:sandipan.das@amd.com>> wrote: > > Hi Jirka, > > Can you please share the following? > > 1. Family, Model and Stepping of the processor > 2. Microcode patch level > On 9/13/2023 8:00 PM, Jirka Hladky wrote: > > Hi Sandipan, > > > > Is there any update on this issue? We have hit the issue, and it makes > > the server pretty unusable due to the thousands of Call Traces being > > logged. > > > > Thanks a lot! > > Jirka > > > > > > On Fri, Jun 16, 2023 at 5:34 PM Peter Zijlstra <peterz@infradead.org <mailto:peterz@infradead.org>> wrote: > >> > >> On Fri, Jun 16, 2023 at 08:13:22PM +0530, Sandipan Das (AMD) wrote: > >>> The reserved bits should never be set. The purpose of the WARN_ON() is to catch such anomalies. > >>> I am working out the details with Breno and will report back with a resolution. > >> > >> Thanks! > >> > > > > > > > > -- > -Jirka
Hi Breno, Jirka, On 9/14/2023 2:15 PM, Jirka Hladky wrote: > Hi Breno, > > I'm definitively voting for using WARN_ON_ONCE - in the current > implementation, we are getting thousands of the same warnings and Call > Traces, causing the system to become unusable. > >> Anyway, please let me know whatever is your preferred way and I will submit a v2. > @Peter Zijlstra and @Sandipan - could you please comment on the > preferred implementation of the patch? > I agree with using WARN_ON_ONCE() to make this less intrusive. > > On Wed, Sep 13, 2023 at 6:24 PM Breno Leitao <leitao@debian.org> wrote: >> >> Hi Peter, >> >> On Fri, Jun 16, 2023 at 03:29:54PM +0200, Peter Zijlstra wrote: >>> On Fri, Jun 16, 2023 at 04:53:15AM -0700, Breno Leitao wrote: >>>> On some systems, the Performance Counter Global Status Register is >>>> coming with reserved bits set, which causes the system to be unusable >>>> if a simple `perf top` runs. The system hits the WARN() thousands times >>>> while perf runs. >>>> >>>> WARNING: CPU: 18 PID: 20608 at arch/x86/events/amd/core.c:944 amd_pmu_v2_handle_irq+0x1be/0x2b0 >>>> >>>> This happens because the "Performance Counter Global Status Register" >>>> (PerfCntGlobalStatus) MSR has bit 7 set. Bit 7 should be reserved according >>>> to the documentation (Figure 13-12 from "AMD64 Architecture Programmer’s >>>> Manual, Volume 2: System Programming, 24593"[1] >>> >>> Would it then not make more sense to mask out bit7 before: >>> >>> + status &= ~AMD_PMU_V2_GLOBAL_STATUS_RESERVED; >>> if (!status) >>> goto done; >> >> Instead of masking `status` against AMD_PMU_V2_GLOBAL_STATUS_RESERVED >> (AMD64_NUM_COUNTERS?), I opted for using the `amd_pmu_global_cntr_mask` >> global variable because it seems to represent what the loop below is >> iterating over: >> >> /* PMC Enable and Overflow bits for PerfCntrGlobal* registers */ >> static u64 amd_pmu_global_cntr_mask __read_mostly; >> >> Also, I think we want to WARN_ON_ONCE() if we see this problem. Right >> now, it warns at every time we call this function, which makes the >> machine unusable, but, warning it once could be helpful to figure out >> there is something wrong with the machine/firmware. >> >> Anyway, please let me know whatever is your preferred way and I will >> submit a v2. >> > >
On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: > Hi Breno, Jirka, > > On 9/14/2023 2:15 PM, Jirka Hladky wrote: > > Hi Breno, > > > > I'm definitively voting for using WARN_ON_ONCE - in the current > > implementation, we are getting thousands of the same warnings and Call > > Traces, causing the system to become unusable. > > > >> Anyway, please let me know whatever is your preferred way and I will submit a v2. > > @Peter Zijlstra and @Sandipan - could you please comment on the > > preferred implementation of the patch? > > > > I agree with using WARN_ON_ONCE() to make this less intrusive. Could you send a patch that AMD is happy with? I think you wrote this was a firmware bug and is sorted with a new firmware, so perhaps make it WARN_ONCE() and tell the users to upgrade their firmware to $ver ?
On 9/14/2023 2:42 PM, Peter Zijlstra wrote: > On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: >> Hi Breno, Jirka, >> >> On 9/14/2023 2:15 PM, Jirka Hladky wrote: >>> Hi Breno, >>> >>> I'm definitively voting for using WARN_ON_ONCE - in the current >>> implementation, we are getting thousands of the same warnings and Call >>> Traces, causing the system to become unusable. >>> >>>> Anyway, please let me know whatever is your preferred way and I will submit a v2. >>> @Peter Zijlstra and @Sandipan - could you please comment on the >>> preferred implementation of the patch? >>> >> >> I agree with using WARN_ON_ONCE() to make this less intrusive. > > Could you send a patch that AMD is happy with? I think you wrote this > was a firmware bug and is sorted with a new firmware, so perhaps make it > WARN_ONCE() and tell the users to upgrade their firmware to $ver ? Sure. Will do.
On Thu, Sep 14, 2023 at 11:12:34AM +0200, Peter Zijlstra wrote: > On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: > > I agree with using WARN_ON_ONCE() to make this less intrusive. > > Could you send a patch that AMD is happy with? Why the current patch is not good enough?
On Thu, Sep 14, 2023 at 02:30:43AM -0700, Breno Leitao wrote: > On Thu, Sep 14, 2023 at 11:12:34AM +0200, Peter Zijlstra wrote: > > On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: > > > > I agree with using WARN_ON_ONCE() to make this less intrusive. > > > > Could you send a patch that AMD is happy with? > > Why the current patch is not good enough? Sandipan, can you answer this? I don't tihnk I'm qualified to speak for the AMD pmu and certainly I don't have insight into their design future.
On 9/14/2023 4:48 PM, Peter Zijlstra wrote: > On Thu, Sep 14, 2023 at 02:30:43AM -0700, Breno Leitao wrote: >> On Thu, Sep 14, 2023 at 11:12:34AM +0200, Peter Zijlstra wrote: >>> On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: >> >>>> I agree with using WARN_ON_ONCE() to make this less intrusive. >>> >>> Could you send a patch that AMD is happy with? >> >> Why the current patch is not good enough? > > Sandipan, can you answer this? I don't tihnk I'm qualified to speak for > the AMD pmu and certainly I don't have insight into their design future. Hi Breno, Functionally, the patch looks good to me and I will be reusing it without any change to the authorship. However, as Peter suggested, I wanted to add a message to prompt users to update the microcode and also call out the required patch levels in the commit message since different Zen 4 variants and steppings use different microcode. Here's what I plan to send. diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index abadd5f23425..186a124bb3c0 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -909,6 +909,13 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs) status &= ~GLOBAL_STATUS_LBRS_FROZEN; } + if (status & ~amd_pmu_global_cntr_mask) + pr_warn_once("Unknown status bits are set (0x%llx), please consider updating microcode\n", + status); + + /* Clear any reserved bits set by buggy microcode */ + status &= amd_pmu_global_cntr_mask; + for (idx = 0; idx < x86_pmu.num_counters; idx++) { if (!test_bit(idx, cpuc->active_mask)) continue; -- Hi Peter, There is another case where users will see warnings but the patch to fix it (link below) is yet to be reviewed. May I rebase and resend it along with the above? https://lore.kernel.org/all/20230613105809.524535-1-sandipan.das@amd.com/
On Thu, Sep 14, 2023 at 04:52:13PM +0530, Sandipan Das wrote: > On 9/14/2023 4:48 PM, Peter Zijlstra wrote: > > On Thu, Sep 14, 2023 at 02:30:43AM -0700, Breno Leitao wrote: > >> On Thu, Sep 14, 2023 at 11:12:34AM +0200, Peter Zijlstra wrote: > >>> On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: > >> > >>>> I agree with using WARN_ON_ONCE() to make this less intrusive. > >>> > >>> Could you send a patch that AMD is happy with? > >> > >> Why the current patch is not good enough? > > > > Sandipan, can you answer this? I don't tihnk I'm qualified to speak for > > the AMD pmu and certainly I don't have insight into their design future. > > Hi Breno, > > Functionally, the patch looks good to me and I will be reusing it > without any change to the authorship. However, as Peter suggested, I > wanted to add a message to prompt users to update the microcode and > also call out the required patch levels in the commit message since > different Zen 4 variants and steppings use different microcode. > > Here's what I plan to send. > > diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c > index abadd5f23425..186a124bb3c0 100644 > --- a/arch/x86/events/amd/core.c > +++ b/arch/x86/events/amd/core.c > @@ -909,6 +909,13 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs) > status &= ~GLOBAL_STATUS_LBRS_FROZEN; > } > > + if (status & ~amd_pmu_global_cntr_mask) > + pr_warn_once("Unknown status bits are set (0x%llx), please consider updating microcode\n", > + status); > + > + /* Clear any reserved bits set by buggy microcode */ > + status &= amd_pmu_global_cntr_mask; > + > for (idx = 0; idx < x86_pmu.num_counters; idx++) { > if (!test_bit(idx, cpuc->active_mask)) > continue; > > -- > > Hi Peter, > > There is another case where users will see warnings but the patch > to fix it (link below) is yet to be reviewed. May I rebase and > resend it along with the above? > > https://lore.kernel.org/all/20230613105809.524535-1-sandipan.das@amd.com/ > Sure, sorry I seem to have missed that :-(
On Thu, Sep 14, 2023 at 04:52:13PM +0530, Sandipan Das wrote: > On 9/14/2023 4:48 PM, Peter Zijlstra wrote: > > On Thu, Sep 14, 2023 at 02:30:43AM -0700, Breno Leitao wrote: > >> On Thu, Sep 14, 2023 at 11:12:34AM +0200, Peter Zijlstra wrote: > >>> On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: > >> > >>>> I agree with using WARN_ON_ONCE() to make this less intrusive. > >>> > >>> Could you send a patch that AMD is happy with? > >> > >> Why the current patch is not good enough? > > > > Sandipan, can you answer this? I don't tihnk I'm qualified to speak for > > the AMD pmu and certainly I don't have insight into their design future. > > Hi Breno, > > Functionally, the patch looks good to me and I will be reusing it > without any change to the authorship. However, as Peter suggested, I > wanted to add a message to prompt users to update the microcode and > also call out the required patch levels in the commit message since > different Zen 4 variants and steppings use different microcode. > > Here's what I plan to send. Awesome. Thanks for addressing it.
Thank you, all! I have confirmed that the latest microcode fixes the issue. ============================================= cat /proc/cpuinfo microcode : 0xaa00212 perf top runs fine, without any kernel warnings ============================================= Jirka On Thu, Sep 14, 2023 at 2:22 PM Breno Leitao <leitao@debian.org> wrote: > > On Thu, Sep 14, 2023 at 04:52:13PM +0530, Sandipan Das wrote: > > On 9/14/2023 4:48 PM, Peter Zijlstra wrote: > > > On Thu, Sep 14, 2023 at 02:30:43AM -0700, Breno Leitao wrote: > > >> On Thu, Sep 14, 2023 at 11:12:34AM +0200, Peter Zijlstra wrote: > > >>> On Thu, Sep 14, 2023 at 02:25:40PM +0530, Sandipan Das wrote: > > >> > > >>>> I agree with using WARN_ON_ONCE() to make this less intrusive. > > >>> > > >>> Could you send a patch that AMD is happy with? > > >> > > >> Why the current patch is not good enough? > > > > > > Sandipan, can you answer this? I don't tihnk I'm qualified to speak for > > > the AMD pmu and certainly I don't have insight into their design future. > > > > Hi Breno, > > > > Functionally, the patch looks good to me and I will be reusing it > > without any change to the authorship. However, as Peter suggested, I > > wanted to add a message to prompt users to update the microcode and > > also call out the required patch levels in the commit message since > > different Zen 4 variants and steppings use different microcode. > > > > Here's what I plan to send. > > Awesome. Thanks for addressing it. >
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index bccea57dee81..809ddb15c122 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -909,6 +909,10 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs) status &= ~GLOBAL_STATUS_LBRS_FROZEN; } + amd_pmu_global_cntr_mask = (1ULL << x86_pmu.num_counters) - 1; + WARN_ON_ONCE(status & ~amd_pmu_global_cntr_mask); + status &= amd_pmu_global_cntr_mask; + for (idx = 0; idx < x86_pmu.num_counters; idx++) { if (!test_bit(idx, cpuc->active_mask)) continue;