Message ID | 20230425121829.61755-1-xueshuai@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3384731vqo; Tue, 25 Apr 2023 06:01:18 -0700 (PDT) X-Google-Smtp-Source: AKy350aUBIvAuV9Cvt7qixOaAPMtvrenWrpEVopR9ZikJHn/JRk+vbVJNIlc6V3tn9aacIXXv49H X-Received: by 2002:a25:2fc7:0:b0:b72:546:f553 with SMTP id v190-20020a252fc7000000b00b720546f553mr12306019ybv.21.1682427677882; Tue, 25 Apr 2023 06:01:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682427677; cv=none; d=google.com; s=arc-20160816; b=ETCtTb6lWQSPslFXLPYHFovFKyBHL2nMjV8v58LkQXDXCG+VjNCjLmJAmmXPxy2Aky 4S3atd7/t8yhk5/j38K+jMutbWBTwgn8tXXdCuh+is7I/rKevBx48MNi1m7wJhngCzKZ zg+8U2M2H9V8YGo6sltUSDpwr8+hntbnRHqHgqw4CsOvv5L8aAMm6vmt2ZaHPL1NKcqD JjdyYJ6UgC4IyxZtDLoMA/woHulT6hxhGo+rWmGgHQAJ+hIiI7dfGLvY3hkoro6MexSz DD98mXO+POimEf4HKYrp8Cz+K5F+ej3KeTn8zoQp6ckW7lKCXLQImq+CIti6WkdM2Y+Q /9fA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=R2usMS3+N9QcR2RoeoX2JrSgji8rhXr6n6smY5/D5CU=; b=GwUClW3vz9674pv1vpo8q2hxtfH04SFXL+pwQ2Spv28Onh9YkuO8IGA1PuuaFOMUW8 oIM5mDQ7CYHb354V5r5rntpAmRQNKZGFRDbUTr9Dy4yyfmf80jEFJSScqQnO87uyi3/D BR4VZitIE2qv4RoW4jnzAIY72R2WdcJMcoOfu0f2Otzr2FJ3dUS07srVmKXb/bL8AYAQ wF2pHNhltLXCthdvJXHtGlwMfMETlt0oChgrrxVSBtCxqa+TDW/aR9d1OJ3e2zNxHM2u pDVLyOxyAkuWCpFTRMkKVb9fCkwZCAjlCNeRrBUvga3VgFG+UQdTA+Td4ugx3gw6KqW8 h3Iw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x27-20020a25acdb000000b00b99e0ff64c1si1246157ybd.486.2023.04.25.06.00.57; Tue, 25 Apr 2023 06:01:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233984AbjDYMSu (ORCPT <rfc822;zxc52fgh@gmail.com> + 99 others); Tue, 25 Apr 2023 08:18:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233943AbjDYMSq (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 25 Apr 2023 08:18:46 -0400 Received: from out30-131.freemail.mail.aliyun.com (out30-131.freemail.mail.aliyun.com [115.124.30.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 398BFCC27; Tue, 25 Apr 2023 05:18:43 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R401e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046056;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---0Vh-RnH7_1682425110; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0Vh-RnH7_1682425110) by smtp.aliyun-inc.com; Tue, 25 Apr 2023 20:18:40 +0800 From: Shuai Xue <xueshuai@linux.alibaba.com> To: bp@alien8.de, tony.luck@intel.com Cc: tglx@linutronix.de, mingo@redhat.com, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, baolin.wang@linux.alibaba.com, linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] x86/mce/amd: init mce severity to handle deferred memory failure Date: Tue, 25 Apr 2023 20:18:29 +0800 Message-Id: <20230425121829.61755-1-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY, URIBL_BLOCKED,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764153285061066589?= X-GMAIL-MSGID: =?utf-8?q?1764153285061066589?= |
Series |
x86/mce/amd: init mce severity to handle deferred memory failure
|
|
Commit Message
Shuai Xue
April 25, 2023, 12:18 p.m. UTC
When a deferred UE error is detected, e.g by background patrol scruber, it
will be handled in APIC interrupt handler amd_deferred_error_interrupt().
The handler will collect MCA banks, init mce struct and process it by
nofitying the registered MCE decode chain.
The uc_decode_notifier, one of MCE decode chain, will process memory
failure but only limit to MCE_AO_SEVERITY and MCE_DEFERRED_SEVERITY.
However, APIC interrupt handler does not init mce severity and the
uninitialized severity is 0 (MCE_NO_SEVERITY).
To handle the deferred memory failure case, init mce severity when logging
MCA banks.
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
Steps to reproduce:
step 1: inject a patrol scrub error by ras-tools
#einj_mem_uc patrol
step 2: check dmesg, no memory failure log
#dmesg -c
[51295.686806] mce: [Hardware Error]: Machine check events logged
[51295.693566] mce->status: 0x942031000400011b
[51295.698248] mce->misc: 0x00000000
[51295.701952] mce->severity: 0x00000000 # Manually added printk
[51295.726640] [Hardware Error]: Deferred error, no action required.
[51295.733448] [Hardware Error]: CPU:65 (19:11:1) MC21_STATUS[-|-|-|AddrV|-|-|SyndV|UECC|Deferred|-|Scrub]: 0x942031000400011b
[51295.733452] [Hardware Error]: Error Addr: 0x0000000006350a00
[51295.733453] [Hardware Error]: PPIN: 0x02b69e294c148024
[51295.733453] [Hardware Error]: IPID: 0x0000109600250f00, Syndrome: 0x9a4a00000b800000
[51295.733455] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
[51295.733463] mce: umc_normaddr_to_sysaddr: Invalid DramBaseAddress range: 0x0.
[51295.733471] EDAC MC0: 1 UE Cannot decode normalized address on mc#0csrow#0channel#2 (csrow:0 channel:2 page:0x0 offset:0x0 grain:64)
[51295.733471] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD
After this fix:
[ 514.966892] mce: [Hardware Error]: Machine check events logged
[ 514.966912] mce->status: 0x942031000400011b
[ 514.978093] mce->misc: 0x00000000
[ 514.981796] mce->severity: 0x00000001
[ 514.985885] <uc_decode_notifier> pre_handler: p->addr = 0x00000000e09e69e4, ip = ffffffff8104b955, flags = 0x282
[ 514.997253] <uc_decode_notifier> post_handler: p->addr = 0x00000000e09e69e4, flags = 0x282
[ 515.006501] Memory failure: 0x5dc2: recovery action for free buddy page: Recovered
[ 515.015188] [Hardware Error]: Deferred error, no action required.
[ 515.022006] [Hardware Error]: CPU:67 (19:11:1) MC21_STATUS[-|-|-|AddrV|-|-|SyndV|UECC|Deferred|-|Scrub]: 0x942031000400011b
[ 515.034440] [Hardware Error]: Error Addr: 0x0000000005dc2a00
[ 515.034442] [Hardware Error]: PPIN: 0x02b69e294c148024
[ 515.034443] [Hardware Error]: IPID: 0x0000109600650f00, Syndrome: 0x9a4a00000b800008
[ 515.034445] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
[ 515.034453] umc_normaddr_to_sysaddr: Invalid DramBaseAddress range: 0x0.
[ 515.034458] EDAC MC1: 1 UE Cannot decode normalized address on mc#1csrow#0channel#6 (csrow:0 channel:6 page:0x0 offset:0x0 grain:64)
[ 515.034461] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD
Note, the memory_failure handles wrong physical address because
umc_normaddr_to_sysaddr fails. I don't figure out why it fails.
---
arch/x86/kernel/cpu/mce/amd.c | 1 +
1 file changed, 1 insertion(+)
Comments
On 4/25/23 8:18 AM, Shuai Xue wrote: > When a deferred UE error is detected, e.g by background patrol scruber, it > will be handled in APIC interrupt handler amd_deferred_error_interrupt(). > The handler will collect MCA banks, init mce struct and process it by > nofitying the registered MCE decode chain. > > The uc_decode_notifier, one of MCE decode chain, will process memory > failure but only limit to MCE_AO_SEVERITY and MCE_DEFERRED_SEVERITY. > However, APIC interrupt handler does not init mce severity and the > uninitialized severity is 0 (MCE_NO_SEVERITY). > > To handle the deferred memory failure case, init mce severity when logging > MCA banks. > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > Hi Shuai Xue, I think this patch is fair to do. But it won't have the intended effect in practice. The value in MCA_ADDR for DRAM ECC errors will be a memory controller "normalized address". This is not a system physical address that the OS can use to take action. The mce_usable_address() function needs to be updated to handle this. I'll send a patchset this week to do so. Afterwards, the uc_decode_notifier will not attempt to handle these errors. Thanks, Yazen
On 2023/5/9 22:25, Yazen Ghannam wrote: > On 4/25/23 8:18 AM, Shuai Xue wrote: >> When a deferred UE error is detected, e.g by background patrol scruber, it >> will be handled in APIC interrupt handler amd_deferred_error_interrupt(). >> The handler will collect MCA banks, init mce struct and process it by >> nofitying the registered MCE decode chain. >> >> The uc_decode_notifier, one of MCE decode chain, will process memory >> failure but only limit to MCE_AO_SEVERITY and MCE_DEFERRED_SEVERITY. >> However, APIC interrupt handler does not init mce severity and the >> uninitialized severity is 0 (MCE_NO_SEVERITY). >> >> To handle the deferred memory failure case, init mce severity when logging >> MCA banks. >> >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >> > > Hi Shuai Xue, > > I think this patch is fair to do. But it won't have the intended effect > in practice. > > The value in MCA_ADDR for DRAM ECC errors will be a memory controller > "normalized address". This is not a system physical address that the OS > can use to take action. > > The mce_usable_address() function needs to be updated to handle this. > I'll send a patchset this week to do so. Afterwards, the > uc_decode_notifier will not attempt to handle these errors. From the experience of other platforms (e.g. ARM64 RAS and Intel MCA), uc_decode_notifier should handle these error to hard offline the corrupted page. If the corrupted page is a free buddy page, we can isolate it and avoid using the page in the future. In my test case, the error is detected by patrol scrubber in memory controller. The scrubber may lack of system address space perspective, and only reports "normalized address". But we can decode the "normalized address" to system address by EDAC (umc_normaddr_to_sysaddr), right? (I am not quite familiar with AMD RAS, please correct me if I am wrong) > > Thanks, > Yazen Thank you. Best Regards, Shuai
On 5/9/23 10:17 PM, Shuai Xue wrote: > > > On 2023/5/9 22:25, Yazen Ghannam wrote: >> On 4/25/23 8:18 AM, Shuai Xue wrote: >>> When a deferred UE error is detected, e.g by background patrol scruber, it >>> will be handled in APIC interrupt handler amd_deferred_error_interrupt(). >>> The handler will collect MCA banks, init mce struct and process it by >>> nofitying the registered MCE decode chain. >>> >>> The uc_decode_notifier, one of MCE decode chain, will process memory >>> failure but only limit to MCE_AO_SEVERITY and MCE_DEFERRED_SEVERITY. >>> However, APIC interrupt handler does not init mce severity and the >>> uninitialized severity is 0 (MCE_NO_SEVERITY). >>> >>> To handle the deferred memory failure case, init mce severity when logging >>> MCA banks. >>> >>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> >>> >> >> Hi Shuai Xue, >> >> I think this patch is fair to do. But it won't have the intended effect >> in practice. >> >> The value in MCA_ADDR for DRAM ECC errors will be a memory controller >> "normalized address". This is not a system physical address that the OS >> can use to take action. >> >> The mce_usable_address() function needs to be updated to handle this. >> I'll send a patchset this week to do so. Afterwards, the >> uc_decode_notifier will not attempt to handle these errors. > > From the experience of other platforms (e.g. ARM64 RAS and Intel MCA), > uc_decode_notifier should handle these error to hard offline the corrupted > page. If the corrupted page is a free buddy page, we can isolate it and avoid > using the page in the future. > > In my test case, the error is detected by patrol scrubber in memory controller. > The scrubber may lack of system address space perspective, and only reports > "normalized address". But we can decode the "normalized address" to system address > by EDAC (umc_normaddr_to_sysaddr), right? > > (I am not quite familiar with AMD RAS, please correct me if I am wrong) > Yes, that's correct. The address translation requires some updates that are still in-review. Afterwards, we can investigate ways to use the translated address. It may require some rework in the MCE notifier chain or, more simply, calling memory_failure() from the EDAC module itself. Thanks, Yazen
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c index 23c5072fbbb7..b5e1a27b0881 100644 --- a/arch/x86/kernel/cpu/mce/amd.c +++ b/arch/x86/kernel/cpu/mce/amd.c @@ -734,6 +734,7 @@ static void __log_error(unsigned int bank, u64 status, u64 addr, u64 misc) m.misc = misc; m.bank = bank; m.tsc = rdtsc(); + m.severity = mce_severity(&m, NULL, NULL, false); if (m.status & MCI_STATUS_ADDRV) { m.addr = addr;