[03/20] x86/mce: Use mce_setup() helpers for apei_smca_report_x86_error()

Message ID 20231118193248.1296798-4-yazen.ghannam@amd.com
State New
Headers
Series MCA Updates |

Commit Message

Yazen Ghannam Nov. 18, 2023, 7:32 p.m. UTC
  Current AMD systems may report MCA errors using the ACPI Boot Error
Record Table (BERT). The BERT entries for MCA errors will be an x86
Common Platform Error Record (CPER) with an MSR register context that
matches the MCAX/SMCA register space.

However, the BERT will not necessarily be processed on the CPU that
reported the MCA errors. Therefore, the correct CPU number needs to be
determined and the information saved in struct mce.

The CPU number is determined by searching all possible CPUs for a Local
APIC ID matching the value in the x86 CPER.

Set up the MCA record after searching for a CPU number. If no possible
CPU was found, then return early.

Gather the global MCA information first, save the found CPU number, then
gather the per-CPU information.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
 arch/x86/kernel/cpu/mce/apei.c     | 18 ++++++++----------
 arch/x86/kernel/cpu/mce/internal.h |  2 ++
 2 files changed, 10 insertions(+), 10 deletions(-)
  

Comments

Borislav Petkov Nov. 22, 2023, 6:28 p.m. UTC | #1
On Sat, Nov 18, 2023 at 01:32:31PM -0600, Yazen Ghannam wrote:
> Current AMD systems may report MCA errors using the ACPI Boot Error
> Record Table (BERT). The BERT entries for MCA errors will be an x86
> Common Platform Error Record (CPER) with an MSR register context that
> matches the MCAX/SMCA register space.
> 
> However, the BERT will not necessarily be processed on the CPU that
> reported the MCA errors. Therefore, the correct CPU number needs to be
> determined and the information saved in struct mce.
> 
> The CPU number is determined by searching all possible CPUs for a Local
> APIC ID matching the value in the x86 CPER.

Those below are explaining what the patch does. Not needed here.

> Set up the MCA record after searching for a CPU number. If no possible
> CPU was found, then return early.
> 
> Gather the global MCA information first, save the found CPU number, then
> gather the per-CPU information.
  
Yazen Ghannam Nov. 27, 2023, 2:53 p.m. UTC | #2
On 11/22/2023 1:28 PM, Borislav Petkov wrote:
> On Sat, Nov 18, 2023 at 01:32:31PM -0600, Yazen Ghannam wrote:
>> Current AMD systems may report MCA errors using the ACPI Boot Error
>> Record Table (BERT). The BERT entries for MCA errors will be an x86
>> Common Platform Error Record (CPER) with an MSR register context that
>> matches the MCAX/SMCA register space.
>>
>> However, the BERT will not necessarily be processed on the CPU that
>> reported the MCA errors. Therefore, the correct CPU number needs to be
>> determined and the information saved in struct mce.
>>
>> The CPU number is determined by searching all possible CPUs for a Local
>> APIC ID matching the value in the x86 CPER.
> 
> Those below are explaining what the patch does. Not needed here.
> 

Okay, will remove them.

>> Set up the MCA record after searching for a CPU number. If no possible
>> CPU was found, then return early.
>>
>> Gather the global MCA information first, save the found CPU number, then
>> gather the per-CPU information.
> 

Thanks,
Yazen
  

Patch

diff --git a/arch/x86/kernel/cpu/mce/apei.c b/arch/x86/kernel/cpu/mce/apei.c
index 7f7309ff67d0..33cefe6157eb 100644
--- a/arch/x86/kernel/cpu/mce/apei.c
+++ b/arch/x86/kernel/cpu/mce/apei.c
@@ -97,20 +97,18 @@  int apei_smca_report_x86_error(struct cper_ia_proc_ctx *ctx_info, u64 lapic_id)
 	if (ctx_info->reg_arr_size < 48)
 		return -EINVAL;
 
-	mce_setup(&m);
-
-	m.extcpu = -1;
-	m.socketid = -1;
-
 	for_each_possible_cpu(cpu) {
-		if (cpu_data(cpu).topo.initial_apicid == lapic_id) {
-			m.extcpu = cpu;
-			m.socketid = cpu_data(m.extcpu).topo.pkg_id;
+		if (cpu_data(cpu).topo.initial_apicid == lapic_id)
 			break;
-		}
 	}
 
-	m.apicid = lapic_id;
+	if (!cpu_possible(cpu))
+		return -EINVAL;
+
+	mce_setup_global(&m);
+	m.cpu = m.extcpu = cpu;
+	mce_setup_per_cpu(&m);
+
 	m.bank = (ctx_info->msr_addr >> 4) & 0xFF;
 	m.status = *i_mce;
 	m.addr = *(i_mce + 1);
diff --git a/arch/x86/kernel/cpu/mce/internal.h b/arch/x86/kernel/cpu/mce/internal.h
index e13a26c9c0ac..424c7461dcf9 100644
--- a/arch/x86/kernel/cpu/mce/internal.h
+++ b/arch/x86/kernel/cpu/mce/internal.h
@@ -209,6 +209,8 @@  enum mca_msr {
 
 /* Decide whether to add MCE record to MCE event pool or filter it out. */
 extern bool filter_mce(struct mce *m);
+void mce_setup_global(struct mce *m);
+void mce_setup_per_cpu(struct mce *m);
 
 #ifdef CONFIG_X86_MCE_AMD
 extern bool amd_filter_mce(struct mce *m);