[3/6] x86/cpuid: Add smp helper

Message ID 827723d8f506411700c68bccc5072ec8d918d2de.1689748843.git.sandipan.das@amd.com
State New
Headers
Series perf/x86/amd: Add memory controller events |

Commit Message

Sandipan Das July 19, 2023, 6:55 a.m. UTC
  Depending on which CPU the CPUID instruction is executed, some leaves
can report different values. There are cases where it may be required
to know all possible values.

E.g. for AMD Zen 4 processors, the ActiveUmcMask field from leaf
0x80000022 ECX, which provides a way to determine the active memory
controllers, can have different masks on CPUs belonging to different
sockets as each socket can follow a different DIMM population scheme.
Each memory channel is assigned a memory controller (UMC) and if no
DIMMs are attached to a channel, the corresponding memory controller
is inactive. There are performance monitoring counters exclusive to
each memory controller which need to be represented under separate
PMUs. So, it will be necessary to know the active memory controllers
on each socket during the initialization of the UMC PMUs irrespective
of where the uncore driver's module init runs.

Add a new helper that executes CPUID on a particular CPU and returns
the EAX, EBX, ECX and EDX values.

Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---
 arch/x86/include/asm/cpuid.h | 14 ++++++++++++++
 arch/x86/lib/Makefile        |  2 +-
 arch/x86/lib/cpuid-smp.c     | 36 ++++++++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/lib/cpuid-smp.c
  

Comments

Peter Zijlstra July 19, 2023, 7:28 a.m. UTC | #1
On Wed, Jul 19, 2023 at 12:25:38PM +0530, Sandipan Das wrote:
> Depending on which CPU the CPUID instruction is executed, some leaves
> can report different values. There are cases where it may be required
> to know all possible values.
> 
> E.g. for AMD Zen 4 processors, the ActiveUmcMask field from leaf
> 0x80000022 ECX, which provides a way to determine the active memory
> controllers, can have different masks on CPUs belonging to different
> sockets as each socket can follow a different DIMM population scheme.
> Each memory channel is assigned a memory controller (UMC) and if no
> DIMMs are attached to a channel, the corresponding memory controller
> is inactive. There are performance monitoring counters exclusive to
> each memory controller which need to be represented under separate
> PMUs. So, it will be necessary to know the active memory controllers
> on each socket during the initialization of the UMC PMUs irrespective
> of where the uncore driver's module init runs.
> 
> Add a new helper that executes CPUID on a particular CPU and returns
> the EAX, EBX, ECX and EDX values.
> 

So I hate all this for multiple reasons:

 - the wohle foo_on_cpu() model generally leads to atrocious code that
   does multiple IPIs, I've seen rdmsr_on_cpu() followed by
   wrmsr_on_cpu() and worse things, just don't do this.

 - The whole CPUID thing is insane; we should read CPUID -- all of it --
   *ONCE* at bringup and thereafter never touch the instruction ever
   again. It could be people are already working on patches to this
   effect.

 - Different CPUID values for different CPUs is a pain :/
  
Thomas Gleixner July 19, 2023, 7:29 a.m. UTC | #2
On Wed, Jul 19 2023 at 12:25, Sandipan Das wrote:
> Depending on which CPU the CPUID instruction is executed, some leaves
> can report different values. There are cases where it may be required
> to know all possible values.
>
> E.g. for AMD Zen 4 processors, the ActiveUmcMask field from leaf
> 0x80000022 ECX, which provides a way to determine the active memory
> controllers, can have different masks on CPUs belonging to different
> sockets as each socket can follow a different DIMM population scheme.
> Each memory channel is assigned a memory controller (UMC) and if no
> DIMMs are attached to a channel, the corresponding memory controller
> is inactive. There are performance monitoring counters exclusive to
> each memory controller which need to be represented under separate
> PMUs. So, it will be necessary to know the active memory controllers
> on each socket during the initialization of the UMC PMUs irrespective
> of where the uncore driver's module init runs.
>
> Add a new helper that executes CPUID on a particular CPU and returns
> the EAX, EBX, ECX and EDX values.

NAK.

This madness has to stop. The correct thing is to parse the information
in CPUID at the point where the CPU comes online and store it for easy
consumption.

I'm in the process of reworking the CPUID and topology evaluation and
that's where these things need to be stored. I'm still fighting some
nightmares with the already existing mess.

Look at the mess people created over time here:

     https://lore.kernel.org/lkml/20230717223049.327865981@linutronix.de

No need to add more insanities to it. IOW, this has to wait for a week
or two until I settled the remaining issues.

Thanks

        tglx
  
Sandipan Das July 19, 2023, 7:35 a.m. UTC | #3
On 7/19/2023 12:59 PM, Thomas Gleixner wrote:
> On Wed, Jul 19 2023 at 12:25, Sandipan Das wrote:
>> Depending on which CPU the CPUID instruction is executed, some leaves
>> can report different values. There are cases where it may be required
>> to know all possible values.
>>
>> E.g. for AMD Zen 4 processors, the ActiveUmcMask field from leaf
>> 0x80000022 ECX, which provides a way to determine the active memory
>> controllers, can have different masks on CPUs belonging to different
>> sockets as each socket can follow a different DIMM population scheme.
>> Each memory channel is assigned a memory controller (UMC) and if no
>> DIMMs are attached to a channel, the corresponding memory controller
>> is inactive. There are performance monitoring counters exclusive to
>> each memory controller which need to be represented under separate
>> PMUs. So, it will be necessary to know the active memory controllers
>> on each socket during the initialization of the UMC PMUs irrespective
>> of where the uncore driver's module init runs.
>>
>> Add a new helper that executes CPUID on a particular CPU and returns
>> the EAX, EBX, ECX and EDX values.
> 
> NAK.
> 
> This madness has to stop. The correct thing is to parse the information
> in CPUID at the point where the CPU comes online and store it for easy
> consumption.
> 
> I'm in the process of reworking the CPUID and topology evaluation and
> that's where these things need to be stored. I'm still fighting some
> nightmares with the already existing mess.
> 
> Look at the mess people created over time here:
> 
>      https://lore.kernel.org/lkml/20230717223049.327865981@linutronix.de
> 
> No need to add more insanities to it. IOW, this has to wait for a week
> or two until I settled the remaining issues.
> 

Agreed. I'll rework the patches and remove this.

- Sandipan
  
Sandipan Das July 19, 2023, 7:37 a.m. UTC | #4
On 7/19/2023 12:58 PM, Peter Zijlstra wrote:
> On Wed, Jul 19, 2023 at 12:25:38PM +0530, Sandipan Das wrote:
>> Depending on which CPU the CPUID instruction is executed, some leaves
>> can report different values. There are cases where it may be required
>> to know all possible values.
>>
>> E.g. for AMD Zen 4 processors, the ActiveUmcMask field from leaf
>> 0x80000022 ECX, which provides a way to determine the active memory
>> controllers, can have different masks on CPUs belonging to different
>> sockets as each socket can follow a different DIMM population scheme.
>> Each memory channel is assigned a memory controller (UMC) and if no
>> DIMMs are attached to a channel, the corresponding memory controller
>> is inactive. There are performance monitoring counters exclusive to
>> each memory controller which need to be represented under separate
>> PMUs. So, it will be necessary to know the active memory controllers
>> on each socket during the initialization of the UMC PMUs irrespective
>> of where the uncore driver's module init runs.
>>
>> Add a new helper that executes CPUID on a particular CPU and returns
>> the EAX, EBX, ECX and EDX values.
>>
> 
> So I hate all this for multiple reasons:
> 
>  - the wohle foo_on_cpu() model generally leads to atrocious code that
>    does multiple IPIs, I've seen rdmsr_on_cpu() followed by
>    wrmsr_on_cpu() and worse things, just don't do this.
> 
>  - The whole CPUID thing is insane; we should read CPUID -- all of it --
>    *ONCE* at bringup and thereafter never touch the instruction ever
>    again. It could be people are already working on patches to this
>    effect.
> 
>  - Different CPUID values for different CPUs is a pain :/

Thanks for the clarification. Will remove this.

- Sandipan
  
Thomas Gleixner July 19, 2023, 11:50 a.m. UTC | #5
On Wed, Jul 19 2023 at 09:29, Thomas Gleixner wrote:
> On Wed, Jul 19 2023 at 12:25, Sandipan Das wrote:
>> Depending on which CPU the CPUID instruction is executed, some leaves
>> can report different values. There are cases where it may be required
>> to know all possible values.
>> Add a new helper that executes CPUID on a particular CPU and returns
>> the EAX, EBX, ECX and EDX values.
>
> NAK.
>
> This madness has to stop. The correct thing is to parse the information
> in CPUID at the point where the CPU comes online and store it for easy
> consumption.
>
> I'm in the process of reworking the CPUID and topology evaluation and
> that's where these things need to be stored. I'm still fighting some
> nightmares with the already existing mess.
>
> Look at the mess people created over time here:
>
>      https://lore.kernel.org/lkml/20230717223049.327865981@linutronix.de
>
> No need to add more insanities to it. IOW, this has to wait for a week
> or two until I settled the remaining issues.

Actually you can do that today already. Either make it part of the CPUID
evaluation or use the perf hotplug callback mechanism which is invoked
early on the upcoming CPU and read that CPUID leaf.
  
Sandipan Das July 19, 2023, 11:59 a.m. UTC | #6
On 7/19/2023 5:20 PM, Thomas Gleixner wrote:
> On Wed, Jul 19 2023 at 09:29, Thomas Gleixner wrote:
>> On Wed, Jul 19 2023 at 12:25, Sandipan Das wrote:
>>> Depending on which CPU the CPUID instruction is executed, some leaves
>>> can report different values. There are cases where it may be required
>>> to know all possible values.
>>> Add a new helper that executes CPUID on a particular CPU and returns
>>> the EAX, EBX, ECX and EDX values.
>>
>> NAK.
>>
>> This madness has to stop. The correct thing is to parse the information
>> in CPUID at the point where the CPU comes online and store it for easy
>> consumption.
>>
>> I'm in the process of reworking the CPUID and topology evaluation and
>> that's where these things need to be stored. I'm still fighting some
>> nightmares with the already existing mess.
>>
>> Look at the mess people created over time here:
>>
>>      https://lore.kernel.org/lkml/20230717223049.327865981@linutronix.de
>>
>> No need to add more insanities to it. IOW, this has to wait for a week
>> or two until I settled the remaining issues.
> 
> Actually you can do that today already. Either make it part of the CPUID
> evaluation or use the perf hotplug callback mechanism which is invoked
> early on the upcoming CPU and read that CPUID leaf.
> 

Thanks for the suggestions. I'll start with the hotplug callback mechanism.

- Sandipan
  

Patch

diff --git a/arch/x86/include/asm/cpuid.h b/arch/x86/include/asm/cpuid.h
index 9bee3e7bf973..17e74d4584f5 100644
--- a/arch/x86/include/asm/cpuid.h
+++ b/arch/x86/include/asm/cpuid.h
@@ -150,6 +150,20 @@  static __always_inline bool cpuid_function_is_indexed(u32 function)
 	return false;
 }
 
+#ifdef CONFIG_SMP
+int cpuid_on_cpu(unsigned int cpu, unsigned int op,
+		 unsigned int *eax, unsigned int *ebx,
+		 unsigned int *ecx, unsigned int *edx);
+#else	/* CONFIG_SMP */
+static inline int cpuid_on_cpu(unsigned int cpu, unsigned int op,
+			       unsigned int *eax, unsigned int *ebx,
+			       unsigned int *ecx, unsigned int *edx)
+{
+	cpuid(op, eax, ebx, ecx, edx);
+	return 0;
+}
+#endif	/* CONFIG_SMP */
+
 #define for_each_possible_hypervisor_cpuid_base(function) \
 	for (function = 0x40000000; function < 0x40010000; function += 0x100)
 
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index ea3a28e7b613..e0097ae55edf 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -39,7 +39,7 @@  $(obj)/inat.o: $(obj)/inat-tables.c
 
 clean-files := inat-tables.c
 
-obj-$(CONFIG_SMP) += msr-smp.o cache-smp.o
+obj-$(CONFIG_SMP) += msr-smp.o cache-smp.o cpuid-smp.o
 
 lib-y := delay.o misc.o cmdline.o cpu.o
 lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
diff --git a/arch/x86/lib/cpuid-smp.c b/arch/x86/lib/cpuid-smp.c
new file mode 100644
index 000000000000..87340893ff61
--- /dev/null
+++ b/arch/x86/lib/cpuid-smp.c
@@ -0,0 +1,36 @@ 
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/export.h>
+#include <linux/smp.h>
+#include <asm/cpuid.h>
+
+struct cpuid_info {
+	u32 op;
+	struct cpuid_regs regs;
+};
+
+static void __cpuid_smp(void *info)
+{
+	struct cpuid_info *rv = info;
+
+	cpuid(rv->op, &rv->regs.eax, &rv->regs.ebx, &rv->regs.ecx, &rv->regs.edx);
+}
+
+int cpuid_on_cpu(unsigned int cpu, unsigned int op,
+		 unsigned int *eax, unsigned int *ebx,
+		 unsigned int *ecx, unsigned int *edx)
+{
+	struct cpuid_info rv;
+	int err;
+
+	memset(&rv, 0, sizeof(rv));
+
+	rv.op = op;
+	err = smp_call_function_single(cpu, __cpuid_smp, &rv, 1);
+	*eax = rv.regs.eax;
+	*ebx = rv.regs.ebx;
+	*ecx = rv.regs.ecx;
+	*edx = rv.regs.edx;
+
+	return err;
+}
+EXPORT_SYMBOL(cpuid_on_cpu);