[v9,03/13] x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag

Message ID 166990897968.17806.16187773270520553795.stgit@bmoger-ubuntu
State New
Headers
Series Support for AMD QoS new features |

Commit Message

Moger, Babu Dec. 1, 2022, 3:36 p.m. UTC
  Newer AMD processors support the new feature Bandwidth Monitoring Event
Configuration (BMEC).

The feature support is identified via CPUID Fn8000_0020_EBX_x0 (ECX=0).
Bits    Field Name    Description
3       EVT_CFG       Bandwidth Monitoring Event Configuration (BMEC)

Currently, the bandwidth monitoring events mbm_total_bytes and
mbm_local_bytes are set to count all the total and local reads/writes
respectively. With the introduction of slow memory, the two counters
are not enough to count all the different types of memory events. With
the feature BMEC, the users have the option to configure
mbm_total_bytes and mbm_local_bytes to count the specific type of
events.

Each BMEC event has a configuration MSR, which contains one field for
each bandwidth type that can be used to configure the bandwidth event
to track any combination of supported bandwidth types. The event will
count requests from every bandwidth type bit that is set in the
corresponding configuration register.

Following are the types of events supported:

====    ========================================================
Bits    Description
====    ========================================================
6       Dirty Victims from the QOS domain to all types of memory
5       Reads to slow memory in the non-local NUMA domain
4       Reads to slow memory in the local NUMA domain
3       Non-temporal writes to non-local NUMA domain
2       Non-temporal writes to local NUMA domain
1       Reads to memory in the non-local NUMA domain
0       Reads to memory in the local NUMA domain
====    ========================================================

By default, the mbm_total_bytes configuration is set to 0x7F to count
all the event types and the mbm_local_bytes configuration is set to
0x15 to count all the local memory events.

Feature description is available in the specification, "AMD64
Technology Platform Quality of Service Extensions, Revision: 1.03
Publication

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 arch/x86/include/asm/cpufeatures.h |    1 +
 arch/x86/kernel/cpu/cpuid-deps.c   |    2 ++
 arch/x86/kernel/cpu/scattered.c    |    1 +
 3 files changed, 4 insertions(+)
  

Comments

Reinette Chatre Dec. 15, 2022, 5:11 p.m. UTC | #1
Hi Babu,

On 12/1/2022 7:36 AM, Babu Moger wrote:
> Newer AMD processors support the new feature Bandwidth Monitoring Event
> Configuration (BMEC).
> 
> The feature support is identified via CPUID Fn8000_0020_EBX_x0 (ECX=0).
> Bits    Field Name    Description
> 3       EVT_CFG       Bandwidth Monitoring Event Configuration (BMEC)
> 
> Currently, the bandwidth monitoring events mbm_total_bytes and

Please drop "Currently,".

> mbm_local_bytes are set to count all the total and local reads/writes
> respectively. With the introduction of slow memory, the two counters
> are not enough to count all the different types of memory events. With
> the feature BMEC, the users have the option to configure
> mbm_total_bytes and mbm_local_bytes to count the specific type of
> events.
> 
> Each BMEC event has a configuration MSR, which contains one field for
> each bandwidth type that can be used to configure the bandwidth event
> to track any combination of supported bandwidth types. The event will
> count requests from every bandwidth type bit that is set in the
> corresponding configuration register.
> 
> Following are the types of events supported:
> 
> ====    ========================================================
> Bits    Description
> ====    ========================================================
> 6       Dirty Victims from the QOS domain to all types of memory
> 5       Reads to slow memory in the non-local NUMA domain
> 4       Reads to slow memory in the local NUMA domain
> 3       Non-temporal writes to non-local NUMA domain
> 2       Non-temporal writes to local NUMA domain
> 1       Reads to memory in the non-local NUMA domain
> 0       Reads to memory in the local NUMA domain
> ====    ========================================================
> 
> By default, the mbm_total_bytes configuration is set to 0x7F to count
> all the event types and the mbm_local_bytes configuration is set to
> 0x15 to count all the local memory events.
> 
> Feature description is available in the specification, "AMD64
> Technology Platform Quality of Service Extensions, Revision: 1.03
> Publication

Missing end quote above.

> 
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>

Same comment about "Link:" ordering as for patch 1.

> ---
>  arch/x86/include/asm/cpufeatures.h |    1 +
>  arch/x86/kernel/cpu/cpuid-deps.c   |    2 ++
>  arch/x86/kernel/cpu/scattered.c    |    1 +
>  3 files changed, 4 insertions(+)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index b6a45e56cd0c..415796d7b309 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -308,6 +308,7 @@
>  #define X86_FEATURE_CALL_DEPTH		(11*32+19) /* "" Call depth tracking for RSB stuffing */
>  #define X86_FEATURE_MSR_TSX_CTRL	(11*32+20) /* "" MSR IA32_TSX_CTRL (Intel) implemented */
>  #define X86_FEATURE_SMBA		(11*32+21) /* Slow Memory Bandwidth Allocation */
> +#define X86_FEATURE_BMEC		(11*32+22) /* Bandwidth Monitoring Event Configuration */
>  
>  /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
>  #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index d95221117129..f6748c8bd647 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -68,6 +68,8 @@ static const struct cpuid_dep cpuid_deps[] = {
>  	{ X86_FEATURE_CQM_OCCUP_LLC,		X86_FEATURE_CQM_LLC   },
>  	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
>  	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
> +	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_TOTAL   },
> +	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_LOCAL   },
>  	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
>  	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
>  	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
> diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
> index d925753084fb..0dad49a09b7a 100644
> --- a/arch/x86/kernel/cpu/scattered.c
> +++ b/arch/x86/kernel/cpu/scattered.c
> @@ -46,6 +46,7 @@ static const struct cpuid_bit cpuid_bits[] = {
>  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
>  	{ X86_FEATURE_MBA,		CPUID_EBX,  6, 0x80000008, 0 },
>  	{ X86_FEATURE_SMBA,		CPUID_EBX,  2, 0x80000020, 0 },
> +	{ X86_FEATURE_BMEC,		CPUID_EBX,  3, 0x80000020, 0 },
>  	{ X86_FEATURE_PERFMON_V2,	CPUID_EAX,  0, 0x80000022, 0 },
>  	{ X86_FEATURE_AMD_LBR_V2,	CPUID_EAX,  1, 0x80000022, 0 },
>  	{ 0, 0, 0, 0, 0 }
> 
> 

With changelog comments addressed:

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette
  
Moger, Babu Dec. 19, 2022, 3:31 p.m. UTC | #2
[AMD Official Use Only - General]

Hi Reinette,

> -----Original Message-----
> From: Reinette Chatre <reinette.chatre@intel.com>
> Sent: Thursday, December 15, 2022 11:11 AM
> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net;
> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de
> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org;
> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org;
> quic_neeraju@quicinc.com; rdunlap@infradead.org;
> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com;
> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com;
> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com;
> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan
> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com;
> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org;
> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu;
> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com;
> peternewman@google.com
> Subject: Re: [PATCH v9 03/13] x86/cpufeatures: Add Bandwidth Monitoring
> Event Configuration feature flag
> 
> Hi Babu,
> 
> On 12/1/2022 7:36 AM, Babu Moger wrote:
> > Newer AMD processors support the new feature Bandwidth Monitoring
> > Event Configuration (BMEC).
> >
> > The feature support is identified via CPUID Fn8000_0020_EBX_x0 (ECX=0).
> > Bits    Field Name    Description
> > 3       EVT_CFG       Bandwidth Monitoring Event Configuration (BMEC)
> >
> > Currently, the bandwidth monitoring events mbm_total_bytes and
> 
> Please drop "Currently,".
Sure.
> 
> > mbm_local_bytes are set to count all the total and local reads/writes
> > respectively. With the introduction of slow memory, the two counters
> > are not enough to count all the different types of memory events. With
> > the feature BMEC, the users have the option to configure
> > mbm_total_bytes and mbm_local_bytes to count the specific type of
> > events.
> >
> > Each BMEC event has a configuration MSR, which contains one field for
> > each bandwidth type that can be used to configure the bandwidth event
> > to track any combination of supported bandwidth types. The event will
> > count requests from every bandwidth type bit that is set in the
> > corresponding configuration register.
> >
> > Following are the types of events supported:
> >
> > ====    ========================================================
> > Bits    Description
> > ====    ========================================================
> > 6       Dirty Victims from the QOS domain to all types of memory
> > 5       Reads to slow memory in the non-local NUMA domain
> > 4       Reads to slow memory in the local NUMA domain
> > 3       Non-temporal writes to non-local NUMA domain
> > 2       Non-temporal writes to local NUMA domain
> > 1       Reads to memory in the non-local NUMA domain
> > 0       Reads to memory in the local NUMA domain
> > ====    ========================================================
> >
> > By default, the mbm_total_bytes configuration is set to 0x7F to count
> > all the event types and the mbm_local_bytes configuration is set to
> > 0x15 to count all the local memory events.
> >
> > Feature description is available in the specification, "AMD64
> > Technology Platform Quality of Service Extensions, Revision: 1.03
> > Publication
> 
> Missing end quote above.
Ok.
> 
> >
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.
> > amd.com%2Fen%2Fsupport%2Ftech-docs%2Famd64-technology-platform-
> quality
> > -service-
> extensions&amp;data=05%7C01%7Cbabu.moger%40amd.com%7Cbf8c3716
> >
> eed74809726508dadebfbee2%7C3dd8961fe4884e608e11a82d994e183d%7C0%
> 7C0%7C
> >
> 638067212372390994%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
> AiLCJQIjo
> >
> iV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdat
> a=nLkb
> >
> Wj%2FCed7gAf%2B8qfEec%2FMa1lKgi83EsPZs%2FU%2FOLWc%3D&amp;reserv
> ed=0
> > Link:
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugz
> >
> illa.kernel.org%2Fshow_bug.cgi%3Fid%3D206537&amp;data=05%7C01%7Cbab
> u.m
> >
> oger%40amd.com%7Cbf8c3716eed74809726508dadebfbee2%7C3dd8961fe488
> 4e608e
> >
> 11a82d994e183d%7C0%7C0%7C638067212372390994%7CUnknown%7CTWFpb
> GZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7
> C300
> >
> 0%7C%7C%7C&amp;sdata=DTiTxeQYtjCz50emuxvoHFJbOPJUO0qcUHjwnQAxP3
> o%3D&am
> > p;reserved=0
> > Signed-off-by: Babu Moger <babu.moger@amd.com>
> 
> Same comment about "Link:" ordering as for patch 1.

Sure.
> 
> > ---
> >  arch/x86/include/asm/cpufeatures.h |    1 +
> >  arch/x86/kernel/cpu/cpuid-deps.c   |    2 ++
> >  arch/x86/kernel/cpu/scattered.c    |    1 +
> >  3 files changed, 4 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/cpufeatures.h
> > b/arch/x86/include/asm/cpufeatures.h
> > index b6a45e56cd0c..415796d7b309 100644
> > --- a/arch/x86/include/asm/cpufeatures.h
> > +++ b/arch/x86/include/asm/cpufeatures.h
> > @@ -308,6 +308,7 @@
> >  #define X86_FEATURE_CALL_DEPTH		(11*32+19) /* "" Call depth
> tracking for RSB stuffing */
> >  #define X86_FEATURE_MSR_TSX_CTRL	(11*32+20) /* "" MSR
> IA32_TSX_CTRL (Intel) implemented */
> >  #define X86_FEATURE_SMBA		(11*32+21) /* Slow Memory
> Bandwidth Allocation */
> > +#define X86_FEATURE_BMEC		(11*32+22) /* Bandwidth
> Monitoring Event Configuration */
> >
> >  /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
> >  #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI
> instructions */
> > diff --git a/arch/x86/kernel/cpu/cpuid-deps.c
> > b/arch/x86/kernel/cpu/cpuid-deps.c
> > index d95221117129..f6748c8bd647 100644
> > --- a/arch/x86/kernel/cpu/cpuid-deps.c
> > +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> > @@ -68,6 +68,8 @@ static const struct cpuid_dep cpuid_deps[] = {
> >  	{ X86_FEATURE_CQM_OCCUP_LLC,
> 	X86_FEATURE_CQM_LLC   },
> >  	{ X86_FEATURE_CQM_MBM_TOTAL,
> 	X86_FEATURE_CQM_LLC   },
> >  	{ X86_FEATURE_CQM_MBM_LOCAL,
> 	X86_FEATURE_CQM_LLC   },
> > +	{ X86_FEATURE_BMEC,
> 	X86_FEATURE_CQM_MBM_TOTAL   },
> > +	{ X86_FEATURE_BMEC,
> 	X86_FEATURE_CQM_MBM_LOCAL   },
> >  	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
> >  	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
> >  	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES
> },
> > diff --git a/arch/x86/kernel/cpu/scattered.c
> > b/arch/x86/kernel/cpu/scattered.c index d925753084fb..0dad49a09b7a
> > 100644
> > --- a/arch/x86/kernel/cpu/scattered.c
> > +++ b/arch/x86/kernel/cpu/scattered.c
> > @@ -46,6 +46,7 @@ static const struct cpuid_bit cpuid_bits[] = {
> >  	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
> >  	{ X86_FEATURE_MBA,		CPUID_EBX,  6, 0x80000008, 0 },
> >  	{ X86_FEATURE_SMBA,		CPUID_EBX,  2, 0x80000020, 0 },
> > +	{ X86_FEATURE_BMEC,		CPUID_EBX,  3, 0x80000020, 0 },
> >  	{ X86_FEATURE_PERFMON_V2,	CPUID_EAX,  0, 0x80000022, 0 },
> >  	{ X86_FEATURE_AMD_LBR_V2,	CPUID_EAX,  1, 0x80000022, 0 },
> >  	{ 0, 0, 0, 0, 0 }
> >
> >
> 
> With changelog comments addressed:
> 
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Thanks Babu
> 
> Reinette
  

Patch

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b6a45e56cd0c..415796d7b309 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -308,6 +308,7 @@ 
 #define X86_FEATURE_CALL_DEPTH		(11*32+19) /* "" Call depth tracking for RSB stuffing */
 #define X86_FEATURE_MSR_TSX_CTRL	(11*32+20) /* "" MSR IA32_TSX_CTRL (Intel) implemented */
 #define X86_FEATURE_SMBA		(11*32+21) /* Slow Memory Bandwidth Allocation */
+#define X86_FEATURE_BMEC		(11*32+22) /* Bandwidth Monitoring Event Configuration */
 
 /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
 #define X86_FEATURE_AVX_VNNI		(12*32+ 4) /* AVX VNNI instructions */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index d95221117129..f6748c8bd647 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -68,6 +68,8 @@  static const struct cpuid_dep cpuid_deps[] = {
 	{ X86_FEATURE_CQM_OCCUP_LLC,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_CQM_MBM_TOTAL,		X86_FEATURE_CQM_LLC   },
 	{ X86_FEATURE_CQM_MBM_LOCAL,		X86_FEATURE_CQM_LLC   },
+	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_TOTAL   },
+	{ X86_FEATURE_BMEC,			X86_FEATURE_CQM_MBM_LOCAL   },
 	{ X86_FEATURE_AVX512_BF16,		X86_FEATURE_AVX512VL  },
 	{ X86_FEATURE_AVX512_FP16,		X86_FEATURE_AVX512BW  },
 	{ X86_FEATURE_ENQCMD,			X86_FEATURE_XSAVES    },
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index d925753084fb..0dad49a09b7a 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -46,6 +46,7 @@  static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_PROC_FEEDBACK,    CPUID_EDX, 11, 0x80000007, 0 },
 	{ X86_FEATURE_MBA,		CPUID_EBX,  6, 0x80000008, 0 },
 	{ X86_FEATURE_SMBA,		CPUID_EBX,  2, 0x80000020, 0 },
+	{ X86_FEATURE_BMEC,		CPUID_EBX,  3, 0x80000020, 0 },
 	{ X86_FEATURE_PERFMON_V2,	CPUID_EAX,  0, 0x80000022, 0 },
 	{ X86_FEATURE_AMD_LBR_V2,	CPUID_EAX,  1, 0x80000022, 0 },
 	{ 0, 0, 0, 0, 0 }