[v3,2/2] x86/resctrl: Remove hard-coded memory bandwidth event configuration

Message ID 20240104212130.209490-2-babu.moger@amd.com
State New
Headers
Series None |

Commit Message

Moger, Babu Jan. 4, 2024, 9:21 p.m. UTC
  If the BMEC (Bandwidth Monitoring Event Configuration) feature is
supported, the bandwidth events can be configured. The maximum supported
bandwidth bitmask can be determined by following CPUID command.

CPUID_Fn80000020_ECX_x03 [Platform QoS Monitoring Bandwidth Event
Configuration] Read-only. Reset: 0000_007Fh.
Bits    Description
31:7    Reserved
 6:0    Identifies the bandwidth sources that can be tracked.

The bandwidth sources can change with the processor generations.
Remove the hard-coded value and detect using CPUID command. Also,
print the valid bitmask when the user tries to configure invalid value.

The CPUID details are documentation in the PPR listed below [1].
[1] Processor Programming Reference (PPR) Vol 1.1 for AMD Family 19h Model
11h B1 - 55901 Rev 0.25.

Fixes: dc2a3e857981 ("x86/resctrl: Add interface to read mbm_total_bytes_config")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
v3: Changed the event_mask name to mbm_cfg_mask. Added comments about the field.
    Reverted the masking of event configuration to original code.
    Few minor comment changes.

v2: Earlier sent as a part of ABMC feature.
    https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
    But this is not related to ABMC. Sending it separate now.
    Removed the global resctrl_max_evt_bitmask. Added event_mask as part of
    the resource.
---
 arch/x86/kernel/cpu/resctrl/internal.h | 3 +++
 arch/x86/kernel/cpu/resctrl/monitor.c  | 6 ++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++++--
 3 files changed, 13 insertions(+), 2 deletions(-)
  

Comments

Reinette Chatre Jan. 5, 2024, 9:18 p.m. UTC | #1
Hi Babu,

On 1/4/2024 1:21 PM, Babu Moger wrote:
> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
> supported, the bandwidth events can be configured. The maximum supported
> bandwidth bitmask can be determined by following CPUID command.
> 
> CPUID_Fn80000020_ECX_x03 [Platform QoS Monitoring Bandwidth Event
> Configuration] Read-only. Reset: 0000_007Fh.
> Bits    Description
> 31:7    Reserved
>  6:0    Identifies the bandwidth sources that can be tracked.
> 
> The bandwidth sources can change with the processor generations.
> Remove the hard-coded value and detect using CPUID command. Also,

I do not think "Remove the hard-coded value" is accurate anymore.

> print the valid bitmask when the user tries to configure invalid value.
> 
> The CPUID details are documentation in the PPR listed below [1].

"are documentation" -> "are documented"

> [1] Processor Programming Reference (PPR) Vol 1.1 for AMD Family 19h Model
> 11h B1 - 55901 Rev 0.25.
> 
> Fixes: dc2a3e857981 ("x86/resctrl: Add interface to read mbm_total_bytes_config")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> Signed-off-by: Babu Moger <babu.moger@amd.com>

Same comment about "Link:" as for patch 1/2.

> ---
> v3: Changed the event_mask name to mbm_cfg_mask. Added comments about the field.
>     Reverted the masking of event configuration to original code.
>     Few minor comment changes.
> 
> v2: Earlier sent as a part of ABMC feature.
>     https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
>     But this is not related to ABMC. Sending it separate now.
>     Removed the global resctrl_max_evt_bitmask. Added event_mask as part of
>     the resource.
> ---
>  arch/x86/kernel/cpu/resctrl/internal.h | 3 +++
>  arch/x86/kernel/cpu/resctrl/monitor.c  | 6 ++++++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++++--
>  3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index d2979748fae4..e3dc35a00a19 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -394,6 +394,8 @@ struct rdt_parse_data {
>   * @msr_update:		Function pointer to update QOS MSRs
>   * @mon_scale:		cqm counter * mon_scale = occupancy in bytes
>   * @mbm_width:		Monitor width, to detect and correct for overflow.
> + * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
> + *			Monitoring Event Configuration (BMEC) is supported.
>   * @cdp_enabled:	CDP state of this resource
>   *
>   * Members of this structure are either private to the architecture
> @@ -408,6 +410,7 @@ struct rdt_hw_resource {
>  				 struct rdt_resource *r);
>  	unsigned int		mon_scale;
>  	unsigned int		mbm_width;
> +	unsigned int		mbm_cfg_mask;
>  	bool			cdp_enabled;
>  };
>  
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index f136ac046851..acca577e2b06 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -813,6 +813,12 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>  		return ret;
>  
>  	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
> +		u32 eax, ebx, ecx, edx;
> +
> +		/* Detect list of bandwidth sources that can be tracked */
> +		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
> +		hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
> +
>  		if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
>  			mbm_total_event.configurable = true;
>  			mbm_config_rftype_init("mbm_total_bytes_config");
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 69a1de92384a..5b5a8f0ffb2f 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -1617,12 +1617,14 @@ static void mon_event_config_write(void *info)
>  static int mbm_config_write_domain(struct rdt_resource *r,
>  				   struct rdt_domain *d, u32 evtid, u32 val)

Not specific to this patch, but since the valid mask is per resource I do not think
it is necessary to check user provided value for every domain. The user provided value
can be checked earlier and only once in mon_config_write() before iterating over all
domains to write the value.

>  {
> +	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>  	struct mon_config_info mon_info = {0};
>  	int ret = 0;
>  
>  	/* mon_config cannot be more than the supported set of events */
> -	if (val > MAX_EVT_CONFIG_BITS) {
> -		rdt_last_cmd_puts("Invalid event configuration\n");
> +	if ((val & hw_res->mbm_cfg_mask) != val) {
> +		rdt_last_cmd_printf("Invalid input: The maximum valid bitmask is 0x%02x\n",
> +				    hw_res->mbm_cfg_mask);

I think keeping "Invalid event configuration" is useful to create a detailed message of:
"Invalid event configuration: maximum valid bitmask is 0x%02x"

Reinette
  
Moger, Babu Jan. 6, 2024, 12:13 a.m. UTC | #2
Hi Reinette,

On 1/5/2024 3:18 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 1/4/2024 1:21 PM, Babu Moger wrote:
>> If the BMEC (Bandwidth Monitoring Event Configuration) feature is
>> supported, the bandwidth events can be configured. The maximum supported
>> bandwidth bitmask can be determined by following CPUID command.
>>
>> CPUID_Fn80000020_ECX_x03 [Platform QoS Monitoring Bandwidth Event
>> Configuration] Read-only. Reset: 0000_007Fh.
>> Bits    Description
>> 31:7    Reserved
>>   6:0    Identifies the bandwidth sources that can be tracked.
>>
>> The bandwidth sources can change with the processor generations.
>> Remove the hard-coded value and detect using CPUID command. Also,
> I do not think "Remove the hard-coded value" is accurate anymore.

Will change it to.

"Read the supported bandwidth sources using CPUID command. Also,"

Also I need to update the subject line.

>
>> print the valid bitmask when the user tries to configure invalid value.
>>
>> The CPUID details are documentation in the PPR listed below [1].
> "are documentation" -> "are documented"
Sure.
>
>> [1] Processor Programming Reference (PPR) Vol 1.1 for AMD Family 19h Model
>> 11h B1 - 55901 Rev 0.25.
>>
>> Fixes: dc2a3e857981 ("x86/resctrl: Add interface to read mbm_total_bytes_config")
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Same comment about "Link:" as for patch 1/2.
Sure.
>
>> ---
>> v3: Changed the event_mask name to mbm_cfg_mask. Added comments about the field.
>>      Reverted the masking of event configuration to original code.
>>      Few minor comment changes.
>>
>> v2: Earlier sent as a part of ABMC feature.
>>      https://lore.kernel.org/lkml/20231201005720.235639-1-babu.moger@amd.com/
>>      But this is not related to ABMC. Sending it separate now.
>>      Removed the global resctrl_max_evt_bitmask. Added event_mask as part of
>>      the resource.
>> ---
>>   arch/x86/kernel/cpu/resctrl/internal.h | 3 +++
>>   arch/x86/kernel/cpu/resctrl/monitor.c  | 6 ++++++
>>   arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++++--
>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>> index d2979748fae4..e3dc35a00a19 100644
>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>> @@ -394,6 +394,8 @@ struct rdt_parse_data {
>>    * @msr_update:		Function pointer to update QOS MSRs
>>    * @mon_scale:		cqm counter * mon_scale = occupancy in bytes
>>    * @mbm_width:		Monitor width, to detect and correct for overflow.
>> + * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
>> + *			Monitoring Event Configuration (BMEC) is supported.
>>    * @cdp_enabled:	CDP state of this resource
>>    *
>>    * Members of this structure are either private to the architecture
>> @@ -408,6 +410,7 @@ struct rdt_hw_resource {
>>   				 struct rdt_resource *r);
>>   	unsigned int		mon_scale;
>>   	unsigned int		mbm_width;
>> +	unsigned int		mbm_cfg_mask;
>>   	bool			cdp_enabled;
>>   };
>>   
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index f136ac046851..acca577e2b06 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -813,6 +813,12 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>>   		return ret;
>>   
>>   	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
>> +		u32 eax, ebx, ecx, edx;
>> +
>> +		/* Detect list of bandwidth sources that can be tracked */
>> +		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
>> +		hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
>> +
>>   		if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
>>   			mbm_total_event.configurable = true;
>>   			mbm_config_rftype_init("mbm_total_bytes_config");
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 69a1de92384a..5b5a8f0ffb2f 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1617,12 +1617,14 @@ static void mon_event_config_write(void *info)
>>   static int mbm_config_write_domain(struct rdt_resource *r,
>>   				   struct rdt_domain *d, u32 evtid, u32 val)
> Not specific to this patch, but since the valid mask is per resource I do not think
> it is necessary to check user provided value for every domain. The user provided value
> can be checked earlier and only once in mon_config_write() before iterating over all
> domains to write the value.
Yes. Agree.
>
>>   {
>> +	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>   	struct mon_config_info mon_info = {0};
>>   	int ret = 0;
>>   
>>   	/* mon_config cannot be more than the supported set of events */
>> -	if (val > MAX_EVT_CONFIG_BITS) {
>> -		rdt_last_cmd_puts("Invalid event configuration\n");
>> +	if ((val & hw_res->mbm_cfg_mask) != val) {
>> +		rdt_last_cmd_printf("Invalid input: The maximum valid bitmask is 0x%02x\n",
>> +				    hw_res->mbm_cfg_mask);
> I think keeping "Invalid event configuration" is useful to create a detailed message of:
> "Invalid event configuration: maximum valid bitmask is 0x%02x"

Sure.

Thanks

Babu
  

Patch

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index d2979748fae4..e3dc35a00a19 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -394,6 +394,8 @@  struct rdt_parse_data {
  * @msr_update:		Function pointer to update QOS MSRs
  * @mon_scale:		cqm counter * mon_scale = occupancy in bytes
  * @mbm_width:		Monitor width, to detect and correct for overflow.
+ * @mbm_cfg_mask:	Bandwidth sources that can be tracked when Bandwidth
+ *			Monitoring Event Configuration (BMEC) is supported.
  * @cdp_enabled:	CDP state of this resource
  *
  * Members of this structure are either private to the architecture
@@ -408,6 +410,7 @@  struct rdt_hw_resource {
 				 struct rdt_resource *r);
 	unsigned int		mon_scale;
 	unsigned int		mbm_width;
+	unsigned int		mbm_cfg_mask;
 	bool			cdp_enabled;
 };
 
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index f136ac046851..acca577e2b06 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -813,6 +813,12 @@  int __init rdt_get_mon_l3_config(struct rdt_resource *r)
 		return ret;
 
 	if (rdt_cpu_has(X86_FEATURE_BMEC)) {
+		u32 eax, ebx, ecx, edx;
+
+		/* Detect list of bandwidth sources that can be tracked */
+		cpuid_count(0x80000020, 3, &eax, &ebx, &ecx, &edx);
+		hw_res->mbm_cfg_mask = ecx & MAX_EVT_CONFIG_BITS;
+
 		if (rdt_cpu_has(X86_FEATURE_CQM_MBM_TOTAL)) {
 			mbm_total_event.configurable = true;
 			mbm_config_rftype_init("mbm_total_bytes_config");
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 69a1de92384a..5b5a8f0ffb2f 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1617,12 +1617,14 @@  static void mon_event_config_write(void *info)
 static int mbm_config_write_domain(struct rdt_resource *r,
 				   struct rdt_domain *d, u32 evtid, u32 val)
 {
+	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	struct mon_config_info mon_info = {0};
 	int ret = 0;
 
 	/* mon_config cannot be more than the supported set of events */
-	if (val > MAX_EVT_CONFIG_BITS) {
-		rdt_last_cmd_puts("Invalid event configuration\n");
+	if ((val & hw_res->mbm_cfg_mask) != val) {
+		rdt_last_cmd_printf("Invalid input: The maximum valid bitmask is 0x%02x\n",
+				    hw_res->mbm_cfg_mask);
 		return -EINVAL;
 	}