[v2,1/4] x86/resctrl: Enable non-contiguous bits in Intel CAT

Message ID 918e147601697b1d4b8f8589f5751e05d6ceccdf.1695371055.git.maciej.wieczor-retman@intel.com
State New
Headers
Series x86/resctrl: Non-contiguous bitmasks in Intel CAT |

Commit Message

Maciej Wieczor-Retman Sept. 22, 2023, 8:48 a.m. UTC
  The setting for non-contiguous 1s support in Intel CAT is
hardcoded to false. On these systems, writing non-contiguous
1s into the schemata file will fail before resctrl passes
the value to the hardware.

In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
being reserved and now carry information about non-contiguous 1s
value support for L3 and L2 cache respectively. The CAT
capacity bitmask (CBM) supports a non-contiguous 1s value if
the bit is set.

Replace the hardcoded non-contiguous support value with
the support learned from the hardware. Add hardcoded non-contiguous
support value to Haswell probe since it can't make use of CPUID for
Cache allocation.

Originally-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
Changelog v2:
- Rewrite part of a comment concerning Haswell. (Reinette)

 arch/x86/kernel/cpu/resctrl/core.c        |  9 ++++++---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++----
 arch/x86/kernel/cpu/resctrl/internal.h    |  9 +++++++++
 3 files changed, 21 insertions(+), 7 deletions(-)
  

Comments

Maciej Wieczor-Retman Sept. 27, 2023, 9:20 a.m. UTC | #1
Hi, and thanks for the review!

On 2023-09-22 at 16:14:41 +0200, Peter Newman wrote:
>Hi Maciej,
>
>On Fri, Sep 22, 2023 at 10:48:23AM +0200, Maciej Wieczor-Retman wrote:
>> The setting for non-contiguous 1s support in Intel CAT is
>> hardcoded to false. On these systems, writing non-contiguous
>> 1s into the schemata file will fail before resctrl passes
>> the value to the hardware.
>> 
>> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
>> being reserved and now carry information about non-contiguous 1s
>> value support for L3 and L2 cache respectively. The CAT
>> capacity bitmask (CBM) supports a non-contiguous 1s value if
>> the bit is set.
>
>How new of an SDM do I need? The June 2023 revision I downloaded today didn't
>list it.

It's not currently in the SDM but in the Intel® Architecture
Instruction Set Extensions and Future Features (which I mentioned in the
second paragraph of the cover letter). My version of the ISA pdf was
from June 2023.

>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 030d3b409768..c783a873147c 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void)
>>  	r->cache.cbm_len = 20;
>>  	r->cache.shareable_bits = 0xc0000;
>>  	r->cache.min_cbm_bits = 2;
>> +	r->cache.arch_has_sparse_bitmaps = false;
>>  	r->alloc_capable = true;
>>  
>>  	rdt_alloc_capable = true;
>> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
>>  {
>>  	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>  	union cpuid_0x10_1_eax eax;
>> +	union cpuid_0x10_x_ecx ecx;
>>  	union cpuid_0x10_x_edx edx;
>> -	u32 ebx, ecx;
>> +	u32 ebx;
>>  
>> -	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
>> +	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
>>  	hw_res->num_closid = edx.split.cos_max + 1;
>>  	r->cache.cbm_len = eax.split.cbm_len + 1;
>>  	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>>  	r->cache.shareable_bits = ebx & r->default_ctrl;
>>  	r->data_width = (r->cache.cbm_len + 3) / 4;
>> +	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>> +		r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
>
>This seems to be called after the clearing of arch_has_sparse_bitmaps in
>cache_alloc_hsw_probe(). If we can't make use of the CPUID bit on Haswell,
>is it safe to use its value here?

I believe the calls go like this for a haswell system:
resctrl_late_init() -> check_quirks() -> __check_quirks_intel() ->
-> cache_alloc_hsw_probe()

There this line is executed:
	rdt_alloc_capable = true;
where rdt_alloc_capable is global in the file scope.

Then later in:
resctrl_late_init() -> get_rdt_resources() -> get_rdt_alloc_resources()

this is executed at the function beginning:
	if (rdt_alloc_capable)
		return true;

So the rest of the get_rdt_alloc_resources() is skipped and calls to
rdt_get_cache_alloc_cfg() never get executed.
  
Peter Newman Sept. 27, 2023, 10:08 a.m. UTC | #2
Hi Maciej,

On Wed, Sep 27, 2023 at 11:20 AM Maciej Wieczór-Retman
<maciej.wieczor-retman@intel.com> wrote:
> On 2023-09-22 at 16:14:41 +0200, Peter Newman wrote:
> >On Fri, Sep 22, 2023 at 10:48:23AM +0200, Maciej Wieczor-Retman wrote:
> >> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
> >> being reserved and now carry information about non-contiguous 1s
> >> value support for L3 and L2 cache respectively. The CAT
> >> capacity bitmask (CBM) supports a non-contiguous 1s value if
> >> the bit is set.
> >
> >How new of an SDM do I need? The June 2023 revision I downloaded today didn't
> >list it.
>
> It's not currently in the SDM but in the Intel® Architecture
> Instruction Set Extensions and Future Features (which I mentioned in the
> second paragraph of the cover letter). My version of the ISA pdf was
> from June 2023.
>

I see it now, thanks!

> >> -    cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
> >> +    cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
> >>      hw_res->num_closid = edx.split.cos_max + 1;
> >>      r->cache.cbm_len = eax.split.cbm_len + 1;
> >>      r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
> >>      r->cache.shareable_bits = ebx & r->default_ctrl;
> >>      r->data_width = (r->cache.cbm_len + 3) / 4;
> >> +    if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> >> +            r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
> >
> >This seems to be called after the clearing of arch_has_sparse_bitmaps in
> >cache_alloc_hsw_probe(). If we can't make use of the CPUID bit on Haswell,
> >is it safe to use its value here?
>
> I believe the calls go like this for a haswell system:
> resctrl_late_init() -> check_quirks() -> __check_quirks_intel() ->
> -> cache_alloc_hsw_probe()
>
> There this line is executed:
>         rdt_alloc_capable = true;
> where rdt_alloc_capable is global in the file scope.
>
> Then later in:
> resctrl_late_init() -> get_rdt_resources() -> get_rdt_alloc_resources()
>
> this is executed at the function beginning:
>         if (rdt_alloc_capable)
>                 return true;
>
> So the rest of the get_rdt_alloc_resources() is skipped and calls to
> rdt_get_cache_alloc_cfg() never get executed.

Yuck. But it works I guess.

The series looks fine to me.

Reviewed-by: Peter Newman <peternewman@google.com>

I applied the series and was able to confirm the behavior was still
correct for contiguous-bitmap Intel hardware and that sprase_bitmaps
is true on AMD and continues to work as expected.

Tested-by: Peter Newman <peternewman@google.com>

I'm not sure if I have access to any Intel hardware with
non-contiguous bitmaps right now. Are you able to say where that would
be implemented?

Thanks!
-Peter
  
Maciej Wieczor-Retman Sept. 27, 2023, 10:44 a.m. UTC | #3
On 2023-09-27 at 12:08:33 +0200, Peter Newman wrote:
>On Wed, Sep 27, 2023 at 11:20 AM Maciej Wieczór-Retman
><maciej.wieczor-retman@intel.com> wrote:
>> On 2023-09-22 at 16:14:41 +0200, Peter Newman wrote:
>> >On Fri, Sep 22, 2023 at 10:48:23AM +0200, Maciej Wieczor-Retman wrote:
>> >> -    cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
>> >> +    cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
>> >>      hw_res->num_closid = edx.split.cos_max + 1;
>> >>      r->cache.cbm_len = eax.split.cbm_len + 1;
>> >>      r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>> >>      r->cache.shareable_bits = ebx & r->default_ctrl;
>> >>      r->data_width = (r->cache.cbm_len + 3) / 4;
>> >> +    if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>> >> +            r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
>> >
>> >This seems to be called after the clearing of arch_has_sparse_bitmaps in
>> >cache_alloc_hsw_probe(). If we can't make use of the CPUID bit on Haswell,
>> >is it safe to use its value here?
>>
>> I believe the calls go like this for a haswell system:
>> resctrl_late_init() -> check_quirks() -> __check_quirks_intel() ->
>> -> cache_alloc_hsw_probe()
>>
>> There this line is executed:
>>         rdt_alloc_capable = true;
>> where rdt_alloc_capable is global in the file scope.
>>
>> Then later in:
>> resctrl_late_init() -> get_rdt_resources() -> get_rdt_alloc_resources()
>>
>> this is executed at the function beginning:
>>         if (rdt_alloc_capable)
>>                 return true;
>>
>> So the rest of the get_rdt_alloc_resources() is skipped and calls to
>> rdt_get_cache_alloc_cfg() never get executed.
>
>Yuck. But it works I guess.
>
>The series looks fine to me.
>
>Reviewed-by: Peter Newman <peternewman@google.com>
>
>I applied the series and was able to confirm the behavior was still
>correct for contiguous-bitmap Intel hardware and that sprase_bitmaps
>is true on AMD and continues to work as expected.
>
>Tested-by: Peter Newman <peternewman@google.com>
>
>I'm not sure if I have access to any Intel hardware with
>non-contiguous bitmaps right now. Are you able to say where that would
>be implemented?

Thanks for testing!

Writing non-contiguous bitmasks is supported starting from the upcoming
GNR microarchitecture forward.

That's also why the new CPUID bit meaning is in the ISA pdf and not in
the SDM one currently.
  
Luck, Tony Sept. 27, 2023, 3:03 p.m. UTC | #4
On Wed, Sep 27, 2023 at 12:44:39PM +0200, Maciej Wieczór-Retman wrote:
> Writing non-contiguous bitmasks is supported starting from the upcoming
> GNR microarchitecture forward.
> 
> That's also why the new CPUID bit meaning is in the ISA pdf and not in
> the SDM one currently.

New SDM released today has the non-contiguous bit. See vol 3B Figuer
18-33.

-Tony
  
Moger, Babu Sept. 27, 2023, 10:34 p.m. UTC | #5
Hi Maciej,

How about this subject line?

x86/resctrl: Enable non-contiguous CBMs on Intel CAT

On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote:
> The setting for non-contiguous 1s support in Intel CAT is

> hardcoded to false. On these systems, writing non-contiguous
> 1s into the schemata file will fail before resctrl passes
> the value to the hardware.
>
> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
> being reserved and now carry information about non-contiguous 1s
> value support for L3 and L2 cache respectively. The CAT
> capacity bitmask (CBM) supports a non-contiguous 1s value if
> the bit is set.
>
> Replace the hardcoded non-contiguous support value with
> the support learned from the hardware. Add hardcoded non-contiguous
> support value to Haswell probe since it can't make use of CPUID for
> Cache allocation.
>
> Originally-by: Fenghua Yu <fenghua.yu@intel.com>
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
> Changelog v2:
> - Rewrite part of a comment concerning Haswell. (Reinette)
>
>   arch/x86/kernel/cpu/resctrl/core.c        |  9 ++++++---
>   arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++----
>   arch/x86/kernel/cpu/resctrl/internal.h    |  9 +++++++++
>   3 files changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 030d3b409768..c783a873147c 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void)
>   	r->cache.cbm_len = 20;
>   	r->cache.shareable_bits = 0xc0000;
>   	r->cache.min_cbm_bits = 2;
> +	r->cache.arch_has_sparse_bitmaps = false;

Is this change required?

This is always set to false in rdt_init_res_defs_intel().

>   	r->alloc_capable = true;
>   
>   	rdt_alloc_capable = true;
> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
>   {
>   	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>   	union cpuid_0x10_1_eax eax;
> +	union cpuid_0x10_x_ecx ecx;
>   	union cpuid_0x10_x_edx edx;
> -	u32 ebx, ecx;
> +	u32 ebx;
>   
> -	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
> +	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
>   	hw_res->num_closid = edx.split.cos_max + 1;
>   	r->cache.cbm_len = eax.split.cbm_len + 1;
>   	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>   	r->cache.shareable_bits = ebx & r->default_ctrl;
>   	r->data_width = (r->cache.cbm_len + 3) / 4;
> +	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> +		r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
>   	r->alloc_capable = true;
>   }
>   
> @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void)
>   
>   		if (r->rid == RDT_RESOURCE_L3 ||
>   		    r->rid == RDT_RESOURCE_L2) {
> -			r->cache.arch_has_sparse_bitmaps = false;

Why do you have to remove this one here?   This seems like a right place 
to initialize.

Thanks

Babu


>   			r->cache.arch_has_per_cpu_cfg = false;
>   			r->cache.min_cbm_bits = 1;
>   		} else if (r->rid == RDT_RESOURCE_MBA) {
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index b44c487727d4..f076f12cf8e8 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -87,10 +87,12 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
>   
>   /*
>    * Check whether a cache bit mask is valid.
> - * For Intel the SDM says:
> - *	Please note that all (and only) contiguous '1' combinations
> - *	are allowed (e.g. FFFFH, 0FF0H, 003CH, etc.).
> - * Additionally Haswell requires at least two bits set.
> + * On Intel CPUs, non-contiguous 1s value support is indicated by CPUID:
> + *   - CPUID.0x10.1:ECX[3]: L3 non-contiguous 1s value supported if 1
> + *   - CPUID.0x10.2:ECX[3]: L2 non-contiguous 1s value supported if 1
> + *
> + * Haswell does not support a non-contiguous 1s value and additionally
> + * requires at least two bits set.
>    * AMD allows non-contiguous bitmasks.
>    */
>   static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 85ceaf9a31ac..c47ef2f13e8e 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -492,6 +492,15 @@ union cpuid_0x10_3_eax {
>   	unsigned int full;
>   };
>   
> +/* CPUID.(EAX=10H, ECX=ResID).ECX */
> +union cpuid_0x10_x_ecx {
> +	struct {
> +		unsigned int reserved:3;
> +		unsigned int noncont:1;
> +	} split;
> +	unsigned int full;
> +};
> +
>   /* CPUID.(EAX=10H, ECX=ResID).EDX */
>   union cpuid_0x10_x_edx {
>   	struct {
  
Maciej Wieczor-Retman Sept. 28, 2023, 7:06 a.m. UTC | #6
Hi, thanks for reviewing the series,

On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote:
>Hi Maciej,
>
>How about this subject line?
>
>x86/resctrl: Enable non-contiguous CBMs on Intel CAT

Changing "bits" to "CBMs" does indeed seem sensible so I'll do that if
there are no objections.

But I'm not sure the preposition collocation change from "in" to "on"
would be grammatical (at least from what I've read in docs about Intel
CAT so far).

>On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote:
>> The setting for non-contiguous 1s support in Intel CAT is
>
>> hardcoded to false. On these systems, writing non-contiguous
>> 1s into the schemata file will fail before resctrl passes
>> the value to the hardware.
>> 
>> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
>> being reserved and now carry information about non-contiguous 1s
>> value support for L3 and L2 cache respectively. The CAT
>> capacity bitmask (CBM) supports a non-contiguous 1s value if
>> the bit is set.
>> 
>> Replace the hardcoded non-contiguous support value with
>> the support learned from the hardware. Add hardcoded non-contiguous
>> support value to Haswell probe since it can't make use of CPUID for
>> Cache allocation.
>> 
>> Originally-by: Fenghua Yu <fenghua.yu@intel.com>
>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
>> ---
>> Changelog v2:
>> - Rewrite part of a comment concerning Haswell. (Reinette)
>> 
>>   arch/x86/kernel/cpu/resctrl/core.c        |  9 ++++++---
>>   arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++----
>>   arch/x86/kernel/cpu/resctrl/internal.h    |  9 +++++++++
>>   3 files changed, 21 insertions(+), 7 deletions(-)
>> 
>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>> index 030d3b409768..c783a873147c 100644
>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void)
>>   	r->cache.cbm_len = 20;
>>   	r->cache.shareable_bits = 0xc0000;
>>   	r->cache.min_cbm_bits = 2;
>> +	r->cache.arch_has_sparse_bitmaps = false;
>
>Is this change required?
>
>This is always set to false in rdt_init_res_defs_intel().

The logic behind moving this variable initialization from
rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and
rdt_get_cache_alloc_cfg() is that the variable doesn't really have a
default value anymore. It used to when the CPUID.0x10.1:ECX[3] and
CPUID.0x10.2:ECX[3] bits were reserved.

Now for the general case the variable is dependent on CPUID output.
And only for Haswell case it needs to be hardcoded to "false", so the
assignment makes more sense in Haswell probe rather than in the default
section.

>>   	r->alloc_capable = true;
>>   	rdt_alloc_capable = true;
>> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
>>   {
>>   	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>   	union cpuid_0x10_1_eax eax;
>> +	union cpuid_0x10_x_ecx ecx;
>>   	union cpuid_0x10_x_edx edx;
>> -	u32 ebx, ecx;
>> +	u32 ebx;
>> -	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
>> +	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
>>   	hw_res->num_closid = edx.split.cos_max + 1;
>>   	r->cache.cbm_len = eax.split.cbm_len + 1;
>>   	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>>   	r->cache.shareable_bits = ebx & r->default_ctrl;
>>   	r->data_width = (r->cache.cbm_len + 3) / 4;
>> +	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>> +		r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
>>   	r->alloc_capable = true;
>>   }
>> @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void)
>>   		if (r->rid == RDT_RESOURCE_L3 ||
>>   		    r->rid == RDT_RESOURCE_L2) {
>> -			r->cache.arch_has_sparse_bitmaps = false;
>
>Why do you have to remove this one here?   This seems like a right place to
>initialize.

Look at the previous comment.
  
Moger, Babu Sept. 28, 2023, 3:08 p.m. UTC | #7
Hi Maciej,

On 9/28/23 02:06, Maciej Wieczór-Retman wrote:
> Hi, thanks for reviewing the series,
> 
> On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote:
>> Hi Maciej,
>>
>> How about this subject line?
>>
>> x86/resctrl: Enable non-contiguous CBMs on Intel CAT
> 
> Changing "bits" to "CBMs" does indeed seem sensible so I'll do that if
> there are no objections.
> 
> But I'm not sure the preposition collocation change from "in" to "on"
> would be grammatical (at least from what I've read in docs about Intel
> CAT so far).
> 
>> On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote:
>>> The setting for non-contiguous 1s support in Intel CAT is
>>
>>> hardcoded to false. On these systems, writing non-contiguous
>>> 1s into the schemata file will fail before resctrl passes
>>> the value to the hardware.
>>>
>>> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
>>> being reserved and now carry information about non-contiguous 1s
>>> value support for L3 and L2 cache respectively. The CAT
>>> capacity bitmask (CBM) supports a non-contiguous 1s value if
>>> the bit is set.
>>>
>>> Replace the hardcoded non-contiguous support value with
>>> the support learned from the hardware. Add hardcoded non-contiguous
>>> support value to Haswell probe since it can't make use of CPUID for
>>> Cache allocation.
>>>
>>> Originally-by: Fenghua Yu <fenghua.yu@intel.com>
>>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
>>> ---
>>> Changelog v2:
>>> - Rewrite part of a comment concerning Haswell. (Reinette)
>>>
>>>   arch/x86/kernel/cpu/resctrl/core.c        |  9 ++++++---
>>>   arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++----
>>>   arch/x86/kernel/cpu/resctrl/internal.h    |  9 +++++++++
>>>   3 files changed, 21 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>>> index 030d3b409768..c783a873147c 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void)
>>>   	r->cache.cbm_len = 20;
>>>   	r->cache.shareable_bits = 0xc0000;
>>>   	r->cache.min_cbm_bits = 2;
>>> +	r->cache.arch_has_sparse_bitmaps = false;
>>
>> Is this change required?
>>
>> This is always set to false in rdt_init_res_defs_intel().
> 
> The logic behind moving this variable initialization from
> rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and
> rdt_get_cache_alloc_cfg() is that the variable doesn't really have a
> default value anymore. It used to when the CPUID.0x10.1:ECX[3] and
> CPUID.0x10.2:ECX[3] bits were reserved.
> 
> Now for the general case the variable is dependent on CPUID output.
> And only for Haswell case it needs to be hardcoded to "false", so the
> assignment makes more sense in Haswell probe rather than in the default
> section.

Here is the current sequence order with your change.

1.
resctrl_late_init -> check_quirks -> __check_quirks_intel ->
cache_alloc_hsw_probe
   r->cache.arch_has_sparse_bitmaps = false; (new code)

2. resctrl_late_init -> rdt_init_res_defs -> rdt_init_res_defs_intel
   r->cache.arch_has_sparse_bitmaps = false; (old code)

3. resctrl_late_init -> get_rdt_resources -> get_rdt_alloc_resources ->
rdt_get_cache_alloc_cfg
   r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; (new code)

The code in (3) is going to overwrite whatever is set in (1) or (2).

I would say you can just remove initialization in both (1) and (2). That
makes the code clearer to me. I assume reserved bits in Intel is always 0.

Thanks
Babu


> 
>>>   	r->alloc_capable = true;
>>>   	rdt_alloc_capable = true;
>>> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
>>>   {
>>>   	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>>   	union cpuid_0x10_1_eax eax;
>>> +	union cpuid_0x10_x_ecx ecx;
>>>   	union cpuid_0x10_x_edx edx;
>>> -	u32 ebx, ecx;
>>> +	u32 ebx;
>>> -	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
>>> +	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
>>>   	hw_res->num_closid = edx.split.cos_max + 1;
>>>   	r->cache.cbm_len = eax.split.cbm_len + 1;
>>>   	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
>>>   	r->cache.shareable_bits = ebx & r->default_ctrl;
>>>   	r->data_width = (r->cache.cbm_len + 3) / 4;
>>> +	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>>> +		r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
>>>   	r->alloc_capable = true;
>>>   }
>>> @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void)
>>>   		if (r->rid == RDT_RESOURCE_L3 ||
>>>   		    r->rid == RDT_RESOURCE_L2) {
>>> -			r->cache.arch_has_sparse_bitmaps = false;
>>
>> Why do you have to remove this one here?   This seems like a right place to
>> initialize.
> 
> Look at the previous comment.
>
  
Reinette Chatre Sept. 28, 2023, 3:53 p.m. UTC | #8
Hi Babu,

On 9/28/2023 8:08 AM, Moger, Babu wrote:
> On 9/28/23 02:06, Maciej Wieczór-Retman wrote:
>> On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote:
>>> On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote:
...

>>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>>>> index 030d3b409768..c783a873147c 100644
>>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void)
>>>>   	r->cache.cbm_len = 20;
>>>>   	r->cache.shareable_bits = 0xc0000;
>>>>   	r->cache.min_cbm_bits = 2;
>>>> +	r->cache.arch_has_sparse_bitmaps = false;
>>>
>>> Is this change required?
>>>
>>> This is always set to false in rdt_init_res_defs_intel().
>>
>> The logic behind moving this variable initialization from
>> rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and
>> rdt_get_cache_alloc_cfg() is that the variable doesn't really have a
>> default value anymore. It used to when the CPUID.0x10.1:ECX[3] and
>> CPUID.0x10.2:ECX[3] bits were reserved.
>>
>> Now for the general case the variable is dependent on CPUID output.
>> And only for Haswell case it needs to be hardcoded to "false", so the
>> assignment makes more sense in Haswell probe rather than in the default
>> section.
> 
> Here is the current sequence order with your change.
> 
> 1.
> resctrl_late_init -> check_quirks -> __check_quirks_intel ->
> cache_alloc_hsw_probe
>    r->cache.arch_has_sparse_bitmaps = false; (new code)
> 
> 2. resctrl_late_init -> rdt_init_res_defs -> rdt_init_res_defs_intel
>    r->cache.arch_has_sparse_bitmaps = false; (old code)
> 
> 3. resctrl_late_init -> get_rdt_resources -> get_rdt_alloc_resources ->
> rdt_get_cache_alloc_cfg
>    r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; (new code)
> 
> The code in (3) is going to overwrite whatever is set in (1) or (2).
> 
> I would say you can just remove initialization in both (1) and (2). That
> makes the code clearer to me. I assume reserved bits in Intel is always 0.
> 

I believe Maciej already addressed this in his response to a similar question
from Peter. Please see:
https://lore.kernel.org/lkml/xnjmmsj5pjskbqeynor2ztha5dmkhxa44j764ohtjhtywy7idb@soobjiql4liy/

Reinette
  
Moger, Babu Sept. 28, 2023, 4:28 p.m. UTC | #9
Hi Reinette,

On 9/28/23 10:53, Reinette Chatre wrote:
> Hi Babu,
> 
> On 9/28/2023 8:08 AM, Moger, Babu wrote:
>> On 9/28/23 02:06, Maciej Wieczór-Retman wrote:
>>> On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote:
>>>> On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote:
> ...
> 
>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
>>>>> index 030d3b409768..c783a873147c 100644
>>>>> --- a/arch/x86/kernel/cpu/resctrl/core.c
>>>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c
>>>>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void)
>>>>>   	r->cache.cbm_len = 20;
>>>>>   	r->cache.shareable_bits = 0xc0000;
>>>>>   	r->cache.min_cbm_bits = 2;
>>>>> +	r->cache.arch_has_sparse_bitmaps = false;
>>>>
>>>> Is this change required?
>>>>
>>>> This is always set to false in rdt_init_res_defs_intel().
>>>
>>> The logic behind moving this variable initialization from
>>> rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and
>>> rdt_get_cache_alloc_cfg() is that the variable doesn't really have a
>>> default value anymore. It used to when the CPUID.0x10.1:ECX[3] and
>>> CPUID.0x10.2:ECX[3] bits were reserved.
>>>
>>> Now for the general case the variable is dependent on CPUID output.
>>> And only for Haswell case it needs to be hardcoded to "false", so the
>>> assignment makes more sense in Haswell probe rather than in the default
>>> section.
>>
>> Here is the current sequence order with your change.
>>
>> 1.
>> resctrl_late_init -> check_quirks -> __check_quirks_intel ->
>> cache_alloc_hsw_probe
>>    r->cache.arch_has_sparse_bitmaps = false; (new code)
>>
>> 2. resctrl_late_init -> rdt_init_res_defs -> rdt_init_res_defs_intel
>>    r->cache.arch_has_sparse_bitmaps = false; (old code)
>>
>> 3. resctrl_late_init -> get_rdt_resources -> get_rdt_alloc_resources ->
>> rdt_get_cache_alloc_cfg
>>    r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; (new code)
>>
>> The code in (3) is going to overwrite whatever is set in (1) or (2).
>>
>> I would say you can just remove initialization in both (1) and (2). That
>> makes the code clearer to me. I assume reserved bits in Intel is always 0.
>>
> 
> I believe Maciej already addressed this in his response to a similar question
> from Peter. Please see:
> https://lore.kernel.org/lkml/xnjmmsj5pjskbqeynor2ztha5dmkhxa44j764ohtjhtywy7idb@soobjiql4liy/

The rdt_alloc_capable part is kind of hidden. Now it makes sense.
Thanks
Babu Moger
  

Patch

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 030d3b409768..c783a873147c 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -152,6 +152,7 @@  static inline void cache_alloc_hsw_probe(void)
 	r->cache.cbm_len = 20;
 	r->cache.shareable_bits = 0xc0000;
 	r->cache.min_cbm_bits = 2;
+	r->cache.arch_has_sparse_bitmaps = false;
 	r->alloc_capable = true;
 
 	rdt_alloc_capable = true;
@@ -267,15 +268,18 @@  static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
 {
 	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
 	union cpuid_0x10_1_eax eax;
+	union cpuid_0x10_x_ecx ecx;
 	union cpuid_0x10_x_edx edx;
-	u32 ebx, ecx;
+	u32 ebx;
 
-	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full);
+	cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full);
 	hw_res->num_closid = edx.split.cos_max + 1;
 	r->cache.cbm_len = eax.split.cbm_len + 1;
 	r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1;
 	r->cache.shareable_bits = ebx & r->default_ctrl;
 	r->data_width = (r->cache.cbm_len + 3) / 4;
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
+		r->cache.arch_has_sparse_bitmaps = ecx.split.noncont;
 	r->alloc_capable = true;
 }
 
@@ -872,7 +876,6 @@  static __init void rdt_init_res_defs_intel(void)
 
 		if (r->rid == RDT_RESOURCE_L3 ||
 		    r->rid == RDT_RESOURCE_L2) {
-			r->cache.arch_has_sparse_bitmaps = false;
 			r->cache.arch_has_per_cpu_cfg = false;
 			r->cache.min_cbm_bits = 1;
 		} else if (r->rid == RDT_RESOURCE_MBA) {
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index b44c487727d4..f076f12cf8e8 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -87,10 +87,12 @@  int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s,
 
 /*
  * Check whether a cache bit mask is valid.
- * For Intel the SDM says:
- *	Please note that all (and only) contiguous '1' combinations
- *	are allowed (e.g. FFFFH, 0FF0H, 003CH, etc.).
- * Additionally Haswell requires at least two bits set.
+ * On Intel CPUs, non-contiguous 1s value support is indicated by CPUID:
+ *   - CPUID.0x10.1:ECX[3]: L3 non-contiguous 1s value supported if 1
+ *   - CPUID.0x10.2:ECX[3]: L2 non-contiguous 1s value supported if 1
+ *
+ * Haswell does not support a non-contiguous 1s value and additionally
+ * requires at least two bits set.
  * AMD allows non-contiguous bitmasks.
  */
 static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 85ceaf9a31ac..c47ef2f13e8e 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -492,6 +492,15 @@  union cpuid_0x10_3_eax {
 	unsigned int full;
 };
 
+/* CPUID.(EAX=10H, ECX=ResID).ECX */
+union cpuid_0x10_x_ecx {
+	struct {
+		unsigned int reserved:3;
+		unsigned int noncont:1;
+	} split;
+	unsigned int full;
+};
+
 /* CPUID.(EAX=10H, ECX=ResID).EDX */
 union cpuid_0x10_x_edx {
 	struct {