[v13,0/8] Add support for Sub-NUMA cluster (SNC) systems

Message ID 20231204185357.120501-1-tony.luck@intel.com
Headers
Series Add support for Sub-NUMA cluster (SNC) systems |

Message

Luck, Tony Dec. 4, 2023, 6:53 p.m. UTC
  The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
that share an L3 cache into two or more sets. This plays havoc with the
Resource Director Technology (RDT) monitoring features.  Prior to this
patch Intel has advised that SNC and RDT are incompatible.

Some of these CPU support an MSR that can partition the RMID counters in
the same way. This allows monitoring features to be used. With the caveat
that users must be aware that Linux may migrate tasks more frequently
between SNC nodes than between "regular" NUMA nodes, so reading counters
from all SNC nodes may be needed to get a complete picture of activity
for tasks.

Cache and memory bandwidth allocation features continue to operate at
the scope of the L3 cache.

Signed-off-by: Tony Luck <tony.luck@intel.com>

Changes since v12:

All:
	Reinette - put commit tags in right order for TIP (Tested-by before
	Reviewed-by)

Patch 7:
	Fam Zheng - Check for -1 return from get_cpu_cacheinfo_id() and
	increase size of bitmap tracking # of L3 instances.
	Reinette - Add extra sanity checks. Note that this patch has
	some additional tweaks beyond the e-mail discussion.
		1) "3" is a valid return in addition to 1, 2, 4
		2) Added a warning if the sanity checks fail that
		prints number of CPU nodes and number of L3 cache
		instances that were found.

Patch 8:
	Babu - Fix grammar with an additional comma.


Tony Luck (8):
  x86/resctrl: Prepare for new domain scope
  x86/resctrl: Prepare to split rdt_domain structure
  x86/resctrl: Prepare for different scope for control/monitor
    operations
  x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
  x86/resctrl: Add node-scope to the options for feature scope
  x86/resctrl: Introduce snc_nodes_per_l3_cache
  x86/resctrl: Sub NUMA Cluster detection and enable
  x86/resctrl: Update documentation with Sub-NUMA cluster changes

 Documentation/arch/x86/resctrl.rst        |  25 +-
 include/linux/resctrl.h                   |  85 +++--
 arch/x86/include/asm/msr-index.h          |   1 +
 arch/x86/kernel/cpu/resctrl/internal.h    |  66 ++--
 arch/x86/kernel/cpu/resctrl/core.c        | 433 +++++++++++++++++-----
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  58 +--
 arch/x86/kernel/cpu/resctrl/monitor.c     |  68 ++--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  26 +-
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 149 ++++----
 9 files changed, 629 insertions(+), 282 deletions(-)


base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
  

Comments

Moger, Babu Dec. 4, 2023, 9:20 p.m. UTC | #1
Hi Tony,

Tested the series on AMD system. Just ran few basic tests. Everything 
looking good.

Thanks

Babu

On 12/4/2023 12:53 PM, Tony Luck wrote:
> The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
> that share an L3 cache into two or more sets. This plays havoc with the
> Resource Director Technology (RDT) monitoring features.  Prior to this
> patch Intel has advised that SNC and RDT are incompatible.
>
> Some of these CPU support an MSR that can partition the RMID counters in
> the same way. This allows monitoring features to be used. With the caveat
> that users must be aware that Linux may migrate tasks more frequently
> between SNC nodes than between "regular" NUMA nodes, so reading counters
> from all SNC nodes may be needed to get a complete picture of activity
> for tasks.
>
> Cache and memory bandwidth allocation features continue to operate at
> the scope of the L3 cache.
>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
>
> Changes since v12:
>
> All:
> 	Reinette - put commit tags in right order for TIP (Tested-by before
> 	Reviewed-by)
>
> Patch 7:
> 	Fam Zheng - Check for -1 return from get_cpu_cacheinfo_id() and
> 	increase size of bitmap tracking # of L3 instances.
> 	Reinette - Add extra sanity checks. Note that this patch has
> 	some additional tweaks beyond the e-mail discussion.
> 		1) "3" is a valid return in addition to 1, 2, 4
> 		2) Added a warning if the sanity checks fail that
> 		prints number of CPU nodes and number of L3 cache
> 		instances that were found.
>
> Patch 8:
> 	Babu - Fix grammar with an additional comma.
>
>
> Tony Luck (8):
>    x86/resctrl: Prepare for new domain scope
>    x86/resctrl: Prepare to split rdt_domain structure
>    x86/resctrl: Prepare for different scope for control/monitor
>      operations
>    x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
>    x86/resctrl: Add node-scope to the options for feature scope
>    x86/resctrl: Introduce snc_nodes_per_l3_cache
>    x86/resctrl: Sub NUMA Cluster detection and enable
>    x86/resctrl: Update documentation with Sub-NUMA cluster changes
>
>   Documentation/arch/x86/resctrl.rst        |  25 +-
>   include/linux/resctrl.h                   |  85 +++--
>   arch/x86/include/asm/msr-index.h          |   1 +
>   arch/x86/kernel/cpu/resctrl/internal.h    |  66 ++--
>   arch/x86/kernel/cpu/resctrl/core.c        | 433 +++++++++++++++++-----
>   arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  58 +--
>   arch/x86/kernel/cpu/resctrl/monitor.c     |  68 ++--
>   arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  26 +-
>   arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 149 ++++----
>   9 files changed, 629 insertions(+), 282 deletions(-)
>
>
> base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
  
Luck, Tony Dec. 4, 2023, 9:33 p.m. UTC | #2
> Tested the series on AMD system. Just ran few basic tests. Everything 
> looking good.

Babu,

Thanks for testing. I'll add your Tested-by tag if[1] I make a v14.

-Tony

[1] realistically not if, but when :-(
  
Luck, Tony Dec. 4, 2023, 11:25 p.m. UTC | #3
On Mon, Dec 04, 2023 at 10:53:49AM -0800, Tony Luck wrote:

Boris: I've collected "Reviewed-by:" from Reinette for all patches. Babu
sent a Tested-by for the series, and Reviewed-by for each patch just
now.

So it's ready to got into your to-be-reviewed queue.

Thanks

-Tony

> The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
> that share an L3 cache into two or more sets. This plays havoc with the
> Resource Director Technology (RDT) monitoring features.  Prior to this
> patch Intel has advised that SNC and RDT are incompatible.
> 
> Some of these CPU support an MSR that can partition the RMID counters in
> the same way. This allows monitoring features to be used. With the caveat
> that users must be aware that Linux may migrate tasks more frequently
> between SNC nodes than between "regular" NUMA nodes, so reading counters
> from all SNC nodes may be needed to get a complete picture of activity
> for tasks.
> 
> Cache and memory bandwidth allocation features continue to operate at
> the scope of the L3 cache.
> 
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> 
> Changes since v12:
> 
> All:
> 	Reinette - put commit tags in right order for TIP (Tested-by before
> 	Reviewed-by)
> 
> Patch 7:
> 	Fam Zheng - Check for -1 return from get_cpu_cacheinfo_id() and
> 	increase size of bitmap tracking # of L3 instances.
> 	Reinette - Add extra sanity checks. Note that this patch has
> 	some additional tweaks beyond the e-mail discussion.
> 		1) "3" is a valid return in addition to 1, 2, 4
> 		2) Added a warning if the sanity checks fail that
> 		prints number of CPU nodes and number of L3 cache
> 		instances that were found.
> 
> Patch 8:
> 	Babu - Fix grammar with an additional comma.
> 
> 
> Tony Luck (8):
>   x86/resctrl: Prepare for new domain scope
>   x86/resctrl: Prepare to split rdt_domain structure
>   x86/resctrl: Prepare for different scope for control/monitor
>     operations
>   x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
>   x86/resctrl: Add node-scope to the options for feature scope
>   x86/resctrl: Introduce snc_nodes_per_l3_cache
>   x86/resctrl: Sub NUMA Cluster detection and enable
>   x86/resctrl: Update documentation with Sub-NUMA cluster changes
> 
>  Documentation/arch/x86/resctrl.rst        |  25 +-
>  include/linux/resctrl.h                   |  85 +++--
>  arch/x86/include/asm/msr-index.h          |   1 +
>  arch/x86/kernel/cpu/resctrl/internal.h    |  66 ++--
>  arch/x86/kernel/cpu/resctrl/core.c        | 433 +++++++++++++++++-----
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  58 +--
>  arch/x86/kernel/cpu/resctrl/monitor.c     |  68 ++--
>  arch/x86/kernel/cpu/resctrl/pseudo_lock.c |  26 +-
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 149 ++++----
>  9 files changed, 629 insertions(+), 282 deletions(-)
> 
> 
> base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
> -- 
> 2.41.0
>