[v8,00/13] Support for AMD QoS new features

Message ID 166759188265.3281208.11769277079826754455.stgit@bmoger-ubuntu
Headers
Series Support for AMD QoS new features |

Message

Moger, Babu Nov. 4, 2022, 7:59 p.m. UTC
  New AMD processors can now support following QoS features.

1. Slow Memory Bandwidth Allocation (SMBA)
   With this feature, the QOS enforcement policies can be applied
   to the external slow memory connected to the host. QOS enforcement
   is accomplished by assigning a Class Of Service (COS) to a processor
   and specifying allocations or limits for that COS for each resource
   to be allocated.

   Currently, CXL.memory is the only supported "slow" memory device. With
   the support of SMBA feature the hardware enables bandwidth allocation
   on the slow memory devices.

2. Bandwidth Monitoring Event Configuration (BMEC)
   The bandwidth monitoring events mbm_total_event and mbm_local_event 
   are set to count all the total and local reads/writes respectively.
   With the introduction of slow memory, the two counters are not enough
   to count all the different types are memory events. With the feature
   BMEC, the users have the option to configure mbm_total_event and
   mbm_local_event to count the specific type of events.

   Following are the bitmaps of events supported.
   Bits    Description
     6       Dirty Victims from the QOS domain to all types of memory
     5       Reads to slow memory in the non-local NUMA domain
     4       Reads to slow memory in the local NUMA domain
     3       Non-temporal writes to non-local NUMA domain
     2       Non-temporal writes to local NUMA domain
     1       Reads to memory in the non-local NUMA domain
     0       Reads to memory in the local NUMA domain

This series adds support for these features.

Feature description is available in the specification, "AMD64 Technology Platform Quality of Service Extensions, Revision: 1.03 Publication # 56375
Revision: 1.03 Issue Date: February 2022".

Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
---
v8:
 Changes:
 1. Removed init attribute for rdt_cpu_has to make it available for all the files.
 2. Updated the change log for mon_features to correct the names of config files.
 3. Changed configuration file name from mbm_total_config to mbm_total_bytes_config.
    This is more consistant with other changes.
 4. Added lock protection while reading/writing the config file.
 5. Other few minor text changes. I have been missing few comments in last couple of
    revisions. Hope I have addressed all of them this time.

v7:
 https://lore.kernel.org/lkml/166604543832.5345.9696970469830919982.stgit@bmoger-ubuntu/
 Changes:
 Not much of a change. Missed one comment from Reinette from v5. Corrected it now.
 Few format corrections from Sanjaya.

v6:
 https://lore.kernel.org/lkml/166543345606.23830.3120625408601531368.stgit@bmoger-ubuntu/
 Summary of changes:
 1. Rebased on top of lastest tip tree. Fixed few minor conflicts.
 2. Fixed format issue with scattered.c.
 3. Removed config_name from the structure mon_evt. It is not required.
 4. The read/write format for mbm_total_config and mbm_local_config will be same
    as schemata format "id0=val0;id1=val1;...". This is comment from Fenghua.
 5. Added more comments MSR_IA32_EVT_CFG_BASE writng.
 5. Few text changes in resctrl.rst 
 
v5:
  https://lore.kernel.org/lkml/166431016617.373387.1968875281081252467.stgit@bmoger-ubuntu/
  Summary of changes.
  1. Split the series into two. The first two patches are bug fixes. So, sent them separate.
  2. The config files mbm_total_config and mbm_local_config are now under
     /sys/fs/resctrl/info/L3_MON/. Removed these config files from mon groups.
  3. Ran "checkpatch --strict --codespell" on all the patches. Looks good with few known exceptions.
  4. Few minor text changes in resctrl.rst file. 

v4:
  https://lore.kernel.org/lkml/166257348081.1043018.11227924488792315932.stgit@bmoger-ubuntu/
  Got numerios of comments from Reinette Chatre. Addressed most of them. 
  Summary of changes.
  1. Removed mon_configurable under /sys/fs/resctrl/info/L3_MON/.  
  2. Updated mon_features texts if the BMEC is supported.
  3. Added more explanation about the slow memory support.
  4. Replaced smp_call_function_many with on_each_cpu_mask call.
  5. Removed arch_has_empty_bitmaps
  6. Few other text changes.
  7. Removed Reviewed-by if the patch is modified.
  8. Rebased the patches to latest tip.

v3:
  https://lore.kernel.org/lkml/166117559756.6695.16047463526634290701.stgit@bmoger-ubuntu/
  a. Rebased the patches to latest tip. Resolved some conflicts.
     https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
  b. Taken care of feedback from Bagas Sanjaya.
  c. Added Reviewed by from Mingo.
  Note: I am still looking for comments from Reinette or Fenghua.

v2:
  https://lore.kernel.org/lkml/165938717220.724959.10931629283087443782.stgit@bmoger-ubuntu/
  a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
  b. Added the patch to fix CBM issue on AMD. This was originally discussed
     https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com/

v1:
  https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/

Babu Moger (13):
      x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
      x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
      x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
      x86/resctrl: Include new features in command line options
      x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation
      x86/resctrl: Remove the init attribute for rdt_cpu_has()
      x86/resctrl: Introduce data structure to support monitor configuration
      x86/resctrl: Add sysfs interface to read mbm_total_bytes_config
      x86/resctrl: Add sysfs interface to read mbm_local_bytes_config
      x86/resctrl: Add sysfs interface to write mbm_total_bytes_config
      x86/resctrl: Add sysfs interface to write mbm_local_bytes_config
      x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
      Documentation/x86: Update resctrl.rst for new features


 .../admin-guide/kernel-parameters.txt         |   2 +-
 Documentation/x86/resctrl.rst                 | 139 +++++++-
 arch/x86/include/asm/cpufeatures.h            |   2 +
 arch/x86/kernel/cpu/cpuid-deps.c              |   1 +
 arch/x86/kernel/cpu/resctrl/core.c            |  56 +++-
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c     |   2 +-
 arch/x86/kernel/cpu/resctrl/internal.h        |  33 ++
 arch/x86/kernel/cpu/resctrl/monitor.c         |   7 +
 arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 304 ++++++++++++++++--
 arch/x86/kernel/cpu/scattered.c               |   2 +
 10 files changed, 515 insertions(+), 33 deletions(-)

--
  

Comments

Moger, Babu Nov. 15, 2022, 8:50 p.m. UTC | #1
Hi Reinette and Others,

I was planning to refresh the series later this week. I have one comment
from Peter Newman.  Let me know if you have any comments.

Thanks

Babu


On 11/4/22 14:59, Babu Moger wrote:
> New AMD processors can now support following QoS features.
>
> 1. Slow Memory Bandwidth Allocation (SMBA)
>    With this feature, the QOS enforcement policies can be applied
>    to the external slow memory connected to the host. QOS enforcement
>    is accomplished by assigning a Class Of Service (COS) to a processor
>    and specifying allocations or limits for that COS for each resource
>    to be allocated.
>
>    Currently, CXL.memory is the only supported "slow" memory device. With
>    the support of SMBA feature the hardware enables bandwidth allocation
>    on the slow memory devices.
>
> 2. Bandwidth Monitoring Event Configuration (BMEC)
>    The bandwidth monitoring events mbm_total_event and mbm_local_event 
>    are set to count all the total and local reads/writes respectively.
>    With the introduction of slow memory, the two counters are not enough
>    to count all the different types are memory events. With the feature
>    BMEC, the users have the option to configure mbm_total_event and
>    mbm_local_event to count the specific type of events.
>
>    Following are the bitmaps of events supported.
>    Bits    Description
>      6       Dirty Victims from the QOS domain to all types of memory
>      5       Reads to slow memory in the non-local NUMA domain
>      4       Reads to slow memory in the local NUMA domain
>      3       Non-temporal writes to non-local NUMA domain
>      2       Non-temporal writes to local NUMA domain
>      1       Reads to memory in the non-local NUMA domain
>      0       Reads to memory in the local NUMA domain
>
> This series adds support for these features.
>
> Feature description is available in the specification, "AMD64 Technology Platform Quality of Service Extensions, Revision: 1.03 Publication # 56375
> Revision: 1.03 Issue Date: February 2022".
>
> Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
> ---
> v8:
>  Changes:
>  1. Removed init attribute for rdt_cpu_has to make it available for all the files.
>  2. Updated the change log for mon_features to correct the names of config files.
>  3. Changed configuration file name from mbm_total_config to mbm_total_bytes_config.
>     This is more consistant with other changes.
>  4. Added lock protection while reading/writing the config file.
>  5. Other few minor text changes. I have been missing few comments in last couple of
>     revisions. Hope I have addressed all of them this time.
>
> v7:
>  https://lore.kernel.org/lkml/166604543832.5345.9696970469830919982.stgit@bmoger-ubuntu/
>  Changes:
>  Not much of a change. Missed one comment from Reinette from v5. Corrected it now.
>  Few format corrections from Sanjaya.
>
> v6:
>  https://lore.kernel.org/lkml/166543345606.23830.3120625408601531368.stgit@bmoger-ubuntu/
>  Summary of changes:
>  1. Rebased on top of lastest tip tree. Fixed few minor conflicts.
>  2. Fixed format issue with scattered.c.
>  3. Removed config_name from the structure mon_evt. It is not required.
>  4. The read/write format for mbm_total_config and mbm_local_config will be same
>     as schemata format "id0=val0;id1=val1;...". This is comment from Fenghua.
>  5. Added more comments MSR_IA32_EVT_CFG_BASE writng.
>  5. Few text changes in resctrl.rst 
>  
> v5:
>   https://lore.kernel.org/lkml/166431016617.373387.1968875281081252467.stgit@bmoger-ubuntu/
>   Summary of changes.
>   1. Split the series into two. The first two patches are bug fixes. So, sent them separate.
>   2. The config files mbm_total_config and mbm_local_config are now under
>      /sys/fs/resctrl/info/L3_MON/. Removed these config files from mon groups.
>   3. Ran "checkpatch --strict --codespell" on all the patches. Looks good with few known exceptions.
>   4. Few minor text changes in resctrl.rst file. 
>
> v4:
>   https://lore.kernel.org/lkml/166257348081.1043018.11227924488792315932.stgit@bmoger-ubuntu/
>   Got numerios of comments from Reinette Chatre. Addressed most of them. 
>   Summary of changes.
>   1. Removed mon_configurable under /sys/fs/resctrl/info/L3_MON/.  
>   2. Updated mon_features texts if the BMEC is supported.
>   3. Added more explanation about the slow memory support.
>   4. Replaced smp_call_function_many with on_each_cpu_mask call.
>   5. Removed arch_has_empty_bitmaps
>   6. Few other text changes.
>   7. Removed Reviewed-by if the patch is modified.
>   8. Rebased the patches to latest tip.
>
> v3:
>   https://lore.kernel.org/lkml/166117559756.6695.16047463526634290701.stgit@bmoger-ubuntu/
>   a. Rebased the patches to latest tip. Resolved some conflicts.
>      https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
>   b. Taken care of feedback from Bagas Sanjaya.
>   c. Added Reviewed by from Mingo.
>   Note: I am still looking for comments from Reinette or Fenghua.
>
> v2:
>   https://lore.kernel.org/lkml/165938717220.724959.10931629283087443782.stgit@bmoger-ubuntu/
>   a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
>   b. Added the patch to fix CBM issue on AMD. This was originally discussed
>      https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com/
>
> v1:
>   https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/
>
> Babu Moger (13):
>       x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
>       x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
>       x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
>       x86/resctrl: Include new features in command line options
>       x86/resctrl: Detect and configure Slow Memory Bandwidth Allocation
>       x86/resctrl: Remove the init attribute for rdt_cpu_has()
>       x86/resctrl: Introduce data structure to support monitor configuration
>       x86/resctrl: Add sysfs interface to read mbm_total_bytes_config
>       x86/resctrl: Add sysfs interface to read mbm_local_bytes_config
>       x86/resctrl: Add sysfs interface to write mbm_total_bytes_config
>       x86/resctrl: Add sysfs interface to write mbm_local_bytes_config
>       x86/resctrl: Replace smp_call_function_many() with on_each_cpu_mask()
>       Documentation/x86: Update resctrl.rst for new features
>
>
>  .../admin-guide/kernel-parameters.txt         |   2 +-
>  Documentation/x86/resctrl.rst                 | 139 +++++++-
>  arch/x86/include/asm/cpufeatures.h            |   2 +
>  arch/x86/kernel/cpu/cpuid-deps.c              |   1 +
>  arch/x86/kernel/cpu/resctrl/core.c            |  56 +++-
>  arch/x86/kernel/cpu/resctrl/ctrlmondata.c     |   2 +-
>  arch/x86/kernel/cpu/resctrl/internal.h        |  33 ++
>  arch/x86/kernel/cpu/resctrl/monitor.c         |   7 +
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c        | 304 ++++++++++++++++--
>  arch/x86/kernel/cpu/scattered.c               |   2 +
>  10 files changed, 515 insertions(+), 33 deletions(-)
>
> --
>
  
Reinette Chatre Nov. 15, 2022, 9:07 p.m. UTC | #2
Hi Babu,

On 11/15/2022 12:50 PM, Moger, Babu wrote:
> Hi Reinette and Others,
> 
> I was planning to refresh the series later this week. I have one comment
> from Peter Newman.  Let me know if you have any comments.
> 

I am behind on resctrl work and have not had a chance to look
at this series yet.

Reinette
  
Moger, Babu Nov. 15, 2022, 9:34 p.m. UTC | #3
Hi Reinette,

On 11/15/22 15:07, Reinette Chatre wrote:
> Hi Babu,
>
> On 11/15/2022 12:50 PM, Moger, Babu wrote:
>> Hi Reinette and Others,
>>
>> I was planning to refresh the series later this week. I have one comment
>> from Peter Newman.  Let me know if you have any comments.
>>
> I am behind on resctrl work and have not had a chance to look
> at this series yet.

Sure. Thanks for the update. I will wait.

Thanks

Babu