[v9,13/13] Documentation/x86: Update resctrl.rst for new features

Message ID 166990905693.17806.6942517971262471285.stgit@bmoger-ubuntu
State New
Headers
Series Support for AMD QoS new features |

Commit Message

Moger, Babu Dec. 1, 2022, 3:37 p.m. UTC
  Update the documentation for the new features:
1. Slow Memory Bandwidth allocation (SMBA).
   With this feature, the QOS  enforcement policies can be applied
   to the external slow memory connected to the host. QOS enforcement
   is accomplished by assigning a Class Of Service (COS) to a processor
   and specifying allocations or limits for that COS for each resource
   to be allocated.

2. Bandwidth Monitoring Event Configuration (BMEC).
   The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
   are set to count all the total and local reads/writes respectively.
   With the introduction of slow memory, the two counters are not
   enough to count all the different types of memory events. With the
   feature BMEC, the users have the option to configure mbm_total_bytes
   and mbm_local_bytes to count the specific type of events.

Also add configuration instructions with examples.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
---
 Documentation/x86/resctrl.rst |  138 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 136 insertions(+), 2 deletions(-)
  

Comments

Reinette Chatre Dec. 15, 2022, 6:30 p.m. UTC | #1
Hi Babu,

On 12/1/2022 7:37 AM, Babu Moger wrote:
> Update the documentation for the new features:
> 1. Slow Memory Bandwidth allocation (SMBA).
>    With this feature, the QOS  enforcement policies can be applied
>    to the external slow memory connected to the host. QOS enforcement
>    is accomplished by assigning a Class Of Service (COS) to a processor
>    and specifying allocations or limits for that COS for each resource
>    to be allocated.
> 
> 2. Bandwidth Monitoring Event Configuration (BMEC).
>    The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
>    are set to count all the total and local reads/writes respectively.
>    With the introduction of slow memory, the two counters are not
>    enough to count all the different types of memory events. With the
>    feature BMEC, the users have the option to configure mbm_total_bytes
>    and mbm_local_bytes to count the specific type of events.
> 
> Also add configuration instructions with examples.
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
> ---
>  Documentation/x86/resctrl.rst |  138 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 136 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 71a531061e4e..60761a6f9087 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -17,14 +17,16 @@ AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
>  This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
>  flag bits:
>  
> -=============================================	================================
> +===============================================	================================
>  RDT (Resource Director Technology) Allocation	"rdt_a"
>  CAT (Cache Allocation Technology)		"cat_l3", "cat_l2"
>  CDP (Code and Data Prioritization)		"cdp_l3", "cdp_l2"
>  CQM (Cache QoS Monitoring)			"cqm_llc", "cqm_occup_llc"
>  MBM (Memory Bandwidth Monitoring)		"cqm_mbm_total", "cqm_mbm_local"
>  MBA (Memory Bandwidth Allocation)		"mba"
> -=============================================	================================
> +SMBA (Slow Memory Bandwidth Allocation)         "smba"
> +BMEC (Bandwidth Monitoring Event Configuration) "bmec"
> +===============================================	================================
>  
>  To use the feature mount the file system::
>  
> @@ -161,6 +163,79 @@ with the following files:
>  "mon_features":
>  		Lists the monitoring events if
>  		monitoring is enabled for the resource.
> +                Example::
> +
> +                   # cat /sys/fs/resctrl/info/L3_MON/mon_features
> +                   llc_occupancy
> +                   mbm_total_bytes
> +                   mbm_local_bytes
> +
> +                If the system supports Bandwidth Monitoring Event
> +                Configuration (BMEC), then the bandwidth events will
> +                be configurable. The output will be::
> +
> +                   # cat /sys/fs/resctrl/info/L3_MON/mon_features
> +                   llc_occupancy
> +                   mbm_total_bytes
> +                   mbm_total_bytes_config
> +                   mbm_local_bytes
> +                   mbm_local_bytes_config
> +
> +"mbm_total_bytes_config", "mbm_local_bytes_config":
> +        These files contain the current event configuration for the events

"These files" is redundant. Note that this is already introduced with "the
following files:".
To match similar files it could read:
"Read/write files containing the configuration for the mbm_total_bytes and
mbm_local_bytes events, respectively, ..."

> +        mbm_total_bytes and mbm_local_bytes, respectively, when the
> +        Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
> +        The event configuration settings are domain specific and will affect

"will" can be dropped?

> +        all the CPUs in the domain.
> +
> +        Following are the types of events supported:
> +
> +        ====    ========================================================
> +        Bits    Description
> +        ====    ========================================================
> +        6       Dirty Victims from the QOS domain to all types of memory
> +        5       Reads to slow memory in the non-local NUMA domain
> +        4       Reads to slow memory in the local NUMA domain
> +        3       Non-temporal writes to non-local NUMA domain
> +        2       Non-temporal writes to local NUMA domain
> +        1       Reads to memory in the non-local NUMA domain
> +        0       Reads to memory in the local NUMA domain
> +        ====    ========================================================
> +
> +        By default, the mbm_total_bytes configuration is set to 0x7f to count
> +        all the event types and the mbm_local_bytes configuration is set to
> +        0x15 to count all the local memory events.
> +
> +        Examples:
> +
> +        * To view the current configuration::
> +          ::
> +
> +            # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> +            0=0x7f;1=0x7f;2=0x7f;3=0x7f
> +
> +            # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> +            0=0x15;1=0x15;3=0x15;4=0x15
> +
> +        * To change the mbm_total_bytes to count only reads on domain 0,
> +          the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
> +          (in hexadecimal 0x33):
> +          ::
> +
> +            # echo  "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> +
> +            # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> +            0=0x33;1=0x7f;2=0x7f;3=0x7f
> +
> +        * To change the mbm_local_bytes to count all the slow memory reads on
> +          domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
> +          in binary (in hexadecimal 0x30):
> +          ::
> +
> +            # echo  "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> +
> +            # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> +            0=0x30;1=0x30;3=0x15;4=0x15
>  
>  "max_threshold_occupancy":
>  		Read/write file provides the largest value (in
> @@ -464,6 +539,25 @@ Memory bandwidth domain is L3 cache.
>  
>  	MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
>  
> +Slow Memory Bandwidth Allocation (SMBA)
> +---------------------------------------
> +AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).
> +CXL.memory is the only supported "slow" memory device. With the
> +support of SMBA, the hardware enables bandwidth allocation on
> +the slow memory devices. If there are multiple such devices in
> +the system, the throttling logic groups all the slow sources
> +together and applies the limit on them as a whole.
> +
> +The presence of SMBA (with CXL.memory) is independent of slow memory
> +devices presence. If there are no such devices on the system, then
> +configuring SMBA will have no impact on the performance of the system.
> +
> +The bandwidth domain for slow memory is L3 cache. Its schemata file
> +is formatted as:
> +::
> +
> +	SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
> +
>  Reading/writing the schemata file
>  ---------------------------------
>  Reading the schemata file will show the state of all resources
> @@ -479,6 +573,46 @@ which you wish to change.  E.g.
>    L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
>    L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
>  
> +Reading/writing the schemata file (on AMD systems)
> +--------------------------------------------------
> +Reading the schemata file will show the current bandwidth limit on all
> +domains. The allocated resources are in multiples of one eighth GB/s.
> +When writing to the file, you need to specify what cache id you wish to
> +configure the bandwidth limit.
> +
> +For example, to allocate 2GB/s limit on the first cache id:
> +
> +::
> +
> +  # cat schemata
> +    MB:0=2048;1=2048;2=2048;3=2048
> +    L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
> +  # echo "MB:1=16" > schemata
> +  # cat schemata
> +    MB:0=2048;1=  16;2=2048;3=2048
> +    L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
> +Reading/writing the schemata file (on AMD systems) with SMBA feature
> +--------------------------------------------------------------------
> +Reading and writing the schemata file is the same as without SMBA in
> +above section.
> +
> +For example, to allocate 8GB/s limit on the first cache id:
> +
> +::
> +
> +  # cat schemata
> +    SMBA:0=2048;1=2048;2=2048;3=2048
> +      MB:0=2048;1=2048;2=2048;3=2048
> +      L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
> +  # echo "SMBA:1=64" > schemata
> +  # cat schemata
> +    SMBA:0=2048;1=  64;2=2048;3=2048
> +      MB:0=2048;1=2048;2=2048;3=2048
> +      L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
>  Cache Pseudo-Locking
>  ====================
>  CAT enables a user to specify the amount of cache space that an
> 
> 

Based on earlier comments I am awaiting information to understand if some
more detail/example is needed to describe to the user what can be expected
after a counter configuration is made.

Reinette
  
Moger, Babu Dec. 19, 2022, 8:05 p.m. UTC | #2
[AMD Official Use Only - General]

Hi Reinette,

> -----Original Message-----
> From: Reinette Chatre <reinette.chatre@intel.com>
> Sent: Thursday, December 15, 2022 12:31 PM
> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net;
> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de
> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org;
> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org;
> quic_neeraju@quicinc.com; rdunlap@infradead.org;
> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com;
> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com;
> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com;
> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan
> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com;
> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org;
> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu;
> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com;
> peternewman@google.com
> Subject: Re: [PATCH v9 13/13] Documentation/x86: Update resctrl.rst for new
> features
> 
> Hi Babu,
> 
> On 12/1/2022 7:37 AM, Babu Moger wrote:
> > Update the documentation for the new features:
> > 1. Slow Memory Bandwidth allocation (SMBA).
> >    With this feature, the QOS  enforcement policies can be applied
> >    to the external slow memory connected to the host. QOS enforcement
> >    is accomplished by assigning a Class Of Service (COS) to a processor
> >    and specifying allocations or limits for that COS for each resource
> >    to be allocated.
> >
> > 2. Bandwidth Monitoring Event Configuration (BMEC).
> >    The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
> >    are set to count all the total and local reads/writes respectively.
> >    With the introduction of slow memory, the two counters are not
> >    enough to count all the different types of memory events. With the
> >    feature BMEC, the users have the option to configure mbm_total_bytes
> >    and mbm_local_bytes to count the specific type of events.
> >
> > Also add configuration instructions with examples.
> >
> > Signed-off-by: Babu Moger <babu.moger@amd.com>
> > Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
> > ---
> >  Documentation/x86/resctrl.rst |  138
> > ++++++++++++++++++++++++++++++++++++++++-
> >  1 file changed, 136 insertions(+), 2 deletions(-)
> >
> > diff --git a/Documentation/x86/resctrl.rst
> > b/Documentation/x86/resctrl.rst index 71a531061e4e..60761a6f9087
> > 100644
> > --- a/Documentation/x86/resctrl.rst
> > +++ b/Documentation/x86/resctrl.rst
> > @@ -17,14 +17,16 @@ AMD refers to this feature as AMD Platform Quality
> of Service(AMD QoS).
> >  This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86
> > /proc/cpuinfo  flag bits:
> >
> > -=============================================
> 	================================
> > +===============================================
> 	================================
> >  RDT (Resource Director Technology) Allocation	"rdt_a"
> >  CAT (Cache Allocation Technology)		"cat_l3", "cat_l2"
> >  CDP (Code and Data Prioritization)		"cdp_l3", "cdp_l2"
> >  CQM (Cache QoS Monitoring)			"cqm_llc",
> "cqm_occup_llc"
> >  MBM (Memory Bandwidth Monitoring)		"cqm_mbm_total",
> "cqm_mbm_local"
> >  MBA (Memory Bandwidth Allocation)		"mba"
> > -=============================================
> 	================================
> > +SMBA (Slow Memory Bandwidth Allocation)         "smba"
> > +BMEC (Bandwidth Monitoring Event Configuration) "bmec"
> > +===============================================
> 	================================
> >
> >  To use the feature mount the file system::
> >
> > @@ -161,6 +163,79 @@ with the following files:
> >  "mon_features":
> >  		Lists the monitoring events if
> >  		monitoring is enabled for the resource.
> > +                Example::
> > +
> > +                   # cat /sys/fs/resctrl/info/L3_MON/mon_features
> > +                   llc_occupancy
> > +                   mbm_total_bytes
> > +                   mbm_local_bytes
> > +
> > +                If the system supports Bandwidth Monitoring Event
> > +                Configuration (BMEC), then the bandwidth events will
> > +                be configurable. The output will be::
> > +
> > +                   # cat /sys/fs/resctrl/info/L3_MON/mon_features
> > +                   llc_occupancy
> > +                   mbm_total_bytes
> > +                   mbm_total_bytes_config
> > +                   mbm_local_bytes
> > +                   mbm_local_bytes_config
> > +
> > +"mbm_total_bytes_config", "mbm_local_bytes_config":
> > +        These files contain the current event configuration for the
> > +events
> 
> "These files" is redundant. Note that this is already introduced with "the
> following files:".
> To match similar files it could read:
> "Read/write files containing the configuration for the mbm_total_bytes and
> mbm_local_bytes events, respectively, ..."

Sure.
> 
> > +        mbm_total_bytes and mbm_local_bytes, respectively, when the
> > +        Bandwidth Monitoring Event Configuration (BMEC) feature is
> supported.
> > +        The event configuration settings are domain specific and will
> > + affect
> 
> "will" can be dropped?

Sure.
> 
> > +        all the CPUs in the domain.
> > +
> > +        Following are the types of events supported:
> > +
> > +        ====
> ========================================================
> > +        Bits    Description
> > +        ====
> ========================================================
> > +        6       Dirty Victims from the QOS domain to all types of memory
> > +        5       Reads to slow memory in the non-local NUMA domain
> > +        4       Reads to slow memory in the local NUMA domain
> > +        3       Non-temporal writes to non-local NUMA domain
> > +        2       Non-temporal writes to local NUMA domain
> > +        1       Reads to memory in the non-local NUMA domain
> > +        0       Reads to memory in the local NUMA domain
> > +        ====
> ========================================================
> > +
> > +        By default, the mbm_total_bytes configuration is set to 0x7f to count
> > +        all the event types and the mbm_local_bytes configuration is set to
> > +        0x15 to count all the local memory events.
> > +
> > +        Examples:
> > +
> > +        * To view the current configuration::
> > +          ::
> > +
> > +            # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> > +            0=0x7f;1=0x7f;2=0x7f;3=0x7f
> > +
> > +            # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> > +            0=0x15;1=0x15;3=0x15;4=0x15
> > +
> > +        * To change the mbm_total_bytes to count only reads on domain 0,
> > +          the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
> > +          (in hexadecimal 0x33):
> > +          ::
> > +
> > +            # echo  "0=0x33" >
> > + /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> > +
> > +            # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
> > +            0=0x33;1=0x7f;2=0x7f;3=0x7f
> > +
> > +        * To change the mbm_local_bytes to count all the slow memory reads
> on
> > +          domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
> > +          in binary (in hexadecimal 0x30):
> > +          ::
> > +
> > +            # echo  "0=0x30;1=0x30" >
> > + /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> > +
> > +            # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
> > +            0=0x30;1=0x30;3=0x15;4=0x15

Planning to add the following text here.

"When an event configuration is changed, the bandwidth counters for all the RMIDs and the events will be cleared for that domain.
The next read for every RMID will report "Unavailable" and subsequent reads will report the valid value."

> >
> >  "max_threshold_occupancy":
> >  		Read/write file provides the largest value (in @@ -464,6
> +539,25 @@
> > Memory bandwidth domain is L3 cache.
> >
> >  	MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
> >
> > +Slow Memory Bandwidth Allocation (SMBA)
> > +---------------------------------------
> > +AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).
> > +CXL.memory is the only supported "slow" memory device. With the
> > +support of SMBA, the hardware enables bandwidth allocation on the
> > +slow memory devices. If there are multiple such devices in the
> > +system, the throttling logic groups all the slow sources together and
> > +applies the limit on them as a whole.
> > +
> > +The presence of SMBA (with CXL.memory) is independent of slow memory
> > +devices presence. If there are no such devices on the system, then
> > +configuring SMBA will have no impact on the performance of the system.
> > +
> > +The bandwidth domain for slow memory is L3 cache. Its schemata file
> > +is formatted as:
> > +::
> > +
> > +	SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
> > +
> >  Reading/writing the schemata file
> >  ---------------------------------
> >  Reading the schemata file will show the state of all resources @@
> > -479,6 +573,46 @@ which you wish to change.  E.g.
> >    L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
> >    L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
> >
> > +Reading/writing the schemata file (on AMD systems)
> > +--------------------------------------------------
> > +Reading the schemata file will show the current bandwidth limit on
> > +all domains. The allocated resources are in multiples of one eighth GB/s.
> > +When writing to the file, you need to specify what cache id you wish
> > +to configure the bandwidth limit.
> > +
> > +For example, to allocate 2GB/s limit on the first cache id:
> > +
> > +::
> > +
> > +  # cat schemata
> > +    MB:0=2048;1=2048;2=2048;3=2048
> > +    L3:0=ffff;1=ffff;2=ffff;3=ffff
> > +
> > +  # echo "MB:1=16" > schemata
> > +  # cat schemata
> > +    MB:0=2048;1=  16;2=2048;3=2048
> > +    L3:0=ffff;1=ffff;2=ffff;3=ffff
> > +
> > +Reading/writing the schemata file (on AMD systems) with SMBA feature
> > +--------------------------------------------------------------------
> > +Reading and writing the schemata file is the same as without SMBA in
> > +above section.
> > +
> > +For example, to allocate 8GB/s limit on the first cache id:
> > +
> > +::
> > +
> > +  # cat schemata
> > +    SMBA:0=2048;1=2048;2=2048;3=2048
> > +      MB:0=2048;1=2048;2=2048;3=2048
> > +      L3:0=ffff;1=ffff;2=ffff;3=ffff
> > +
> > +  # echo "SMBA:1=64" > schemata
> > +  # cat schemata
> > +    SMBA:0=2048;1=  64;2=2048;3=2048
> > +      MB:0=2048;1=2048;2=2048;3=2048
> > +      L3:0=ffff;1=ffff;2=ffff;3=ffff
> > +
> >  Cache Pseudo-Locking
> >  ====================
> >  CAT enables a user to specify the amount of cache space that an
> >
> >
> 
> Based on earlier comments I am awaiting information to understand if some
> more detail/example is needed to describe to the user what can be expected
> after a counter configuration is made.

Proposed the new text above. Please check.
Thanks
Babu
> 
> Reinette
  

Patch

diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 71a531061e4e..60761a6f9087 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -17,14 +17,16 @@  AMD refers to this feature as AMD Platform Quality of Service(AMD QoS).
 This feature is enabled by the CONFIG_X86_CPU_RESCTRL and the x86 /proc/cpuinfo
 flag bits:
 
-=============================================	================================
+===============================================	================================
 RDT (Resource Director Technology) Allocation	"rdt_a"
 CAT (Cache Allocation Technology)		"cat_l3", "cat_l2"
 CDP (Code and Data Prioritization)		"cdp_l3", "cdp_l2"
 CQM (Cache QoS Monitoring)			"cqm_llc", "cqm_occup_llc"
 MBM (Memory Bandwidth Monitoring)		"cqm_mbm_total", "cqm_mbm_local"
 MBA (Memory Bandwidth Allocation)		"mba"
-=============================================	================================
+SMBA (Slow Memory Bandwidth Allocation)         "smba"
+BMEC (Bandwidth Monitoring Event Configuration) "bmec"
+===============================================	================================
 
 To use the feature mount the file system::
 
@@ -161,6 +163,79 @@  with the following files:
 "mon_features":
 		Lists the monitoring events if
 		monitoring is enabled for the resource.
+                Example::
+
+                   # cat /sys/fs/resctrl/info/L3_MON/mon_features
+                   llc_occupancy
+                   mbm_total_bytes
+                   mbm_local_bytes
+
+                If the system supports Bandwidth Monitoring Event
+                Configuration (BMEC), then the bandwidth events will
+                be configurable. The output will be::
+
+                   # cat /sys/fs/resctrl/info/L3_MON/mon_features
+                   llc_occupancy
+                   mbm_total_bytes
+                   mbm_total_bytes_config
+                   mbm_local_bytes
+                   mbm_local_bytes_config
+
+"mbm_total_bytes_config", "mbm_local_bytes_config":
+        These files contain the current event configuration for the events
+        mbm_total_bytes and mbm_local_bytes, respectively, when the
+        Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
+        The event configuration settings are domain specific and will affect
+        all the CPUs in the domain.
+
+        Following are the types of events supported:
+
+        ====    ========================================================
+        Bits    Description
+        ====    ========================================================
+        6       Dirty Victims from the QOS domain to all types of memory
+        5       Reads to slow memory in the non-local NUMA domain
+        4       Reads to slow memory in the local NUMA domain
+        3       Non-temporal writes to non-local NUMA domain
+        2       Non-temporal writes to local NUMA domain
+        1       Reads to memory in the non-local NUMA domain
+        0       Reads to memory in the local NUMA domain
+        ====    ========================================================
+
+        By default, the mbm_total_bytes configuration is set to 0x7f to count
+        all the event types and the mbm_local_bytes configuration is set to
+        0x15 to count all the local memory events.
+
+        Examples:
+
+        * To view the current configuration::
+          ::
+
+            # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+            0=0x7f;1=0x7f;2=0x7f;3=0x7f
+
+            # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+            0=0x15;1=0x15;3=0x15;4=0x15
+
+        * To change the mbm_total_bytes to count only reads on domain 0,
+          the bits 0, 1, 4 and 5 needs to be set, which is 110011b in binary
+          (in hexadecimal 0x33):
+          ::
+
+            # echo  "0=0x33" > /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+
+            # cat /sys/fs/resctrl/info/L3_MON/mbm_total_bytes_config
+            0=0x33;1=0x7f;2=0x7f;3=0x7f
+
+        * To change the mbm_local_bytes to count all the slow memory reads on
+          domain 0 and 1, the bits 4 and 5 needs to be set, which is 110000b
+          in binary (in hexadecimal 0x30):
+          ::
+
+            # echo  "0=0x30;1=0x30" > /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+
+            # cat /sys/fs/resctrl/info/L3_MON/mbm_local_bytes_config
+            0=0x30;1=0x30;3=0x15;4=0x15
 
 "max_threshold_occupancy":
 		Read/write file provides the largest value (in
@@ -464,6 +539,25 @@  Memory bandwidth domain is L3 cache.
 
 	MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
 
+Slow Memory Bandwidth Allocation (SMBA)
+---------------------------------------
+AMD hardware supports Slow Memory Bandwidth Allocation (SMBA).
+CXL.memory is the only supported "slow" memory device. With the
+support of SMBA, the hardware enables bandwidth allocation on
+the slow memory devices. If there are multiple such devices in
+the system, the throttling logic groups all the slow sources
+together and applies the limit on them as a whole.
+
+The presence of SMBA (with CXL.memory) is independent of slow memory
+devices presence. If there are no such devices on the system, then
+configuring SMBA will have no impact on the performance of the system.
+
+The bandwidth domain for slow memory is L3 cache. Its schemata file
+is formatted as:
+::
+
+	SMBA:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
+
 Reading/writing the schemata file
 ---------------------------------
 Reading the schemata file will show the state of all resources
@@ -479,6 +573,46 @@  which you wish to change.  E.g.
   L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
   L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
 
+Reading/writing the schemata file (on AMD systems)
+--------------------------------------------------
+Reading the schemata file will show the current bandwidth limit on all
+domains. The allocated resources are in multiples of one eighth GB/s.
+When writing to the file, you need to specify what cache id you wish to
+configure the bandwidth limit.
+
+For example, to allocate 2GB/s limit on the first cache id:
+
+::
+
+  # cat schemata
+    MB:0=2048;1=2048;2=2048;3=2048
+    L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+  # echo "MB:1=16" > schemata
+  # cat schemata
+    MB:0=2048;1=  16;2=2048;3=2048
+    L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+Reading/writing the schemata file (on AMD systems) with SMBA feature
+--------------------------------------------------------------------
+Reading and writing the schemata file is the same as without SMBA in
+above section.
+
+For example, to allocate 8GB/s limit on the first cache id:
+
+::
+
+  # cat schemata
+    SMBA:0=2048;1=2048;2=2048;3=2048
+      MB:0=2048;1=2048;2=2048;3=2048
+      L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+  # echo "SMBA:1=64" > schemata
+  # cat schemata
+    SMBA:0=2048;1=  64;2=2048;3=2048
+      MB:0=2048;1=2048;2=2048;3=2048
+      L3:0=ffff;1=ffff;2=ffff;3=ffff
+
 Cache Pseudo-Locking
 ====================
 CAT enables a user to specify the amount of cache space that an