[v3,5/7] x86/resctrl: Display the RMID and COSID for resctrl groups

Message ID 167778869402.1053859.6094569492538617564.stgit@bmoger-ubuntu
State New
Headers
Series x86/resctrl: Miscellaneous resctrl features |

Commit Message

Moger, Babu March 2, 2023, 8:24 p.m. UTC
  When a user creates a control or monitor group, the CLOSID or RMID
are not visible to the user. These are architecturally defined entities.
There is no harm in displaying these in resctrl groups. Sometimes it
can help to debug the issues.

Add CLOSID and RMID to the control/monitor groups display in resctrl
interface.

$cat /sys/fs/resctrl/clos1/closid
1
$cat /sys/fs/resctrl/mon_groups/mon1/rmid
3

Signed-off-by: Babu Moger <babu.moger@amd.com>
---
 Documentation/x86/resctrl.rst          |   17 ++++++++++++
 arch/x86/kernel/cpu/resctrl/rdtgroup.c |   44 ++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)
  

Comments

Reinette Chatre March 15, 2023, 6:42 p.m. UTC | #1
Hi Babu,

Please note "COSID" (CLOSID?) in the subject. Even so, you already
know and have been reminded when you submitted an earlier version
that resctrl needs to support Arm. Adding x86 specific bits to the
user interface complicates this enabling.

What happened to:
https://lore.kernel.org/lkml/9ca06669-7826-b3b7-0977-02185be7ce08@amd.com/

On 3/2/2023 12:24 PM, Babu Moger wrote:
> When a user creates a control or monitor group, the CLOSID or RMID
> are not visible to the user. These are architecturally defined entities.
> There is no harm in displaying these in resctrl groups. Sometimes it
> can help to debug the issues.
> 
> Add CLOSID and RMID to the control/monitor groups display in resctrl
> interface.
> 
> $cat /sys/fs/resctrl/clos1/closid
> 1
> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
> 3
> 
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  Documentation/x86/resctrl.rst          |   17 ++++++++++++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |   44 ++++++++++++++++++++++++++++++++
>  2 files changed, 61 insertions(+)
> 
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 25203f20002d..67eae74fe40c 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -321,6 +321,15 @@ All groups contain the following files:
>  	Just like "cpus", only using ranges of CPUs instead of bitmasks.
>  
>  
> +"rmid":
> +	Available only with debug option.Reading this file shows the

Where can the user find documentation about "debug option"? 

Also please watch spacing.

> +	Resource Monitoring ID (RMID) for monitoring the resource
> +	utilization. Monitoring is performed by tagging each core (or
> +	thread) or process via a RMID. Kernel assigns a new RMID when
> +	a group is created depending on the available RMIDs. Multiple
> +	cores (or threads) or processes can share a same RMID in a resctrl
> +	domain.

Please make this not to be x86 specific. You can surely document the
x86 details but that should start with something like "on x86 this ..."

What does "a resctrl domain" mean to user space? Did you mean "resource
group"?

> +
>  When control is enabled all CTRL_MON groups will also contain:
>  
>  "schemata":
> @@ -342,6 +351,14 @@ When control is enabled all CTRL_MON groups will also contain:
>  	file. On successful pseudo-locked region creation the mode will
>  	automatically change to "pseudo-locked".
>  
> +"closid":
> +	Available only with debug option. Reading this file shows the
> +	Class of Service (CLOS) id which acts as a resource control tag
> +	on which the resources can be throttled. Kernel assigns a new
> +	CLOSID a control group is created depending on the available
> +	CLOSIDs. Multiple cores(or threads) or processes can share a
> +	same CLOSID in a resctrl domain.
> +

This also should not be x86 specific. Also, please check the language
and watch spacing. 
for example, "Kernel assigns a new CLOSID a control group" -> "Kernel
assigns a new CLOSID to a control group", "can share a same" -> "can
share the same".
What does "a resctrl domain" mean to user space?


>  When monitoring is enabled all MON groups will also contain:
>  

Shouldn't the "rmid" (to be renamed) entry be in this section of the
documentation?

>  "mon_data":

Reinette
  
Moger, Babu March 20, 2023, 4:52 p.m. UTC | #2
Hi Reinette,

On 3/15/23 13:42, Reinette Chatre wrote:
> Hi Babu,
> 
> Please note "COSID" (CLOSID?) in the subject. Even so, you already
> know and have been reminded when you submitted an earlier version
> that resctrl needs to support Arm. Adding x86 specific bits to the
> user interface complicates this enabling.
> 
> What happened to:
> https://lore.kernel.org/lkml/9ca06669-7826-b3b7-0977-02185be7ce08@amd.com/

Yes. It kind of loses meaning if is renamed differently. Kept it same.
I will change it to mon_hw_id and ctrl_hw_id respectively. Will change
subject line accordingly.

> 
> On 3/2/2023 12:24 PM, Babu Moger wrote:
>> When a user creates a control or monitor group, the CLOSID or RMID
>> are not visible to the user. These are architecturally defined entities.
>> There is no harm in displaying these in resctrl groups. Sometimes it
>> can help to debug the issues.
>>
>> Add CLOSID and RMID to the control/monitor groups display in resctrl
>> interface.
>>
>> $cat /sys/fs/resctrl/clos1/closid
>> 1
>> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
>> 3
>>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>>  Documentation/x86/resctrl.rst          |   17 ++++++++++++
>>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |   44 ++++++++++++++++++++++++++++++++
>>  2 files changed, 61 insertions(+)
>>
>> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
>> index 25203f20002d..67eae74fe40c 100644
>> --- a/Documentation/x86/resctrl.rst
>> +++ b/Documentation/x86/resctrl.rst
>> @@ -321,6 +321,15 @@ All groups contain the following files:
>>  	Just like "cpus", only using ranges of CPUs instead of bitmasks.
>>  
>>  
>> +"rmid":
>> +	Available only with debug option.Reading this file shows the
> 
> Where can the user find documentation about "debug option"? 

I can change the patch order. I can bring patch 7 before this.

> 
> Also please watch spacing.

Ok sure.

> 
>> +	Resource Monitoring ID (RMID) for monitoring the resource
>> +	utilization. Monitoring is performed by tagging each core (or
>> +	thread) or process via a RMID. Kernel assigns a new RMID when
>> +	a group is created depending on the available RMIDs. Multiple
>> +	cores (or threads) or processes can share a same RMID in a resctrl
>> +	domain.
> 
> Please make this not to be x86 specific. You can surely document the
> x86 details but that should start with something like "on x86 this ..."

ok. Sure
> 
> What does "a resctrl domain" mean to user space? Did you mean "resource
> group"?

Yes. it should have been "resource group".
> 
>> +
>>  When control is enabled all CTRL_MON groups will also contain:
>>  
>>  "schemata":
>> @@ -342,6 +351,14 @@ When control is enabled all CTRL_MON groups will also contain:
>>  	file. On successful pseudo-locked region creation the mode will
>>  	automatically change to "pseudo-locked".
>>  
>> +"closid":
>> +	Available only with debug option. Reading this file shows the
>> +	Class of Service (CLOS) id which acts as a resource control tag
>> +	on which the resources can be throttled. Kernel assigns a new
>> +	CLOSID a control group is created depending on the available
>> +	CLOSIDs. Multiple cores(or threads) or processes can share a
>> +	same CLOSID in a resctrl domain.
>> +
> 
> This also should not be x86 specific. Also, please check the language
> and watch spacing. 

ok
> for example, "Kernel assigns a new CLOSID a control group" -> "Kernel
> assigns a new CLOSID to a control group", "can share a same" -> "can
> share the same".

Sure
> What does "a resctrl domain" mean to user space?

It should have been "resource group".

> 
> 
>>  When monitoring is enabled all MON groups will also contain:
>>  
> 
> Shouldn't the "rmid" (to be renamed) entry be in this section of the
> documentation?

Not sure about this comment. Did you mean to move the rmid (to be renamed
mon_hw_id) documentation here?

Thanks
Babu

> 
>>  "mon_data":
> 
> Reinette
>
  
Reinette Chatre March 20, 2023, 5 p.m. UTC | #3
Hi Babu,

On 3/20/2023 9:52 AM, Moger, Babu wrote:
> On 3/15/23 13:42, Reinette Chatre wrote:

...

>>
>>
>>>  When monitoring is enabled all MON groups will also contain:
>>>  
>>
>> Shouldn't the "rmid" (to be renamed) entry be in this section of the
>> documentation?
> 
> Not sure about this comment. Did you mean to move the rmid (to be renamed
> mon_hw_id) documentation here?


Yes. Note the header reads "When monitoring is enabled all MON groups
will also contain:". The new file is only relevant to monitoring groups, so
it seems appropriate that it falls under this section. 

Reinette
  
James Morse March 20, 2023, 5:10 p.m. UTC | #4
Hi Babu,

On 02/03/2023 20:24, Babu Moger wrote:
> When a user creates a control or monitor group, the CLOSID or RMID
> are not visible to the user. These are architecturally defined entities.

On x86. Any other architecture is going to have a hard time supporting this.


> There is no harm in displaying these in resctrl groups. Sometimes it
> can help to debug the issues.

By comparing it with what? Unless user-space can see into the hardware, resctrl is the
only gateway to this stuff. What difference does the allocated value here make?

Could you elaborate on what issues this can help debug?


> Add CLOSID and RMID to the control/monitor groups display in resctrl
> interface.
> 
> $cat /sys/fs/resctrl/clos1/closid
> 1
> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
> 3

Er. Please don't expose this to user-space!
MPAM has no equivalent value to RMID, so whatever this is for, can't work on MPAM.


Where I have needed this value for MPAM is to pass the closid/rmid to another kernel
interface. Because the user-space interface needs to be architecture agnostic, I proposed
it as a u64 called 'id' that each architecture can encode/decode as appropriate. [0]

To prevent user-space trying to base anything on the raw closid/rmid values, I went as far
as obfuscating them with a random value picked at boot, to ensure scripts always read the
current value when passing the control/monitor group.


I'm curious what the raw value is useful for.


Thanks,

James

[0]
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.2&id=d568cf2ba58b7c4970ce41a8d4d6224e285a177e
  
Moger, Babu March 20, 2023, 7:53 p.m. UTC | #5
Hi James,

On 3/20/23 12:10, James Morse wrote:
> Hi Babu,
> 
> On 02/03/2023 20:24, Babu Moger wrote:
>> When a user creates a control or monitor group, the CLOSID or RMID
>> are not visible to the user. These are architecturally defined entities.
> 
> On x86. Any other architecture is going to have a hard time supporting this.
> 
> 
>> There is no harm in displaying these in resctrl groups. Sometimes it
>> can help to debug the issues.
> 
> By comparing it with what? Unless user-space can see into the hardware, resctrl is the
> only gateway to this stuff. What difference does the allocated value here make?
> 
> Could you elaborate on what issues this can help debug?

While ago, we had an issue with one of the RMID's event reporting. There
were numerous active RMIDs on the system. As a kernel developer, we
couldn't pinpoint which RMID was reporting wrong information. That
information was important for hardware guys to debug further. We had to
patch the kernel to print that information for debugging. This is one of
the cases.

> 
> 
>> Add CLOSID and RMID to the control/monitor groups display in resctrl
>> interface.
>>
>> $cat /sys/fs/resctrl/clos1/closid
>> 1
>> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
>> 3
> 
> Er. Please don't expose this to user-space!
> MPAM has no equivalent value to RMID, so whatever this is for, can't work on MPAM.
> 
> 
> Where I have needed this value for MPAM is to pass the closid/rmid to another kernel
> interface. Because the user-space interface needs to be architecture agnostic, I proposed
> it as a u64 called 'id' that each architecture can encode/decode as appropriate. [0]
> 
> To prevent user-space trying to base anything on the raw closid/rmid values, I went as far
> as obfuscating them with a random value picked at boot, to ensure scripts always read the
> current value when passing the control/monitor group.
> 
> 
> I'm curious what the raw value is useful for.

It is mostly for debugging when there are issues.

I think we need to have a way to print generic as well as architecture
specific details.

Thanks
Babu
> 
> 
> Thanks,
> 
> James
> 
> [0]
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/commit/?h=mpam/snapshot/v6.2&id=d568cf2ba58b7c4970ce41a8d4d6224e285a177e
> 
> 
> 
>
  
Stephane Eranian March 20, 2023, 8:59 p.m. UTC | #6
On Thu, Mar 2, 2023 at 3:25 PM Babu Moger <babu.moger@amd.com> wrote:
>
> When a user creates a control or monitor group, the CLOSID or RMID
> are not visible to the user. These are architecturally defined entities.
> There is no harm in displaying these in resctrl groups. Sometimes it
> can help to debug the issues.
>
> Add CLOSID and RMID to the control/monitor groups display in resctrl
> interface.
>
> $cat /sys/fs/resctrl/clos1/closid
> 1
> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
> 3
>
Is the intent here to be x86 specific?
How would that work on ARM?
Shouldn't we be using more generic names such as monitoring_id,   control_id?


> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
>  Documentation/x86/resctrl.rst          |   17 ++++++++++++
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |   44 ++++++++++++++++++++++++++++++++
>  2 files changed, 61 insertions(+)
>
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 25203f20002d..67eae74fe40c 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -321,6 +321,15 @@ All groups contain the following files:
>         Just like "cpus", only using ranges of CPUs instead of bitmasks.
>
>
> +"rmid":
> +       Available only with debug option.Reading this file shows the
> +       Resource Monitoring ID (RMID) for monitoring the resource
> +       utilization. Monitoring is performed by tagging each core (or
> +       thread) or process via a RMID. Kernel assigns a new RMID when
> +       a group is created depending on the available RMIDs. Multiple
> +       cores (or threads) or processes can share a same RMID in a resctrl
> +       domain.
> +
>  When control is enabled all CTRL_MON groups will also contain:
>
>  "schemata":
> @@ -342,6 +351,14 @@ When control is enabled all CTRL_MON groups will also contain:
>         file. On successful pseudo-locked region creation the mode will
>         automatically change to "pseudo-locked".
>
> +"closid":
> +       Available only with debug option. Reading this file shows the
> +       Class of Service (CLOS) id which acts as a resource control tag
> +       on which the resources can be throttled. Kernel assigns a new
> +       CLOSID a control group is created depending on the available
> +       CLOSIDs. Multiple cores(or threads) or processes can share a
> +       same CLOSID in a resctrl domain.
> +
>  When monitoring is enabled all MON groups will also contain:
>
>  "mon_data":
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index 1eb538965bd3..389d64b42704 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -760,6 +760,38 @@ static int rdtgroup_tasks_show(struct kernfs_open_file *of,
>         return ret;
>  }
>
> +static int rdtgroup_closid_show(struct kernfs_open_file *of,
> +                               struct seq_file *s, void *v)
> +{
> +       struct rdtgroup *rdtgrp;
> +       int ret = 0;
> +
> +       rdtgrp = rdtgroup_kn_lock_live(of->kn);
> +       if (rdtgrp)
> +               seq_printf(s, "%u\n", rdtgrp->closid);
> +       else
> +               ret = -ENOENT;
> +       rdtgroup_kn_unlock(of->kn);
> +
> +       return ret;
> +}
> +
> +static int rdtgroup_rmid_show(struct kernfs_open_file *of,
> +                             struct seq_file *s, void *v)
> +{
> +       struct rdtgroup *rdtgrp;
> +       int ret = 0;
> +
> +       rdtgrp = rdtgroup_kn_lock_live(of->kn);
> +       if (rdtgrp)
> +               seq_printf(s, "%u\n", rdtgrp->mon.rmid);
> +       else
> +               ret = -ENOENT;
> +       rdtgroup_kn_unlock(of->kn);
> +
> +       return ret;
> +}
> +
>  #ifdef CONFIG_PROC_CPU_RESCTRL
>
>  /*
> @@ -1821,6 +1853,12 @@ static struct rftype res_common_files[] = {
>                 .seq_show       = rdtgroup_tasks_show,
>                 .fflags         = RFTYPE_BASE,
>         },
> +       {
> +               .name           = "rmid",
> +               .mode           = 0444,
> +               .kf_ops         = &rdtgroup_kf_single_ops,
> +               .seq_show       = rdtgroup_rmid_show,
> +       },
>         {
>                 .name           = "schemata",
>                 .mode           = 0644,
> @@ -1844,6 +1882,12 @@ static struct rftype res_common_files[] = {
>                 .seq_show       = rdtgroup_size_show,
>                 .fflags         = RFTYPE_BASE_CTRL,
>         },
> +       {
> +               .name           = "closid",
> +               .mode           = 0444,
> +               .kf_ops         = &rdtgroup_kf_single_ops,
> +               .seq_show       = rdtgroup_closid_show,
> +       },
>
>  };
>
>
>
  
Moger, Babu March 22, 2023, 1:49 p.m. UTC | #7
Hi Stephane,

On 3/20/23 15:59, Stephane Eranian wrote:
> On Thu, Mar 2, 2023 at 3:25 PM Babu Moger <babu.moger@amd.com> wrote:
>>
>> When a user creates a control or monitor group, the CLOSID or RMID
>> are not visible to the user. These are architecturally defined entities.
>> There is no harm in displaying these in resctrl groups. Sometimes it
>> can help to debug the issues.
>>
>> Add CLOSID and RMID to the control/monitor groups display in resctrl
>> interface.
>>
>> $cat /sys/fs/resctrl/clos1/closid
>> 1
>> $cat /sys/fs/resctrl/mon_groups/mon1/rmid
>> 3
>>
> Is the intent here to be x86 specific?
> How would that work on ARM?
> Shouldn't we be using more generic names such as monitoring_id,   control_id?

Yes. We can do that.
Thanks
Babu
  

Patch

diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 25203f20002d..67eae74fe40c 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -321,6 +321,15 @@  All groups contain the following files:
 	Just like "cpus", only using ranges of CPUs instead of bitmasks.
 
 
+"rmid":
+	Available only with debug option.Reading this file shows the
+	Resource Monitoring ID (RMID) for monitoring the resource
+	utilization. Monitoring is performed by tagging each core (or
+	thread) or process via a RMID. Kernel assigns a new RMID when
+	a group is created depending on the available RMIDs. Multiple
+	cores (or threads) or processes can share a same RMID in a resctrl
+	domain.
+
 When control is enabled all CTRL_MON groups will also contain:
 
 "schemata":
@@ -342,6 +351,14 @@  When control is enabled all CTRL_MON groups will also contain:
 	file. On successful pseudo-locked region creation the mode will
 	automatically change to "pseudo-locked".
 
+"closid":
+	Available only with debug option. Reading this file shows the
+	Class of Service (CLOS) id which acts as a resource control tag
+	on which the resources can be throttled. Kernel assigns a new
+	CLOSID a control group is created depending on the available
+	CLOSIDs. Multiple cores(or threads) or processes can share a
+	same CLOSID in a resctrl domain.
+
 When monitoring is enabled all MON groups will also contain:
 
 "mon_data":
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 1eb538965bd3..389d64b42704 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -760,6 +760,38 @@  static int rdtgroup_tasks_show(struct kernfs_open_file *of,
 	return ret;
 }
 
+static int rdtgroup_closid_show(struct kernfs_open_file *of,
+				struct seq_file *s, void *v)
+{
+	struct rdtgroup *rdtgrp;
+	int ret = 0;
+
+	rdtgrp = rdtgroup_kn_lock_live(of->kn);
+	if (rdtgrp)
+		seq_printf(s, "%u\n", rdtgrp->closid);
+	else
+		ret = -ENOENT;
+	rdtgroup_kn_unlock(of->kn);
+
+	return ret;
+}
+
+static int rdtgroup_rmid_show(struct kernfs_open_file *of,
+			      struct seq_file *s, void *v)
+{
+	struct rdtgroup *rdtgrp;
+	int ret = 0;
+
+	rdtgrp = rdtgroup_kn_lock_live(of->kn);
+	if (rdtgrp)
+		seq_printf(s, "%u\n", rdtgrp->mon.rmid);
+	else
+		ret = -ENOENT;
+	rdtgroup_kn_unlock(of->kn);
+
+	return ret;
+}
+
 #ifdef CONFIG_PROC_CPU_RESCTRL
 
 /*
@@ -1821,6 +1853,12 @@  static struct rftype res_common_files[] = {
 		.seq_show	= rdtgroup_tasks_show,
 		.fflags		= RFTYPE_BASE,
 	},
+	{
+		.name		= "rmid",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= rdtgroup_rmid_show,
+	},
 	{
 		.name		= "schemata",
 		.mode		= 0644,
@@ -1844,6 +1882,12 @@  static struct rftype res_common_files[] = {
 		.seq_show	= rdtgroup_size_show,
 		.fflags		= RFTYPE_BASE_CTRL,
 	},
+	{
+		.name		= "closid",
+		.mode		= 0444,
+		.kf_ops		= &rdtgroup_kf_single_ops,
+		.seq_show	= rdtgroup_closid_show,
+	},
 
 };