diff mbox series

[23/24] selftests/resctrl: Add L2 CAT test

Message ID	20231024092634.7122-24-ilpo.jarvinen@linux.intel.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com> To: linux-kselftest@vger.kernel.org, Reinette Chatre <reinette.chatre@intel.com>, Shuah Khan <shuah@kernel.org>, Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>, =?utf-8?q?Maciej_Wiecz=C3=B3r-R?= =?utf-8?q?etman?= <maciej.wieczor-retman@intel.com>, Fenghua Yu <fenghua.yu@intel.com> Cc: linux-kernel@vger.kernel.org, =?utf-8?q?Ilpo_J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com> Subject: [PATCH 23/24] selftests/resctrl: Add L2 CAT test Date: Tue, 24 Oct 2023 12:26:33 +0300 Message-Id: <20231024092634.7122-24-ilpo.jarvinen@linux.intel.com> In-Reply-To: <20231024092634.7122-1-ilpo.jarvinen@linux.intel.com> References: <20231024092634.7122-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	selftests/resctrl: CAT test improvements & generalized test framework \| [00/24] selftests/resctrl: CAT test improvements & generalized test framework [01/24] selftests/resctrl: Split fill_buf to allow tests finer-grained control [02/24] selftests/resctrl: Refactor fill_buf functions [03/24] selftests/resctrl: Refactor get_cbm_mask() [04/24] selftests/resctrl: Mark get_cache_size() cache_type const [05/24] selftests/resctrl: Create cache_size() helper [06/24] selftests/resctrl: Exclude shareable bits from schemata in CAT test [07/24] selftests/resctrl: Split measure_cache_vals() function [08/24] selftests/resctrl: Split show_cache_info() to test specific and generic parts [09/24] selftests/resctrl: Remove unnecessary __u64 -> unsigned long conversion [10/24] selftests/resctrl: Remove nested calls in perf event handling [11/24] selftests/resctrl: Consolidate naming of perf event related things [12/24] selftests/resctrl: Improve perf init [13/24] selftests/resctrl: Convert perf related globals to locals [14/24] selftests/resctrl: Move cat_val() to cat_test.c and rename to cat_test() [15/24] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations [16/24] selftests/resctrl: Rewrite Cache Allocation Technology (CAT) test [17/24] selftests/resctrl: Create struct for input parameter [18/24] selftests/resctrl: Introduce generalized test framework [19/24] selftests/resctrl: Pass write_schemata() resource instead of test name [20/24] selftests/resctrl: Add helper to convert L2/3 to integer [21/24] selftests/resctrl: Get resource id from cache id [22/24] selftests/resctrl: Add test groups and name L3 CAT test L3_CAT [23/24] selftests/resctrl: Add L2 CAT test [24/24] selftests/resctrl: Ignore failures from L2 CAT test with <= 2 bits

Commit Message

Ilpo Järvinen Oct. 24, 2023, 9:26 a.m. UTC

  CAT selftests only cover L3 but some newer CPUs come also with L2 CAT
support.

Add L2 CAT selftest. As measuring L2 misses is not easily available
with perf, use L3 accesses as a proxy for L2 CAT working or not.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
 tools/testing/selftests/resctrl/cat_test.c    | 68 +++++++++++++++++--
 tools/testing/selftests/resctrl/resctrl.h     |  1 +
 .../testing/selftests/resctrl/resctrl_tests.c |  1 +
 3 files changed, 63 insertions(+), 7 deletions(-)

Comments

Maciej Wieczor-Retman Oct. 27, 2023, 12:46 p.m. UTC | #1

On 2023-10-24 at 12:26:33 +0300, Ilpo Järvinen wrote:
>CAT selftests only cover L3 but some newer CPUs come also with L2 CAT
>support.

Is there some some defined line since what CPU model is L2 CAT supported?

In my opinion, from the perspective of someone digging up this commit a couple
years from now it could be handy to have something more specific instead of
"some newer CPUs".

>
>Add L2 CAT selftest. As measuring L2 misses is not easily available
>with perf, use L3 accesses as a proxy for L2 CAT working or not.
>
>Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

Reinette Chatre Nov. 2, 2023, 5:57 p.m. UTC | #2

Hi Ilpo,

On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> CAT selftests only cover L3 but some newer CPUs come also with L2 CAT
> support.

No need to use "new" language. L2 CAT has been available for a long time
... since Apollo Lake. Which systems actually support it is a different
topic. This is an architectural feature that has been available for a
long time. Whether a system supports it will be detected and the test
run based on that. 

> 
> Add L2 CAT selftest. As measuring L2 misses is not easily available
> with perf, use L3 accesses as a proxy for L2 CAT working or not.

I understand the exact measurement is not available but I do notice some
L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
looks promising.

L3 cannot be relied on for those systems, like Apollo lake, that do
not have an L3.

> 
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> ---
>  tools/testing/selftests/resctrl/cat_test.c    | 68 +++++++++++++++++--
>  tools/testing/selftests/resctrl/resctrl.h     |  1 +
>  .../testing/selftests/resctrl/resctrl_tests.c |  1 +
>  3 files changed, 63 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 48a96acd9e31..a9c72022bb5a 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
>  	remove(RESULT_FILE_NAME);
>  }
>  
> +/*
> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
> + * because perf cannot directly provide the number of L2 misses (there are
> + * only platform specific ways to get the number of L2 misses).
> + *
> + * This function sets up L3 CAT to reduce noise from other processes during
> + * L2 CAT test.

This motivation is not clear to me. Does the same isolation used during L3 CAT
testing not work? I expected it to follow the same idea with the L2 cache split
in two, the test using one part and the rest of the system using the other.
Is that not enough isolation?

Reinette

Ilpo Järvinen Nov. 3, 2023, 10:39 a.m. UTC | #3

On Thu, 2 Nov 2023, Reinette Chatre wrote:
> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:

> > Add L2 CAT selftest. As measuring L2 misses is not easily available
> > with perf, use L3 accesses as a proxy for L2 CAT working or not.
> 
> I understand the exact measurement is not available but I do notice some
> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
> looks promising.

Okay, I was under impression that L2 misses are not available. Both based 
on what you mentioned to me half an year ago and because of what flags I 
found from the header. But I'll take another look into it.

> L3 cannot be relied on for those systems, like Apollo lake, that do
> not have an L3.

Do you happen know what perf will report for such CPUs, will it return 
L2 as LLC?

> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > ---
> >  tools/testing/selftests/resctrl/cat_test.c    | 68 +++++++++++++++++--
> >  tools/testing/selftests/resctrl/resctrl.h     |  1 +
> >  .../testing/selftests/resctrl/resctrl_tests.c |  1 +
> >  3 files changed, 63 insertions(+), 7 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> > index 48a96acd9e31..a9c72022bb5a 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
> >  	remove(RESULT_FILE_NAME);
> >  }
> >  
> > +/*
> > + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
> > + * because perf cannot directly provide the number of L2 misses (there are
> > + * only platform specific ways to get the number of L2 misses).
> > + *
> > + * This function sets up L3 CAT to reduce noise from other processes during
> > + * L2 CAT test.
> 
> This motivation is not clear to me. Does the same isolation used during 
> L3 CAT testing not work? I expected it to follow the same idea with the 
> L2 cache split in two, the test using one part and the rest of the 
> system using the other. Is that not enough isolation?

Isolation for L2 is done very same way as with L3 and I think it itself 
works just fine.

However, because L2 CAT selftest as is measures L3 accesses that in ideal 
world equals to L2 misses, isolating selftest related L3 accesses from the 
rest of the system should reduce noise in the # of L3 accesses. It's not 
mandatory though so if L3 CAT is not available the function just prints a 
warning about the potential noise and does setup nothing for L3.

But I'll see if I can make it use L2 misses directly so this wouldn't 
matter.

Reinette Chatre Nov. 3, 2023, 10:53 p.m. UTC | #4

Hi Ilpo,

On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> 
>>> Add L2 CAT selftest. As measuring L2 misses is not easily available
>>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
>>
>> I understand the exact measurement is not available but I do notice some
>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
>> looks promising.
> 
> Okay, I was under impression that L2 misses are not available. Both based 
> on what you mentioned to me half an year ago and because of what flags I 
> found from the header. But I'll take another look into it.

You are correct that when I did L2 testing a long time ago I used
the model specific L2 miss counts. I was hoping that things have improved
so that model specific counters are not needed, as you have tried here.
I found the l2_rqsts symbol while looking for alternatives but I am not
familiar enough with perf to know how these symbolic names are mapped.
I was hoping that they could be a simple drop-in replacement to
experiment with.

> 
>> L3 cannot be relied on for those systems, like Apollo lake, that do
>> not have an L3.
> 
> Do you happen know what perf will report for such CPUs, will it return 
> L2 as LLC?

I don't know.

> 
>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>> ---
>>>  tools/testing/selftests/resctrl/cat_test.c    | 68 +++++++++++++++++--
>>>  tools/testing/selftests/resctrl/resctrl.h     |  1 +
>>>  .../testing/selftests/resctrl/resctrl_tests.c |  1 +
>>>  3 files changed, 63 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>> index 48a96acd9e31..a9c72022bb5a 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
>>>  	remove(RESULT_FILE_NAME);
>>>  }
>>>  
>>> +/*
>>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
>>> + * because perf cannot directly provide the number of L2 misses (there are
>>> + * only platform specific ways to get the number of L2 misses).
>>> + *
>>> + * This function sets up L3 CAT to reduce noise from other processes during
>>> + * L2 CAT test.
>>
>> This motivation is not clear to me. Does the same isolation used during 
>> L3 CAT testing not work? I expected it to follow the same idea with the 
>> L2 cache split in two, the test using one part and the rest of the 
>> system using the other. Is that not enough isolation?
> 
> Isolation for L2 is done very same way as with L3 and I think it itself 
> works just fine.
> 
> However, because L2 CAT selftest as is measures L3 accesses that in ideal 
> world equals to L2 misses, isolating selftest related L3 accesses from the 
> rest of the system should reduce noise in the # of L3 accesses. It's not 
> mandatory though so if L3 CAT is not available the function just prints a 
> warning about the potential noise and does setup nothing for L3.

This is not clear to me. If the read misses L2 and then accesses L3 then
it should not matter which part of L3 cache the work is isolated to. What noise
do you have in mind?

Reinette

Ilpo Järvinen Nov. 6, 2023, 9:53 a.m. UTC | #5

On Fri, 3 Nov 2023, Reinette Chatre wrote:
> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
> > On Thu, 2 Nov 2023, Reinette Chatre wrote:
> >> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> > 
> >>> Add L2 CAT selftest. As measuring L2 misses is not easily available
> >>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
> >>
> >> I understand the exact measurement is not available but I do notice some
> >> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
> >> looks promising.
> > 
> > Okay, I was under impression that L2 misses are not available. Both based 
> > on what you mentioned to me half an year ago and because of what flags I 
> > found from the header. But I'll take another look into it.
> 
> You are correct that when I did L2 testing a long time ago I used
> the model specific L2 miss counts. I was hoping that things have improved
> so that model specific counters are not needed, as you have tried here.
> I found the l2_rqsts symbol while looking for alternatives but I am not
> familiar enough with perf to know how these symbolic names are mapped.
> I was hoping that they could be a simple drop-in replacement to
> experiment with.

According to perf_event_open() manpage, mapping those symbolic names 
requires libpfm so this would add a library dependency?

> >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> >>> ---
> >>>  tools/testing/selftests/resctrl/cat_test.c    | 68 +++++++++++++++++--
> >>>  tools/testing/selftests/resctrl/resctrl.h     |  1 +
> >>>  .../testing/selftests/resctrl/resctrl_tests.c |  1 +
> >>>  3 files changed, 63 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> >>> index 48a96acd9e31..a9c72022bb5a 100644
> >>> --- a/tools/testing/selftests/resctrl/cat_test.c
> >>> +++ b/tools/testing/selftests/resctrl/cat_test.c
> >>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
> >>>  	remove(RESULT_FILE_NAME);
> >>>  }
> >>>  
> >>> +/*
> >>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
> >>> + * because perf cannot directly provide the number of L2 misses (there are
> >>> + * only platform specific ways to get the number of L2 misses).
> >>> + *
> >>> + * This function sets up L3 CAT to reduce noise from other processes during
> >>> + * L2 CAT test.
> >>
> >> This motivation is not clear to me. Does the same isolation used during 
> >> L3 CAT testing not work? I expected it to follow the same idea with the 
> >> L2 cache split in two, the test using one part and the rest of the 
> >> system using the other. Is that not enough isolation?
> > 
> > Isolation for L2 is done very same way as with L3 and I think it itself 
> > works just fine.
> > 
> > However, because L2 CAT selftest as is measures L3 accesses that in ideal 
> > world equals to L2 misses, isolating selftest related L3 accesses from the 
> > rest of the system should reduce noise in the # of L3 accesses. It's not 
> > mandatory though so if L3 CAT is not available the function just prints a 
> > warning about the potential noise and does setup nothing for L3.
> 
> This is not clear to me. If the read misses L2 and then accesses L3 then
> it should not matter which part of L3 cache the work is isolated to. 
> What noise do you have in mind?

The way it is currently done is to measure L3 accesses. If something else 
runs at the same time as the CAT selftest, it can do mem accesses that 
cause L3 accesses which is noise in the # of L3 accesses number since 
those accesses were unrelated to the L2 CAT selftest.

Reinette Chatre Nov. 6, 2023, 5:03 p.m. UTC | #6

Hi Ilpo,

On 11/6/2023 1:53 AM, Ilpo Järvinen wrote:
> On Fri, 3 Nov 2023, Reinette Chatre wrote:
>> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
>>> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>>>
>>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available
>>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
>>>>
>>>> I understand the exact measurement is not available but I do notice some
>>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
>>>> looks promising.
>>>
>>> Okay, I was under impression that L2 misses are not available. Both based 
>>> on what you mentioned to me half an year ago and because of what flags I 
>>> found from the header. But I'll take another look into it.
>>
>> You are correct that when I did L2 testing a long time ago I used
>> the model specific L2 miss counts. I was hoping that things have improved
>> so that model specific counters are not needed, as you have tried here.
>> I found the l2_rqsts symbol while looking for alternatives but I am not
>> familiar enough with perf to know how these symbolic names are mapped.
>> I was hoping that they could be a simple drop-in replacement to
>> experiment with.
> 
> According to perf_event_open() manpage, mapping those symbolic names 
> requires libpfm so this would add a library dependency?

I do not see perf list using this library to determine the event and
umask but I am in unfamiliar territory. I'll have to spend some more
time here to determine options.

> 
>>>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>>>>> ---
>>>>>  tools/testing/selftests/resctrl/cat_test.c    | 68 +++++++++++++++++--
>>>>>  tools/testing/selftests/resctrl/resctrl.h     |  1 +
>>>>>  .../testing/selftests/resctrl/resctrl_tests.c |  1 +
>>>>>  3 files changed, 63 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>>>> index 48a96acd9e31..a9c72022bb5a 100644
>>>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>>>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void)
>>>>>  	remove(RESULT_FILE_NAME);
>>>>>  }
>>>>>  
>>>>> +/*
>>>>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
>>>>> + * because perf cannot directly provide the number of L2 misses (there are
>>>>> + * only platform specific ways to get the number of L2 misses).
>>>>> + *
>>>>> + * This function sets up L3 CAT to reduce noise from other processes during
>>>>> + * L2 CAT test.
>>>>
>>>> This motivation is not clear to me. Does the same isolation used during 
>>>> L3 CAT testing not work? I expected it to follow the same idea with the 
>>>> L2 cache split in two, the test using one part and the rest of the 
>>>> system using the other. Is that not enough isolation?
>>>
>>> Isolation for L2 is done very same way as with L3 and I think it itself 
>>> works just fine.
>>>
>>> However, because L2 CAT selftest as is measures L3 accesses that in ideal 
>>> world equals to L2 misses, isolating selftest related L3 accesses from the 
>>> rest of the system should reduce noise in the # of L3 accesses. It's not 
>>> mandatory though so if L3 CAT is not available the function just prints a 
>>> warning about the potential noise and does setup nothing for L3.
>>
>> This is not clear to me. If the read misses L2 and then accesses L3 then
>> it should not matter which part of L3 cache the work is isolated to. 
>> What noise do you have in mind?
> 
> The way it is currently done is to measure L3 accesses. If something else 
> runs at the same time as the CAT selftest, it can do mem accesses that 
> cause L3 accesses which is noise in the # of L3 accesses number since 
> those accesses were unrelated to the L2 CAT selftest.
> 

Creating a CAT allocation sets aside a portion of cache where a task/cpu
can allocation into cache, it does not prevent one task from accessing
the cache concurrently with another.

Reinette

Reinette Chatre Nov. 6, 2023, 9:22 p.m. UTC | #7

Hi Ilpo,

On 11/6/2023 9:03 AM, Reinette Chatre wrote:
> On 11/6/2023 1:53 AM, Ilpo Järvinen wrote:
>> On Fri, 3 Nov 2023, Reinette Chatre wrote:
>>> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
>>>> On Thu, 2 Nov 2023, Reinette Chatre wrote:
>>>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
>>>>
>>>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available
>>>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
>>>>>
>>>>> I understand the exact measurement is not available but I do notice some
>>>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
>>>>> looks promising.
>>>>
>>>> Okay, I was under impression that L2 misses are not available. Both based 
>>>> on what you mentioned to me half an year ago and because of what flags I 
>>>> found from the header. But I'll take another look into it.
>>>
>>> You are correct that when I did L2 testing a long time ago I used
>>> the model specific L2 miss counts. I was hoping that things have improved
>>> so that model specific counters are not needed, as you have tried here.
>>> I found the l2_rqsts symbol while looking for alternatives but I am not
>>> familiar enough with perf to know how these symbolic names are mapped.
>>> I was hoping that they could be a simple drop-in replacement to
>>> experiment with.
>>
>> According to perf_event_open() manpage, mapping those symbolic names 
>> requires libpfm so this would add a library dependency?
> 
> I do not see perf list using this library to determine the event and
> umask but I am in unfamiliar territory. I'll have to spend some more
> time here to determine options.

tools/perf/pmu-events/README cleared it up for me. The architecture specific
tables are included in the perf binary. Potentially pmu-events.h could be
included or the test could just stick with the architectural events.
A quick look at the various cache.json files created the impression that
the events of interest may actually have the same event code and umask across
platforms.
I am not familiar with libpfm. This can surely be considered if it supports
this testing. Several selftests have library dependencies.

Reinette

Ilpo Järvinen Nov. 7, 2023, 9:33 a.m. UTC | #8

On Mon, 6 Nov 2023, Reinette Chatre wrote:
> On 11/6/2023 9:03 AM, Reinette Chatre wrote:
> > On 11/6/2023 1:53 AM, Ilpo Järvinen wrote:
> >> On Fri, 3 Nov 2023, Reinette Chatre wrote:
> >>> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote:
> >>>> On Thu, 2 Nov 2023, Reinette Chatre wrote:
> >>>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote:
> >>>>
> >>>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available
> >>>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not.
> >>>>>
> >>>>> I understand the exact measurement is not available but I do notice some
> >>>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss
> >>>>> looks promising.
> >>>>
> >>>> Okay, I was under impression that L2 misses are not available. Both based 
> >>>> on what you mentioned to me half an year ago and because of what flags I 
> >>>> found from the header. But I'll take another look into it.
> >>>
> >>> You are correct that when I did L2 testing a long time ago I used
> >>> the model specific L2 miss counts. I was hoping that things have improved
> >>> so that model specific counters are not needed, as you have tried here.
> >>> I found the l2_rqsts symbol while looking for alternatives but I am not
> >>> familiar enough with perf to know how these symbolic names are mapped.
> >>> I was hoping that they could be a simple drop-in replacement to
> >>> experiment with.
> >>
> >> According to perf_event_open() manpage, mapping those symbolic names 
> >> requires libpfm so this would add a library dependency?
> > 
> > I do not see perf list using this library to determine the event and
> > umask but I am in unfamiliar territory. I'll have to spend some more
> > time here to determine options.
> 
> tools/perf/pmu-events/README cleared it up for me. The architecture specific
> tables are included in the perf binary. Potentially pmu-events.h could be
> included or the test could just stick with the architectural events.
> A quick look at the various cache.json files created the impression that
> the events of interest may actually have the same event code and umask across
> platforms.
> I am not familiar with libpfm. This can surely be considered if it supports
> this testing. Several selftests have library dependencies.

man perf_event_open() says this:

"If type is PERF_TYPE_RAW, then a custom "raw" config  value  is  needed.
Most  CPUs  support  events  that  are  not covered by the "generalized"
events.  These are implementation defined; see your CPU manual (for  ex-
ample  the  Intel Volume 3B documentation or the AMD BIOS and Kernel De-
veloper Guide).  The libpfm4 library can be used to translate  from  the
name in the architectural manuals to the raw hex value perf_event_open()
expects in this field."

...I've not come across libpfm myself either but to me it looks libpfm 
bridges between those architecture specific tables and perf_event_open(). 
That is, it could provide the binary value necessary in constructing the 
perf_event_attr struct.

I think this is probably the function which maps string -> 
perf_event_attr:

https://man7.org/linux/man-pages/man3/pfm_get_os_event_encoding.3.html

Reinette Chatre Nov. 8, 2023, 4:31 p.m. UTC | #9

Hi Ilpo,

On 11/7/2023 1:33 AM, Ilpo Järvinen wrote:
> man perf_event_open() says this:
> 
> "If type is PERF_TYPE_RAW, then a custom "raw" config  value  is  needed.
> Most  CPUs  support  events  that  are  not covered by the "generalized"
> events.  These are implementation defined; see your CPU manual (for  ex-
> ample  the  Intel Volume 3B documentation or the AMD BIOS and Kernel De-
> veloper Guide).  The libpfm4 library can be used to translate  from  the
> name in the architectural manuals to the raw hex value perf_event_open()
> expects in this field."
> 
> ...I've not come across libpfm myself either but to me it looks libpfm 
> bridges between those architecture specific tables and perf_event_open(). 
> That is, it could provide the binary value necessary in constructing the 
> perf_event_attr struct.
> 
> I think this is probably the function which maps string -> 
> perf_event_attr:
> 
> https://man7.org/linux/man-pages/man3/pfm_get_os_event_encoding.3.html
> 

This sounds promising. If this works out I think that it would be ideal if
the L2 CAT test is not blocked by absence of libpfm. That is, the resctrl
tests should not fail to build if libpfm is not present but instead
L2 CAT just turns into a simple functional test. To accomplish this it looks
like tools/build/Makefile.feature can be helpful and already has a check
for libpfm.

Reinette

diff mbox series

Patch

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 48a96acd9e31..a9c72022bb5a 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -131,8 +131,47 @@  void cat_test_cleanup(void)
 	remove(RESULT_FILE_NAME);
 }
 
+/*
+ * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy
+ * because perf cannot directly provide the number of L2 misses (there are
+ * only platform specific ways to get the number of L2 misses).
+ *
+ * This function sets up L3 CAT to reduce noise from other processes during
+ * L2 CAT test.
+ */
+int l3_proxy_prepare(const struct resctrl_test *test, struct resctrl_val_param *param, int cpu)
+{
+	unsigned long l3_mask, split_mask;
+	unsigned int start;
+	int count_of_bits;
+	char schemata[64];
+	int n, ret;
+
+	if (!validate_resctrl_feature_request("L3", NULL)) {
+		ksft_print_msg("%s test results may contain noise because L3 CAT is not available!\n",
+			       test->name);
+		return 0;
+	}
+
+	ret = get_mask_no_shareable("L3", &l3_mask);
+	if (ret)
+		return ret;
+	count_of_bits = count_contiguous_bits(l3_mask, &start);
+	n = count_of_bits / 2;
+	split_mask = create_bit_mask(start, n);
+
+	snprintf(schemata, sizeof(schemata), "%lx", l3_mask & ~split_mask);
+	ret = write_schemata("", schemata, cpu, "L3");
+	if (ret)
+		return ret;
+
+	snprintf(schemata, sizeof(schemata), "%lx", split_mask);
+	return write_schemata(param->ctrlgrp, schemata, cpu, "L3");
+}
+
 /*
  * cat_test:	execute CAT benchmark and measure LLC cache misses
+ * @test:	test information structure
  * @param:	parameters passed to cat_test()
  * @span:	buffer size for the benchmark
  * @current_mask	start mask for the first iteration
@@ -142,9 +181,10 @@  void cat_test_cleanup(void)
  *
  * Return:		0 on success. non-zero on failure.
  */
-static int cat_test(struct resctrl_val_param *param, const char *resource,
+static int cat_test(const struct resctrl_test *test, struct resctrl_val_param *param,
 		    size_t span, unsigned long current_mask)
 {
+	__u64 pea_config = PERF_COUNT_HW_CACHE_MISSES;
 	char *resctrl_val = param->resctrl_val;
 	static struct perf_event_read pe_read;
 	struct perf_event_attr pea;
@@ -169,20 +209,26 @@  static int cat_test(struct resctrl_val_param *param, const char *resource,
 	if (ret)
 		return ret;
 
+	if (!strcmp(test->resource, "L2")) {
+		ret = l3_proxy_prepare(test, param, param->cpu_no);
+		if (ret)
+			return ret;
+		pea_config = PERF_COUNT_HW_CACHE_REFERENCES;
+	}
+	perf_event_attr_initialize(&pea, pea_config);
+	perf_event_initialize_read_format(&pe_read);
+
 	buf = alloc_buffer(span, 1);
 	if (buf == NULL)
 		return -1;
 
-	perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
-	perf_event_initialize_read_format(&pe_read);
-
 	while (current_mask) {
 		snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask);
-		ret = write_schemata("", schemata, param->cpu_no, resource);
+		ret = write_schemata("", schemata, param->cpu_no, test->resource);
 		if (ret)
 			goto free_buf;
 		snprintf(schemata, sizeof(schemata), "%lx", current_mask);
-		ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, resource);
+		ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, test->resource);
 		if (ret)
 			goto free_buf;
 
@@ -269,7 +315,7 @@  static int cat_run_test(const struct resctrl_test *test, const struct user_param
 
 	remove(param.filename);
 
-	ret = cat_test(&param, test->resource, span, start_mask);
+	ret = cat_test(test, &param, span, start_mask);
 	if (ret)
 		goto out;
 
@@ -288,3 +334,11 @@  struct resctrl_test l3_cat_test = {
 	.feature_check = test_resource_feature_check,
 	.run_test = cat_run_test,
 };
+
+struct resctrl_test l2_cat_test = {
+	.name = "L2_CAT",
+	.group = "CAT",
+	.resource = "L2",
+	.feature_check = test_resource_feature_check,
+	.run_test = cat_run_test,
+};
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index f9a4cfd981f8..fffeb442c173 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -183,5 +183,6 @@  extern struct resctrl_test mbm_test;
 extern struct resctrl_test mba_test;
 extern struct resctrl_test cmt_test;
 extern struct resctrl_test l3_cat_test;
+extern struct resctrl_test l2_cat_test;
 
 #endif /* RESCTRL_H */
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index d89179541d7b..9e254bca6c25 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -15,6 +15,7 @@  static struct resctrl_test *resctrl_tests[] = {
 	&mba_test,
 	&cmt_test,
 	&l3_cat_test,
+	&l2_cat_test,
 };
 
 static int detect_vendor(void)