Message ID | 20231024092634.7122-24-ilpo.jarvinen@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1821500vqx; Tue, 24 Oct 2023 02:32:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE0aJpLUs8jOQUhmW9Iu0ia5CA/UiSFQhWOTjrfZg+omSr9TR1sW1fkHUANaPw1DcHnOezs X-Received: by 2002:a05:6358:f55:b0:168:e366:6139 with SMTP id c21-20020a0563580f5500b00168e3666139mr3921108rwj.11.1698139977249; Tue, 24 Oct 2023 02:32:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698139977; cv=none; d=google.com; s=arc-20160816; b=ILcH6HQGe9ce5r+S1oZ7LI6cs48RLsWfSh7fScMtKDbke03F1n5vkaFRruTIabr1NP w3qUnaC2TdfZbHq9I5CgOawJPCJjz/OBra7yje/1KO3+xEwIPyXu8lQlvLKY1Xm7Hno2 ZPvLb+CaariYoZI6HS+UdGGGaryzVs/IkIDmggwRVs4c5r2Ps5fN9nL0/vHguy1zmTwS 57xYC+68UQWtfuhKeiH3AwMMAXeijAqjnpUYLnsOO6WxGRPvsKt1EvLcFfbz0n1nCitJ hSee5O4AdkZA3WIyUbWWCmymZPnfRFqkjkmkyDKk1CFldi2riJF1W12xRvzS2BgTYDwS mGWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PKJ3nY5sHKbXVB5w7Iag1A74CQ9qJftWaMYqnB60Y+Q=; fh=gPQ6jqLSfsDb5bE3yrtO+AlT5R4d75RXkjC5xckz7Dk=; b=NxuyfN1KewkCpyI8p+75CKxlSkl87QXzDiv6pFgEGdh7kQxuoxFRdEHencWJ+peL5P /hDm8Y9Gs9NjACpdTd6cR6ByCtU47Kic2X2F//XsAy4yGjPc3hXU3L+3WT/dIvcHr7r0 za2sXbe6JxTH4EseUrLz3ZuofX7r+4Jgc/5lq7OHYEJxg4PslPMD8x4o1SojZwIppPVr vWF5MJ9oLs36S2b3Y6UxEehJcoWgr3Ofk2zHZJnZBrNKC1I5qKy6dOLCjJR3tuZmEmdB ZAqfOABHi6Pb1Y7DEjXswgo/vejWf67qbfzyEGgrFmqb0rNCKyPkppUy41nPSPDHxj3Q rLjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jbyamRI2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id fi39-20020a056a0039a700b006be1885fb85si8141200pfb.79.2023.10.24.02.32.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 02:32:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jbyamRI2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 6884280859AA; Tue, 24 Oct 2023 02:32:52 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232908AbjJXJcX (ORCPT <rfc822;a1648639935@gmail.com> + 26 others); Tue, 24 Oct 2023 05:32:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234440AbjJXJbc (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 24 Oct 2023 05:31:32 -0400 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 034BF1711; Tue, 24 Oct 2023 02:30:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698139803; x=1729675803; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ocn2/9r+kzf8EjPiilW7216EKf/6BlgU2VvF1iWWBGI=; b=jbyamRI21ZmtsbzyRrkw/oQTuNZMjZBjV9LvAv4l4DZJeRg3dY98S/ny fPkLCZDj16vsY3fkqLQfEfsIQ86jLnuYBjg50h5Y6OdJE0ln3wVpn+n8w BUcnBS5ZRYX6i4JY/sfE5TueHJcggKCDrPaS69+1FyXsY3On/aEf2q/c4 LkUwR+Eq4jSybOgKr5Syv+Z4gedutRh3g77fpFKci7Ri+CH8mwSZ22lv5 7TdB6ctUd1Rk46s8rKGt2bNmjU9x6sPtgkjxQ5ph+UmmkQLZocp4G0FyK 2OKk9lCEIVYVbIAiIQlmv3cT9kr++prqYVgVlWfoZZiqO2mu0SI8uCQFl g==; X-IronPort-AV: E=McAfee;i="6600,9927,10872"; a="5644406" X-IronPort-AV: E=Sophos;i="6.03,247,1694761200"; d="scan'208";a="5644406" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2023 02:30:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10872"; a="758410209" X-IronPort-AV: E=Sophos;i="6.03,247,1694761200"; d="scan'208";a="758410209" Received: from hprosing-mobl.ger.corp.intel.com (HELO localhost) ([10.249.40.219]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2023 02:29:58 -0700 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com> To: linux-kselftest@vger.kernel.org, Reinette Chatre <reinette.chatre@intel.com>, Shuah Khan <shuah@kernel.org>, Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>, =?utf-8?q?Maciej_Wiecz=C3=B3r-R?= =?utf-8?q?etman?= <maciej.wieczor-retman@intel.com>, Fenghua Yu <fenghua.yu@intel.com> Cc: linux-kernel@vger.kernel.org, =?utf-8?q?Ilpo_J=C3=A4rvinen?= <ilpo.jarvinen@linux.intel.com> Subject: [PATCH 23/24] selftests/resctrl: Add L2 CAT test Date: Tue, 24 Oct 2023 12:26:33 +0300 Message-Id: <20231024092634.7122-24-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231024092634.7122-1-ilpo.jarvinen@linux.intel.com> References: <20231024092634.7122-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Tue, 24 Oct 2023 02:32:52 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780628824568048175 X-GMAIL-MSGID: 1780628824568048175 |
Series |
selftests/resctrl: CAT test improvements & generalized test framework
|
|
Commit Message
Ilpo Järvinen
Oct. 24, 2023, 9:26 a.m. UTC
CAT selftests only cover L3 but some newer CPUs come also with L2 CAT
support.
Add L2 CAT selftest. As measuring L2 misses is not easily available
with perf, use L3 accesses as a proxy for L2 CAT working or not.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++--
tools/testing/selftests/resctrl/resctrl.h | 1 +
.../testing/selftests/resctrl/resctrl_tests.c | 1 +
3 files changed, 63 insertions(+), 7 deletions(-)
Comments
On 2023-10-24 at 12:26:33 +0300, Ilpo Järvinen wrote: >CAT selftests only cover L3 but some newer CPUs come also with L2 CAT >support. Is there some some defined line since what CPU model is L2 CAT supported? In my opinion, from the perspective of someone digging up this commit a couple years from now it could be handy to have something more specific instead of "some newer CPUs". > >Add L2 CAT selftest. As measuring L2 misses is not easily available >with perf, use L3 accesses as a proxy for L2 CAT working or not. > >Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Hi Ilpo, On 10/24/2023 2:26 AM, Ilpo Järvinen wrote: > CAT selftests only cover L3 but some newer CPUs come also with L2 CAT > support. No need to use "new" language. L2 CAT has been available for a long time ... since Apollo Lake. Which systems actually support it is a different topic. This is an architectural feature that has been available for a long time. Whether a system supports it will be detected and the test run based on that. > > Add L2 CAT selftest. As measuring L2 misses is not easily available > with perf, use L3 accesses as a proxy for L2 CAT working or not. I understand the exact measurement is not available but I do notice some L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss looks promising. L3 cannot be relied on for those systems, like Apollo lake, that do not have an L3. > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > --- > tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++-- > tools/testing/selftests/resctrl/resctrl.h | 1 + > .../testing/selftests/resctrl/resctrl_tests.c | 1 + > 3 files changed, 63 insertions(+), 7 deletions(-) > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > index 48a96acd9e31..a9c72022bb5a 100644 > --- a/tools/testing/selftests/resctrl/cat_test.c > +++ b/tools/testing/selftests/resctrl/cat_test.c > @@ -131,8 +131,47 @@ void cat_test_cleanup(void) > remove(RESULT_FILE_NAME); > } > > +/* > + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy > + * because perf cannot directly provide the number of L2 misses (there are > + * only platform specific ways to get the number of L2 misses). > + * > + * This function sets up L3 CAT to reduce noise from other processes during > + * L2 CAT test. This motivation is not clear to me. Does the same isolation used during L3 CAT testing not work? I expected it to follow the same idea with the L2 cache split in two, the test using one part and the rest of the system using the other. Is that not enough isolation? Reinette
On Thu, 2 Nov 2023, Reinette Chatre wrote: > On 10/24/2023 2:26 AM, Ilpo Järvinen wrote: > > Add L2 CAT selftest. As measuring L2 misses is not easily available > > with perf, use L3 accesses as a proxy for L2 CAT working or not. > > I understand the exact measurement is not available but I do notice some > L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss > looks promising. Okay, I was under impression that L2 misses are not available. Both based on what you mentioned to me half an year ago and because of what flags I found from the header. But I'll take another look into it. > L3 cannot be relied on for those systems, like Apollo lake, that do > not have an L3. Do you happen know what perf will report for such CPUs, will it return L2 as LLC? > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > > --- > > tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++-- > > tools/testing/selftests/resctrl/resctrl.h | 1 + > > .../testing/selftests/resctrl/resctrl_tests.c | 1 + > > 3 files changed, 63 insertions(+), 7 deletions(-) > > > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > > index 48a96acd9e31..a9c72022bb5a 100644 > > --- a/tools/testing/selftests/resctrl/cat_test.c > > +++ b/tools/testing/selftests/resctrl/cat_test.c > > @@ -131,8 +131,47 @@ void cat_test_cleanup(void) > > remove(RESULT_FILE_NAME); > > } > > > > +/* > > + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy > > + * because perf cannot directly provide the number of L2 misses (there are > > + * only platform specific ways to get the number of L2 misses). > > + * > > + * This function sets up L3 CAT to reduce noise from other processes during > > + * L2 CAT test. > > This motivation is not clear to me. Does the same isolation used during > L3 CAT testing not work? I expected it to follow the same idea with the > L2 cache split in two, the test using one part and the rest of the > system using the other. Is that not enough isolation? Isolation for L2 is done very same way as with L3 and I think it itself works just fine. However, because L2 CAT selftest as is measures L3 accesses that in ideal world equals to L2 misses, isolating selftest related L3 accesses from the rest of the system should reduce noise in the # of L3 accesses. It's not mandatory though so if L3 CAT is not available the function just prints a warning about the potential noise and does setup nothing for L3. But I'll see if I can make it use L2 misses directly so this wouldn't matter.
Hi Ilpo, On 11/3/2023 3:39 AM, Ilpo Järvinen wrote: > On Thu, 2 Nov 2023, Reinette Chatre wrote: >> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote: > >>> Add L2 CAT selftest. As measuring L2 misses is not easily available >>> with perf, use L3 accesses as a proxy for L2 CAT working or not. >> >> I understand the exact measurement is not available but I do notice some >> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss >> looks promising. > > Okay, I was under impression that L2 misses are not available. Both based > on what you mentioned to me half an year ago and because of what flags I > found from the header. But I'll take another look into it. You are correct that when I did L2 testing a long time ago I used the model specific L2 miss counts. I was hoping that things have improved so that model specific counters are not needed, as you have tried here. I found the l2_rqsts symbol while looking for alternatives but I am not familiar enough with perf to know how these symbolic names are mapped. I was hoping that they could be a simple drop-in replacement to experiment with. > >> L3 cannot be relied on for those systems, like Apollo lake, that do >> not have an L3. > > Do you happen know what perf will report for such CPUs, will it return > L2 as LLC? I don't know. > >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> >>> --- >>> tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++-- >>> tools/testing/selftests/resctrl/resctrl.h | 1 + >>> .../testing/selftests/resctrl/resctrl_tests.c | 1 + >>> 3 files changed, 63 insertions(+), 7 deletions(-) >>> >>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c >>> index 48a96acd9e31..a9c72022bb5a 100644 >>> --- a/tools/testing/selftests/resctrl/cat_test.c >>> +++ b/tools/testing/selftests/resctrl/cat_test.c >>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void) >>> remove(RESULT_FILE_NAME); >>> } >>> >>> +/* >>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy >>> + * because perf cannot directly provide the number of L2 misses (there are >>> + * only platform specific ways to get the number of L2 misses). >>> + * >>> + * This function sets up L3 CAT to reduce noise from other processes during >>> + * L2 CAT test. >> >> This motivation is not clear to me. Does the same isolation used during >> L3 CAT testing not work? I expected it to follow the same idea with the >> L2 cache split in two, the test using one part and the rest of the >> system using the other. Is that not enough isolation? > > Isolation for L2 is done very same way as with L3 and I think it itself > works just fine. > > However, because L2 CAT selftest as is measures L3 accesses that in ideal > world equals to L2 misses, isolating selftest related L3 accesses from the > rest of the system should reduce noise in the # of L3 accesses. It's not > mandatory though so if L3 CAT is not available the function just prints a > warning about the potential noise and does setup nothing for L3. This is not clear to me. If the read misses L2 and then accesses L3 then it should not matter which part of L3 cache the work is isolated to. What noise do you have in mind? Reinette
On Fri, 3 Nov 2023, Reinette Chatre wrote: > On 11/3/2023 3:39 AM, Ilpo Järvinen wrote: > > On Thu, 2 Nov 2023, Reinette Chatre wrote: > >> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote: > > > >>> Add L2 CAT selftest. As measuring L2 misses is not easily available > >>> with perf, use L3 accesses as a proxy for L2 CAT working or not. > >> > >> I understand the exact measurement is not available but I do notice some > >> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss > >> looks promising. > > > > Okay, I was under impression that L2 misses are not available. Both based > > on what you mentioned to me half an year ago and because of what flags I > > found from the header. But I'll take another look into it. > > You are correct that when I did L2 testing a long time ago I used > the model specific L2 miss counts. I was hoping that things have improved > so that model specific counters are not needed, as you have tried here. > I found the l2_rqsts symbol while looking for alternatives but I am not > familiar enough with perf to know how these symbolic names are mapped. > I was hoping that they could be a simple drop-in replacement to > experiment with. According to perf_event_open() manpage, mapping those symbolic names requires libpfm so this would add a library dependency? > >>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> > >>> --- > >>> tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++-- > >>> tools/testing/selftests/resctrl/resctrl.h | 1 + > >>> .../testing/selftests/resctrl/resctrl_tests.c | 1 + > >>> 3 files changed, 63 insertions(+), 7 deletions(-) > >>> > >>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > >>> index 48a96acd9e31..a9c72022bb5a 100644 > >>> --- a/tools/testing/selftests/resctrl/cat_test.c > >>> +++ b/tools/testing/selftests/resctrl/cat_test.c > >>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void) > >>> remove(RESULT_FILE_NAME); > >>> } > >>> > >>> +/* > >>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy > >>> + * because perf cannot directly provide the number of L2 misses (there are > >>> + * only platform specific ways to get the number of L2 misses). > >>> + * > >>> + * This function sets up L3 CAT to reduce noise from other processes during > >>> + * L2 CAT test. > >> > >> This motivation is not clear to me. Does the same isolation used during > >> L3 CAT testing not work? I expected it to follow the same idea with the > >> L2 cache split in two, the test using one part and the rest of the > >> system using the other. Is that not enough isolation? > > > > Isolation for L2 is done very same way as with L3 and I think it itself > > works just fine. > > > > However, because L2 CAT selftest as is measures L3 accesses that in ideal > > world equals to L2 misses, isolating selftest related L3 accesses from the > > rest of the system should reduce noise in the # of L3 accesses. It's not > > mandatory though so if L3 CAT is not available the function just prints a > > warning about the potential noise and does setup nothing for L3. > > This is not clear to me. If the read misses L2 and then accesses L3 then > it should not matter which part of L3 cache the work is isolated to. > What noise do you have in mind? The way it is currently done is to measure L3 accesses. If something else runs at the same time as the CAT selftest, it can do mem accesses that cause L3 accesses which is noise in the # of L3 accesses number since those accesses were unrelated to the L2 CAT selftest.
Hi Ilpo, On 11/6/2023 1:53 AM, Ilpo Järvinen wrote: > On Fri, 3 Nov 2023, Reinette Chatre wrote: >> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote: >>> On Thu, 2 Nov 2023, Reinette Chatre wrote: >>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote: >>> >>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available >>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not. >>>> >>>> I understand the exact measurement is not available but I do notice some >>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss >>>> looks promising. >>> >>> Okay, I was under impression that L2 misses are not available. Both based >>> on what you mentioned to me half an year ago and because of what flags I >>> found from the header. But I'll take another look into it. >> >> You are correct that when I did L2 testing a long time ago I used >> the model specific L2 miss counts. I was hoping that things have improved >> so that model specific counters are not needed, as you have tried here. >> I found the l2_rqsts symbol while looking for alternatives but I am not >> familiar enough with perf to know how these symbolic names are mapped. >> I was hoping that they could be a simple drop-in replacement to >> experiment with. > > According to perf_event_open() manpage, mapping those symbolic names > requires libpfm so this would add a library dependency? I do not see perf list using this library to determine the event and umask but I am in unfamiliar territory. I'll have to spend some more time here to determine options. > >>>>> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> >>>>> --- >>>>> tools/testing/selftests/resctrl/cat_test.c | 68 +++++++++++++++++-- >>>>> tools/testing/selftests/resctrl/resctrl.h | 1 + >>>>> .../testing/selftests/resctrl/resctrl_tests.c | 1 + >>>>> 3 files changed, 63 insertions(+), 7 deletions(-) >>>>> >>>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c >>>>> index 48a96acd9e31..a9c72022bb5a 100644 >>>>> --- a/tools/testing/selftests/resctrl/cat_test.c >>>>> +++ b/tools/testing/selftests/resctrl/cat_test.c >>>>> @@ -131,8 +131,47 @@ void cat_test_cleanup(void) >>>>> remove(RESULT_FILE_NAME); >>>>> } >>>>> >>>>> +/* >>>>> + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy >>>>> + * because perf cannot directly provide the number of L2 misses (there are >>>>> + * only platform specific ways to get the number of L2 misses). >>>>> + * >>>>> + * This function sets up L3 CAT to reduce noise from other processes during >>>>> + * L2 CAT test. >>>> >>>> This motivation is not clear to me. Does the same isolation used during >>>> L3 CAT testing not work? I expected it to follow the same idea with the >>>> L2 cache split in two, the test using one part and the rest of the >>>> system using the other. Is that not enough isolation? >>> >>> Isolation for L2 is done very same way as with L3 and I think it itself >>> works just fine. >>> >>> However, because L2 CAT selftest as is measures L3 accesses that in ideal >>> world equals to L2 misses, isolating selftest related L3 accesses from the >>> rest of the system should reduce noise in the # of L3 accesses. It's not >>> mandatory though so if L3 CAT is not available the function just prints a >>> warning about the potential noise and does setup nothing for L3. >> >> This is not clear to me. If the read misses L2 and then accesses L3 then >> it should not matter which part of L3 cache the work is isolated to. >> What noise do you have in mind? > > The way it is currently done is to measure L3 accesses. If something else > runs at the same time as the CAT selftest, it can do mem accesses that > cause L3 accesses which is noise in the # of L3 accesses number since > those accesses were unrelated to the L2 CAT selftest. > Creating a CAT allocation sets aside a portion of cache where a task/cpu can allocation into cache, it does not prevent one task from accessing the cache concurrently with another. Reinette
Hi Ilpo, On 11/6/2023 9:03 AM, Reinette Chatre wrote: > On 11/6/2023 1:53 AM, Ilpo Järvinen wrote: >> On Fri, 3 Nov 2023, Reinette Chatre wrote: >>> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote: >>>> On Thu, 2 Nov 2023, Reinette Chatre wrote: >>>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote: >>>> >>>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available >>>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not. >>>>> >>>>> I understand the exact measurement is not available but I do notice some >>>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss >>>>> looks promising. >>>> >>>> Okay, I was under impression that L2 misses are not available. Both based >>>> on what you mentioned to me half an year ago and because of what flags I >>>> found from the header. But I'll take another look into it. >>> >>> You are correct that when I did L2 testing a long time ago I used >>> the model specific L2 miss counts. I was hoping that things have improved >>> so that model specific counters are not needed, as you have tried here. >>> I found the l2_rqsts symbol while looking for alternatives but I am not >>> familiar enough with perf to know how these symbolic names are mapped. >>> I was hoping that they could be a simple drop-in replacement to >>> experiment with. >> >> According to perf_event_open() manpage, mapping those symbolic names >> requires libpfm so this would add a library dependency? > > I do not see perf list using this library to determine the event and > umask but I am in unfamiliar territory. I'll have to spend some more > time here to determine options. tools/perf/pmu-events/README cleared it up for me. The architecture specific tables are included in the perf binary. Potentially pmu-events.h could be included or the test could just stick with the architectural events. A quick look at the various cache.json files created the impression that the events of interest may actually have the same event code and umask across platforms. I am not familiar with libpfm. This can surely be considered if it supports this testing. Several selftests have library dependencies. Reinette
On Mon, 6 Nov 2023, Reinette Chatre wrote: > On 11/6/2023 9:03 AM, Reinette Chatre wrote: > > On 11/6/2023 1:53 AM, Ilpo Järvinen wrote: > >> On Fri, 3 Nov 2023, Reinette Chatre wrote: > >>> On 11/3/2023 3:39 AM, Ilpo Järvinen wrote: > >>>> On Thu, 2 Nov 2023, Reinette Chatre wrote: > >>>>> On 10/24/2023 2:26 AM, Ilpo Järvinen wrote: > >>>> > >>>>>> Add L2 CAT selftest. As measuring L2 misses is not easily available > >>>>>> with perf, use L3 accesses as a proxy for L2 CAT working or not. > >>>>> > >>>>> I understand the exact measurement is not available but I do notice some > >>>>> L2 related symbolic counters when I run "perf list". l2_rqsts.all_demand_miss > >>>>> looks promising. > >>>> > >>>> Okay, I was under impression that L2 misses are not available. Both based > >>>> on what you mentioned to me half an year ago and because of what flags I > >>>> found from the header. But I'll take another look into it. > >>> > >>> You are correct that when I did L2 testing a long time ago I used > >>> the model specific L2 miss counts. I was hoping that things have improved > >>> so that model specific counters are not needed, as you have tried here. > >>> I found the l2_rqsts symbol while looking for alternatives but I am not > >>> familiar enough with perf to know how these symbolic names are mapped. > >>> I was hoping that they could be a simple drop-in replacement to > >>> experiment with. > >> > >> According to perf_event_open() manpage, mapping those symbolic names > >> requires libpfm so this would add a library dependency? > > > > I do not see perf list using this library to determine the event and > > umask but I am in unfamiliar territory. I'll have to spend some more > > time here to determine options. > > tools/perf/pmu-events/README cleared it up for me. The architecture specific > tables are included in the perf binary. Potentially pmu-events.h could be > included or the test could just stick with the architectural events. > A quick look at the various cache.json files created the impression that > the events of interest may actually have the same event code and umask across > platforms. > I am not familiar with libpfm. This can surely be considered if it supports > this testing. Several selftests have library dependencies. man perf_event_open() says this: "If type is PERF_TYPE_RAW, then a custom "raw" config value is needed. Most CPUs support events that are not covered by the "generalized" events. These are implementation defined; see your CPU manual (for ex- ample the Intel Volume 3B documentation or the AMD BIOS and Kernel De- veloper Guide). The libpfm4 library can be used to translate from the name in the architectural manuals to the raw hex value perf_event_open() expects in this field." ...I've not come across libpfm myself either but to me it looks libpfm bridges between those architecture specific tables and perf_event_open(). That is, it could provide the binary value necessary in constructing the perf_event_attr struct. I think this is probably the function which maps string -> perf_event_attr: https://man7.org/linux/man-pages/man3/pfm_get_os_event_encoding.3.html
Hi Ilpo, On 11/7/2023 1:33 AM, Ilpo Järvinen wrote: > man perf_event_open() says this: > > "If type is PERF_TYPE_RAW, then a custom "raw" config value is needed. > Most CPUs support events that are not covered by the "generalized" > events. These are implementation defined; see your CPU manual (for ex- > ample the Intel Volume 3B documentation or the AMD BIOS and Kernel De- > veloper Guide). The libpfm4 library can be used to translate from the > name in the architectural manuals to the raw hex value perf_event_open() > expects in this field." > > ...I've not come across libpfm myself either but to me it looks libpfm > bridges between those architecture specific tables and perf_event_open(). > That is, it could provide the binary value necessary in constructing the > perf_event_attr struct. > > I think this is probably the function which maps string -> > perf_event_attr: > > https://man7.org/linux/man-pages/man3/pfm_get_os_event_encoding.3.html > This sounds promising. If this works out I think that it would be ideal if the L2 CAT test is not blocked by absence of libpfm. That is, the resctrl tests should not fail to build if libpfm is not present but instead L2 CAT just turns into a simple functional test. To accomplish this it looks like tools/build/Makefile.feature can be helpful and already has a check for libpfm. Reinette
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 48a96acd9e31..a9c72022bb5a 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -131,8 +131,47 @@ void cat_test_cleanup(void) remove(RESULT_FILE_NAME); } +/* + * L2 CAT test measures L2 misses indirectly using L3 accesses as a proxy + * because perf cannot directly provide the number of L2 misses (there are + * only platform specific ways to get the number of L2 misses). + * + * This function sets up L3 CAT to reduce noise from other processes during + * L2 CAT test. + */ +int l3_proxy_prepare(const struct resctrl_test *test, struct resctrl_val_param *param, int cpu) +{ + unsigned long l3_mask, split_mask; + unsigned int start; + int count_of_bits; + char schemata[64]; + int n, ret; + + if (!validate_resctrl_feature_request("L3", NULL)) { + ksft_print_msg("%s test results may contain noise because L3 CAT is not available!\n", + test->name); + return 0; + } + + ret = get_mask_no_shareable("L3", &l3_mask); + if (ret) + return ret; + count_of_bits = count_contiguous_bits(l3_mask, &start); + n = count_of_bits / 2; + split_mask = create_bit_mask(start, n); + + snprintf(schemata, sizeof(schemata), "%lx", l3_mask & ~split_mask); + ret = write_schemata("", schemata, cpu, "L3"); + if (ret) + return ret; + + snprintf(schemata, sizeof(schemata), "%lx", split_mask); + return write_schemata(param->ctrlgrp, schemata, cpu, "L3"); +} + /* * cat_test: execute CAT benchmark and measure LLC cache misses + * @test: test information structure * @param: parameters passed to cat_test() * @span: buffer size for the benchmark * @current_mask start mask for the first iteration @@ -142,9 +181,10 @@ void cat_test_cleanup(void) * * Return: 0 on success. non-zero on failure. */ -static int cat_test(struct resctrl_val_param *param, const char *resource, +static int cat_test(const struct resctrl_test *test, struct resctrl_val_param *param, size_t span, unsigned long current_mask) { + __u64 pea_config = PERF_COUNT_HW_CACHE_MISSES; char *resctrl_val = param->resctrl_val; static struct perf_event_read pe_read; struct perf_event_attr pea; @@ -169,20 +209,26 @@ static int cat_test(struct resctrl_val_param *param, const char *resource, if (ret) return ret; + if (!strcmp(test->resource, "L2")) { + ret = l3_proxy_prepare(test, param, param->cpu_no); + if (ret) + return ret; + pea_config = PERF_COUNT_HW_CACHE_REFERENCES; + } + perf_event_attr_initialize(&pea, pea_config); + perf_event_initialize_read_format(&pe_read); + buf = alloc_buffer(span, 1); if (buf == NULL) return -1; - perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES); - perf_event_initialize_read_format(&pe_read); - while (current_mask) { snprintf(schemata, sizeof(schemata), "%lx", param->mask & ~current_mask); - ret = write_schemata("", schemata, param->cpu_no, resource); + ret = write_schemata("", schemata, param->cpu_no, test->resource); if (ret) goto free_buf; snprintf(schemata, sizeof(schemata), "%lx", current_mask); - ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, resource); + ret = write_schemata(param->ctrlgrp, schemata, param->cpu_no, test->resource); if (ret) goto free_buf; @@ -269,7 +315,7 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param remove(param.filename); - ret = cat_test(¶m, test->resource, span, start_mask); + ret = cat_test(test, ¶m, span, start_mask); if (ret) goto out; @@ -288,3 +334,11 @@ struct resctrl_test l3_cat_test = { .feature_check = test_resource_feature_check, .run_test = cat_run_test, }; + +struct resctrl_test l2_cat_test = { + .name = "L2_CAT", + .group = "CAT", + .resource = "L2", + .feature_check = test_resource_feature_check, + .run_test = cat_run_test, +}; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index f9a4cfd981f8..fffeb442c173 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -183,5 +183,6 @@ extern struct resctrl_test mbm_test; extern struct resctrl_test mba_test; extern struct resctrl_test cmt_test; extern struct resctrl_test l3_cat_test; +extern struct resctrl_test l2_cat_test; #endif /* RESCTRL_H */ diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index d89179541d7b..9e254bca6c25 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -15,6 +15,7 @@ static struct resctrl_test *resctrl_tests[] = { &mba_test, &cmt_test, &l3_cat_test, + &l2_cat_test, }; static int detect_vendor(void)