Message ID | 10c3afd7f62c63db31a3d4af86529144a5d7bbf9.1702392177.git.maciej.wieczor-retman@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp7774737vqy; Tue, 12 Dec 2023 06:53:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IHB8YMpmVnEatTACI3Tqg++NJuYNgEwxZEA7+rZDmqg7A7byZDjRD4sV0eadA5mJ12gXOVt X-Received: by 2002:a17:902:ee86:b0:1d0:8c04:85fe with SMTP id a6-20020a170902ee8600b001d08c0485femr6268240pld.12.1702392837235; Tue, 12 Dec 2023 06:53:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702392837; cv=none; d=google.com; s=arc-20160816; b=xpH1Z/+nBP5q1Uc24mYftzLvFBD0UTp8DIWgwnQUci3UlHjpWl35RuNElB/GSdO8Tp 2DxWC+5BBSAaP2v5YB9X03czGkn8qMPhDWp+kGTWUvha+imx3L+REvCma7Ce/KeOHhjA YLEISgrXMXRxR5LHiif3TjKVmCO4GXwXsRQ5JHlgN0hA5tt57Fb7Ngi3YYEweglBsZQW t99HabgYylE+bJxN7OnGDo2vg6DVxjEoYE52ZqWL8r5UGFzCf+ETvFycZ7dFbulA9DDN qPcaOtvbxO44D9fwCFTdDKttlFI50T9o531uEEDHaa7XdmCe3rwsKv105c72uHnXAo1V ePuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=1Md5q/rpb2tD6nWzrnbcj1gJcjRgdejubCdHraAdKG4=; fh=/pACME0x50g4vorTSG5vEdNwgU981WFvy8LGbk4DPBk=; b=ZxGt1Lq96UiYfIiDBR9EnR1PmTioy5jWLsclQEOBL1oT7OFpLUeW48H8ditCEQV1Cz GgGFxPuOX68ViLCb7ud1XISa7cUYwWgC1hffOakBuPrqzR3BMwOL1L5kjXc4BuZXq40N 5smg+czorLwKZPndS3bEYIgov7fsL/gZRB1VVu08LKolLoA+XwwV2zvV3tFxjcvoLnZp 0JWAyz1tp3zTsKr08H7bz9xKFJJk487/fS9/zG7odMLY8IYlpYqM7aDeB3Cj7UT1Vrci d6AAwwMorEWOnhpZvDaMwSpTie6cWZbS0jXGBEkfRHGc6ltDBoUJu+/F8ivbVAXPaPQp TjKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bKZhKxy1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id z5-20020a170902834500b001c9c967e77esi7791482pln.207.2023.12.12.06.53.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 06:53:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bKZhKxy1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 5728E8026DA1; Tue, 12 Dec 2023 06:53:54 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377058AbjLLOxj (ORCPT <rfc822;dexuan.linux@gmail.com> + 99 others); Tue, 12 Dec 2023 09:53:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377011AbjLLOxc (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 12 Dec 2023 09:53:32 -0500 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 09734EA; Tue, 12 Dec 2023 06:53:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702392818; x=1733928818; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wTLiPs7DM2XKGHCN7FGTKAU1Kd3MFxHMnL+sAIX2BfA=; b=bKZhKxy1kfWYdt4iVQBOFb7KDb/pohtLqS6er0pNgn7/1e+U+f4vlOOt 1jLBDKR4JmZ8w9+ZJ6zmjxWzMYXQNcSLYuJJ2zjwn8XfeAZNi3thswz4z 7NbVD9ncoU/rBhlEBnKAJkBDMXA+0/K7UOPTq2OQ+RjUXZdzPAxfUiRS+ gcc3w6Y2TjnwqyQPWMYjXw4yzL5Mtk1NGFQkFWergQMgJccw39dJB6pqm 9hDw3+cLWSdKWR3UnuP3kDg9Gsdg94R/W8aCw3FRMskAMwtHDY3SKwkGr u1qqXYU51soGkMuB2C7sVFyXK7YEhKSWZ5pyNDrKucIuQwM850aet5Txk g==; X-IronPort-AV: E=McAfee;i="6600,9927,10922"; a="481014360" X-IronPort-AV: E=Sophos;i="6.04,270,1695711600"; d="scan'208";a="481014360" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2023 06:53:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.04,270,1695711600"; d="scan'208";a="15024023" Received: from mdabrows-mobl1.ger.corp.intel.com (HELO wieczorr-mobl1.intel.com) ([10.213.5.65]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Dec 2023 06:53:34 -0800 From: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> To: Fenghua Yu <fenghua.yu@intel.com>, Reinette Chatre <reinette.chatre@intel.com>, Shuah Khan <shuah@kernel.org> Cc: ilpo.jarvinen@linux.intel.com, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH v2 4/4] selftests/resctrl: Add non-contiguous CBMs CAT test Date: Tue, 12 Dec 2023 15:52:54 +0100 Message-ID: <10c3afd7f62c63db31a3d4af86529144a5d7bbf9.1702392177.git.maciej.wieczor-retman@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <cover.1702392177.git.maciej.wieczor-retman@intel.com> References: <cover.1702392177.git.maciej.wieczor-retman@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Tue, 12 Dec 2023 06:53:54 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785088271786579212 X-GMAIL-MSGID: 1785088271786579212 |
Series |
selftests/resctrl: Add non-contiguous CBMs in Intel CAT selftest
|
|
Commit Message
Maciej Wieczor-Retman
Dec. 12, 2023, 2:52 p.m. UTC
Add tests for both L2 and L3 CAT to verify the return values
generated by writing non-contiguous CBMs don't contradict the
reported non-contiguous support information.
Use a logical XOR to confirm return value of write_schemata() and
non-contiguous CBMs support information match.
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
Changelog v2:
- Redo the patch message. (Ilpo)
- Tidy up __cpuid_count calls. (Ilpo)
- Remove redundant AND in noncont_mask calculations (Ilpo)
- Fix bit_center offset.
- Add newline before function return. (Ilpo)
- Group non-contiguous tests with CAT tests. (Ilpo)
- Use a helper for reading sparse_masks file. (Ilpo)
- Make get_cache_level() available in other source files. (Ilpo)
tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++
tools/testing/selftests/resctrl/resctrl.h | 3 +
.../testing/selftests/resctrl/resctrl_tests.c | 2 +
tools/testing/selftests/resctrl/resctrlfs.c | 2 +-
4 files changed, 81 insertions(+), 1 deletion(-)
Comments
Hi Maciej, On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: > Add tests for both L2 and L3 CAT to verify the return values > generated by writing non-contiguous CBMs don't contradict the > reported non-contiguous support information. > > Use a logical XOR to confirm return value of write_schemata() and > non-contiguous CBMs support information match. > > Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> > --- > Changelog v2: > - Redo the patch message. (Ilpo) > - Tidy up __cpuid_count calls. (Ilpo) > - Remove redundant AND in noncont_mask calculations (Ilpo) > - Fix bit_center offset. > - Add newline before function return. (Ilpo) > - Group non-contiguous tests with CAT tests. (Ilpo) > - Use a helper for reading sparse_masks file. (Ilpo) > - Make get_cache_level() available in other source files. (Ilpo) > > tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ > tools/testing/selftests/resctrl/resctrl.h | 3 + > .../testing/selftests/resctrl/resctrl_tests.c | 2 + > tools/testing/selftests/resctrl/resctrlfs.c | 2 +- > 4 files changed, 81 insertions(+), 1 deletion(-) > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > index 7dc7206b3b99..ecf553a89aae 100644 > --- a/tools/testing/selftests/resctrl/cat_test.c > +++ b/tools/testing/selftests/resctrl/cat_test.c > @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param > return ret; > } > > +static int noncont_cat_run_test(const struct resctrl_test *test, > + const struct user_params *uparams) > +{ > + unsigned long full_cache_mask, cont_mask, noncont_mask; > + unsigned int eax, ebx, ecx, edx, ret; > + int level, bit_center, sparse_masks; > + char schemata[64]; > + > + /* Check to compare sparse_masks content to cpuid output. */ "cpuid" -> "CPUID" (to note it is an instruction) > + sparse_masks = read_info_res_file(test->resource, "sparse_masks"); > + if (sparse_masks < 0) > + return sparse_masks; > + > + level = get_cache_level(test->resource); > + if (level < 0) > + return -EINVAL; > + __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx); Please do not invent relationships. Please replace the "4 - level" with specific index used that depends on particular cache. The cache level may not even be needed, just use the resource to determine the correct index. > + > + if (sparse_masks != ((ecx >> 3) & 1)) > + return -1; Can a message be displayed to support the debugging this test failure? > + > + /* Write checks initialization. */ > + ret = get_full_cbm(test->resource, &full_cache_mask); > + if (ret < 0) > + return ret; I assume this test failure relies on the error message from get_bit_mask() that is called via get_full_cbm()? > + bit_center = count_bits(full_cache_mask) / 2; > + cont_mask = full_cache_mask >> bit_center; > + > + /* Contiguous mask write check. */ > + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); > + ret = write_schemata("", schemata, uparams->cpu, test->resource); > + if (ret) > + return ret; How will user know what failed? I am seeing this single test exercise a few scenarios and it is not obvious to me if the issue will be clear if this test, noncont_cat_run_test(), fails. > + > + /* > + * Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle. > + * Output is compared with support information to catch any edge case errors. > + */ > + noncont_mask = ~(0xf << (bit_center - 2)) & full_cache_mask; > + snprintf(schemata, sizeof(schemata), "%lx", noncont_mask); > + ret = write_schemata("", schemata, uparams->cpu, test->resource); > + if (ret && sparse_masks) > + ksft_print_msg("Non-contiguous CBMs supported but write failed\n"); > + else if (ret && !sparse_masks) > + ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n"); > + else if (!ret && !sparse_masks) > + ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n"); Can these messages be made more specific with a "write" changed to "write of non-contiguous CBM" > + > + return !ret == !sparse_masks; Please return negative on error. Ilpo just did a big cleanup to address this. > +} > + > +static bool noncont_cat_feature_check(const struct resctrl_test *test) > +{ > + if (!resctrl_resource_exists(test->resource)) > + return false; > + > + return resctrl_cache_feature_exists(test->resource, "sparse_masks"); > +} > + > struct resctrl_test l3_cat_test = { > .name = "L3_CAT", > .group = "CAT", > @@ -299,3 +358,19 @@ struct resctrl_test l3_cat_test = { > .feature_check = test_resource_feature_check, > .run_test = cat_run_test, > }; > + > +struct resctrl_test l3_noncont_cat_test = { > + .name = "L3_NONCONT_CAT", > + .group = "CAT", > + .resource = "L3", > + .feature_check = noncont_cat_feature_check, > + .run_test = noncont_cat_run_test, > +}; > + > +struct resctrl_test l2_noncont_cat_test = { > + .name = "L2_NONCONT_CAT", > + .group = "CAT", > + .resource = "L2", > + .feature_check = noncont_cat_feature_check, > + .run_test = noncont_cat_run_test, > +}; > diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h > index 74041a35d4ba..7b7a48d1fddd 100644 > --- a/tools/testing/selftests/resctrl/resctrl.h > +++ b/tools/testing/selftests/resctrl/resctrl.h > @@ -165,6 +165,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); > int get_full_cbm(const char *cache_type, unsigned long *mask); > int get_mask_no_shareable(const char *cache_type, unsigned long *mask); > int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); > +int get_cache_level(const char *cache_type); > int read_info_res_file(const char *resource, const char *filename); > void ctrlc_handler(int signum, siginfo_t *info, void *ptr); > int signal_handler_register(void); > @@ -201,5 +202,7 @@ extern struct resctrl_test mbm_test; > extern struct resctrl_test mba_test; > extern struct resctrl_test cmt_test; > extern struct resctrl_test l3_cat_test; > +extern struct resctrl_test l3_noncont_cat_test; > +extern struct resctrl_test l2_noncont_cat_test; > > #endif /* RESCTRL_H */ > diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c > index 3044179ee6e9..f3dc1b9696e7 100644 > --- a/tools/testing/selftests/resctrl/resctrl_tests.c > +++ b/tools/testing/selftests/resctrl/resctrl_tests.c > @@ -19,6 +19,8 @@ static struct resctrl_test *resctrl_tests[] = { > &mba_test, > &cmt_test, > &l3_cat_test, > + &l3_noncont_cat_test, > + &l2_noncont_cat_test, > }; > > static int detect_vendor(void) > diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c > index 8546421f0940..8bd30973fec3 100644 > --- a/tools/testing/selftests/resctrl/resctrlfs.c > +++ b/tools/testing/selftests/resctrl/resctrlfs.c > @@ -100,7 +100,7 @@ int umount_resctrlfs(void) > * > * Return: cache level as integer or -1 if @cache_type is invalid. > */ > -static int get_cache_level(const char *cache_type) > +int get_cache_level(const char *cache_type) > { > if (!strcmp(cache_type, "L3")) > return 3; Reinette
On Mon, 8 Jan 2024, Reinette Chatre wrote: > Hi Maciej, > > On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: > > Add tests for both L2 and L3 CAT to verify the return values > > generated by writing non-contiguous CBMs don't contradict the > > reported non-contiguous support information. > > > > Use a logical XOR to confirm return value of write_schemata() and > > non-contiguous CBMs support information match. > > > > Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> > > --- > > Changelog v2: > > - Redo the patch message. (Ilpo) > > - Tidy up __cpuid_count calls. (Ilpo) > > - Remove redundant AND in noncont_mask calculations (Ilpo) > > - Fix bit_center offset. > > - Add newline before function return. (Ilpo) > > - Group non-contiguous tests with CAT tests. (Ilpo) > > - Use a helper for reading sparse_masks file. (Ilpo) > > - Make get_cache_level() available in other source files. (Ilpo) > > > > tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ > > tools/testing/selftests/resctrl/resctrl.h | 3 + > > .../testing/selftests/resctrl/resctrl_tests.c | 2 + > > tools/testing/selftests/resctrl/resctrlfs.c | 2 +- > > 4 files changed, 81 insertions(+), 1 deletion(-) > > > > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c > > index 7dc7206b3b99..ecf553a89aae 100644 > > --- a/tools/testing/selftests/resctrl/cat_test.c > > +++ b/tools/testing/selftests/resctrl/cat_test.c > > @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param > > return ret; > > } > > > > +static int noncont_cat_run_test(const struct resctrl_test *test, > > + const struct user_params *uparams) > > +{ > > + unsigned long full_cache_mask, cont_mask, noncont_mask; > > + unsigned int eax, ebx, ecx, edx, ret; > > + int level, bit_center, sparse_masks; > > + char schemata[64]; > > + > > + /* Check to compare sparse_masks content to cpuid output. */ > > "cpuid" -> "CPUID" (to note it is an instruction) > > > + sparse_masks = read_info_res_file(test->resource, "sparse_masks"); > > + if (sparse_masks < 0) > > + return sparse_masks; > > + > > + level = get_cache_level(test->resource); > > + if (level < 0) > > + return -EINVAL; > > + __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx); > > Please do not invent relationships. Please replace the "4 - level" with > specific index used that depends on particular cache. The cache level > may not even be needed, just use the resource to determine the correct > index. This is actually my fault, I suggested Maciej could use arithmetics there. > > + > > + return !ret == !sparse_masks; > > Please return negative on error. Ilpo just did a big cleanup to address this. Test failure is not same as an error. So tests should return negative for errors which prevent even running test at all, and 0/1 for test success/fail.
Hi Ilpo, On 1/9/2024 1:13 AM, Ilpo Järvinen wrote: > On Mon, 8 Jan 2024, Reinette Chatre wrote: > >> Hi Maciej, >> >> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>> Add tests for both L2 and L3 CAT to verify the return values >>> generated by writing non-contiguous CBMs don't contradict the >>> reported non-contiguous support information. >>> >>> Use a logical XOR to confirm return value of write_schemata() and >>> non-contiguous CBMs support information match. >>> >>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> >>> --- >>> Changelog v2: >>> - Redo the patch message. (Ilpo) >>> - Tidy up __cpuid_count calls. (Ilpo) >>> - Remove redundant AND in noncont_mask calculations (Ilpo) >>> - Fix bit_center offset. >>> - Add newline before function return. (Ilpo) >>> - Group non-contiguous tests with CAT tests. (Ilpo) >>> - Use a helper for reading sparse_masks file. (Ilpo) >>> - Make get_cache_level() available in other source files. (Ilpo) >>> >>> tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ >>> tools/testing/selftests/resctrl/resctrl.h | 3 + >>> .../testing/selftests/resctrl/resctrl_tests.c | 2 + >>> tools/testing/selftests/resctrl/resctrlfs.c | 2 +- >>> 4 files changed, 81 insertions(+), 1 deletion(-) >>> >>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c >>> index 7dc7206b3b99..ecf553a89aae 100644 >>> --- a/tools/testing/selftests/resctrl/cat_test.c >>> +++ b/tools/testing/selftests/resctrl/cat_test.c >>> @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param >>> return ret; >>> } >>> >>> +static int noncont_cat_run_test(const struct resctrl_test *test, >>> + const struct user_params *uparams) >>> +{ >>> + unsigned long full_cache_mask, cont_mask, noncont_mask; >>> + unsigned int eax, ebx, ecx, edx, ret; >>> + int level, bit_center, sparse_masks; >>> + char schemata[64]; >>> + >>> + /* Check to compare sparse_masks content to cpuid output. */ >> >> "cpuid" -> "CPUID" (to note it is an instruction) >> >>> + sparse_masks = read_info_res_file(test->resource, "sparse_masks"); >>> + if (sparse_masks < 0) >>> + return sparse_masks; >>> + >>> + level = get_cache_level(test->resource); >>> + if (level < 0) >>> + return -EINVAL; >>> + __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx); >> >> Please do not invent relationships. Please replace the "4 - level" with >> specific index used that depends on particular cache. The cache level >> may not even be needed, just use the resource to determine the correct >> index. > > This is actually my fault, I suggested Maciej could use arithmetics there. No problem. The math works for the current values but there is no such relationship. If hypothetically a new cache level needs to be supported then this computation cannot be relied upon to continue to be correct. >>> + return !ret == !sparse_masks; >> >> Please return negative on error. Ilpo just did a big cleanup to address this. > > Test failure is not same as an error. So tests should return negative for > errors which prevent even running test at all, and 0/1 for test > success/fail. > Thanks for catching this. I missed this subtlety in the framework. Reinette
Hi, thanks for the review! On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >Hi Maciej, > >On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >> Add tests for both L2 and L3 CAT to verify the return values >> generated by writing non-contiguous CBMs don't contradict the >> reported non-contiguous support information. >> >> Use a logical XOR to confirm return value of write_schemata() and >> non-contiguous CBMs support information match. >> >> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> >> --- >> Changelog v2: >> - Redo the patch message. (Ilpo) >> - Tidy up __cpuid_count calls. (Ilpo) >> - Remove redundant AND in noncont_mask calculations (Ilpo) >> - Fix bit_center offset. >> - Add newline before function return. (Ilpo) >> - Group non-contiguous tests with CAT tests. (Ilpo) >> - Use a helper for reading sparse_masks file. (Ilpo) >> - Make get_cache_level() available in other source files. (Ilpo) >> >> tools/testing/selftests/resctrl/cat_test.c | 75 +++++++++++++++++++ >> tools/testing/selftests/resctrl/resctrl.h | 3 + >> .../testing/selftests/resctrl/resctrl_tests.c | 2 + >> tools/testing/selftests/resctrl/resctrlfs.c | 2 +- >> 4 files changed, 81 insertions(+), 1 deletion(-) >> >> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c >> index 7dc7206b3b99..ecf553a89aae 100644 >> --- a/tools/testing/selftests/resctrl/cat_test.c >> +++ b/tools/testing/selftests/resctrl/cat_test.c >> @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param >> return ret; >> } >> >> +static int noncont_cat_run_test(const struct resctrl_test *test, >> + const struct user_params *uparams) >> +{ >> + unsigned long full_cache_mask, cont_mask, noncont_mask; >> + unsigned int eax, ebx, ecx, edx, ret; >> + int level, bit_center, sparse_masks; >> + char schemata[64]; >> + >> + /* Check to compare sparse_masks content to cpuid output. */ > >"cpuid" -> "CPUID" (to note it is an instruction) > Thanks, will change >> + sparse_masks = read_info_res_file(test->resource, "sparse_masks"); >> + if (sparse_masks < 0) >> + return sparse_masks; >> + >> + level = get_cache_level(test->resource); >> + if (level < 0) >> + return -EINVAL; >> + __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx); > >Please do not invent relationships. Please replace the "4 - level" with >specific index used that depends on particular cache. The cache level >may not even be needed, just use the resource to determine the correct >index. I'll move this back to 'if (!strcmp(test->resource, "L3") ... ' structure then. >> + >> + if (sparse_masks != ((ecx >> 3) & 1)) >> + return -1; > >Can a message be displayed to support the debugging this test failure? Sure, that is a very good idea. I'll add ksft_perror() here. >> + >> + /* Write checks initialization. */ >> + ret = get_full_cbm(test->resource, &full_cache_mask); >> + if (ret < 0) >> + return ret; > >I assume this test failure relies on the error message from get_bit_mask() >that is called via get_full_cbm()? Yes, I thought adding more prints here might look redundant. >> + bit_center = count_bits(full_cache_mask) / 2; >> + cont_mask = full_cache_mask >> bit_center; >> + >> + /* Contiguous mask write check. */ >> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >> + if (ret) >> + return ret; > >How will user know what failed? I am seeing this single test exercise a few scenarios >and it is not obvious to me if the issue will be clear if this test, >noncont_cat_run_test(), fails. write_schemata() either succeeds with '0' or errors out with a negative value. If the contiguous mask write fails, write_schemata should print out what was wrong and I believe that the test will report an error rather than failure. >> + >> + /* >> + * Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle. >> + * Output is compared with support information to catch any edge case errors. >> + */ >> + noncont_mask = ~(0xf << (bit_center - 2)) & full_cache_mask; >> + snprintf(schemata, sizeof(schemata), "%lx", noncont_mask); >> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >> + if (ret && sparse_masks) >> + ksft_print_msg("Non-contiguous CBMs supported but write failed\n"); >> + else if (ret && !sparse_masks) >> + ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n"); >> + else if (!ret && !sparse_masks) >> + ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n"); > >Can these messages be made more specific with a "write" changed to "write of >non-contiguous CBM" Sure, will fix it. >> + >> + return !ret == !sparse_masks; > >Please return negative on error. Ilpo just did a big cleanup to address this. I believe this is resolved now. >> +} >> + >> +static bool noncont_cat_feature_check(const struct resctrl_test *test) >> +{ >> + if (!resctrl_resource_exists(test->resource)) >> + return false; >> + >> + return resctrl_cache_feature_exists(test->resource, "sparse_masks"); >> +} >> + >> struct resctrl_test l3_cat_test = { >> .name = "L3_CAT", >> .group = "CAT", >> @@ -299,3 +358,19 @@ struct resctrl_test l3_cat_test = { >> .feature_check = test_resource_feature_check, >> .run_test = cat_run_test, >> }; >> + >> +struct resctrl_test l3_noncont_cat_test = { >> + .name = "L3_NONCONT_CAT", >> + .group = "CAT", >> + .resource = "L3", >> + .feature_check = noncont_cat_feature_check, >> + .run_test = noncont_cat_run_test, >> +}; >> + >> +struct resctrl_test l2_noncont_cat_test = { >> + .name = "L2_NONCONT_CAT", >> + .group = "CAT", >> + .resource = "L2", >> + .feature_check = noncont_cat_feature_check, >> + .run_test = noncont_cat_run_test, >> +}; >> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h >> index 74041a35d4ba..7b7a48d1fddd 100644 >> --- a/tools/testing/selftests/resctrl/resctrl.h >> +++ b/tools/testing/selftests/resctrl/resctrl.h >> @@ -165,6 +165,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); >> int get_full_cbm(const char *cache_type, unsigned long *mask); >> int get_mask_no_shareable(const char *cache_type, unsigned long *mask); >> int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); >> +int get_cache_level(const char *cache_type); >> int read_info_res_file(const char *resource, const char *filename); >> void ctrlc_handler(int signum, siginfo_t *info, void *ptr); >> int signal_handler_register(void); >> @@ -201,5 +202,7 @@ extern struct resctrl_test mbm_test; >> extern struct resctrl_test mba_test; >> extern struct resctrl_test cmt_test; >> extern struct resctrl_test l3_cat_test; >> +extern struct resctrl_test l3_noncont_cat_test; >> +extern struct resctrl_test l2_noncont_cat_test; >> >> #endif /* RESCTRL_H */ >> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c >> index 3044179ee6e9..f3dc1b9696e7 100644 >> --- a/tools/testing/selftests/resctrl/resctrl_tests.c >> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c >> @@ -19,6 +19,8 @@ static struct resctrl_test *resctrl_tests[] = { >> &mba_test, >> &cmt_test, >> &l3_cat_test, >> + &l3_noncont_cat_test, >> + &l2_noncont_cat_test, >> }; >> >> static int detect_vendor(void) >> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c >> index 8546421f0940..8bd30973fec3 100644 >> --- a/tools/testing/selftests/resctrl/resctrlfs.c >> +++ b/tools/testing/selftests/resctrl/resctrlfs.c >> @@ -100,7 +100,7 @@ int umount_resctrlfs(void) >> * >> * Return: cache level as integer or -1 if @cache_type is invalid. >> */ >> -static int get_cache_level(const char *cache_type) >> +int get_cache_level(const char *cache_type) >> { >> if (!strcmp(cache_type, "L3")) >> return 3; > > >Reinette
Hi Maciej, On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: > On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>> + >>> + if (sparse_masks != ((ecx >> 3) & 1)) >>> + return -1; >> >> Can a message be displayed to support the debugging this test failure? > > Sure, that is a very good idea. I'll add ksft_perror() here. I do not think ksft_perror() is appropriate since perror() is for system error messages (something that sets errno). Perhaps just ksft_print_msg(). >>> + bit_center = count_bits(full_cache_mask) / 2; >>> + cont_mask = full_cache_mask >> bit_center; >>> + >>> + /* Contiguous mask write check. */ >>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>> + if (ret) >>> + return ret; >> >> How will user know what failed? I am seeing this single test exercise a few scenarios >> and it is not obvious to me if the issue will be clear if this test, >> noncont_cat_run_test(), fails. > > write_schemata() either succeeds with '0' or errors out with a negative value. If > the contiguous mask write fails, write_schemata should print out what was wrong > and I believe that the test will report an error rather than failure. Right. I am trying to understand whether the user will be able to decipher what failed in case there is an error. Seems like in this case the user is expected to look at the source code of the test to understand what the test was trying to do at the time it encountered the failure. In this case user may be "lucky" that this test only has one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that reasoning to figure out which write_schemata() failed to further dig what test was trying to do. Reinette
On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >Hi Maciej, > >On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: > > >>>> + >>>> + if (sparse_masks != ((ecx >> 3) & 1)) >>>> + return -1; >>> >>> Can a message be displayed to support the debugging this test failure? >> >> Sure, that is a very good idea. I'll add ksft_perror() here. > >I do not think ksft_perror() is appropriate since perror() is for >system error messages (something that sets errno). Perhaps just >ksft_print_msg(). Thanks for the suggestion! > >>>> + bit_center = count_bits(full_cache_mask) / 2; >>>> + cont_mask = full_cache_mask >> bit_center; >>>> + >>>> + /* Contiguous mask write check. */ >>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>> + if (ret) >>>> + return ret; >>> >>> How will user know what failed? I am seeing this single test exercise a few scenarios >>> and it is not obvious to me if the issue will be clear if this test, >>> noncont_cat_run_test(), fails. >> >> write_schemata() either succeeds with '0' or errors out with a negative value. If >> the contiguous mask write fails, write_schemata should print out what was wrong >> and I believe that the test will report an error rather than failure. > >Right. I am trying to understand whether the user will be able to decipher what failed >in case there is an error. Seems like in this case the user is expected to look at the >source code of the test to understand what the test was trying to do at the time it >encountered the failure. In this case user may be "lucky" that this test only has >one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >reasoning to figure out which write_schemata() failed to further dig what test was >trying to do. When a write_schemata() is executed the string that is being written gets printed. If there are multiple calls in a single tests and one fails I'd imagine it would be easy for the user to figure out which one failed. On a side note I'm not sure if that's true but I'm getting a feeling that the harder errors (not just test failures) are more of a clue for developers working on the tests. Would you agree that it seems like users probably won't see write_schemata() fail here (if the test execution managed to get to this point without erroring out due to bad parameters or kernel compiled without required options)? > >Reinette
Hi Maciej, On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: > On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>> + cont_mask = full_cache_mask >> bit_center; >>>>> + >>>>> + /* Contiguous mask write check. */ >>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>> + if (ret) >>>>> + return ret; >>>> >>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>> and it is not obvious to me if the issue will be clear if this test, >>>> noncont_cat_run_test(), fails. >>> >>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>> the contiguous mask write fails, write_schemata should print out what was wrong >>> and I believe that the test will report an error rather than failure. >> >> Right. I am trying to understand whether the user will be able to decipher what failed >> in case there is an error. Seems like in this case the user is expected to look at the >> source code of the test to understand what the test was trying to do at the time it >> encountered the failure. In this case user may be "lucky" that this test only has >> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >> reasoning to figure out which write_schemata() failed to further dig what test was >> trying to do. > > When a write_schemata() is executed the string that is being written gets > printed. If there are multiple calls in a single tests and one fails I'd imagine > it would be easy for the user to figure out which one failed. It would be easy for the user the figure out if (a) it is obvious to the user what schema a particular write_schema() call attempted to write and (b) all the write_schema() calls attempt to write different schema. > On a side note I'm not sure if that's true but I'm getting a feeling that the > harder errors (not just test failures) are more of a clue for developers working > on the tests. Would you agree that it seems like users probably won't see > write_schemata() fail here (if the test execution managed to get to this point > without erroring out due to bad parameters or kernel compiled without required > options)? I do agree that users probably won't see such failures. I do not think these errors are clues to developers working on the tests though, but instead clues to resctrl developers or kernel development CI systems. Reinette
Hello! On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: >Hi Maciej, > >On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: > >>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>> + >>>>>> + /* Contiguous mask write check. */ >>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>> + if (ret) >>>>>> + return ret; >>>>> >>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>> and it is not obvious to me if the issue will be clear if this test, >>>>> noncont_cat_run_test(), fails. >>>> >>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>> and I believe that the test will report an error rather than failure. >>> >>> Right. I am trying to understand whether the user will be able to decipher what failed >>> in case there is an error. Seems like in this case the user is expected to look at the >>> source code of the test to understand what the test was trying to do at the time it >>> encountered the failure. In this case user may be "lucky" that this test only has >>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>> reasoning to figure out which write_schemata() failed to further dig what test was >>> trying to do. >> >> When a write_schemata() is executed the string that is being written gets >> printed. If there are multiple calls in a single tests and one fails I'd imagine >> it would be easy for the user to figure out which one failed. > >It would be easy for the user the figure out if (a) it is obvious to the user >what schema a particular write_schema() call attempted to write and (b) all the >write_schema() calls attempt to write different schema. Okay, your comment made me wonder if on error the schemata still is printed. I double checked in the code and whether write_schemata() fails or not it has a goto path where before returning it will print out the schema. So I believe that satisfies your (a) condition. As for (b) depends on what you meant. Other tests that run more than one write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest that the non-contiguous test should attempt more schematas? For example shift the bit hole from one side to the other? I assumed one CBM with a centered bit hole would be enough to check if non-contiguous CBM feature works properly and more CBMs would be redundant. > >> On a side note I'm not sure if that's true but I'm getting a feeling that the >> harder errors (not just test failures) are more of a clue for developers working >> on the tests. Would you agree that it seems like users probably won't see >> write_schemata() fail here (if the test execution managed to get to this point >> without erroring out due to bad parameters or kernel compiled without required >> options)? > >I do agree that users probably won't see such failures. I do not think these >errors are clues to developers working on the tests though, but instead clues >to resctrl developers or kernel development CI systems. Right, I agree, the target group I mentioned was too narrow. > >Reinette >
Hi Maciej, On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote: > On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: >> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >> >>>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>>> + >>>>>>> + /* Contiguous mask write check. */ >>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>>> + if (ret) >>>>>>> + return ret; >>>>>> >>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>>> and it is not obvious to me if the issue will be clear if this test, >>>>>> noncont_cat_run_test(), fails. >>>>> >>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>>> and I believe that the test will report an error rather than failure. >>>> >>>> Right. I am trying to understand whether the user will be able to decipher what failed >>>> in case there is an error. Seems like in this case the user is expected to look at the >>>> source code of the test to understand what the test was trying to do at the time it >>>> encountered the failure. In this case user may be "lucky" that this test only has >>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>>> reasoning to figure out which write_schemata() failed to further dig what test was >>>> trying to do. >>> >>> When a write_schemata() is executed the string that is being written gets >>> printed. If there are multiple calls in a single tests and one fails I'd imagine >>> it would be easy for the user to figure out which one failed. >> >> It would be easy for the user the figure out if (a) it is obvious to the user >> what schema a particular write_schema() call attempted to write and (b) all the >> write_schema() calls attempt to write different schema. > > Okay, your comment made me wonder if on error the schemata still is printed. I > double checked in the code and whether write_schemata() fails or not it has a > goto path where before returning it will print out the schema. So I believe that > satisfies your (a) condition. Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ... Scenario 2: The test has the following code: ... write_schemata(..., schemata, ...); ... write_schemata(..., schemata, ...); ... Any failure of write_schemata() in scenario 1 will be easy to trace. As you state, write_schemata() prints the schemata attempted and it will thus be easy to look at the code to see which write_schemata() call failed since it is obvious from the code which schemata was attempted. A failure of one of the write_schemata() in scenario 2 will not be as easy to trace since the user first needs to determine what the value of "schemata" is at each call and that may depend on the platform, bit shifting done in test, and state of system state at time of test. > As for (b) depends on what you meant. Other tests that run more than one > write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest > that the non-contiguous test should attempt more schematas? For example shift > the bit hole from one side to the other? I assumed one CBM with a centered bit > hole would be enough to check if non-contiguous CBM feature works properly and > more CBMs would be redundant. Let me try with an example. Scenario 1: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xf0f", ...); ... Scenario 2: The test has the following code: ... write_schemata(..., "0xfff", ...); ... write_schemata(..., "0xfff", ...); ... A failure of either write_schemata() in scenario 1 will be easy to trace since the schemata attempted is different in each case. The schemata printed by the write_schemata() error message can thus easily be connected to the specific write_schemata() call. A failure of either write_schemata() in scenario 2 is not so obvious since they both attempted the same schemata so the error message printed by write_schemata() could belong to either. Reinette
Hi! On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote: >Hi Maciej, > >On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote: >> On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: >>> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >>>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>> >>>>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>>>> + >>>>>>>> + /* Contiguous mask write check. */ >>>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>>>> + if (ret) >>>>>>>> + return ret; >>>>>>> >>>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>>>> and it is not obvious to me if the issue will be clear if this test, >>>>>>> noncont_cat_run_test(), fails. >>>>>> >>>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>>>> and I believe that the test will report an error rather than failure. >>>>> >>>>> Right. I am trying to understand whether the user will be able to decipher what failed >>>>> in case there is an error. Seems like in this case the user is expected to look at the >>>>> source code of the test to understand what the test was trying to do at the time it >>>>> encountered the failure. In this case user may be "lucky" that this test only has >>>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>>>> reasoning to figure out which write_schemata() failed to further dig what test was >>>>> trying to do. >>>> >>>> When a write_schemata() is executed the string that is being written gets >>>> printed. If there are multiple calls in a single tests and one fails I'd imagine >>>> it would be easy for the user to figure out which one failed. >>> >>> It would be easy for the user the figure out if (a) it is obvious to the user >>> what schema a particular write_schema() call attempted to write and (b) all the >>> write_schema() calls attempt to write different schema. >> >> Okay, your comment made me wonder if on error the schemata still is printed. I >> double checked in the code and whether write_schemata() fails or not it has a >> goto path where before returning it will print out the schema. So I believe that >> satisfies your (a) condition. > >Let me try with an example. >Scenario 1: >The test has the following code: > ... > write_schemata(..., "0xfff", ...); > ... > write_schemata(..., "0xf0f", ...); > ... > >Scenario 2: >The test has the following code: > ... > write_schemata(..., schemata, ...); > ... > write_schemata(..., schemata, ...); > ... > >Any failure of write_schemata() in scenario 1 will be easy to trace. As you >state, write_schemata() prints the schemata attempted and it will thus be >easy to look at the code to see which write_schemata() call failed since it >is obvious from the code which schemata was attempted. >A failure of one of the write_schemata() in scenario 2 will not be as easy >to trace since the user first needs to determine what the value of "schemata" >is at each call and that may depend on the platform, bit shifting done in test, >and state of system state at time of test. Doing things similar to scenario 1 would be great from a debugging perspective but since the masks can have different sizes putting literals there seems impossible. Maybe the code could be improved by putting an example CBM in the comment above a write_schemata() call? "For a 12 bit maximum CBM value, the contiguous schemata will look like '0x3f'" and "For a 12 bit maximum CBM value, the non-contiguous schemata will look like '0xf0f'" This seems like the closest I could get to what you're showing in scenario 1 (which I assume would be the best). >> As for (b) depends on what you meant. Other tests that run more than one >> write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest >> that the non-contiguous test should attempt more schematas? For example shift >> the bit hole from one side to the other? I assumed one CBM with a centered bit >> hole would be enough to check if non-contiguous CBM feature works properly and >> more CBMs would be redundant. > >Let me try with an example. >Scenario 1: >The test has the following code: > ... > write_schemata(..., "0xfff", ...); > ... > write_schemata(..., "0xf0f", ...); > ... > >Scenario 2: >The test has the following code: > ... > write_schemata(..., "0xfff", ...); > ... > write_schemata(..., "0xfff", ...); > ... > >A failure of either write_schemata() in scenario 1 will be easy to trace since >the schemata attempted is different in each case. The schemata printed by the >write_schemata() error message can thus easily be connected to the specific >write_schemata() call. >A failure of either write_schemata() in scenario 2 is not so obvious since they >both attempted the same schemata so the error message printed by write_schemata() >could belong to either. I believe my code follows the first scenario example (since one schemata is half the full CBM, and the other one is the full CBM with a hole in the middle). I'm sorry to drag this thread out but I want to be sure if I'm right or are you suggesting something and I missed it? > >Reinette
Hi Maciej, On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote: > Hi! > > On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote: >> Hi Maciej, >> >> On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote: >>> On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: >>>> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >>>>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>>>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>>> >>>>>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>>>>> + >>>>>>>>> + /* Contiguous mask write check. */ >>>>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>>>>> + if (ret) >>>>>>>>> + return ret; >>>>>>>> >>>>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>>>>> and it is not obvious to me if the issue will be clear if this test, >>>>>>>> noncont_cat_run_test(), fails. >>>>>>> >>>>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>>>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>>>>> and I believe that the test will report an error rather than failure. >>>>>> >>>>>> Right. I am trying to understand whether the user will be able to decipher what failed >>>>>> in case there is an error. Seems like in this case the user is expected to look at the >>>>>> source code of the test to understand what the test was trying to do at the time it >>>>>> encountered the failure. In this case user may be "lucky" that this test only has >>>>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>>>>> reasoning to figure out which write_schemata() failed to further dig what test was >>>>>> trying to do. >>>>> >>>>> When a write_schemata() is executed the string that is being written gets >>>>> printed. If there are multiple calls in a single tests and one fails I'd imagine >>>>> it would be easy for the user to figure out which one failed. >>>> >>>> It would be easy for the user the figure out if (a) it is obvious to the user >>>> what schema a particular write_schema() call attempted to write and (b) all the >>>> write_schema() calls attempt to write different schema. >>> >>> Okay, your comment made me wonder if on error the schemata still is printed. I >>> double checked in the code and whether write_schemata() fails or not it has a >>> goto path where before returning it will print out the schema. So I believe that >>> satisfies your (a) condition. >> >> Let me try with an example. >> Scenario 1: >> The test has the following code: >> ... >> write_schemata(..., "0xfff", ...); >> ... >> write_schemata(..., "0xf0f", ...); >> ... >> >> Scenario 2: >> The test has the following code: >> ... >> write_schemata(..., schemata, ...); >> ... >> write_schemata(..., schemata, ...); >> ... >> >> Any failure of write_schemata() in scenario 1 will be easy to trace. As you >> state, write_schemata() prints the schemata attempted and it will thus be >> easy to look at the code to see which write_schemata() call failed since it >> is obvious from the code which schemata was attempted. >> A failure of one of the write_schemata() in scenario 2 will not be as easy >> to trace since the user first needs to determine what the value of "schemata" >> is at each call and that may depend on the platform, bit shifting done in test, >> and state of system state at time of test. > > Doing things similar to scenario 1 would be great from a debugging perspective > but since the masks can have different sizes putting literals there seems > impossible. > > Maybe the code could be improved by putting an example CBM in the comment above > a write_schemata() call? "For a 12 bit maximum CBM value, the contiguous > schemata will look like '0x3f'" and "For a 12 bit maximum CBM value, the > non-contiguous schemata will look like '0xf0f'" > > This seems like the closest I could get to what you're > showing in scenario 1 (which I assume would be the best). I am not asking you to use literals. I am trying to demonstrate that the only way it would be obvious to the user where a failure is is when the test uses literals. I continue to try to motivate for clear indication to user/developer what failed when this test failed ... this could just be a ksft_print_msg() when the write_schemata() call we are talking about fails. > >>> As for (b) depends on what you meant. Other tests that run more than one >>> write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest >>> that the non-contiguous test should attempt more schematas? For example shift >>> the bit hole from one side to the other? I assumed one CBM with a centered bit >>> hole would be enough to check if non-contiguous CBM feature works properly and >>> more CBMs would be redundant. >> >> Let me try with an example. >> Scenario 1: >> The test has the following code: >> ... >> write_schemata(..., "0xfff", ...); >> ... >> write_schemata(..., "0xf0f", ...); >> ... >> >> Scenario 2: >> The test has the following code: >> ... >> write_schemata(..., "0xfff", ...); >> ... >> write_schemata(..., "0xfff", ...); >> ... >> >> A failure of either write_schemata() in scenario 1 will be easy to trace since >> the schemata attempted is different in each case. The schemata printed by the >> write_schemata() error message can thus easily be connected to the specific >> write_schemata() call. >> A failure of either write_schemata() in scenario 2 is not so obvious since they >> both attempted the same schemata so the error message printed by write_schemata() >> could belong to either. > > I believe my code follows the first scenario example (since one schemata is half > the full CBM, and the other one is the full CBM with a hole in the middle). I know. This thread digressed into discussion about when it would be ok to omit error message from caller of write_schemata(). > I'm sorry to drag this thread out but I want to be sure if I'm right or are you > suggesting something and I missed it? Please just add a ksft_print_msg() to noncont_cat_run_test() when this write_schemata() fails. Reinette
On 2024-01-22 at 08:32:36 -0800, Reinette Chatre wrote: >Hi Maciej, > >On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote: >> Hi! >> >> On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote: >>> Hi Maciej, >>> >>> On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote: >>>> On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: >>>>> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >>>>>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>>>>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>>>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>>>> >>>>>>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>>>>>> + >>>>>>>>>> + /* Contiguous mask write check. */ >>>>>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>>>>>> + if (ret) >>>>>>>>>> + return ret; >>>>>>>>> >>>>>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>>>>>> and it is not obvious to me if the issue will be clear if this test, >>>>>>>>> noncont_cat_run_test(), fails. >>>>>>>> >>>>>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>>>>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>>>>>> and I believe that the test will report an error rather than failure. >>>>>>> >>>>>>> Right. I am trying to understand whether the user will be able to decipher what failed >>>>>>> in case there is an error. Seems like in this case the user is expected to look at the >>>>>>> source code of the test to understand what the test was trying to do at the time it >>>>>>> encountered the failure. In this case user may be "lucky" that this test only has >>>>>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>>>>>> reasoning to figure out which write_schemata() failed to further dig what test was >>>>>>> trying to do. >>>>>> >>>>>> When a write_schemata() is executed the string that is being written gets >>>>>> printed. If there are multiple calls in a single tests and one fails I'd imagine >>>>>> it would be easy for the user to figure out which one failed. >>>>> >>>>> It would be easy for the user the figure out if (a) it is obvious to the user >>>>> what schema a particular write_schema() call attempted to write and (b) all the >>>>> write_schema() calls attempt to write different schema. >> >>>> As for (b) depends on what you meant. Other tests that run more than one >>>> write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest >>>> that the non-contiguous test should attempt more schematas? For example shift >>>> the bit hole from one side to the other? I assumed one CBM with a centered bit >>>> hole would be enough to check if non-contiguous CBM feature works properly and >>>> more CBMs would be redundant. >>> >>> Let me try with an example. >>> Scenario 1: >>> The test has the following code: >>> ... >>> write_schemata(..., "0xfff", ...); >>> ... >>> write_schemata(..., "0xf0f", ...); >>> ... >>> >>> Scenario 2: >>> The test has the following code: >>> ... >>> write_schemata(..., "0xfff", ...); >>> ... >>> write_schemata(..., "0xfff", ...); >>> ... >>> >>> A failure of either write_schemata() in scenario 1 will be easy to trace since >>> the schemata attempted is different in each case. The schemata printed by the >>> write_schemata() error message can thus easily be connected to the specific >>> write_schemata() call. >>> A failure of either write_schemata() in scenario 2 is not so obvious since they >>> both attempted the same schemata so the error message printed by write_schemata() >>> could belong to either. > >> I'm sorry to drag this thread out but I want to be sure if I'm right or are you >> suggesting something and I missed it? > >Please just add a ksft_print_msg() to noncont_cat_run_test() when this >write_schemata() fails. My point all along was that if write_schemata() fails it already prints out all the necessary information. I'd like to avoid adding redundant messages so please take a look at how it looks now: I injected write_schemata() with an error so it will take a path as if write() failed with 'Permission denied' as a reason. Here is the output for L3 non-contiguous CAT test: [root@spr1 ~]# ./resctrl_tests -t L3_NONCONT_CAT TAP version 13 # Pass: Check kernel supports resctrl filesystem # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists # resctrl filesystem not mounted # dmesg: [ 18.579861] resctrl: L3 allocation detected # dmesg: [ 18.590395] resctrl: L2 allocation detected # dmesg: [ 18.595181] resctrl: MB allocation detected # dmesg: [ 18.599963] resctrl: L3 monitoring detected 1..1 # Starting L3_NONCONT_CAT test ... # Mounting resctrl to "/sys/fs/resctrl" # Write schema "L3:0=ff" to resctrl FS # write() failed : Permission denied not ok 1 L3_NONCONT_CAT: test # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 Of course if you still think adding a ksft_print_msg() there would be meaningful I'll try to write a sensible message. But I hope you can see what I meant when I wrote that the user could already easily see what failed. > >Reinette >
Hi Maciej, On 1/22/2024 11:58 PM, Maciej Wieczór-Retman wrote: > On 2024-01-22 at 08:32:36 -0800, Reinette Chatre wrote: >> Hi Maciej, >> >> On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote: >>> Hi! >>> >>> On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote: >>>> Hi Maciej, >>>> >>>> On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote: >>>>> On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: >>>>>> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >>>>>>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>>>>>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>>>>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>>>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>>>>> >>>>>>>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>>>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>>>>>>> + >>>>>>>>>>> + /* Contiguous mask write check. */ >>>>>>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>>>>>>> + if (ret) >>>>>>>>>>> + return ret; >>>>>>>>>> >>>>>>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>>>>>>> and it is not obvious to me if the issue will be clear if this test, >>>>>>>>>> noncont_cat_run_test(), fails. >>>>>>>>> >>>>>>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>>>>>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>>>>>>> and I believe that the test will report an error rather than failure. >>>>>>>> >>>>>>>> Right. I am trying to understand whether the user will be able to decipher what failed >>>>>>>> in case there is an error. Seems like in this case the user is expected to look at the >>>>>>>> source code of the test to understand what the test was trying to do at the time it >>>>>>>> encountered the failure. In this case user may be "lucky" that this test only has >>>>>>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>>>>>>> reasoning to figure out which write_schemata() failed to further dig what test was >>>>>>>> trying to do. >>>>>>> >>>>>>> When a write_schemata() is executed the string that is being written gets >>>>>>> printed. If there are multiple calls in a single tests and one fails I'd imagine >>>>>>> it would be easy for the user to figure out which one failed. >>>>>> >>>>>> It would be easy for the user the figure out if (a) it is obvious to the user >>>>>> what schema a particular write_schema() call attempted to write and (b) all the >>>>>> write_schema() calls attempt to write different schema. >>> >>>>> As for (b) depends on what you meant. Other tests that run more than one >>>>> write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest >>>>> that the non-contiguous test should attempt more schematas? For example shift >>>>> the bit hole from one side to the other? I assumed one CBM with a centered bit >>>>> hole would be enough to check if non-contiguous CBM feature works properly and >>>>> more CBMs would be redundant. >>>> >>>> Let me try with an example. >>>> Scenario 1: >>>> The test has the following code: >>>> ... >>>> write_schemata(..., "0xfff", ...); >>>> ... >>>> write_schemata(..., "0xf0f", ...); >>>> ... >>>> >>>> Scenario 2: >>>> The test has the following code: >>>> ... >>>> write_schemata(..., "0xfff", ...); >>>> ... >>>> write_schemata(..., "0xfff", ...); >>>> ... >>>> >>>> A failure of either write_schemata() in scenario 1 will be easy to trace since >>>> the schemata attempted is different in each case. The schemata printed by the >>>> write_schemata() error message can thus easily be connected to the specific >>>> write_schemata() call. >>>> A failure of either write_schemata() in scenario 2 is not so obvious since they >>>> both attempted the same schemata so the error message printed by write_schemata() >>>> could belong to either. >> >>> I'm sorry to drag this thread out but I want to be sure if I'm right or are you >>> suggesting something and I missed it? >> >> Please just add a ksft_print_msg() to noncont_cat_run_test() when this >> write_schemata() fails. > > My point all along was that if write_schemata() fails it already prints out all > the necessary information. I'd like to avoid adding redundant messages so please > take a look at how it looks now: Please consider that there may be different perspectives of "necessary information". > I injected write_schemata() with an error so it will take a path as if write() > failed with 'Permission denied' as a reason. Here is the output for L3 > non-contiguous CAT test: > > [root@spr1 ~]# ./resctrl_tests -t L3_NONCONT_CAT > TAP version 13 > # Pass: Check kernel supports resctrl filesystem > # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists > # resctrl filesystem not mounted > # dmesg: [ 18.579861] resctrl: L3 allocation detected > # dmesg: [ 18.590395] resctrl: L2 allocation detected > # dmesg: [ 18.595181] resctrl: MB allocation detected > # dmesg: [ 18.599963] resctrl: L3 monitoring detected > 1..1 > # Starting L3_NONCONT_CAT test ... > # Mounting resctrl to "/sys/fs/resctrl" > # Write schema "L3:0=ff" to resctrl FS # write() failed : Permission denied > not ok 1 L3_NONCONT_CAT: test > # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 Understood. > Of course if you still think adding a ksft_print_msg() there would be meaningful > I'll try to write a sensible message. But I hope you can see what I meant when I > wrote that the user could already easily see what failed. I do still believe that it will be helpful if there is a ksft_print_msg() with something like "Unable to write contiguous CBM" or "Write of contiguous CBM failed" or ... ? Reinette
Hi Reinette! On 2024-01-23 at 09:42:07 -0800, Reinette Chatre wrote: >Hi Maciej, > >On 1/22/2024 11:58 PM, Maciej Wieczór-Retman wrote: >> On 2024-01-22 at 08:32:36 -0800, Reinette Chatre wrote: >>> Hi Maciej, >>> >>> On 1/21/2024 11:56 PM, Maciej Wieczór-Retman wrote: >>>> Hi! >>>> >>>> On 2024-01-19 at 08:39:31 -0800, Reinette Chatre wrote: >>>>> Hi Maciej, >>>>> >>>>> On 1/18/2024 11:37 PM, Maciej Wieczór-Retman wrote: >>>>>> On 2024-01-18 at 09:15:46 -0800, Reinette Chatre wrote: >>>>>>> On 1/18/2024 4:02 AM, Maciej Wieczór-Retman wrote: >>>>>>>> On 2024-01-17 at 10:49:06 -0800, Reinette Chatre wrote: >>>>>>>>> On 1/17/2024 12:26 AM, Maciej Wieczór-Retman wrote: >>>>>>>>>> On 2024-01-08 at 14:42:11 -0800, Reinette Chatre wrote: >>>>>>>>>>> On 12/12/2023 6:52 AM, Maciej Wieczor-Retman wrote: >>>>>>> >>>>>>>>>>>> + bit_center = count_bits(full_cache_mask) / 2; >>>>>>>>>>>> + cont_mask = full_cache_mask >> bit_center; >>>>>>>>>>>> + >>>>>>>>>>>> + /* Contiguous mask write check. */ >>>>>>>>>>>> + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); >>>>>>>>>>>> + ret = write_schemata("", schemata, uparams->cpu, test->resource); >>>>>>>>>>>> + if (ret) >>>>>>>>>>>> + return ret; >>>>>>>>>>> >>>>>>>>>>> How will user know what failed? I am seeing this single test exercise a few scenarios >>>>>>>>>>> and it is not obvious to me if the issue will be clear if this test, >>>>>>>>>>> noncont_cat_run_test(), fails. >>>>>>>>>> >>>>>>>>>> write_schemata() either succeeds with '0' or errors out with a negative value. If >>>>>>>>>> the contiguous mask write fails, write_schemata should print out what was wrong >>>>>>>>>> and I believe that the test will report an error rather than failure. >>>>>>>>> >>>>>>>>> Right. I am trying to understand whether the user will be able to decipher what failed >>>>>>>>> in case there is an error. Seems like in this case the user is expected to look at the >>>>>>>>> source code of the test to understand what the test was trying to do at the time it >>>>>>>>> encountered the failure. In this case user may be "lucky" that this test only has >>>>>>>>> one write_schemata() call _not_ followed by a ksft_print_msg() so user can use that >>>>>>>>> reasoning to figure out which write_schemata() failed to further dig what test was >>>>>>>>> trying to do. >>>>>>>> >>>>>>>> When a write_schemata() is executed the string that is being written gets >>>>>>>> printed. If there are multiple calls in a single tests and one fails I'd imagine >>>>>>>> it would be easy for the user to figure out which one failed. >>>>>>> >>>>>>> It would be easy for the user the figure out if (a) it is obvious to the user >>>>>>> what schema a particular write_schema() call attempted to write and (b) all the >>>>>>> write_schema() calls attempt to write different schema. >>>> >>>>>> As for (b) depends on what you meant. Other tests that run more than one >>>>>> write_schemata() use different ones every time (CAT, MBM, MBA). Do you suggest >>>>>> that the non-contiguous test should attempt more schematas? For example shift >>>>>> the bit hole from one side to the other? I assumed one CBM with a centered bit >>>>>> hole would be enough to check if non-contiguous CBM feature works properly and >>>>>> more CBMs would be redundant. >>>>> >>>>> Let me try with an example. >>>>> Scenario 1: >>>>> The test has the following code: >>>>> ... >>>>> write_schemata(..., "0xfff", ...); >>>>> ... >>>>> write_schemata(..., "0xf0f", ...); >>>>> ... >>>>> >>>>> Scenario 2: >>>>> The test has the following code: >>>>> ... >>>>> write_schemata(..., "0xfff", ...); >>>>> ... >>>>> write_schemata(..., "0xfff", ...); >>>>> ... >>>>> >>>>> A failure of either write_schemata() in scenario 1 will be easy to trace since >>>>> the schemata attempted is different in each case. The schemata printed by the >>>>> write_schemata() error message can thus easily be connected to the specific >>>>> write_schemata() call. >>>>> A failure of either write_schemata() in scenario 2 is not so obvious since they >>>>> both attempted the same schemata so the error message printed by write_schemata() >>>>> could belong to either. >>> >>>> I'm sorry to drag this thread out but I want to be sure if I'm right or are you >>>> suggesting something and I missed it? >>> >>> Please just add a ksft_print_msg() to noncont_cat_run_test() when this >>> write_schemata() fails. >> >> My point all along was that if write_schemata() fails it already prints out all >> the necessary information. I'd like to avoid adding redundant messages so please >> take a look at how it looks now: > >Please consider that there may be different perspectives of "necessary information". Oh of course. By that I meant the failed schemata which I assumed is what you were looking for in this error handling here. > >> I injected write_schemata() with an error so it will take a path as if write() >> failed with 'Permission denied' as a reason. Here is the output for L3 >> non-contiguous CAT test: >> >> [root@spr1 ~]# ./resctrl_tests -t L3_NONCONT_CAT >> TAP version 13 >> # Pass: Check kernel supports resctrl filesystem >> # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists >> # resctrl filesystem not mounted >> # dmesg: [ 18.579861] resctrl: L3 allocation detected >> # dmesg: [ 18.590395] resctrl: L2 allocation detected >> # dmesg: [ 18.595181] resctrl: MB allocation detected >> # dmesg: [ 18.599963] resctrl: L3 monitoring detected >> 1..1 >> # Starting L3_NONCONT_CAT test ... >> # Mounting resctrl to "/sys/fs/resctrl" >> # Write schema "L3:0=ff" to resctrl FS # write() failed : Permission denied >> not ok 1 L3_NONCONT_CAT: test >> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 > >Understood. > >> Of course if you still think adding a ksft_print_msg() there would be meaningful >> I'll try to write a sensible message. But I hope you can see what I meant when I >> wrote that the user could already easily see what failed. > >I do still believe that it will be helpful if there is a ksft_print_msg() with >something like "Unable to write contiguous CBM" or "Write of contiguous CBM failed" >or ... ? Sure, I can see how that can be helpful, I'll add "Write of contiguous CBM failed", thanks! > >Reinette >
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c index 7dc7206b3b99..ecf553a89aae 100644 --- a/tools/testing/selftests/resctrl/cat_test.c +++ b/tools/testing/selftests/resctrl/cat_test.c @@ -292,6 +292,65 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param return ret; } +static int noncont_cat_run_test(const struct resctrl_test *test, + const struct user_params *uparams) +{ + unsigned long full_cache_mask, cont_mask, noncont_mask; + unsigned int eax, ebx, ecx, edx, ret; + int level, bit_center, sparse_masks; + char schemata[64]; + + /* Check to compare sparse_masks content to cpuid output. */ + sparse_masks = read_info_res_file(test->resource, "sparse_masks"); + if (sparse_masks < 0) + return sparse_masks; + + level = get_cache_level(test->resource); + if (level < 0) + return -EINVAL; + __cpuid_count(0x10, 4 - level, eax, ebx, ecx, edx); + + if (sparse_masks != ((ecx >> 3) & 1)) + return -1; + + /* Write checks initialization. */ + ret = get_full_cbm(test->resource, &full_cache_mask); + if (ret < 0) + return ret; + bit_center = count_bits(full_cache_mask) / 2; + cont_mask = full_cache_mask >> bit_center; + + /* Contiguous mask write check. */ + snprintf(schemata, sizeof(schemata), "%lx", cont_mask); + ret = write_schemata("", schemata, uparams->cpu, test->resource); + if (ret) + return ret; + + /* + * Non-contiguous mask write check. CBM has a 0xf hole approximately in the middle. + * Output is compared with support information to catch any edge case errors. + */ + noncont_mask = ~(0xf << (bit_center - 2)) & full_cache_mask; + snprintf(schemata, sizeof(schemata), "%lx", noncont_mask); + ret = write_schemata("", schemata, uparams->cpu, test->resource); + if (ret && sparse_masks) + ksft_print_msg("Non-contiguous CBMs supported but write failed\n"); + else if (ret && !sparse_masks) + ksft_print_msg("Non-contiguous CBMs not supported and write failed as expected\n"); + else if (!ret && !sparse_masks) + ksft_print_msg("Non-contiguous CBMs not supported but write succeeded\n"); + + return !ret == !sparse_masks; +} + +static bool noncont_cat_feature_check(const struct resctrl_test *test) +{ + if (!resctrl_resource_exists(test->resource)) + return false; + + return resctrl_cache_feature_exists(test->resource, "sparse_masks"); +} + struct resctrl_test l3_cat_test = { .name = "L3_CAT", .group = "CAT", @@ -299,3 +358,19 @@ struct resctrl_test l3_cat_test = { .feature_check = test_resource_feature_check, .run_test = cat_run_test, }; + +struct resctrl_test l3_noncont_cat_test = { + .name = "L3_NONCONT_CAT", + .group = "CAT", + .resource = "L3", + .feature_check = noncont_cat_feature_check, + .run_test = noncont_cat_run_test, +}; + +struct resctrl_test l2_noncont_cat_test = { + .name = "L2_NONCONT_CAT", + .group = "CAT", + .resource = "L2", + .feature_check = noncont_cat_feature_check, + .run_test = noncont_cat_run_test, +}; diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h index 74041a35d4ba..7b7a48d1fddd 100644 --- a/tools/testing/selftests/resctrl/resctrl.h +++ b/tools/testing/selftests/resctrl/resctrl.h @@ -165,6 +165,7 @@ unsigned int count_contiguous_bits(unsigned long val, unsigned int *start); int get_full_cbm(const char *cache_type, unsigned long *mask); int get_mask_no_shareable(const char *cache_type, unsigned long *mask); int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size); +int get_cache_level(const char *cache_type); int read_info_res_file(const char *resource, const char *filename); void ctrlc_handler(int signum, siginfo_t *info, void *ptr); int signal_handler_register(void); @@ -201,5 +202,7 @@ extern struct resctrl_test mbm_test; extern struct resctrl_test mba_test; extern struct resctrl_test cmt_test; extern struct resctrl_test l3_cat_test; +extern struct resctrl_test l3_noncont_cat_test; +extern struct resctrl_test l2_noncont_cat_test; #endif /* RESCTRL_H */ diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c index 3044179ee6e9..f3dc1b9696e7 100644 --- a/tools/testing/selftests/resctrl/resctrl_tests.c +++ b/tools/testing/selftests/resctrl/resctrl_tests.c @@ -19,6 +19,8 @@ static struct resctrl_test *resctrl_tests[] = { &mba_test, &cmt_test, &l3_cat_test, + &l3_noncont_cat_test, + &l2_noncont_cat_test, }; static int detect_vendor(void) diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c index 8546421f0940..8bd30973fec3 100644 --- a/tools/testing/selftests/resctrl/resctrlfs.c +++ b/tools/testing/selftests/resctrl/resctrlfs.c @@ -100,7 +100,7 @@ int umount_resctrlfs(void) * * Return: cache level as integer or -1 if @cache_type is invalid. */ -static int get_cache_level(const char *cache_type) +int get_cache_level(const char *cache_type) { if (!strcmp(cache_type, "L3")) return 3;