Message ID | 918e147601697b1d4b8f8589f5751e05d6ceccdf.1695371055.git.maciej.wieczor-retman@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5763931vqi; Fri, 22 Sep 2023 10:50:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHpBXAwSeh0hSCheDKrDOUCZs04R6LzyDl0hqy6H/kzaAPonwBl2FcQXkSPGMUOLVs0/6IE X-Received: by 2002:a05:6a00:a1b:b0:690:d4f5:c664 with SMTP id p27-20020a056a000a1b00b00690d4f5c664mr115893pfh.11.1695405010637; Fri, 22 Sep 2023 10:50:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695405010; cv=none; d=google.com; s=arc-20160816; b=OWinC2yJnTyXFa0qhIy+Me81kE94oBvUQn9GgTQKgTXVxp6e0pMSAINncoSJ8Vxgy2 z5koXb2+xYBMz8otytGoqImAKVyM9dFTHRjG0WAL8nHXaVBz3DLvH9Lbgc8O5ZrQf4dj zAoK0Gx+Y2plU3kziQ10xqXoO+L5xmMismJk2IRS5JHBbYJmgXlEZjupk8rXlr+YExrB bMJ+PvLZ+smDok/ovqVox447fvTG8paq8TJK+Gw7IIq8u9BLIs81zxWnW0m1Ebv4nRGS vZT/SvufsEbLYdWa0kYaRzl/RtuLs1KtAr2MScYZKQFJAUXeUWTgw/NE0GaoOlWNZynf IepA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9IvA5avPkjH1mdhkLVWmMsYjJHP6FKaHdHhhOHAm1U0=; fh=JoGJTBLBX0mCLsj1pwi/CRHCOqwSpQQHNFK25Wueyf0=; b=d8IpKF2rUNqHj6z0sEmXtpqXFFA86+aC1xmeqbgdvpg7UOtsMoNz9C2sue7bAcq/0m hDMjNzA7mBqmVcqjvsOBjM1ypHAUafVKFBKuCpLvDI7C2qQFD4N1KOvxN1Xv9EEzFzhL bd2bndmnvCpPSdTD3Uc94OZg+p6auKQfJvqlZxbuX4ks80b5ywlExGlNdVzrWEcAcW4a 6KTrSx830vQw0QvHG1im/XlpE/kZq9jDIu63iSaioQTdkjgcSV9Gg3kIPpTbN9bWgdws U6dr0dFCEQ8OA0AEM/llg69AFZQovcgawvgSuvEEc5gjDMAuxVgaBe4Dr0kaRNxQvV8Q y/Og== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JBcIrj1S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id cw16-20020a056a00451000b006906a716917si4129640pfb.398.2023.09.22.10.50.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 10:50:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JBcIrj1S; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 19D10828611F; Fri, 22 Sep 2023 01:49:07 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232039AbjIVItH (ORCPT <rfc822;chrisfriedt@gmail.com> + 30 others); Fri, 22 Sep 2023 04:49:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231743AbjIVItG (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 22 Sep 2023 04:49:06 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 29DF4A9 for <linux-kernel@vger.kernel.org>; Fri, 22 Sep 2023 01:49:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695372540; x=1726908540; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZSI9mPubS2DHXuo/DkyS/wKdsH+OUEEwQ2sjwqZm2/Q=; b=JBcIrj1SbxgTSeUti4lFdyW/NlQEYp6VC1pVRN5vGa0BcJI/Szh6Q0Pv s64IwAFSdavVO0UkG8+fm/6OqWdeuwiQn+r7IPo4aPR+NVVGizxrHHnyh 6DkGTgT+AOlVrvPUcT3JIjQhHOJtCn9gY7F/VAba7qk0H9Q0uq0G1KPrS wXJ3fAoCGdoFCHaJEOC4zb0PiZHqV3A+5+KUVlFXa6yZWB+XlYzDigUIg W7IvHvgDdo9mSv+OG1FdUPXuyvxGQAsR0kREVo5lEy6apXyghqkyr5sGs vZrZodX68SZyKoa1sH9LohcPiAZegbLKJqPgbFQ6iIDo/Uvo9hWroq2bN w==; X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="383524440" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="383524440" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 01:48:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10840"; a="750769479" X-IronPort-AV: E=Sophos;i="6.03,167,1694761200"; d="scan'208";a="750769479" Received: from bmatwiej-mobl.ger.corp.intel.com (HELO wieczorr-mobl1.intel.com) ([10.213.8.2]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2023 01:48:56 -0700 From: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> To: Fenghua Yu <fenghua.yu@intel.com>, Reinette Chatre <reinette.chatre@intel.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com> Cc: linux-kernel@vger.kernel.org Subject: [PATCH v2 1/4] x86/resctrl: Enable non-contiguous bits in Intel CAT Date: Fri, 22 Sep 2023 10:48:23 +0200 Message-ID: <918e147601697b1d4b8f8589f5751e05d6ceccdf.1695371055.git.maciej.wieczor-retman@intel.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <cover.1695371055.git.maciej.wieczor-retman@intel.com> References: <cover.1695371055.git.maciej.wieczor-retman@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Fri, 22 Sep 2023 01:49:07 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777751387450408610 X-GMAIL-MSGID: 1777761004715848284 |
Series |
x86/resctrl: Non-contiguous bitmasks in Intel CAT
|
|
Commit Message
Maciej Wieczor-Retman
Sept. 22, 2023, 8:48 a.m. UTC
The setting for non-contiguous 1s support in Intel CAT is
hardcoded to false. On these systems, writing non-contiguous
1s into the schemata file will fail before resctrl passes
the value to the hardware.
In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
being reserved and now carry information about non-contiguous 1s
value support for L3 and L2 cache respectively. The CAT
capacity bitmask (CBM) supports a non-contiguous 1s value if
the bit is set.
Replace the hardcoded non-contiguous support value with
the support learned from the hardware. Add hardcoded non-contiguous
support value to Haswell probe since it can't make use of CPUID for
Cache allocation.
Originally-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
Changelog v2:
- Rewrite part of a comment concerning Haswell. (Reinette)
arch/x86/kernel/cpu/resctrl/core.c | 9 ++++++---
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++----
arch/x86/kernel/cpu/resctrl/internal.h | 9 +++++++++
3 files changed, 21 insertions(+), 7 deletions(-)
Comments
Hi, and thanks for the review! On 2023-09-22 at 16:14:41 +0200, Peter Newman wrote: >Hi Maciej, > >On Fri, Sep 22, 2023 at 10:48:23AM +0200, Maciej Wieczor-Retman wrote: >> The setting for non-contiguous 1s support in Intel CAT is >> hardcoded to false. On these systems, writing non-contiguous >> 1s into the schemata file will fail before resctrl passes >> the value to the hardware. >> >> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped >> being reserved and now carry information about non-contiguous 1s >> value support for L3 and L2 cache respectively. The CAT >> capacity bitmask (CBM) supports a non-contiguous 1s value if >> the bit is set. > >How new of an SDM do I need? The June 2023 revision I downloaded today didn't >list it. It's not currently in the SDM but in the Intel® Architecture Instruction Set Extensions and Future Features (which I mentioned in the second paragraph of the cover letter). My version of the ISA pdf was from June 2023. >> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c >> index 030d3b409768..c783a873147c 100644 >> --- a/arch/x86/kernel/cpu/resctrl/core.c >> +++ b/arch/x86/kernel/cpu/resctrl/core.c >> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void) >> r->cache.cbm_len = 20; >> r->cache.shareable_bits = 0xc0000; >> r->cache.min_cbm_bits = 2; >> + r->cache.arch_has_sparse_bitmaps = false; >> r->alloc_capable = true; >> >> rdt_alloc_capable = true; >> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r) >> { >> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); >> union cpuid_0x10_1_eax eax; >> + union cpuid_0x10_x_ecx ecx; >> union cpuid_0x10_x_edx edx; >> - u32 ebx, ecx; >> + u32 ebx; >> >> - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full); >> + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full); >> hw_res->num_closid = edx.split.cos_max + 1; >> r->cache.cbm_len = eax.split.cbm_len + 1; >> r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1; >> r->cache.shareable_bits = ebx & r->default_ctrl; >> r->data_width = (r->cache.cbm_len + 3) / 4; >> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) >> + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; > >This seems to be called after the clearing of arch_has_sparse_bitmaps in >cache_alloc_hsw_probe(). If we can't make use of the CPUID bit on Haswell, >is it safe to use its value here? I believe the calls go like this for a haswell system: resctrl_late_init() -> check_quirks() -> __check_quirks_intel() -> -> cache_alloc_hsw_probe() There this line is executed: rdt_alloc_capable = true; where rdt_alloc_capable is global in the file scope. Then later in: resctrl_late_init() -> get_rdt_resources() -> get_rdt_alloc_resources() this is executed at the function beginning: if (rdt_alloc_capable) return true; So the rest of the get_rdt_alloc_resources() is skipped and calls to rdt_get_cache_alloc_cfg() never get executed.
Hi Maciej, On Wed, Sep 27, 2023 at 11:20 AM Maciej Wieczór-Retman <maciej.wieczor-retman@intel.com> wrote: > On 2023-09-22 at 16:14:41 +0200, Peter Newman wrote: > >On Fri, Sep 22, 2023 at 10:48:23AM +0200, Maciej Wieczor-Retman wrote: > >> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped > >> being reserved and now carry information about non-contiguous 1s > >> value support for L3 and L2 cache respectively. The CAT > >> capacity bitmask (CBM) supports a non-contiguous 1s value if > >> the bit is set. > > > >How new of an SDM do I need? The June 2023 revision I downloaded today didn't > >list it. > > It's not currently in the SDM but in the Intel® Architecture > Instruction Set Extensions and Future Features (which I mentioned in the > second paragraph of the cover letter). My version of the ISA pdf was > from June 2023. > I see it now, thanks! > >> - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full); > >> + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full); > >> hw_res->num_closid = edx.split.cos_max + 1; > >> r->cache.cbm_len = eax.split.cbm_len + 1; > >> r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1; > >> r->cache.shareable_bits = ebx & r->default_ctrl; > >> r->data_width = (r->cache.cbm_len + 3) / 4; > >> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) > >> + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; > > > >This seems to be called after the clearing of arch_has_sparse_bitmaps in > >cache_alloc_hsw_probe(). If we can't make use of the CPUID bit on Haswell, > >is it safe to use its value here? > > I believe the calls go like this for a haswell system: > resctrl_late_init() -> check_quirks() -> __check_quirks_intel() -> > -> cache_alloc_hsw_probe() > > There this line is executed: > rdt_alloc_capable = true; > where rdt_alloc_capable is global in the file scope. > > Then later in: > resctrl_late_init() -> get_rdt_resources() -> get_rdt_alloc_resources() > > this is executed at the function beginning: > if (rdt_alloc_capable) > return true; > > So the rest of the get_rdt_alloc_resources() is skipped and calls to > rdt_get_cache_alloc_cfg() never get executed. Yuck. But it works I guess. The series looks fine to me. Reviewed-by: Peter Newman <peternewman@google.com> I applied the series and was able to confirm the behavior was still correct for contiguous-bitmap Intel hardware and that sprase_bitmaps is true on AMD and continues to work as expected. Tested-by: Peter Newman <peternewman@google.com> I'm not sure if I have access to any Intel hardware with non-contiguous bitmaps right now. Are you able to say where that would be implemented? Thanks! -Peter
On 2023-09-27 at 12:08:33 +0200, Peter Newman wrote: >On Wed, Sep 27, 2023 at 11:20 AM Maciej Wieczór-Retman ><maciej.wieczor-retman@intel.com> wrote: >> On 2023-09-22 at 16:14:41 +0200, Peter Newman wrote: >> >On Fri, Sep 22, 2023 at 10:48:23AM +0200, Maciej Wieczor-Retman wrote: >> >> - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full); >> >> + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full); >> >> hw_res->num_closid = edx.split.cos_max + 1; >> >> r->cache.cbm_len = eax.split.cbm_len + 1; >> >> r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1; >> >> r->cache.shareable_bits = ebx & r->default_ctrl; >> >> r->data_width = (r->cache.cbm_len + 3) / 4; >> >> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) >> >> + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; >> > >> >This seems to be called after the clearing of arch_has_sparse_bitmaps in >> >cache_alloc_hsw_probe(). If we can't make use of the CPUID bit on Haswell, >> >is it safe to use its value here? >> >> I believe the calls go like this for a haswell system: >> resctrl_late_init() -> check_quirks() -> __check_quirks_intel() -> >> -> cache_alloc_hsw_probe() >> >> There this line is executed: >> rdt_alloc_capable = true; >> where rdt_alloc_capable is global in the file scope. >> >> Then later in: >> resctrl_late_init() -> get_rdt_resources() -> get_rdt_alloc_resources() >> >> this is executed at the function beginning: >> if (rdt_alloc_capable) >> return true; >> >> So the rest of the get_rdt_alloc_resources() is skipped and calls to >> rdt_get_cache_alloc_cfg() never get executed. > >Yuck. But it works I guess. > >The series looks fine to me. > >Reviewed-by: Peter Newman <peternewman@google.com> > >I applied the series and was able to confirm the behavior was still >correct for contiguous-bitmap Intel hardware and that sprase_bitmaps >is true on AMD and continues to work as expected. > >Tested-by: Peter Newman <peternewman@google.com> > >I'm not sure if I have access to any Intel hardware with >non-contiguous bitmaps right now. Are you able to say where that would >be implemented? Thanks for testing! Writing non-contiguous bitmasks is supported starting from the upcoming GNR microarchitecture forward. That's also why the new CPUID bit meaning is in the ISA pdf and not in the SDM one currently.
On Wed, Sep 27, 2023 at 12:44:39PM +0200, Maciej Wieczór-Retman wrote: > Writing non-contiguous bitmasks is supported starting from the upcoming > GNR microarchitecture forward. > > That's also why the new CPUID bit meaning is in the ISA pdf and not in > the SDM one currently. New SDM released today has the non-contiguous bit. See vol 3B Figuer 18-33. -Tony
Hi Maciej, How about this subject line? x86/resctrl: Enable non-contiguous CBMs on Intel CAT On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote: > The setting for non-contiguous 1s support in Intel CAT is > hardcoded to false. On these systems, writing non-contiguous > 1s into the schemata file will fail before resctrl passes > the value to the hardware. > > In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped > being reserved and now carry information about non-contiguous 1s > value support for L3 and L2 cache respectively. The CAT > capacity bitmask (CBM) supports a non-contiguous 1s value if > the bit is set. > > Replace the hardcoded non-contiguous support value with > the support learned from the hardware. Add hardcoded non-contiguous > support value to Haswell probe since it can't make use of CPUID for > Cache allocation. > > Originally-by: Fenghua Yu <fenghua.yu@intel.com> > Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> > --- > Changelog v2: > - Rewrite part of a comment concerning Haswell. (Reinette) > > arch/x86/kernel/cpu/resctrl/core.c | 9 ++++++--- > arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++---- > arch/x86/kernel/cpu/resctrl/internal.h | 9 +++++++++ > 3 files changed, 21 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c > index 030d3b409768..c783a873147c 100644 > --- a/arch/x86/kernel/cpu/resctrl/core.c > +++ b/arch/x86/kernel/cpu/resctrl/core.c > @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void) > r->cache.cbm_len = 20; > r->cache.shareable_bits = 0xc0000; > r->cache.min_cbm_bits = 2; > + r->cache.arch_has_sparse_bitmaps = false; Is this change required? This is always set to false in rdt_init_res_defs_intel(). > r->alloc_capable = true; > > rdt_alloc_capable = true; > @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r) > { > struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); > union cpuid_0x10_1_eax eax; > + union cpuid_0x10_x_ecx ecx; > union cpuid_0x10_x_edx edx; > - u32 ebx, ecx; > + u32 ebx; > > - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full); > + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full); > hw_res->num_closid = edx.split.cos_max + 1; > r->cache.cbm_len = eax.split.cbm_len + 1; > r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1; > r->cache.shareable_bits = ebx & r->default_ctrl; > r->data_width = (r->cache.cbm_len + 3) / 4; > + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) > + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; > r->alloc_capable = true; > } > > @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void) > > if (r->rid == RDT_RESOURCE_L3 || > r->rid == RDT_RESOURCE_L2) { > - r->cache.arch_has_sparse_bitmaps = false; Why do you have to remove this one here? This seems like a right place to initialize. Thanks Babu > r->cache.arch_has_per_cpu_cfg = false; > r->cache.min_cbm_bits = 1; > } else if (r->rid == RDT_RESOURCE_MBA) { > diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c > index b44c487727d4..f076f12cf8e8 100644 > --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c > +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c > @@ -87,10 +87,12 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, > > /* > * Check whether a cache bit mask is valid. > - * For Intel the SDM says: > - * Please note that all (and only) contiguous '1' combinations > - * are allowed (e.g. FFFFH, 0FF0H, 003CH, etc.). > - * Additionally Haswell requires at least two bits set. > + * On Intel CPUs, non-contiguous 1s value support is indicated by CPUID: > + * - CPUID.0x10.1:ECX[3]: L3 non-contiguous 1s value supported if 1 > + * - CPUID.0x10.2:ECX[3]: L2 non-contiguous 1s value supported if 1 > + * > + * Haswell does not support a non-contiguous 1s value and additionally > + * requires at least two bits set. > * AMD allows non-contiguous bitmasks. > */ > static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r) > diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h > index 85ceaf9a31ac..c47ef2f13e8e 100644 > --- a/arch/x86/kernel/cpu/resctrl/internal.h > +++ b/arch/x86/kernel/cpu/resctrl/internal.h > @@ -492,6 +492,15 @@ union cpuid_0x10_3_eax { > unsigned int full; > }; > > +/* CPUID.(EAX=10H, ECX=ResID).ECX */ > +union cpuid_0x10_x_ecx { > + struct { > + unsigned int reserved:3; > + unsigned int noncont:1; > + } split; > + unsigned int full; > +}; > + > /* CPUID.(EAX=10H, ECX=ResID).EDX */ > union cpuid_0x10_x_edx { > struct {
Hi, thanks for reviewing the series, On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote: >Hi Maciej, > >How about this subject line? > >x86/resctrl: Enable non-contiguous CBMs on Intel CAT Changing "bits" to "CBMs" does indeed seem sensible so I'll do that if there are no objections. But I'm not sure the preposition collocation change from "in" to "on" would be grammatical (at least from what I've read in docs about Intel CAT so far). >On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote: >> The setting for non-contiguous 1s support in Intel CAT is > >> hardcoded to false. On these systems, writing non-contiguous >> 1s into the schemata file will fail before resctrl passes >> the value to the hardware. >> >> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped >> being reserved and now carry information about non-contiguous 1s >> value support for L3 and L2 cache respectively. The CAT >> capacity bitmask (CBM) supports a non-contiguous 1s value if >> the bit is set. >> >> Replace the hardcoded non-contiguous support value with >> the support learned from the hardware. Add hardcoded non-contiguous >> support value to Haswell probe since it can't make use of CPUID for >> Cache allocation. >> >> Originally-by: Fenghua Yu <fenghua.yu@intel.com> >> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> >> --- >> Changelog v2: >> - Rewrite part of a comment concerning Haswell. (Reinette) >> >> arch/x86/kernel/cpu/resctrl/core.c | 9 ++++++--- >> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++---- >> arch/x86/kernel/cpu/resctrl/internal.h | 9 +++++++++ >> 3 files changed, 21 insertions(+), 7 deletions(-) >> >> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c >> index 030d3b409768..c783a873147c 100644 >> --- a/arch/x86/kernel/cpu/resctrl/core.c >> +++ b/arch/x86/kernel/cpu/resctrl/core.c >> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void) >> r->cache.cbm_len = 20; >> r->cache.shareable_bits = 0xc0000; >> r->cache.min_cbm_bits = 2; >> + r->cache.arch_has_sparse_bitmaps = false; > >Is this change required? > >This is always set to false in rdt_init_res_defs_intel(). The logic behind moving this variable initialization from rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and rdt_get_cache_alloc_cfg() is that the variable doesn't really have a default value anymore. It used to when the CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] bits were reserved. Now for the general case the variable is dependent on CPUID output. And only for Haswell case it needs to be hardcoded to "false", so the assignment makes more sense in Haswell probe rather than in the default section. >> r->alloc_capable = true; >> rdt_alloc_capable = true; >> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r) >> { >> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); >> union cpuid_0x10_1_eax eax; >> + union cpuid_0x10_x_ecx ecx; >> union cpuid_0x10_x_edx edx; >> - u32 ebx, ecx; >> + u32 ebx; >> - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full); >> + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full); >> hw_res->num_closid = edx.split.cos_max + 1; >> r->cache.cbm_len = eax.split.cbm_len + 1; >> r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1; >> r->cache.shareable_bits = ebx & r->default_ctrl; >> r->data_width = (r->cache.cbm_len + 3) / 4; >> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) >> + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; >> r->alloc_capable = true; >> } >> @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void) >> if (r->rid == RDT_RESOURCE_L3 || >> r->rid == RDT_RESOURCE_L2) { >> - r->cache.arch_has_sparse_bitmaps = false; > >Why do you have to remove this one here? This seems like a right place to >initialize. Look at the previous comment.
Hi Maciej, On 9/28/23 02:06, Maciej Wieczór-Retman wrote: > Hi, thanks for reviewing the series, > > On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote: >> Hi Maciej, >> >> How about this subject line? >> >> x86/resctrl: Enable non-contiguous CBMs on Intel CAT > > Changing "bits" to "CBMs" does indeed seem sensible so I'll do that if > there are no objections. > > But I'm not sure the preposition collocation change from "in" to "on" > would be grammatical (at least from what I've read in docs about Intel > CAT so far). > >> On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote: >>> The setting for non-contiguous 1s support in Intel CAT is >> >>> hardcoded to false. On these systems, writing non-contiguous >>> 1s into the schemata file will fail before resctrl passes >>> the value to the hardware. >>> >>> In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped >>> being reserved and now carry information about non-contiguous 1s >>> value support for L3 and L2 cache respectively. The CAT >>> capacity bitmask (CBM) supports a non-contiguous 1s value if >>> the bit is set. >>> >>> Replace the hardcoded non-contiguous support value with >>> the support learned from the hardware. Add hardcoded non-contiguous >>> support value to Haswell probe since it can't make use of CPUID for >>> Cache allocation. >>> >>> Originally-by: Fenghua Yu <fenghua.yu@intel.com> >>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> >>> --- >>> Changelog v2: >>> - Rewrite part of a comment concerning Haswell. (Reinette) >>> >>> arch/x86/kernel/cpu/resctrl/core.c | 9 ++++++--- >>> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 10 ++++++---- >>> arch/x86/kernel/cpu/resctrl/internal.h | 9 +++++++++ >>> 3 files changed, 21 insertions(+), 7 deletions(-) >>> >>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c >>> index 030d3b409768..c783a873147c 100644 >>> --- a/arch/x86/kernel/cpu/resctrl/core.c >>> +++ b/arch/x86/kernel/cpu/resctrl/core.c >>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void) >>> r->cache.cbm_len = 20; >>> r->cache.shareable_bits = 0xc0000; >>> r->cache.min_cbm_bits = 2; >>> + r->cache.arch_has_sparse_bitmaps = false; >> >> Is this change required? >> >> This is always set to false in rdt_init_res_defs_intel(). > > The logic behind moving this variable initialization from > rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and > rdt_get_cache_alloc_cfg() is that the variable doesn't really have a > default value anymore. It used to when the CPUID.0x10.1:ECX[3] and > CPUID.0x10.2:ECX[3] bits were reserved. > > Now for the general case the variable is dependent on CPUID output. > And only for Haswell case it needs to be hardcoded to "false", so the > assignment makes more sense in Haswell probe rather than in the default > section. Here is the current sequence order with your change. 1. resctrl_late_init -> check_quirks -> __check_quirks_intel -> cache_alloc_hsw_probe r->cache.arch_has_sparse_bitmaps = false; (new code) 2. resctrl_late_init -> rdt_init_res_defs -> rdt_init_res_defs_intel r->cache.arch_has_sparse_bitmaps = false; (old code) 3. resctrl_late_init -> get_rdt_resources -> get_rdt_alloc_resources -> rdt_get_cache_alloc_cfg r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; (new code) The code in (3) is going to overwrite whatever is set in (1) or (2). I would say you can just remove initialization in both (1) and (2). That makes the code clearer to me. I assume reserved bits in Intel is always 0. Thanks Babu > >>> r->alloc_capable = true; >>> rdt_alloc_capable = true; >>> @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r) >>> { >>> struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); >>> union cpuid_0x10_1_eax eax; >>> + union cpuid_0x10_x_ecx ecx; >>> union cpuid_0x10_x_edx edx; >>> - u32 ebx, ecx; >>> + u32 ebx; >>> - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full); >>> + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full); >>> hw_res->num_closid = edx.split.cos_max + 1; >>> r->cache.cbm_len = eax.split.cbm_len + 1; >>> r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1; >>> r->cache.shareable_bits = ebx & r->default_ctrl; >>> r->data_width = (r->cache.cbm_len + 3) / 4; >>> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) >>> + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; >>> r->alloc_capable = true; >>> } >>> @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void) >>> if (r->rid == RDT_RESOURCE_L3 || >>> r->rid == RDT_RESOURCE_L2) { >>> - r->cache.arch_has_sparse_bitmaps = false; >> >> Why do you have to remove this one here? This seems like a right place to >> initialize. > > Look at the previous comment. >
Hi Babu, On 9/28/2023 8:08 AM, Moger, Babu wrote: > On 9/28/23 02:06, Maciej Wieczór-Retman wrote: >> On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote: >>> On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote: ... >>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c >>>> index 030d3b409768..c783a873147c 100644 >>>> --- a/arch/x86/kernel/cpu/resctrl/core.c >>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c >>>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void) >>>> r->cache.cbm_len = 20; >>>> r->cache.shareable_bits = 0xc0000; >>>> r->cache.min_cbm_bits = 2; >>>> + r->cache.arch_has_sparse_bitmaps = false; >>> >>> Is this change required? >>> >>> This is always set to false in rdt_init_res_defs_intel(). >> >> The logic behind moving this variable initialization from >> rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and >> rdt_get_cache_alloc_cfg() is that the variable doesn't really have a >> default value anymore. It used to when the CPUID.0x10.1:ECX[3] and >> CPUID.0x10.2:ECX[3] bits were reserved. >> >> Now for the general case the variable is dependent on CPUID output. >> And only for Haswell case it needs to be hardcoded to "false", so the >> assignment makes more sense in Haswell probe rather than in the default >> section. > > Here is the current sequence order with your change. > > 1. > resctrl_late_init -> check_quirks -> __check_quirks_intel -> > cache_alloc_hsw_probe > r->cache.arch_has_sparse_bitmaps = false; (new code) > > 2. resctrl_late_init -> rdt_init_res_defs -> rdt_init_res_defs_intel > r->cache.arch_has_sparse_bitmaps = false; (old code) > > 3. resctrl_late_init -> get_rdt_resources -> get_rdt_alloc_resources -> > rdt_get_cache_alloc_cfg > r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; (new code) > > The code in (3) is going to overwrite whatever is set in (1) or (2). > > I would say you can just remove initialization in both (1) and (2). That > makes the code clearer to me. I assume reserved bits in Intel is always 0. > I believe Maciej already addressed this in his response to a similar question from Peter. Please see: https://lore.kernel.org/lkml/xnjmmsj5pjskbqeynor2ztha5dmkhxa44j764ohtjhtywy7idb@soobjiql4liy/ Reinette
Hi Reinette, On 9/28/23 10:53, Reinette Chatre wrote: > Hi Babu, > > On 9/28/2023 8:08 AM, Moger, Babu wrote: >> On 9/28/23 02:06, Maciej Wieczór-Retman wrote: >>> On 2023-09-27 at 17:34:27 -0500, Moger, Babu wrote: >>>> On 9/22/2023 3:48 AM, Maciej Wieczor-Retman wrote: > ... > >>>>> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c >>>>> index 030d3b409768..c783a873147c 100644 >>>>> --- a/arch/x86/kernel/cpu/resctrl/core.c >>>>> +++ b/arch/x86/kernel/cpu/resctrl/core.c >>>>> @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void) >>>>> r->cache.cbm_len = 20; >>>>> r->cache.shareable_bits = 0xc0000; >>>>> r->cache.min_cbm_bits = 2; >>>>> + r->cache.arch_has_sparse_bitmaps = false; >>>> >>>> Is this change required? >>>> >>>> This is always set to false in rdt_init_res_defs_intel(). >>> >>> The logic behind moving this variable initialization from >>> rdt_init_res_defs_intel() into both cache_alloc_hsw_probe() and >>> rdt_get_cache_alloc_cfg() is that the variable doesn't really have a >>> default value anymore. It used to when the CPUID.0x10.1:ECX[3] and >>> CPUID.0x10.2:ECX[3] bits were reserved. >>> >>> Now for the general case the variable is dependent on CPUID output. >>> And only for Haswell case it needs to be hardcoded to "false", so the >>> assignment makes more sense in Haswell probe rather than in the default >>> section. >> >> Here is the current sequence order with your change. >> >> 1. >> resctrl_late_init -> check_quirks -> __check_quirks_intel -> >> cache_alloc_hsw_probe >> r->cache.arch_has_sparse_bitmaps = false; (new code) >> >> 2. resctrl_late_init -> rdt_init_res_defs -> rdt_init_res_defs_intel >> r->cache.arch_has_sparse_bitmaps = false; (old code) >> >> 3. resctrl_late_init -> get_rdt_resources -> get_rdt_alloc_resources -> >> rdt_get_cache_alloc_cfg >> r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; (new code) >> >> The code in (3) is going to overwrite whatever is set in (1) or (2). >> >> I would say you can just remove initialization in both (1) and (2). That >> makes the code clearer to me. I assume reserved bits in Intel is always 0. >> > > I believe Maciej already addressed this in his response to a similar question > from Peter. Please see: > https://lore.kernel.org/lkml/xnjmmsj5pjskbqeynor2ztha5dmkhxa44j764ohtjhtywy7idb@soobjiql4liy/ The rdt_alloc_capable part is kind of hidden. Now it makes sense. Thanks Babu Moger
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 030d3b409768..c783a873147c 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -152,6 +152,7 @@ static inline void cache_alloc_hsw_probe(void) r->cache.cbm_len = 20; r->cache.shareable_bits = 0xc0000; r->cache.min_cbm_bits = 2; + r->cache.arch_has_sparse_bitmaps = false; r->alloc_capable = true; rdt_alloc_capable = true; @@ -267,15 +268,18 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); union cpuid_0x10_1_eax eax; + union cpuid_0x10_x_ecx ecx; union cpuid_0x10_x_edx edx; - u32 ebx, ecx; + u32 ebx; - cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx, &edx.full); + cpuid_count(0x00000010, idx, &eax.full, &ebx, &ecx.full, &edx.full); hw_res->num_closid = edx.split.cos_max + 1; r->cache.cbm_len = eax.split.cbm_len + 1; r->default_ctrl = BIT_MASK(eax.split.cbm_len + 1) - 1; r->cache.shareable_bits = ebx & r->default_ctrl; r->data_width = (r->cache.cbm_len + 3) / 4; + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) + r->cache.arch_has_sparse_bitmaps = ecx.split.noncont; r->alloc_capable = true; } @@ -872,7 +876,6 @@ static __init void rdt_init_res_defs_intel(void) if (r->rid == RDT_RESOURCE_L3 || r->rid == RDT_RESOURCE_L2) { - r->cache.arch_has_sparse_bitmaps = false; r->cache.arch_has_per_cpu_cfg = false; r->cache.min_cbm_bits = 1; } else if (r->rid == RDT_RESOURCE_MBA) { diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index b44c487727d4..f076f12cf8e8 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -87,10 +87,12 @@ int parse_bw(struct rdt_parse_data *data, struct resctrl_schema *s, /* * Check whether a cache bit mask is valid. - * For Intel the SDM says: - * Please note that all (and only) contiguous '1' combinations - * are allowed (e.g. FFFFH, 0FF0H, 003CH, etc.). - * Additionally Haswell requires at least two bits set. + * On Intel CPUs, non-contiguous 1s value support is indicated by CPUID: + * - CPUID.0x10.1:ECX[3]: L3 non-contiguous 1s value supported if 1 + * - CPUID.0x10.2:ECX[3]: L2 non-contiguous 1s value supported if 1 + * + * Haswell does not support a non-contiguous 1s value and additionally + * requires at least two bits set. * AMD allows non-contiguous bitmasks. */ static bool cbm_validate(char *buf, u32 *data, struct rdt_resource *r) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 85ceaf9a31ac..c47ef2f13e8e 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -492,6 +492,15 @@ union cpuid_0x10_3_eax { unsigned int full; }; +/* CPUID.(EAX=10H, ECX=ResID).ECX */ +union cpuid_0x10_x_ecx { + struct { + unsigned int reserved:3; + unsigned int noncont:1; + } split; + unsigned int full; +}; + /* CPUID.(EAX=10H, ECX=ResID).EDX */ union cpuid_0x10_x_edx { struct {