Message ID | 20231204185357.120501-1-tony.luck@intel.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp2967255vqy; Mon, 4 Dec 2023 10:54:18 -0800 (PST) X-Google-Smtp-Source: AGHT+IEQznJ2iDxz+msWAhJiiqowKMBMzJntzVbyjTN9sCrp8FPamdIDXlGxFv591Zp2sh4uGIUR X-Received: by 2002:a17:903:230b:b0:1d0:4759:bb60 with SMTP id d11-20020a170903230b00b001d04759bb60mr232175plh.26.1701716058217; Mon, 04 Dec 2023 10:54:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701716058; cv=none; d=google.com; s=arc-20160816; b=BZopDvxtteQheBkVvTpKQO7pFix5deQ7tBv4xNURqTRHU6Pf2EQeYE+ClHUsNdi5bH B9hQxCQSwRRxuA0D2b9BEPKn0J25HpRtTBvXTxWiDWft80ZwddZKDTCqZyoNKNeYnEO0 2LBWNukyR/cL+4xQxMj1CcY0x3z9ckAL/ZBRinHYPi4xCaVyvL2Oj27xC2ERCPghvpxX dLwXjuSmaGPH/h5XCfPOE8q3hgHHYfo6biVGI2HHXJVZejBsNQ9Vaz1hUpTCyAQIcMta uNqXIGZlIrOpV5s3RTt2hZHm67jTTxcJDz+RA7h+Vs46HCbQ8Z3Dw7oWdikIjCLZR9pr 2tYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WAkhX/QGORnlcNjjC0yWhVqS2ys/EtNhSa6UGkSJaVo=; fh=EIH9XAmicvPIUSP7TBeBhZ/WaoqG49JQ3xV1i3Gl7Co=; b=sWkSlWm4fTU9WrxupY48dyeHVVzDGu+jekx800l4O8U+VXnz9/ruq7J7O3lNgf1BYg HQ7s7sNMHeSN7Y9Itri0mwO546oDmdlTHJ9qggWsiZn2JjvPAcKuwZB0HaaxVhdJIIsL f+Nn3pLeDNKaqRC87Q2JH8rlssuQJXeh+sOZSPjmBClv9OQ+uF0fEt11uIWpRGU6w4RD XH46gSpp4X8M1xWzU6mQEfv60SAGlxgzv4f1iaM1Zp57IsqGouUayWfo8dplEiJVXZxx qdPrTa2KYF8F+2slRIIU29z2W0rNUFMYogBgbtf6fc+KnpZ0jlo7Rj1ezOyJQzSuQWpT YfrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lz5PtVN4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id b17-20020a170903229100b001d0a45c0212si2014856plh.263.2023.12.04.10.54.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 10:54:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lz5PtVN4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 6B33D802BD4E; Mon, 4 Dec 2023 10:54:12 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231233AbjLDSyD (ORCPT <rfc822;chrisfriedt@gmail.com> + 99 others); Mon, 4 Dec 2023 13:54:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38478 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbjLDSyD (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 4 Dec 2023 13:54:03 -0500 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F508CA; Mon, 4 Dec 2023 10:54:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701716048; x=1733252048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MmH/laKEGnp2y/uGerX+E7RxvQ9TncUwvCwOiPDkDoQ=; b=lz5PtVN45Y+UhtpsRG2R7/MQd5yaFcdXf6htpKe3pv3nINjzsLauQvzz W8owWJbiFf3Ms4jxylMgQ5L/oV1Qlfma58KI46fH1cUsEilwqiE2WSHVa r1qE1lm8Hq0rWWzQYxe9p64tTFjEa4js9HhSzih6kGdd5DxNxEZLSw+j8 sQ3jH435eq9YPb4VjZB4gG8gYVaUCi/TzAnvxBsOA0TFQ/DmD3y05MfPC HeEBnADT3Cejqa5Tk1gzldB/ybp/yFK/qur3ZTdt7u1+id0OkzuYK0+6/ X9TsjWrx8uPB0havJGouqDTUA93rYxeDO0wBuos9bDQGexAWpQxAw8k0n Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="838442" X-IronPort-AV: E=Sophos;i="6.04,250,1695711600"; d="scan'208";a="838442" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 10:54:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10914"; a="836687184" X-IronPort-AV: E=Sophos;i="6.04,250,1695711600"; d="scan'208";a="836687184" Received: from agluck-desk3.sc.intel.com ([172.25.222.74]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Dec 2023 10:54:07 -0800 From: Tony Luck <tony.luck@intel.com> To: Fenghua Yu <fenghua.yu@intel.com>, Reinette Chatre <reinette.chatre@intel.com>, Peter Newman <peternewman@google.com>, Jonathan Corbet <corbet@lwn.net>, Shuah Khan <skhan@linuxfoundation.org>, x86@kernel.org Cc: Shaopeng Tan <tan.shaopeng@fujitsu.com>, James Morse <james.morse@arm.com>, Jamie Iles <quic_jiles@quicinc.com>, Babu Moger <babu.moger@amd.com>, Randy Dunlap <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev, Tony Luck <tony.luck@intel.com> Subject: [PATCH v13 0/8] Add support for Sub-NUMA cluster (SNC) systems Date: Mon, 4 Dec 2023 10:53:49 -0800 Message-ID: <20231204185357.120501-1-tony.luck@intel.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231130003418.89964-1-tony.luck@intel.com> References: <20231130003418.89964-1-tony.luck@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 04 Dec 2023 10:54:12 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772150361455667514 X-GMAIL-MSGID: 1784378617438919516 |
Series |
Add support for Sub-NUMA cluster (SNC) systems
|
|
Message
Luck, Tony
Dec. 4, 2023, 6:53 p.m. UTC
The Sub-NUMA cluster feature on some Intel processors partitions the CPUs
that share an L3 cache into two or more sets. This plays havoc with the
Resource Director Technology (RDT) monitoring features. Prior to this
patch Intel has advised that SNC and RDT are incompatible.
Some of these CPU support an MSR that can partition the RMID counters in
the same way. This allows monitoring features to be used. With the caveat
that users must be aware that Linux may migrate tasks more frequently
between SNC nodes than between "regular" NUMA nodes, so reading counters
from all SNC nodes may be needed to get a complete picture of activity
for tasks.
Cache and memory bandwidth allocation features continue to operate at
the scope of the L3 cache.
Signed-off-by: Tony Luck <tony.luck@intel.com>
Changes since v12:
All:
Reinette - put commit tags in right order for TIP (Tested-by before
Reviewed-by)
Patch 7:
Fam Zheng - Check for -1 return from get_cpu_cacheinfo_id() and
increase size of bitmap tracking # of L3 instances.
Reinette - Add extra sanity checks. Note that this patch has
some additional tweaks beyond the e-mail discussion.
1) "3" is a valid return in addition to 1, 2, 4
2) Added a warning if the sanity checks fail that
prints number of CPU nodes and number of L3 cache
instances that were found.
Patch 8:
Babu - Fix grammar with an additional comma.
Tony Luck (8):
x86/resctrl: Prepare for new domain scope
x86/resctrl: Prepare to split rdt_domain structure
x86/resctrl: Prepare for different scope for control/monitor
operations
x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
x86/resctrl: Add node-scope to the options for feature scope
x86/resctrl: Introduce snc_nodes_per_l3_cache
x86/resctrl: Sub NUMA Cluster detection and enable
x86/resctrl: Update documentation with Sub-NUMA cluster changes
Documentation/arch/x86/resctrl.rst | 25 +-
include/linux/resctrl.h | 85 +++--
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kernel/cpu/resctrl/internal.h | 66 ++--
arch/x86/kernel/cpu/resctrl/core.c | 433 +++++++++++++++++-----
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 58 +--
arch/x86/kernel/cpu/resctrl/monitor.c | 68 ++--
arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 26 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 149 ++++----
9 files changed, 629 insertions(+), 282 deletions(-)
base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
Comments
Hi Tony, Tested the series on AMD system. Just ran few basic tests. Everything looking good. Thanks Babu On 12/4/2023 12:53 PM, Tony Luck wrote: > The Sub-NUMA cluster feature on some Intel processors partitions the CPUs > that share an L3 cache into two or more sets. This plays havoc with the > Resource Director Technology (RDT) monitoring features. Prior to this > patch Intel has advised that SNC and RDT are incompatible. > > Some of these CPU support an MSR that can partition the RMID counters in > the same way. This allows monitoring features to be used. With the caveat > that users must be aware that Linux may migrate tasks more frequently > between SNC nodes than between "regular" NUMA nodes, so reading counters > from all SNC nodes may be needed to get a complete picture of activity > for tasks. > > Cache and memory bandwidth allocation features continue to operate at > the scope of the L3 cache. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > > Changes since v12: > > All: > Reinette - put commit tags in right order for TIP (Tested-by before > Reviewed-by) > > Patch 7: > Fam Zheng - Check for -1 return from get_cpu_cacheinfo_id() and > increase size of bitmap tracking # of L3 instances. > Reinette - Add extra sanity checks. Note that this patch has > some additional tweaks beyond the e-mail discussion. > 1) "3" is a valid return in addition to 1, 2, 4 > 2) Added a warning if the sanity checks fail that > prints number of CPU nodes and number of L3 cache > instances that were found. > > Patch 8: > Babu - Fix grammar with an additional comma. > > > Tony Luck (8): > x86/resctrl: Prepare for new domain scope > x86/resctrl: Prepare to split rdt_domain structure > x86/resctrl: Prepare for different scope for control/monitor > operations > x86/resctrl: Split the rdt_domain and rdt_hw_domain structures > x86/resctrl: Add node-scope to the options for feature scope > x86/resctrl: Introduce snc_nodes_per_l3_cache > x86/resctrl: Sub NUMA Cluster detection and enable > x86/resctrl: Update documentation with Sub-NUMA cluster changes > > Documentation/arch/x86/resctrl.rst | 25 +- > include/linux/resctrl.h | 85 +++-- > arch/x86/include/asm/msr-index.h | 1 + > arch/x86/kernel/cpu/resctrl/internal.h | 66 ++-- > arch/x86/kernel/cpu/resctrl/core.c | 433 +++++++++++++++++----- > arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 58 +-- > arch/x86/kernel/cpu/resctrl/monitor.c | 68 ++-- > arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 26 +- > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 149 ++++---- > 9 files changed, 629 insertions(+), 282 deletions(-) > > > base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab
> Tested the series on AMD system. Just ran few basic tests. Everything > looking good. Babu, Thanks for testing. I'll add your Tested-by tag if[1] I make a v14. -Tony [1] realistically not if, but when :-(
On Mon, Dec 04, 2023 at 10:53:49AM -0800, Tony Luck wrote: Boris: I've collected "Reviewed-by:" from Reinette for all patches. Babu sent a Tested-by for the series, and Reviewed-by for each patch just now. So it's ready to got into your to-be-reviewed queue. Thanks -Tony > The Sub-NUMA cluster feature on some Intel processors partitions the CPUs > that share an L3 cache into two or more sets. This plays havoc with the > Resource Director Technology (RDT) monitoring features. Prior to this > patch Intel has advised that SNC and RDT are incompatible. > > Some of these CPU support an MSR that can partition the RMID counters in > the same way. This allows monitoring features to be used. With the caveat > that users must be aware that Linux may migrate tasks more frequently > between SNC nodes than between "regular" NUMA nodes, so reading counters > from all SNC nodes may be needed to get a complete picture of activity > for tasks. > > Cache and memory bandwidth allocation features continue to operate at > the scope of the L3 cache. > > Signed-off-by: Tony Luck <tony.luck@intel.com> > > Changes since v12: > > All: > Reinette - put commit tags in right order for TIP (Tested-by before > Reviewed-by) > > Patch 7: > Fam Zheng - Check for -1 return from get_cpu_cacheinfo_id() and > increase size of bitmap tracking # of L3 instances. > Reinette - Add extra sanity checks. Note that this patch has > some additional tweaks beyond the e-mail discussion. > 1) "3" is a valid return in addition to 1, 2, 4 > 2) Added a warning if the sanity checks fail that > prints number of CPU nodes and number of L3 cache > instances that were found. > > Patch 8: > Babu - Fix grammar with an additional comma. > > > Tony Luck (8): > x86/resctrl: Prepare for new domain scope > x86/resctrl: Prepare to split rdt_domain structure > x86/resctrl: Prepare for different scope for control/monitor > operations > x86/resctrl: Split the rdt_domain and rdt_hw_domain structures > x86/resctrl: Add node-scope to the options for feature scope > x86/resctrl: Introduce snc_nodes_per_l3_cache > x86/resctrl: Sub NUMA Cluster detection and enable > x86/resctrl: Update documentation with Sub-NUMA cluster changes > > Documentation/arch/x86/resctrl.rst | 25 +- > include/linux/resctrl.h | 85 +++-- > arch/x86/include/asm/msr-index.h | 1 + > arch/x86/kernel/cpu/resctrl/internal.h | 66 ++-- > arch/x86/kernel/cpu/resctrl/core.c | 433 +++++++++++++++++----- > arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 58 +-- > arch/x86/kernel/cpu/resctrl/monitor.c | 68 ++-- > arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 26 +- > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 149 ++++---- > 9 files changed, 629 insertions(+), 282 deletions(-) > > > base-commit: 2cc14f52aeb78ce3f29677c2de1f06c0e91471ab > -- > 2.41.0 >