Message ID | cover.1686712819.git.alison.schofield@intel.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1006581vqr; Tue, 13 Jun 2023 22:11:16 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6ikYDQaSxiabcJSlDLyJmTSrg5/TWLIjNDepcM8Ap7NZVNKmm9QyfWxcrbzzGUmuPgnkOq X-Received: by 2002:a17:906:6a1c:b0:978:ab6b:afe9 with SMTP id qw28-20020a1709066a1c00b00978ab6bafe9mr14020607ejc.66.1686719476070; Tue, 13 Jun 2023 22:11:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686719476; cv=none; d=google.com; s=arc-20160816; b=Hxb7/Nk+HeIeRMsaESrS3nuwGz2aGED6iufYEC70oPmyGtPpYkJGFNCXc08Y/WNO/C 3xylpZ84vec/NFH7salJlq/QwBg7TtHKhVC15dUdvuiNqTiZC6/FM33opACCRujoXZ27 r+HyttP2yG281Noh+Whf16uHQ+ESGqKi5ejWf2XVYiNvZaQxveHsum517GA0dB5hddCI jcwlpN6s7hU9JLHUXChbHxwrHkON2ziWpr9VGRbgPDCvb1quHpvjuzBThlxxV+Vh4Cvt 2i7N4PujUlwhdr3HQRAhrYGkGT0TATpBY5lIGWOiRUN9ECwb/HjNeWk8BBdrgjWzF7Hj jPOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=BnFTLSuerPJnwsE+cMSXTP67/s0JvZAHHfCeee5JmXI=; b=0wZRPlmK6FMjccKesMD40LbJ49wcHLrnjRjh8RB/k44Hq7gS2ekRc/HWKNfhm+8JQy HKVpczId+OySCPGv8QEXOrnh7KCqKSpyDTndlq2cF7IicJhBlRA/DiabZj5TY1O4pA8X y+suM0MU43zwab/x5l8vAwWFXJEH8RFWDB0hrkZMgD2ws+pon2K7WLf/hMtZjUyoxtge /haTVj0dQ3LaOEK7EaUxvHcwX/AwZvqgsFAoobddP4pHNgMzujQTk18FFDRjqt28OuXD NbqvOm7IMs9wVBPW6l1FVnYCakKLQdxeMLbUEGfJEsw/6UtWuqkfQXmB5OP2peFo20Vu gzMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BKL8EsOu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r15-20020aa7cfcf000000b0051827a10497si3836060edy.212.2023.06.13.22.10.48; Tue, 13 Jun 2023 22:11:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BKL8EsOu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242498AbjFNEfx (ORCPT <rfc822;jesperjuhl76@gmail.com> + 99 others); Wed, 14 Jun 2023 00:35:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233878AbjFNEfu (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 14 Jun 2023 00:35:50 -0400 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 815131BD2; Tue, 13 Jun 2023 21:35:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686717349; x=1718253349; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=ejpS5yv6KiwAGP47Gkayf0WbxFj2TagAcuYgVG4W330=; b=BKL8EsOuq3r+a9A2QuWj4X6VBVSw1ynp/jeoQZMkmjShtW3RzZh+fbAA RqMv2cdBtrcYAR20atw8xwv/DJBYqWTCX6JYaGRPPTdR6TpHsDe3vvcxq Yz7gm4aTAUHOM1MBlszRIfiYW3wYyGCpBer3ZaJ0s0/pGBIns+0/zWjOK OYp7lq1bfQ2BQ6tdlQ+n6K42wC0y3cGcIuhJomHfmTyNagBXZzjTXre88 hwdDDSB49PAgUsA9B9eSTSSYl5zmawBvWzX2MDrJ/DOvCpcphCtnr8OIv DPPN0l4ZXINvi6iFzZdryhpYDMEOxkz8fAVtqMC6Etc9rXLxlwV872mmj Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10740"; a="360998729" X-IronPort-AV: E=Sophos;i="6.00,241,1681196400"; d="scan'208";a="360998729" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2023 21:35:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10740"; a="662251402" X-IronPort-AV: E=Sophos;i="6.00,241,1681196400"; d="scan'208";a="662251402" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.212.233.239]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2023 21:35:28 -0700 From: alison.schofield@intel.com To: "Rafael J. Wysocki" <rafael@kernel.org>, Len Brown <lenb@kernel.org>, Dan Williams <dan.j.williams@intel.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>, Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Andrew Morton <akpm@linux-foundation.org>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Dave Jiang <dave.jiang@intel.com>, Mike Rapoport <rppt@kernel.org> Cc: Alison Schofield <alison.schofield@intel.com>, x86@kernel.org, linux-cxl@vger.kernel.org, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 0/2] CXL: Apply SRAT defined PXM to entire CFMWS window Date: Tue, 13 Jun 2023 21:35:23 -0700 Message-Id: <cover.1686712819.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768653561132304011?= X-GMAIL-MSGID: =?utf-8?q?1768653561132304011?= |
Series |
CXL: Apply SRAT defined PXM to entire CFMWS window
|
|
Message
Alison Schofield
June 14, 2023, 4:35 a.m. UTC
From: Alison Schofield <alison.schofield@intel.com>
Along with the changes in v2 listed below, Dan questioned the maintenance
burden of x86 not switching to use the memblock API. See Dan Williams &
Mike Rapoport discuss the issue in the v1 link. [1]
IIUC switching existing x86 meminfo usage to memblock is the pre-existing
outstanding work, and per Mike 'that's quite some work needed to make
that happen' and since the memblock API doesn't support something like
numa_fill_memblks(), add that work on top.
So, with that open awaiting feedback from x86 maintainers, here's v2.
Changes in v2:
Patch 1/2: x86/numa: Introduce numa_fill_memblks()
- Update commit log with policy description. (Dan)
- Collect memblks with any HPA range intersect. (Dan)
- Adjust head or tail memblk to include, not align to, HPA range.
- Let the case of a single memblk simply fall through.
- Simplify the sort compare function to use start only.
- Rewrite and simplify the fill loop.
- Add code comment for exclusive memblk->end. (Dan)
- Add code comment for memblks being adjusted in place. (Dan)
- Add Tags: Reported-by, Suggested-by, Tested-by
Patch 2/2: ACPI: NUMA: Apply SRAT proximity domain to entire CFMWS window
- Add Tags: Reported-by, Suggested-by, Tested-by
- No changes in patch body.
[1] v1: https://lore.kernel.org/linux-cxl/cover.1684448934.git.alison.schofield@intel.com/
Cover Letter:
The CXL subsystem requires the creation of NUMA nodes for CFMWS
Windows not described in the SRAT. The existing implementation
only addresses windows that the SRAT describes completely or
not at all. This work addresses the case of partially described
CFMWS Windows by extending proximity domains in a portion of
a CFMWS window to the entire window.
Introduce a NUMA helper, numa_fill_memblks(), to fill gaps in
a numa_meminfo memblk address range. Update the CFMWS parsing
in the ACPI driver to use numa_fill_memblks() to extend SRAT
defined proximity domains to entire CXL windows.
An RFC of this patchset was previously posted for CXL folks
review.[2] The RFC feedback led to the implementation here,
extending existing memblks (Dan). Also, both Jonathan and
Dan influenced the changelog comments in the ACPI patch, with
regards to setting expectations on this evolving heuristic.
Repeating here to set reviewer expectations:
*Note that this heuristic will evolve when CFMWS Windows present
a wider range of characteristics. The extension of the proximity
domain, implemented here, is likely a step in developing a more
sophisticated performance profile in the future.
[2] https://lore.kernel.org/linux-cxl/cover.1683742429.git.alison.schofield@intel.com/
Alison Schofield (2):
x86/numa: Introduce numa_fill_memblks()
ACPI: NUMA: Apply SRAT proximity domain to entire CFMWS window
arch/x86/include/asm/sparsemem.h | 2 +
arch/x86/mm/numa.c | 87 ++++++++++++++++++++++++++++++++
drivers/acpi/numa/srat.c | 11 ++--
include/linux/numa.h | 7 +++
4 files changed, 104 insertions(+), 3 deletions(-)
base-commit: 6e2e1e779912345f0b5f86ef01facc2802bd97cc
Comments
On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote:
> The CXL subsystem requires the creation of NUMA nodes for CFMWS
The thing is CXL some persistent memory thing, right? But what is this
CFMWS thing? I don't think I've ever seen that particular combination of
letters together.
On Wed, 14 Jun 2023 10:32:40 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote: > > The CXL subsystem requires the creation of NUMA nodes for CFMWS > > The thing is CXL some persistent memory thing, right? But what is this > CFMWS thing? I don't think I've ever seen that particular combination of > letters together. > Hi Peter, To save time before the US based folk wake up. Both persistent and volatile memory found on CXL devices (mostly volatile on early devices). CXL Fixed Memory Window (structure) (CFMWS - defined in 9.17.1.3 of CXL r3.0 - via an ACPI table (CEDT). CFMWS, as a term, is sometimes abused in the kernel (and here) for the window rather than the structure describing the window (the S on the end). CFMWS - A region of Host Physical Address (HPA) Space which routes accesses to CXL Host bridges. A CFMWS describes interleaving as well (so multiple target host bridges). If multiple interleave setups are available, then you'll see multiple CFMWS entries - so different statically regions of HPA can route to same host bridges with different interleave setups (decoding via the configurable part to hit different actual memory on the downstream devices). Where accesses are routed after that depends on the configurable parts of the CXL topology (Host-Managed Device Memory (HDM) decoders in host bridges, switches etc). Note that a CFMWS address may route to nowhere if downstream devices aren't available / configured yet. CFMWS is the CXL specification avoiding defining interfaces for controlling the host address space to CXL host bridge mapping as those vary so much across host implementations + not always configurable at runtime anyway. Also includes a bunch of other details about the region (too many details perhaps!) Who does the configuration (BIOS / kernel) varies across implementations and we have OS managed hotplug so the OS always has to do some of it (personally I prefer the kernel doing everything :) It's made messier by CXL 1.1 hosts where a lot less was discoverable so generally the BIOS has to do the heavy lifting. For CXL 2.0 onwards the OS 'might' do everything except whatever is needed on the host to configure the CXL Fixed Memory Windows it is advertising. Note there is no requirement that the access characteristics of memory mapped into a given CFMWS should be remotely consistent across the whole window - some of the window may route through switches, and to directly connected devices. That's a simplifying assumption made today as we don't yet know the full scope of what people are building. Hope that helps (rather than causing confusion!) Jonathan
Jonathan Cameron wrote: > On Wed, 14 Jun 2023 10:32:40 +0200 > Peter Zijlstra <peterz@infradead.org> wrote: > > > On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote: > > > The CXL subsystem requires the creation of NUMA nodes for CFMWS > > > > The thing is CXL some persistent memory thing, right? But what is this > > CFMWS thing? I don't think I've ever seen that particular combination of > > letters together. > > > Hi Peter, > > To save time before the US based folk wake up. > [..] > Note there is no requirement that the access characteristics of memory mapped > into a given CFMWS should be remotely consistent across the whole window > - some of the window may route through switches, and to directly connected > devices. > That's a simplifying assumption made today as we don't yet know the full > scope of what people are building. > > Hope that helps (rather than causing confusion!) Thanks Jonathan! Patch 1 changelog also goes into more detail.