[v2,0/2] CXL: Apply SRAT defined PXM to entire CFMWS window

Message ID cover.1686712819.git.alison.schofield@intel.com
Headers
Series CXL: Apply SRAT defined PXM to entire CFMWS window |

Message

Alison Schofield June 14, 2023, 4:35 a.m. UTC
  From: Alison Schofield <alison.schofield@intel.com>

Along with the changes in v2 listed below, Dan questioned the maintenance
burden of x86 not switching to use the memblock API. See Dan Williams &
Mike Rapoport discuss the issue in the v1 link. [1]

IIUC switching existing x86 meminfo usage to memblock is the pre-existing
outstanding work, and per Mike 'that's quite some work needed to make
that happen' and since the memblock API doesn't support something like
numa_fill_memblks(), add that work on top.

So, with that open awaiting feedback from x86 maintainers, here's v2.


Changes in v2:

Patch 1/2: x86/numa: Introduce numa_fill_memblks()
- Update commit log with policy description. (Dan)
- Collect memblks with any HPA range intersect. (Dan)
- Adjust head or tail memblk to include, not align to, HPA range.
- Let the case of a single memblk simply fall through.
- Simplify the sort compare function to use start only.
- Rewrite and simplify the fill loop.
- Add code comment for exclusive memblk->end. (Dan)
- Add code comment for memblks being adjusted in place. (Dan)
- Add Tags: Reported-by, Suggested-by, Tested-by

Patch 2/2: ACPI: NUMA: Apply SRAT proximity domain to entire CFMWS window
- Add Tags: Reported-by, Suggested-by, Tested-by
- No changes in patch body.

[1] v1: https://lore.kernel.org/linux-cxl/cover.1684448934.git.alison.schofield@intel.com/

Cover Letter:

The CXL subsystem requires the creation of NUMA nodes for CFMWS
Windows not described in the SRAT. The existing implementation
only addresses windows that the SRAT describes completely or
not at all. This work addresses the case of partially described
CFMWS Windows by extending proximity domains in a portion of
a CFMWS window to the entire window.

Introduce a NUMA helper, numa_fill_memblks(), to fill gaps in
a numa_meminfo memblk address range. Update the CFMWS parsing
in the ACPI driver to use numa_fill_memblks() to extend SRAT
defined proximity domains to entire CXL windows.

An RFC of this patchset was previously posted for CXL folks
review.[2] The RFC feedback led to the implementation here,
extending existing memblks (Dan). Also, both Jonathan and
Dan influenced the changelog comments in the ACPI patch, with
regards to setting expectations on this evolving heuristic.

Repeating here to set reviewer expectations:
*Note that this heuristic will evolve when CFMWS Windows present
a wider range of characteristics. The extension of the proximity
domain, implemented here, is likely a step in developing a more
sophisticated performance profile in the future.

[2] https://lore.kernel.org/linux-cxl/cover.1683742429.git.alison.schofield@intel.com/

Alison Schofield (2):
  x86/numa: Introduce numa_fill_memblks()
  ACPI: NUMA: Apply SRAT proximity domain to entire CFMWS window

 arch/x86/include/asm/sparsemem.h |  2 +
 arch/x86/mm/numa.c               | 87 ++++++++++++++++++++++++++++++++
 drivers/acpi/numa/srat.c         | 11 ++--
 include/linux/numa.h             |  7 +++
 4 files changed, 104 insertions(+), 3 deletions(-)


base-commit: 6e2e1e779912345f0b5f86ef01facc2802bd97cc
  

Comments

Peter Zijlstra June 14, 2023, 8:32 a.m. UTC | #1
On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote:
> The CXL subsystem requires the creation of NUMA nodes for CFMWS

The thing is CXL some persistent memory thing, right? But what is this
CFMWS thing? I don't think I've ever seen that particular combination of
letters together.
  
Jonathan Cameron June 14, 2023, 9:10 a.m. UTC | #2
On Wed, 14 Jun 2023 10:32:40 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote:
> > The CXL subsystem requires the creation of NUMA nodes for CFMWS  
> 
> The thing is CXL some persistent memory thing, right? But what is this
> CFMWS thing? I don't think I've ever seen that particular combination of
> letters together.
> 
Hi Peter,

To save time before the US based folk wake up.

Both persistent and volatile memory found on CXL devices (mostly volatile on
early devices).

CXL Fixed Memory Window (structure) (CFMWS - defined in 9.17.1.3 of CXL r3.0
- via an ACPI table (CEDT).  CFMWS, as a term, is sometimes abused in the kernel
(and here) for the window rather than the structure describing the window
(the S on the end).

CFMWS - A region of Host Physical Address (HPA) Space which routes accesses to CXL Host
bridges. A CFMWS describes interleaving as well (so multiple target host bridges).
If multiple interleave setups are available, then you'll see multiple CFMWS entries
- so different statically regions of HPA can route to same host bridges with different
interleave setups (decoding via the configurable part to hit different actual memory
on the downstream devices). 
Where accesses are routed after that depends on the configurable parts
of the CXL topology (Host-Managed Device Memory (HDM) decoders in host bridges,
switches etc).  Note that a CFMWS address may route to nowhere if downstream
devices aren't available / configured yet.

CFMWS is the CXL specification avoiding defining interfaces for controlling
the host address space to CXL host bridge mapping as those vary so much across
host implementations + not always configurable at runtime anyway. Also includes
a bunch of other details about the region (too many details perhaps!)

Who does the configuration (BIOS / kernel) varies across implementations
and we have OS managed hotplug so the OS always has to do some of it
(personally I prefer the kernel doing everything :)
It's made messier by CXL 1.1 hosts where a lot less was discoverable so
generally the BIOS has to do the heavy lifting. For CXL 2.0 onwards the OS
'might' do everything except whatever is needed on the host to configure
the CXL Fixed Memory Windows it is advertising.

Note there is no requirement that the access characteristics of memory mapped
into a given CFMWS should be remotely consistent across the whole window
 - some of the window may route through switches, and to directly connected
   devices.
That's a simplifying assumption made today as we don't yet know the full
scope of what people are building.

Hope that helps (rather than causing confusion!)

Jonathan
  
Dan Williams June 14, 2023, 1:33 p.m. UTC | #3
Jonathan Cameron wrote:
> On Wed, 14 Jun 2023 10:32:40 +0200
> Peter Zijlstra <peterz@infradead.org> wrote:
> 
> > On Tue, Jun 13, 2023 at 09:35:23PM -0700, alison.schofield@intel.com wrote:
> > > The CXL subsystem requires the creation of NUMA nodes for CFMWS  
> > 
> > The thing is CXL some persistent memory thing, right? But what is this
> > CFMWS thing? I don't think I've ever seen that particular combination of
> > letters together.
> > 
> Hi Peter,
> 
> To save time before the US based folk wake up.
> 
[..]
> Note there is no requirement that the access characteristics of memory mapped
> into a given CFMWS should be remotely consistent across the whole window
>  - some of the window may route through switches, and to directly connected
>    devices.
> That's a simplifying assumption made today as we don't yet know the full
> scope of what people are building.
> 
> Hope that helps (rather than causing confusion!)

Thanks Jonathan! Patch 1 changelog also goes into more detail.