[v2,0/2] NUMA aware page table allocation

Message ID 20221201195718.1409782-1-vipinsh@google.com
Headers
Series NUMA aware page table allocation |

Message

Vipin Sharma Dec. 1, 2022, 7:57 p.m. UTC
  Hi,

This series improves page table accesses by allocating page tables on
the same NUMA node where underlying physical page is present.

Currently page tables are allocated during page faults and page splits.
In both instances page table location will depend on the current thread
mempolicy. This can create suboptimal placement of page tables on NUMA
node, for example, thread doing eager page split is on different NUMA
node compared to page it is splitting.

Reviewers please provide suggestion to the following:

1. Module parameter is true by default, which means this feature will
   be enabled by default. Is this okay or should I set it to false?

2. I haven't reduced KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE considering that
   it might not be too much of an impact as only online nodes are filled
   during topup phase and in many cases some of these nodes will never
   be refilled again.  Please let me know if you want this to be
   reduced.

3. I have tried to keep everything in x86/mmu except for some changes in
   virt/kvm/kvm_main.c. I used __weak function so that only x86/mmu will
   see the change, other arch nothing will change. I hope this is the
   right approach.

4. I am not sure what is the right way to split patch 2. If you think
   this is too big for a patch please let me know what would you prefer.

Thanks
Vipin

v2:
- All page table pages will be allocated on underlying physical page's
  NUMA node.
- Introduced module parameter, numa_aware_pagetable, to disable this
  feature.
- Using kvm_pfn_to_refcounted_page to get page from a pfn.

v1: https://lore.kernel.org/all/20220801151928.270380-1-vipinsh@google.com/

Vipin Sharma (2):
  KVM: x86/mmu: Allocate page table pages on TDP splits during dirty log
    enable on the underlying page's numa node
  KVM: x86/mmu: Allocate page table pages on NUMA node of underlying
    pages

 arch/x86/include/asm/kvm_host.h |   4 +-
 arch/x86/kvm/mmu/mmu.c          | 126 ++++++++++++++++++++++++--------
 arch/x86/kvm/mmu/paging_tmpl.h  |   4 +-
 arch/x86/kvm/mmu/tdp_mmu.c      |  26 ++++---
 include/linux/kvm_host.h        |  17 +++++
 include/linux/kvm_types.h       |   2 +
 virt/kvm/kvm_main.c             |   7 +-
 7 files changed, 141 insertions(+), 45 deletions(-)


base-commit: df0bb47baa95aad133820b149851d5b94cbc6790
  

Comments

David Matlack Dec. 9, 2022, 12:21 a.m. UTC | #1
On Thu, Dec 01, 2022 at 11:57:16AM -0800, Vipin Sharma wrote:
> Hi,
> 
> This series improves page table accesses by allocating page tables on
> the same NUMA node where underlying physical page is present.
> 
> Currently page tables are allocated during page faults and page splits.
> In both instances page table location will depend on the current thread
> mempolicy. This can create suboptimal placement of page tables on NUMA
> node, for example, thread doing eager page split is on different NUMA
> node compared to page it is splitting.
> 
> Reviewers please provide suggestion to the following:
> 
> 1. Module parameter is true by default, which means this feature will
>    be enabled by default. Is this okay or should I set it to false?
> 
> 2. I haven't reduced KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE considering that
>    it might not be too much of an impact as only online nodes are filled
>    during topup phase and in many cases some of these nodes will never
>    be refilled again.  Please let me know if you want this to be
>    reduced.
> 
> 3. I have tried to keep everything in x86/mmu except for some changes in
>    virt/kvm/kvm_main.c. I used __weak function so that only x86/mmu will
>    see the change, other arch nothing will change. I hope this is the
>    right approach.
> 
> 4. I am not sure what is the right way to split patch 2. If you think
>    this is too big for a patch please let me know what would you prefer.

I agree it's too big. The split_shadow_page_cache changes can easily be
split into a separate commit.