[v2,0/2] NUMA aware page table allocation

Message ID	20221201195718.1409782-1-vipinsh@google.com
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Date: Thu, 1 Dec 2022 11:57:16 -0800 Mime-Version: 1.0 Message-ID: <20221201195718.1409782-1-vipinsh@google.com> Subject: [Patch v2 0/2] NUMA aware page table allocation From: Vipin Sharma <vipinsh@google.com> To: dmatlack@google.com, bgardon@google.com, seanjc@google.com, pbonzini@redhat.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	NUMA aware page table allocation \| [v2,0/2] NUMA aware page table allocation [v2,1/2] KVM: x86/mmu: Allocate page table pages on TDP splits during dirty log enable on the under… [v2,2/2] KVM: x86/mmu: Allocate page table pages on NUMA node of underlying pages

Message ID

20221201195718.1409782-1-vipinsh@google.com

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Date: Thu,  1 Dec 2022 11:57:16 -0800
Mime-Version: 1.0
Message-ID: <20221201195718.1409782-1-vipinsh@google.com>
Subject: [Patch v2 0/2] NUMA aware page table allocation
From: Vipin Sharma <vipinsh@google.com>
To: dmatlack@google.com, bgardon@google.com, seanjc@google.com,
        pbonzini@redhat.com
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
        Vipin Sharma <vipinsh@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

NUMA aware page table allocation |

Message

Vipin Sharma Dec. 1, 2022, 7:57 p.m. UTC

  Hi,

This series improves page table accesses by allocating page tables on
the same NUMA node where underlying physical page is present.

Currently page tables are allocated during page faults and page splits.
In both instances page table location will depend on the current thread
mempolicy. This can create suboptimal placement of page tables on NUMA
node, for example, thread doing eager page split is on different NUMA
node compared to page it is splitting.

Reviewers please provide suggestion to the following:

1. Module parameter is true by default, which means this feature will
   be enabled by default. Is this okay or should I set it to false?

2. I haven't reduced KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE considering that
   it might not be too much of an impact as only online nodes are filled
   during topup phase and in many cases some of these nodes will never
   be refilled again.  Please let me know if you want this to be
   reduced.

3. I have tried to keep everything in x86/mmu except for some changes in
   virt/kvm/kvm_main.c. I used __weak function so that only x86/mmu will
   see the change, other arch nothing will change. I hope this is the
   right approach.

4. I am not sure what is the right way to split patch 2. If you think
   this is too big for a patch please let me know what would you prefer.

Thanks
Vipin

v2:
- All page table pages will be allocated on underlying physical page's
  NUMA node.
- Introduced module parameter, numa_aware_pagetable, to disable this
  feature.
- Using kvm_pfn_to_refcounted_page to get page from a pfn.

v1: https://lore.kernel.org/all/20220801151928.270380-1-vipinsh@google.com/

Vipin Sharma (2):
  KVM: x86/mmu: Allocate page table pages on TDP splits during dirty log
    enable on the underlying page's numa node
  KVM: x86/mmu: Allocate page table pages on NUMA node of underlying
    pages

 arch/x86/include/asm/kvm_host.h |   4 +-
 arch/x86/kvm/mmu/mmu.c          | 126 ++++++++++++++++++++++++--------
 arch/x86/kvm/mmu/paging_tmpl.h  |   4 +-
 arch/x86/kvm/mmu/tdp_mmu.c      |  26 ++++---
 include/linux/kvm_host.h        |  17 +++++
 include/linux/kvm_types.h       |   2 +
 virt/kvm/kvm_main.c             |   7 +-
 7 files changed, 141 insertions(+), 45 deletions(-)


base-commit: df0bb47baa95aad133820b149851d5b94cbc6790

Comments

David Matlack Dec. 9, 2022, 12:21 a.m. UTC | #1

On Thu, Dec 01, 2022 at 11:57:16AM -0800, Vipin Sharma wrote:
> Hi,
> 
> This series improves page table accesses by allocating page tables on
> the same NUMA node where underlying physical page is present.
> 
> Currently page tables are allocated during page faults and page splits.
> In both instances page table location will depend on the current thread
> mempolicy. This can create suboptimal placement of page tables on NUMA
> node, for example, thread doing eager page split is on different NUMA
> node compared to page it is splitting.
> 
> Reviewers please provide suggestion to the following:
> 
> 1. Module parameter is true by default, which means this feature will
>    be enabled by default. Is this okay or should I set it to false?
> 
> 2. I haven't reduced KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE considering that
>    it might not be too much of an impact as only online nodes are filled
>    during topup phase and in many cases some of these nodes will never
>    be refilled again.  Please let me know if you want this to be
>    reduced.
> 
> 3. I have tried to keep everything in x86/mmu except for some changes in
>    virt/kvm/kvm_main.c. I used __weak function so that only x86/mmu will
>    see the change, other arch nothing will change. I hope this is the
>    right approach.
> 
> 4. I am not sure what is the right way to split patch 2. If you think
>    this is too big for a patch please let me know what would you prefer.

I agree it's too big. The split_shadow_page_cache changes can easily be
split into a separate commit.