[v4,05/18] KVM: x86/mmu: Add split_shadow_page_cache pages to global count of MMU cache pages

Message ID 20230306224127.1689967-6-vipinsh@google.com
State New
Headers
Series NUMA aware page table allocation |

Commit Message

Vipin Sharma March 6, 2023, 10:41 p.m. UTC
  Add pages in split_shadow_page_cache to the global counter
kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker
in future commit.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
  

Comments

Zhi Wang March 9, 2023, 3:58 p.m. UTC | #1
On Mon,  6 Mar 2023 14:41:14 -0800
Vipin Sharma <vipinsh@google.com> wrote:

> Add pages in split_shadow_page_cache to the global counter
> kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker
> in future commit.
> 
> Signed-off-by: Vipin Sharma <vipinsh@google.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index df8dcb7e5de7..0ebb8a2eaf47 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6149,7 +6149,9 @@ static void mmu_free_vm_memory_caches(struct kvm *kvm)
>  {
>  	kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
>  	kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
> -	kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
> +	mutex_lock(&kvm->slots_lock);
> +	mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache);
> +	mutex_unlock(&kvm->slots_lock);

Taking the lock of the calling path in the layer of cache topping/free layer
seems off.

My vote goes to have a lock for each cache and take the lock of the cache when
topping/free the cache. It is more self-contained and architecturally nice.

>  }
>  
>  void kvm_mmu_uninit_vm(struct kvm *kvm)
> @@ -6303,7 +6305,7 @@ static int topup_split_caches(struct kvm *kvm)
>  	if (r)
>  		return r;
>  
> -	return kvm_mmu_topup_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
> +	return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
>  }
>  
>  static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u64 *huge_sptep)
> @@ -6328,6 +6330,7 @@ static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u64 *hu
>  	/* Direct SPs do not require a shadowed_info_cache. */
>  	caches.page_header_cache = &kvm->arch.split_page_header_cache;
>  	caches.shadow_page_cache = &kvm->arch.split_shadow_page_cache;
> +	caches.count_shadow_page_allocation = true;
>  
>  	/* Safe to pass NULL for vCPU since requesting a direct SP. */
>  	return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role);
  
Vipin Sharma March 9, 2023, 7:59 p.m. UTC | #2
On Thu, Mar 9, 2023 at 7:58 AM Zhi Wang <zhi.wang.linux@gmail.com> wrote:
>
> On Mon,  6 Mar 2023 14:41:14 -0800
> Vipin Sharma <vipinsh@google.com> wrote:
>
> > Add pages in split_shadow_page_cache to the global counter
> > kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker
> > in future commit.
> >
> > Signed-off-by: Vipin Sharma <vipinsh@google.com>
> > ---
> >  arch/x86/kvm/mmu/mmu.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index df8dcb7e5de7..0ebb8a2eaf47 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -6149,7 +6149,9 @@ static void mmu_free_vm_memory_caches(struct kvm *kvm)
> >  {
> >       kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
> >       kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
> > -     kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
> > +     mutex_lock(&kvm->slots_lock);
> > +     mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache);
> > +     mutex_unlock(&kvm->slots_lock);
>
> Taking the lock of the calling path in the layer of cache topping/free layer
> seems off.
>
> My vote goes to have a lock for each cache and take the lock of the cache when
> topping/free the cache. It is more self-contained and architecturally nice.
>

Yeah, this can be one way. However, in future patches when I am adding
per NUMA node cache, it will add up a lot of locks for the same code
path before a topup. In split huge page case we know what NUMA node we
need to allocate from so we can fine tune which lock to take but  in
fault path code we don't know what NUMA node the page will be coming
from so we need to topup all of the NUMA caches. Having a single lock
simplifies code a little bit.

I agree with you on being more self-contained. I will wait for others
to also chime in on this and go from there.
  
David Matlack March 10, 2023, 12:05 a.m. UTC | #3
On Thu, Mar 09, 2023 at 11:59:00AM -0800, Vipin Sharma wrote:
> On Thu, Mar 9, 2023 at 7:58 AM Zhi Wang <zhi.wang.linux@gmail.com> wrote:
> >
> > On Mon,  6 Mar 2023 14:41:14 -0800
> > Vipin Sharma <vipinsh@google.com> wrote:
> >
> > > Add pages in split_shadow_page_cache to the global counter
> > > kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker
> > > in future commit.
> > >
> > > Signed-off-by: Vipin Sharma <vipinsh@google.com>
> > > ---
> > >  arch/x86/kvm/mmu/mmu.c | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index df8dcb7e5de7..0ebb8a2eaf47 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -6149,7 +6149,9 @@ static void mmu_free_vm_memory_caches(struct kvm *kvm)
> > >  {
> > >       kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
> > >       kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
> > > -     kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
> > > +     mutex_lock(&kvm->slots_lock);
> > > +     mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache);
> > > +     mutex_unlock(&kvm->slots_lock);
> >
> > Taking the lock of the calling path in the layer of cache topping/free layer
> > seems off.
> >
> > My vote goes to have a lock for each cache and take the lock of the cache when
> > topping/free the cache. It is more self-contained and architecturally nice.
> >
> 
> Yeah, this can be one way. However, in future patches when I am adding
> per NUMA node cache, it will add up a lot of locks for the same code
> path before a topup. In split huge page case we know what NUMA node we
> need to allocate from so we can fine tune which lock to take but  in
> fault path code we don't know what NUMA node the page will be coming
> from so we need to topup all of the NUMA caches. Having a single lock
> simplifies code a little bit.
> 
> I agree with you on being more self-contained. I will wait for others
> to also chime in on this and go from there.

As a general rule, please only added locking when it's needed. Adding
the lock in this commit is just confusing.

But that aside, I don't think acquiring the slots lock is even needed in
this commit.  mmu_free_vm_memory_caches() is never called while the the
VM is on vm_list. i.e. This can never race with the shrinker.

If you want to be paranoid you can add a WARN to ensure that stays true
going forward:

  /* ... comment ... */
  WARN_ON_ONCE(!list_empty(&kvm->vm_list));
  
David Matlack March 10, 2023, 12:06 a.m. UTC | #4
On Thu, Mar 9, 2023 at 4:05 PM David Matlack <dmatlack@google.com> wrote:
>
> On Thu, Mar 09, 2023 at 11:59:00AM -0800, Vipin Sharma wrote:
> > On Thu, Mar 9, 2023 at 7:58 AM Zhi Wang <zhi.wang.linux@gmail.com> wrote:
> > >
> > > On Mon,  6 Mar 2023 14:41:14 -0800
> > > Vipin Sharma <vipinsh@google.com> wrote:
> > >
> > > > Add pages in split_shadow_page_cache to the global counter
> > > > kvm_total_unused_cached_pages. These pages will be freed by MMU shrinker
> > > > in future commit.
> > > >
> > > > Signed-off-by: Vipin Sharma <vipinsh@google.com>
> > > > ---
> > > >  arch/x86/kvm/mmu/mmu.c | 7 +++++--
> > > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > index df8dcb7e5de7..0ebb8a2eaf47 100644
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -6149,7 +6149,9 @@ static void mmu_free_vm_memory_caches(struct kvm *kvm)
> > > >  {
> > > >       kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
> > > >       kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
> > > > -     kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
> > > > +     mutex_lock(&kvm->slots_lock);
> > > > +     mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache);
> > > > +     mutex_unlock(&kvm->slots_lock);
> > >
> > > Taking the lock of the calling path in the layer of cache topping/free layer
> > > seems off.
> > >
> > > My vote goes to have a lock for each cache and take the lock of the cache when
> > > topping/free the cache. It is more self-contained and architecturally nice.
> > >
> >
> > Yeah, this can be one way. However, in future patches when I am adding
> > per NUMA node cache, it will add up a lot of locks for the same code
> > path before a topup. In split huge page case we know what NUMA node we
> > need to allocate from so we can fine tune which lock to take but  in
> > fault path code we don't know what NUMA node the page will be coming
> > from so we need to topup all of the NUMA caches. Having a single lock
> > simplifies code a little bit.
> >
> > I agree with you on being more self-contained. I will wait for others
> > to also chime in on this and go from there.
>
> As a general rule, please only added locking when it's needed. Adding
> the lock in this commit is just confusing.
>
> But that aside, I don't think acquiring the slots lock is even needed in
> this commit.

Correction: even needed in the *next* commit

> mmu_free_vm_memory_caches() is never called while the the
> VM is on vm_list. i.e. This can never race with the shrinker.
>
> If you want to be paranoid you can add a WARN to ensure that stays true
> going forward:
>
>   /* ... comment ... */
>   WARN_ON_ONCE(!list_empty(&kvm->vm_list));
  

Patch

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index df8dcb7e5de7..0ebb8a2eaf47 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6149,7 +6149,9 @@  static void mmu_free_vm_memory_caches(struct kvm *kvm)
 {
 	kvm_mmu_free_memory_cache(&kvm->arch.split_desc_cache);
 	kvm_mmu_free_memory_cache(&kvm->arch.split_page_header_cache);
-	kvm_mmu_free_memory_cache(&kvm->arch.split_shadow_page_cache);
+	mutex_lock(&kvm->slots_lock);
+	mmu_free_sp_memory_cache(&kvm->arch.split_shadow_page_cache);
+	mutex_unlock(&kvm->slots_lock);
 }
 
 void kvm_mmu_uninit_vm(struct kvm *kvm)
@@ -6303,7 +6305,7 @@  static int topup_split_caches(struct kvm *kvm)
 	if (r)
 		return r;
 
-	return kvm_mmu_topup_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
+	return mmu_topup_sp_memory_cache(&kvm->arch.split_shadow_page_cache, 1);
 }
 
 static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u64 *huge_sptep)
@@ -6328,6 +6330,7 @@  static struct kvm_mmu_page *shadow_mmu_get_sp_for_split(struct kvm *kvm, u64 *hu
 	/* Direct SPs do not require a shadowed_info_cache. */
 	caches.page_header_cache = &kvm->arch.split_page_header_cache;
 	caches.shadow_page_cache = &kvm->arch.split_shadow_page_cache;
+	caches.count_shadow_page_allocation = true;
 
 	/* Safe to pass NULL for vCPU since requesting a direct SP. */
 	return __kvm_mmu_get_shadow_page(kvm, NULL, &caches, gfn, role);