[v3,0/7] Optimize clear dirty log

Message ID	20230211014626.3659152-1-vipinsh@google.com
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Date: Fri, 10 Feb 2023 17:46:19 -0800 Mime-Version: 1.0 Message-ID: <20230211014626.3659152-1-vipinsh@google.com> Subject: [Patch v3 0/7] Optimize clear dirty log From: Vipin Sharma <vipinsh@google.com> To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com, dmatlack@google.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Vipin Sharma <vipinsh@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	Optimize clear dirty log \| [v3,0/7] Optimize clear dirty log [v3,1/7] KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic write [v3,2/7] KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log flow [v3,3/7] KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte() [v3,4/7] KVM: x86/mmu: Optimize SPTE change for aging gfn range [v3,5/7] KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte() [v3,6/7] KVM: x86/mmu: Remove handle_changed_spte_dirty_log() [v3,7/7] KVM: x86/mmu: Merge all handle_changed_pte* functions.

Message ID

20230211014626.3659152-1-vipinsh@google.com

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Date: Fri, 10 Feb 2023 17:46:19 -0800
Mime-Version: 1.0
Message-ID: <20230211014626.3659152-1-vipinsh@google.com>
Subject: [Patch v3 0/7] Optimize clear dirty log
From: Vipin Sharma <vipinsh@google.com>
To: seanjc@google.com, pbonzini@redhat.com, bgardon@google.com,
        dmatlack@google.com
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
        Vipin Sharma <vipinsh@google.com>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk

Series

Optimize clear dirty log |

Message

Vipin Sharma Feb. 11, 2023, 1:46 a.m. UTC

  Hi,


This patch series has optimized control flow of clearing dirty log and
improved its performance by ~40% (2% more than v2).

It also got rid of many variants of the handle_changed_spte family of
functions and converged logic to one handle_changed_spte() function. It
also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
booleans for controlling them.

Thanks,
Vipin

v3:
- Tried to do better job at writing commit messages.
- Made kvm_tdp_mmu_clear_spte_bits() similar to the kvm_tdp_mmu_write_spte().
- clear_dirty_pt_masked() evaluates mask for the bit to be cleared outside the
  loop and use that for all of the SPTEs instead of calculating for each SPTE.
- Some naming changes based on the feedbacks.
- Split out the dead code clean from the optimization code.


v2: https://lore.kernel.org/lkml/20230203192822.106773-1-vipinsh@google.com/
- Clear dirty log and age gfn range does not go through
  handle_changed_spte, they handle their SPTE changes locally to improve
  their speed.
- Clear only specific bits atomically when updating SPTEs in clearing
  dirty log and aging gfn range functions.
- Removed tdp_mmu_set_spte_no_[acc_track|dirty_log] APIs.
- Converged all handle_changed_spte related functions to one place.

v1: https://lore.kernel.org/lkml/20230125213857.824959-1-vipinsh@google.com/


Vipin Sharma (7):
  KVM: x86/mmu: Add a helper function to check if an SPTE needs atomic
    write
  KVM: x86/mmu: Atomically clear SPTE dirty state in the clear-dirty-log
    flow
  KVM: x86/mmu: Remove "record_dirty_log" in __tdp_mmu_set_spte()
  KVM: x86/mmu: Optimize SPTE change for aging gfn range
  KVM: x86/mmu: Remove "record_acc_track" in __tdp_mmu_set_spte()
  KVM: x86/mmu: Remove handle_changed_spte_dirty_log()
  KVM: x86/mmu: Merge all handle_changed_pte* functions.

 arch/x86/kvm/mmu/tdp_iter.h |  48 ++++++---
 arch/x86/kvm/mmu/tdp_mmu.c  | 190 ++++++++++++------------------------
 2 files changed, 96 insertions(+), 142 deletions(-)

Comments

Sean Christopherson March 17, 2023, 10:57 p.m. UTC | #1

On Fri, Feb 10, 2023, Vipin Sharma wrote:
> This patch series has optimized control flow of clearing dirty log and
> improved its performance by ~40% (2% more than v2).
> 
> It also got rid of many variants of the handle_changed_spte family of
> functions and converged logic to one handle_changed_spte() function. It
> also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
> booleans for controlling them.
>
> v3:
> - Tried to do better job at writing commit messages.

LOL, that's the spirit!

Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
various nits when applying.  I'll also add a link in patch 2 to the discussion
about why we determined that bypassing __tdp_mmu_set_spte() is safe; that's critical
information that isn't captured in the changelog.

Vipin Sharma March 17, 2023, 11:51 p.m. UTC | #2

On Fri, Mar 17, 2023 at 3:57 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Feb 10, 2023, Vipin Sharma wrote:
> > This patch series has optimized control flow of clearing dirty log and
> > improved its performance by ~40% (2% more than v2).
> >
> > It also got rid of many variants of the handle_changed_spte family of
> > functions and converged logic to one handle_changed_spte() function. It
> > also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
> > booleans for controlling them.
> >
> > v3:
> > - Tried to do better job at writing commit messages.
>
> LOL, that's the spirit!
>
> Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> various nits when applying.  I'll also add a link in patch 2 to the discussion

Yeah, he is too demanding! :p

> about why we determined that bypassing __tdp_mmu_set_spte() is safe; that's critical
> information that isn't captured in the changelog.

Thanks!

Sean Christopherson March 21, 2023, 12:41 a.m. UTC | #3

On Fri, Mar 17, 2023, Sean Christopherson wrote:
> On Fri, Feb 10, 2023, Vipin Sharma wrote:
> > This patch series has optimized control flow of clearing dirty log and
> > improved its performance by ~40% (2% more than v2).
> > 
> > It also got rid of many variants of the handle_changed_spte family of
> > functions and converged logic to one handle_changed_spte() function. It
> > also remove tdp_mmu_set_spte_no_[acc_track|dirty_log] and various
> > booleans for controlling them.
> >
> > v3:
> > - Tried to do better job at writing commit messages.
> 
> LOL, that's the spirit!
> 
> Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> various nits when applying.

Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
but damage done, and it's my fault for throwing you a big blob of code in the
first place.

I ended up splitting the "interesting" patches into three each:

  1. Switch to the atomic-AND
  2. Drop the access-tracking / dirty-logging (as appropriate)
  3. Drop the call to __handle_changed_spte()

because logically they are three different things (although obviously related).

I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
can take a look in the meantime to make sure I didn't do something completely
boneheaded, it'd be much appreciated.

Vipin Sharma March 21, 2023, 6:11 p.m. UTC | #4

On Mon, Mar 20, 2023 at 5:41 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Fri, Mar 17, 2023, Sean Christopherson wrote:
> > Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> > it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> > various nits when applying.
>
> Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
> but damage done, and it's my fault for throwing you a big blob of code in the
> first place.
>
> I ended up splitting the "interesting" patches into three each:
>
>   1. Switch to the atomic-AND
>   2. Drop the access-tracking / dirty-logging (as appropriate)
>   3. Drop the call to __handle_changed_spte()
>
> because logically they are three different things (although obviously related).
>
> I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
> sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
> can take a look in the meantime to make sure I didn't do something completely
> boneheaded, it'd be much appreciated.


Thanks for refactoring the patches. I reviewed the commits, no obvious
red flags from my side. Few small nits I found:

commit e534a94eac07 ("KVM: x86/mmu: Use kvm_ad_enabled() to determine
if TDP MMU SPTEs need wrprot")
 - kvm_ad_enabled() should be outside the loop.

commit 69032b5d71ef (" KVM: x86/mmu: Atomically clear SPTE dirty state
in the clear-dirty-log flow")
 - MMU_WARN_ON(kvm_ad_enabled() &&
spte_ad_need_write_protect(iter.old_spte) should be after
if(iter.level > PG_LEVEL_4k...)

commit 93c375bb6aea ("KVM: x86/mmu: Bypass __handle_changed_spte()
when clearing TDP MMU dirty bits")
 - Needs new performance numbers. Adding MMU_WARN_ON() might change
numbers. I will run a perf test on your mmu branch and see if
something changes a lot.

Sean Christopherson March 21, 2023, 7:57 p.m. UTC | #5

On Tue, Mar 21, 2023, Vipin Sharma wrote:
> On Mon, Mar 20, 2023 at 5:41 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Fri, Mar 17, 2023, Sean Christopherson wrote:
> > > Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> > > it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> > > various nits when applying.
> >
> > Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
> > but damage done, and it's my fault for throwing you a big blob of code in the
> > first place.
> >
> > I ended up splitting the "interesting" patches into three each:
> >
> >   1. Switch to the atomic-AND
> >   2. Drop the access-tracking / dirty-logging (as appropriate)
> >   3. Drop the call to __handle_changed_spte()
> >
> > because logically they are three different things (although obviously related).
> >
> > I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
> > sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
> > can take a look in the meantime to make sure I didn't do something completely
> > boneheaded, it'd be much appreciated.
> 
> 
> Thanks for refactoring the patches. I reviewed the commits, no obvious
> red flags from my side. Few small nits I found:
> 
> commit e534a94eac07 ("KVM: x86/mmu: Use kvm_ad_enabled() to determine
> if TDP MMU SPTEs need wrprot")
>  - kvm_ad_enabled() should be outside the loop.

Hmm, I deliberately left it inside the loop, but I agree that it would be better
to hoist it out in that commit.

> commit 69032b5d71ef (" KVM: x86/mmu: Atomically clear SPTE dirty state
> in the clear-dirty-log flow")
>  - MMU_WARN_ON(kvm_ad_enabled() &&
> spte_ad_need_write_protect(iter.old_spte) should be after
> if(iter.level > PG_LEVEL_4k...)

Ah, hrm.  This was also deliberate, but looking at the diff I agree that relative
to the diff, it's an unnecessary/unrelated change.  I think what I'll do is
land the assertion above the "if (iter.level > PG_LEVEL_4K ||" in the above
commit that switches to kvm_ad_enabled().  That way there shouldn't be any change
for the assertion in this commit.

> commit 93c375bb6aea ("KVM: x86/mmu: Bypass __handle_changed_spte()
> when clearing TDP MMU dirty bits")
>  - Needs new performance numbers. Adding MMU_WARN_ON() might change
> numbers. I will run a perf test on your mmu branch and see if
> something changes a lot.

It won't.  MMU_WARN_ON() is dead code without manual modification to define MMU_DEBUG.
Part of the reason I used MMU_WARN_ON() was to remind myself to send a patch/series
to overhaul MMU_WARN_ON[*].  My thought/hope is that a Kconfig will allow developers
and testers to run with a pile of assertions and sanity checks without impacting
the runtime overhead for production builds.

[*] https://lore.kernel.org/all/Yz4Qi7cn7TWTWQjj@google.com/

Sean Christopherson March 21, 2023, 8:40 p.m. UTC | #6

On Tue, Mar 21, 2023, Sean Christopherson wrote:
> On Tue, Mar 21, 2023, Vipin Sharma wrote:
> > On Mon, Mar 20, 2023 at 5:41 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Fri, Mar 17, 2023, Sean Christopherson wrote:
> > > > Did a cursory glance, looks good.  I'll do a more thorough pass next week and get
> > > > it queued up if all goes well.  No need for a v4 at this point, I'll fixup David's
> > > > various nits when applying.
> > >
> > > Ooof, that ended up being painful.  In hindsight, I should have asked for a v4,
> > > but damage done, and it's my fault for throwing you a big blob of code in the
> > > first place.
> > >
> > > I ended up splitting the "interesting" patches into three each:
> > >
> > >   1. Switch to the atomic-AND
> > >   2. Drop the access-tracking / dirty-logging (as appropriate)
> > >   3. Drop the call to __handle_changed_spte()
> > >
> > > because logically they are three different things (although obviously related).
> > >
> > > I have pushed the result to kvm-x86/mmu, but haven't merged to kvm-x86/next or
> > > sent thanks because it's not yet tested.  I'll do testing tomorrow, but if you
> > > can take a look in the meantime to make sure I didn't do something completely
> > > boneheaded, it'd be much appreciated.
> > 
> > 
> > Thanks for refactoring the patches. I reviewed the commits, no obvious
> > red flags from my side. Few small nits I found:
> > 
> > commit e534a94eac07 ("KVM: x86/mmu: Use kvm_ad_enabled() to determine
> > if TDP MMU SPTEs need wrprot")
> >  - kvm_ad_enabled() should be outside the loop.
> 
> Hmm, I deliberately left it inside the loop, but I agree that it would be better
> to hoist it out in that commit.
> 
> > commit 69032b5d71ef (" KVM: x86/mmu: Atomically clear SPTE dirty state
> > in the clear-dirty-log flow")
> >  - MMU_WARN_ON(kvm_ad_enabled() &&
> > spte_ad_need_write_protect(iter.old_spte) should be after
> > if(iter.level > PG_LEVEL_4k...)
> 
> Ah, hrm.  This was also deliberate, but looking at the diff I agree that relative
> to the diff, it's an unnecessary/unrelated change.  I think what I'll do is
> land the assertion above the "if (iter.level > PG_LEVEL_4K ||" in the above
> commit that switches to kvm_ad_enabled().  That way there shouldn't be any change
> for the assertion in this commit.

Aha!  Even better, split this into yet one more patch to dedup the guts before
switching to the atomic-AND, and give clear_dirty_gfn_range() the same treatment.
That further isolates the changes, provides solid justification for hoisting the
kvm_ad_enabled() check out of the loop (it's basically guaranteed to be a single
memory read that hits the L1), and keeps clear_dirty_gfn_range() and
clear_dirty_pt_masked() as similar as is reasonably possible.

Speaking of which, I'll send a patch to remove the redundant is_shadow_present_pte()
check in clear_dirty_gfn_range(), that's already handled by tdp_root_for_each_leaf_pte().

Sean Christopherson March 21, 2023, 9:38 p.m. UTC | #7

On Tue, Mar 21, 2023, Sean Christopherson wrote:
> It won't.  MMU_WARN_ON() is dead code without manual modification to define MMU_DEBUG.
> Part of the reason I used MMU_WARN_ON() was to remind myself to send a patch/series
> to overhaul MMU_WARN_ON[*].  My thought/hope is that a Kconfig will allow developers
> and testers to run with a pile of assertions and sanity checks without impacting
> the runtime overhead for production builds.
> 
> [*] https://lore.kernel.org/all/Yz4Qi7cn7TWTWQjj@google.com/

Ugh, I'm definitely sending that patch, MMU_DEBUG has bitrotted and broken the
build yet again.

arch/x86/kvm/mmu/mmu.c: In function ‘kvm_mmu_free_shadow_page’:
arch/x86/kvm/mmu/mmu.c:1738:15: error: implicit declaration of function ‘is_empty_shadow_page’; did you mean ‘to_shadow_page’? [-Werror=implicit-function-declaration]
 1738 |  MMU_WARN_ON(!is_empty_shadow_page(sp->spt));
      |               ^~~~~~~~~~~~~~~~~~~~
include/asm-generic/bug.h:110:25: note: in definition of macro ‘WARN_ON_ONCE’
  110 |  int __ret_warn_on = !!(condition);   \
      |                         ^~~~~~~~~
arch/x86/kvm/mmu/mmu.c:1738:2: note: in expansion of macro ‘MMU_WARN_ON’
 1738 |  MMU_WARN_ON(!is_empty_shadow_page(sp->spt));
      |  ^~~~~~~~~~~