[v2] KVM: x86: avoid memslot check in NX hugepage recovery if it cannot succeed
Commit Message
Since gfn_to_memslot() is relatively expensive, it helps to
skip it if it the memslot cannot possibly have dirty logging
enabled. In order to do this, add to struct kvm a counter
of the number of log-page memslots. While the correct value
can only be read with slots_lock taken, the NX recovery thread
is content with using an approximate value. Therefore, the
counter is an atomic_t.
Based on https://lore.kernel.org/kvm/20221027200316.2221027-2-dmatlack@google.com/
by David Matlack.
Supersedes: <20221117173109.3126912-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
v1->v2: actually works, using ideas from David's v1
arch/x86/kvm/mmu/mmu.c | 22 +++++++++++++++++++---
include/linux/kvm_host.h | 5 +++++
virt/kvm/kvm_main.c | 7 +++++++
3 files changed, 31 insertions(+), 3 deletions(-)
Comments
On Fri, Nov 18, 2022, Paolo Bonzini wrote:
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 43bbe4fde078..5d85f1a61793 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1603,6 +1603,8 @@ static int kvm_prepare_memory_region(struct kvm *kvm,
> struct kvm_memory_slot *new,
> enum kvm_mr_change change)
> {
> + int old_flags = old ? old->flags : 0;
> + int new_flags = new ? new->flags : 0;
> int r;
>
> /*
> @@ -1627,6 +1629,11 @@ static int kvm_prepare_memory_region(struct kvm *kvm,
> }
> }
>
> + if ((old_flags ^ new_flags) & KVM_MEM_LOG_DIRTY_PAGES) {
> + int change = (new_flags & KVM_MEM_LOG_DIRTY_PAGES) ? 1 : -1;
> + atomic_set(&kvm->nr_memslots_dirty_logging,
> + atomic_read(&kvm->nr_memslots_dirty_logging) + change);
Again, this needs to be done in the "commit" stage, and IMO should be x86-only.
https://lore.kernel.org/all/Y3bTu4%2FnUfpX+Enm@google.com
> + }
> r = kvm_arch_prepare_memory_region(kvm, old, new, change);
>
> /* Free the bitmap on failure if it was allocated above. */
> --
> 2.31.1
>
@@ -6878,16 +6878,32 @@ static void kvm_recover_nx_huge_pages(struct kvm *kvm)
WARN_ON_ONCE(!sp->nx_huge_page_disallowed);
WARN_ON_ONCE(!sp->role.direct);
- slot = gfn_to_memslot(kvm, sp->gfn);
- WARN_ON_ONCE(!slot);
-
/*
* Unaccount and do not attempt to recover any NX Huge Pages
* that are being dirty tracked, as they would just be faulted
* back in as 4KiB pages. The NX Huge Pages in this slot will be
* recovered, along with all the other huge pages in the slot,
* when dirty logging is disabled.
+ *
+ * Since gfn_to_memslot() is relatively expensive, it helps to
+ * skip it if it the test cannot possibly return true. On the
+ * other hand, if any memslot has logging enabled, chances are
+ * good that all of them do, in which case unaccount_nx_huge_page()
+ * is much cheaper than zapping the page.
+ *
+ * If a memslot update is in progress, reading an incorrect value
+ * of kvm->nr_memslots_dirty_logging is not a problem: if it is
+ * becoming zero, gfn_to_memslot() will be done unnecessarily; if
+ * it is becoming nonzero, the page will be zapped unnecessarily.
+ * Either way, this only affects efficiency in racy situations,
+ * and not correctness.
*/
+ slot = NULL;
+ if (atomic_read(&kvm->nr_memslots_dirty_logging)) {
+ slot = gfn_to_memslot(kvm, sp->gfn);
+ WARN_ON_ONCE(!slot);
+ }
+
if (slot && kvm_slot_dirty_track_enabled(slot))
unaccount_nx_huge_page(kvm, sp);
else if (is_tdp_mmu_page(sp))
@@ -722,6 +722,11 @@ struct kvm {
/* The current active memslot set for each address space */
struct kvm_memslots __rcu *memslots[KVM_ADDRESS_SPACE_NUM];
struct xarray vcpu_array;
+ /*
+ * Protected by slots_lock, but can be read outside if an
+ * incorrect answer is acceptable.
+ */
+ atomic_t nr_memslots_dirty_logging;
/* Used to wait for completion of MMU notifiers. */
spinlock_t mn_invalidate_lock;
@@ -1603,6 +1603,8 @@ static int kvm_prepare_memory_region(struct kvm *kvm,
struct kvm_memory_slot *new,
enum kvm_mr_change change)
{
+ int old_flags = old ? old->flags : 0;
+ int new_flags = new ? new->flags : 0;
int r;
/*
@@ -1627,6 +1629,11 @@ static int kvm_prepare_memory_region(struct kvm *kvm,
}
}
+ if ((old_flags ^ new_flags) & KVM_MEM_LOG_DIRTY_PAGES) {
+ int change = (new_flags & KVM_MEM_LOG_DIRTY_PAGES) ? 1 : -1;
+ atomic_set(&kvm->nr_memslots_dirty_logging,
+ atomic_read(&kvm->nr_memslots_dirty_logging) + change);
+ }
r = kvm_arch_prepare_memory_region(kvm, old, new, change);
/* Free the bitmap on failure if it was allocated above. */