[v8,05/15] KVM: pfncache: remove KVM_GUEST_USES_PFN usage
Commit Message
From: Paul Durrant <pdurrant@amazon.com>
As noted in [1] the KVM_GUEST_USES_PFN usage flag is never set by any
callers of kvm_gpc_init(), which also makes the 'vcpu' argument redundant.
Moreover, all existing callers specify KVM_HOST_USES_PFN so the usage
check in hva_to_pfn_retry() and hence the 'usage' argument to
kvm_gpc_init() are also redundant.
Remove the pfn_cache_usage enumeration and remove the redundant arguments,
fields of struct gfn_to_hva_cache, and all the related code.
[1] https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.com/
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: x86@kernel.org
v8:
- New in this version.
---
arch/x86/kvm/x86.c | 2 +-
arch/x86/kvm/xen.c | 14 ++++-----
include/linux/kvm_host.h | 11 +------
include/linux/kvm_types.h | 8 -----
virt/kvm/pfncache.c | 61 ++++++---------------------------------
5 files changed, 16 insertions(+), 80 deletions(-)
Comments
On Tue, 2023-11-21 at 18:02 +0000, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
>
> As noted in [1] the KVM_GUEST_USES_PFN usage flag is never set by any
> callers of kvm_gpc_init(), which also makes the 'vcpu' argument redundant.
> Moreover, all existing callers specify KVM_HOST_USES_PFN so the usage
> check in hva_to_pfn_retry() and hence the 'usage' argument to
> kvm_gpc_init() are also redundant.
> Remove the pfn_cache_usage enumeration and remove the redundant arguments,
> fields of struct gfn_to_hva_cache, and all the related code.
>
> [1] https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.com/
>
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
I think it's https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@google.com/
which is the key reference. I'm not sure I'm 100% on board, but I never
got round to replying to Sean's email because it was one of those "put
up or shut up situations" and I didn't have the bandwidth to actually
write the code to prove my point.
I think it *is* important to support non-pinned pages. There's a reason
we even made the vapic page migratable. We want to support memory
hotplug, we want to cope with machine checks telling us to move certain
pages (which I suppose is memory hotplug). See commit 38b9917350cb
("kvm: vmx: Implement set_apic_access_page_addr") for example.
I agree that in the first round of the nVMX code there were bugs. And
sure, of *course* it isn't sufficient to wire up the invalidation
without either a KVM_REQ_SOMETHIMG to put it back, or just a *check* on
the corresponding gpc on the way back into the guest. We'd have worked
that out.
And yes, the gpc has had bugs as we implemented it, but the point was
that we got to something which *is* working, and forms a usable
building block.
So I'm not really sold on the idea of ditching KVM_GUEST_USES_PFN. I
think we could get it working, and I think it's worth it. But my
opinion is worth very little unless I express it in 'diff -up' form
instead of prose, and reverting this particular patch is the least of
my barriers to doing so, so reluctantly...
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
On Tue, Nov 21, 2023, David Woodhouse wrote:
> On Tue, 2023-11-21 at 18:02 +0000, Paul Durrant wrote:
> > From: Paul Durrant <pdurrant@amazon.com>
> >
> > As noted in [1] the KVM_GUEST_USES_PFN usage flag is never set by any
> > callers of kvm_gpc_init(), which also makes the 'vcpu' argument redundant.
> > Moreover, all existing callers specify KVM_HOST_USES_PFN so the usage
> > check in hva_to_pfn_retry() and hence the 'usage' argument to
> > kvm_gpc_init() are also redundant.
> > Remove the pfn_cache_usage enumeration and remove the redundant arguments,
> > fields of struct gfn_to_hva_cache, and all the related code.
> >
> > [1] https://lore.kernel.org/all/ZQiR8IpqOZrOpzHC@google.com/
> >
> > Signed-off-by: Paul Durrant <pdurrant@amazon.com>
>
> I think it's https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@google.com/
Yeah, that's the more important link.
> which is the key reference. I'm not sure I'm 100% on board, but I never
> got round to replying to Sean's email because it was one of those "put
> up or shut up situations" and I didn't have the bandwidth to actually
> write the code to prove my point.
>
> I think it *is* important to support non-pinned pages. There's a reason
> we even made the vapic page migratable. We want to support memory
> hotplug, we want to cope with machine checks telling us to move certain
> pages (which I suppose is memory hotplug). See commit 38b9917350cb
> ("kvm: vmx: Implement set_apic_access_page_addr") for example.
The vAPIC page is slightly different in that it effectively never opened a window
for page migration, i.e. once a vCPU was created that page was stuck. For nested
virtualization pages, the probability of being able to migrate a page at any given
time might be relatively low, but it's extremely unlikely for a page to be pinned
for the entire lifetime of a (L1) VM.
> I agree that in the first round of the nVMX code there were bugs. And
> sure, of *course* it isn't sufficient to wire up the invalidation
> without either a KVM_REQ_SOMETHIMG to put it back, or just a *check* on
> the corresponding gpc on the way back into the guest. We'd have worked
> that out.
Maybe. I spent most of a day, maybe longer, hacking at the nVMX code and was
unable to get line of sight to an end result that I felt would be worth pursuing.
I'm definitely not saying it's impossible, and I'm not dead set against
re-introducing KVM_GUEST_USES_PFN or similar, but a complete solution crosses the
threshold where it's unreasonable to ask/expect someone to pick up the work in
order to get their code/series merged.
Which is effectively what you said below, I just wanted to explain why I'm pushing
to remove KVM_GUEST_USES_PFN, and to say that if you or someone else were to write
the code it wouldn't be an automatic nak.
> And yes, the gpc has had bugs as we implemented it, but the point was
> that we got to something which *is* working, and forms a usable
> building block.
>
> So I'm not really sold on the idea of ditching KVM_GUEST_USES_PFN. I
> think we could get it working, and I think it's worth it. But my
> opinion is worth very little unless I express it in 'diff -up' form
> instead of prose, and reverting this particular patch is the least of
> my barriers to doing so, so reluctantly...
@@ -11976,7 +11976,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
vcpu->arch.regs_avail = ~0;
vcpu->arch.regs_dirty = ~0;
- kvm_gpc_init(&vcpu->arch.pv_time, vcpu->kvm, vcpu, KVM_HOST_USES_PFN);
+ kvm_gpc_init(&vcpu->arch.pv_time, vcpu->kvm);
if (!irqchip_in_kernel(vcpu->kvm) || kvm_vcpu_is_reset_bsp(vcpu))
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
@@ -2101,14 +2101,10 @@ void kvm_xen_init_vcpu(struct kvm_vcpu *vcpu)
timer_setup(&vcpu->arch.xen.poll_timer, cancel_evtchn_poll, 0);
- kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm, NULL,
- KVM_HOST_USES_PFN);
- kvm_gpc_init(&vcpu->arch.xen.runstate2_cache, vcpu->kvm, NULL,
- KVM_HOST_USES_PFN);
- kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm, NULL,
- KVM_HOST_USES_PFN);
- kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache, vcpu->kvm, NULL,
- KVM_HOST_USES_PFN);
+ kvm_gpc_init(&vcpu->arch.xen.runstate_cache, vcpu->kvm);
+ kvm_gpc_init(&vcpu->arch.xen.runstate2_cache, vcpu->kvm);
+ kvm_gpc_init(&vcpu->arch.xen.vcpu_info_cache, vcpu->kvm);
+ kvm_gpc_init(&vcpu->arch.xen.vcpu_time_info_cache, vcpu->kvm);
}
void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
@@ -2151,7 +2147,7 @@ void kvm_xen_init_vm(struct kvm *kvm)
{
mutex_init(&kvm->arch.xen.xen_lock);
idr_init(&kvm->arch.xen.evtchn_ports);
- kvm_gpc_init(&kvm->arch.xen.shinfo_cache, kvm, NULL, KVM_HOST_USES_PFN);
+ kvm_gpc_init(&kvm->arch.xen.shinfo_cache, kvm);
}
void kvm_xen_destroy_vm(struct kvm *kvm)
@@ -1287,21 +1287,12 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
*
* @gpc: struct gfn_to_pfn_cache object.
* @kvm: pointer to kvm instance.
- * @vcpu: vCPU to be used for marking pages dirty and to be woken on
- * invalidation.
- * @usage: indicates if the resulting host physical PFN is used while
- * the @vcpu is IN_GUEST_MODE (in which case invalidation of
- * the cache from MMU notifiers---but not for KVM memslot
- * changes!---will also force @vcpu to exit the guest and
- * refresh the cache); and/or if the PFN used directly
- * by KVM (and thus needs a kernel virtual mapping).
*
* This sets up a gfn_to_pfn_cache by initializing locks and assigning the
* immutable attributes. Note, the cache must be zero-allocated (or zeroed by
* the caller before init).
*/
-void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
- struct kvm_vcpu *vcpu, enum pfn_cache_usage usage);
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm);
/**
* kvm_gpc_activate - prepare a cached kernel mapping and HPA for a given guest
@@ -48,12 +48,6 @@ typedef u64 hfn_t;
typedef hfn_t kvm_pfn_t;
-enum pfn_cache_usage {
- KVM_GUEST_USES_PFN = BIT(0),
- KVM_HOST_USES_PFN = BIT(1),
- KVM_GUEST_AND_HOST_USE_PFN = KVM_GUEST_USES_PFN | KVM_HOST_USES_PFN,
-};
-
struct gfn_to_hva_cache {
u64 generation;
gpa_t gpa;
@@ -68,13 +62,11 @@ struct gfn_to_pfn_cache {
unsigned long uhva;
struct kvm_memory_slot *memslot;
struct kvm *kvm;
- struct kvm_vcpu *vcpu;
struct list_head list;
rwlock_t lock;
struct mutex refresh_lock;
void *khva;
kvm_pfn_t pfn;
- enum pfn_cache_usage usage;
bool active;
bool valid;
};
@@ -25,9 +25,7 @@
void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
unsigned long end, bool may_block)
{
- DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
struct gfn_to_pfn_cache *gpc;
- bool evict_vcpus = false;
spin_lock(&kvm->gpc_lock);
list_for_each_entry(gpc, &kvm->gpc_list, list) {
@@ -37,43 +35,10 @@ void gfn_to_pfn_cache_invalidate_start(struct kvm *kvm, unsigned long start,
if (gpc->valid && !is_error_noslot_pfn(gpc->pfn) &&
gpc->uhva >= start && gpc->uhva < end) {
gpc->valid = false;
-
- /*
- * If a guest vCPU could be using the physical address,
- * it needs to be forced out of guest mode.
- */
- if (gpc->usage & KVM_GUEST_USES_PFN) {
- if (!evict_vcpus) {
- evict_vcpus = true;
- bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS);
- }
- __set_bit(gpc->vcpu->vcpu_idx, vcpu_bitmap);
- }
}
write_unlock_irq(&gpc->lock);
}
spin_unlock(&kvm->gpc_lock);
-
- if (evict_vcpus) {
- /*
- * KVM needs to ensure the vCPU is fully out of guest context
- * before allowing the invalidation to continue.
- */
- unsigned int req = KVM_REQ_OUTSIDE_GUEST_MODE;
- bool called;
-
- /*
- * If the OOM reaper is active, then all vCPUs should have
- * been stopped already, so perform the request without
- * KVM_REQUEST_WAIT and be sad if any needed to be IPI'd.
- */
- if (!may_block)
- req &= ~KVM_REQUEST_WAIT;
-
- called = kvm_make_vcpus_request_mask(kvm, req, vcpu_bitmap);
-
- WARN_ON_ONCE(called && !may_block);
- }
}
bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, unsigned long len)
@@ -206,16 +171,14 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
* pfn. Note, kmap() and memremap() can both sleep, so this
* too must be done outside of gpc->lock!
*/
- if (gpc->usage & KVM_HOST_USES_PFN) {
- if (new_pfn == gpc->pfn)
- new_khva = old_khva;
- else
- new_khva = gpc_map(new_pfn);
-
- if (!new_khva) {
- kvm_release_pfn_clean(new_pfn);
- goto out_error;
- }
+ if (new_pfn == gpc->pfn)
+ new_khva = old_khva;
+ else
+ new_khva = gpc_map(new_pfn);
+
+ if (!new_khva) {
+ kvm_release_pfn_clean(new_pfn);
+ goto out_error;
}
write_lock_irq(&gpc->lock);
@@ -346,18 +309,12 @@ int kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, unsigned long len)
return __kvm_gpc_refresh(gpc, gpc->gpa, len);
}
-void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm,
- struct kvm_vcpu *vcpu, enum pfn_cache_usage usage)
+void kvm_gpc_init(struct gfn_to_pfn_cache *gpc, struct kvm *kvm)
{
- WARN_ON_ONCE(!usage || (usage & KVM_GUEST_AND_HOST_USE_PFN) != usage);
- WARN_ON_ONCE((usage & KVM_GUEST_USES_PFN) && !vcpu);
-
rwlock_init(&gpc->lock);
mutex_init(&gpc->refresh_lock);
gpc->kvm = kvm;
- gpc->vcpu = vcpu;
- gpc->usage = usage;
gpc->pfn = KVM_PFN_ERR_FAULT;
gpc->uhva = KVM_HVA_ERR_BAD;
}