[v6,4/4] KVM: mmu: remove over-aggressive warnings
Commit Message
From: David Stevens <stevensd@chromium.org>
Remove two warnings that require ref counts for pages to be non-zero, as
mapped pfns from follow_pfn may not have an initialized ref count.
Signed-off-by: David Stevens <stevensd@chromium.org>
---
arch/x86/kvm/mmu/mmu.c | 10 ----------
virt/kvm/kvm_main.c | 5 ++---
2 files changed, 2 insertions(+), 13 deletions(-)
Comments
On Thu, Mar 30, 2023, David Stevens wrote:
> From: David Stevens <stevensd@chromium.org>
>
> Remove two warnings that require ref counts for pages to be non-zero, as
> mapped pfns from follow_pfn may not have an initialized ref count.
This patch needs to be moved earlier, e.g. if just this patch is reverted, these
WARNs will fire on a guest with non-refcounted memory.
The shortlog and changelog also need to be reworded. The shortlog in particular
is misleading, as the the WARNs aren't overly agressive _in the current code base_,
but rather are invalidated by KVM allowing non-refcounted struct page memory to
be mapped into the guest.
Lastly, as I mentioned in previous versions, I would like to keep the sanity
check if possible. But this time, I have a concrete idea :-)
When installing a SPTE that points at a refcounted page, set a flag stating as
much. Then use the flag to assert that the page has an elevate refcount whenever
KVM is operating on the page. It'll require some additional plumbing changes,
e.g. to tell make_spte() that the pfn is refcounted, but the actual code should be
straightforward.
Actually, we should make that a requirement to allow an arch to get non-refcounted
struct page memory: the arch must be able to keep track which pages are/aren't
refcounted. That'll disallow your GPU use case with 32-bit x86 host kernels (we're
out of software bits in PAE SPTEs), but I can't imaging anyone cares. Then I
believe we can make that support mutually exclusive with kvm_pfn_to_refcounted_page(),
because all of the kvm_follow_pfn() users will know (and remember) that the pfn
is backed by a refcounted page.
@@ -555,7 +555,6 @@ static u64 mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
kvm_pfn_t pfn;
u64 old_spte = *sptep;
int level = sptep_to_sp(sptep)->role.level;
- struct page *page;
if (!is_shadow_present_pte(old_spte) ||
!spte_has_volatile_bits(old_spte))
@@ -570,15 +569,6 @@ static u64 mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep)
pfn = spte_to_pfn(old_spte);
- /*
- * KVM doesn't hold a reference to any pages mapped into the guest, and
- * instead uses the mmu_notifier to ensure that KVM unmaps any pages
- * before they are reclaimed. Sanity check that, if the pfn is backed
- * by a refcounted page, the refcount is elevated.
- */
- page = kvm_pfn_to_refcounted_page(pfn);
- WARN_ON(page && !page_count(page));
-
if (is_accessed_spte(old_spte))
kvm_set_pfn_accessed(pfn);
@@ -165,10 +165,9 @@ bool kvm_is_zone_device_page(struct page *page)
/*
* The metadata used by is_zone_device_page() to determine whether or
* not a page is ZONE_DEVICE is guaranteed to be valid if and only if
- * the device has been pinned, e.g. by get_user_pages(). WARN if the
- * page_count() is zero to help detect bad usage of this helper.
+ * the device has been pinned, e.g. by get_user_pages().
*/
- if (WARN_ON_ONCE(!page_count(page)))
+ if (!page_count(page))
return false;
return is_zone_device_page(page);