KVM: allow mapping of compound tail pages for IO or PFNMAP mapping

Message ID 20230719083332.4584-1-yan.y.zhao@intel.com
State New
Headers
Series KVM: allow mapping of compound tail pages for IO or PFNMAP mapping |

Commit Message

Yan Zhao July 19, 2023, 8:33 a.m. UTC
  Allow mapping of tail pages of compound pages for IO or PFNMAP mapping
by trying and getting ref count of its head page.

For IO or PFNMAP mapping, sometimes it's backed by compound pages.
KVM will just return error on mapping of tail pages of the compound pages,
as ref count of the tail pages are always 0.

So, rather than check and add ref count of a tail page, check and add ref
count of its folio (head page) to allow mapping of the compound tail pages.

This will not break the origial intention to disallow mapping of tail pages
of non-compound higher order allocations as the folio of a non-compound
tail page is the same as the page itself.

On the other side, put_page() has already converted page to folio before
putting page ref.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: 24ff4c08e5bbdd7399d45f940f10fed030dfadda
  

Comments

Sean Christopherson July 19, 2023, 3:42 p.m. UTC | #1
On Wed, Jul 19, 2023, Yan Zhao wrote:
> Allow mapping of tail pages of compound pages for IO or PFNMAP mapping
> by trying and getting ref count of its head page.
> 
> For IO or PFNMAP mapping, sometimes it's backed by compound pages.
> KVM will just return error on mapping of tail pages of the compound pages,
> as ref count of the tail pages are always 0.
> 
> So, rather than check and add ref count of a tail page, check and add ref
> count of its folio (head page) to allow mapping of the compound tail pages.
> 
> This will not break the origial intention to disallow mapping of tail pages
> of non-compound higher order allocations as the folio of a non-compound
> tail page is the same as the page itself.
> 
> On the other side, put_page() has already converted page to folio before
> putting page ref.

Is there an actual use case for this?  It's not necessarily a strict requirement,
but it would be helpful to know if KVM supports this for a specific use case, or
just because it can.

Either way, this needs a selftest, KVM has had way too many bugs in this area.

> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  virt/kvm/kvm_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 138292a86174..6f2b51ef20f7 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2551,7 +2551,7 @@ static int kvm_try_get_pfn(kvm_pfn_t pfn)
>  	if (!page)
>  		return 1;
>  
> -	return get_page_unless_zero(page);
> +	return folio_try_get(page_folio(page));
>  }
>  
>  static int hva_to_pfn_remapped(struct vm_area_struct *vma,
> 
> base-commit: 24ff4c08e5bbdd7399d45f940f10fed030dfadda
> -- 
> 2.17.1
>
  
Yan Zhao July 20, 2023, 2:03 a.m. UTC | #2
On Wed, Jul 19, 2023 at 08:42:37AM -0700, Sean Christopherson wrote:
> On Wed, Jul 19, 2023, Yan Zhao wrote:
> > Allow mapping of tail pages of compound pages for IO or PFNMAP mapping
> > by trying and getting ref count of its head page.
> > 
> > For IO or PFNMAP mapping, sometimes it's backed by compound pages.
> > KVM will just return error on mapping of tail pages of the compound pages,
> > as ref count of the tail pages are always 0.
> > 
> > So, rather than check and add ref count of a tail page, check and add ref
> > count of its folio (head page) to allow mapping of the compound tail pages.
> > 
> > This will not break the origial intention to disallow mapping of tail pages
> > of non-compound higher order allocations as the folio of a non-compound
> > tail page is the same as the page itself.
> > 
> > On the other side, put_page() has already converted page to folio before
> > putting page ref.
> 
> Is there an actual use case for this?  It's not necessarily a strict requirement,
> but it would be helpful to know if KVM supports this for a specific use case, or
> just because it can.
Well, the actual use case is a kind of "yes and no".
In VFIO we now have the concept of "variant drivers" which work with
specific PCI IDs. The variant drivers can inject device specific
knowledge into VFIO. I tested this patch by writing a variant driver to
my e1000e NIC and composing a BAR whose backends are compound pages of 4M in
length.

I guess there might be real use case when the backend pages are in
ZONE_DEVICE but I haven't spent enough time to setup such kind of
environment for verification.

> Either way, this needs a selftest, KVM has had way too many bugs in this area.
Sure, I'll try to provide a selftest.
Thanks for bringing it out!
  

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 138292a86174..6f2b51ef20f7 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2551,7 +2551,7 @@  static int kvm_try_get_pfn(kvm_pfn_t pfn)
 	if (!page)
 		return 1;
 
-	return get_page_unless_zero(page);
+	return folio_try_get(page_folio(page));
 }
 
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,