[RFC,v2,1/3] KVM: allow mapping of compound tail pages for IO or PFNMAP mapping

Message ID 20240103084424.20014-1-yan.y.zhao@intel.com
State New
Headers
Series KVM: allow mapping of compound tail pages for IO or PFNMAP mapping |

Commit Message

Yan Zhao Jan. 3, 2024, 8:44 a.m. UTC
  Allow mapping of tail pages of compound pages for IO or PFNMAP mapping
by trying and getting ref count of its head page.

For IO or PFNMAP mapping, sometimes it's backed by compound pages.
KVM will just return error on mapping of tail pages of the compound pages,
as ref count of the tail pages are always 0.

So, rather than check and add ref count of a tail page, check and add ref
count of its folio (head page) to allow mapping of the compound tail pages.

This will not break the origial intention to disallow mapping of tail pages
of non-compound higher order allocations as the folio of a non-compound
tail page is the same as the page itself.

On the other side, put_page() has already converted page to folio before
putting page ref.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Sean Christopherson Feb. 13, 2024, 3:17 a.m. UTC | #1
On Wed, Jan 03, 2024, Yan Zhao wrote:
> Allow mapping of tail pages of compound pages for IO or PFNMAP mapping
> by trying and getting ref count of its head page.
> 
> For IO or PFNMAP mapping, sometimes it's backed by compound pages.
> KVM will just return error on mapping of tail pages of the compound pages,
> as ref count of the tail pages are always 0.
> 
> So, rather than check and add ref count of a tail page, check and add ref
> count of its folio (head page) to allow mapping of the compound tail pages.

Can you add a blurb to call out that this is effectively what gup() does in
try_get_folio()?  That knowledge give me a _lot_ more confidence that this is
correct (I didn't think too deeply about what this patch was doing when I looked
at v1).

> This will not break the origial intention to disallow mapping of tail pages
> of non-compound higher order allocations as the folio of a non-compound
> tail page is the same as the page itself.
> 
> On the other side, put_page() has already converted page to folio before
> putting page ref.
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  virt/kvm/kvm_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index acd67fb40183..f53b58446ac7 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2892,7 +2892,7 @@ static int kvm_try_get_pfn(kvm_pfn_t pfn)
>  	if (!page)
>  		return 1;
>  
> -	return get_page_unless_zero(page);
> +	return folio_try_get(page_folio(page));

This seems like it needs retry logic, a la try_get_folio(), to guard against a
race with the folio being split.  From page_folio():
  
 If the caller* does not hold a reference, this call may race with a folio split,
 so it should re-check the folio still contains this page after gaining a
 reference on the folio.

I assume that splitting one of these folios is extremely unlikely, but I don't
see any harm in being paranoid (unless this really truly cannot race).
  
Yan Zhao Feb. 20, 2024, 8:52 a.m. UTC | #2
On Mon, Feb 12, 2024 at 07:17:21PM -0800, Sean Christopherson wrote:
> On Wed, Jan 03, 2024, Yan Zhao wrote:
> > Allow mapping of tail pages of compound pages for IO or PFNMAP mapping
> > by trying and getting ref count of its head page.
> > 
> > For IO or PFNMAP mapping, sometimes it's backed by compound pages.
> > KVM will just return error on mapping of tail pages of the compound pages,
> > as ref count of the tail pages are always 0.
> > 
> > So, rather than check and add ref count of a tail page, check and add ref
> > count of its folio (head page) to allow mapping of the compound tail pages.
> 
> Can you add a blurb to call out that this is effectively what gup() does in
> try_get_folio()?  That knowledge give me a _lot_ more confidence that this is
> correct (I didn't think too deeply about what this patch was doing when I looked
> at v1).
Sure.

> 
> > This will not break the origial intention to disallow mapping of tail pages
> > of non-compound higher order allocations as the folio of a non-compound
> > tail page is the same as the page itself.
> > 
> > On the other side, put_page() has already converted page to folio before
> > putting page ref.
> > 
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---
> >  virt/kvm/kvm_main.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index acd67fb40183..f53b58446ac7 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -2892,7 +2892,7 @@ static int kvm_try_get_pfn(kvm_pfn_t pfn)
> >  	if (!page)
> >  		return 1;
> >  
> > -	return get_page_unless_zero(page);
> > +	return folio_try_get(page_folio(page));
> 
> This seems like it needs retry logic, a la try_get_folio(), to guard against a
> race with the folio being split.  From page_folio():
>   
>  If the caller* does not hold a reference, this call may race with a folio split,
>  so it should re-check the folio still contains this page after gaining a
>  reference on the folio.
> 
> I assume that splitting one of these folios is extremely unlikely, but I don't
> see any harm in being paranoid (unless this really truly cannot race).
Yes, you are right!
Will do the retry. Thanks!
  

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index acd67fb40183..f53b58446ac7 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2892,7 +2892,7 @@  static int kvm_try_get_pfn(kvm_pfn_t pfn)
 	if (!page)
 		return 1;
 
-	return get_page_unless_zero(page);
+	return folio_try_get(page_folio(page));
 }
 
 static int hva_to_pfn_remapped(struct vm_area_struct *vma,