[4/7] mm/hugetlb: Prepare hugetlb_follow_page_mask() for FOLL_PIN

Message ID 20230613215346.1022773-5-peterx@redhat.com
State New
Headers
Series mm/gup: Unify hugetlb, speed up thp |

Commit Message

Peter Xu June 13, 2023, 9:53 p.m. UTC
  It's coming, not yet, but soon.  Loose the restriction.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/hugetlb.c | 7 -------
 1 file changed, 7 deletions(-)
  

Comments

David Hildenbrand June 14, 2023, 2:57 p.m. UTC | #1
On 13.06.23 23:53, Peter Xu wrote:
> It's coming, not yet, but soon.  Loose the restriction.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   mm/hugetlb.c | 7 -------
>   1 file changed, 7 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f037eaf9d819..31d8f18bc2e4 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6467,13 +6467,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
>   	spinlock_t *ptl;
>   	pte_t *pte, entry;
>   
> -	/*
> -	 * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via
> -	 * follow_hugetlb_page().
> -	 */
> -	if (WARN_ON_ONCE(flags & FOLL_PIN))
> -		return NULL;
> -
>   	hugetlb_vma_lock_read(vma);
>   	pte = hugetlb_walk(vma, haddr, huge_page_size(h));
>   	if (!pte)
Did you fix why the warning was placed there in the first place? (IIRC, 
at least unsharing support needs to be added, maybe more)
  
Peter Xu June 14, 2023, 3:11 p.m. UTC | #2
On Wed, Jun 14, 2023 at 04:57:37PM +0200, David Hildenbrand wrote:
> On 13.06.23 23:53, Peter Xu wrote:
> > It's coming, not yet, but soon.  Loose the restriction.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   mm/hugetlb.c | 7 -------
> >   1 file changed, 7 deletions(-)
> > 
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index f037eaf9d819..31d8f18bc2e4 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -6467,13 +6467,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
> >   	spinlock_t *ptl;
> >   	pte_t *pte, entry;
> > -	/*
> > -	 * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via
> > -	 * follow_hugetlb_page().
> > -	 */
> > -	if (WARN_ON_ONCE(flags & FOLL_PIN))
> > -		return NULL;
> > -
> >   	hugetlb_vma_lock_read(vma);
> >   	pte = hugetlb_walk(vma, haddr, huge_page_size(h));
> >   	if (!pte)
> Did you fix why the warning was placed there in the first place? (IIRC, at
> least unsharing support needs to be added, maybe more)

Feel free to have a look at patch 2 - it should be done there, hopefully in
the right way.  And IIUC it could be a bug to not do that before (besides
CoR there was also the pgtable permission checks that was missing).  More
details in patch 2's commit message.  Thanks,
  
David Hildenbrand June 14, 2023, 3:17 p.m. UTC | #3
On 14.06.23 17:11, Peter Xu wrote:
> On Wed, Jun 14, 2023 at 04:57:37PM +0200, David Hildenbrand wrote:
>> On 13.06.23 23:53, Peter Xu wrote:
>>> It's coming, not yet, but soon.  Loose the restriction.
>>>
>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>> ---
>>>    mm/hugetlb.c | 7 -------
>>>    1 file changed, 7 deletions(-)
>>>
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index f037eaf9d819..31d8f18bc2e4 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -6467,13 +6467,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
>>>    	spinlock_t *ptl;
>>>    	pte_t *pte, entry;
>>> -	/*
>>> -	 * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via
>>> -	 * follow_hugetlb_page().
>>> -	 */
>>> -	if (WARN_ON_ONCE(flags & FOLL_PIN))
>>> -		return NULL;
>>> -
>>>    	hugetlb_vma_lock_read(vma);
>>>    	pte = hugetlb_walk(vma, haddr, huge_page_size(h));
>>>    	if (!pte)
>> Did you fix why the warning was placed there in the first place? (IIRC, at
>> least unsharing support needs to be added, maybe more)
> 
> Feel free to have a look at patch 2 - it should be done there, hopefully in
> the right way.  And IIUC it could be a bug to not do that before (besides
> CoR there was also the pgtable permission checks that was missing).  More
> details in patch 2's commit message.  Thanks,

Oh, that slipped my eyes (unsharing is not really a permission check) -- 
and the patch description could have been more explicit about why we can 
now lift the restrictions.

For the records: we don't use CoR terminology upstream. As suggested by 
John, we use "GUP-triggered unsharing".

As unsharing only applies to FOLL_PIN, it doesn't quite fit into patch 
#2. Either move that to this patch or squash both.
  
Peter Xu June 14, 2023, 3:31 p.m. UTC | #4
On Wed, Jun 14, 2023 at 05:17:13PM +0200, David Hildenbrand wrote:
> On 14.06.23 17:11, Peter Xu wrote:
> > On Wed, Jun 14, 2023 at 04:57:37PM +0200, David Hildenbrand wrote:
> > > On 13.06.23 23:53, Peter Xu wrote:
> > > > It's coming, not yet, but soon.  Loose the restriction.
> > > > 
> > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > ---
> > > >    mm/hugetlb.c | 7 -------
> > > >    1 file changed, 7 deletions(-)
> > > > 
> > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > > index f037eaf9d819..31d8f18bc2e4 100644
> > > > --- a/mm/hugetlb.c
> > > > +++ b/mm/hugetlb.c
> > > > @@ -6467,13 +6467,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
> > > >    	spinlock_t *ptl;
> > > >    	pte_t *pte, entry;
> > > > -	/*
> > > > -	 * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via
> > > > -	 * follow_hugetlb_page().
> > > > -	 */
> > > > -	if (WARN_ON_ONCE(flags & FOLL_PIN))
> > > > -		return NULL;
> > > > -
> > > >    	hugetlb_vma_lock_read(vma);
> > > >    	pte = hugetlb_walk(vma, haddr, huge_page_size(h));
> > > >    	if (!pte)
> > > Did you fix why the warning was placed there in the first place? (IIRC, at
> > > least unsharing support needs to be added, maybe more)
> > 
> > Feel free to have a look at patch 2 - it should be done there, hopefully in
> > the right way.  And IIUC it could be a bug to not do that before (besides
> > CoR there was also the pgtable permission checks that was missing).  More
> > details in patch 2's commit message.  Thanks,
> 
> Oh, that slipped my eyes (unsharing is not really a permission check) -- and

I think it is still a "permission check"?  It means, we forbid anyone R/O
taking the page if it's not exclusively owned, just like we forbid anyone
RW taking the page if it's not writable?

It's just that the permission check only applies to PIN which follow_page()
doesn't yet care, so it won't ever trigger.

> the patch description could have been more explicit about why we can now
> lift the restrictions.
> 
> For the records: we don't use CoR terminology upstream. As suggested by
> John, we use "GUP-triggered unsharing".

Sure.

> 
> As unsharing only applies to FOLL_PIN, it doesn't quite fit into patch #2.
> Either move that to this patch or squash both.

Sure, no strong opinions here.

The plan is _if_ someone wants to backport patch 2, this patch should not
be part of it.  But then maybe it makes more sense to move the CoR change
there into this one, not because "it's not permission check", but because
CoR is not relevant in follow_page(), so not relevant to a backport.
  
David Hildenbrand June 14, 2023, 3:47 p.m. UTC | #5
On 14.06.23 17:31, Peter Xu wrote:
> On Wed, Jun 14, 2023 at 05:17:13PM +0200, David Hildenbrand wrote:
>> On 14.06.23 17:11, Peter Xu wrote:
>>> On Wed, Jun 14, 2023 at 04:57:37PM +0200, David Hildenbrand wrote:
>>>> On 13.06.23 23:53, Peter Xu wrote:
>>>>> It's coming, not yet, but soon.  Loose the restriction.
>>>>>
>>>>> Signed-off-by: Peter Xu <peterx@redhat.com>
>>>>> ---
>>>>>     mm/hugetlb.c | 7 -------
>>>>>     1 file changed, 7 deletions(-)
>>>>>
>>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>>> index f037eaf9d819..31d8f18bc2e4 100644
>>>>> --- a/mm/hugetlb.c
>>>>> +++ b/mm/hugetlb.c
>>>>> @@ -6467,13 +6467,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
>>>>>     	spinlock_t *ptl;
>>>>>     	pte_t *pte, entry;
>>>>> -	/*
>>>>> -	 * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via
>>>>> -	 * follow_hugetlb_page().
>>>>> -	 */
>>>>> -	if (WARN_ON_ONCE(flags & FOLL_PIN))
>>>>> -		return NULL;
>>>>> -
>>>>>     	hugetlb_vma_lock_read(vma);
>>>>>     	pte = hugetlb_walk(vma, haddr, huge_page_size(h));
>>>>>     	if (!pte)
>>>> Did you fix why the warning was placed there in the first place? (IIRC, at
>>>> least unsharing support needs to be added, maybe more)
>>>
>>> Feel free to have a look at patch 2 - it should be done there, hopefully in
>>> the right way.  And IIUC it could be a bug to not do that before (besides
>>> CoR there was also the pgtable permission checks that was missing).  More
>>> details in patch 2's commit message.  Thanks,
>>
>> Oh, that slipped my eyes (unsharing is not really a permission check) -- and
> 
> I think it is still a "permission check"?  It means, we forbid anyone R/O
> taking the page if it's not exclusively owned, just like we forbid anyone
> RW taking the page if it's not writable?

Agreed, just not in the traditional PTE-protection case.

> 
> It's just that the permission check only applies to PIN which follow_page()
> doesn't yet care, so it won't ever trigger.
> 
>> the patch description could have been more explicit about why we can now
>> lift the restrictions.
>>
>> For the records: we don't use CoR terminology upstream. As suggested by
>> John, we use "GUP-triggered unsharing".
> 
> Sure.
> 
>>
>> As unsharing only applies to FOLL_PIN, it doesn't quite fit into patch #2.
>> Either move that to this patch or squash both.
> 
> Sure, no strong opinions here.
> 
> The plan is _if_ someone wants to backport patch 2, this patch should not
> be part of it.  But then maybe it makes more sense to move the CoR change
> there into this one, not because "it's not permission check", but because
> CoR is not relevant in follow_page(), so not relevant to a backport.

Right. Then just call patch #2 "Add missing write-permission check" and 
this patch "Support FOLL_PIN in hugetlb_follow_page_mask()" or sth. like 
that.

Regarding the backport, I really wonder if patch #2 is required at all, 
because I didn't sport any applicable FOLL_WRITE users. Maybe there were 
some? Hm. If it's not applicable, a single "Support FOLL_PIN in 
hugetlb_follow_page_mask()" patch might be cleanest.
  
Peter Xu June 14, 2023, 3:51 p.m. UTC | #6
On Wed, Jun 14, 2023 at 05:47:31PM +0200, David Hildenbrand wrote:
> Right. Then just call patch #2 "Add missing write-permission check" and this
> patch "Support FOLL_PIN in hugetlb_follow_page_mask()" or sth. like that.
> 
> Regarding the backport, I really wonder if patch #2 is required at all,
> because I didn't sport any applicable FOLL_WRITE users. Maybe there were
> some? Hm. If it's not applicable, a single "Support FOLL_PIN in
> hugetlb_follow_page_mask()" patch might be cleanest.

Yeah, I agree.  The code is definitely needed, not the split of patches if
no need for a backport.  Let me merge then.
  
Mike Kravetz June 15, 2023, 12:25 a.m. UTC | #7
On 06/14/23 11:51, Peter Xu wrote:
> On Wed, Jun 14, 2023 at 05:47:31PM +0200, David Hildenbrand wrote:
> > Right. Then just call patch #2 "Add missing write-permission check" and this
> > patch "Support FOLL_PIN in hugetlb_follow_page_mask()" or sth. like that.
> > 
> > Regarding the backport, I really wonder if patch #2 is required at all,
> > because I didn't sport any applicable FOLL_WRITE users. Maybe there were
> > some? Hm. If it's not applicable, a single "Support FOLL_PIN in
> > hugetlb_follow_page_mask()" patch might be cleanest.
> 
> Yeah, I agree.  The code is definitely needed, not the split of patches if
> no need for a backport.  Let me merge then.
> 

Should have read this before adding my RB to patch 2.  I assumed no
backport.  Agree, than merging the gup_must_unshare here makes more sense.
  
Peter Xu June 15, 2023, 7:42 p.m. UTC | #8
On Wed, Jun 14, 2023 at 05:25:25PM -0700, Mike Kravetz wrote:
> On 06/14/23 11:51, Peter Xu wrote:
> > On Wed, Jun 14, 2023 at 05:47:31PM +0200, David Hildenbrand wrote:
> > > Right. Then just call patch #2 "Add missing write-permission check" and this
> > > patch "Support FOLL_PIN in hugetlb_follow_page_mask()" or sth. like that.
> > > 
> > > Regarding the backport, I really wonder if patch #2 is required at all,
> > > because I didn't sport any applicable FOLL_WRITE users. Maybe there were
> > > some? Hm. If it's not applicable, a single "Support FOLL_PIN in
> > > hugetlb_follow_page_mask()" patch might be cleanest.
> > 
> > Yeah, I agree.  The code is definitely needed, not the split of patches if
> > no need for a backport.  Let me merge then.
> > 
> 
> Should have read this before adding my RB to patch 2.  I assumed no
> backport.  Agree, than merging the gup_must_unshare here makes more sense.

Thanks for taking a look!

No worries, I'll make bold to just take your R-b over the merged patch when
I repost, according to your R-b in patch 2 and the comment here.  I hope
it's fine to you.
  

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f037eaf9d819..31d8f18bc2e4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6467,13 +6467,6 @@  struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma,
 	spinlock_t *ptl;
 	pte_t *pte, entry;
 
-	/*
-	 * FOLL_PIN is not supported for follow_page(). Ordinary GUP goes via
-	 * follow_hugetlb_page().
-	 */
-	if (WARN_ON_ONCE(flags & FOLL_PIN))
-		return NULL;
-
 	hugetlb_vma_lock_read(vma);
 	pte = hugetlb_walk(vma, haddr, huge_page_size(h));
 	if (!pte)