[1/4] mm: migrate: use a folio in add_page_for_migration()

Message ID 20230802095346.87449-2-wangkefeng.wang@huawei.com
State New
Headers
Series mm: migrate: more folio conversion |

Commit Message

Kefeng Wang Aug. 2, 2023, 9:53 a.m. UTC
  Use a folio in add_page_for_migration() to save compound_head() calls.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 mm/migrate.c | 44 +++++++++++++++++++++-----------------------
 1 file changed, 21 insertions(+), 23 deletions(-)
  

Comments

Matthew Wilcox Aug. 2, 2023, 12:21 p.m. UTC | #1
On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote:
>  	err = -EACCES;
> -	if (page_mapcount(page) > 1 && !migrate_all)
> -		goto out_putpage;
> +	if (folio_estimated_sharers(folio) > 1 && !migrate_all)
> +		goto out_putfolio;

I do not think this is the correct change.  Maybe leave this line
alone.

> -	if (PageHuge(page)) {
> -		if (PageHead(page)) {
> -			isolated = isolate_hugetlb(page_folio(page), pagelist);
> +	if (folio_test_hugetlb(folio)) {
> +		if (folio_test_large(folio)) {

This makes no sense when you read it.  All hugetlb folios are large,
by definition.  Think about what this code used to do, and what it
should be changed to.

> +			isolated = isolate_hugetlb(folio, pagelist);
>  			err = isolated ? 1 : -EBUSY;
>  		}
>  	} else {
> -		struct page *head;
> -
> -		head = compound_head(page);
> -		isolated = isolate_lru_page(head);
> +		isolated = folio_isolate_lru(folio);
>  		if (!isolated) {
>  			err = -EBUSY;
> -			goto out_putpage;
> +			goto out_putfolio;
>  		}
>  
>  		err = 1;
> -		list_add_tail(&head->lru, pagelist);
> -		mod_node_page_state(page_pgdat(head),
> -			NR_ISOLATED_ANON + page_is_file_lru(head),
> -			thp_nr_pages(head));
> +		list_add_tail(&folio->lru, pagelist);
> +		node_stat_mod_folio(folio,
> +			NR_ISOLATED_ANON + folio_is_file_lru(folio),
> +			folio_nr_pages(folio));
>  	}
  
Kefeng Wang Aug. 3, 2023, 7:13 a.m. UTC | #2
On 2023/8/2 20:21, Matthew Wilcox wrote:
> On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote:
>>   	err = -EACCES;
>> -	if (page_mapcount(page) > 1 && !migrate_all)
>> -		goto out_putpage;
>> +	if (folio_estimated_sharers(folio) > 1 && !migrate_all)
>> +		goto out_putfolio;
> 
> I do not think this is the correct change.  Maybe leave this line
> alone.

Ok, I am aware of the discussion about this in other mail, will not
change it(also the next two patch about this function), or wait the
new work of David.
> 
>> -	if (PageHuge(page)) {
>> -		if (PageHead(page)) {
>> -			isolated = isolate_hugetlb(page_folio(page), pagelist);
>> +	if (folio_test_hugetlb(folio)) {
>> +		if (folio_test_large(folio)) {
> 
> This makes no sense when you read it.  All hugetlb folios are large,
> by definition.  Think about what this code used to do, and what it
> should be changed to.

hugetlb folio is self large folio, will drop redundant check
  
Matthew Wilcox Aug. 3, 2023, 12:30 p.m. UTC | #3
On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
> 
> 
> On 2023/8/2 20:21, Matthew Wilcox wrote:
> > On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote:
> > >   	err = -EACCES;
> > > -	if (page_mapcount(page) > 1 && !migrate_all)
> > > -		goto out_putpage;
> > > +	if (folio_estimated_sharers(folio) > 1 && !migrate_all)
> > > +		goto out_putfolio;
> > 
> > I do not think this is the correct change.  Maybe leave this line
> > alone.
> 
> Ok, I am aware of the discussion about this in other mail, will not
> change it(also the next two patch about this function), or wait the
> new work of David.
> > 
> > > -	if (PageHuge(page)) {
> > > -		if (PageHead(page)) {
> > > -			isolated = isolate_hugetlb(page_folio(page), pagelist);
> > > +	if (folio_test_hugetlb(folio)) {
> > > +		if (folio_test_large(folio)) {
> > 
> > This makes no sense when you read it.  All hugetlb folios are large,
> > by definition.  Think about what this code used to do, and what it
> > should be changed to.
> 
> hugetlb folio is self large folio, will drop redundant check

No, that's not the difference.  Keep thinking about it.  This is not
a mechanical translation!
  
Kefeng Wang Aug. 4, 2023, 1:45 a.m. UTC | #4
On 2023/8/3 20:30, Matthew Wilcox wrote:
> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
>>
>>
>> On 2023/8/2 20:21, Matthew Wilcox wrote:
>>> On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote:
>>>>    	err = -EACCES;
>>>> -	if (page_mapcount(page) > 1 && !migrate_all)
>>>> -		goto out_putpage;
>>>> +	if (folio_estimated_sharers(folio) > 1 && !migrate_all)
>>>> +		goto out_putfolio;
>>>
>>> I do not think this is the correct change.  Maybe leave this line
>>> alone.
>>
>> Ok, I am aware of the discussion about this in other mail, will not
>> change it(also the next two patch about this function), or wait the
>> new work of David.
>>>
>>>> -	if (PageHuge(page)) {
>>>> -		if (PageHead(page)) {
>>>> -			isolated = isolate_hugetlb(page_folio(page), pagelist);
>>>> +	if (folio_test_hugetlb(folio)) {
>>>> +		if (folio_test_large(folio)) {
>>>
>>> This makes no sense when you read it.  All hugetlb folios are large,
>>> by definition.  Think about what this code used to do, and what it
>>> should be changed to.
>>
>> hugetlb folio is self large folio, will drop redundant check
> 
> No, that's not the difference.  Keep thinking about it.  This is not
> a mechanical translation!


   if (PageHuge(page))  // page must be a hugetlb page
	if (PageHead(page)) // page must be a head page, not tail
              isolate_hugetlb() // isolate the hugetlb page if head

After using folio,

   if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not

I don't check the page is head or not, since the follow_page could
return a sub-page, so the check PageHead need be retained, right?
  
Zi Yan Aug. 4, 2023, 2:42 a.m. UTC | #5
On 3 Aug 2023, at 21:45, Kefeng Wang wrote:

> On 2023/8/3 20:30, Matthew Wilcox wrote:
>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
>>>
>>>
>>> On 2023/8/2 20:21, Matthew Wilcox wrote:
>>>> On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote:
>>>>>    	err = -EACCES;
>>>>> -	if (page_mapcount(page) > 1 && !migrate_all)
>>>>> -		goto out_putpage;
>>>>> +	if (folio_estimated_sharers(folio) > 1 && !migrate_all)
>>>>> +		goto out_putfolio;
>>>>
>>>> I do not think this is the correct change.  Maybe leave this line
>>>> alone.
>>>
>>> Ok, I am aware of the discussion about this in other mail, will not
>>> change it(also the next two patch about this function), or wait the
>>> new work of David.
>>>>
>>>>> -	if (PageHuge(page)) {
>>>>> -		if (PageHead(page)) {
>>>>> -			isolated = isolate_hugetlb(page_folio(page), pagelist);
>>>>> +	if (folio_test_hugetlb(folio)) {
>>>>> +		if (folio_test_large(folio)) {
>>>>
>>>> This makes no sense when you read it.  All hugetlb folios are large,
>>>> by definition.  Think about what this code used to do, and what it
>>>> should be changed to.
>>>
>>> hugetlb folio is self large folio, will drop redundant check
>>
>> No, that's not the difference.  Keep thinking about it.  This is not
>> a mechanical translation!
>
>
>   if (PageHuge(page))  // page must be a hugetlb page
> 	if (PageHead(page)) // page must be a head page, not tail
>              isolate_hugetlb() // isolate the hugetlb page if head
>
> After using folio,
>
>   if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not
>
> I don't check the page is head or not, since the follow_page could
> return a sub-page, so the check PageHead need be retained, right?

Right. It will prevent the kernel from trying to isolate the same hugetlb page
twice when two pages are in the same hugetlb folio. But looking at the
code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb()
would return false, no error would show up. But it changes err value
from -EACCES to -EBUSY and user will see a different page status than before.

I wonder why we do not have follow_folio() and returns -ENOENT error pointer
when addr points to a non head page. It would make this patch more folio if
follow_folio() can be used in place of follow_page(). One caveat is that
user will see -ENOENT instead of -EACCES after this change.


--
Best Regards,
Yan, Zi
  
Kefeng Wang Aug. 4, 2023, 5:54 a.m. UTC | #6
On 2023/8/4 10:42, Zi Yan wrote:
> On 3 Aug 2023, at 21:45, Kefeng Wang wrote:
> 
>> On 2023/8/3 20:30, Matthew Wilcox wrote:
>>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2023/8/2 20:21, Matthew Wilcox wrote:
>>>>> On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote:
>>>>>>     	err = -EACCES;
>>>>>> -	if (page_mapcount(page) > 1 && !migrate_all)
>>>>>> -		goto out_putpage;
>>>>>> +	if (folio_estimated_sharers(folio) > 1 && !migrate_all)
>>>>>> +		goto out_putfolio;
>>>>>
>>>>> I do not think this is the correct change.  Maybe leave this line
>>>>> alone.
>>>>
>>>> Ok, I am aware of the discussion about this in other mail, will not
>>>> change it(also the next two patch about this function), or wait the
>>>> new work of David.
>>>>>
>>>>>> -	if (PageHuge(page)) {
>>>>>> -		if (PageHead(page)) {
>>>>>> -			isolated = isolate_hugetlb(page_folio(page), pagelist);
>>>>>> +	if (folio_test_hugetlb(folio)) {
>>>>>> +		if (folio_test_large(folio)) {
>>>>>
>>>>> This makes no sense when you read it.  All hugetlb folios are large,
>>>>> by definition.  Think about what this code used to do, and what it
>>>>> should be changed to.
>>>>
>>>> hugetlb folio is self large folio, will drop redundant check
>>>
>>> No, that's not the difference.  Keep thinking about it.  This is not
>>> a mechanical translation!
>>
>>
>>    if (PageHuge(page))  // page must be a hugetlb page
>> 	if (PageHead(page)) // page must be a head page, not tail
>>               isolate_hugetlb() // isolate the hugetlb page if head
>>
>> After using folio,
>>
>>    if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not
>>
>> I don't check the page is head or not, since the follow_page could
>> return a sub-page, so the check PageHead need be retained, right?
> 
> Right. It will prevent the kernel from trying to isolate the same hugetlb page
> twice when two pages are in the same hugetlb folio. But looking at the
> code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb()
> would return false, no error would show up. But it changes err value
> from -EACCES to -EBUSY and user will see a different page status than before.


When check man[1], the current -EACCES is not right, -EBUSY is not
precise but more suitable for this scenario,

  	-EACCES
               The page is mapped by multiple processes and can be moved
               only if MPOL_MF_MOVE_ALL is specified.

        -EBUSY The page is currently busy and cannot be moved.  Try again
               later.  This occurs if a page is undergoing I/O or another
               kernel subsystem is holding a reference to the page.
	-ENOENT
               The page is not present.

> 
> I wonder why we do not have follow_folio() and returns -ENOENT error pointer
> when addr points to a non head page. It would make this patch more folio if
> follow_folio() can be used in place of follow_page(). One caveat is that
> user will see -ENOENT instead of -EACCES after this change.
> 

-ENOENT is ok, but maybe the man need to be updated too.


	
[1] https://man7.org/linux/man-pages/man2/move_pages.2.html






> 
> --
> Best Regards,
> Yan, Zi
  
Kefeng Wang Aug. 7, 2023, 12:20 p.m. UTC | #7
Hi Zi Yan and Matthew and Naoya,

On 2023/8/4 13:54, Kefeng Wang wrote:
> 
> 
> On 2023/8/4 10:42, Zi Yan wrote:
>> On 3 Aug 2023, at 21:45, Kefeng Wang wrote:
>>
>>> On 2023/8/3 20:30, Matthew Wilcox wrote:
>>>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
>>>>>

...

>>>
>>>
>>>    if (PageHuge(page))  // page must be a hugetlb page
>>>     if (PageHead(page)) // page must be a head page, not tail
>>>               isolate_hugetlb() // isolate the hugetlb page if head
>>>
>>> After using folio,
>>>
>>>    if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not
>>>
>>> I don't check the page is head or not, since the follow_page could
>>> return a sub-page, so the check PageHead need be retained, right?
>>
>> Right. It will prevent the kernel from trying to isolate the same 
>> hugetlb page
>> twice when two pages are in the same hugetlb folio. But looking at the
>> code, if you try to isolate an already-isolated hugetlb folio, 
>> isolate_hugetlb()
>> would return false, no error would show up. But it changes err value
>> from -EACCES to -EBUSY and user will see a different page status than 
>> before.
> 

Before e66f17ff7177 ("mm/hugetlb: take page table lock in 
follow_huge_pmd()")
in v4.0, follow_page() will return NULL on tail page for Huagetlb page,
and move_pages() will return -ENOENT errno,but after that commit,
-EACCES is returned, which not match the manual,

> 
> When check man[1], the current -EACCES is not right, -EBUSY is not
> precise but more suitable for this scenario,
> 
>       -EACCES
>                The page is mapped by multiple processes and can be moved
>                only if MPOL_MF_MOVE_ALL is specified.
> 
>       -EBUSY The page is currently busy and cannot be moved.  Try again
>                later.  This occurs if a page is undergoing I/O or another
>                kernel subsystem is holding a reference to the page.
>      -ENOENT
>                The page is not present.
> 
>>
>> I wonder why we do not have follow_folio() and returns -ENOENT error 
>> pointer
>> when addr points to a non head page. It would make this patch more 
>> folio if
>> follow_folio() can be used in place of follow_page(). One caveat is that
>> user will see -ENOENT instead of -EACCES after this change.
>>
> 
> -ENOENT is ok, but maybe the man need to be updated too.

According to above analysis, -ENOENT is suitable when introduce the
follow_folio(), but when THP migrate support is introduced by
e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in
v4.14, the tail page will be turned into head page and return -EBUSY,

So should we unify errno(maybe use -ENOENT) about the tail page?


> 
> 
> 
> [1] https://man7.org/linux/man-pages/man2/move_pages.2.html
  
Zi Yan Aug. 7, 2023, 6:45 p.m. UTC | #8
On 7 Aug 2023, at 8:20, Kefeng Wang wrote:

> Hi Zi Yan and Matthew and Naoya,
>
> On 2023/8/4 13:54, Kefeng Wang wrote:
>>
>>
>> On 2023/8/4 10:42, Zi Yan wrote:
>>> On 3 Aug 2023, at 21:45, Kefeng Wang wrote:
>>>
>>>> On 2023/8/3 20:30, Matthew Wilcox wrote:
>>>>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
>>>>>>
>
> ...
>
>>>>
>>>>
>>>>    if (PageHuge(page))  // page must be a hugetlb page
>>>>     if (PageHead(page)) // page must be a head page, not tail
>>>>               isolate_hugetlb() // isolate the hugetlb page if head
>>>>
>>>> After using folio,
>>>>
>>>>    if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not
>>>>
>>>> I don't check the page is head or not, since the follow_page could
>>>> return a sub-page, so the check PageHead need be retained, right?
>>>
>>> Right. It will prevent the kernel from trying to isolate the same hugetlb page
>>> twice when two pages are in the same hugetlb folio. But looking at the
>>> code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb()
>>> would return false, no error would show up. But it changes err value
>>> from -EACCES to -EBUSY and user will see a different page status than before.
>>
>
> Before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()")
> in v4.0, follow_page() will return NULL on tail page for Huagetlb page,
> and move_pages() will return -ENOENT errno,but after that commit,
> -EACCES is returned, which not match the manual,
>
>>
>> When check man[1], the current -EACCES is not right, -EBUSY is not
>> precise but more suitable for this scenario,
>>
>>       -EACCES
>>                The page is mapped by multiple processes and can be moved
>>                only if MPOL_MF_MOVE_ALL is specified.
>>
>>       -EBUSY The page is currently busy and cannot be moved.  Try again
>>                later.  This occurs if a page is undergoing I/O or another
>>                kernel subsystem is holding a reference to the page.
>>      -ENOENT
>>                The page is not present.
>>
>>>
>>> I wonder why we do not have follow_folio() and returns -ENOENT error pointer
>>> when addr points to a non head page. It would make this patch more folio if
>>> follow_folio() can be used in place of follow_page(). One caveat is that
>>> user will see -ENOENT instead of -EACCES after this change.
>>>
>>
>> -ENOENT is ok, but maybe the man need to be updated too.
>
> According to above analysis, -ENOENT is suitable when introduce the
> follow_folio(), but when THP migrate support is introduced by
> e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in
> v4.14, the tail page will be turned into head page and return -EBUSY,
>
> So should we unify errno(maybe use -ENOENT) about the tail page?
>
>
>>
>>
>>
>> [1] https://man7.org/linux/man-pages/man2/move_pages.2.html

I think so. I think -EBUSY is more reasonable for tail pages. But there is
some subtle difference between THP and hugetlb from current code:

For THP, compound_head() is used to get the head page for isolation, this means
if user specifies a tail page address in move_pages(), the whole THP can be
migrated.

For hugetlb, only if user specifies the head page address of a hugetlb page,
the hugetlb page will be migrated. Otherwise, an error would show up.

Cc Mike to help us clarify the expected behavior of hugetlb.

Hi Mike, what is the expected behavior, if a user tries to use move_pages()
to migrate a non head page of a hugetlb page?

--
Best Regards,
Yan, Zi
  
Kefeng Wang Aug. 9, 2023, 12:37 p.m. UTC | #9
Hi Mike

On 2023/8/8 2:45, Zi Yan wrote:
> On 7 Aug 2023, at 8:20, Kefeng Wang wrote:
> 
>> Hi Zi Yan and Matthew and Naoya,
>>
>> On 2023/8/4 13:54, Kefeng Wang wrote:
>>>
>>>
>>> On 2023/8/4 10:42, Zi Yan wrote:
>>>> On 3 Aug 2023, at 21:45, Kefeng Wang wrote:
>>>>
>>>>> On 2023/8/3 20:30, Matthew Wilcox wrote:
>>>>>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
>>>>>>>
>>
>> ...
>>
>>>>>
>>>>>
>>>>>     if (PageHuge(page))  // page must be a hugetlb page
>>>>>      if (PageHead(page)) // page must be a head page, not tail
>>>>>                isolate_hugetlb() // isolate the hugetlb page if head
>>>>>
>>>>> After using folio,
>>>>>
>>>>>     if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not
>>>>>
>>>>> I don't check the page is head or not, since the follow_page could
>>>>> return a sub-page, so the check PageHead need be retained, right?
>>>>
>>>> Right. It will prevent the kernel from trying to isolate the same hugetlb page
>>>> twice when two pages are in the same hugetlb folio. But looking at the
>>>> code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb()
>>>> would return false, no error would show up. But it changes err value
>>>> from -EACCES to -EBUSY and user will see a different page status than before.
>>>
>>
>> Before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()")
>> in v4.0, follow_page() will return NULL on tail page for Huagetlb page,
>> and move_pages() will return -ENOENT errno,but after that commit,
>> -EACCES is returned, which not match the manual,
>>
>>>
>>> When check man[1], the current -EACCES is not right, -EBUSY is not
>>> precise but more suitable for this scenario,
>>>
>>>        -EACCES
>>>                 The page is mapped by multiple processes and can be moved
>>>                 only if MPOL_MF_MOVE_ALL is specified.
>>>
>>>        -EBUSY The page is currently busy and cannot be moved.  Try again
>>>                 later.  This occurs if a page is undergoing I/O or another
>>>                 kernel subsystem is holding a reference to the page.
>>>       -ENOENT
>>>                 The page is not present.
>>>
>>>>
>>>> I wonder why we do not have follow_folio() and returns -ENOENT error pointer
>>>> when addr points to a non head page. It would make this patch more folio if
>>>> follow_folio() can be used in place of follow_page(). One caveat is that
>>>> user will see -ENOENT instead of -EACCES after this change.
>>>>
>>>
>>> -ENOENT is ok, but maybe the man need to be updated too.
>>
>> According to above analysis, -ENOENT is suitable when introduce the
>> follow_folio(), but when THP migrate support is introduced by
>> e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in
>> v4.14, the tail page will be turned into head page and return -EBUSY,
>>
>> So should we unify errno(maybe use -ENOENT) about the tail page?
>>
>>
>>>
>>>
>>>
>>> [1] https://man7.org/linux/man-pages/man2/move_pages.2.html
> 
> I think so. I think -EBUSY is more reasonable for tail pages. But there is
> some subtle difference between THP and hugetlb from current code:
> 
> For THP, compound_head() is used to get the head page for isolation, this means
> if user specifies a tail page address in move_pages(), the whole THP can be
> migrated.
> 
> For hugetlb, only if user specifies the head page address of a hugetlb page,
> the hugetlb page will be migrated. Otherwise, an error would show up.
> 
> Cc Mike to help us clarify the expected behavior of hugetlb.
> 
> Hi Mike, what is the expected behavior, if a user tries to use move_pages()
> to migrate a non head page of a hugetlb page?

Could you give some advise, thanks

> 
> --
> Best Regards,
> Yan, Zi
  
Mike Kravetz Aug. 9, 2023, 8:53 p.m. UTC | #10
On 08/09/23 20:37, Kefeng Wang wrote:
> Hi Mike
> 
> On 2023/8/8 2:45, Zi Yan wrote:
> > On 7 Aug 2023, at 8:20, Kefeng Wang wrote:
> > 
> > > Hi Zi Yan and Matthew and Naoya,
> > > 
> > > On 2023/8/4 13:54, Kefeng Wang wrote:
> > > > 
> > > > 
> > > > On 2023/8/4 10:42, Zi Yan wrote:
> > > > > On 3 Aug 2023, at 21:45, Kefeng Wang wrote:
> > > > > 
> > > > > > On 2023/8/3 20:30, Matthew Wilcox wrote:
> > > > > > > On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote:
> > > > > > > > 
> > > 
> > > ...
> > > 
> > > > > > 
> > > > > > 
> > > > > >     if (PageHuge(page))  // page must be a hugetlb page
> > > > > >      if (PageHead(page)) // page must be a head page, not tail
> > > > > >                isolate_hugetlb() // isolate the hugetlb page if head
> > > > > > 
> > > > > > After using folio,
> > > > > > 
> > > > > >     if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not
> > > > > > 
> > > > > > I don't check the page is head or not, since the follow_page could
> > > > > > return a sub-page, so the check PageHead need be retained, right?
> > > > > 
> > > > > Right. It will prevent the kernel from trying to isolate the same hugetlb page
> > > > > twice when two pages are in the same hugetlb folio. But looking at the
> > > > > code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb()
> > > > > would return false, no error would show up. But it changes err value
> > > > > from -EACCES to -EBUSY and user will see a different page status than before.
> > > > 
> > > 
> > > Before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()")
> > > in v4.0, follow_page() will return NULL on tail page for Huagetlb page,
> > > and move_pages() will return -ENOENT errno,but after that commit,
> > > -EACCES is returned, which not match the manual,
> > > 
> > > > 
> > > > When check man[1], the current -EACCES is not right, -EBUSY is not
> > > > precise but more suitable for this scenario,
> > > > 
> > > >        -EACCES
> > > >                 The page is mapped by multiple processes and can be moved
> > > >                 only if MPOL_MF_MOVE_ALL is specified.
> > > > 
> > > >        -EBUSY The page is currently busy and cannot be moved.  Try again
> > > >                 later.  This occurs if a page is undergoing I/O or another
> > > >                 kernel subsystem is holding a reference to the page.
> > > >       -ENOENT
> > > >                 The page is not present.
> > > > 
> > > > > 
> > > > > I wonder why we do not have follow_folio() and returns -ENOENT error pointer
> > > > > when addr points to a non head page. It would make this patch more folio if
> > > > > follow_folio() can be used in place of follow_page(). One caveat is that
> > > > > user will see -ENOENT instead of -EACCES after this change.
> > > > > 
> > > > 
> > > > -ENOENT is ok, but maybe the man need to be updated too.
> > > 
> > > According to above analysis, -ENOENT is suitable when introduce the
> > > follow_folio(), but when THP migrate support is introduced by
> > > e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in
> > > v4.14, the tail page will be turned into head page and return -EBUSY,
> > > 
> > > So should we unify errno(maybe use -ENOENT) about the tail page?
> > > 
> > > 
> > > > 
> > > > 
> > > > 
> > > > [1] https://man7.org/linux/man-pages/man2/move_pages.2.html
> > 
> > I think so. I think -EBUSY is more reasonable for tail pages. But there is
> > some subtle difference between THP and hugetlb from current code:
> > 
> > For THP, compound_head() is used to get the head page for isolation, this means
> > if user specifies a tail page address in move_pages(), the whole THP can be
> > migrated.
> > 
> > For hugetlb, only if user specifies the head page address of a hugetlb page,
> > the hugetlb page will be migrated. Otherwise, an error would show up.
> > 
> > Cc Mike to help us clarify the expected behavior of hugetlb.
> > 
> > Hi Mike, what is the expected behavior, if a user tries to use move_pages()
> > to migrate a non head page of a hugetlb page?
> 
> Could you give some advise, thanks
> 

Sorry, I was away for a while.

It seems unfortunate that move_pages says the passed user addresses
should be aligned to page boundaries.  However, IIUC this is not checked
or enforced.  Otherwise, passing a hugetlb page should return the same
error.

One thought would be that hugetlb mappings should behave the same
non-hugetlb mappings.  If passed the address of a hugetlb tail page, align
the address to a hugetlb boundary and migrate the page.  This changes the
existing behavior.  However, it would be hard to imagine anyone depending
on this.

After taking a closer look at the add_page_for_migration(), it seems to
just ignore passed tail pages and do nothing for such passed addresses.
Correct?  Or, am I missing something?  Perhaps that is behavior we want/
need to preserve?
  
Mike Kravetz Aug. 9, 2023, 10:44 p.m. UTC | #11
On 08/09/23 13:53, Mike Kravetz wrote:
> On 08/09/23 20:37, Kefeng Wang wrote:
> > > 
> > > Cc Mike to help us clarify the expected behavior of hugetlb.
> > > 
> > > Hi Mike, what is the expected behavior, if a user tries to use move_pages()
> > > to migrate a non head page of a hugetlb page?
> > 
> > Could you give some advise, thanks
> > 
> 
> Sorry, I was away for a while.
> 
> It seems unfortunate that move_pages says the passed user addresses
> should be aligned to page boundaries.  However, IIUC this is not checked
> or enforced.  Otherwise, passing a hugetlb page should return the same
> error.
> 
> One thought would be that hugetlb mappings should behave the same
> non-hugetlb mappings.  If passed the address of a hugetlb tail page, align
> the address to a hugetlb boundary and migrate the page.  This changes the
> existing behavior.  However, it would be hard to imagine anyone depending
> on this.
> 
> After taking a closer look at the add_page_for_migration(), it seems to
> just ignore passed tail pages and do nothing for such passed addresses.
> Correct?  Or, am I missing something?  Perhaps that is behavior we want/
> need to preserve?

My mistake, status -EACCES is returned when passing a tail page of a
hugetlb page.

Back to the question of 'What is the expected behavior if a tail page is
passed?'.  I do not think we have defined an expected behavior.  If
anything is 'expected' I would say it is -EACCES as returned today.

BTW - hugetlb pages not migrated due to passing a tail page does not
seem to contribute to a 'Positive return value' indicating the number of
non-migrated pages.
  
Kefeng Wang Aug. 10, 2023, 1:49 a.m. UTC | #12
On 2023/8/10 6:44, Mike Kravetz wrote:
> On 08/09/23 13:53, Mike Kravetz wrote:
>> On 08/09/23 20:37, Kefeng Wang wrote:
>>>>
>>>> Cc Mike to help us clarify the expected behavior of hugetlb.
>>>>
>>>> Hi Mike, what is the expected behavior, if a user tries to use move_pages()
>>>> to migrate a non head page of a hugetlb page?
>>>
>>> Could you give some advise, thanks
>>>
>>
>> Sorry, I was away for a while.
>>
>> It seems unfortunate that move_pages says the passed user addresses
>> should be aligned to page boundaries.  However, IIUC this is not checked
>> or enforced.  Otherwise, passing a hugetlb page should return the same
>> error.
>>
>> One thought would be that hugetlb mappings should behave the same
>> non-hugetlb mappings.  If passed the address of a hugetlb tail page, align
>> the address to a hugetlb boundary and migrate the page.  This changes the
>> existing behavior.  However, it would be hard to imagine anyone depending
>> on this.
>>
>> After taking a closer look at the add_page_for_migration(), it seems to
>> just ignore passed tail pages and do nothing for such passed addresses.
>> Correct?  Or, am I missing something?  Perhaps that is behavior we want/
>> need to preserve?
> 
> My mistake, status -EACCES is returned when passing a tail page of a
> hugetlb page.
> 

As mentioned in previous mail, before e66f17ff7177 ("mm/hugetlb: take
page table lock in follow_huge_pmd()") in v4.0, follow_page() will
return NULL on tail page for Huagetlb page, so move_pages() will return
-ENOENT errno, but after that commit, -EACCES is returned.

Meanwhile, the behavior of THP/HUGETLB is different, the whole THP will 
be migrated on a tail page, but HUGETLB will return -EACCES(after v4.0)
or -ENOENT(before v4.0) on tail page.

> Back to the question of 'What is the expected behavior if a tail page is
> passed?'.  I do not think we have defined an expected behavior.  If
> anything is 'expected' I would say it is -EACCES as returned today.
> 

My question is,

Should we keep seem behavior between HUGETLB and THP, or only change the
errno from -EACCES to -ENOENT/-EBUSY.

I would like to drop PageHead() check for Hugetlb to keep seem behavior,
which will keep seem error code if isolate fail or success on head/tail
page.

Thanks.

> BTW - hugetlb pages not migrated due to passing a tail page does not
> seem to contribute to a 'Positive return value' indicating the number of
> non-migrated pages.
  
Mike Kravetz Aug. 10, 2023, 4:29 p.m. UTC | #13
On 08/10/23 09:49, Kefeng Wang wrote:
> 
> 
> On 2023/8/10 6:44, Mike Kravetz wrote:
> > On 08/09/23 13:53, Mike Kravetz wrote:
> > > On 08/09/23 20:37, Kefeng Wang wrote:
> > > > > 
> > > > > Cc Mike to help us clarify the expected behavior of hugetlb.
> > > > > 
> > > > > Hi Mike, what is the expected behavior, if a user tries to use move_pages()
> > > > > to migrate a non head page of a hugetlb page?
> > > > 
> > > > Could you give some advise, thanks
> > > > 
> > > 
> > > Sorry, I was away for a while.
> > > 
> > > It seems unfortunate that move_pages says the passed user addresses
> > > should be aligned to page boundaries.  However, IIUC this is not checked
> > > or enforced.  Otherwise, passing a hugetlb page should return the same
> > > error.
> > > 
> > > One thought would be that hugetlb mappings should behave the same
> > > non-hugetlb mappings.  If passed the address of a hugetlb tail page, align
> > > the address to a hugetlb boundary and migrate the page.  This changes the
> > > existing behavior.  However, it would be hard to imagine anyone depending
> > > on this.
> > > 
> > > After taking a closer look at the add_page_for_migration(), it seems to
> > > just ignore passed tail pages and do nothing for such passed addresses.
> > > Correct?  Or, am I missing something?  Perhaps that is behavior we want/
> > > need to preserve?
> > 
> > My mistake, status -EACCES is returned when passing a tail page of a
> > hugetlb page.
> > 
> 
> As mentioned in previous mail, before e66f17ff7177 ("mm/hugetlb: take
> page table lock in follow_huge_pmd()") in v4.0, follow_page() will
> return NULL on tail page for Huagetlb page, so move_pages() will return
> -ENOENT errno, but after that commit, -EACCES is returned.
> 
> Meanwhile, the behavior of THP/HUGETLB is different, the whole THP will be
> migrated on a tail page, but HUGETLB will return -EACCES(after v4.0)
> or -ENOENT(before v4.0) on tail page.
> 
> > Back to the question of 'What is the expected behavior if a tail page is
> > passed?'.  I do not think we have defined an expected behavior.  If
> > anything is 'expected' I would say it is -EACCES as returned today.
> > 
> 
> My question is,
> 
> Should we keep seem behavior between HUGETLB and THP, or only change the
> errno from -EACCES to -ENOENT/-EBUSY.

Just to be clear.  When you say "keep seem behavior between HUGETLB and THP",
are you saying that you would like hugetlb to perform migration of the entire
hugetlb page if a tail page is passed?

IMO, this would be ideal as it would mean that hugetlb and THP behave the same
when passed the address of a tail page.  The fewer places where hugetlb
behavior diverges, the better.  However, this does change behavior.

As mentioned above, I have a hard time imagining someone depending on the
behavior that passing the address of a hugetlb tail page returns error.  But,
this is almost impossible to predict.

Thoughts from others?
  
Kefeng Wang Aug. 16, 2023, 12:50 a.m. UTC | #14
On 2023/8/16 5:12, Mike Kravetz wrote:
> On 08/15/23 11:58, Huang, Ying wrote:
>> Mike Kravetz <mike.kravetz@oracle.com> writes:
>>
>>> On 08/10/23 09:49, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2023/8/10 6:44, Mike Kravetz wrote:
>>>>> On 08/09/23 13:53, Mike Kravetz wrote:
>>>>>> On 08/09/23 20:37, Kefeng Wang wrote:
>>>>>>>>
>>>>>>>> Cc Mike to help us clarify the expected behavior of hugetlb.
>>>>>>>>
>>>>>>>> Hi Mike, what is the expected behavior, if a user tries to use move_pages()
>>>>>>>> to migrate a non head page of a hugetlb page?
>>>>>>>
>>>>>>> Could you give some advise, thanks
>>>>>>>
>>>>>>
>>>>>> Sorry, I was away for a while.
>>>>>>
>>>>>> It seems unfortunate that move_pages says the passed user addresses
>>>>>> should be aligned to page boundaries.  However, IIUC this is not checked
>>>>>> or enforced.  Otherwise, passing a hugetlb page should return the same
>>>>>> error.
>>>>>>
>>>>>> One thought would be that hugetlb mappings should behave the same
>>>>>> non-hugetlb mappings.  If passed the address of a hugetlb tail page, align
>>>>>> the address to a hugetlb boundary and migrate the page.  This changes the
>>>>>> existing behavior.  However, it would be hard to imagine anyone depending
>>>>>> on this.
>>>>>>
>>>>>> After taking a closer look at the add_page_for_migration(), it seems to
>>>>>> just ignore passed tail pages and do nothing for such passed addresses.
>>>>>> Correct?  Or, am I missing something?  Perhaps that is behavior we want/
>>>>>> need to preserve?
>>>>>
>>>>> My mistake, status -EACCES is returned when passing a tail page of a
>>>>> hugetlb page.
>>>>>
>>>>
>>>> As mentioned in previous mail, before e66f17ff7177 ("mm/hugetlb: take
>>>> page table lock in follow_huge_pmd()") in v4.0, follow_page() will
>>>> return NULL on tail page for Huagetlb page, so move_pages() will return
>>>> -ENOENT errno, but after that commit, -EACCES is returned.
>>>>
>>>> Meanwhile, the behavior of THP/HUGETLB is different, the whole THP will be
>>>> migrated on a tail page, but HUGETLB will return -EACCES(after v4.0)
>>>> or -ENOENT(before v4.0) on tail page.
>>>>
>>>>> Back to the question of 'What is the expected behavior if a tail page is
>>>>> passed?'.  I do not think we have defined an expected behavior.  If
>>>>> anything is 'expected' I would say it is -EACCES as returned today.
>>>>>
>>>>
>>>> My question is,
>>>>
>>>> Should we keep seem behavior between HUGETLB and THP, or only change the
>>>> errno from -EACCES to -ENOENT/-EBUSY.
>>>
>>> Just to be clear.  When you say "keep seem behavior between HUGETLB and THP",
>>> are you saying that you would like hugetlb to perform migration of the entire
>>> hugetlb page if a tail page is passed?
>>>
>>> IMO, this would be ideal as it would mean that hugetlb and THP behave the same
>>> when passed the address of a tail page.  The fewer places where hugetlb
>>> behavior diverges, the better.  However, this does change behavior.
>>
>> A separate patch will be needed for behavior change.
>>
> 
> Correct.
> 
> Since the goal of this series is to convert to folios, we should maintain the
> existing behavior and errno (-EACCES).  In a subsequent patch, we can
> change behavior.
> 
> That would be my suggestion.


Thanks all, will following the suggestion and re-post.
  

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index e21d5a7e7447..b0c318bc531c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2057,6 +2057,7 @@  static int add_page_for_migration(struct mm_struct *mm, const void __user *p,
 	struct vm_area_struct *vma;
 	unsigned long addr;
 	struct page *page;
+	struct folio *folio;
 	int err;
 	bool isolated;
 
@@ -2079,45 +2080,42 @@  static int add_page_for_migration(struct mm_struct *mm, const void __user *p,
 	if (!page)
 		goto out;
 
-	if (is_zone_device_page(page))
-		goto out_putpage;
+	folio = page_folio(page);
+	if (folio_is_zone_device(folio))
+		goto out_putfolio;
 
 	err = 0;
-	if (page_to_nid(page) == node)
-		goto out_putpage;
+	if (folio_nid(folio) == node)
+		goto out_putfolio;
 
 	err = -EACCES;
-	if (page_mapcount(page) > 1 && !migrate_all)
-		goto out_putpage;
+	if (folio_estimated_sharers(folio) > 1 && !migrate_all)
+		goto out_putfolio;
 
-	if (PageHuge(page)) {
-		if (PageHead(page)) {
-			isolated = isolate_hugetlb(page_folio(page), pagelist);
+	if (folio_test_hugetlb(folio)) {
+		if (folio_test_large(folio)) {
+			isolated = isolate_hugetlb(folio, pagelist);
 			err = isolated ? 1 : -EBUSY;
 		}
 	} else {
-		struct page *head;
-
-		head = compound_head(page);
-		isolated = isolate_lru_page(head);
+		isolated = folio_isolate_lru(folio);
 		if (!isolated) {
 			err = -EBUSY;
-			goto out_putpage;
+			goto out_putfolio;
 		}
 
 		err = 1;
-		list_add_tail(&head->lru, pagelist);
-		mod_node_page_state(page_pgdat(head),
-			NR_ISOLATED_ANON + page_is_file_lru(head),
-			thp_nr_pages(head));
+		list_add_tail(&folio->lru, pagelist);
+		node_stat_mod_folio(folio,
+			NR_ISOLATED_ANON + folio_is_file_lru(folio),
+			folio_nr_pages(folio));
 	}
-out_putpage:
+out_putfolio:
 	/*
-	 * Either remove the duplicate refcount from
-	 * isolate_lru_page() or drop the page ref if it was
-	 * not isolated.
+	 * Either remove the duplicate refcount from folio_isolate_lru()
+	 * or drop the folio ref if it was not isolated.
 	 */
-	put_page(page);
+	folio_put(folio);
 out:
 	mmap_read_unlock(mm);
 	return err;