[v2,1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully

Message ID fca6bb0bd48a0292a0ace2fadd0f44579a060cbb.1666335603.git.baolin.wang@linux.alibaba.com
State New
Headers
Series [v2,1/2] mm: migrate: Fix return value if all subpages of THPs are migrated successfully |

Commit Message

Baolin Wang Oct. 21, 2022, 10:16 a.m. UTC
  When THP migration, if THPs are split and all subpages are migrated successfully
, the migrate_pages() will still return the number of THP that were not migrated.
That will confuse the callers of migrate_pages(), for example, which will make
the longterm pinning failed though all pages are migrated successfully.

Thus we should return 0 to indicate all pages are migrated in this case.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
Changes from v1:
- Fix the return value of migrate_pages() instead of fixing the
  callers' validation.
---
 mm/migrate.c | 7 +++++++
 1 file changed, 7 insertions(+)
  

Comments

Andrew Morton Oct. 21, 2022, 6:41 p.m. UTC | #1
On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> When THP migration, if THPs are split and all subpages are migrated successfully
> , the migrate_pages() will still return the number of THP that were not migrated.
> That will confuse the callers of migrate_pages(), for example, which will make
> the longterm pinning failed though all pages are migrated successfully.
> 
> Thus we should return 0 to indicate all pages are migrated in this case.
> 

This had me puzzled for a while.  I think this wording is clearer?

: During THP migration, if THPs are not migrated but they are split and all
: subpages are migrated successfully, migrate_pages() will still return the
: number of THP pages that were not migrated.  This will confuse the callers
: of migrate_pages().  For example, the longterm pinning will failed though
: all pages are migrated successfully.
:
: Thus we should return 0 to indicate that all pages are migrated in this
: case.

This is a fairly longstanding problem?  No Fixes: we can identify?

Did you consider the desirability of a -stable backport?
  
Yang Shi Oct. 21, 2022, 8:55 p.m. UTC | #2
On Fri, Oct 21, 2022 at 11:41 AM Andrew Morton
<akpm@linux-foundation.org> wrote:
>
> On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>
> > When THP migration, if THPs are split and all subpages are migrated successfully
> > , the migrate_pages() will still return the number of THP that were not migrated.
> > That will confuse the callers of migrate_pages(), for example, which will make
> > the longterm pinning failed though all pages are migrated successfully.
> >
> > Thus we should return 0 to indicate all pages are migrated in this case.
> >
>
> This had me puzzled for a while.  I think this wording is clearer?
>
> : During THP migration, if THPs are not migrated but they are split and all
> : subpages are migrated successfully, migrate_pages() will still return the
> : number of THP pages that were not migrated.  This will confuse the callers
> : of migrate_pages().  For example, the longterm pinning will failed though
> : all pages are migrated successfully.
> :
> : Thus we should return 0 to indicate that all pages are migrated in this
> : case.
>
> This is a fairly longstanding problem?  No Fixes: we can identify?

It doesn't seem like a long standing issue. It seems like commit
b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
fixed one problem, but introduced this new one IIUC.

Before this commit, the code did:

nr_failed += retry + thp_retry;
rc = nr_failed;

But retry and thp_retry were actually reset for each retry until the
last one. So as long as there is no permanent migration failure and
THP split failure, nr_failed should be 0 IIUC. TBH the code is a
little bit hard to follow, please correct me if I'm wrong.

>
> Did you consider the desirability of a -stable backport?
>
  
Huang, Ying Oct. 24, 2022, 1:56 a.m. UTC | #3
Yang Shi <shy828301@gmail.com> writes:

> On Fri, Oct 21, 2022 at 11:41 AM Andrew Morton
> <akpm@linux-foundation.org> wrote:
>>
>> On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>
>> > When THP migration, if THPs are split and all subpages are migrated successfully
>> > , the migrate_pages() will still return the number of THP that were not migrated.
>> > That will confuse the callers of migrate_pages(), for example, which will make
>> > the longterm pinning failed though all pages are migrated successfully.
>> >
>> > Thus we should return 0 to indicate all pages are migrated in this case.
>> >
>>
>> This had me puzzled for a while.  I think this wording is clearer?
>>
>> : During THP migration, if THPs are not migrated but they are split and all
>> : subpages are migrated successfully, migrate_pages() will still return the
>> : number of THP pages that were not migrated.  This will confuse the callers
>> : of migrate_pages().  For example, the longterm pinning will failed though
>> : all pages are migrated successfully.
>> :
>> : Thus we should return 0 to indicate that all pages are migrated in this
>> : case.
>>
>> This is a fairly longstanding problem?  No Fixes: we can identify?
>
> It doesn't seem like a long standing issue. It seems like commit
> b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
> fixed one problem, but introduced this new one IIUC.
>
> Before this commit, the code did:
>
> nr_failed += retry + thp_retry;
> rc = nr_failed;
>
> But retry and thp_retry were actually reset for each retry until the
> last one. So as long as there is no permanent migration failure and
> THP split failure, nr_failed should be 0 IIUC. TBH the code is a
> little bit hard to follow, please correct me if I'm wrong.

I think that you are correct.  We can added

Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")

>> Did you consider the desirability of a -stable backport?

I think this can be backport to -stable.

Best Regards,
Huang, Ying
  
Alistair Popple Oct. 24, 2022, 2:36 a.m. UTC | #4
Baolin Wang <baolin.wang@linux.alibaba.com> writes:

> When THP migration, if THPs are split and all subpages are migrated successfully
> , the migrate_pages() will still return the number of THP that were not migrated.
> That will confuse the callers of migrate_pages(), for example, which will make
> the longterm pinning failed though all pages are migrated successfully.
>
> Thus we should return 0 to indicate all pages are migrated in this case.
>
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
> Changes from v1:
> - Fix the return value of migrate_pages() instead of fixing the
>   callers' validation.
> ---
>  mm/migrate.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 8e5eb6e..1da0dbc 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>  	 */
>  	list_splice(&ret_pages, from);
>
> +	/*
> +	 * Return 0 in case all subpages of fail-to-migrate THPs are
> +	 * migrated successfully.
> +	 */
> +	if (nr_thp_split && list_empty(from))
> +		rc = 0;

Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
imply success? And if it doesn't imply success wouldn't it be possible
to end up with nr_thp_split && list_empty(from) whilst still having
pages that failed to migrate?

The list management and return code logic from unmap_and_move() has
gotten pretty difficult to follow and could do with some rework IMHO.

>  	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
>  	count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
>  	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);
  
Baolin Wang Oct. 24, 2022, 6:03 a.m. UTC | #5
On 10/24/2022 9:56 AM, Huang, Ying wrote:
> Yang Shi <shy828301@gmail.com> writes:
> 
>> On Fri, Oct 21, 2022 at 11:41 AM Andrew Morton
>> <akpm@linux-foundation.org> wrote:
>>>
>>> On Fri, 21 Oct 2022 18:16:23 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>>
>>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>>> That will confuse the callers of migrate_pages(), for example, which will make
>>>> the longterm pinning failed though all pages are migrated successfully.
>>>>
>>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>>
>>>
>>> This had me puzzled for a while.  I think this wording is clearer?
>>>
>>> : During THP migration, if THPs are not migrated but they are split and all
>>> : subpages are migrated successfully, migrate_pages() will still return the
>>> : number of THP pages that were not migrated.  This will confuse the callers
>>> : of migrate_pages().  For example, the longterm pinning will failed though
>>> : all pages are migrated successfully.
>>> :
>>> : Thus we should return 0 to indicate that all pages are migrated in this
>>> : case.
>>>
>>> This is a fairly longstanding problem?  No Fixes: we can identify?
>>
>> It doesn't seem like a long standing issue. It seems like commit
>> b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")
>> fixed one problem, but introduced this new one IIUC.
>>
>> Before this commit, the code did:
>>
>> nr_failed += retry + thp_retry;
>> rc = nr_failed;
>>
>> But retry and thp_retry were actually reset for each retry until the
>> last one. So as long as there is no permanent migration failure and
>> THP split failure, nr_failed should be 0 IIUC. TBH the code is a
>> little bit hard to follow, please correct me if I'm wrong.
> 
> I think that you are correct.  We can added
> 
> Fixes: b5bade978e9b ("mm: migrate: fix the return value of migrate_pages()")

I think so too. Thanks Yang and Ying for pointing it out.

> 
>>> Did you consider the desirability of a -stable backport?
> 
> I think this can be backport to -stable.

Agree.

Andrew, could you help to add the Fixes tag and cc -stable? Thanks.
  
Baolin Wang Oct. 24, 2022, 6:41 a.m. UTC | #6
On 10/24/2022 10:36 AM, Alistair Popple wrote:
> 
> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> 
>> When THP migration, if THPs are split and all subpages are migrated successfully
>> , the migrate_pages() will still return the number of THP that were not migrated.
>> That will confuse the callers of migrate_pages(), for example, which will make
>> the longterm pinning failed though all pages are migrated successfully.
>>
>> Thus we should return 0 to indicate all pages are migrated in this case.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>> Changes from v1:
>> - Fix the return value of migrate_pages() instead of fixing the
>>    callers' validation.
>> ---
>>   mm/migrate.c | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 8e5eb6e..1da0dbc 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>   	 */
>>   	list_splice(&ret_pages, from);
>>
>> +	/*
>> +	 * Return 0 in case all subpages of fail-to-migrate THPs are
>> +	 * migrated successfully.
>> +	 */
>> +	if (nr_thp_split && list_empty(from))
>> +		rc = 0;
> 
> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True

Only in the case of THP split, we can meet this abnormal case. So if no 
THP split, just return the original 'rc' instead of validating the list, 
since the 'nr_thp_split' validation is cheaper than the list_empty() 
validation IMHO.

> imply success? And if it doesn't imply success wouldn't it be possible
> to end up with nr_thp_split && list_empty(from) whilst still having
> pages that failed to migrate?
> 
> The list management and return code logic from unmap_and_move() has
> gotten pretty difficult to follow and could do with some rework IMHO.

Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor, 
which seems a good start.

[1] https://lore.kernel.org/all/20220921060616.73086-1-ying.huang@intel.com/
  
Alistair Popple Oct. 24, 2022, 7:24 a.m. UTC | #7
Baolin Wang <baolin.wang@linux.alibaba.com> writes:

> On 10/24/2022 10:36 AM, Alistair Popple wrote:
>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>
>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>> That will confuse the callers of migrate_pages(), for example, which will make
>>> the longterm pinning failed though all pages are migrated successfully.
>>>
>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>
>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> ---
>>> Changes from v1:
>>> - Fix the return value of migrate_pages() instead of fixing the
>>>    callers' validation.
>>> ---
>>>   mm/migrate.c | 7 +++++++
>>>   1 file changed, 7 insertions(+)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index 8e5eb6e..1da0dbc 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>   	 */
>>>   	list_splice(&ret_pages, from);
>>>
>>> +	/*
>>> +	 * Return 0 in case all subpages of fail-to-migrate THPs are
>>> +	 * migrated successfully.
>>> +	 */
>>> +	if (nr_thp_split && list_empty(from))
>>> +		rc = 0;
>> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
>
> Only in the case of THP split, we can meet this abnormal case. So if no THP
> split, just return the original 'rc' instead of validating the list, since the
> 'nr_thp_split' validation is cheaper than the list_empty() validation IMHO.

Is it really that much cheaper? We're already retrying migrations
multiple times, etc. so surely the difference here would be marginal at
best, and IMHO the code would be much clearer if we always set rc = 0
when list_empty(from) = True.

>> imply success? And if it doesn't imply success wouldn't it be possible
>> to end up with nr_thp_split && list_empty(from) whilst still having
>> pages that failed to migrate?
>> The list management and return code logic from unmap_and_move() has
>> gotten pretty difficult to follow and could do with some rework IMHO.
>
> Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor, which seems
> a good start.

Thanks for pointing that out, I looked at it a while back but missed the
clean ups. I was kind of waiting for the non-RFC version before taking
another closer look.

> [1] https://lore.kernel.org/all/20220921060616.73086-1-ying.huang@intel.com/
  
Baolin Wang Oct. 24, 2022, 8:01 a.m. UTC | #8
On 10/24/2022 3:24 PM, Alistair Popple wrote:
> 
> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
> 
>> On 10/24/2022 10:36 AM, Alistair Popple wrote:
>>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>>
>>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>>> That will confuse the callers of migrate_pages(), for example, which will make
>>>> the longterm pinning failed though all pages are migrated successfully.
>>>>
>>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>>
>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>> ---
>>>> Changes from v1:
>>>> - Fix the return value of migrate_pages() instead of fixing the
>>>>     callers' validation.
>>>> ---
>>>>    mm/migrate.c | 7 +++++++
>>>>    1 file changed, 7 insertions(+)
>>>>
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index 8e5eb6e..1da0dbc 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>>    	 */
>>>>    	list_splice(&ret_pages, from);
>>>>
>>>> +	/*
>>>> +	 * Return 0 in case all subpages of fail-to-migrate THPs are
>>>> +	 * migrated successfully.
>>>> +	 */
>>>> +	if (nr_thp_split && list_empty(from))
>>>> +		rc = 0;
>>> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
>>
>> Only in the case of THP split, we can meet this abnormal case. So if no THP
>> split, just return the original 'rc' instead of validating the list, since the
>> 'nr_thp_split' validation is cheaper than the list_empty() validation IMHO.
> 
> Is it really that much cheaper? We're already retrying migrations
> multiple times, etc. so surely the difference here would be marginal at
> best, and IMHO the code would be much clearer if we always set rc = 0
> when list_empty(from) = True.

Yeah, the difference is marginal and I have no strong preference. OK, 
will drop the 'nr_thp_split' in next version. Thanks.

>>> imply success? And if it doesn't imply success wouldn't it be possible
>>> to end up with nr_thp_split && list_empty(from) whilst still having
>>> pages that failed to migrate?
>>> The list management and return code logic from unmap_and_move() has
>>> gotten pretty difficult to follow and could do with some rework IMHO.
>>
>> Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor, which seems
>> a good start.
> 
> Thanks for pointing that out, I looked at it a while back but missed the
> clean ups. I was kind of waiting for the non-RFC version before taking
> another closer look.
> 
>> [1] https://lore.kernel.org/all/20220921060616.73086-1-ying.huang@intel.com/
  
Alistair Popple Oct. 24, 2022, 8:06 a.m. UTC | #9
Baolin Wang <baolin.wang@linux.alibaba.com> writes:

> On 10/24/2022 3:24 PM, Alistair Popple wrote:
>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>
>>> On 10/24/2022 10:36 AM, Alistair Popple wrote:
>>>> Baolin Wang <baolin.wang@linux.alibaba.com> writes:
>>>>
>>>>> When THP migration, if THPs are split and all subpages are migrated successfully
>>>>> , the migrate_pages() will still return the number of THP that were not migrated.
>>>>> That will confuse the callers of migrate_pages(), for example, which will make
>>>>> the longterm pinning failed though all pages are migrated successfully.
>>>>>
>>>>> Thus we should return 0 to indicate all pages are migrated in this case.
>>>>>
>>>>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>>>>> ---
>>>>> Changes from v1:
>>>>> - Fix the return value of migrate_pages() instead of fixing the
>>>>>     callers' validation.
>>>>> ---
>>>>>    mm/migrate.c | 7 +++++++
>>>>>    1 file changed, 7 insertions(+)
>>>>>
>>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>>> index 8e5eb6e..1da0dbc 100644
>>>>> --- a/mm/migrate.c
>>>>> +++ b/mm/migrate.c
>>>>> @@ -1582,6 +1582,13 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page,
>>>>>    	 */
>>>>>    	list_splice(&ret_pages, from);
>>>>>
>>>>> +	/*
>>>>> +	 * Return 0 in case all subpages of fail-to-migrate THPs are
>>>>> +	 * migrated successfully.
>>>>> +	 */
>>>>> +	if (nr_thp_split && list_empty(from))
>>>>> +		rc = 0;
>>>> Why do you need to check nr_thp_split? Wouldn't list_empty(from) == True
>>>
>>> Only in the case of THP split, we can meet this abnormal case. So if no THP
>>> split, just return the original 'rc' instead of validating the list, since the
>>> 'nr_thp_split' validation is cheaper than the list_empty() validation IMHO.
>> Is it really that much cheaper? We're already retrying migrations
>> multiple times, etc. so surely the difference here would be marginal at
>> best, and IMHO the code would be much clearer if we always set rc = 0
>> when list_empty(from) = True.
>
> Yeah, the difference is marginal and I have no strong preference. OK, will drop
> the 'nr_thp_split' in next version. Thanks.

Thanks. With that change feel free to add:

Reviewed-by: Alistair Popple <apopple@nvidia.com>

>>>> imply success? And if it doesn't imply success wouldn't it be possible
>>>> to end up with nr_thp_split && list_empty(from) whilst still having
>>>> pages that failed to migrate?
>>>> The list management and return code logic from unmap_and_move() has
>>>> gotten pretty difficult to follow and could do with some rework IMHO.
>>>
>>> Yes, Huang Ying has sent a RFC patchset[1] doing some code refactor, which seems
>>> a good start.
>> Thanks for pointing that out, I looked at it a while back but missed the
>> clean ups. I was kind of waiting for the non-RFC version before taking
>> another closer look.
>>
>>> [1] https://lore.kernel.org/all/20220921060616.73086-1-ying.huang@intel.com/
  

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index 8e5eb6e..1da0dbc 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1582,6 +1582,13 @@  int migrate_pages(struct list_head *from, new_page_t get_new_page,
 	 */
 	list_splice(&ret_pages, from);
 
+	/*
+	 * Return 0 in case all subpages of fail-to-migrate THPs are
+	 * migrated successfully.
+	 */
+	if (nr_thp_split && list_empty(from))
+		rc = 0;
+
 	count_vm_events(PGMIGRATE_SUCCESS, nr_succeeded);
 	count_vm_events(PGMIGRATE_FAIL, nr_failed_pages);
 	count_vm_events(THP_MIGRATION_SUCCESS, nr_thp_succeeded);