mm: memory-failure: fix unexpected return value in soft_offline_page()

Message ID 20230627112808.1275241-1-linmiaohe@huawei.com
State New
Headers
Series mm: memory-failure: fix unexpected return value in soft_offline_page() |

Commit Message

Miaohe Lin June 27, 2023, 11:28 a.m. UTC
  When page_handle_poison() fails to handle the hugepage or free page in
retry path, soft_offline_page() will return 0 while -EBUSY is expected
in this case.

Fixes: b94e02822deb ("mm,hwpoison: try to narrow window race for free pages")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/memory-failure.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)
  

Comments

Andrew Morton June 27, 2023, 7:30 p.m. UTC | #1
On Tue, 27 Jun 2023 19:28:08 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:

> When page_handle_poison() fails to handle the hugepage or free page in
> retry path, soft_offline_page() will return 0 while -EBUSY is expected
> in this case.

What are the user visible effects of the bug?

> Fixes: b94e02822deb ("mm,hwpoison: try to narrow window race for free pages")
>
> ...
>
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -2737,10 +2737,13 @@ int soft_offline_page(unsigned long pfn, int flags)
>  	if (ret > 0) {
>  		ret = soft_offline_in_use_page(page);
>  	} else if (ret == 0) {
> -		if (!page_handle_poison(page, true, false) && try_again) {
> -			try_again = false;
> -			flags &= ~MF_COUNT_INCREASED;
> -			goto retry;
> +		if (!page_handle_poison(page, true, false)) {
> +			if (try_again) {
> +				try_again = false;
> +				flags &= ~MF_COUNT_INCREASED;
> +				goto retry;
> +			}
> +			ret = -EBUSY;
>  		}
>  	}
  
Miaohe Lin June 28, 2023, 1:56 a.m. UTC | #2
On 2023/6/28 3:30, Andrew Morton wrote:
> On Tue, 27 Jun 2023 19:28:08 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
> 
>> When page_handle_poison() fails to handle the hugepage or free page in
>> retry path, soft_offline_page() will return 0 while -EBUSY is expected
>> in this case.
> 
> What are the user visible effects of the bug?

The user will think soft_offline_page succeeds while it failed in fact. So user
will not try again later in this case.

> 
>> Fixes: b94e02822deb ("mm,hwpoison: try to narrow window race for free pages")
>>
>> ...
>>
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -2737,10 +2737,13 @@ int soft_offline_page(unsigned long pfn, int flags)
>>  	if (ret > 0) {
>>  		ret = soft_offline_in_use_page(page);
>>  	} else if (ret == 0) {
>> -		if (!page_handle_poison(page, true, false) && try_again) {
>> -			try_again = false;
>> -			flags &= ~MF_COUNT_INCREASED;
>> -			goto retry;
>> +		if (!page_handle_poison(page, true, false)) {
>> +			if (try_again) {
>> +				try_again = false;
>> +				flags &= ~MF_COUNT_INCREASED;
>> +				goto retry;
>> +			}
>> +			ret = -EBUSY;
>>  		}
>>  	}
> .
>
  
Naoya Horiguchi June 28, 2023, 11:06 a.m. UTC | #3
On Wed, Jun 28, 2023 at 09:56:38AM +0800, Miaohe Lin wrote:
> On 2023/6/28 3:30, Andrew Morton wrote:
> > On Tue, 27 Jun 2023 19:28:08 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
> > 
> >> When page_handle_poison() fails to handle the hugepage or free page in
> >> retry path, soft_offline_page() will return 0 while -EBUSY is expected
> >> in this case.
> > 
> > What are the user visible effects of the bug?
> 
> The user will think soft_offline_page succeeds while it failed in fact. So user
> will not try again later in this case.

I think that it's helpful to put this in patch descrition so that maintainers can
easily guess the impact of this patch.

Anyway, the patch looks good to me, thank you.

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

> 
> > 
> >> Fixes: b94e02822deb ("mm,hwpoison: try to narrow window race for free pages")
> >>
> >> ...
> >>
> >> --- a/mm/memory-failure.c
> >> +++ b/mm/memory-failure.c
> >> @@ -2737,10 +2737,13 @@ int soft_offline_page(unsigned long pfn, int flags)
> >>  	if (ret > 0) {
> >>  		ret = soft_offline_in_use_page(page);
> >>  	} else if (ret == 0) {
> >> -		if (!page_handle_poison(page, true, false) && try_again) {
> >> -			try_again = false;
> >> -			flags &= ~MF_COUNT_INCREASED;
> >> -			goto retry;
> >> +		if (!page_handle_poison(page, true, false)) {
> >> +			if (try_again) {
> >> +				try_again = false;
> >> +				flags &= ~MF_COUNT_INCREASED;
> >> +				goto retry;
> >> +			}
> >> +			ret = -EBUSY;
> >>  		}
> >>  	}
> > .
> > 
> 
> 
>
  
Miaohe Lin June 29, 2023, 1:48 a.m. UTC | #4
On 2023/6/28 19:06, Naoya Horiguchi wrote:
> On Wed, Jun 28, 2023 at 09:56:38AM +0800, Miaohe Lin wrote:
>> On 2023/6/28 3:30, Andrew Morton wrote:
>>> On Tue, 27 Jun 2023 19:28:08 +0800 Miaohe Lin <linmiaohe@huawei.com> wrote:
>>>
>>>> When page_handle_poison() fails to handle the hugepage or free page in
>>>> retry path, soft_offline_page() will return 0 while -EBUSY is expected
>>>> in this case.
>>>
>>> What are the user visible effects of the bug?
>>
>> The user will think soft_offline_page succeeds while it failed in fact. So user
>> will not try again later in this case.
> 
> I think that it's helpful to put this in patch descrition so that maintainers can
> easily guess the impact of this patch.

Thanks for your review and advice. Will add it if v2 is needed.

> 
> Anyway, the patch looks good to me, thank you.
> 
> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

Thanks Naoya.
  

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index aada6ac72fe5..dc1572818b7d 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -2737,10 +2737,13 @@  int soft_offline_page(unsigned long pfn, int flags)
 	if (ret > 0) {
 		ret = soft_offline_in_use_page(page);
 	} else if (ret == 0) {
-		if (!page_handle_poison(page, true, false) && try_again) {
-			try_again = false;
-			flags &= ~MF_COUNT_INCREASED;
-			goto retry;
+		if (!page_handle_poison(page, true, false)) {
+			if (try_again) {
+				try_again = false;
+				flags &= ~MF_COUNT_INCREASED;
+				goto retry;
+			}
+			ret = -EBUSY;
 		}
 	}