mm: hugetlb: fix hugetlb allocation failure when handling freed or in-use hugetlb

Message ID b2e6ce111400670d8021baf4d7ac524ae78a40d5.1707105047.git.baolin.wang@linux.alibaba.com
State New
Headers
Series mm: hugetlb: fix hugetlb allocation failure when handling freed or in-use hugetlb |

Commit Message

Baolin Wang Feb. 5, 2024, 3:54 a.m. UTC
  When handling the freed hugetlb or in-use hugetlb, we should ignore the
failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
since we did not use the new allocated hugetlb in this 2 cases.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/hugetlb.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)
  

Comments

Muchun Song Feb. 5, 2024, 6:56 a.m. UTC | #1
> On Feb 5, 2024, at 11:54, Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
> When handling the freed hugetlb or in-use hugetlb, we should ignore the
> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
> since we did not use the new allocated hugetlb in this 2 cases.
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

OK. It is not a fix (I see a fix keyword in subject) but an
optimization for unnecessary-allocation cases. Thanks.

Reviewed-by: Muchun Song <muchun.song@linux.dev>
  
Baolin Wang Feb. 5, 2024, 8:23 a.m. UTC | #2
On 2/5/2024 2:56 PM, Muchun Song wrote:
> 
> 
>> On Feb 5, 2024, at 11:54, Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
>>
>> When handling the freed hugetlb or in-use hugetlb, we should ignore the
>> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
>> since we did not use the new allocated hugetlb in this 2 cases.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> 
> OK. It is not a fix (I see a fix keyword in subject) but an
> optimization for unnecessary-allocation cases. Thanks.

Yes, better to change the subject to 'mm: hugetlb: improve the handling 
of hugetlb allocation failure for freed or in-use hugetlb'

Andrew, could you help to change the subject line when you apply it? (or 
you want a new version, please let me know) Thanks.

> Reviewed-by: Muchun Song <muchun.song@linux.dev>

Thanks for reviewing.
  
Michal Hocko Feb. 5, 2024, 9:31 a.m. UTC | #3
On Mon 05-02-24 11:54:17, Baolin Wang wrote:
> When handling the freed hugetlb or in-use hugetlb, we should ignore the
> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
> since we did not use the new allocated hugetlb in this 2 cases.
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  mm/hugetlb.c | 18 ++++++++++++------
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 9d996fe4ecd9..212ab331d355 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3042,9 +3042,8 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>  	 * under the lock.
>  	 */
>  	new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
> -	if (!new_folio)
> -		return -ENOMEM;
> -	__prep_new_hugetlb_folio(h, new_folio);
> +	if (new_folio)
> +		__prep_new_hugetlb_folio(h, new_folio);

Is there any reason why you haven't moved the allocation to the only
branch that actually needs it? I know that we hold hugetlb lock but you
could have easily dropped the lock, allocate a page and then goto retry.
This would actually save an allocation.

Something like this:

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ed1581b670d4..db5f72b94422 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3029,21 +3029,9 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 {
 	gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
 	int nid = folio_nid(old_folio);
-	struct folio *new_folio;
+	struct folio *new_folio = NULL;
 	int ret = 0;
 
-	/*
-	 * Before dissolving the folio, we need to allocate a new one for the
-	 * pool to remain stable.  Here, we allocate the folio and 'prep' it
-	 * by doing everything but actually updating counters and adding to
-	 * the pool.  This simplifies and let us do most of the processing
-	 * under the lock.
-	 */
-	new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
-	if (!new_folio)
-		return -ENOMEM;
-	__prep_new_hugetlb_folio(h, new_folio);
-
 retry:
 	spin_lock_irq(&hugetlb_lock);
 	if (!folio_test_hugetlb(old_folio)) {
@@ -3073,6 +3061,15 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 		cond_resched();
 		goto retry;
 	} else {
+
+		if (!new_folio) {
+			spin_unlock_irq(&hugetlb_lock);
+			new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
+			if (!new_folio)
+				return -ENOMEM;
+			__prep_new_hugetlb_folio(h, new_folio);
+			goto retry;
+		}
 		/*
 		 * Ok, old_folio is still a genuine free hugepage. Remove it from
 		 * the freelist and decrease the counters. These will be
@@ -3100,9 +3097,11 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 
 free_new:
 	spin_unlock_irq(&hugetlb_lock);
-	/* Folio has a zero ref count, but needs a ref to be freed */
-	folio_ref_unfreeze(new_folio, 1);
-	update_and_free_hugetlb_folio(h, new_folio, false);
+	if (new_folio) {
+		/* Folio has a zero ref count, but needs a ref to be freed */
+		folio_ref_unfreeze(new_folio, 1);
+		update_and_free_hugetlb_folio(h, new_folio, false);
+	}
 
 	return ret;
 }
  
Baolin Wang Feb. 5, 2024, 12:38 p.m. UTC | #4
On 2/5/2024 5:31 PM, Michal Hocko wrote:
> On Mon 05-02-24 11:54:17, Baolin Wang wrote:
>> When handling the freed hugetlb or in-use hugetlb, we should ignore the
>> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
>> since we did not use the new allocated hugetlb in this 2 cases.
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>>   mm/hugetlb.c | 18 ++++++++++++------
>>   1 file changed, 12 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 9d996fe4ecd9..212ab331d355 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3042,9 +3042,8 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>>   	 * under the lock.
>>   	 */
>>   	new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
>> -	if (!new_folio)
>> -		return -ENOMEM;
>> -	__prep_new_hugetlb_folio(h, new_folio);
>> +	if (new_folio)
>> +		__prep_new_hugetlb_folio(h, new_folio);
> 
> Is there any reason why you haven't moved the allocation to the only
> branch that actually needs it? I know that we hold hugetlb lock but you

Nope, just did a simple patch to ignore the allocation failure.

> could have easily dropped the lock, allocate a page and then goto retry.
> This would actually save an allocation.

Yes, will do. Thanks.

> Something like this:
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index ed1581b670d4..db5f72b94422 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3029,21 +3029,9 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>   {
>   	gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
>   	int nid = folio_nid(old_folio);
> -	struct folio *new_folio;
> +	struct folio *new_folio = NULL;
>   	int ret = 0;
>   
> -	/*
> -	 * Before dissolving the folio, we need to allocate a new one for the
> -	 * pool to remain stable.  Here, we allocate the folio and 'prep' it
> -	 * by doing everything but actually updating counters and adding to
> -	 * the pool.  This simplifies and let us do most of the processing
> -	 * under the lock.
> -	 */
> -	new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
> -	if (!new_folio)
> -		return -ENOMEM;
> -	__prep_new_hugetlb_folio(h, new_folio);
> -
>   retry:
>   	spin_lock_irq(&hugetlb_lock);
>   	if (!folio_test_hugetlb(old_folio)) {
> @@ -3073,6 +3061,15 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>   		cond_resched();
>   		goto retry;
>   	} else {
> +
> +		if (!new_folio) {
> +			spin_unlock_irq(&hugetlb_lock);
> +			new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
> +			if (!new_folio)
> +				return -ENOMEM;
> +			__prep_new_hugetlb_folio(h, new_folio);
> +			goto retry;
> +		}
>   		/*
>   		 * Ok, old_folio is still a genuine free hugepage. Remove it from
>   		 * the freelist and decrease the counters. These will be
> @@ -3100,9 +3097,11 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>   
>   free_new:
>   	spin_unlock_irq(&hugetlb_lock);
> -	/* Folio has a zero ref count, but needs a ref to be freed */
> -	folio_ref_unfreeze(new_folio, 1);
> -	update_and_free_hugetlb_folio(h, new_folio, false);
> +	if (new_folio) {
> +		/* Folio has a zero ref count, but needs a ref to be freed */
> +		folio_ref_unfreeze(new_folio, 1);
> +		update_and_free_hugetlb_folio(h, new_folio, false);
> +	}
>   
>   	return ret;
>   }
  

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9d996fe4ecd9..212ab331d355 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3042,9 +3042,8 @@  static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 	 * under the lock.
 	 */
 	new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
-	if (!new_folio)
-		return -ENOMEM;
-	__prep_new_hugetlb_folio(h, new_folio);
+	if (new_folio)
+		__prep_new_hugetlb_folio(h, new_folio);
 
 retry:
 	spin_lock_irq(&hugetlb_lock);
@@ -3075,6 +3074,11 @@  static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 		cond_resched();
 		goto retry;
 	} else {
+		if (!new_folio) {
+			ret = -ENOMEM;
+			goto free_new;
+		}
+
 		/*
 		 * Ok, old_folio is still a genuine free hugepage. Remove it from
 		 * the freelist and decrease the counters. These will be
@@ -3102,9 +3106,11 @@  static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 
 free_new:
 	spin_unlock_irq(&hugetlb_lock);
-	/* Folio has a zero ref count, but needs a ref to be freed */
-	folio_ref_unfreeze(new_folio, 1);
-	update_and_free_hugetlb_folio(h, new_folio, false);
+	if (new_folio) {
+		/* Folio has a zero ref count, but needs a ref to be freed */
+		folio_ref_unfreeze(new_folio, 1);
+		update_and_free_hugetlb_folio(h, new_folio, false);
+	}
 
 	return ret;
 }