[v2] mm: hugetlb: improve the handling of hugetlb allocation failure for freed or in-use hugetlb

Message ID 23814ccce5dd3cd30fd67aa692fd0bf3514b0166.1707137359.git.baolin.wang@linux.alibaba.com
State New
Headers
Series [v2] mm: hugetlb: improve the handling of hugetlb allocation failure for freed or in-use hugetlb |

Commit Message

Baolin Wang Feb. 5, 2024, 12:50 p.m. UTC
  When handling the freed hugetlb or in-use hugetlb, we should ignore the
failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
since we did not use the new allocated hugetlb in this 2 cases. Moreover,
moving the allocation into the free hugetlb handling branch.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
Changes from v1:
 - Update the suject line per Muchun.
 - Move the allocation into the free hugetlb handling branch per Michal.
---
 mm/hugetlb.c | 40 ++++++++++++++++++++++++----------------
 1 file changed, 24 insertions(+), 16 deletions(-)
  

Comments

Michal Hocko Feb. 5, 2024, 2:47 p.m. UTC | #1
On Mon 05-02-24 20:50:51, Baolin Wang wrote:
> When handling the freed hugetlb or in-use hugetlb, we should ignore the
> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
> since we did not use the new allocated hugetlb in this 2 cases. Moreover,
> moving the allocation into the free hugetlb handling branch.

The changelog is a bit hard for me to understand. What about the
following instead?
alloc_and_dissolve_hugetlb_folio preallocates a new huge page before it
takes hugetlb_lock. In 3 out of 4 cases the page is not really used and
therefore the newly allocated page is just freed right away. This is
wasteful and it might cause pre-mature failures in those cases.

Address that by moving the allocation down to the only case (hugetlb
page is really in the free pages pool). We need to drop hugetlb_lock
to do so and therefore need to recheck the page state after regaining
it.

The patch is more of a cleanup than an actual fix to an existing
problem. There are no known reports about pre-mature failures.

[...]

> @@ -3075,6 +3063,24 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>  		cond_resched();
>  		goto retry;
>  	} else {
> +		if (!new_folio) {
> +			spin_unlock_irq(&hugetlb_lock);
> +			/*
> +			 * Before dissolving the free hugetlb, we need to allocate
> +			 * a new one for the pool to remain stable.  Here, we
> +			 * allocate the folio and 'prep' it by doing everything
> +			 * but actually updating counters and adding to the pool.
> +			 * This simplifies and let us do most of the processing
> +			 * under the lock.
> +			 */

This comment is not really needed anymore IMHO.

> +			new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid,
> +							      NULL, NULL);
> +			if (!new_folio)
> +				return -ENOMEM;
> +			__prep_new_hugetlb_folio(h, new_folio);
> +			goto retry;
> +		}
> +
>  		/*
>  		 * Ok, old_folio is still a genuine free hugepage. Remove it from
>  		 * the freelist and decrease the counters. These will be
  
Baolin Wang Feb. 6, 2024, 1:01 a.m. UTC | #2
On 2/5/2024 10:47 PM, Michal Hocko wrote:
> On Mon 05-02-24 20:50:51, Baolin Wang wrote:
>> When handling the freed hugetlb or in-use hugetlb, we should ignore the
>> failure of alloc_buddy_hugetlb_folio() to dissolve the old hugetlb successfully,
>> since we did not use the new allocated hugetlb in this 2 cases. Moreover,
>> moving the allocation into the free hugetlb handling branch.
> 
> The changelog is a bit hard for me to understand. What about the
> following instead?
> alloc_and_dissolve_hugetlb_folio preallocates a new huge page before it
> takes hugetlb_lock. In 3 out of 4 cases the page is not really used and
> therefore the newly allocated page is just freed right away. This is
> wasteful and it might cause pre-mature failures in those cases.
> 
> Address that by moving the allocation down to the only case (hugetlb
> page is really in the free pages pool). We need to drop hugetlb_lock
> to do so and therefore need to recheck the page state after regaining
> it.
> 
> The patch is more of a cleanup than an actual fix to an existing
> problem. There are no known reports about pre-mature failures.

Looks better. Thanks.

>> @@ -3075,6 +3063,24 @@ static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
>>   		cond_resched();
>>   		goto retry;
>>   	} else {
>> +		if (!new_folio) {
>> +			spin_unlock_irq(&hugetlb_lock);
>> +			/*
>> +			 * Before dissolving the free hugetlb, we need to allocate
>> +			 * a new one for the pool to remain stable.  Here, we
>> +			 * allocate the folio and 'prep' it by doing everything
>> +			 * but actually updating counters and adding to the pool.
>> +			 * This simplifies and let us do most of the processing
>> +			 * under the lock.
>> +			 */
> 
> This comment is not really needed anymore IMHO.

Acked.

> 
>> +			new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid,
>> +							      NULL, NULL);
>> +			if (!new_folio)
>> +				return -ENOMEM;
>> +			__prep_new_hugetlb_folio(h, new_folio);
>> +			goto retry;
>> +		}
>> +
>>   		/*
>>   		 * Ok, old_folio is still a genuine free hugepage. Remove it from
>>   		 * the freelist and decrease the counters. These will be
>
  

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 9d996fe4ecd9..4899167d3652 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3031,21 +3031,9 @@  static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 {
 	gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
 	int nid = folio_nid(old_folio);
-	struct folio *new_folio;
+	struct folio *new_folio = NULL;
 	int ret = 0;
 
-	/*
-	 * Before dissolving the folio, we need to allocate a new one for the
-	 * pool to remain stable.  Here, we allocate the folio and 'prep' it
-	 * by doing everything but actually updating counters and adding to
-	 * the pool.  This simplifies and let us do most of the processing
-	 * under the lock.
-	 */
-	new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid, NULL, NULL);
-	if (!new_folio)
-		return -ENOMEM;
-	__prep_new_hugetlb_folio(h, new_folio);
-
 retry:
 	spin_lock_irq(&hugetlb_lock);
 	if (!folio_test_hugetlb(old_folio)) {
@@ -3075,6 +3063,24 @@  static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 		cond_resched();
 		goto retry;
 	} else {
+		if (!new_folio) {
+			spin_unlock_irq(&hugetlb_lock);
+			/*
+			 * Before dissolving the free hugetlb, we need to allocate
+			 * a new one for the pool to remain stable.  Here, we
+			 * allocate the folio and 'prep' it by doing everything
+			 * but actually updating counters and adding to the pool.
+			 * This simplifies and let us do most of the processing
+			 * under the lock.
+			 */
+			new_folio = alloc_buddy_hugetlb_folio(h, gfp_mask, nid,
+							      NULL, NULL);
+			if (!new_folio)
+				return -ENOMEM;
+			__prep_new_hugetlb_folio(h, new_folio);
+			goto retry;
+		}
+
 		/*
 		 * Ok, old_folio is still a genuine free hugepage. Remove it from
 		 * the freelist and decrease the counters. These will be
@@ -3102,9 +3108,11 @@  static int alloc_and_dissolve_hugetlb_folio(struct hstate *h,
 
 free_new:
 	spin_unlock_irq(&hugetlb_lock);
-	/* Folio has a zero ref count, but needs a ref to be freed */
-	folio_ref_unfreeze(new_folio, 1);
-	update_and_free_hugetlb_folio(h, new_folio, false);
+	if (new_folio) {
+		/* Folio has a zero ref count, but needs a ref to be freed */
+		folio_ref_unfreeze(new_folio, 1);
+		update_and_free_hugetlb_folio(h, new_folio, false);
+	}
 
 	return ret;
 }