[2/5] mm: compaction: simplify should_compact_retry()

Message ID 20230519123959.77335-3-hannes@cmpxchg.org
State New
Headers
Series mm: compaction: cleanups & simplifications |

Commit Message

Johannes Weiner May 19, 2023, 12:39 p.m. UTC
  The different branches for retry are unnecessarily complicated. There
are really only three outcomes: progress (retry n times), skipped
(retry if reclaim can help), failed (retry with higher priority).

Rearrange the branches and the retry counter to make it simpler.

v2:
- fix trace point build (Mel)
- fix max_retries logic for costly allocs (Huang)

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 53 +++++++++++++++----------------------------------
 1 file changed, 16 insertions(+), 37 deletions(-)
  

Comments

Vlastimil Babka May 29, 2023, 1:03 p.m. UTC | #1
On 5/19/23 14:39, Johannes Weiner wrote:
> The different branches for retry are unnecessarily complicated. There
> are really only three outcomes: progress (retry n times), skipped
> (retry if reclaim can help), failed (retry with higher priority).
> 
> Rearrange the branches and the retry counter to make it simpler.
> 
> v2:
> - fix trace point build (Mel)
> - fix max_retries logic for costly allocs (Huang)
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/page_alloc.c | 53 +++++++++++++++----------------------------------
>  1 file changed, 16 insertions(+), 37 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 5a84a0bebc37..72660e924b95 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3772,16 +3772,22 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
>  	 * Compaction managed to coalesce some page blocks, but the
>  	 * allocation failed presumably due to a race. Retry some.
>  	 */
> -	if (compact_result == COMPACT_SUCCESS)
> -		(*compaction_retries)++;
> +	if (compact_result == COMPACT_SUCCESS) {
> +		/*
> +		 * !costly requests are much more important than
> +		 * __GFP_RETRY_MAYFAIL costly ones because they are de
> +		 * facto nofail and invoke OOM killer to move on while
> +		 * costly can fail and users are ready to cope with
> +		 * that. 1/4 retries is rather arbitrary but we would
> +		 * need much more detailed feedback from compaction to
> +		 * make a better decision.
> +		 */
> +		if (order > PAGE_ALLOC_COSTLY_ORDER)
> +			max_retries /= 4;
>  
> -	/*
> -	 * All zones were scanned completely and still no result. It
> -	 * doesn't really make much sense to retry except when the
> -	 * failure could be caused by insufficient priority
> -	 */
> -	if (compact_result == COMPACT_COMPLETE)
> -		goto check_priority;
> +		ret = ++(*compaction_retries) <= max_retries;
> +		goto out;

I think you simplified this part too much, so now once it runs out of
retries, it will return false, while previously it would increase the priority.

> +	}
>  
>  	/*
>  	 * Compaction was skipped due to a lack of free order-0
> @@ -3793,35 +3799,8 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
>  	}
>  
>  	/*
> -	 * If compaction backed due to being deferred, due to
> -	 * contended locks in async mode, or due to scanners meeting
> -	 * after a partial scan, retry with increased priority.
> -	 */
> -	if (compact_result == COMPACT_DEFERRED ||
> -	    compact_result == COMPACT_CONTENDED ||
> -	    compact_result == COMPACT_PARTIAL_SKIPPED)
> -		goto check_priority;
> -
> -	/*
> -	 * !costly requests are much more important than __GFP_RETRY_MAYFAIL
> -	 * costly ones because they are de facto nofail and invoke OOM
> -	 * killer to move on while costly can fail and users are ready
> -	 * to cope with that. 1/4 retries is rather arbitrary but we
> -	 * would need much more detailed feedback from compaction to
> -	 * make a better decision.
> -	 */
> -	if (order > PAGE_ALLOC_COSTLY_ORDER)
> -		max_retries /= 4;
> -	if (*compaction_retries <= max_retries) {
> -		ret = true;
> -		goto out;
> -	}
> -
> -	/*
> -	 * Make sure there are attempts at the highest priority if we exhausted
> -	 * all retries or failed at the lower priorities.
> +	 * Compaction failed. Retry with increasing priority.
>  	 */
> -check_priority:
>  	min_priority = (order > PAGE_ALLOC_COSTLY_ORDER) ?
>  			MIN_COMPACT_COSTLY_PRIORITY : MIN_COMPACT_PRIORITY;
>
  
Johannes Weiner May 29, 2023, 4:38 p.m. UTC | #2
On Mon, May 29, 2023 at 03:03:52PM +0200, Vlastimil Babka wrote:
> On 5/19/23 14:39, Johannes Weiner wrote:
> > The different branches for retry are unnecessarily complicated. There
> > are really only three outcomes: progress (retry n times), skipped
> > (retry if reclaim can help), failed (retry with higher priority).
> > 
> > Rearrange the branches and the retry counter to make it simpler.
> > 
> > v2:
> > - fix trace point build (Mel)
> > - fix max_retries logic for costly allocs (Huang)
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> >  mm/page_alloc.c | 53 +++++++++++++++----------------------------------
> >  1 file changed, 16 insertions(+), 37 deletions(-)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 5a84a0bebc37..72660e924b95 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -3772,16 +3772,22 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> >  	 * Compaction managed to coalesce some page blocks, but the
> >  	 * allocation failed presumably due to a race. Retry some.
> >  	 */
> > -	if (compact_result == COMPACT_SUCCESS)
> > -		(*compaction_retries)++;
> > +	if (compact_result == COMPACT_SUCCESS) {
> > +		/*
> > +		 * !costly requests are much more important than
> > +		 * __GFP_RETRY_MAYFAIL costly ones because they are de
> > +		 * facto nofail and invoke OOM killer to move on while
> > +		 * costly can fail and users are ready to cope with
> > +		 * that. 1/4 retries is rather arbitrary but we would
> > +		 * need much more detailed feedback from compaction to
> > +		 * make a better decision.
> > +		 */
> > +		if (order > PAGE_ALLOC_COSTLY_ORDER)
> > +			max_retries /= 4;
> >  
> > -	/*
> > -	 * All zones were scanned completely and still no result. It
> > -	 * doesn't really make much sense to retry except when the
> > -	 * failure could be caused by insufficient priority
> > -	 */
> > -	if (compact_result == COMPACT_COMPLETE)
> > -		goto check_priority;
> > +		ret = ++(*compaction_retries) <= max_retries;
> > +		goto out;
> 
> I think you simplified this part too much, so now once it runs out of
> retries, it will return false, while previously it would increase the priority.

Oops, I'll send a delta fix to Andrew tomorrow. Thanks!
  
Johannes Weiner June 2, 2023, 2:47 p.m. UTC | #3
On Mon, May 29, 2023 at 12:38:07PM -0400, Johannes Weiner wrote:
> On Mon, May 29, 2023 at 03:03:52PM +0200, Vlastimil Babka wrote:
> > I think you simplified this part too much, so now once it runs out of
> > retries, it will return false, while previously it would increase the priority.

Here is the delta fix. If this looks good to everybody, can you please
fold this into the patch you have in tree? Thanks!

---
From 4b9429f9ef04fcb7bb5ffae0db8ea113b26d097b Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Fri, 2 Jun 2023 16:02:37 +0200
Subject: [PATCH] mm: compaction: simplify should_compact_retry() fix

Vlastimil points out an unintended change. Previously when hitting
max_retries we'd bump the priority level and restart the loop. Now we
bail out and fail instead. Restore the original behavior.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 72660e924b95..e7d7db36582b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3768,6 +3768,15 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	if (fatal_signal_pending(current))
 		return false;
 
+	/*
+	 * Compaction was skipped due to a lack of free order-0
+	 * migration targets. Continue if reclaim can help.
+	 */
+	if (compact_result == COMPACT_SKIPPED) {
+		ret = compaction_zonelist_suitable(ac, order, alloc_flags);
+		goto out;
+	}
+
 	/*
 	 * Compaction managed to coalesce some page blocks, but the
 	 * allocation failed presumably due to a race. Retry some.
@@ -3785,17 +3794,10 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 		if (order > PAGE_ALLOC_COSTLY_ORDER)
 			max_retries /= 4;
 
-		ret = ++(*compaction_retries) <= max_retries;
-		goto out;
-	}
-
-	/*
-	 * Compaction was skipped due to a lack of free order-0
-	 * migration targets. Continue if reclaim can help.
-	 */
-	if (compact_result == COMPACT_SKIPPED) {
-		ret = compaction_zonelist_suitable(ac, order, alloc_flags);
-		goto out;
+		if (++(*compaction_retries) <= max_retries) {
+			ret = true;
+			goto out;
+		}
 	}
 
 	/*
  
Vlastimil Babka June 6, 2023, 12:58 p.m. UTC | #4
On 6/2/23 16:47, Johannes Weiner wrote:
> On Mon, May 29, 2023 at 12:38:07PM -0400, Johannes Weiner wrote:
>> On Mon, May 29, 2023 at 03:03:52PM +0200, Vlastimil Babka wrote:
>> > I think you simplified this part too much, so now once it runs out of
>> > retries, it will return false, while previously it would increase the priority.
> 
> Here is the delta fix. If this looks good to everybody, can you please
> fold this into the patch you have in tree? Thanks!
> 
> ---
> From 4b9429f9ef04fcb7bb5ffae0db8ea113b26d097b Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Fri, 2 Jun 2023 16:02:37 +0200
> Subject: [PATCH] mm: compaction: simplify should_compact_retry() fix
> 
> Vlastimil points out an unintended change. Previously when hitting
> max_retries we'd bump the priority level and restart the loop. Now we
> bail out and fail instead. Restore the original behavior.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

For the 2/5 +fix

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 72660e924b95..e7d7db36582b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3768,6 +3768,15 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
>  	if (fatal_signal_pending(current))
>  		return false;
>  
> +	/*
> +	 * Compaction was skipped due to a lack of free order-0
> +	 * migration targets. Continue if reclaim can help.
> +	 */
> +	if (compact_result == COMPACT_SKIPPED) {
> +		ret = compaction_zonelist_suitable(ac, order, alloc_flags);
> +		goto out;
> +	}
> +
>  	/*
>  	 * Compaction managed to coalesce some page blocks, but the
>  	 * allocation failed presumably due to a race. Retry some.
> @@ -3785,17 +3794,10 @@ should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
>  		if (order > PAGE_ALLOC_COSTLY_ORDER)
>  			max_retries /= 4;
>  
> -		ret = ++(*compaction_retries) <= max_retries;
> -		goto out;
> -	}
> -
> -	/*
> -	 * Compaction was skipped due to a lack of free order-0
> -	 * migration targets. Continue if reclaim can help.
> -	 */
> -	if (compact_result == COMPACT_SKIPPED) {
> -		ret = compaction_zonelist_suitable(ac, order, alloc_flags);
> -		goto out;
> +		if (++(*compaction_retries) <= max_retries) {
> +			ret = true;
> +			goto out;
> +		}
>  	}
>  
>  	/*
  

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5a84a0bebc37..72660e924b95 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3772,16 +3772,22 @@  should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	 * Compaction managed to coalesce some page blocks, but the
 	 * allocation failed presumably due to a race. Retry some.
 	 */
-	if (compact_result == COMPACT_SUCCESS)
-		(*compaction_retries)++;
+	if (compact_result == COMPACT_SUCCESS) {
+		/*
+		 * !costly requests are much more important than
+		 * __GFP_RETRY_MAYFAIL costly ones because they are de
+		 * facto nofail and invoke OOM killer to move on while
+		 * costly can fail and users are ready to cope with
+		 * that. 1/4 retries is rather arbitrary but we would
+		 * need much more detailed feedback from compaction to
+		 * make a better decision.
+		 */
+		if (order > PAGE_ALLOC_COSTLY_ORDER)
+			max_retries /= 4;
 
-	/*
-	 * All zones were scanned completely and still no result. It
-	 * doesn't really make much sense to retry except when the
-	 * failure could be caused by insufficient priority
-	 */
-	if (compact_result == COMPACT_COMPLETE)
-		goto check_priority;
+		ret = ++(*compaction_retries) <= max_retries;
+		goto out;
+	}
 
 	/*
 	 * Compaction was skipped due to a lack of free order-0
@@ -3793,35 +3799,8 @@  should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
 	}
 
 	/*
-	 * If compaction backed due to being deferred, due to
-	 * contended locks in async mode, or due to scanners meeting
-	 * after a partial scan, retry with increased priority.
-	 */
-	if (compact_result == COMPACT_DEFERRED ||
-	    compact_result == COMPACT_CONTENDED ||
-	    compact_result == COMPACT_PARTIAL_SKIPPED)
-		goto check_priority;
-
-	/*
-	 * !costly requests are much more important than __GFP_RETRY_MAYFAIL
-	 * costly ones because they are de facto nofail and invoke OOM
-	 * killer to move on while costly can fail and users are ready
-	 * to cope with that. 1/4 retries is rather arbitrary but we
-	 * would need much more detailed feedback from compaction to
-	 * make a better decision.
-	 */
-	if (order > PAGE_ALLOC_COSTLY_ORDER)
-		max_retries /= 4;
-	if (*compaction_retries <= max_retries) {
-		ret = true;
-		goto out;
-	}
-
-	/*
-	 * Make sure there are attempts at the highest priority if we exhausted
-	 * all retries or failed at the lower priorities.
+	 * Compaction failed. Retry with increasing priority.
 	 */
-check_priority:
 	min_priority = (order > PAGE_ALLOC_COSTLY_ORDER) ?
 			MIN_COMPACT_COSTLY_PRIORITY : MIN_COMPACT_PRIORITY;