mm: ALLOC_HIGHATOMIC flag allocation issue
Commit Message
In case that alloc_flags contains ALLOC_HIGHATOMIC and alloc order
is order1/2/3/10 in rmqueue(), if pages are alloced successfully
from pcplist, a free pageblock will be also moved from the alloced
migratetype freelist to MIGRATE_HIGHATOMIC freelist, rather than
alloc from MIGRATE_HIGHATOMIC freelist firstly, so this will result
in an increasing number of pages on the MIGRATE_HIGHATOMIC freelist,
pages in other migratetype freelist are reduced and more likely to
allocation failure.
Currently the sequence of ALLOC_HIGHATOMIC allocation is:
pcplist --> rmqueue_bulk() --> rmqueue_buddy() MIGRATE_HIGHATOMIC
--> rmqueue_buddy() allocation migratetype.
Due to the fact that requesting pages from the pcplist is faster than
buddy, the sequence of modifying the ALLOC_HIGHATOMIC allocation is:
pcplist --> rmqueue_buddy() MIGRATE_HIGHATOMIC --> rmqueue_buddy()
allocation migratetype.
This patch can solve the failure problem of allocating other types of
pages due to excessive MIGRATE_HIGHATOMIC freelist reservations.
In comparative testing, cat /proc/pagetypeinfo and the HighAtomic
freelist size is:
Test without this patch:
Node 0, zone Normal, type HighAtomic 2369 771 138 15 0 0 0 0 0 0 0
Test with this patch:
Node 0, zone Normal, type HighAtomic 206 82 4 2 1 0 0 0 0 0 0
Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
---
mm/page_alloc.c | 33 ++++++++++++++++++++++++++++++---
1 file changed, 30 insertions(+), 3 deletions(-)
mode change 100644 => 100755 mm/page_alloc.c
Comments
On Mon, Nov 20, 2023 at 10:35:36AM +0800, Zhiguo Jiang wrote:
> + /*
> + * If pcplist is empty and alloc_flags is with ALLOC_HIGHATOMIC,
> + * it should alloc from buddy highatomic migrate freelist firstly
> + * to ensure quick and successful allocation.
Assuming that all the serious question shave been dealt with, let's
fix the less important problems ...
* If pcplist is empty and alloc_flags contains
* ALLOC_HIGHATOMIC, alloc from buddy highatomic
* freelist first.
> @@ -2918,7 +2927,7 @@ static inline
> struct page *rmqueue(struct zone *preferred_zone,
> struct zone *zone, unsigned int order,
> gfp_t gfp_flags, unsigned int alloc_flags,
> - int migratetype)
> + int migratetype, bool *highatomc_allocation)
bool *highatomic
> + /*
> + * The high-order atomic allocation pageblock reserved conditions:
> + *
> + * If the high-order atomic allocation page is alloced from pcplist,
> + * the highatomic pageblock does not need to be reserved, which can
> + * void to migrate an increasing number of pages into buddy
* avoid migrating an increasing number of pages into buddy
> + * MIGRATE_HIGHATOMIC freelist and lead to an increasing risk of
"increased"
> + * allocation failure on other buddy migrate freelists.
> + *
> + * If the high-order atomic allocation page is alloced from buddy
"allocated"
> @@ -3208,6 +3234,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
> struct pglist_data *last_pgdat = NULL;
> bool last_pgdat_dirty_ok = false;
> bool no_fallback;
> + bool highatomc_allocation = false;
Again, just call this 'highatomic'.
Thanks, I have updated the related modifications according to your
suggestions in patch v2.
在 2023/11/21 1:29, Matthew Wilcox 写道:
> On Mon, Nov 20, 2023 at 10:35:36AM +0800, Zhiguo Jiang wrote:
>> + /*
>> + * If pcplist is empty and alloc_flags is with ALLOC_HIGHATOMIC,
>> + * it should alloc from buddy highatomic migrate freelist firstly
>> + * to ensure quick and successful allocation.
> Assuming that all the serious question shave been dealt with, let's
> fix the less important problems ...
>
> * If pcplist is empty and alloc_flags contains
> * ALLOC_HIGHATOMIC, alloc from buddy highatomic
> * freelist first.
>
>> @@ -2918,7 +2927,7 @@ static inline
>> struct page *rmqueue(struct zone *preferred_zone,
>> struct zone *zone, unsigned int order,
>> gfp_t gfp_flags, unsigned int alloc_flags,
>> - int migratetype)
>> + int migratetype, bool *highatomc_allocation)
> bool *highatomic
>
>> + /*
>> + * The high-order atomic allocation pageblock reserved conditions:
>> + *
>> + * If the high-order atomic allocation page is alloced from pcplist,
>> + * the highatomic pageblock does not need to be reserved, which can
>> + * void to migrate an increasing number of pages into buddy
> * avoid migrating an increasing number of pages into buddy
>
>> + * MIGRATE_HIGHATOMIC freelist and lead to an increasing risk of
> "increased"
>
>> + * allocation failure on other buddy migrate freelists.
>> + *
>> + * If the high-order atomic allocation page is alloced from buddy
> "allocated"
>
>> @@ -3208,6 +3234,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
>> struct pglist_data *last_pgdat = NULL;
>> bool last_pgdat_dirty_ok = false;
>> bool no_fallback;
>> + bool highatomc_allocation = false;
> Again, just call this 'highatomic'.
>
@@ -2850,11 +2850,20 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order,
int batch = nr_pcp_alloc(pcp, zone, order);
int alloced;
+ /*
+ * If pcplist is empty and alloc_flags is with ALLOC_HIGHATOMIC,
+ * it should alloc from buddy highatomic migrate freelist firstly
+ * to ensure quick and successful allocation.
+ */
+ if (alloc_flags & ALLOC_HIGHATOMIC)
+ goto out;
+
alloced = rmqueue_bulk(zone, order,
batch, list,
migratetype, alloc_flags);
pcp->count += alloced << order;
+out:
if (unlikely(list_empty(list)))
return NULL;
}
@@ -2918,7 +2927,7 @@ static inline
struct page *rmqueue(struct zone *preferred_zone,
struct zone *zone, unsigned int order,
gfp_t gfp_flags, unsigned int alloc_flags,
- int migratetype)
+ int migratetype, bool *highatomc_allocation)
{
struct page *page;
@@ -2938,6 +2947,23 @@ struct page *rmqueue(struct zone *preferred_zone,
page = rmqueue_buddy(preferred_zone, zone, order, alloc_flags,
migratetype);
+ /*
+ * The high-order atomic allocation pageblock reserved conditions:
+ *
+ * If the high-order atomic allocation page is alloced from pcplist,
+ * the highatomic pageblock does not need to be reserved, which can
+ * void to migrate an increasing number of pages into buddy
+ * MIGRATE_HIGHATOMIC freelist and lead to an increasing risk of
+ * allocation failure on other buddy migrate freelists.
+ *
+ * If the high-order atomic allocation page is alloced from buddy
+ * highatomic migrate freelist, regardless of whether the allocation
+ * is successful or not, the highatomic pageblock can try to be
+ * reserved.
+ */
+ if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
+ *highatomc_allocation = true;
+
out:
/* Separate test+clear to avoid unnecessary atomics */
if ((alloc_flags & ALLOC_KSWAPD) &&
@@ -3208,6 +3234,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
struct pglist_data *last_pgdat = NULL;
bool last_pgdat_dirty_ok = false;
bool no_fallback;
+ bool highatomc_allocation = false;
retry:
/*
@@ -3339,7 +3366,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
try_this_zone:
page = rmqueue(ac->preferred_zoneref->zone, zone, order,
- gfp_mask, alloc_flags, ac->migratetype);
+ gfp_mask, alloc_flags, ac->migratetype, &highatomc_allocation);
if (page) {
prep_new_page(page, order, gfp_mask, alloc_flags);
@@ -3347,7 +3374,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
* If this is a high-order atomic allocation then check
* if the pageblock should be reserved for the future
*/
- if (unlikely(alloc_flags & ALLOC_HIGHATOMIC))
+ if (unlikely(highatomc_allocation))
reserve_highatomic_pageblock(page, zone);
return page;