[V2,2/3] mm: page_alloc: correct high atomic reserve calculations

Message ID 905d99651423ee85aeb7a71982b95ee9bb05ee99.1699104759.git.quic_charante@quicinc.com
State New
Headers
Series mm: page_alloc: fixes for early oom kills |

Commit Message

Charan Teja Kalla Nov. 5, 2023, 12:50 p.m. UTC
  reserve_highatomic_pageblock() aims to reserve the 1% of the managed
pages of a zone, which is used for the high order atomic allocations.

It uses the below calculation to reserve:
static void reserve_highatomic_pageblock(struct page *page, ....) {

   .......
   max_managed = (zone_managed_pages(zone) / 100) + pageblock_nr_pages;

   if (zone->nr_reserved_highatomic >= max_managed)
       goto out;

   zone->nr_reserved_highatomic += pageblock_nr_pages;
   set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
   move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL);

out:
   ....
}

Since we are always appending the 1% of zone managed pages count to
pageblock_nr_pages, the minimum it is turning into 2 pageblocks as the
nr_reserved_highatomic is incremented/decremented in pageblock sizes.

Encountered a system(actually a VM running on the Linux kernel) with the
below zone configuration:
Normal free:7728kB boost:0kB min:804kB low:1004kB high:1204kB
reserved_highatomic:8192KB managed:49224kB

The existing calculations making it to reserve the 8MB(with pageblock
size of 4MB) i.e. 16% of the zone managed memory.  Reserving such high
amount of memory can easily exert memory pressure in the system thus may
lead into unnecessary reclaims till unreserving of high atomic reserves.

Since high atomic reserves are managed in pageblock size granules, as
MIGRATE_HIGHATOMIC is set for such pageblock, fix the calculations for
high atomic reserves as,  minimum is pageblock size , maximum is
approximately 1% of the zone managed pages.

Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
---
 mm/page_alloc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
  

Comments

Mel Gorman Nov. 16, 2023, 9:59 a.m. UTC | #1
On Sun, Nov 05, 2023 at 06:20:49PM +0530, Charan Teja Kalla wrote:
> reserve_highatomic_pageblock() aims to reserve the 1% of the managed
> pages of a zone, which is used for the high order atomic allocations.
> 
> It uses the below calculation to reserve:
> static void reserve_highatomic_pageblock(struct page *page, ....) {
> 
>    .......
>    max_managed = (zone_managed_pages(zone) / 100) + pageblock_nr_pages;
> 
>    if (zone->nr_reserved_highatomic >= max_managed)
>        goto out;
> 
>    zone->nr_reserved_highatomic += pageblock_nr_pages;
>    set_pageblock_migratetype(page, MIGRATE_HIGHATOMIC);
>    move_freepages_block(zone, page, MIGRATE_HIGHATOMIC, NULL);
> 
> out:
>    ....
> }
> 
> Since we are always appending the 1% of zone managed pages count to
> pageblock_nr_pages, the minimum it is turning into 2 pageblocks as the
> nr_reserved_highatomic is incremented/decremented in pageblock sizes.
> 
> Encountered a system(actually a VM running on the Linux kernel) with the
> below zone configuration:
> Normal free:7728kB boost:0kB min:804kB low:1004kB high:1204kB
> reserved_highatomic:8192KB managed:49224kB
> 
> The existing calculations making it to reserve the 8MB(with pageblock
> size of 4MB) i.e. 16% of the zone managed memory.  Reserving such high
> amount of memory can easily exert memory pressure in the system thus may
> lead into unnecessary reclaims till unreserving of high atomic reserves.
> 
> Since high atomic reserves are managed in pageblock size granules, as
> MIGRATE_HIGHATOMIC is set for such pageblock, fix the calculations for
> high atomic reserves as,  minimum is pageblock size , maximum is
> approximately 1% of the zone managed pages.
> 
> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>

This patch in isolation seems fine with the caveat that such a small
system may find the atomic reserves to be borderline useless.

Acked-by: Mel Gorman <mgorman@techsingularity.net>
  
Michal Hocko Nov. 16, 2023, 12:52 p.m. UTC | #2
On Thu 16-11-23 09:59:01, Mel Gorman wrote:
[...]
> This patch in isolation seems fine with the caveat that such a small
> system may find the atomic reserves to be borderline useless.

Yes, exactly what I had in mind. Would it make sense to reserve the the
pageblock only if it really is less than 1% of available memory?
  
Mel Gorman Nov. 17, 2023, 4:19 p.m. UTC | #3
On Thu, Nov 16, 2023 at 01:52:26PM +0100, Michal Hocko wrote:
> On Thu 16-11-23 09:59:01, Mel Gorman wrote:
> [...]
> > This patch in isolation seems fine with the caveat that such a small
> > system may find the atomic reserves to be borderline useless.
> 
> Yes, exactly what I had in mind. Would it make sense to reserve the the
> pageblock only if it really is less than 1% of available memory?

I'd have no objection with the caveat that we need to watch out for new
bugs related to high-order atomic failures. If ths risk is noted in the
changelog and that it's a revert candidate then any bisection should
trivially find it.
  

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e07a38f..b91c99e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1883,10 +1883,11 @@  static void reserve_highatomic_pageblock(struct page *page, struct zone *zone)
 	unsigned long max_managed, flags;
 
 	/*
-	 * Limit the number reserved to 1 pageblock or roughly 1% of a zone.
+	 * The number reserved as: minimum is 1 pageblock, maximum is
+	 * roughly 1% of a zone.
 	 * Check is race-prone but harmless.
 	 */
-	max_managed = (zone_managed_pages(zone) / 100) + pageblock_nr_pages;
+	max_managed = ALIGN((zone_managed_pages(zone) / 100), pageblock_nr_pages);
 	if (zone->nr_reserved_highatomic >= max_managed)
 		return;