Message ID | d2ba7e41ee566309b594311207ffca736375fc16.1688715750.git.baolin.wang@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp3120040vqx; Fri, 7 Jul 2023 01:56:53 -0700 (PDT) X-Google-Smtp-Source: APBJJlFBKOz6m4r4mTRNKYr18omMxBLVuNZeoJy3V0CD1AIU8xViUIRLQjXh2l3JEW9YzxeiHQzK X-Received: by 2002:a05:6a21:32a1:b0:12c:3973:800d with SMTP id yt33-20020a056a2132a100b0012c3973800dmr12063462pzb.6.1688720213210; Fri, 07 Jul 2023 01:56:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688720213; cv=none; d=google.com; s=arc-20160816; b=fEVewMFUTMS1+X84kHRIlUkI8/FBKFbCt91xxB3VQjHcqpdfbb33rcdsr+zK5PJtkf Te8xbbWN1TPGrNJvU2RzMnlYAOIKTZ1QNBT0murJwVk70OwG+NvEDXNLdGwaKvCkyW38 1oSPnmli8D8xTYFnwGGVnlVskfOOZMf5Uv8cnmL8etQ/FeTieYhlr5KqD8c7v41QVrAt z/PVDSEO16FkceKZf/l6x+J23uaaXdmGDKNuCTOJO9iuSphDt8+WA+aolva8hAoCGmAq GrY7x+G5kiGmZPoHQ4Tju7q/si18sWACrCN6I1UCBxyrgBph/kyboAnXn4yV7urrsVN9 ae8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=QuEz8VtQYPea+3TAX/L5Nki2qQd83grwnRUVMVkXeeE=; fh=N6z9vA7W2c4kFRliU3BnCmUb76HcxjtNKMQr8X2+k8s=; b=0t8HLTU4dkZmIZNk7XIU7Ou64QGsvWwtcf6AoyNB/gCF+wUqHLrsyV4tky1olNLfgG +04cGlNPEjNQvGXtf56Q7t28XNTUhYTNl7Hc0Rv52I9xGDMELePWcdYP4Ghj4jf4LLRv lFy6WlzYlZHOe5zs7vGLKgVnhmvm18EzN2NIhkQwkkb9B3A3AfPXgxiaaq/GwqCEZ0iT lpfXYTFda7n45C3yfoOvKyN4sfOqHGjOvf93RzJ8PJckVgG0vfYJdu7xx+LRaH8QhN8A bYPupFUAXbsIE1B/8BNZT6wxWGNBAT1z9oZEiMnjlS/SbG2PwLnP8RHwYr1v4LRa0ib0 qHbA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t25-20020a632259000000b0055ae24a7f9esi3213846pgm.786.2023.07.07.01.56.38; Fri, 07 Jul 2023 01:56:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231710AbjGGIv6 (ORCPT <rfc822;hadasmailinglist@gmail.com> + 99 others); Fri, 7 Jul 2023 04:51:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232052AbjGGIv4 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 7 Jul 2023 04:51:56 -0400 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5CC55129 for <linux-kernel@vger.kernel.org>; Fri, 7 Jul 2023 01:51:54 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R151e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=8;SR=0;TI=SMTPD_---0Vmo.NB4_1688719910; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Vmo.NB4_1688719910) by smtp.aliyun-inc.com; Fri, 07 Jul 2023 16:51:51 +0800 From: Baolin Wang <baolin.wang@linux.alibaba.com> To: akpm@linux-foundation.org Cc: mgorman@techsingularity.net, vbabka@suse.cz, david@redhat.com, ying.huang@intel.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] mm: compaction: skip the memory hole rapidly when isolating free pages Date: Fri, 7 Jul 2023 16:51:47 +0800 Message-Id: <d2ba7e41ee566309b594311207ffca736375fc16.1688715750.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 In-Reply-To: <b21cd8e2e32b9a1d9bc9e43ebf8acaf35e87f8df.1688715750.git.baolin.wang@linux.alibaba.com> References: <b21cd8e2e32b9a1d9bc9e43ebf8acaf35e87f8df.1688715750.git.baolin.wang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770751486188470886?= X-GMAIL-MSGID: =?utf-8?q?1770751486188470886?= |
Series |
[1/2] mm: compaction: use the correct type of list for free pages
|
|
Commit Message
Baolin Wang
July 7, 2023, 8:51 a.m. UTC
On my machine with below memory layout, and I can see it will take more
time to skip the larger memory hole (range: 0x100000000 - 0x1800000000)
when isolating free pages. So adding a new helper to skip the memory
hole rapidly, which can reduce the time consumed from about 70us to less
than 1us.
[ 0.000000] Zone ranges:
[ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff]
[ 0.000000] DMA32 empty
[ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff]
[ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff]
[ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff]
[ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff]
[ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff]
[ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff]
[ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff]
[ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff]
[ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff]
[ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff]
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
mm/compaction.c | 30 +++++++++++++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)
Comments
On 07.07.23 10:51, Baolin Wang wrote: > On my machine with below memory layout, and I can see it will take more > time to skip the larger memory hole (range: 0x100000000 - 0x1800000000) > when isolating free pages. So adding a new helper to skip the memory > hole rapidly, which can reduce the time consumed from about 70us to less > than 1us. Can you clarify how this relates to the previous commit and mention that commit? > > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] > [ 0.000000] DMA32 empty > [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] > [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] > [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] > [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] > [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] > [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] > [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] > [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] > [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] > [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/compaction.c | 30 +++++++++++++++++++++++++++++- > 1 file changed, 29 insertions(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 43358efdbdc2..9641e2131901 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -249,11 +249,31 @@ static unsigned long skip_offline_sections(unsigned long start_pfn) > > return 0; > } > + > +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) > +{ > + unsigned long start_nr = pfn_to_section_nr(start_pfn); > + > + if (!start_nr || online_section_nr(start_nr)) > + return 0; > + > + while (start_nr-- > 0) { > + if (online_section_nr(start_nr)) > + return section_nr_to_pfn(start_nr) + PAGES_PER_SECTION - 1; > + } > + > + return 0; > +} > #else > static unsigned long skip_offline_sections(unsigned long start_pfn) > { > return 0; > } > + > +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) > +{ > + return 0; > +} > #endif > > /* > @@ -1668,8 +1688,16 @@ static void isolate_freepages(struct compact_control *cc) > > page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn, > zone); > - if (!page) > + if (!page) { > + unsigned long next_pfn; > + > + next_pfn = skip_offline_sections_reverse(block_start_pfn); > + if (next_pfn) > + block_start_pfn = max(pageblock_start_pfn(next_pfn), > + low_pfn); > + > continue; > + } > > /* Check the block is suitable for migration */ > if (!suitable_migration_target(cc, page)) LGTM Acked-by: David Hildenbrand <david@redhat.com>
Baolin Wang <baolin.wang@linux.alibaba.com> writes: > On my machine with below memory layout, and I can see it will take more > time to skip the larger memory hole (range: 0x100000000 - 0x1800000000) > when isolating free pages. So adding a new helper to skip the memory > hole rapidly, which can reduce the time consumed from about 70us to less > than 1us. > > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] > [ 0.000000] DMA32 empty > [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] The memory hole is at the beginning of zone NORMAL? If so, should zone NORMAL start at 0x1800000000? And, the free pages will not be scanned there? Or my understanding were wrong? Best Regards, Huang, Ying > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] > [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] > [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] > [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] > [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] > [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] > [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] > [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] > [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] > [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > --- > mm/compaction.c | 30 +++++++++++++++++++++++++++++- > 1 file changed, 29 insertions(+), 1 deletion(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 43358efdbdc2..9641e2131901 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -249,11 +249,31 @@ static unsigned long skip_offline_sections(unsigned long start_pfn) > > return 0; > } > + > +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) > +{ > + unsigned long start_nr = pfn_to_section_nr(start_pfn); > + > + if (!start_nr || online_section_nr(start_nr)) > + return 0; > + > + while (start_nr-- > 0) { > + if (online_section_nr(start_nr)) > + return section_nr_to_pfn(start_nr) + PAGES_PER_SECTION - 1; > + } > + > + return 0; > +} > #else > static unsigned long skip_offline_sections(unsigned long start_pfn) > { > return 0; > } > + > +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) > +{ > + return 0; > +} > #endif > > /* > @@ -1668,8 +1688,16 @@ static void isolate_freepages(struct compact_control *cc) > > page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn, > zone); > - if (!page) > + if (!page) { > + unsigned long next_pfn; > + > + next_pfn = skip_offline_sections_reverse(block_start_pfn); > + if (next_pfn) > + block_start_pfn = max(pageblock_start_pfn(next_pfn), > + low_pfn); > + > continue; > + } > > /* Check the block is suitable for migration */ > if (!suitable_migration_target(cc, page))
On 7/10/2023 2:11 PM, Huang, Ying wrote: > Baolin Wang <baolin.wang@linux.alibaba.com> writes: > >> On my machine with below memory layout, and I can see it will take more >> time to skip the larger memory hole (range: 0x100000000 - 0x1800000000) >> when isolating free pages. So adding a new helper to skip the memory >> hole rapidly, which can reduce the time consumed from about 70us to less >> than 1us. >> >> [ 0.000000] Zone ranges: >> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] >> [ 0.000000] DMA32 empty >> [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] > > The memory hole is at the beginning of zone NORMAL? If so, should zone No, the memory hole range is 0x1000000000 - 0x1800000000, and the normal zone is start from 0x100000000. I'm sorry I made a typo in the commit message, which confuses you. The memory hole range should be: 0x1000000000 - 0x1800000000. I updated the commit message to the following and addressed David's comment: " Just like commit 9721fd82351d ("mm: compaction: skip memory hole rapidly when isolating migratable pages"), I can see it will also take more time to skip the larger memory hole (range: 0x1000000000 - 0x1800000000) when isolating free pages on my machine with below memory layout. So like commit 9721fd82351d, adding a new helper to skip the memory hole rapidly, which can reduce the time consumed from about 70us to less than 1us. [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] [ 0.000000] DMA32 empty [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] " > NORMAL start at 0x1800000000? And, the free pages will not be scanned > there? Or my understanding were wrong. > >> [ 0.000000] Movable zone start for each node >> [ 0.000000] Early memory node ranges >> [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] >> [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] >> [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] >> [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] >> [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] >> [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] >> [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] >> [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] >> [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] >> [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >> --- >> mm/compaction.c | 30 +++++++++++++++++++++++++++++- >> 1 file changed, 29 insertions(+), 1 deletion(-) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index 43358efdbdc2..9641e2131901 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -249,11 +249,31 @@ static unsigned long skip_offline_sections(unsigned long start_pfn) >> >> return 0; >> } >> + >> +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) >> +{ >> + unsigned long start_nr = pfn_to_section_nr(start_pfn); >> + >> + if (!start_nr || online_section_nr(start_nr)) >> + return 0; >> + >> + while (start_nr-- > 0) { >> + if (online_section_nr(start_nr)) >> + return section_nr_to_pfn(start_nr) + PAGES_PER_SECTION - 1; >> + } >> + >> + return 0; >> +} >> #else >> static unsigned long skip_offline_sections(unsigned long start_pfn) >> { >> return 0; >> } >> + >> +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) >> +{ >> + return 0; >> +} >> #endif >> >> /* >> @@ -1668,8 +1688,16 @@ static void isolate_freepages(struct compact_control *cc) >> >> page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn, >> zone); >> - if (!page) >> + if (!page) { >> + unsigned long next_pfn; >> + >> + next_pfn = skip_offline_sections_reverse(block_start_pfn); >> + if (next_pfn) >> + block_start_pfn = max(pageblock_start_pfn(next_pfn), >> + low_pfn); >> + >> continue; >> + } >> >> /* Check the block is suitable for migration */ >> if (!suitable_migration_target(cc, page))
Baolin Wang <baolin.wang@linux.alibaba.com> writes: > On 7/10/2023 2:11 PM, Huang, Ying wrote: >> Baolin Wang <baolin.wang@linux.alibaba.com> writes: >> >>> On my machine with below memory layout, and I can see it will take more >>> time to skip the larger memory hole (range: 0x100000000 - 0x1800000000) >>> when isolating free pages. So adding a new helper to skip the memory >>> hole rapidly, which can reduce the time consumed from about 70us to less >>> than 1us. >>> >>> [ 0.000000] Zone ranges: >>> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] >>> [ 0.000000] DMA32 empty >>> [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] >> The memory hole is at the beginning of zone NORMAL? If so, should >> zone > > No, the memory hole range is 0x1000000000 - 0x1800000000, and the > normal zone is start from 0x100000000. > > I'm sorry I made a typo in the commit message, which confuses you. The > memory hole range should be: 0x1000000000 - 0x1800000000. I updated > the commit message to the following and addressed David's comment: Got it! Thanks for explanation! > " > Just like commit 9721fd82351d ("mm: compaction: skip memory hole rapidly > when isolating migratable pages"), I can see it will also take more > time to skip the larger memory hole (range: 0x1000000000 - 0x1800000000) > when isolating free pages on my machine with below memory layout. So > like commit 9721fd82351d, adding a new helper to skip the memory hole > rapidly, which can reduce the time consumed from about 70us to less > than 1us. LGTM. Reviewed-by: "Huang, Ying" <ying.huang@intel.com> > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] > [ 0.000000] DMA32 empty > [ 0.000000] Normal [mem 0x0000000100000000-0x0000001fa7ffffff] > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] > [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] > [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] > [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] > [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] > [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] > [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] > [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] > [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] > [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] > " > >> NORMAL start at 0x1800000000? And, the free pages will not be scanned >> there? Or my understanding were wrong. > >>> [ 0.000000] Movable zone start for each node >>> [ 0.000000] Early memory node ranges >>> [ 0.000000] node 0: [mem 0x0000000040000000-0x0000000fffffffff] >>> [ 0.000000] node 0: [mem 0x0000001800000000-0x0000001fa3c7ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa3c80000-0x0000001fa3ffffff] >>> [ 0.000000] node 0: [mem 0x0000001fa4000000-0x0000001fa402ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa4030000-0x0000001fa40effff] >>> [ 0.000000] node 0: [mem 0x0000001fa40f0000-0x0000001fa73cffff] >>> [ 0.000000] node 0: [mem 0x0000001fa73d0000-0x0000001fa745ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa7460000-0x0000001fa746ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa7470000-0x0000001fa758ffff] >>> [ 0.000000] node 0: [mem 0x0000001fa7590000-0x0000001fa7ffffff] >>> >>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> >>> --- >>> mm/compaction.c | 30 +++++++++++++++++++++++++++++- >>> 1 file changed, 29 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/compaction.c b/mm/compaction.c >>> index 43358efdbdc2..9641e2131901 100644 >>> --- a/mm/compaction.c >>> +++ b/mm/compaction.c >>> @@ -249,11 +249,31 @@ static unsigned long skip_offline_sections(unsigned long start_pfn) >>> return 0; >>> } >>> + >>> +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) >>> +{ >>> + unsigned long start_nr = pfn_to_section_nr(start_pfn); >>> + >>> + if (!start_nr || online_section_nr(start_nr)) >>> + return 0; >>> + >>> + while (start_nr-- > 0) { >>> + if (online_section_nr(start_nr)) >>> + return section_nr_to_pfn(start_nr) + PAGES_PER_SECTION - 1; >>> + } >>> + >>> + return 0; >>> +} >>> #else >>> static unsigned long skip_offline_sections(unsigned long start_pfn) >>> { >>> return 0; >>> } >>> + >>> +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) >>> +{ >>> + return 0; >>> +} >>> #endif >>> /* >>> @@ -1668,8 +1688,16 @@ static void isolate_freepages(struct compact_control *cc) >>> page = pageblock_pfn_to_page(block_start_pfn, >>> block_end_pfn, >>> zone); >>> - if (!page) >>> + if (!page) { >>> + unsigned long next_pfn; >>> + >>> + next_pfn = skip_offline_sections_reverse(block_start_pfn); >>> + if (next_pfn) >>> + block_start_pfn = max(pageblock_start_pfn(next_pfn), >>> + low_pfn); >>> + >>> continue; >>> + } >>> /* Check the block is suitable for migration */ >>> if (!suitable_migration_target(cc, page))
diff --git a/mm/compaction.c b/mm/compaction.c index 43358efdbdc2..9641e2131901 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -249,11 +249,31 @@ static unsigned long skip_offline_sections(unsigned long start_pfn) return 0; } + +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) +{ + unsigned long start_nr = pfn_to_section_nr(start_pfn); + + if (!start_nr || online_section_nr(start_nr)) + return 0; + + while (start_nr-- > 0) { + if (online_section_nr(start_nr)) + return section_nr_to_pfn(start_nr) + PAGES_PER_SECTION - 1; + } + + return 0; +} #else static unsigned long skip_offline_sections(unsigned long start_pfn) { return 0; } + +static unsigned long skip_offline_sections_reverse(unsigned long start_pfn) +{ + return 0; +} #endif /* @@ -1668,8 +1688,16 @@ static void isolate_freepages(struct compact_control *cc) page = pageblock_pfn_to_page(block_start_pfn, block_end_pfn, zone); - if (!page) + if (!page) { + unsigned long next_pfn; + + next_pfn = skip_offline_sections_reverse(block_start_pfn); + if (next_pfn) + block_start_pfn = max(pageblock_start_pfn(next_pfn), + low_pfn); + continue; + } /* Check the block is suitable for migration */ if (!suitable_migration_target(cc, page))