[v6,0/4] Enable >0 order folio memory compaction

Message ID 20240216170432.1268753-1-zi.yan@sent.com
State New
Headers

Commit Message

Zi Yan Feb. 16, 2024, 5:04 p.m. UTC
  From: Zi Yan <ziy@nvidia.com>

Hi all,

This patchset enables >0 order folio memory compaction, which is one of
the prerequisitions for large folio support[1]. It is on top of
mm-everything-2024-02-16-01-35.

I am aware of that split free pages is necessary for folio
migration in compaction, since if >0 order free pages are never split
and no order-0 free page is scanned, compaction will end prematurely due
to migration returns -ENOMEM. Free page split becomes a must instead of
an optimization.

lkp ncompare results (on a 8-CPU (Intel Xeon E5-2650 v4 @2.20GHz) 16G VM)
for default LRU (-no-mglru) and CONFIG_LRU_GEN are shown at the bottom,
copied from V3[4].
In sum, most of vm-scalability applications do not see performance
change, and the others see ~4% to ~26% performance boost under default LRU
and ~2% to ~6% performance boost under CONFIG_LRU_GEN.


Changelog
===

From V5 [6]
1. Removed unused parameter in prepare_free_pages() and used it instead
of my old prepare_free_pages_fpi_none() (per Vlastimil Babka).

2. Removed unnecessary INIT_LIST_HEAD() in compaction_free()
(per Vlastimil Babka).

3. Fixed cc->nr_migratepages update in compaction_free()
(per Vlastimil Babka).


From V4 [5]:
1. Refactored code in compaction_alloc() in Patch 3 (per Yu Zhao).


From V3 [4]:
1. Restructured isolate_migratepages_block() to minimize PageHuge() use
in Patch 1 (per Vlastimil Babka).

2. Used folio_put_testzero() instead of folio_set_count() to properly
handle free pages in compaction_free() (per Vlastimil Babka).

3. Simplified code to use struct list_head instead of a new struct page_list
(per Vlastimil Babka).

4. Restructured compaction_alloc() code to reduce indentation and
increase readability (per Vlastimil Babka).


From V2 [3]:
1. Added missing free page count in fast isolation path. This fixed the
weird performance outcome.


From V1 [2]:
1. Used folio_test_large() instead of folio_order() > 0. (per Matthew
Wilcox)

2. Fixed code rebase error. (per Baolin Wang)

3. Used list_split_init() instead of list_split(). (per Ryan Boberts)

4. Added free_pages_prepare_fpi_none() to avoid duplicate free page code
in compaction_free().

5. Dropped source page order sorting patch.


From RFC [1]:
1. Enabled >0 order folio compaction in the first patch by splitting all
to-be-migrated folios. (per Huang, Ying)

2. Stopped isolating compound pages with order greater than cc->order
to avoid wasting effort, since cc->order gives a hint that no free pages
with order greater than it exist, thus migrating the compound pages will fail.
(per Baolin Wang)

3. Retained the folio check within lru lock. (per Baolin Wang)

4. Made isolate_freepages_block() generate order-sorted multi lists.
(per Johannes Weiner)

Overview
===

To support >0 order folio compaction, the patchset changes how free pages used
for migration are kept during compaction. Free pages used to be split into
order-0 pages that are post allocation processed (i.e., PageBuddy flag cleared,
page order stored in page->private is zeroed, and page reference is set to 1).
Now all free pages are kept in a NR_PAGE_ORDER array of page lists based
on their order without post allocation process. When migrate_pages() asks for
a new page, one of the free pages, based on the requested page order, is
then processed and given out. And THP <2MB would need this feature.


Feel free to give comments and ask questions.

Thanks.

[1] https://lore.kernel.org/linux-mm/f8d47176-03a8-99bf-a813-b5942830fd73@arm.com/
[2] https://lore.kernel.org/linux-mm/20231113170157.280181-1-zi.yan@sent.com/
[3] https://lore.kernel.org/linux-mm/20240123034636.1095672-1-zi.yan@sent.com/
[4] https://lore.kernel.org/linux-mm/20240202161554.565023-1-zi.yan@sent.com/
[5] https://lore.kernel.org/linux-mm/20240212163510.859822-1-zi.yan@sent.com/
[6] https://lore.kernel.org/linux-mm/20240214220420.1229173-1-zi.yan@sent.com/


Hi Andrew,

Baolin's patch on nr_migratepages was based on this one, a better fixup
for it might be below. Since before my patchset, compaction only deals with
order-0 pages.

vm-scalability results on CONFIG_LRU_GEN
===

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/mmap-xread-seq-mt/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19+
  6.8.0-rc1-split-folio-in-compaction+
  6.8.0-rc1-folio-migration-in-compaction+
  6.8.0-rc1-folio-migration-free-page-split+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
  15107616            +3.2%   15590339            +1.3%   15297619            +3.0%   15567998        vm-scalability.throughput

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19+
  6.8.0-rc1-split-folio-in-compaction+
  6.8.0-rc1-folio-migration-in-compaction+
  6.8.0-rc1-folio-migration-free-page-split+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
  12611785            +1.8%   12832919            +0.9%   12724223            +1.6%   12812682        vm-scalability.throughput


=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19+
  6.8.0-rc1-split-folio-in-compaction+
  6.8.0-rc1-folio-migration-in-compaction+
  6.8.0-rc1-folio-migration-free-page-split+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
   9833393            +5.7%   10390190            +3.0%   10126606            +5.9%   10408804        vm-scalability.throughput

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19+
  6.8.0-rc1-split-folio-in-compaction+
  6.8.0-rc1-folio-migration-in-compaction+
  6.8.0-rc1-folio-migration-free-page-split+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
   7034709 ±  3%      +2.9%    7241429            +3.2%    7256680 ±  2%      +3.9%    7308375        vm-scalability.throughput



vm-scalability results on default LRU (with -no-mglru suffix)
===

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/mmap-xread-seq-mt/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+
  6.8.0-rc1-split-folio-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-free-page-split-no-mglru+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
  14401491            +3.7%   14940270            +2.4%   14748626            +4.0%   14975716        vm-scalability.throughput

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+
  6.8.0-rc1-split-folio-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-free-page-split-no-mglru+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
  11407497            +5.1%   11989632            -0.5%   11349272            +4.8%   11957423        vm-scalability.throughput

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/mmap-pread-seq-mt/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+
  6.8.0-rc1-split-folio-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-free-page-split-no-mglru+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
  11348474            +3.3%   11719453            -1.2%   11208759            +3.7%   11771926        vm-scalability.throughput

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/lru-file-readtwice/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+
  6.8.0-rc1-split-folio-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-free-page-split-no-mglru+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
   8065614 ±  3%      +7.7%    8686626 ±  2%      +5.0%    8467577 ±  4%     +11.8%    9016077 ±  2%  vm-scalability.throughput

=========================================================================================
compiler/kconfig/rootfs/runtime/tbox_group/test/testcase:
  gcc-13/defconfig/debian/300s/qemu-vm/lru-file-mmap-read/vm-scalability

commit: 
  6.8.0-rc1-mm-everything-2024-01-29-07-19-no-mglru+
  6.8.0-rc1-split-folio-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-in-compaction-no-mglru+
  6.8.0-rc1-folio-migration-free-page-split-no-mglru+

6.8.0-rc1-mm-eve 6.8.0-rc1-split-folio-in-co 6.8.0-rc1-folio-migration-i 6.8.0-rc1-folio-migration-f 
---------------- --------------------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \          |                \  
   6438422 ±  2%     +27.5%    8206734 ±  2%     +10.6%    7118390           +26.2%    8127192 ±  4%  vm-scalability.throughput

Zi Yan (4):
  mm/page_alloc: remove unused fpi_flags in free_pages_prepare()
  mm/compaction: enable compacting >0 order folios.
  mm/compaction: add support for >0 order folio memory compaction.
  mm/compaction: optimize >0 order folio compaction with free page
    split.

 mm/compaction.c | 225 ++++++++++++++++++++++++++++++++----------------
 mm/internal.h   |   4 +-
 mm/page_alloc.c |  12 +--
 3 files changed, 162 insertions(+), 79 deletions(-)
  

Comments

Andrew Morton Feb. 20, 2024, 2:06 a.m. UTC | #1
On Fri, 16 Feb 2024 12:04:28 -0500 Zi Yan <zi.yan@sent.com> wrote:

> Baolin's patch

Baolin writes many patches and patches have names, please use them!

> on nr_migratepages was based on this one, a better fixup
> for it might be below. Since before my patchset, compaction only deals with
> order-0 pages.

I don't understand what this means.  The patchset you sent applies OK
to mm-unstable so what else is there to do?

> diff --git a/mm/compaction.c b/mm/compaction.c
> index 01ec85cfd623f..e60135e2019d6 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1798,7 +1798,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>  	dst = list_entry(cc->freepages.next, struct folio, lru);
>  	list_del(&dst->lru);
>  	cc->nr_freepages--;
> -	cc->nr_migratepages -= 1 << order;
> +	cc->nr_migratepages--;
>  
>  	return dst;
>  }
> @@ -1814,7 +1814,7 @@ static void compaction_free(struct folio *dst, unsigned long data)
>  
>  	list_add(&dst->lru, &cc->freepages);
>  	cc->nr_freepages++;
> -	cc->nr_migratepages += 1 << order;
> +	cc->nr_migratepages++;
>  }
  
Zi Yan Feb. 20, 2024, 2:31 a.m. UTC | #2
On 19 Feb 2024, at 21:06, Andrew Morton wrote:

> On Fri, 16 Feb 2024 12:04:28 -0500 Zi Yan <zi.yan@sent.com> wrote:
>
>> Baolin's patch
>
> Baolin writes many patches and patches have names, please use them!
Sorry for not being specific. I mean this fixup:
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=97f749c7c82f677f89bbf4f10de7816ce9b071fe

to this patch:
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=ea87b0558293a5ad597bea606fe261f7b2650cda


The patch was based on top of my early version of this patchset, thus
uses "cc->nr_migratepages -= 1 << order;" and
"cc->nr_migratepages += 1 << order;", but now it is applied before
mine. The change should be "cc->nr_migratepages--;" and
"cc->nr_migratepages++;", respectively.

>
>> on nr_migratepages was based on this one, a better fixup
>> for it might be below. Since before my patchset, compaction only deals with
>> order-0 pages.
>
> I don't understand what this means.  The patchset you sent applies OK
> to mm-unstable so what else is there to do?

Your above fixup to Baolin's patch needs to be changed to the patch below
and my "mm/compaction: add support for >0 order folio memory compaction" will
need to be adjusted accordingly to be applied on top.

Let me know if anything is unclear.

>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 01ec85cfd623f..e60135e2019d6 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1798,7 +1798,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>>  	dst = list_entry(cc->freepages.next, struct folio, lru);
>>  	list_del(&dst->lru);
>>  	cc->nr_freepages--;
>> -	cc->nr_migratepages -= 1 << order;
>> +	cc->nr_migratepages--;
>>
>>  	return dst;
>>  }
>> @@ -1814,7 +1814,7 @@ static void compaction_free(struct folio *dst, unsigned long data)
>>
>>  	list_add(&dst->lru, &cc->freepages);
>>  	cc->nr_freepages++;
>> -	cc->nr_migratepages += 1 << order;
>> +	cc->nr_migratepages++;
>>  }


--
Best Regards,
Yan, Zi
  
Baolin Wang Feb. 20, 2024, 3 a.m. UTC | #3
On 2024/2/20 10:31, Zi Yan wrote:
> On 19 Feb 2024, at 21:06, Andrew Morton wrote:
> 
>> On Fri, 16 Feb 2024 12:04:28 -0500 Zi Yan <zi.yan@sent.com> wrote:
>>
>>> Baolin's patch
>>
>> Baolin writes many patches and patches have names, please use them!
> Sorry for not being specific. I mean this fixup:
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=97f749c7c82f677f89bbf4f10de7816ce9b071fe
> 
> to this patch:
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=ea87b0558293a5ad597bea606fe261f7b2650cda
> 
> 
> The patch was based on top of my early version of this patchset, thus
> uses "cc->nr_migratepages -= 1 << order;" and
> "cc->nr_migratepages += 1 << order;", but now it is applied before
> mine. The change should be "cc->nr_migratepages--;" and
> "cc->nr_migratepages++;", respectively.
> 
>>
>>> on nr_migratepages was based on this one, a better fixup
>>> for it might be below. Since before my patchset, compaction only deals with
>>> order-0 pages.
>>
>> I don't understand what this means.  The patchset you sent applies OK
>> to mm-unstable so what else is there to do?
> 
> Your above fixup to Baolin's patch needs to be changed to the patch below
> and my "mm/compaction: add support for >0 order folio memory compaction" will
> need to be adjusted accordingly to be applied on top.
> 
> Let me know if anything is unclear.

Hi Andrew,

To avoid conflicts, you can drop these two patches, and I will send a 
new version with fixing the issue pointed by Vlastimilb on top of 
"mm/compaction: add support for >0 order folio memory compaction".

https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=97f749c7c82f677f89bbf4f10de7816ce9b071fe

https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=ea87b0558293a5ad597bea606fe261f7b2650cda
  
Andrew Morton Feb. 20, 2024, 3:30 a.m. UTC | #4
On Tue, 20 Feb 2024 11:00:39 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:

> > The patch was based on top of my early version of this patchset, thus
> > uses "cc->nr_migratepages -= 1 << order;" and
> > "cc->nr_migratepages += 1 << order;", but now it is applied before
> > mine. The change should be "cc->nr_migratepages--;" and
> > "cc->nr_migratepages++;", respectively.
> > 
> >>
> >>> on nr_migratepages was based on this one, a better fixup
> >>> for it might be below. Since before my patchset, compaction only deals with
> >>> order-0 pages.
> >>
> >> I don't understand what this means.  The patchset you sent applies OK
> >> to mm-unstable so what else is there to do?
> > 
> > Your above fixup to Baolin's patch needs to be changed to the patch below
> > and my "mm/compaction: add support for >0 order folio memory compaction" will
> > need to be adjusted accordingly to be applied on top.
> > 
> > Let me know if anything is unclear.
> 
> Hi Andrew,
> 
> To avoid conflicts, you can drop these two patches, and I will send a 
> new version with fixing the issue pointed by Vlastimilb on top of 
> "mm/compaction: add support for >0 order folio memory compaction".
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=97f749c7c82f677f89bbf4f10de7816ce9b071fe
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=ea87b0558293a5ad597bea606fe261f7b2650cda

Well I thought I'd fixed everything up 10 minutes ago.  Please take a
look at next mm-unstable.
  
Baolin Wang Feb. 20, 2024, 6:28 a.m. UTC | #5
On 2024/2/20 11:30, Andrew Morton wrote:
> On Tue, 20 Feb 2024 11:00:39 +0800 Baolin Wang <baolin.wang@linux.alibaba.com> wrote:
> 
>>> The patch was based on top of my early version of this patchset, thus
>>> uses "cc->nr_migratepages -= 1 << order;" and
>>> "cc->nr_migratepages += 1 << order;", but now it is applied before
>>> mine. The change should be "cc->nr_migratepages--;" and
>>> "cc->nr_migratepages++;", respectively.
>>>
>>>>
>>>>> on nr_migratepages was based on this one, a better fixup
>>>>> for it might be below. Since before my patchset, compaction only deals with
>>>>> order-0 pages.
>>>>
>>>> I don't understand what this means.  The patchset you sent applies OK
>>>> to mm-unstable so what else is there to do?
>>>
>>> Your above fixup to Baolin's patch needs to be changed to the patch below
>>> and my "mm/compaction: add support for >0 order folio memory compaction" will
>>> need to be adjusted accordingly to be applied on top.
>>>
>>> Let me know if anything is unclear.
>>
>> Hi Andrew,
>>
>> To avoid conflicts, you can drop these two patches, and I will send a
>> new version with fixing the issue pointed by Vlastimilb on top of
>> "mm/compaction: add support for >0 order folio memory compaction".
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=97f749c7c82f677f89bbf4f10de7816ce9b071fe
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything-2024-02-16-01-35&id=ea87b0558293a5ad597bea606fe261f7b2650cda
> 
> Well I thought I'd fixed everything up 10 minutes ago.  Please take a
> look at next mm-unstable.

Sure. And I found a minor rebase error in the compaction_alloc() 
function while Zi Yan's original patch is correct.

+	cc->nr_freepages -= 1 << order;
  	cc->nr_migratepages--;
-
-	return dst;
+	return page_rmappable_folio(&dst->page);
  }

should change to be:
cc->nr_migratepages -= 1 << order;
  

Patch

diff --git a/mm/compaction.c b/mm/compaction.c
index 01ec85cfd623f..e60135e2019d6 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1798,7 +1798,7 @@  static struct folio *compaction_alloc(struct folio *src, unsigned long data)
 	dst = list_entry(cc->freepages.next, struct folio, lru);
 	list_del(&dst->lru);
 	cc->nr_freepages--;
-	cc->nr_migratepages -= 1 << order;
+	cc->nr_migratepages--;
 
 	return dst;
 }
@@ -1814,7 +1814,7 @@  static void compaction_free(struct folio *dst, unsigned long data)
 
 	list_add(&dst->lru, &cc->freepages);
 	cc->nr_freepages++;
-	cc->nr_migratepages += 1 << order;
+	cc->nr_migratepages++;
 }