[mm-unstable,v3,0/4] mm/mglru: Kconfig cleanup

Message ID 20231220040037.883811-1-kinseyho@google.com
Headers
Series mm/mglru: Kconfig cleanup |

Message

Kinsey Ho Dec. 20, 2023, 4 a.m. UTC
  This series is the result of the following discussion:
https://lore.kernel.org/47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com/

It mainly avoids building the code that walks page tables on CPUs that
use it, i.e., those don't support hardware accessed bit. Specifically,
it introduces a new Kconfig to guard some of functions added by
commit bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
on CPUs like POWER9, on which the series was tested.


Kinsey Ho (4):
  mm/mglru: add CONFIG_ARCH_HAS_HW_PTE_YOUNG
  mm/mglru: add CONFIG_LRU_GEN_WALKS_MMU
  mm/mglru: remove CONFIG_MEMCG
  mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE

 arch/Kconfig                   |   8 +
 arch/arm64/Kconfig             |   1 +
 arch/x86/Kconfig               |   1 +
 arch/x86/include/asm/pgtable.h |   6 -
 include/linux/memcontrol.h     |   2 +-
 include/linux/mm_types.h       |  16 +-
 include/linux/mmzone.h         |  28 +---
 include/linux/pgtable.h        |   2 +-
 kernel/fork.c                  |   2 +-
 mm/Kconfig                     |   4 +
 mm/vmscan.c                    | 271 ++++++++++++++++++---------------
 11 files changed, 174 insertions(+), 167 deletions(-)
  

Comments

Yu Zhao Dec. 20, 2023, 4:16 a.m. UTC | #1
On Tue, Dec 19, 2023 at 9:01 PM Kinsey Ho <kinseyho@google.com> wrote:
>
> This series is the result of the following discussion:
> https://lore.kernel.org/47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com/
>
> It mainly avoids building the code that walks page tables on CPUs that
> use it, i.e., those don't support hardware accessed bit. Specifically,
> it introduces a new Kconfig to guard some of functions added by
> commit bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
> on CPUs like POWER9, on which the series was tested.
>
>
> Kinsey Ho (4):
>   mm/mglru: add CONFIG_ARCH_HAS_HW_PTE_YOUNG
>   mm/mglru: add CONFIG_LRU_GEN_WALKS_MMU
>   mm/mglru: remove CONFIG_MEMCG
>   mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
>
>  arch/Kconfig                   |   8 +
>  arch/arm64/Kconfig             |   1 +
>  arch/x86/Kconfig               |   1 +
>  arch/x86/include/asm/pgtable.h |   6 -
>  include/linux/memcontrol.h     |   2 +-
>  include/linux/mm_types.h       |  16 +-
>  include/linux/mmzone.h         |  28 +---
>  include/linux/pgtable.h        |   2 +-
>  kernel/fork.c                  |   2 +-
>  mm/Kconfig                     |   4 +
>  mm/vmscan.c                    | 271 ++++++++++++++++++---------------
>  11 files changed, 174 insertions(+), 167 deletions(-)

+Donet Tom <donettom@linux.vnet.ibm.com>
who is also working on this.

Donet, could try this latest version instead? If it works well as the
old one you've been using, can you please provide your Tested-by tag?
Thanks.
  
Donet Tom Dec. 20, 2023, 1:45 p.m. UTC | #2
On 12/20/23 09:46, Yu Zhao wrote:
> On Tue, Dec 19, 2023 at 9:01 PM Kinsey Ho <kinseyho@google.com> wrote:
>> This series is the result of the following discussion:
>> https://lore.kernel.org/47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com/
>>
>> It mainly avoids building the code that walks page tables on CPUs that
>> use it, i.e., those don't support hardware accessed bit. Specifically,
>> it introduces a new Kconfig to guard some of functions added by
>> commit bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
>> on CPUs like POWER9, on which the series was tested.
>>
>>
>> Kinsey Ho (4):
>>    mm/mglru: add CONFIG_ARCH_HAS_HW_PTE_YOUNG
>>    mm/mglru: add CONFIG_LRU_GEN_WALKS_MMU
>>    mm/mglru: remove CONFIG_MEMCG
>>    mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
>>
>>   arch/Kconfig                   |   8 +
>>   arch/arm64/Kconfig             |   1 +
>>   arch/x86/Kconfig               |   1 +
>>   arch/x86/include/asm/pgtable.h |   6 -
>>   include/linux/memcontrol.h     |   2 +-
>>   include/linux/mm_types.h       |  16 +-
>>   include/linux/mmzone.h         |  28 +---
>>   include/linux/pgtable.h        |   2 +-
>>   kernel/fork.c                  |   2 +-
>>   mm/Kconfig                     |   4 +
>>   mm/vmscan.c                    | 271 ++++++++++++++++++---------------
>>   11 files changed, 174 insertions(+), 167 deletions(-)
> +Donet Tom <donettom@linux.vnet.ibm.com>
> who is also working on this.
>
> Donet, could try this latest version instead? If it works well as the
> old one you've been using, can you please provide your Tested-by tag?
> Thanks.

Hi Yu Zhao,

This patch set looks promising.

I have conducted tests on PowerPC and x86.

In old patch set there is a cleanup patch which removes
struct scan_control *sc argument from try_to_inc_max_seq() and
run_aging(), Do we need to include that patch?

=>Here are some test results from PowerPC.

# ls -l vmscan.o
-rw-r--r--. 1 root root 3600080 Dec 19 22:35 vmscan.o

# size vmscan.o
   text       data           bss      dec         hex filename
   95086      27412          0        122498      1de82 vmscan.o

# ./scripts/bloat-o-meter vmscan.o.old vmscan.o
add/remove: 4/8 grow/shrink: 7/9 up/down: 860/-2524 (-1664)
Function                              old       new     delta
should_abort_scan                      -        472     +472
inc_max_seq.isra                      1472      1612    +140
shrink_one                            680       760     +80
lru_gen_release_memcg                 508       556     +48
lru_gen_init_pgdat                    92        132     +40
shrink_node                           4040      4064    +24
lru_gen_online_memcg                  680       696     +16
lru_gen_change_state                  3968      3984    +16
------
shrink_lruvec                         2168      2152    -16
lru_gen_seq_write                     1980      1964    -16
isolate_folios                        6904      6888    -16
lru_gen_init_memcg                    32        12      -20
mm_list                               24        -       -24
lru_gen_exit_memcg                    388       344     -44
try_to_shrink_lruvec                  904       816     -88
lru_gen_rotate_memcg                  832       700     -132
lru_gen_migrate_mm                    132       -       -132
lru_gen_seq_show                      1484      1308    -176
iterate_mm_list_nowalk                288       -       -288
lru_gen_look_around                   2284      1984    -300
lru_gen_add_mm                        528       -       -528
lru_gen_del_mm                        720       -       -720
Total: Before=116213, After=114549, chg -1.43%

=>Here are some test results from x86.

$ ls -l vmscan.o
-rw-r--r--. 1 donettom donettom 2545792 Dec 20 15:16 vmscan.o

$ size vmscan.o
   text          data          bss    dec        hex filename
   109751        32189         0      141940     22a74 vmscan.o
$

$ ./scripts/bloat-o-meter vmscan.o.old vmscan.o
add/remove: 7/3 grow/shrink: 14/4 up/down: 2307/-1534 (773)
Function                                old       new      delta
inc_max_seq                             -         1470     +1470
should_abort_scan                       -         229      +229
isolate_folios                          4469      4562     +93
lru_gen_rotate_memcg                    641       731      +90
lru_gen_init_memcg                      41        99       +58
lru_gen_release_memcg                   282       336      +54
lru_gen_exit_memcg                      306       350      +44
walk_pud_range                          2502      2543     +41
shrink_node                             2912      2951     +39
lru_gen_online_memcg                    402       434      +32
lru_gen_seq_show                        1112      1140     +28
lru_gen_add_folio                       740       757      +17
lru_gen_look_around                     1217      1233     +16
__pfx_should_abort_scan                 -         16       +16
__pfx_inc_max_seq                       -         16       +16
iterate_mm_list_nowalk                  277       292      +15
shrink_one                              413       426      +13
lru_gen_init_lruvec                     190       202      +12
-----
try_to_shrink_lruvec                    717       643      -74
lru_gen_init_pgdat                      196       82       -114
try_to_inc_max_seq.isra                 2897      1578     -1319
Total: Before=101095, After=101868, chg +0.76%
$


Tested-by: Donet Tom <donettom@linux.vnet.ibm.com>

Thanks
Donet Tom
  
Yu Zhao Dec. 20, 2023, 3:16 p.m. UTC | #3
On Wed, Dec 20, 2023 at 6:45 AM Donet Tom <donettom@linux.vnet.ibm.com> wrote:
>
>
> On 12/20/23 09:46, Yu Zhao wrote:
> > On Tue, Dec 19, 2023 at 9:01 PM Kinsey Ho <kinseyho@google.com> wrote:
> >> This series is the result of the following discussion:
> >> https://lore.kernel.org/47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com/
> >>
> >> It mainly avoids building the code that walks page tables on CPUs that
> >> use it, i.e., those don't support hardware accessed bit. Specifically,
> >> it introduces a new Kconfig to guard some of functions added by
> >> commit bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
> >> on CPUs like POWER9, on which the series was tested.
> >>
> >>
> >> Kinsey Ho (4):
> >>    mm/mglru: add CONFIG_ARCH_HAS_HW_PTE_YOUNG
> >>    mm/mglru: add CONFIG_LRU_GEN_WALKS_MMU
> >>    mm/mglru: remove CONFIG_MEMCG
> >>    mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
> >>
> >>   arch/Kconfig                   |   8 +
> >>   arch/arm64/Kconfig             |   1 +
> >>   arch/x86/Kconfig               |   1 +
> >>   arch/x86/include/asm/pgtable.h |   6 -
> >>   include/linux/memcontrol.h     |   2 +-
> >>   include/linux/mm_types.h       |  16 +-
> >>   include/linux/mmzone.h         |  28 +---
> >>   include/linux/pgtable.h        |   2 +-
> >>   kernel/fork.c                  |   2 +-
> >>   mm/Kconfig                     |   4 +
> >>   mm/vmscan.c                    | 271 ++++++++++++++++++---------------
> >>   11 files changed, 174 insertions(+), 167 deletions(-)
> > +Donet Tom <donettom@linux.vnet.ibm.com>
> > who is also working on this.
> >
> > Donet, could try this latest version instead? If it works well as the
> > old one you've been using, can you please provide your Tested-by tag?
> > Thanks.
>
> Hi Yu Zhao,
>
> This patch set looks promising.
>
> I have conducted tests on PowerPC and x86.
>
> In old patch set there is a cleanup patch which removes
> struct scan_control *sc argument from try_to_inc_max_seq() and
> run_aging(), Do we need to include that patch?

Sorry not for including that patch in this series.

It's the first patch in the next cleanup series, which we haven't
fully tested yet. It'll be the first order of business after the
holiday season (mid Jan), does the schedule work for you?

> =>Here are some test results from PowerPC.
>
> # ls -l vmscan.o
> -rw-r--r--. 1 root root 3600080 Dec 19 22:35 vmscan.o
>
> # size vmscan.o
>    text       data           bss      dec         hex filename
>    95086      27412          0        122498      1de82 vmscan.o
>
> # ./scripts/bloat-o-meter vmscan.o.old vmscan.o
> add/remove: 4/8 grow/shrink: 7/9 up/down: 860/-2524 (-1664)
> Function                              old       new     delta
> should_abort_scan                      -        472     +472
> inc_max_seq.isra                      1472      1612    +140
> shrink_one                            680       760     +80
> lru_gen_release_memcg                 508       556     +48
> lru_gen_init_pgdat                    92        132     +40
> shrink_node                           4040      4064    +24
> lru_gen_online_memcg                  680       696     +16
> lru_gen_change_state                  3968      3984    +16
> ------
> shrink_lruvec                         2168      2152    -16
> lru_gen_seq_write                     1980      1964    -16
> isolate_folios                        6904      6888    -16
> lru_gen_init_memcg                    32        12      -20
> mm_list                               24        -       -24
> lru_gen_exit_memcg                    388       344     -44
> try_to_shrink_lruvec                  904       816     -88
> lru_gen_rotate_memcg                  832       700     -132
> lru_gen_migrate_mm                    132       -       -132
> lru_gen_seq_show                      1484      1308    -176
> iterate_mm_list_nowalk                288       -       -288
> lru_gen_look_around                   2284      1984    -300
> lru_gen_add_mm                        528       -       -528
> lru_gen_del_mm                        720       -       -720
> Total: Before=116213, After=114549, chg -1.43%
>
> =>Here are some test results from x86.
>
> $ ls -l vmscan.o
> -rw-r--r--. 1 donettom donettom 2545792 Dec 20 15:16 vmscan.o
>
> $ size vmscan.o
>    text          data          bss    dec        hex filename
>    109751        32189         0      141940     22a74 vmscan.o
> $
>
> $ ./scripts/bloat-o-meter vmscan.o.old vmscan.o
> add/remove: 7/3 grow/shrink: 14/4 up/down: 2307/-1534 (773)
> Function                                old       new      delta
> inc_max_seq                             -         1470     +1470
> should_abort_scan                       -         229      +229
> isolate_folios                          4469      4562     +93
> lru_gen_rotate_memcg                    641       731      +90
> lru_gen_init_memcg                      41        99       +58
> lru_gen_release_memcg                   282       336      +54
> lru_gen_exit_memcg                      306       350      +44
> walk_pud_range                          2502      2543     +41
> shrink_node                             2912      2951     +39
> lru_gen_online_memcg                    402       434      +32
> lru_gen_seq_show                        1112      1140     +28
> lru_gen_add_folio                       740       757      +17
> lru_gen_look_around                     1217      1233     +16
> __pfx_should_abort_scan                 -         16       +16
> __pfx_inc_max_seq                       -         16       +16
> iterate_mm_list_nowalk                  277       292      +15
> shrink_one                              413       426      +13
> lru_gen_init_lruvec                     190       202      +12
> -----
> try_to_shrink_lruvec                    717       643      -74
> lru_gen_init_pgdat                      196       82       -114
> try_to_inc_max_seq.isra                 2897      1578     -1319
> Total: Before=101095, After=101868, chg +0.76%
> $
>
>
> Tested-by: Donet Tom <donettom@linux.vnet.ibm.com>

Thanks!

Acked-by: Yu Zhao <yuzhao@google.com>
  
Donet Tom Dec. 21, 2023, 5:08 a.m. UTC | #4
On 12/20/23 20:46, Yu Zhao wrote:
> On Wed, Dec 20, 2023 at 6:45 AM Donet Tom <donettom@linux.vnet.ibm.com> wrote:
>>
>> On 12/20/23 09:46, Yu Zhao wrote:
>>> On Tue, Dec 19, 2023 at 9:01 PM Kinsey Ho <kinseyho@google.com> wrote:
>>>> This series is the result of the following discussion:
>>>> https://lore.kernel.org/47066176-bd93-55dd-c2fa-002299d9e034@linux.ibm.com/
>>>>
>>>> It mainly avoids building the code that walks page tables on CPUs that
>>>> use it, i.e., those don't support hardware accessed bit. Specifically,
>>>> it introduces a new Kconfig to guard some of functions added by
>>>> commit bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
>>>> on CPUs like POWER9, on which the series was tested.
>>>>
>>>>
>>>> Kinsey Ho (4):
>>>>     mm/mglru: add CONFIG_ARCH_HAS_HW_PTE_YOUNG
>>>>     mm/mglru: add CONFIG_LRU_GEN_WALKS_MMU
>>>>     mm/mglru: remove CONFIG_MEMCG
>>>>     mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE
>>>>
>>>>    arch/Kconfig                   |   8 +
>>>>    arch/arm64/Kconfig             |   1 +
>>>>    arch/x86/Kconfig               |   1 +
>>>>    arch/x86/include/asm/pgtable.h |   6 -
>>>>    include/linux/memcontrol.h     |   2 +-
>>>>    include/linux/mm_types.h       |  16 +-
>>>>    include/linux/mmzone.h         |  28 +---
>>>>    include/linux/pgtable.h        |   2 +-
>>>>    kernel/fork.c                  |   2 +-
>>>>    mm/Kconfig                     |   4 +
>>>>    mm/vmscan.c                    | 271 ++++++++++++++++++---------------
>>>>    11 files changed, 174 insertions(+), 167 deletions(-)
>>> +Donet Tom <donettom@linux.vnet.ibm.com>
>>> who is also working on this.
>>>
>>> Donet, could try this latest version instead? If it works well as the
>>> old one you've been using, can you please provide your Tested-by tag?
>>> Thanks.
>> Hi Yu Zhao,
>>
>> This patch set looks promising.
>>
>> I have conducted tests on PowerPC and x86.
>>
>> In old patch set there is a cleanup patch which removes
>> struct scan_control *sc argument from try_to_inc_max_seq() and
>> run_aging(), Do we need to include that patch?
> Sorry not for including that patch in this series.
>
> It's the first patch in the next cleanup series, which we haven't
> fully tested yet. It'll be the first order of business after the
> holiday season (mid Jan), does the schedule work for you?
>
Yes. No Problem.

Thank you very much.

Donet Tom


>> =>Here are some test results from PowerPC.
>>
>> # ls -l vmscan.o
>> -rw-r--r--. 1 root root 3600080 Dec 19 22:35 vmscan.o
>>
>> # size vmscan.o
>>     text       data           bss      dec         hex filename
>>     95086      27412          0        122498      1de82 vmscan.o
>>
>> # ./scripts/bloat-o-meter vmscan.o.old vmscan.o
>> add/remove: 4/8 grow/shrink: 7/9 up/down: 860/-2524 (-1664)
>> Function                              old       new     delta
>> should_abort_scan                      -        472     +472
>> inc_max_seq.isra                      1472      1612    +140
>> shrink_one                            680       760     +80
>> lru_gen_release_memcg                 508       556     +48
>> lru_gen_init_pgdat                    92        132     +40
>> shrink_node                           4040      4064    +24
>> lru_gen_online_memcg                  680       696     +16
>> lru_gen_change_state                  3968      3984    +16
>> ------
>> shrink_lruvec                         2168      2152    -16
>> lru_gen_seq_write                     1980      1964    -16
>> isolate_folios                        6904      6888    -16
>> lru_gen_init_memcg                    32        12      -20
>> mm_list                               24        -       -24
>> lru_gen_exit_memcg                    388       344     -44
>> try_to_shrink_lruvec                  904       816     -88
>> lru_gen_rotate_memcg                  832       700     -132
>> lru_gen_migrate_mm                    132       -       -132
>> lru_gen_seq_show                      1484      1308    -176
>> iterate_mm_list_nowalk                288       -       -288
>> lru_gen_look_around                   2284      1984    -300
>> lru_gen_add_mm                        528       -       -528
>> lru_gen_del_mm                        720       -       -720
>> Total: Before=116213, After=114549, chg -1.43%
>>
>> =>Here are some test results from x86.
>>
>> $ ls -l vmscan.o
>> -rw-r--r--. 1 donettom donettom 2545792 Dec 20 15:16 vmscan.o
>>
>> $ size vmscan.o
>>     text          data          bss    dec        hex filename
>>     109751        32189         0      141940     22a74 vmscan.o
>> $
>>
>> $ ./scripts/bloat-o-meter vmscan.o.old vmscan.o
>> add/remove: 7/3 grow/shrink: 14/4 up/down: 2307/-1534 (773)
>> Function                                old       new      delta
>> inc_max_seq                             -         1470     +1470
>> should_abort_scan                       -         229      +229
>> isolate_folios                          4469      4562     +93
>> lru_gen_rotate_memcg                    641       731      +90
>> lru_gen_init_memcg                      41        99       +58
>> lru_gen_release_memcg                   282       336      +54
>> lru_gen_exit_memcg                      306       350      +44
>> walk_pud_range                          2502      2543     +41
>> shrink_node                             2912      2951     +39
>> lru_gen_online_memcg                    402       434      +32
>> lru_gen_seq_show                        1112      1140     +28
>> lru_gen_add_folio                       740       757      +17
>> lru_gen_look_around                     1217      1233     +16
>> __pfx_should_abort_scan                 -         16       +16
>> __pfx_inc_max_seq                       -         16       +16
>> iterate_mm_list_nowalk                  277       292      +15
>> shrink_one                              413       426      +13
>> lru_gen_init_lruvec                     190       202      +12
>> -----
>> try_to_shrink_lruvec                    717       643      -74
>> lru_gen_init_pgdat                      196       82       -114
>> try_to_inc_max_seq.isra                 2897      1578     -1319
>> Total: Before=101095, After=101868, chg +0.76%
>> $
>>
>>
>> Tested-by: Donet Tom <donettom@linux.vnet.ibm.com>
> Thanks!
>
> Acked-by: Yu Zhao <yuzhao@google.com>