mm, vmscan: Don't turn on cache_trim_mode at the highest scan priority

Message ID 20240208061825.36640-1-byungchul@sk.com
State New
Headers
Series mm, vmscan: Don't turn on cache_trim_mode at the highest scan priority |

Commit Message

Byungchul Park Feb. 8, 2024, 6:18 a.m. UTC
  With cache_trim_mode on, reclaim logic doesn't bother reclaiming anon
pages. However, it should be more careful to turn on the mode because
it's going to prevent anon pages from reclaimed even if there are huge
ammount of anon pages that are very cold so should be reclaimed. Even
worse, that can lead kswapd_failures to be MAX_RECLAIM_RETRIES and stop
until direct reclaim eventually works to resume kswapd.

Signed-off-by: Byungchul Park <byungchul@sk.com>
---
 mm/vmscan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Yu Zhao Feb. 16, 2024, 5:55 a.m. UTC | #1
On Thu, Feb 8, 2024 at 1:18 AM Byungchul Park <byungchul@sk.com> wrote:
>
> With cache_trim_mode on, reclaim logic doesn't bother reclaiming anon
> pages. However, it should be more careful to turn on the mode because
> it's going to prevent anon pages from reclaimed even if there are huge
> ammount of anon pages that are very cold so should be reclaimed. Even
> worse, that can lead kswapd_failures to be MAX_RECLAIM_RETRIES and stop
> until direct reclaim eventually works to resume kswapd.

Is a theory or something observed in the real world? If it's the
former, would this change risk breaking existing use cases? It's the
latter, where are the performance numbers to show what it looks like
before and after this patch?

> Signed-off-by: Byungchul Park <byungchul@sk.com>
> ---
>  mm/vmscan.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index bba207f41b14..25b55fdc0d41 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2268,7 +2268,8 @@ static void prepare_scan_control(pg_data_t *pgdat, struct scan_control *sc)
>          * anonymous pages.
>          */
>         file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
> -       if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
> +       if (sc->priority != 1 && file >> sc->priority &

Why 1?

> +           !(sc->may_deactivate & DEACTIVATE_FILE))
>                 sc->cache_trim_mode = 1;
>         else
>                 sc->cache_trim_mode = 0;
  
Byungchul Park Feb. 16, 2024, 7:24 a.m. UTC | #2
On Fri, Feb 16, 2024 at 12:55:17AM -0500, Yu Zhao wrote:
> On Thu, Feb 8, 2024 at 1:18 AM Byungchul Park <byungchul@sk.com> wrote:
> >
> > With cache_trim_mode on, reclaim logic doesn't bother reclaiming anon
> > pages. However, it should be more careful to turn on the mode because
> > it's going to prevent anon pages from reclaimed even if there are huge
> > ammount of anon pages that are very cold so should be reclaimed. Even
> > worse, that can lead kswapd_failures to be MAX_RECLAIM_RETRIES and stop
> > until direct reclaim eventually works to resume kswapd.
> 
> Is a theory or something observed in the real world? If it's the
> former, would this change risk breaking existing use cases? It's the

I faced the latter case.

> latter, where are the performance numbers to show what it looks like
> before and after this patch?

Before:

Whenever the system meets the condition to turn on cache_trim_mode but
few cache pages to trim, kswapd fails without scanning anon pages that
are plenty and cold for sure and it retries 8 times and looks *stopped
for ever*.

After:

When the system meets the condition to turn on cache_trim_mode but few
cache pages to trim, kswapd finally works at the highest scan priority.
So kswap looks working well even in the same condition.

> > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > ---
> >  mm/vmscan.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index bba207f41b14..25b55fdc0d41 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -2268,7 +2268,8 @@ static void prepare_scan_control(pg_data_t *pgdat, struct scan_control *sc)
> >          * anonymous pages.
> >          */
> >         file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
> > -       if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
> > +       if (sc->priority != 1 && file >> sc->priority &
> 
> Why 1?

It means the highest scan priority. The priority goes from DEF_PRIORITY
to 1.

	Byungchul

> > +           !(sc->may_deactivate & DEACTIVATE_FILE))
> >                 sc->cache_trim_mode = 1;
> >         else
> >                 sc->cache_trim_mode = 0;
  
Yu Zhao Feb. 17, 2024, 5:11 a.m. UTC | #3
On Fri, Feb 16, 2024 at 2:24 AM Byungchul Park <byungchul@sk.com> wrote:
>
> On Fri, Feb 16, 2024 at 12:55:17AM -0500, Yu Zhao wrote:
> > On Thu, Feb 8, 2024 at 1:18 AM Byungchul Park <byungchul@sk.com> wrote:
> > >
> > > With cache_trim_mode on, reclaim logic doesn't bother reclaiming anon
> > > pages. However, it should be more careful to turn on the mode because
> > > it's going to prevent anon pages from reclaimed even if there are huge
> > > ammount of anon pages that are very cold so should be reclaimed. Even
> > > worse, that can lead kswapd_failures to be MAX_RECLAIM_RETRIES and stop
> > > until direct reclaim eventually works to resume kswapd.
> >
> > Is a theory or something observed in the real world? If it's the
> > former, would this change risk breaking existing use cases? It's the
>
> I faced the latter case.
>
> > latter, where are the performance numbers to show what it looks like
> > before and after this patch?

Let me ask again: where are the performance numbers to show what it
looks like before and after this patch?

> Before:
>
> Whenever the system meets the condition to turn on cache_trim_mode but
> few cache pages to trim, kswapd fails without scanning anon pages that
> are plenty and cold for sure and it retries 8 times and looks *stopped
> for ever*.
>
> After:
>
> When the system meets the condition to turn on cache_trim_mode but few
> cache pages to trim, kswapd finally works at the highest scan priority.
> So kswap looks working well even in the same condition.

These are not performance numbers -- what test cases can prove what's
described here?

> > > Signed-off-by: Byungchul Park <byungchul@sk.com>
> > > ---
> > >  mm/vmscan.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index bba207f41b14..25b55fdc0d41 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -2268,7 +2268,8 @@ static void prepare_scan_control(pg_data_t *pgdat, struct scan_control *sc)
> > >          * anonymous pages.
> > >          */
> > >         file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
> > > -       if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
> > > +       if (sc->priority != 1 && file >> sc->priority &
> >
> > Why 1?
>
> It means the highest scan priority. The priority goes from DEF_PRIORITY
> to 1.

This is not true -- sc->priority can go all the way to zero.

>         Byungchul
>
> > > +           !(sc->may_deactivate & DEACTIVATE_FILE))
> > >                 sc->cache_trim_mode = 1;
> > >         else
> > >                 sc->cache_trim_mode = 0;
  
Andrew Morton Feb. 21, 2024, 10:30 p.m. UTC | #4
On Sat, 17 Feb 2024 00:11:25 -0500 Yu Zhao <yuzhao@google.com> wrote:

> On Fri, Feb 16, 2024 at 2:24 AM Byungchul Park <byungchul@sk.com> wrote:
> >
> > On Fri, Feb 16, 2024 at 12:55:17AM -0500, Yu Zhao wrote:
> > > On Thu, Feb 8, 2024 at 1:18 AM Byungchul Park <byungchul@sk.com> wrote:
> > > >
> > > > With cache_trim_mode on, reclaim logic doesn't bother reclaiming anon
> > > > pages. However, it should be more careful to turn on the mode because
> > > > it's going to prevent anon pages from reclaimed even if there are huge
> > > > ammount of anon pages that are very cold so should be reclaimed. Even
> > > > worse, that can lead kswapd_failures to be MAX_RECLAIM_RETRIES and stop
> > > > until direct reclaim eventually works to resume kswapd.
> > >
> > > Is a theory or something observed in the real world? If it's the
> > > former, would this change risk breaking existing use cases? It's the
> >
> > I faced the latter case.
> >
> > > latter, where are the performance numbers to show what it looks like
> > > before and after this patch?
> 
> Let me ask again: where are the performance numbers to show what it
> looks like before and after this patch?
> 
> > Before:
> >
> > Whenever the system meets the condition to turn on cache_trim_mode but
> > few cache pages to trim, kswapd fails without scanning anon pages that
> > are plenty and cold for sure and it retries 8 times and looks *stopped
> > for ever*.

Does "stopped for ever" mean that kswapd simply stops functioning?

If so, that's a pretty serious issue.  Please fully describe all of
this in the changelog.  Please also address Yu Zhao's review comments
and send us a v2 patch?  Thanks.
  
Byungchul Park Feb. 22, 2024, 3:27 a.m. UTC | #5
On Wed, Feb 21, 2024 at 02:30:13PM -0800, Andrew Morton wrote:
> On Sat, 17 Feb 2024 00:11:25 -0500 Yu Zhao <yuzhao@google.com> wrote:
> 
> > On Fri, Feb 16, 2024 at 2:24 AM Byungchul Park <byungchul@sk.com> wrote:
> > >
> > > On Fri, Feb 16, 2024 at 12:55:17AM -0500, Yu Zhao wrote:
> > > > On Thu, Feb 8, 2024 at 1:18 AM Byungchul Park <byungchul@sk.com> wrote:
> > > > >
> > > > > With cache_trim_mode on, reclaim logic doesn't bother reclaiming anon
> > > > > pages. However, it should be more careful to turn on the mode because
> > > > > it's going to prevent anon pages from reclaimed even if there are huge
> > > > > ammount of anon pages that are very cold so should be reclaimed. Even
> > > > > worse, that can lead kswapd_failures to be MAX_RECLAIM_RETRIES and stop
> > > > > until direct reclaim eventually works to resume kswapd.
> > > >
> > > > Is a theory or something observed in the real world? If it's the
> > > > former, would this change risk breaking existing use cases? It's the
> > >
> > > I faced the latter case.
> > >
> > > > latter, where are the performance numbers to show what it looks like
> > > > before and after this patch?
> > 
> > Let me ask again: where are the performance numbers to show what it
> > looks like before and after this patch?
> > 
> > > Before:
> > >
> > > Whenever the system meets the condition to turn on cache_trim_mode but
> > > few cache pages to trim, kswapd fails without scanning anon pages that
> > > are plenty and cold for sure and it retries 8 times and looks *stopped
> > > for ever*.
> 
> Does "stopped for ever" mean that kswapd simply stops functioning?

Yes. kswapd stops its functioning. Even worse, after being stopped, any
request to wake up kswapd fails until ->kswapd_failures gets reset to 0
by direct reclaim or something.

It's more like a bug fix than a performance improvement.

> If so, that's a pretty serious issue.  Please fully describe all of
> this in the changelog.  Please also address Yu Zhao's review comments
> and send us a v2 patch?  Thanks.

I will post v2 with vmstat numbers between before and after.

	Byungchul
  

Patch

diff --git a/mm/vmscan.c b/mm/vmscan.c
index bba207f41b14..25b55fdc0d41 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2268,7 +2268,8 @@  static void prepare_scan_control(pg_data_t *pgdat, struct scan_control *sc)
 	 * anonymous pages.
 	 */
 	file = lruvec_page_state(target_lruvec, NR_INACTIVE_FILE);
-	if (file >> sc->priority && !(sc->may_deactivate & DEACTIVATE_FILE))
+	if (sc->priority != 1 && file >> sc->priority &&
+	    !(sc->may_deactivate & DEACTIVATE_FILE))
 		sc->cache_trim_mode = 1;
 	else
 		sc->cache_trim_mode = 0;