[v1] mm: Fix for negative counter: nr_file_hugepages

Message ID 20231106181918.1091043-1-shr@devkernel.io
State New
Headers
Series [v1] mm: Fix for negative counter: nr_file_hugepages |

Commit Message

Stefan Roesch Nov. 6, 2023, 6:19 p.m. UTC
  While qualifiying the 6.4 release, the following warning was detected in
messages:

vmstat_refresh: nr_file_hugepages -15664

The warning is caused by the incorrect updating of the NR_FILE_THPS
counter in the function split_huge_page_to_list. The if case is checking
for folio_test_swapbacked, but the else case is missing the check for
folio_test_pmd_mappable. The other functions that manipulate the counter
like __filemap_add_folio and filemap_unaccount_folio have the
corresponding check.

I have a test case, which reproduces the problem. It can be found here:
  https://github.com/sroeschus/testcase/blob/main/vmstat_refresh/madv.c

The test case reproduces on an XFS filesystem. Running the same test
case on a BTRFS filesystem does not reproduce the problem.

AFAIK version 6.1 until 6.6 are affected by this problem.

Signed-off-by: Stefan Roesch <shr@devkernel.io>
Co-debugged-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/huge_memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


base-commit: ffc253263a1375a65fa6c9f62a893e9767fbebfa
  

Comments

Andrew Morton Nov. 6, 2023, 6:26 p.m. UTC | #1
On Mon,  6 Nov 2023 10:19:18 -0800 Stefan Roesch <shr@devkernel.io> wrote:

> While qualifiying the 6.4 release, the following warning was detected in
> messages:
> 
> vmstat_refresh: nr_file_hugepages -15664
> 
> The warning is caused by the incorrect updating of the NR_FILE_THPS
> counter in the function split_huge_page_to_list. The if case is checking
> for folio_test_swapbacked, but the else case is missing the check for
> folio_test_pmd_mappable. The other functions that manipulate the counter
> like __filemap_add_folio and filemap_unaccount_folio have the
> corresponding check.
> 
> I have a test case, which reproduces the problem. It can be found here:
>   https://github.com/sroeschus/testcase/blob/main/vmstat_refresh/madv.c
> 
> The test case reproduces on an XFS filesystem. Running the same test
> case on a BTRFS filesystem does not reproduce the problem.
> 
> AFAIK version 6.1 until 6.6 are affected by this problem.

I'm thinking a cc:stable is justified.

> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2740,7 +2740,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>  			if (folio_test_swapbacked(folio)) {
>  				__lruvec_stat_mod_folio(folio, NR_SHMEM_THPS,
>  							-nr);
> -			} else {
> +			} else if (folio_test_pmd_mappable(folio)) {
> +
>  				__lruvec_stat_mod_folio(folio, NR_FILE_THPS,
>  							-nr);
>  				filemap_nr_thps_dec(mapping);
> 

I expect this will backport OK until it hits 3e9a13daa ("huge_memory:
convert split_huge_page_to_list() to use a folio") at which point the
-stable maintainers might request a reworked version.
  
Matthew Wilcox Nov. 6, 2023, 6:59 p.m. UTC | #2
On Mon, Nov 06, 2023 at 10:19:18AM -0800, Stefan Roesch wrote:
> +++ b/mm/huge_memory.c
> @@ -2740,7 +2740,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
>  			if (folio_test_swapbacked(folio)) {
>  				__lruvec_stat_mod_folio(folio, NR_SHMEM_THPS,
>  							-nr);
> -			} else {
> +			} else if (folio_test_pmd_mappable(folio)) {
> +
>  				__lruvec_stat_mod_folio(folio, NR_FILE_THPS,
>  							-nr);
>  				filemap_nr_thps_dec(mapping);

Good catch.  Two things:

1. No blank line after the 'else if'.

2. We're leaving a bit of a landmine for shmem when it gets support for
arbitrary folio sizes.  Really all of this should be under a
test_pmd_mappable.
  
Johannes Weiner Nov. 6, 2023, 7:28 p.m. UTC | #3
On Mon, Nov 06, 2023 at 10:19:18AM -0800, Stefan Roesch wrote:
> While qualifiying the 6.4 release, the following warning was detected in
> messages:
> 
> vmstat_refresh: nr_file_hugepages -15664
> 
> The warning is caused by the incorrect updating of the NR_FILE_THPS
> counter in the function split_huge_page_to_list. The if case is checking
> for folio_test_swapbacked, but the else case is missing the check for
> folio_test_pmd_mappable. The other functions that manipulate the counter
> like __filemap_add_folio and filemap_unaccount_folio have the
> corresponding check.
> 
> I have a test case, which reproduces the problem. It can be found here:
>   https://github.com/sroeschus/testcase/blob/main/vmstat_refresh/madv.c
> 
> The test case reproduces on an XFS filesystem. Running the same test
> case on a BTRFS filesystem does not reproduce the problem.
> 
> AFAIK version 6.1 until 6.6 are affected by this problem.
> 
> Signed-off-by: Stefan Roesch <shr@devkernel.io>
> Co-debugged-by: Johannes Weiner <hannes@cmpxchg.org>

With the newline fix Willy pointed out, and CC: stable:

Acked-by: Johannes Weiner <hannes@cmpxchg.org>
  
Johannes Weiner Nov. 6, 2023, 7:33 p.m. UTC | #4
On Mon, Nov 06, 2023 at 06:59:55PM +0000, Matthew Wilcox wrote:
> On Mon, Nov 06, 2023 at 10:19:18AM -0800, Stefan Roesch wrote:
> > +++ b/mm/huge_memory.c
> > @@ -2740,7 +2740,8 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
> >  			if (folio_test_swapbacked(folio)) {
> >  				__lruvec_stat_mod_folio(folio, NR_SHMEM_THPS,
> >  							-nr);
> > -			} else {
> > +			} else if (folio_test_pmd_mappable(folio)) {
> > +
> >  				__lruvec_stat_mod_folio(folio, NR_FILE_THPS,
> >  							-nr);
> >  				filemap_nr_thps_dec(mapping);
> 
> Good catch.  Two things:
> 
> 1. No blank line after the 'else if'.
> 
> 2. We're leaving a bit of a landmine for shmem when it gets support for
> arbitrary folio sizes.  Really all of this should be under a
> test_pmd_mappable.

I was wondering if we want to keep NR_FILE_THPS permanently for
original flavor 512 basepage THPs, or whether they should account
large folios as well? Same for NR_ANON_THPS and NR_SHMEM_THPS.

If so, then I agree this should all be conditional on pmdmapped. I
suppose the same in filemap_unaccount_folio().
  

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 064fbd90822b4..ea6bee675c4d3 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2740,7 +2740,8 @@  int split_huge_page_to_list(struct page *page, struct list_head *list)
 			if (folio_test_swapbacked(folio)) {
 				__lruvec_stat_mod_folio(folio, NR_SHMEM_THPS,
 							-nr);
-			} else {
+			} else if (folio_test_pmd_mappable(folio)) {
+
 				__lruvec_stat_mod_folio(folio, NR_FILE_THPS,
 							-nr);
 				filemap_nr_thps_dec(mapping);