[5/5] swap: avoid ra statistic lost when swapin races

Message ID 20221208180209.50845-6-ryncsn@gmail.com
State New
Headers
Series Clean up and fixes for swap |

Commit Message

Kairui Song Dec. 8, 2022, 6:02 p.m. UTC
  From: Kairui Song <kasong@tencent.com>

__read_swap_cache_async should just call swap_cache_get_folio for trying
to look up the swap cache. Because swap_cache_get_folio handles the
readahead statistic, and clears the RA flag, looking up the cache
directly will skip these parts.

And the comment no longer applies after commit 442701e7058b
("mm/swap: remove swap_cache_info statistics"), just remove them.

Fixes: 442701e7058b ("mm/swap: remove swap_cache_info statistics")
Signed-off-by: Kairui Song <kasong@tencent.com>
---
 mm/swap_state.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)
  

Comments

Matthew Wilcox Dec. 8, 2022, 7:14 p.m. UTC | #1
On Fri, Dec 09, 2022 at 02:02:09AM +0800, Kairui Song wrote:
> From: Kairui Song <kasong@tencent.com>
> 
> __read_swap_cache_async should just call swap_cache_get_folio for trying
> to look up the swap cache. Because swap_cache_get_folio handles the
> readahead statistic, and clears the RA flag, looking up the cache
> directly will skip these parts.
> 
> And the comment no longer applies after commit 442701e7058b
> ("mm/swap: remove swap_cache_info statistics"), just remove them.

But what about the readahead stats?

> Fixes: 442701e7058b ("mm/swap: remove swap_cache_info statistics")
> Signed-off-by: Kairui Song <kasong@tencent.com>
> ---
>  mm/swap_state.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index eba388f67741..f39cfb62551d 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -418,15 +418,12 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>  	for (;;) {
>  		int err;
>  		/*
> -		 * First check the swap cache.  Since this is normally
> -		 * called after swap_cache_get_folio() failed, re-calling
> -		 * that would confuse statistics.
> +		 * First check the swap cache in case of race.
>  		 */
>  		si = get_swap_device(entry);
>  		if (!si)
>  			return NULL;
> -		folio = filemap_get_folio(swap_address_space(entry),
> -						swp_offset(entry));
> +		folio = swap_cache_get_folio(entry, vma, addr);
>  		put_swap_device(si);
>  		if (folio)
>  			return folio_file_page(folio, swp_offset(entry));
> -- 
> 2.35.2
> 
>
  
Kairui Song Dec. 9, 2022, 1:54 a.m. UTC | #2
Matthew Wilcox <willy@infradead.org> 于2022年12月9日周五 03:14写道:
>

Hi, thanks for the review.

> On Fri, Dec 09, 2022 at 02:02:09AM +0800, Kairui Song wrote:
> > From: Kairui Song <kasong@tencent.com>
> >
> > __read_swap_cache_async should just call swap_cache_get_folio for trying
> > to look up the swap cache. Because swap_cache_get_folio handles the
> > readahead statistic, and clears the RA flag, looking up the cache
> > directly will skip these parts.
> >
> > And the comment no longer applies after commit 442701e7058b
> > ("mm/swap: remove swap_cache_info statistics"), just remove them.
>
> But what about the readahead stats?
>

Shouldn't readahead stats be accounted here? __read_swap_cache_async
is called by swap read in path, if it hits the swap cache, and the
page have readahead page flag set, then accounting that readahead
should be just the right thing todo. And the readahead flag is checked
with folio_test_clear_readahead, so there should be no issue about
repeated accounting.

Only the addr info of the swap_readahead_info could be updated for
multiple times by racing readers, but I think that seems fine, since
we don't know which swap read comes later in case of race, just let
the last reader that hits the swap cache update the address info of
readahead makes sense to me.

Or do you mean I should update the comment about the readahead stat
instead of just drop the commnet?
  
Huang, Ying Dec. 11, 2022, 12:02 p.m. UTC | #3
Kairui Song <ryncsn@gmail.com> writes:

> Matthew Wilcox <willy@infradead.org> 于2022年12月9日周五 03:14写道:
>>
>
> Hi, thanks for the review.
>
>> On Fri, Dec 09, 2022 at 02:02:09AM +0800, Kairui Song wrote:
>> > From: Kairui Song <kasong@tencent.com>
>> >
>> > __read_swap_cache_async should just call swap_cache_get_folio for trying
>> > to look up the swap cache. Because swap_cache_get_folio handles the
>> > readahead statistic, and clears the RA flag, looking up the cache
>> > directly will skip these parts.
>> >
>> > And the comment no longer applies after commit 442701e7058b
>> > ("mm/swap: remove swap_cache_info statistics"), just remove them.
>>
>> But what about the readahead stats?
>>
>
> Shouldn't readahead stats be accounted here? __read_swap_cache_async
> is called by swap read in path, if it hits the swap cache, and the
> page have readahead page flag set, then accounting that readahead
> should be just the right thing todo. And the readahead flag is checked
> with folio_test_clear_readahead, so there should be no issue about
> repeated accounting.
>
> Only the addr info of the swap_readahead_info could be updated for
> multiple times by racing readers, but I think that seems fine, since
> we don't know which swap read comes later in case of race, just let
> the last reader that hits the swap cache update the address info of
> readahead makes sense to me.
>
> Or do you mean I should update the comment about the readahead stat
> instead of just drop the commnet?

__read_swap_cache_async() is called by readahead too
(swap_vma_readahead/__read_swap_cache_async).  I don't think that it's a
good idea to do swap readahead operation in this function.

Best Regards,
Huang, Ying
  
Kairui Song Dec. 11, 2022, 12:15 p.m. UTC | #4
Huang, Ying <ying.huang@intel.com> 于2022年12月11日周日 20:03写道:
>
> Kairui Song <ryncsn@gmail.com> writes:
>
> > Matthew Wilcox <willy@infradead.org> 于2022年12月9日周五 03:14写道:
> >>
> >
> > Hi, thanks for the review.
> >
> >> On Fri, Dec 09, 2022 at 02:02:09AM +0800, Kairui Song wrote:
> >> > From: Kairui Song <kasong@tencent.com>
> >> >
> >> > __read_swap_cache_async should just call swap_cache_get_folio for trying
> >> > to look up the swap cache. Because swap_cache_get_folio handles the
> >> > readahead statistic, and clears the RA flag, looking up the cache
> >> > directly will skip these parts.
> >> >
> >> > And the comment no longer applies after commit 442701e7058b
> >> > ("mm/swap: remove swap_cache_info statistics"), just remove them.
> >>
> >> But what about the readahead stats?
> >>
> >
> > Shouldn't readahead stats be accounted here? __read_swap_cache_async
> > is called by swap read in path, if it hits the swap cache, and the
> > page have readahead page flag set, then accounting that readahead
> > should be just the right thing todo. And the readahead flag is checked
> > with folio_test_clear_readahead, so there should be no issue about
> > repeated accounting.
> >
> > Only the addr info of the swap_readahead_info could be updated for
> > multiple times by racing readers, but I think that seems fine, since
> > we don't know which swap read comes later in case of race, just let
> > the last reader that hits the swap cache update the address info of
> > readahead makes sense to me.
> >
> > Or do you mean I should update the comment about the readahead stat
> > instead of just drop the commnet?
>
> __read_swap_cache_async() is called by readahead too
> (swap_vma_readahead/__read_swap_cache_async).  I don't think that it's a
> good idea to do swap readahead operation in this function.

Ah, I got it.
Thanks for pointing out the issue, I'll drop this patch.

>
> Best Regards,
> Huang, Ying
  

Patch

diff --git a/mm/swap_state.c b/mm/swap_state.c
index eba388f67741..f39cfb62551d 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -418,15 +418,12 @@  struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 	for (;;) {
 		int err;
 		/*
-		 * First check the swap cache.  Since this is normally
-		 * called after swap_cache_get_folio() failed, re-calling
-		 * that would confuse statistics.
+		 * First check the swap cache in case of race.
 		 */
 		si = get_swap_device(entry);
 		if (!si)
 			return NULL;
-		folio = filemap_get_folio(swap_address_space(entry),
-						swp_offset(entry));
+		folio = swap_cache_get_folio(entry, vma, addr);
 		put_swap_device(si);
 		if (folio)
 			return folio_file_page(folio, swp_offset(entry));