[16/24] mm/swap: reduce scope of get_swap_device in swapin path
Commit Message
From: Kairui Song <kasong@tencent.com>
Move get_swap_device into swapin_readahead, simplify the code
and prepare for follow up commits.
For the later part in do_swap_page, using swp_swap_info directly is fine
since in that context, the swap device is pinned by swapcache reference.
Signed-off-by: Kairui Song <kasong@tencent.com>
---
mm/memory.c | 16 ++++------------
mm/swap_state.c | 8 ++++++--
mm/swapfile.c | 4 +++-
3 files changed, 13 insertions(+), 15 deletions(-)
Comments
On Mon, Nov 20, 2023 at 03:47:32AM +0800, Kairui Song wrote:
> page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
> vmf, &cache_result);
> - if (page) {
> + if (PTR_ERR(page) == -EBUSY) {
if (page == ERR_PTR(-EBUSY)) {
Matthew Wilcox <willy@infradead.org> 于2023年11月20日周一 05:12写道:
>
> On Mon, Nov 20, 2023 at 03:47:32AM +0800, Kairui Song wrote:
> > page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
> > vmf, &cache_result);
> > - if (page) {
> > + if (PTR_ERR(page) == -EBUSY) {
>
> if (page == ERR_PTR(-EBUSY)) {
>
Thanks, I'll fix this.
On Sun, Nov 19, 2023 at 11:48 AM Kairui Song <ryncsn@gmail.com> wrote:
>
> From: Kairui Song <kasong@tencent.com>
>
> Move get_swap_device into swapin_readahead, simplify the code
> and prepare for follow up commits.
>
> For the later part in do_swap_page, using swp_swap_info directly is fine
> since in that context, the swap device is pinned by swapcache reference.
>
> Signed-off-by: Kairui Song <kasong@tencent.com>
> ---
> mm/memory.c | 16 ++++------------
> mm/swap_state.c | 8 ++++++--
> mm/swapfile.c | 4 +++-
> 3 files changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/mm/memory.c b/mm/memory.c
> index 22af9f3e8c75..e399b37ef395 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3789,7 +3789,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> struct folio *swapcache = NULL, *folio = NULL;
> enum swap_cache_result cache_result;
> struct page *page;
> - struct swap_info_struct *si = NULL;
> rmap_t rmap_flags = RMAP_NONE;
> bool exclusive = false;
> swp_entry_t entry;
> @@ -3845,14 +3844,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
> goto out;
> }
>
> - /* Prevent swapoff from happening to us. */
> - si = get_swap_device(entry);
> - if (unlikely(!si))
> - goto out;
> -
> page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
> vmf, &cache_result);
> - if (page) {
> + if (PTR_ERR(page) == -EBUSY) {
> + goto out;
> + } else if (page) {
As Matthew suggested, using ERR_PTR(-EBUSY).
you also don't need the else here. The goto already terminates the flow.
if (page == ERR_PTR(-EBUSY))
goto out;
if (page) {
Chris
Kairui Song <ryncsn@gmail.com> writes:
> From: Kairui Song <kasong@tencent.com>
>
> Move get_swap_device into swapin_readahead, simplify the code
> and prepare for follow up commits.
No. Please don't do this. Please check the get/put_swap_device() usage
rule in the comments of get_swap_device().
"
* When we get a swap entry, if there aren't some other ways to
* prevent swapoff, such as the folio in swap cache is locked, page
* table lock is held, etc., the swap entry may become invalid because
* of swapoff. Then, we need to enclose all swap related functions
* with get_swap_device() and put_swap_device(), unless the swap
* functions call get/put_swap_device() by themselves.
"
This is to simplify the reasoning about swapoff and swap entry.
Why does it bother you?
> For the later part in do_swap_page, using swp_swap_info directly is fine
> since in that context, the swap device is pinned by swapcache reference.
[snip]
--
Best Regards,
Huang, Ying
Huang, Ying <ying.huang@intel.com> 于2023年11月22日周三 08:38写道:
>
> Kairui Song <ryncsn@gmail.com> writes:
>
> > From: Kairui Song <kasong@tencent.com>
> >
> > Move get_swap_device into swapin_readahead, simplify the code
> > and prepare for follow up commits.
>
> No. Please don't do this. Please check the get/put_swap_device() usage
> rule in the comments of get_swap_device().
>
> "
> * When we get a swap entry, if there aren't some other ways to
> * prevent swapoff, such as the folio in swap cache is locked, page
> * table lock is held, etc., the swap entry may become invalid because
> * of swapoff. Then, we need to enclose all swap related functions
> * with get_swap_device() and put_swap_device(), unless the swap
> * functions call get/put_swap_device() by themselves.
> "
>
> This is to simplify the reasoning about swapoff and swap entry.
>
> Why does it bother you?
Hi Ying,
This is trying to reduce LOC, avoid a trivial si read, and make error
checking logic easier to refactor in later commits.
And besides there is one trivial change I forgot to include in this
commit, get_swap_device can be put after swap_cache_get_folio in
swapin_readahead, since swap_cache_get_folio doesn't need to hold the
swap device, so in cache hit case this get/put_swap_device() can be
saved.
The comment also mentioned:
"Then, we need to enclose all swap related functions with
get_swap_device() and put_swap_device(), unless the swap functions
call get/put_swap_device() by themselves"
So I think it should be OK to do this.
Kairui Song <ryncsn@gmail.com> writes:
> Huang, Ying <ying.huang@intel.com> 于2023年11月22日周三 08:38写道:
>>
>> Kairui Song <ryncsn@gmail.com> writes:
>>
>> > From: Kairui Song <kasong@tencent.com>
>> >
>> > Move get_swap_device into swapin_readahead, simplify the code
>> > and prepare for follow up commits.
>>
>> No. Please don't do this. Please check the get/put_swap_device() usage
>> rule in the comments of get_swap_device().
>>
>> "
>> * When we get a swap entry, if there aren't some other ways to
>> * prevent swapoff, such as the folio in swap cache is locked, page
>> * table lock is held, etc., the swap entry may become invalid because
>> * of swapoff. Then, we need to enclose all swap related functions
>> * with get_swap_device() and put_swap_device(), unless the swap
>> * functions call get/put_swap_device() by themselves.
>> "
>>
>> This is to simplify the reasoning about swapoff and swap entry.
>>
>> Why does it bother you?
>
> Hi Ying,
>
> This is trying to reduce LOC, avoid a trivial si read, and make error
> checking logic easier to refactor in later commits.
The race with swapoff isn't considered by many developers usually. So,
we should use a simple rule as much as possible. For example, if you
get a swap entry, use get/put_swap_device() to enclose all code that
operate on the swap entry. This makes code easier to be maintained in
the long run. Yes. Sometimes we break the rule a little, but only if
we have enough benefit, such as improving performance in some practical
use cases.
> And besides there is one trivial change I forgot to include in this
> commit, get_swap_device can be put after swap_cache_get_folio in
> swapin_readahead, since swap_cache_get_folio doesn't need to hold the
> swap device, so in cache hit case this get/put_swap_device() can be
> saved.
swapoff is rare operation, it's OK to delay it a little to make the code
easier to be understood.
> The comment also mentioned:
>
> "Then, we need to enclose all swap related functions with
> get_swap_device() and put_swap_device(), unless the swap functions
> call get/put_swap_device() by themselves"
>
> So I think it should be OK to do this.
No. You should change the code with a good enough reason. Compared
with complexity it introduced, the benefit isn't enough for me so far.
--
Best Regards,
Huang, Ying
@@ -3789,7 +3789,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
struct folio *swapcache = NULL, *folio = NULL;
enum swap_cache_result cache_result;
struct page *page;
- struct swap_info_struct *si = NULL;
rmap_t rmap_flags = RMAP_NONE;
bool exclusive = false;
swp_entry_t entry;
@@ -3845,14 +3844,11 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
goto out;
}
- /* Prevent swapoff from happening to us. */
- si = get_swap_device(entry);
- if (unlikely(!si))
- goto out;
-
page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
vmf, &cache_result);
- if (page) {
+ if (PTR_ERR(page) == -EBUSY) {
+ goto out;
+ } else if (page) {
folio = page_folio(page);
if (cache_result != SWAP_CACHE_HIT) {
/* Had to read the page from swap area: Major fault */
@@ -3964,7 +3960,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
*/
exclusive = true;
} else if (exclusive && folio_test_writeback(folio) &&
- data_race(si->flags & SWP_STABLE_WRITES)) {
+ (swp_swap_info(entry)->flags & SWP_STABLE_WRITES)) {
/*
* This is tricky: not all swap backends support
* concurrent page modifications while under writeback.
@@ -4068,8 +4064,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (vmf->pte)
pte_unmap_unlock(vmf->pte, vmf->ptl);
out:
- if (si)
- put_swap_device(si);
return ret;
out_nomap:
if (vmf->pte)
@@ -4082,8 +4076,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
folio_unlock(swapcache);
folio_put(swapcache);
}
- if (si)
- put_swap_device(si);
return ret;
}
@@ -922,6 +922,11 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
struct page *page;
pgoff_t ilx;
+ /* Prevent swapoff from happening to us */
+ si = get_swap_device(entry);
+ if (unlikely(!si))
+ return ERR_PTR(-EBUSY);
+
folio = swap_cache_get_folio(entry, vmf, &shadow);
if (folio) {
page = folio_file_page(folio, swp_offset(entry));
@@ -929,7 +934,6 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
goto done;
}
- si = swp_swap_info(entry);
mpol = get_vma_policy(vmf->vma, vmf->address, 0, &ilx);
if (swap_use_no_readahead(si, swp_offset(entry))) {
page = swapin_no_readahead(entry, gfp_mask, mpol, ilx, vmf->vma->vm_mm);
@@ -944,8 +948,8 @@ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask,
cache_result = SWAP_CACHE_MISS;
}
mpol_cond_put(mpol);
-
done:
+ put_swap_device(si);
if (result)
*result = cache_result;
@@ -1857,7 +1857,9 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
pte = NULL;
page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE,
&vmf, NULL);
- if (page)
+ if (IS_ERR(page))
+ return PTR_ERR(page);
+ else if (page)
folio = page_folio(page);
if (!folio) {
/*