[v2,2/3] madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check

Message ID 20230808020917.2230692-3-fengwei.yin@intel.com
State New
Headers
Series don't use mapcount() to check large folio sharing |

Commit Message

Yin Fengwei Aug. 8, 2023, 2:09 a.m. UTC
  Commit fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to
use a folio") replaced the page_mapcount() with folio_mapcount() to check
whether the folio is shared by other mapping.

It's not correct for large folios. folio_mapcount() returns the total
mapcount of large folio which is not suitable to detect whether the folio
is shared.

Use folio_estimated_sharers() which returns a estimated number of shares.
That means it's not 100% correct. It should be OK for madvise case here.

User-visible effects is that the THP is skipped when user call madvise.
But the correct behavior is THP should be split and processed then.

NOTE: this change is a temporary fix to reduce the user-visible effects
before the long term fix from David is ready.

Fixes: fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio")
Cc: stable@vger.kernel.org
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
---
 mm/huge_memory.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Yu Zhao Aug. 8, 2023, 2:49 a.m. UTC | #1
On Mon, Aug 7, 2023 at 8:11 PM Yin Fengwei <fengwei.yin@intel.com> wrote:
>
> Commit fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to
> use a folio") replaced the page_mapcount() with folio_mapcount() to check
> whether the folio is shared by other mapping.
>
> It's not correct for large folios. folio_mapcount() returns the total
> mapcount of large folio which is not suitable to detect whether the folio
> is shared.
>
> Use folio_estimated_sharers() which returns a estimated number of shares.
> That means it's not 100% correct. It should be OK for madvise case here.
>
> User-visible effects is that the THP is skipped when user call madvise.
> But the correct behavior is THP should be split and processed then.
>
> NOTE: this change is a temporary fix to reduce the user-visible effects
> before the long term fix from David is ready.
>
> Fixes: fc986a38b670 ("mm: huge_memory: convert madvise_free_huge_pmd to use a folio")
> Cc: stable@vger.kernel.org

Andrew, this one isn't really a bug fix but an optimization, so please
feel free to drop the Fixes and Cc tags above. (It seems to me it
doesn't hurt for stable to take it though.)

Thank you.


> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
> Reviewed-by: Yu Zhao <yuzhao@google.com>
> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  mm/huge_memory.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 154c210892a1..0b709d2c46c6 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1612,7 +1612,7 @@ bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>          * If other processes are mapping this folio, we couldn't discard
>          * the folio unless they all do MADV_FREE so let's skip the folio.
>          */
> -       if (folio_mapcount(folio) != 1)
> +       if (folio_estimated_sharers(folio) != 1)
>                 goto out;
>
>         if (!folio_trylock(folio))
> --
> 2.39.2
>
  

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 154c210892a1..0b709d2c46c6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1612,7 +1612,7 @@  bool madvise_free_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 	 * If other processes are mapping this folio, we couldn't discard
 	 * the folio unless they all do MADV_FREE so let's skip the folio.
 	 */
-	if (folio_mapcount(folio) != 1)
+	if (folio_estimated_sharers(folio) != 1)
 		goto out;
 
 	if (!folio_trylock(folio))