[v2,0/3] don't use mapcount() to check large folio sharing

Message ID 20230808020917.2230692-1-fengwei.yin@intel.com
Headers
Series don't use mapcount() to check large folio sharing |

Message

Yin Fengwei Aug. 8, 2023, 2:09 a.m. UTC
  In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(),
folio_mapcount() is used to check whether the folio is shared. But it's
not correct as folio_mapcount() returns total mapcount of large folio.

Use folio_estimated_sharers() here as the estimated number is enough.


This patchset will fix the cases:
User space application call madvise() with MADV_FREE, MADV_COLD and
MADV_PAGEOUT for specific address range. There are THP mapped to the
range. Without the patchset, the THP is skipped. With the patch, the
THP will be split and handled accordingly.

David reported the cow self test skip some cases because of
MADV_PAGEOUT skip THP:
https://lore.kernel.org/linux-mm/9e92e42d-488f-47db-ac9d-75b24cd0d037@intel.com/T/#mbf0f2ec7fbe45da47526de1d7036183981691e81
and I confirmed this patchset make it work again.


Changelog from v1:
  - Avoid two Fixes tags make backport harder. Thank Andrew for pointing
    this out.

  - Add note section to mention this is a temporary fix which is fine
    to reduce user-visble effects. For long term fix, we should wait for
    David's solution. Thank Ryan and David for pointing this out.

  - Spell user-visible effects out. Then people could decide whether
    these patches are necessary for stable branch. Thank Andrew for
    pointing this out.

V1:
https://lore.kernel.org/linux-mm/9e92e42d-488f-47db-ac9d-75b24cd0d037@intel.com/T/#med74caad0cbd0049641cfddc5b9fe793b4b50276  

Yin Fengwei (3):
  madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount()
    against large folio for sharing check
  madvise:madvise_free_huge_pmd(): don't use mapcount() against large
    folio for sharing check
  madvise:madvise_free_pte_range(): don't use mapcount() against large
    folio for sharing check

 mm/huge_memory.c | 2 +-
 mm/madvise.c     | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)
  

Comments

Yin Fengwei Aug. 8, 2023, 4:10 a.m. UTC | #1
On 8/8/2023 10:43 AM, Yu Zhao wrote:
> On Mon, Aug 7, 2023 at 8:10 PM Yin Fengwei <fengwei.yin@intel.com> wrote:
>>
>> In madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(),
>> folio_mapcount() is used to check whether the folio is shared. But it's
>> not correct as folio_mapcount() returns total mapcount of large folio.
>>
>> Use folio_estimated_sharers() here as the estimated number is enough.
>>
>>
>> This patchset will fix the cases:
>> User space application call madvise() with MADV_FREE, MADV_COLD and
>> MADV_PAGEOUT for specific address range. There are THP mapped to the
>> range. Without the patchset, the THP is skipped. With the patch, the
>> THP will be split and handled accordingly.
>>
>> David reported the cow self test skip some cases because of
>> MADV_PAGEOUT skip THP:
>> https://lore.kernel.org/linux-mm/9e92e42d-488f-47db-ac9d-75b24cd0d037@intel.com/T/#mbf0f2ec7fbe45da47526de1d7036183981691e81
>> and I confirmed this patchset make it work again.
>>
>>
>> Changelog from v1:
>>   - Avoid two Fixes tags make backport harder. Thank Andrew for pointing
>>     this out.
>>
>>   - Add note section to mention this is a temporary fix which is fine
>>     to reduce user-visble effects. For long term fix, we should wait for
>>     David's solution. Thank Ryan and David for pointing this out.
>>
>>   - Spell user-visible effects out. Then people could decide whether
>>     these patches are necessary for stable branch. Thank Andrew for
>>     pointing this out.
> 
> LGTM, thank you.
Thanks a lot for looking at this patchset.


Regards
Yin, Fengwei