[v3,2/4] mm/damon/ops-common: atomically test and clear young on ptes and pmds
Commit Message
It is racy to non-atomically read a pte, then clear the young bit, then
write it back as this could discard dirty information. Further, it is
bad practice to directly set a pte entry within a table. Instead
clearing young must go through the arch-provided helper,
ptep_test_and_clear_young() to ensure it is modified atomically and to
give the arch code visibility and allow it to check (and potentially
modify) the operation.
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
mm/damon/ops-common.c | 16 ++++++----------
mm/damon/ops-common.h | 4 ++--
mm/damon/paddr.c | 4 ++--
mm/damon/vaddr.c | 4 ++--
4 files changed, 12 insertions(+), 16 deletions(-)
Comments
On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> It is racy to non-atomically read a pte, then clear the young bit, then
> write it back as this could discard dirty information. Further, it is
> bad practice to directly set a pte entry within a table. Instead
> clearing young must go through the arch-provided helper,
> ptep_test_and_clear_young() to ensure it is modified atomically and to
> give the arch code visibility and allow it to check (and potentially
> modify) the operation.
>
> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
deemed unnecessary?
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: SeongJae Park <sj@kernel.org>
> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
On 02/06/2023 17:35, Yu Zhao wrote:
> On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> It is racy to non-atomically read a pte, then clear the young bit, then
>> write it back as this could discard dirty information. Further, it is
>> bad practice to directly set a pte entry within a table. Instead
>> clearing young must go through the arch-provided helper,
>> ptep_test_and_clear_young() to ensure it is modified atomically and to
>> give the arch code visibility and allow it to check (and potentially
>> modify) the operation.
>>
>> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
>
> Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> deemed unnecessary?
It was overlooked - incompetance strikes again! I was intending to cc the whole
series. What's the best way to fix this? Can I just add stable in cc on reply to
the cover letter or will I have to resend the lot?
>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>> Reviewed-by: SeongJae Park <sj@kernel.org>
>> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
On Fri, Jun 2, 2023 at 11:14 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 02/06/2023 17:35, Yu Zhao wrote:
> > On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> It is racy to non-atomically read a pte, then clear the young bit, then
> >> write it back as this could discard dirty information. Further, it is
> >> bad practice to directly set a pte entry within a table. Instead
> >> clearing young must go through the arch-provided helper,
> >> ptep_test_and_clear_young() to ensure it is modified atomically and to
> >> give the arch code visibility and allow it to check (and potentially
> >> modify) the operation.
> >>
> >> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
> >
> > Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> > deemed unnecessary?
>
> It was overlooked - incompetance strikes again! I was intending to cc the whole
> series. What's the best way to fix this? Can I just add stable in cc on reply to
> the cover letter or will I have to resend the lot?
Resending the whole series would be more reliable for the process (and
easier for Andrew). You might want to include a few new
reviewed/acked-bys anyway (I just acked the next patch).
Hi Ryan,
On Fri, 2 Jun 2023 18:14:25 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:
> On 02/06/2023 17:35, Yu Zhao wrote:
> > On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> It is racy to non-atomically read a pte, then clear the young bit, then
> >> write it back as this could discard dirty information. Further, it is
> >> bad practice to directly set a pte entry within a table. Instead
> >> clearing young must go through the arch-provided helper,
> >> ptep_test_and_clear_young() to ensure it is modified atomically and to
> >> give the arch code visibility and allow it to check (and potentially
> >> modify) the operation.
> >>
> >> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
> >
> > Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> > deemed unnecessary?
>
> It was overlooked - incompetance strikes again! I was intending to cc the
> whole series.
Not the whole patches in this series but only this patch is intended to be
merged in stable series, right? If I'm not wrong, you could add
'Cc: <stable@vger.kernel.org>' tag here[1] when resending, to let stable kernel
maintainers easily understand exactly what patches should be merged in the
stable kernels. So, you wouldn't need to touch coverletter or cc whole series
but only this one.
[1] https://www.kernel.org/doc/html/v4.10/process/stable-kernel-rules.html
Thanks,
SJ
> What's the best way to fix this? Can I just add stable in cc on reply to
> the cover letter or will I have to resend the lot?
>
> >
> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> >> Reviewed-by: Zi Yan <ziy@nvidia.com>
> >> Reviewed-by: SeongJae Park <sj@kernel.org>
> >> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
On Fri, 2 Jun 2023 19:15:01 +0000 SeongJae Park <sj@kernel.org> wrote:
> Hi Ryan,
>
> On Fri, 2 Jun 2023 18:14:25 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> > On 02/06/2023 17:35, Yu Zhao wrote:
> > > On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> > >>
> > >> It is racy to non-atomically read a pte, then clear the young bit, then
> > >> write it back as this could discard dirty information. Further, it is
> > >> bad practice to directly set a pte entry within a table. Instead
> > >> clearing young must go through the arch-provided helper,
> > >> ptep_test_and_clear_young() to ensure it is modified atomically and to
> > >> give the arch code visibility and allow it to check (and potentially
> > >> modify) the operation.
> > >>
> > >> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
> > >
> > > Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> > > deemed unnecessary?
> >
> > It was overlooked - incompetance strikes again! I was intending to cc the
> > whole series.
>
> Not the whole patches in this series but only this patch is intended to be
> merged in stable series, right? If I'm not wrong, you could add
> 'Cc: <stable@vger.kernel.org>' tag here[1] when resending, to let stable kernel
> maintainers easily understand exactly what patches should be merged in the
> stable kernels. So, you wouldn't need to touch coverletter or cc whole series
> but only this one.
>
> [1] https://www.kernel.org/doc/html/v4.10/process/stable-kernel-rules.html
And I just found Andrew added the tag while adding this to the -mm queue.
Thank you, Andrew!
[1] https://lore.kernel.org/mm-commits/20230602205509.9DFBDC433D2@smtp.kernel.org/
Thanks,
SJ
>
>
> Thanks,
> SJ
>
On 02/06/2023 22:43, SeongJae Park wrote:
> On Fri, 2 Jun 2023 19:15:01 +0000 SeongJae Park <sj@kernel.org> wrote:
>
>> Hi Ryan,
>>
>> On Fri, 2 Jun 2023 18:14:25 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>>> On 02/06/2023 17:35, Yu Zhao wrote:
>>>> On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>>
>>>>> It is racy to non-atomically read a pte, then clear the young bit, then
>>>>> write it back as this could discard dirty information. Further, it is
>>>>> bad practice to directly set a pte entry within a table. Instead
>>>>> clearing young must go through the arch-provided helper,
>>>>> ptep_test_and_clear_young() to ensure it is modified atomically and to
>>>>> give the arch code visibility and allow it to check (and potentially
>>>>> modify) the operation.
>>>>>
>>>>> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
>>>>
>>>> Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
>>>> deemed unnecessary?
>>>
>>> It was overlooked - incompetance strikes again! I was intending to cc the
>>> whole series.
>>
>> Not the whole patches in this series but only this patch is intended to be
>> merged in stable series, right? If I'm not wrong, you could add
>> 'Cc: <stable@vger.kernel.org>' tag here[1] when resending, to let stable kernel
>> maintainers easily understand exactly what patches should be merged in the
>> stable kernels. So, you wouldn't need to touch coverletter or cc whole series
>> but only this one.
>>
>> [1] https://www.kernel.org/doc/html/v4.10/process/stable-kernel-rules.html
>
> And I just found Andrew added the tag while adding this to the -mm queue.
> Thank you, Andrew!
Yes indeed - thanks for fixing that up for me, Andrew!
>
> [1] https://lore.kernel.org/mm-commits/20230602205509.9DFBDC433D2@smtp.kernel.org/
>
>
> Thanks,
> SJ
>
>>
>>
>> Thanks,
>> SJ
>>
@@ -37,7 +37,7 @@ struct folio *damon_get_folio(unsigned long pfn)
return folio;
}
-void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
+void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned long addr)
{
bool referenced = false;
struct folio *folio = damon_get_folio(pte_pfn(*pte));
@@ -45,13 +45,11 @@ void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
if (!folio)
return;
- if (pte_young(*pte)) {
+ if (ptep_test_and_clear_young(vma, addr, pte))
referenced = true;
- *pte = pte_mkold(*pte);
- }
#ifdef CONFIG_MMU_NOTIFIER
- if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE))
+ if (mmu_notifier_clear_young(vma->vm_mm, addr, addr + PAGE_SIZE))
referenced = true;
#endif /* CONFIG_MMU_NOTIFIER */
@@ -62,7 +60,7 @@ void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
folio_put(folio);
}
-void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr)
+void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr)
{
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
bool referenced = false;
@@ -71,13 +69,11 @@ void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr)
if (!folio)
return;
- if (pmd_young(*pmd)) {
+ if (pmdp_test_and_clear_young(vma, addr, pmd))
referenced = true;
- *pmd = pmd_mkold(*pmd);
- }
#ifdef CONFIG_MMU_NOTIFIER
- if (mmu_notifier_clear_young(mm, addr, addr + HPAGE_PMD_SIZE))
+ if (mmu_notifier_clear_young(vma->vm_mm, addr, addr + HPAGE_PMD_SIZE))
referenced = true;
#endif /* CONFIG_MMU_NOTIFIER */
@@ -9,8 +9,8 @@
struct folio *damon_get_folio(unsigned long pfn);
-void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr);
-void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr);
+void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned long addr);
+void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr);
int damon_cold_score(struct damon_ctx *c, struct damon_region *r,
struct damos *s);
@@ -24,9 +24,9 @@ static bool __damon_pa_mkold(struct folio *folio, struct vm_area_struct *vma,
while (page_vma_mapped_walk(&pvmw)) {
addr = pvmw.address;
if (pvmw.pte)
- damon_ptep_mkold(pvmw.pte, vma->vm_mm, addr);
+ damon_ptep_mkold(pvmw.pte, vma, addr);
else
- damon_pmdp_mkold(pvmw.pmd, vma->vm_mm, addr);
+ damon_pmdp_mkold(pvmw.pmd, vma, addr);
}
return true;
}
@@ -311,7 +311,7 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
}
if (pmd_trans_huge(*pmd)) {
- damon_pmdp_mkold(pmd, walk->mm, addr);
+ damon_pmdp_mkold(pmd, walk->vma, addr);
spin_unlock(ptl);
return 0;
}
@@ -323,7 +323,7 @@ static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
if (!pte_present(*pte))
goto out;
- damon_ptep_mkold(pte, walk->mm, addr);
+ damon_ptep_mkold(pte, walk->vma, addr);
out:
pte_unmap_unlock(pte, ptl);
return 0;