[v3,2/4] mm/damon/ops-common: atomically test and clear young on ptes and pmds

Message ID 20230602092949.545577-3-ryan.roberts@arm.com
State New
Headers
Series Fixes for pte encapsulation bypasses |

Commit Message

Ryan Roberts June 2, 2023, 9:29 a.m. UTC
  It is racy to non-atomically read a pte, then clear the young bit, then
write it back as this could discard dirty information. Further, it is
bad practice to directly set a pte entry within a table. Instead
clearing young must go through the arch-provided helper,
ptep_test_and_clear_young() to ensure it is modified atomically and to
give the arch code visibility and allow it to check (and potentially
modify) the operation.

Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 mm/damon/ops-common.c | 16 ++++++----------
 mm/damon/ops-common.h |  4 ++--
 mm/damon/paddr.c      |  4 ++--
 mm/damon/vaddr.c      |  4 ++--
 4 files changed, 12 insertions(+), 16 deletions(-)
  

Comments

Yu Zhao June 2, 2023, 4:35 p.m. UTC | #1
On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> It is racy to non-atomically read a pte, then clear the young bit, then
> write it back as this could discard dirty information. Further, it is
> bad practice to directly set a pte entry within a table. Instead
> clearing young must go through the arch-provided helper,
> ptep_test_and_clear_young() to ensure it is modified atomically and to
> give the arch code visibility and allow it to check (and potentially
> modify) the operation.
>
> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").

Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
deemed unnecessary?

> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: SeongJae Park <sj@kernel.org>
> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
  
Ryan Roberts June 2, 2023, 5:14 p.m. UTC | #2
On 02/06/2023 17:35, Yu Zhao wrote:
> On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> It is racy to non-atomically read a pte, then clear the young bit, then
>> write it back as this could discard dirty information. Further, it is
>> bad practice to directly set a pte entry within a table. Instead
>> clearing young must go through the arch-provided helper,
>> ptep_test_and_clear_young() to ensure it is modified atomically and to
>> give the arch code visibility and allow it to check (and potentially
>> modify) the operation.
>>
>> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
> 
> Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> deemed unnecessary?

It was overlooked - incompetance strikes again! I was intending to cc the whole
series. What's the best way to fix this? Can I just add stable in cc on reply to
the cover letter or will I have to resend the lot?

> 
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>> Reviewed-by: SeongJae Park <sj@kernel.org>
>> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
  
Yu Zhao June 2, 2023, 5:35 p.m. UTC | #3
On Fri, Jun 2, 2023 at 11:14 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 02/06/2023 17:35, Yu Zhao wrote:
> > On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> It is racy to non-atomically read a pte, then clear the young bit, then
> >> write it back as this could discard dirty information. Further, it is
> >> bad practice to directly set a pte entry within a table. Instead
> >> clearing young must go through the arch-provided helper,
> >> ptep_test_and_clear_young() to ensure it is modified atomically and to
> >> give the arch code visibility and allow it to check (and potentially
> >> modify) the operation.
> >>
> >> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
> >
> > Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> > deemed unnecessary?
>
> It was overlooked - incompetance strikes again! I was intending to cc the whole
> series. What's the best way to fix this? Can I just add stable in cc on reply to
> the cover letter or will I have to resend the lot?

Resending the whole series would be more reliable for the process (and
easier for Andrew). You might want to include a few new
reviewed/acked-bys anyway (I just acked the next patch).
  
SeongJae Park June 2, 2023, 7:15 p.m. UTC | #4
Hi Ryan,

On Fri, 2 Jun 2023 18:14:25 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:

> On 02/06/2023 17:35, Yu Zhao wrote:
> > On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>
> >> It is racy to non-atomically read a pte, then clear the young bit, then
> >> write it back as this could discard dirty information. Further, it is
> >> bad practice to directly set a pte entry within a table. Instead
> >> clearing young must go through the arch-provided helper,
> >> ptep_test_and_clear_young() to ensure it is modified atomically and to
> >> give the arch code visibility and allow it to check (and potentially
> >> modify) the operation.
> >>
> >> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
> > 
> > Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> > deemed unnecessary?
> 
> It was overlooked - incompetance strikes again! I was intending to cc the
> whole series.

Not the whole patches in this series but only this patch is intended to be
merged in stable series, right?  If I'm not wrong, you could add
'Cc: <stable@vger.kernel.org>' tag here[1] when resending, to let stable kernel
maintainers easily understand exactly what patches should be merged in the
stable kernels.  So, you wouldn't need to touch coverletter or cc whole series
but only this one.

[1] https://www.kernel.org/doc/html/v4.10/process/stable-kernel-rules.html


Thanks,
SJ

> What's the best way to fix this? Can I just add stable in cc on reply to
> the cover letter or will I have to resend the lot?
> 
> > 
> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> >> Reviewed-by: Zi Yan <ziy@nvidia.com>
> >> Reviewed-by: SeongJae Park <sj@kernel.org>
> >> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
  
SeongJae Park June 2, 2023, 9:43 p.m. UTC | #5
On Fri, 2 Jun 2023 19:15:01 +0000 SeongJae Park <sj@kernel.org> wrote:

> Hi Ryan,
> 
> On Fri, 2 Jun 2023 18:14:25 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:
> 
> > On 02/06/2023 17:35, Yu Zhao wrote:
> > > On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> > >>
> > >> It is racy to non-atomically read a pte, then clear the young bit, then
> > >> write it back as this could discard dirty information. Further, it is
> > >> bad practice to directly set a pte entry within a table. Instead
> > >> clearing young must go through the arch-provided helper,
> > >> ptep_test_and_clear_young() to ensure it is modified atomically and to
> > >> give the arch code visibility and allow it to check (and potentially
> > >> modify) the operation.
> > >>
> > >> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
> > > 
> > > Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
> > > deemed unnecessary?
> > 
> > It was overlooked - incompetance strikes again! I was intending to cc the
> > whole series.
> 
> Not the whole patches in this series but only this patch is intended to be
> merged in stable series, right?  If I'm not wrong, you could add
> 'Cc: <stable@vger.kernel.org>' tag here[1] when resending, to let stable kernel
> maintainers easily understand exactly what patches should be merged in the
> stable kernels.  So, you wouldn't need to touch coverletter or cc whole series
> but only this one.
> 
> [1] https://www.kernel.org/doc/html/v4.10/process/stable-kernel-rules.html

And I just found Andrew added the tag while adding this to the -mm queue.
Thank you, Andrew!

[1] https://lore.kernel.org/mm-commits/20230602205509.9DFBDC433D2@smtp.kernel.org/


Thanks,
SJ

> 
> 
> Thanks,
> SJ
>
  
Ryan Roberts June 3, 2023, 6:20 p.m. UTC | #6
On 02/06/2023 22:43, SeongJae Park wrote:
> On Fri, 2 Jun 2023 19:15:01 +0000 SeongJae Park <sj@kernel.org> wrote:
> 
>> Hi Ryan,
>>
>> On Fri, 2 Jun 2023 18:14:25 +0100 Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>>> On 02/06/2023 17:35, Yu Zhao wrote:
>>>> On Fri, Jun 2, 2023 at 3:30 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>>
>>>>> It is racy to non-atomically read a pte, then clear the young bit, then
>>>>> write it back as this could discard dirty information. Further, it is
>>>>> bad practice to directly set a pte entry within a table. Instead
>>>>> clearing young must go through the arch-provided helper,
>>>>> ptep_test_and_clear_young() to ensure it is modified atomically and to
>>>>> give the arch code visibility and allow it to check (and potentially
>>>>> modify) the operation.
>>>>>
>>>>> Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces").
>>>>
>>>> Just to double check: was "Cc: stable@vger.kernel.org" overlooked or
>>>> deemed unnecessary?
>>>
>>> It was overlooked - incompetance strikes again! I was intending to cc the
>>> whole series.
>>
>> Not the whole patches in this series but only this patch is intended to be
>> merged in stable series, right?  If I'm not wrong, you could add
>> 'Cc: <stable@vger.kernel.org>' tag here[1] when resending, to let stable kernel
>> maintainers easily understand exactly what patches should be merged in the
>> stable kernels.  So, you wouldn't need to touch coverletter or cc whole series
>> but only this one.
>>
>> [1] https://www.kernel.org/doc/html/v4.10/process/stable-kernel-rules.html
> 
> And I just found Andrew added the tag while adding this to the -mm queue.
> Thank you, Andrew!

Yes indeed - thanks for fixing that up for me, Andrew!

> 
> [1] https://lore.kernel.org/mm-commits/20230602205509.9DFBDC433D2@smtp.kernel.org/
> 
> 
> Thanks,
> SJ
> 
>>
>>
>> Thanks,
>> SJ
>>
  

Patch

diff --git a/mm/damon/ops-common.c b/mm/damon/ops-common.c
index cc63cf953636..acc264b97903 100644
--- a/mm/damon/ops-common.c
+++ b/mm/damon/ops-common.c
@@ -37,7 +37,7 @@  struct folio *damon_get_folio(unsigned long pfn)
 	return folio;
 }
 
-void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
+void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned long addr)
 {
 	bool referenced = false;
 	struct folio *folio = damon_get_folio(pte_pfn(*pte));
@@ -45,13 +45,11 @@  void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
 	if (!folio)
 		return;
 
-	if (pte_young(*pte)) {
+	if (ptep_test_and_clear_young(vma, addr, pte))
 		referenced = true;
-		*pte = pte_mkold(*pte);
-	}
 
 #ifdef CONFIG_MMU_NOTIFIER
-	if (mmu_notifier_clear_young(mm, addr, addr + PAGE_SIZE))
+	if (mmu_notifier_clear_young(vma->vm_mm, addr, addr + PAGE_SIZE))
 		referenced = true;
 #endif /* CONFIG_MMU_NOTIFIER */
 
@@ -62,7 +60,7 @@  void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr)
 	folio_put(folio);
 }
 
-void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr)
+void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr)
 {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	bool referenced = false;
@@ -71,13 +69,11 @@  void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr)
 	if (!folio)
 		return;
 
-	if (pmd_young(*pmd)) {
+	if (pmdp_test_and_clear_young(vma, addr, pmd))
 		referenced = true;
-		*pmd = pmd_mkold(*pmd);
-	}
 
 #ifdef CONFIG_MMU_NOTIFIER
-	if (mmu_notifier_clear_young(mm, addr, addr + HPAGE_PMD_SIZE))
+	if (mmu_notifier_clear_young(vma->vm_mm, addr, addr + HPAGE_PMD_SIZE))
 		referenced = true;
 #endif /* CONFIG_MMU_NOTIFIER */
 
diff --git a/mm/damon/ops-common.h b/mm/damon/ops-common.h
index 14f4bc69f29b..18d837d11bce 100644
--- a/mm/damon/ops-common.h
+++ b/mm/damon/ops-common.h
@@ -9,8 +9,8 @@ 
 
 struct folio *damon_get_folio(unsigned long pfn);
 
-void damon_ptep_mkold(pte_t *pte, struct mm_struct *mm, unsigned long addr);
-void damon_pmdp_mkold(pmd_t *pmd, struct mm_struct *mm, unsigned long addr);
+void damon_ptep_mkold(pte_t *pte, struct vm_area_struct *vma, unsigned long addr);
+void damon_pmdp_mkold(pmd_t *pmd, struct vm_area_struct *vma, unsigned long addr);
 
 int damon_cold_score(struct damon_ctx *c, struct damon_region *r,
 			struct damos *s);
diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
index 467b99166b43..5b3a3463d078 100644
--- a/mm/damon/paddr.c
+++ b/mm/damon/paddr.c
@@ -24,9 +24,9 @@  static bool __damon_pa_mkold(struct folio *folio, struct vm_area_struct *vma,
 	while (page_vma_mapped_walk(&pvmw)) {
 		addr = pvmw.address;
 		if (pvmw.pte)
-			damon_ptep_mkold(pvmw.pte, vma->vm_mm, addr);
+			damon_ptep_mkold(pvmw.pte, vma, addr);
 		else
-			damon_pmdp_mkold(pvmw.pmd, vma->vm_mm, addr);
+			damon_pmdp_mkold(pvmw.pmd, vma, addr);
 	}
 	return true;
 }
diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c
index 1fec16d7263e..37994fb6120c 100644
--- a/mm/damon/vaddr.c
+++ b/mm/damon/vaddr.c
@@ -311,7 +311,7 @@  static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
 		}
 
 		if (pmd_trans_huge(*pmd)) {
-			damon_pmdp_mkold(pmd, walk->mm, addr);
+			damon_pmdp_mkold(pmd, walk->vma, addr);
 			spin_unlock(ptl);
 			return 0;
 		}
@@ -323,7 +323,7 @@  static int damon_mkold_pmd_entry(pmd_t *pmd, unsigned long addr,
 	pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
 	if (!pte_present(*pte))
 		goto out;
-	damon_ptep_mkold(pte, walk->mm, addr);
+	damon_ptep_mkold(pte, walk->vma, addr);
 out:
 	pte_unmap_unlock(pte, ptl);
 	return 0;