[RFC,1/1] x86/mm/pat: Clear VM_PAT if copy_p4d_range failed

Message ID 20230217025615.1595558-1-mawupeng1@huawei.com
State New
Headers
Series [RFC,1/1] x86/mm/pat: Clear VM_PAT if copy_p4d_range failed |

Commit Message

mawupeng Feb. 17, 2023, 2:56 a.m. UTC
  From: Ma Wupeng <mawupeng1@huawei.com>

x86/mm/pat: Clear VM_PAT if copy_p4d_range failed.

Syzbot reports a warning in untrack_pfn(). Digging into the root we found
that this is due to memory allocation failure in pmd_alloc_one. And this
failure is produced due to failslab.

In copy_page_range(), memory alloaction for pmd failed. During the error
handling process in copy_page_range(), mmput() is called to remove all
vmas. While untrack_pfn this empty pfn, warning happens.

Here's a simplified flow:

dup_mm
  dup_mmap
    copy_page_range
      copy_p4d_range
        copy_pud_range
          copy_pmd_range
            pmd_alloc
              __pmd_alloc
                pmd_alloc_one
                  page = alloc_pages(gfp, 0);
                    if (!page)
                      return NULL;
    mmput
        exit_mmap
          unmap_vmas
            unmap_single_vma
              untrack_pfn
                follow_phys
                  WARN_ON_ONCE(1);

Since this vma is not generate successfully, we can clear flag VM_PAT.
In this case, untrack_pfn() will not be called while cleaning this vma.

Function untrack_pfn_moved() has also been renamed to fit the new logic.

Reported-by: syzbot+5f488e922d047d8f00cc@syzkaller.appspotmail.com
Signed-off-by: Ma Wupeng <mawupeng1@huawei.com>
---
 arch/x86/mm/pat/memtype.c | 12 ++++++++----
 include/linux/pgtable.h   |  7 ++++---
 mm/memory.c               |  1 +
 mm/mremap.c               |  2 +-
 4 files changed, 14 insertions(+), 8 deletions(-)
  

Comments

Dave Hansen Feb. 22, 2023, 8:03 p.m. UTC | #1
On 2/16/23 18:56, Wupeng Ma wrote:
> dup_mm
>   dup_mmap
>     copy_page_range
>       copy_p4d_range
>         copy_pud_range
>           copy_pmd_range
>             pmd_alloc
>               __pmd_alloc
>                 pmd_alloc_one
>                   page = alloc_pages(gfp, 0);
>                     if (!page)
>                       return NULL;
>     mmput
>         exit_mmap
>           unmap_vmas
>             unmap_single_vma
>               untrack_pfn
>                 follow_phys
>                   WARN_ON_ONCE(1);

What's the point of that warning in the first place?  I can certainly
imagine follow_phys() failing for sparse mappings, for instance.  Is
there some requirement that VM_PFNMAP can't be sparse?
  
Andrew Morton Feb. 23, 2023, 10:14 p.m. UTC | #2
On Fri, 17 Feb 2023 10:56:15 +0800 Wupeng Ma <mawupeng1@huawei.com> wrote:

> From: Ma Wupeng <mawupeng1@huawei.com>
> 
> x86/mm/pat: Clear VM_PAT if copy_p4d_range failed.
> 
> Syzbot reports

Thanks.  It would be nice to have a Link: to this report - I cannot
find that email anywhere :(
  
mawupeng Feb. 24, 2023, 1:15 a.m. UTC | #3
On 2023/2/24 6:14, Andrew Morton wrote:
> On Fri, 17 Feb 2023 10:56:15 +0800 Wupeng Ma <mawupeng1@huawei.com> wrote:
> 
>> From: Ma Wupeng <mawupeng1@huawei.com>
>>
>> x86/mm/pat: Clear VM_PAT if copy_p4d_range failed.
>>
>> Syzbot reports
> 
> Thanks.  It would be nice to have a Link: to this report - I cannot
> find that email anywhere :(

This WARN_ON_ONCE was originally reported by syz since v4.19[1] and we found that
this problem can be reproduced in v5.10 and the latest master.

[1]: https://syzkaller.appspot.com/bug?id=2ab293a84ed98e600f31dede8e082dc81be8a4fe

> 
>
  
mawupeng March 2, 2023, 3:47 a.m. UTC | #4
On 2023/2/23 4:03, Dave Hansen wrote:
> On 2/16/23 18:56, Wupeng Ma wrote:
>> dup_mm
>>   dup_mmap
>>     copy_page_range
>>       copy_p4d_range
>>         copy_pud_range
>>           copy_pmd_range
>>             pmd_alloc
>>               __pmd_alloc
>>                 pmd_alloc_one
>>                   page = alloc_pages(gfp, 0);
>>                     if (!page)
>>                       return NULL;
>>     mmput
>>         exit_mmap
>>           unmap_vmas
>>             unmap_single_vma
>>               untrack_pfn
>>                 follow_phys
>>                   WARN_ON_ONCE(1);
> 
> What's the point of that warning in the first place?  I can certainly
> imagine follow_phys() failing for sparse mappings, for instance.  Is
> there some requirement that VM_PFNMAP can't be sparse?

Hi Dave

Thanks for reviewing.

Sorry,I have no idea why warning in the first place.

I think we can delete this WARN_ON_ONCE with another patch?
  

Patch

diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index fb4b1b5e0dea..558bb71ff350 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -1070,11 +1070,15 @@  void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 }
 
 /*
- * untrack_pfn_moved is called, while mremapping a pfnmap for a new region,
- * with the old vma after its pfnmap page table has been removed.  The new
- * vma has a new pfnmap to the same pfn & cache type with VM_PAT set.
+ * untrack_pfn_clear is called if the following situation fits:
+ *
+ * 1) while mremapping a pfnmap for a new region,  with the old vma after
+ * its pfnmap page table has been removed.  The new vma has a new pfnmap
+ * to the same pfn & cache type with VM_PAT set.
+ * 2) while duplicating vm area, the new vma fails to copy the pgtable from
+ * old vma.
  */
-void untrack_pfn_moved(struct vm_area_struct *vma)
+void untrack_pfn_clear(struct vm_area_struct *vma)
 {
 	vma->vm_flags &= ~VM_PAT;
 }
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 1159b25b0542..e211124f4330 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1219,9 +1219,10 @@  static inline void untrack_pfn(struct vm_area_struct *vma,
 }
 
 /*
- * untrack_pfn_moved is called while mremapping a pfnmap for a new region.
+ * untrack_pfn_clear is called while mremapping a pfnmap for a new region
+ * or fails to copy pgtable during duplicate vm area.
  */
-static inline void untrack_pfn_moved(struct vm_area_struct *vma)
+static inline void untrack_pfn_clear(struct vm_area_struct *vma)
 {
 }
 #else
@@ -1233,7 +1234,7 @@  extern void track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
 extern int track_pfn_copy(struct vm_area_struct *vma);
 extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 			unsigned long size);
-extern void untrack_pfn_moved(struct vm_area_struct *vma);
+extern void untrack_pfn_clear(struct vm_area_struct *vma);
 #endif
 
 #ifdef CONFIG_MMU
diff --git a/mm/memory.c b/mm/memory.c
index f526b9152bef..846d4136d1de 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1278,6 +1278,7 @@  copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma)
 			continue;
 		if (unlikely(copy_p4d_range(dst_vma, src_vma, dst_pgd, src_pgd,
 					    addr, next))) {
+			untrack_pfn_clear(dst_vma);
 			ret = -ENOMEM;
 			break;
 		}
diff --git a/mm/mremap.c b/mm/mremap.c
index 930f65c315c0..6ed28eeae5a8 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -682,7 +682,7 @@  static unsigned long move_vma(struct vm_area_struct *vma,
 
 	/* Tell pfnmap has moved from this vma */
 	if (unlikely(vma->vm_flags & VM_PFNMAP))
-		untrack_pfn_moved(vma);
+		untrack_pfn_clear(vma);
 
 	if (unlikely(!err && (flags & MREMAP_DONTUNMAP))) {
 		/* We always clear VM_LOCKED[ONFAULT] on the old vma */