mm: fix oops when filemap_map_pmd() without prealloc_pte

Message ID 6ed0c50c-78ef-0719-b3c5-60c0c010431c@google.com
State New
Headers
Series mm: fix oops when filemap_map_pmd() without prealloc_pte |

Commit Message

Hugh Dickins Nov. 17, 2023, 8:49 a.m. UTC
  syzbot reports oops in lockdep's __lock_acquire(), called from
__pte_offset_map_lock() called from filemap_map_pages(); or when I run
the repro, the oops comes in pmd_install(), called from filemap_map_pmd()
called from filemap_map_pages(), just before the __pte_offset_map_lock().

The problem is that filemap_map_pmd() has been assuming that when it
finds pmd_none(), a page table has already been prepared in prealloc_pte;
and indeed do_fault_around() has been careful to preallocate one there,
when it finds pmd_none(): but what if *pmd became none in between?

My 6.6 mods in mm/khugepaged.c, avoiding mmap_lock for write, have made
it easy for *pmd to be cleared while servicing a page fault; but even
before those, a huge *pmd might be zapped while a fault is serviced.

The difference in symptomatic stack traces comes from the "memory model"
in use: pmd_install() uses pmd_populate() uses page_to_pfn(): in some
models that is strict, and will oops on the NULL prealloc_pte; in other
models, it will construct a bogus value to be populated into *pmd, then
__pte_offset_map_lock() oops when trying to access split ptlock pointer
(or some other symptom in normal case of ptlock embedded not pointer).

Reported-and-tested-by: syzbot+89edd67979b52675ddec@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-mm/0000000000005e44550608a0806c@google.com/
Link: https://lore.kernel.org/linux-mm/20231115065506.19780-1-jose.pekkarinen@foxhound.fi/
Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>    [5.12+]
---
 mm/filemap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

David Hildenbrand Nov. 28, 2023, 6:18 p.m. UTC | #1
On 17.11.23 09:49, Hugh Dickins wrote:
> syzbot reports oops in lockdep's __lock_acquire(), called from
> __pte_offset_map_lock() called from filemap_map_pages(); or when I run
> the repro, the oops comes in pmd_install(), called from filemap_map_pmd()
> called from filemap_map_pages(), just before the __pte_offset_map_lock().
> 
> The problem is that filemap_map_pmd() has been assuming that when it
> finds pmd_none(), a page table has already been prepared in prealloc_pte;
> and indeed do_fault_around() has been careful to preallocate one there,
> when it finds pmd_none(): but what if *pmd became none in between?
> 
> My 6.6 mods in mm/khugepaged.c, avoiding mmap_lock for write, have made
> it easy for *pmd to be cleared while servicing a page fault; but even
> before those, a huge *pmd might be zapped while a fault is serviced.
> 
> The difference in symptomatic stack traces comes from the "memory model"
> in use: pmd_install() uses pmd_populate() uses page_to_pfn(): in some
> models that is strict, and will oops on the NULL prealloc_pte; in other
> models, it will construct a bogus value to be populated into *pmd, then
> __pte_offset_map_lock() oops when trying to access split ptlock pointer
> (or some other symptom in normal case of ptlock embedded not pointer).
> 
> Reported-and-tested-by: syzbot+89edd67979b52675ddec@syzkaller.appspotmail.com
> Closes: https://lore.kernel.org/linux-mm/0000000000005e44550608a0806c@google.com/
> Link: https://lore.kernel.org/linux-mm/20231115065506.19780-1-jose.pekkarinen@foxhound.fi/
> Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")


Quite an old one

Reviewed-by: David Hildenbrand <david@redhat.com>
  

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index 9710f43a89ac..3d4dae9d1070 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3371,7 +3371,7 @@  static bool filemap_map_pmd(struct vm_fault *vmf, struct folio *folio,
 		}
 	}
 
-	if (pmd_none(*vmf->pmd))
+	if (pmd_none(*vmf->pmd) && vmf->prealloc_pte)
 		pmd_install(mm, vmf->pmd, &vmf->prealloc_pte);
 
 	return false;