fs/userfaultfd: Fix maple tree iterator in userfaultfd_unregister()

Message ID 20221107201121.1169273-1-Liam.Howlett@oracle.com
State New
Headers
Series fs/userfaultfd: Fix maple tree iterator in userfaultfd_unregister() |

Commit Message

Liam R. Howlett Nov. 7, 2022, 8:11 p.m. UTC
  When iterating the VMAs, the maple state needs to be invalidated if the
tree is modified by a split or merge to ensure the maple tree node
contained in the maple state is still valid.  These invalidations were
missed, so add them to the paths which alter the tree.

Reported-by: syzbot+0d2014e4da2ccced5b41@syzkaller.appspotmail.com
Fixes: 69dbe6daf104 (userfaultfd: use maple tree iterator to iterate VMAs)
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
---
 fs/userfaultfd.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Linus Torvalds Nov. 7, 2022, 9:13 p.m. UTC | #1
On Mon, Nov 7, 2022 at 12:12 PM Liam Howlett <liam.howlett@oracle.com> wrote:
>
> When iterating the VMAs, the maple state needs to be invalidated if the
> tree is modified by a split or merge to ensure the maple tree node
> contained in the maple state is still valid.  These invalidations were
> missed, so add them to the paths which alter the tree.

I have applied this as an obvious fix, but I would *really* want to
also see longer-term

 - I'd really like the 'mas' operations to have 'vma' specializations
that get the type safety right

 - that mas_pause() name is horrible, please let's just fix it to
something sensible in this context

 - moving the iterator invalidation into split_vma() and vma_merge()
or at least have some other means of not having these mistakes

From what I can tell, things like mprotect() and mlock() - end up not
using the iterator at all because of this issue. Instead they seem to
just do

                vma = find_vma(current->mm, prev->vm_end);

despite having actually started out with the whole iterator state.
Except for 'apply_mlockall_flags()' that randomly does end up usign
the iterator (and has that mas_pause() as a result).

So it would make *sense* to have "mlock_fixup()" take a MA_STATE
instead of "vma, &prev" as arguments, but it doesn't.

I dunno. Maybe there's some other reason for this very non-intuitive
mix of "sometimes iterators, sometimes not, and always horrible
naming".

                Linus
  
Liam R. Howlett Nov. 7, 2022, 10:04 p.m. UTC | #2
* Linus Torvalds <torvalds@linuxfoundation.org> [221107 16:14]:
> On Mon, Nov 7, 2022 at 12:12 PM Liam Howlett <liam.howlett@oracle.com> wrote:
> >
> > When iterating the VMAs, the maple state needs to be invalidated if the
> > tree is modified by a split or merge to ensure the maple tree node
> > contained in the maple state is still valid.  These invalidations were
> > missed, so add them to the paths which alter the tree.
> 
> I have applied this as an obvious fix, but I would *really* want to
> also see longer-term
> 
>  - I'd really like the 'mas' operations to have 'vma' specializations
> that get the type safety right
> 
>  - that mas_pause() name is horrible, please let's just fix it to
> something sensible in this context
> 
>  - moving the iterator invalidation into split_vma() and vma_merge()
> or at least have some other means of not having these mistakes
> 

Yes, we intend to do all these things in the 6.2 merge window.

> From what I can tell, things like mprotect() and mlock() - end up not
> using the iterator at all because of this issue. Instead they seem to
> just do
> 
>                 vma = find_vma(current->mm, prev->vm_end);
> 
> despite having actually started out with the whole iterator state.
> Except for 'apply_mlockall_flags()' that randomly does end up usign
> the iterator (and has that mas_pause() as a result).
> 
> So it would make *sense* to have "mlock_fixup()" take a MA_STATE
> instead of "vma, &prev" as arguments, but it doesn't.
> 
> I dunno. Maybe there's some other reason for this very non-intuitive
> mix of "sometimes iterators, sometimes not, and always horrible
> naming".
> 
>                 Linus


I will have a deeper look at the mprotect(), mlock() areas as well.  My
first pass on this was to just replace the easier loops with iterators.

Thanks,
Liam
  

Patch

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 091d95ddf9a0..a5ed75271c15 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1630,17 +1630,20 @@  static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
 				 NULL_VM_UFFD_CTX, anon_vma_name(vma));
 		if (prev) {
 			vma = prev;
+			mas_pause(&mas);
 			goto next;
 		}
 		if (vma->vm_start < start) {
 			ret = split_vma(mm, vma, start, 1);
 			if (ret)
 				break;
+			mas_pause(&mas);
 		}
 		if (vma->vm_end > end) {
 			ret = split_vma(mm, vma, end, 0);
 			if (ret)
 				break;
+			mas_pause(&mas);
 		}
 	next:
 		/*