[v4,3/3] mm: perform the mapping_map_writable() check after call_mmap()
Commit Message
In order for a F_SEAL_WRITE sealed memfd mapping to have an opportunity to
clear VM_MAYWRITE, we must be able to invoke the appropriate vm_ops->mmap()
handler to do so. We would otherwise fail the mapping_map_writable() check
before we had the opportunity to avoid it.
This patch moves this check after the call_mmap() invocation. Only memfd
actively denies write access causing a potential failure here (in
memfd_add_seals()), so there should be no impact on non-memfd cases.
This patch makes the userland-visible change that MAP_SHARED, PROT_READ
mappings of an F_SEAL_WRITE sealed memfd mapping will now succeed.
There is a delicate situation with cleanup paths assuming that a writable
mapping must have occurred in circumstances where it may now not have. In
order to ensure we do not accidentally mark a writable file unwritable by
mistake, we explicitly track whether we have a writable mapping and
unmap only if we do.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217238
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
mm/mmap.c | 23 ++++++++++++++---------
1 file changed, 14 insertions(+), 9 deletions(-)
Comments
On Thu, Oct 12, 2023 at 06:04:30PM +0100, Lorenzo Stoakes wrote:
> In order for a F_SEAL_WRITE sealed memfd mapping to have an opportunity to
> clear VM_MAYWRITE, we must be able to invoke the appropriate vm_ops->mmap()
> handler to do so. We would otherwise fail the mapping_map_writable() check
> before we had the opportunity to avoid it.
>
> This patch moves this check after the call_mmap() invocation. Only memfd
> actively denies write access causing a potential failure here (in
> memfd_add_seals()), so there should be no impact on non-memfd cases.
>
> This patch makes the userland-visible change that MAP_SHARED, PROT_READ
> mappings of an F_SEAL_WRITE sealed memfd mapping will now succeed.
>
> There is a delicate situation with cleanup paths assuming that a writable
> mapping must have occurred in circumstances where it may now not have. In
> order to ensure we do not accidentally mark a writable file unwritable by
> mistake, we explicitly track whether we have a writable mapping and
> unmap only if we do.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217238
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> ---
> mm/mmap.c | 23 ++++++++++++++---------
> 1 file changed, 14 insertions(+), 9 deletions(-)
>
[snip]
Andrew, could you apply the following -fix patch to this? As a bug was
detected in the implementation [0] - I was being over-zealous in setting
the writable_file_mapping flag and had falsely assumed vma->vm_file == file
in all instances of the cleanup. The fix is to only set it in one place.
[0]: https://lore.kernel.org/all/CA+G9fYtL7wK-dE-Tnz4t-GWmQb50EPYa=TWGjpgYU2Z=oeAO_w@mail.gmail.com/
----8<----
From 7feea6faada5b10a872c24755cc630220cba619a Mon Sep 17 00:00:00 2001
From: Lorenzo Stoakes <lstoakes@gmail.com>
Date: Mon, 16 Oct 2023 17:17:13 +0100
Subject: [PATCH] mm: perform the mapping_map_writable() check after
call_mmap()
Do not set writable_file_mapping in an instance where it is not appropriate
to do so.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
mm/mmap.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/mmap.c b/mm/mmap.c
index 7f45a08e7973..8b57e42fd980 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2923,10 +2923,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
mm->map_count++;
if (vma->vm_file) {
i_mmap_lock_write(vma->vm_file->f_mapping);
- if (vma_is_shared_maywrite(vma)) {
+ if (vma_is_shared_maywrite(vma))
mapping_allow_writable(vma->vm_file->f_mapping);
- writable_file_mapping = true;
- }
flush_dcache_mmap_lock(vma->vm_file->f_mapping);
vma_interval_tree_insert(vma, &vma->vm_file->f_mapping->i_mmap);
@@ -2752,6 +2752,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
unsigned long charged = 0;
unsigned long end = addr + len;
unsigned long merge_start = addr, merge_end = end;
+ bool writable_file_mapping = false;
pgoff_t vm_pgoff;
int error;
VMA_ITERATOR(vmi, mm, addr);
@@ -2846,17 +2847,19 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
vma->vm_pgoff = pgoff;
if (file) {
- if (is_shared_maywrite(vm_flags)) {
- error = mapping_map_writable(file->f_mapping);
- if (error)
- goto free_vma;
- }
-
vma->vm_file = get_file(file);
error = call_mmap(file, vma);
if (error)
goto unmap_and_free_vma;
+ if (vma_is_shared_maywrite(vma)) {
+ error = mapping_map_writable(file->f_mapping);
+ if (error)
+ goto close_and_free_vma;
+
+ writable_file_mapping = true;
+ }
+
/*
* Expansion is handled above, merging is handled below.
* Drivers should not alter the address of the VMA.
@@ -2920,8 +2923,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
mm->map_count++;
if (vma->vm_file) {
i_mmap_lock_write(vma->vm_file->f_mapping);
- if (vma_is_shared_maywrite(vma))
+ if (vma_is_shared_maywrite(vma)) {
mapping_allow_writable(vma->vm_file->f_mapping);
+ writable_file_mapping = true;
+ }
flush_dcache_mmap_lock(vma->vm_file->f_mapping);
vma_interval_tree_insert(vma, &vma->vm_file->f_mapping->i_mmap);
@@ -2937,7 +2942,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
/* Once vma denies write, undo our temporary denial count */
unmap_writable:
- if (file && is_shared_maywrite(vm_flags))
+ if (writable_file_mapping)
mapping_unmap_writable(file->f_mapping);
file = vma->vm_file;
ksm_add_vma(vma);
@@ -2985,7 +2990,7 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
unmap_region(mm, &vmi.mas, vma, prev, next, vma->vm_start,
vma->vm_end, vma->vm_end, true);
}
- if (file && is_shared_maywrite(vm_flags))
+ if (writable_file_mapping)
mapping_unmap_writable(file->f_mapping);
free_vma:
vm_area_free(vma);