[v5,4/6] io_uring: rsrc: delegate VMA file-backed check to GUP
Commit Message
Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
writing to file-backed mappings", there is no need to explicitly check VMAs
for this condition, so simply remove this logic from io_uring altogether.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
io_uring/rsrc.c | 34 ++++++----------------------------
1 file changed, 6 insertions(+), 28 deletions(-)
Comments
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On 5/14/23 3:26 PM, Lorenzo Stoakes wrote:
> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
> writing to file-backed mappings", there is no need to explicitly check VMAs
> for this condition, so simply remove this logic from io_uring altogether.
Don't have the prerequisite patch handy (not in mainline yet), but if it
just moves the check, then:
Reviewed-by: Jens Axboe <axboe@kernel.dk>
On 15.05.23 21:55, Jens Axboe wrote:
> On 5/14/23 3:26 PM, Lorenzo Stoakes wrote:
>> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
>> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
>> writing to file-backed mappings", there is no need to explicitly check VMAs
>> for this condition, so simply remove this logic from io_uring altogether.
>
> Don't have the prerequisite patch handy (not in mainline yet), but if it
> just moves the check, then:
>
> Reviewed-by: Jens Axboe <axboe@kernel.dk>
>
Jens, please see my note regarding iouring:
https://lore.kernel.org/bpf/6e96358e-bcb5-cc36-18c3-ec5153867b9a@redhat.com/
With this patch, MAP_PRIVATE will work as expected (2), but there will
be a change in return code handling (1) that we might have to document
in the man page.
On 14.05.23 23:26, Lorenzo Stoakes wrote:
> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
> writing to file-backed mappings", there is no need to explicitly check VMAs
> for this condition, so simply remove this logic from io_uring altogether.
>
Worth adding "Note that this change will make iouring fixed buffers work
on MAP_PRIVATE file mappings."
I'll run my test cases with this series and expect no surprises :)
Reviewed-by: David Hildenbrand <david@redhat.com>
On 5/16/23 2:25?AM, David Hildenbrand wrote:
> On 15.05.23 21:55, Jens Axboe wrote:
>> On 5/14/23 3:26?PM, Lorenzo Stoakes wrote:
>>> Now that the GUP explicitly checks FOLL_LONGTERM pin_user_pages() for
>>> broken file-backed mappings in "mm/gup: disallow FOLL_LONGTERM GUP-nonfast
>>> writing to file-backed mappings", there is no need to explicitly check VMAs
>>> for this condition, so simply remove this logic from io_uring altogether.
>>
>> Don't have the prerequisite patch handy (not in mainline yet), but if it
>> just moves the check, then:
>>
>> Reviewed-by: Jens Axboe <axboe@kernel.dk>
>>
>
> Jens, please see my note regarding iouring:
>
> https://lore.kernel.org/bpf/6e96358e-bcb5-cc36-18c3-ec5153867b9a@redhat.com/
>
> With this patch, MAP_PRIVATE will work as expected (2), but there will
> be a change in return code handling (1) that we might have to document
> in the man page.
I think documenting that newer kernels will return -EFAULT rather than
-EOPNOTSUPP should be fine. It's not a new failure case, just a
different error value for an already failing case. Should be fine with
just a doc update. Will do that now.
@@ -1030,9 +1030,8 @@ static int io_buffer_account_pin(struct io_ring_ctx *ctx, struct page **pages,
struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages)
{
unsigned long start, end, nr_pages;
- struct vm_area_struct **vmas = NULL;
struct page **pages = NULL;
- int i, pret, ret = -ENOMEM;
+ int pret, ret = -ENOMEM;
end = (ubuf + len + PAGE_SIZE - 1) >> PAGE_SHIFT;
start = ubuf >> PAGE_SHIFT;
@@ -1042,45 +1041,24 @@ struct page **io_pin_pages(unsigned long ubuf, unsigned long len, int *npages)
if (!pages)
goto done;
- vmas = kvmalloc_array(nr_pages, sizeof(struct vm_area_struct *),
- GFP_KERNEL);
- if (!vmas)
- goto done;
-
ret = 0;
mmap_read_lock(current->mm);
pret = pin_user_pages(ubuf, nr_pages, FOLL_WRITE | FOLL_LONGTERM,
- pages, vmas);
- if (pret == nr_pages) {
- /* don't support file backed memory */
- for (i = 0; i < nr_pages; i++) {
- struct vm_area_struct *vma = vmas[i];
-
- if (vma_is_shmem(vma))
- continue;
- if (vma->vm_file &&
- !is_file_hugepages(vma->vm_file)) {
- ret = -EOPNOTSUPP;
- break;
- }
- }
+ pages, NULL);
+ if (pret == nr_pages)
*npages = nr_pages;
- } else {
+ else
ret = pret < 0 ? pret : -EFAULT;
- }
+
mmap_read_unlock(current->mm);
if (ret) {
- /*
- * if we did partial map, or found file backed vmas,
- * release any pages we did get
- */
+ /* if we did partial map, release any pages we did get */
if (pret > 0)
unpin_user_pages(pages, pret);
goto done;
}
ret = 0;
done:
- kvfree(vmas);
if (ret < 0) {
kvfree(pages);
pages = ERR_PTR(ret);