[3/7] mm/gup: remove vmas parameter from get_user_pages_remote()
Commit Message
The only instances of get_user_pages_remote() invocations which used the
vmas parameter were for a single page which can instead simply look up the
VMA directly. In particular:-
- __update_ref_ctr() looked up the VMA but did nothing with it so we simply
remove it.
- __access_remote_vm() was already using vma_lookup() when the original
lookup failed so by doing the lookup directly this also de-duplicates the
code.
This forms part of a broader set of patches intended to eliminate the vmas
parameter altogether.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
---
arch/arm64/kernel/mte.c | 5 +++--
arch/s390/kvm/interrupt.c | 2 +-
fs/exec.c | 2 +-
include/linux/mm.h | 2 +-
kernel/events/uprobes.c | 10 +++++-----
mm/gup.c | 12 ++++--------
mm/memory.c | 9 +++++----
mm/rmap.c | 2 +-
security/tomoyo/domain.c | 2 +-
virt/kvm/async_pf.c | 3 +--
10 files changed, 23 insertions(+), 26 deletions(-)
Comments
On 2023/04/15 8:27, Lorenzo Stoakes wrote:
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index f5bcb0dc6267..74d8d4007dec 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> struct page *page = NULL;
>
> ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> - &vma, NULL);
> - if (ret <= 0)
> + NULL);
> + vma = vma_lookup(mm, addr);
> + if (ret <= 0 || !vma)
> break;
This conversion looks wrong. When get_user_pages_remote(&page) returned > 0,
put_page(page) is needed even if vma_lookup() returned NULL, isn't it?
On Sat, Apr 15, 2023 at 09:25:51AM +0900, Tetsuo Handa wrote:
> On 2023/04/15 8:27, Lorenzo Stoakes wrote:
> > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > index f5bcb0dc6267..74d8d4007dec 100644
> > --- a/arch/arm64/kernel/mte.c
> > +++ b/arch/arm64/kernel/mte.c
> > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> > struct page *page = NULL;
> >
> > ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> > - &vma, NULL);
> > - if (ret <= 0)
> > + NULL);
> > + vma = vma_lookup(mm, addr);
> > + if (ret <= 0 || !vma)
> > break;
>
> This conversion looks wrong. When get_user_pages_remote(&page) returned > 0,
> put_page(page) is needed even if vma_lookup() returned NULL, isn't it?
>
You're right, though actually it's not possible for ret > 0 and vma != NULL
because the GUP code requires the VMA to exist for it to have returned > 0.
I was trying to be too cute here I think, actually we only want to be doing
that lookup if the GUP succeeded in any case.
Let me respin with a fix for this.
On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> The only instances of get_user_pages_remote() invocations which used the
> vmas parameter were for a single page which can instead simply look up the
> VMA directly. In particular:-
>
> - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> remove it.
>
> - __access_remote_vm() was already using vma_lookup() when the original
> lookup failed so by doing the lookup directly this also de-duplicates the
> code.
>
> This forms part of a broader set of patches intended to eliminate the vmas
> parameter altogether.
>
> Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> ---
> arch/arm64/kernel/mte.c | 5 +++--
> arch/s390/kvm/interrupt.c | 2 +-
> fs/exec.c | 2 +-
> include/linux/mm.h | 2 +-
> kernel/events/uprobes.c | 10 +++++-----
> mm/gup.c | 12 ++++--------
> mm/memory.c | 9 +++++----
> mm/rmap.c | 2 +-
> security/tomoyo/domain.c | 2 +-
> virt/kvm/async_pf.c | 3 +--
> 10 files changed, 23 insertions(+), 26 deletions(-)
>
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index f5bcb0dc6267..74d8d4007dec 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> struct page *page = NULL;
>
> ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> - &vma, NULL);
> - if (ret <= 0)
> + NULL);
> + vma = vma_lookup(mm, addr);
> + if (ret <= 0 || !vma)
> break;
Given the slightly tricky error handling, it would make sense to turn
this pattern into a helper function:
page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
if (IS_ERR(page))
[..]
static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
unsigned long addr, int gup_flags, struct vm_area_struct **vma)
{
struct page *page;
int ret;
ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
if (ret < 0)
return ERR_PTR(ret);
if (WARN_ON(ret == 0))
return ERR_PTR(-EINVAL);
*vma = vma_lookup(mm, addr);
if (WARN_ON(!*vma) {
put_user_page(page);
return ERR_PTR(-EINVAL);
}
return page;
}
It could be its own patch so this change was just a mechanical removal
of NULL
Jason
On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> > The only instances of get_user_pages_remote() invocations which used the
> > vmas parameter were for a single page which can instead simply look up the
> > VMA directly. In particular:-
> >
> > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> > remove it.
> >
> > - __access_remote_vm() was already using vma_lookup() when the original
> > lookup failed so by doing the lookup directly this also de-duplicates the
> > code.
> >
> > This forms part of a broader set of patches intended to eliminate the vmas
> > parameter altogether.
> >
> > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > ---
> > arch/arm64/kernel/mte.c | 5 +++--
> > arch/s390/kvm/interrupt.c | 2 +-
> > fs/exec.c | 2 +-
> > include/linux/mm.h | 2 +-
> > kernel/events/uprobes.c | 10 +++++-----
> > mm/gup.c | 12 ++++--------
> > mm/memory.c | 9 +++++----
> > mm/rmap.c | 2 +-
> > security/tomoyo/domain.c | 2 +-
> > virt/kvm/async_pf.c | 3 +--
> > 10 files changed, 23 insertions(+), 26 deletions(-)
> >
> > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > index f5bcb0dc6267..74d8d4007dec 100644
> > --- a/arch/arm64/kernel/mte.c
> > +++ b/arch/arm64/kernel/mte.c
> > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> > struct page *page = NULL;
> >
> > ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> > - &vma, NULL);
> > - if (ret <= 0)
> > + NULL);
> > + vma = vma_lookup(mm, addr);
> > + if (ret <= 0 || !vma)
> > break;
>
> Given the slightly tricky error handling, it would make sense to turn
> this pattern into a helper function:
>
> page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> if (IS_ERR(page))
> [..]
>
> static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
> unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> {
> struct page *page;
> int ret;
>
> ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> if (ret < 0)
> return ERR_PTR(ret);
> if (WARN_ON(ret == 0))
> return ERR_PTR(-EINVAL);
> *vma = vma_lookup(mm, addr);
> if (WARN_ON(!*vma) {
> put_user_page(page);
> return ERR_PTR(-EINVAL);
> }
> return page;
> }
>
> It could be its own patch so this change was just a mechanical removal
> of NULL
>
> Jason
>
Agreed, I think this would work better as a follow up patch however so as
not to distract too much from the core change. I feel like there are quite
a few things we can follow up on including assessing whether we might be
able to use _fast() paths in places (I haven't assessed this yet).
On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
> On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> > > The only instances of get_user_pages_remote() invocations which used the
> > > vmas parameter were for a single page which can instead simply look up the
> > > VMA directly. In particular:-
> > >
> > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> > > remove it.
> > >
> > > - __access_remote_vm() was already using vma_lookup() when the original
> > > lookup failed so by doing the lookup directly this also de-duplicates the
> > > code.
> > >
> > > This forms part of a broader set of patches intended to eliminate the vmas
> > > parameter altogether.
> > >
> > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > > ---
> > > arch/arm64/kernel/mte.c | 5 +++--
> > > arch/s390/kvm/interrupt.c | 2 +-
> > > fs/exec.c | 2 +-
> > > include/linux/mm.h | 2 +-
> > > kernel/events/uprobes.c | 10 +++++-----
> > > mm/gup.c | 12 ++++--------
> > > mm/memory.c | 9 +++++----
> > > mm/rmap.c | 2 +-
> > > security/tomoyo/domain.c | 2 +-
> > > virt/kvm/async_pf.c | 3 +--
> > > 10 files changed, 23 insertions(+), 26 deletions(-)
> > >
> > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > > index f5bcb0dc6267..74d8d4007dec 100644
> > > --- a/arch/arm64/kernel/mte.c
> > > +++ b/arch/arm64/kernel/mte.c
> > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> > > struct page *page = NULL;
> > >
> > > ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> > > - &vma, NULL);
> > > - if (ret <= 0)
> > > + NULL);
> > > + vma = vma_lookup(mm, addr);
> > > + if (ret <= 0 || !vma)
> > > break;
> >
> > Given the slightly tricky error handling, it would make sense to turn
> > this pattern into a helper function:
> >
> > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> > if (IS_ERR(page))
> > [..]
> >
> > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
> > unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> > {
> > struct page *page;
> > int ret;
> >
> > ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> > if (ret < 0)
> > return ERR_PTR(ret);
> > if (WARN_ON(ret == 0))
> > return ERR_PTR(-EINVAL);
> > *vma = vma_lookup(mm, addr);
> > if (WARN_ON(!*vma) {
> > put_user_page(page);
> > return ERR_PTR(-EINVAL);
> > }
> > return page;
> > }
> >
> > It could be its own patch so this change was just a mechanical removal
> > of NULL
> >
> > Jason
> >
>
> Agreed, I think this would work better as a follow up patch however so as
> not to distract too much from the core change.
I don't think you should open code sketchy error handling in several
places and then clean it up later. Just do it right from the start.
Jason
On Mon, Apr 17, 2023 at 10:16:28AM -0300, Jason Gunthorpe wrote:
> On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
> > On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> > > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> > > > The only instances of get_user_pages_remote() invocations which used the
> > > > vmas parameter were for a single page which can instead simply look up the
> > > > VMA directly. In particular:-
> > > >
> > > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> > > > remove it.
> > > >
> > > > - __access_remote_vm() was already using vma_lookup() when the original
> > > > lookup failed so by doing the lookup directly this also de-duplicates the
> > > > code.
> > > >
> > > > This forms part of a broader set of patches intended to eliminate the vmas
> > > > parameter altogether.
> > > >
> > > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> > > > ---
> > > > arch/arm64/kernel/mte.c | 5 +++--
> > > > arch/s390/kvm/interrupt.c | 2 +-
> > > > fs/exec.c | 2 +-
> > > > include/linux/mm.h | 2 +-
> > > > kernel/events/uprobes.c | 10 +++++-----
> > > > mm/gup.c | 12 ++++--------
> > > > mm/memory.c | 9 +++++----
> > > > mm/rmap.c | 2 +-
> > > > security/tomoyo/domain.c | 2 +-
> > > > virt/kvm/async_pf.c | 3 +--
> > > > 10 files changed, 23 insertions(+), 26 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> > > > index f5bcb0dc6267..74d8d4007dec 100644
> > > > --- a/arch/arm64/kernel/mte.c
> > > > +++ b/arch/arm64/kernel/mte.c
> > > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> > > > struct page *page = NULL;
> > > >
> > > > ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> > > > - &vma, NULL);
> > > > - if (ret <= 0)
> > > > + NULL);
> > > > + vma = vma_lookup(mm, addr);
> > > > + if (ret <= 0 || !vma)
> > > > break;
> > >
> > > Given the slightly tricky error handling, it would make sense to turn
> > > this pattern into a helper function:
> > >
> > > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> > > if (IS_ERR(page))
> > > [..]
> > >
> > > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
> > > unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> > > {
> > > struct page *page;
> > > int ret;
> > >
> > > ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> > > if (ret < 0)
> > > return ERR_PTR(ret);
> > > if (WARN_ON(ret == 0))
> > > return ERR_PTR(-EINVAL);
> > > *vma = vma_lookup(mm, addr);
> > > if (WARN_ON(!*vma) {
> > > put_user_page(page);
> > > return ERR_PTR(-EINVAL);
> > > }
> > > return page;
> > > }
> > >
> > > It could be its own patch so this change was just a mechanical removal
> > > of NULL
> > >
> > > Jason
> > >
> >
> > Agreed, I think this would work better as a follow up patch however so as
> > not to distract too much from the core change.
>
> I don't think you should open code sketchy error handling in several
> places and then clean it up later. Just do it right from the start.
>
Intent was to do smallest change possible (though through review that grew
of course), but I see your point, in this instance this is fiddly stuff and
probably better to abstract it to enforce correct handling.
I'll respin + add something like this.
> Jason
Lorenzo Stoakes <lstoakes@gmail.com> writes:
> On Mon, Apr 17, 2023 at 10:16:28AM -0300, Jason Gunthorpe wrote:
>> On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
>> > On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
>> > > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
>> > > > The only instances of get_user_pages_remote() invocations which used the
>> > > > vmas parameter were for a single page which can instead simply look up the
>> > > > VMA directly. In particular:-
>> > > >
>> > > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
>> > > > remove it.
>> > > >
>> > > > - __access_remote_vm() was already using vma_lookup() when the original
>> > > > lookup failed so by doing the lookup directly this also de-duplicates the
>> > > > code.
>> > > >
>> > > > This forms part of a broader set of patches intended to eliminate the vmas
>> > > > parameter altogether.
>> > > >
>> > > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
>> > > > ---
>> > > > arch/arm64/kernel/mte.c | 5 +++--
>> > > > arch/s390/kvm/interrupt.c | 2 +-
>> > > > fs/exec.c | 2 +-
>> > > > include/linux/mm.h | 2 +-
>> > > > kernel/events/uprobes.c | 10 +++++-----
>> > > > mm/gup.c | 12 ++++--------
>> > > > mm/memory.c | 9 +++++----
>> > > > mm/rmap.c | 2 +-
>> > > > security/tomoyo/domain.c | 2 +-
>> > > > virt/kvm/async_pf.c | 3 +--
>> > > > 10 files changed, 23 insertions(+), 26 deletions(-)
>> > > >
>> > > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
>> > > > index f5bcb0dc6267..74d8d4007dec 100644
>> > > > --- a/arch/arm64/kernel/mte.c
>> > > > +++ b/arch/arm64/kernel/mte.c
>> > > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
>> > > > struct page *page = NULL;
>> > > >
>> > > > ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
>> > > > - &vma, NULL);
>> > > > - if (ret <= 0)
>> > > > + NULL);
>> > > > + vma = vma_lookup(mm, addr);
>> > > > + if (ret <= 0 || !vma)
>> > > > break;
>> > >
>> > > Given the slightly tricky error handling, it would make sense to turn
>> > > this pattern into a helper function:
>> > >
>> > > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
>> > > if (IS_ERR(page))
>> > > [..]
>> > >
>> > > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
>> > > unsigned long addr, int gup_flags, struct vm_area_struct **vma)
>> > > {
>> > > struct page *page;
>> > > int ret;
>> > >
>> > > ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
>> > > if (ret < 0)
>> > > return ERR_PTR(ret);
>> > > if (WARN_ON(ret == 0))
>> > > return ERR_PTR(-EINVAL);
>> > > *vma = vma_lookup(mm, addr);
>> > > if (WARN_ON(!*vma) {
>> > > put_user_page(page);
>> > > return ERR_PTR(-EINVAL);
>> > > }
>> > > return page;
>> > > }
>> > >
>> > > It could be its own patch so this change was just a mechanical removal
>> > > of NULL
>> > >
>> > > Jason
>> > >
>> >
>> > Agreed, I think this would work better as a follow up patch however so as
>> > not to distract too much from the core change.
>>
>> I don't think you should open code sketchy error handling in several
>> places and then clean it up later. Just do it right from the start.
>>
>
> Intent was to do smallest change possible (though through review that grew
> of course), but I see your point, in this instance this is fiddly stuff and
> probably better to abstract it to enforce correct handling.
>
> I'll respin + add something like this.
Could you include in your description why looking up the vma after
getting the page does not introduce a race?
I am probably silly and just looking at this quickly but it does not
seem immediately obvious why the vma and the page should match.
I would not be surprised if you hold the appropriate mutex over the
entire operation but it just isn't apparent from the diff.
I am concerned because it is an easy mistake to refactor something into
two steps and then discover you have introduced a race.
Eric
On Mon, Apr 17, 2023 at 10:07:53AM -0500, Eric W. Biederman wrote:
> Lorenzo Stoakes <lstoakes@gmail.com> writes:
>
> > On Mon, Apr 17, 2023 at 10:16:28AM -0300, Jason Gunthorpe wrote:
> >> On Mon, Apr 17, 2023 at 02:13:39PM +0100, Lorenzo Stoakes wrote:
> >> > On Mon, Apr 17, 2023 at 10:09:36AM -0300, Jason Gunthorpe wrote:
> >> > > On Sat, Apr 15, 2023 at 12:27:31AM +0100, Lorenzo Stoakes wrote:
> >> > > > The only instances of get_user_pages_remote() invocations which used the
> >> > > > vmas parameter were for a single page which can instead simply look up the
> >> > > > VMA directly. In particular:-
> >> > > >
> >> > > > - __update_ref_ctr() looked up the VMA but did nothing with it so we simply
> >> > > > remove it.
> >> > > >
> >> > > > - __access_remote_vm() was already using vma_lookup() when the original
> >> > > > lookup failed so by doing the lookup directly this also de-duplicates the
> >> > > > code.
> >> > > >
> >> > > > This forms part of a broader set of patches intended to eliminate the vmas
> >> > > > parameter altogether.
> >> > > >
> >> > > > Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
> >> > > > ---
> >> > > > arch/arm64/kernel/mte.c | 5 +++--
> >> > > > arch/s390/kvm/interrupt.c | 2 +-
> >> > > > fs/exec.c | 2 +-
> >> > > > include/linux/mm.h | 2 +-
> >> > > > kernel/events/uprobes.c | 10 +++++-----
> >> > > > mm/gup.c | 12 ++++--------
> >> > > > mm/memory.c | 9 +++++----
> >> > > > mm/rmap.c | 2 +-
> >> > > > security/tomoyo/domain.c | 2 +-
> >> > > > virt/kvm/async_pf.c | 3 +--
> >> > > > 10 files changed, 23 insertions(+), 26 deletions(-)
> >> > > >
> >> > > > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> >> > > > index f5bcb0dc6267..74d8d4007dec 100644
> >> > > > --- a/arch/arm64/kernel/mte.c
> >> > > > +++ b/arch/arm64/kernel/mte.c
> >> > > > @@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
> >> > > > struct page *page = NULL;
> >> > > >
> >> > > > ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
> >> > > > - &vma, NULL);
> >> > > > - if (ret <= 0)
> >> > > > + NULL);
> >> > > > + vma = vma_lookup(mm, addr);
> >> > > > + if (ret <= 0 || !vma)
> >> > > > break;
> >> > >
> >> > > Given the slightly tricky error handling, it would make sense to turn
> >> > > this pattern into a helper function:
> >> > >
> >> > > page = get_single_user_page_locked(mm, addr, gup_flags, &vma);
> >> > > if (IS_ERR(page))
> >> > > [..]
> >> > >
> >> > > static inline struct page *get_single_user_page_locked(struct mm_struct *mm,
> >> > > unsigned long addr, int gup_flags, struct vm_area_struct **vma)
> >> > > {
> >> > > struct page *page;
> >> > > int ret;
> >> > >
> >> > > ret = get_user_pages_remote(*mm, addr, 1, gup_flags, &page, NULL, NULL);
> >> > > if (ret < 0)
> >> > > return ERR_PTR(ret);
> >> > > if (WARN_ON(ret == 0))
> >> > > return ERR_PTR(-EINVAL);
> >> > > *vma = vma_lookup(mm, addr);
> >> > > if (WARN_ON(!*vma) {
> >> > > put_user_page(page);
> >> > > return ERR_PTR(-EINVAL);
> >> > > }
> >> > > return page;
> >> > > }
> >> > >
> >> > > It could be its own patch so this change was just a mechanical removal
> >> > > of NULL
> >> > >
> >> > > Jason
> >> > >
> >> >
> >> > Agreed, I think this would work better as a follow up patch however so as
> >> > not to distract too much from the core change.
> >>
> >> I don't think you should open code sketchy error handling in several
> >> places and then clean it up later. Just do it right from the start.
> >>
> >
> > Intent was to do smallest change possible (though through review that grew
> > of course), but I see your point, in this instance this is fiddly stuff and
> > probably better to abstract it to enforce correct handling.
> >
> > I'll respin + add something like this.
>
> Could you include in your description why looking up the vma after
> getting the page does not introduce a race?
>
> I am probably silly and just looking at this quickly but it does not
> seem immediately obvious why the vma and the page should match.
>
> I would not be surprised if you hold the appropriate mutex over the
> entire operation but it just isn't apparent from the diff.
>
> I am concerned because it is an easy mistake to refactor something into
> two steps and then discover you have introduced a race.
>
> Eric
>
The mmap_lock is held so VMAs cannot be altered and no such race can
occur. get_user_pages_remote() requires that the user calls it under the
lock so it is implicitly assured that this cannot happen.
I'll update the description to make this clear on the next spin!
(side-note: here is another irritating issue with GUP, we don't suffix with
_locked() consistently)
@@ -437,8 +437,9 @@ static int __access_remote_tags(struct mm_struct *mm, unsigned long addr,
struct page *page = NULL;
ret = get_user_pages_remote(mm, addr, 1, gup_flags, &page,
- &vma, NULL);
- if (ret <= 0)
+ NULL);
+ vma = vma_lookup(mm, addr);
+ if (ret <= 0 || !vma)
break;
/*
@@ -2777,7 +2777,7 @@ static struct page *get_map_page(struct kvm *kvm, u64 uaddr)
mmap_read_lock(kvm->mm);
get_user_pages_remote(kvm->mm, uaddr, 1, FOLL_WRITE,
- &page, NULL, NULL);
+ &page, NULL);
mmap_read_unlock(kvm->mm);
return page;
}
@@ -219,7 +219,7 @@ static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
*/
mmap_read_lock(bprm->mm);
ret = get_user_pages_remote(bprm->mm, pos, 1, gup_flags,
- &page, NULL, NULL);
+ &page, NULL);
mmap_read_unlock(bprm->mm);
if (ret <= 0)
return NULL;
@@ -2374,7 +2374,7 @@ extern int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
long get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
- struct vm_area_struct **vmas, int *locked);
+ int *locked);
long pin_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
@@ -365,7 +365,6 @@ __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d)
{
void *kaddr;
struct page *page;
- struct vm_area_struct *vma;
int ret;
short *ptr;
@@ -373,7 +372,7 @@ __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d)
return -EINVAL;
ret = get_user_pages_remote(mm, vaddr, 1,
- FOLL_WRITE, &page, &vma, NULL);
+ FOLL_WRITE, &page, NULL);
if (unlikely(ret <= 0)) {
/*
* We are asking for 1 page. If get_user_pages_remote() fails,
@@ -475,8 +474,9 @@ int uprobe_write_opcode(struct arch_uprobe *auprobe, struct mm_struct *mm,
gup_flags |= FOLL_SPLIT_PMD;
/* Read the page with vaddr into memory */
ret = get_user_pages_remote(mm, vaddr, 1, gup_flags,
- &old_page, &vma, NULL);
- if (ret <= 0)
+ &old_page, NULL);
+ vma = vma_lookup(mm, vaddr);
+ if (ret <= 0 || !vma)
return ret;
ret = verify_opcode(old_page, vaddr, &opcode);
@@ -2028,7 +2028,7 @@ static int is_trap_at_addr(struct mm_struct *mm, unsigned long vaddr)
* essentially a kernel access to the memory.
*/
result = get_user_pages_remote(mm, vaddr, 1, FOLL_FORCE, &page,
- NULL, NULL);
+ NULL);
if (result < 0)
return result;
@@ -2165,8 +2165,6 @@ static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
* @pages: array that receives pointers to the pages pinned.
* Should be at least nr_pages long. Or NULL, if caller
* only intends to ensure the pages are faulted in.
- * @vmas: array of pointers to vmas corresponding to each page.
- * Or NULL if the caller does not require them.
* @locked: pointer to lock flag indicating whether lock is held and
* subsequently whether VM_FAULT_RETRY functionality can be
* utilised. Lock must initially be held.
@@ -2181,8 +2179,6 @@ static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
*
* The caller is responsible for releasing returned @pages, via put_page().
*
- * @vmas are valid only as long as mmap_lock is held.
- *
* Must be called with mmap_lock held for read or write.
*
* get_user_pages_remote walks a process's page tables and takes a reference
@@ -2219,15 +2215,15 @@ static bool is_valid_gup_args(struct page **pages, struct vm_area_struct **vmas,
long get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
- struct vm_area_struct **vmas, int *locked)
+ int *locked)
{
int local_locked = 1;
- if (!is_valid_gup_args(pages, vmas, locked, &gup_flags,
+ if (!is_valid_gup_args(pages, NULL, locked, &gup_flags,
FOLL_TOUCH | FOLL_REMOTE))
return -EINVAL;
- return __get_user_pages_locked(mm, start, nr_pages, pages, vmas,
+ return __get_user_pages_locked(mm, start, nr_pages, pages, NULL,
locked ? locked : &local_locked,
gup_flags);
}
@@ -2237,7 +2233,7 @@ EXPORT_SYMBOL(get_user_pages_remote);
long get_user_pages_remote(struct mm_struct *mm,
unsigned long start, unsigned long nr_pages,
unsigned int gup_flags, struct page **pages,
- struct vm_area_struct **vmas, int *locked)
+ int *locked)
{
return 0;
}
@@ -5596,7 +5596,11 @@ int __access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf,
struct page *page = NULL;
ret = get_user_pages_remote(mm, addr, 1,
- gup_flags, &page, &vma, NULL);
+ gup_flags, &page, NULL);
+ vma = vma_lookup(mm, addr);
+ if (!vma)
+ break;
+
if (ret <= 0) {
#ifndef CONFIG_HAVE_IOREMAP_PROT
break;
@@ -5605,9 +5609,6 @@ int __access_remote_vm(struct mm_struct *mm, unsigned long addr, void *buf,
* Check if this is a VM_IO | VM_PFNMAP VMA, which
* we can access using slightly different code.
*/
- vma = vma_lookup(mm, addr);
- if (!vma)
- break;
if (vma->vm_ops && vma->vm_ops->access)
ret = vma->vm_ops->access(vma, addr, buf,
len, write);
@@ -2324,7 +2324,7 @@ int make_device_exclusive_range(struct mm_struct *mm, unsigned long start,
npages = get_user_pages_remote(mm, start, npages,
FOLL_GET | FOLL_WRITE | FOLL_SPLIT_PMD,
- pages, NULL, NULL);
+ pages, NULL);
if (npages < 0)
return npages;
@@ -916,7 +916,7 @@ bool tomoyo_dump_page(struct linux_binprm *bprm, unsigned long pos,
*/
mmap_read_lock(bprm->mm);
ret = get_user_pages_remote(bprm->mm, pos, 1,
- FOLL_FORCE, &page, NULL, NULL);
+ FOLL_FORCE, &page, NULL);
mmap_read_unlock(bprm->mm);
if (ret <= 0)
return false;
@@ -61,8 +61,7 @@ static void async_pf_execute(struct work_struct *work)
* access remotely.
*/
mmap_read_lock(mm);
- get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, NULL,
- &locked);
+ get_user_pages_remote(mm, addr, 1, FOLL_WRITE, NULL, &locked);
if (locked)
mmap_read_unlock(mm);