[1/2] hugetlbfs: extend hugetlb_vma_lock to private VMAs

Message ID 20230920021811.3095089-2-riel@surriel.com
State New
Headers
Series hugetlbfs: close race between MADV_DONTNEED and page fault |

Commit Message

Rik van Riel Sept. 20, 2023, 2:16 a.m. UTC
  From: Rik van Riel <riel@surriel.com>

Extend the locking scheme used to protect shared hugetlb mappings
from truncate vs page fault races, in order to protect private
hugetlb mappings (with resv_map) against MADV_DONTNEED.

Add a read-write semaphore to the resv_map data structure, and
use that from the hugetlb_vma_(un)lock_* functions, in preparation
for closing the race between MADV_DONTNEED and page faults.

Signed-off-by: Rik van Riel <riel@surriel.com>
---
 include/linux/hugetlb.h |  6 ++++++
 mm/hugetlb.c            | 36 ++++++++++++++++++++++++++++++++----
 2 files changed, 38 insertions(+), 4 deletions(-)
  

Comments

Matthew Wilcox Sept. 20, 2023, 3:57 a.m. UTC | #1
On Tue, Sep 19, 2023 at 10:16:09PM -0400, riel@surriel.com wrote:
> From: Rik van Riel <riel@surriel.com>
> 
> Extend the locking scheme used to protect shared hugetlb mappings
> from truncate vs page fault races, in order to protect private
> hugetlb mappings (with resv_map) against MADV_DONTNEED.
> 
> Add a read-write semaphore to the resv_map data structure, and
> use that from the hugetlb_vma_(un)lock_* functions, in preparation
> for closing the race between MADV_DONTNEED and page faults.

This feels an awful lot like the invalidate_lock in struct address_space
which was recently added by Jan Kara.
  
Rik van Riel Sept. 20, 2023, 4:09 a.m. UTC | #2
On Wed, 2023-09-20 at 04:57 +0100, Matthew Wilcox wrote:
> On Tue, Sep 19, 2023 at 10:16:09PM -0400, riel@surriel.com wrote:
> > From: Rik van Riel <riel@surriel.com>
> > 
> > Extend the locking scheme used to protect shared hugetlb mappings
> > from truncate vs page fault races, in order to protect private
> > hugetlb mappings (with resv_map) against MADV_DONTNEED.
> > 
> > Add a read-write semaphore to the resv_map data structure, and
> > use that from the hugetlb_vma_(un)lock_* functions, in preparation
> > for closing the race between MADV_DONTNEED and page faults.
> 
> This feels an awful lot like the invalidate_lock in struct
> address_space
> which was recently added by Jan Kara.
> 
Indeed it does.

It might be even nicer if we could replace the hugetlb_vma_lock
special logic with the invalidate_lock for hugetlbfs.

Mike, can you think of any reason why the hugetlb_vma_lock logic
should not be replaced with the invalidate_lock?

If not, I'd be happy to implement that.
  
Mike Kravetz Sept. 20, 2023, 4:36 p.m. UTC | #3
On 09/20/23 00:09, Rik van Riel wrote:
> On Wed, 2023-09-20 at 04:57 +0100, Matthew Wilcox wrote:
> > On Tue, Sep 19, 2023 at 10:16:09PM -0400, riel@surriel.com wrote:
> > > From: Rik van Riel <riel@surriel.com>
> > > 
> > > Extend the locking scheme used to protect shared hugetlb mappings
> > > from truncate vs page fault races, in order to protect private
> > > hugetlb mappings (with resv_map) against MADV_DONTNEED.
> > > 
> > > Add a read-write semaphore to the resv_map data structure, and
> > > use that from the hugetlb_vma_(un)lock_* functions, in preparation
> > > for closing the race between MADV_DONTNEED and page faults.
> > 
> > This feels an awful lot like the invalidate_lock in struct
> > address_space
> > which was recently added by Jan Kara.
> > 
> Indeed it does.
> 
> It might be even nicer if we could replace the hugetlb_vma_lock
> special logic with the invalidate_lock for hugetlbfs.
> 
> Mike, can you think of any reason why the hugetlb_vma_lock logic
> should not be replaced with the invalidate_lock?
> 
> If not, I'd be happy to implement that.
> 

Sorry Rik,

I have some other things that need immediate attention and have not had a
chance to take a close look here.  I'll take a closer look later (my) today
or tomorrow.
  
Mike Kravetz Sept. 21, 2023, 10:42 p.m. UTC | #4
On 09/19/23 22:16, riel@surriel.com wrote:
> From: Rik van Riel <riel@surriel.com>
> 
> Extend the locking scheme used to protect shared hugetlb mappings
> from truncate vs page fault races, in order to protect private
> hugetlb mappings (with resv_map) against MADV_DONTNEED.
> 
> Add a read-write semaphore to the resv_map data structure, and
> use that from the hugetlb_vma_(un)lock_* functions, in preparation
> for closing the race between MADV_DONTNEED and page faults.
> 
> Signed-off-by: Rik van Riel <riel@surriel.com>
> ---
>  include/linux/hugetlb.h |  6 ++++++
>  mm/hugetlb.c            | 36 ++++++++++++++++++++++++++++++++----
>  2 files changed, 38 insertions(+), 4 deletions(-)

This looks straight forward.

However, I ran just this patch through libhugetlbfs test suite and it hung on
misaligned_offset (2M: 32).
https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/misaligned_offset.c

Added lock/semaphore debugging to the kernel and got:
[   38.094690] =========================
[   38.095517] WARNING: held lock freed!
[   38.096350] 6.6.0-rc2-next-20230921-dirty #4 Not tainted
[   38.097556] -------------------------
[   38.098439] mlock/1002 is freeing memory ffff8881eff8dc00-ffff8881eff8ddff, with a lock still held there!
[   38.100550] ffff8881eff8dce8 (&resv_map->rw_sema){++++}-{3:3}, at: __unmap_hugepage_range_final+0x29/0x120
[   38.103564] 2 locks held by mlock/1002:
[   38.104552]  #0: ffff8881effa42a0 (&mm->mmap_lock){++++}-{3:3}, at: do_vmi_align_munmap+0x5c6/0x650
[   38.106611]  #1: ffff8881eff8dce8 (&resv_map->rw_sema){++++}-{3:3}, at: __unmap_hugepage_range_final+0x29/0x120
[   38.108827] 
[   38.108827] stack backtrace:
[   38.109929] CPU: 0 PID: 1002 Comm: mlock Not tainted 6.6.0-rc2-next-20230921-dirty #4
[   38.111812] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
[   38.113784] Call Trace:
[   38.114456]  <TASK>
[   38.115066]  dump_stack_lvl+0x57/0x90
[   38.116001]  debug_check_no_locks_freed+0x137/0x170
[   38.117193]  ? remove_vma+0x28/0x70
[   38.118088]  __kmem_cache_free+0x8f/0x2b0
[   38.119080]  remove_vma+0x28/0x70
[   38.119960]  do_vmi_align_munmap+0x3b1/0x650
[   38.121051]  do_vmi_munmap+0xc9/0x1a0
[   38.122006]  __vm_munmap+0xa4/0x190
[   38.122931]  __ia32_sys_munmap+0x15/0x20
[   38.123926]  __do_fast_syscall_32+0x68/0x100
[   38.125031]  do_fast_syscall_32+0x2f/0x70
[   38.126060]  entry_SYSENTER_compat_after_hwframe+0x7b/0x8d
[   38.127366] RIP: 0023:0xf7f05579
[   38.128198] Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
[   38.132534] RSP: 002b:00000000fffa877c EFLAGS: 00000286 ORIG_RAX: 000000000000005b
[   38.135703] RAX: ffffffffffffffda RBX: 00000000f7a00000 RCX: 0000000000200000
[   38.137323] RDX: 00000000f7a00000 RSI: 0000000000200000 RDI: 0000000000000003
[   38.138965] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
[   38.140574] R10: 0000000000000000 R11: 0000000000000286 R12: 0000000000000000
[   38.142191] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   38.143865]  </TASK>

Something is not quite right.  If you do not get to it first, I will take a
look as time permits.
  
Mike Kravetz Sept. 21, 2023, 11:17 p.m. UTC | #5
On 09/21/23 15:42, Mike Kravetz wrote:
> On 09/19/23 22:16, riel@surriel.com wrote:
> > From: Rik van Riel <riel@surriel.com>
> > 
> > Extend the locking scheme used to protect shared hugetlb mappings
> > from truncate vs page fault races, in order to protect private
> > hugetlb mappings (with resv_map) against MADV_DONTNEED.
> > 
> > Add a read-write semaphore to the resv_map data structure, and
> > use that from the hugetlb_vma_(un)lock_* functions, in preparation
> > for closing the race between MADV_DONTNEED and page faults.
> > 
> > Signed-off-by: Rik van Riel <riel@surriel.com>
> > ---
> >  include/linux/hugetlb.h |  6 ++++++
> >  mm/hugetlb.c            | 36 ++++++++++++++++++++++++++++++++----
> >  2 files changed, 38 insertions(+), 4 deletions(-)
> 
> This looks straight forward.
> 
> However, I ran just this patch through libhugetlbfs test suite and it hung on
> misaligned_offset (2M: 32).
> https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/misaligned_offset.c
> 
> Added lock/semaphore debugging to the kernel and got:
> [   38.094690] =========================
> [   38.095517] WARNING: held lock freed!
> [   38.096350] 6.6.0-rc2-next-20230921-dirty #4 Not tainted
> [   38.097556] -------------------------
> [   38.098439] mlock/1002 is freeing memory ffff8881eff8dc00-ffff8881eff8ddff, with a lock still held there!
> [   38.100550] ffff8881eff8dce8 (&resv_map->rw_sema){++++}-{3:3}, at: __unmap_hugepage_range_final+0x29/0x120
> [   38.103564] 2 locks held by mlock/1002:
> [   38.104552]  #0: ffff8881effa42a0 (&mm->mmap_lock){++++}-{3:3}, at: do_vmi_align_munmap+0x5c6/0x650
> [   38.106611]  #1: ffff8881eff8dce8 (&resv_map->rw_sema){++++}-{3:3}, at: __unmap_hugepage_range_final+0x29/0x120
> [   38.108827] 
> [   38.108827] stack backtrace:
> [   38.109929] CPU: 0 PID: 1002 Comm: mlock Not tainted 6.6.0-rc2-next-20230921-dirty #4
> [   38.111812] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
> [   38.113784] Call Trace:
> [   38.114456]  <TASK>
> [   38.115066]  dump_stack_lvl+0x57/0x90
> [   38.116001]  debug_check_no_locks_freed+0x137/0x170
> [   38.117193]  ? remove_vma+0x28/0x70
> [   38.118088]  __kmem_cache_free+0x8f/0x2b0
> [   38.119080]  remove_vma+0x28/0x70
> [   38.119960]  do_vmi_align_munmap+0x3b1/0x650
> [   38.121051]  do_vmi_munmap+0xc9/0x1a0
> [   38.122006]  __vm_munmap+0xa4/0x190
> [   38.122931]  __ia32_sys_munmap+0x15/0x20
> [   38.123926]  __do_fast_syscall_32+0x68/0x100
> [   38.125031]  do_fast_syscall_32+0x2f/0x70
> [   38.126060]  entry_SYSENTER_compat_after_hwframe+0x7b/0x8d
> [   38.127366] RIP: 0023:0xf7f05579
> [   38.128198] Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
> [   38.132534] RSP: 002b:00000000fffa877c EFLAGS: 00000286 ORIG_RAX: 000000000000005b
> [   38.135703] RAX: ffffffffffffffda RBX: 00000000f7a00000 RCX: 0000000000200000
> [   38.137323] RDX: 00000000f7a00000 RSI: 0000000000200000 RDI: 0000000000000003
> [   38.138965] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000
> [   38.140574] R10: 0000000000000000 R11: 0000000000000286 R12: 0000000000000000
> [   38.142191] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [   38.143865]  </TASK>
> 
> Something is not quite right.  If you do not get to it first, I will take a
> look as time permits.

Just for grins I threw on patch 2 (with lock debugging) and ran the test
suite.  It gets past misaligned_offset, but is spewing locking warnings
too fast to read.  Something is certainly missing.
  
Rik van Riel Sept. 22, 2023, 12:37 a.m. UTC | #6
On Thu, 2023-09-21 at 15:42 -0700, Mike Kravetz wrote:
> On 09/19/23 22:16, riel@surriel.com wrote:
> > From: Rik van Riel <riel@surriel.com>
> > 
> > Extend the locking scheme used to protect shared hugetlb mappings
> > from truncate vs page fault races, in order to protect private
> > hugetlb mappings (with resv_map) against MADV_DONTNEED.
> > 
> > Add a read-write semaphore to the resv_map data structure, and
> > use that from the hugetlb_vma_(un)lock_* functions, in preparation
> > for closing the race between MADV_DONTNEED and page faults.
> > 
> > Signed-off-by: Rik van Riel <riel@surriel.com>
> > ---
> >  include/linux/hugetlb.h |  6 ++++++
> >  mm/hugetlb.c            | 36 ++++++++++++++++++++++++++++++++----
> >  2 files changed, 38 insertions(+), 4 deletions(-)
> 
> This looks straight forward.
> 
> However, I ran just this patch through libhugetlbfs test suite and it
> hung on
> misaligned_offset (2M: 32).
> https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/misaligned_offset.c

Ah, so that's why I couldn't find hugetlbfs tests in the kernel
selftests directory. They're in libhugetlbfs.

I'll play around with those tests tomorrow. Let me see what's
going on.
  
Rik van Riel Sept. 22, 2023, 2:37 p.m. UTC | #7
On Thu, 2023-09-21 at 15:42 -0700, Mike Kravetz wrote:
> On 09/19/23 22:16, riel@surriel.com wrote:
> > From: Rik van Riel <riel@surriel.com>
> > 
> > Extend the locking scheme used to protect shared hugetlb mappings
> > from truncate vs page fault races, in order to protect private
> > hugetlb mappings (with resv_map) against MADV_DONTNEED.
> > 
> > Add a read-write semaphore to the resv_map data structure, and
> > use that from the hugetlb_vma_(un)lock_* functions, in preparation
> > for closing the race between MADV_DONTNEED and page faults.
> > 
> > Signed-off-by: Rik van Riel <riel@surriel.com>
> > ---
> >  include/linux/hugetlb.h |  6 ++++++
> >  mm/hugetlb.c            | 36 ++++++++++++++++++++++++++++++++----
> >  2 files changed, 38 insertions(+), 4 deletions(-)
> 
> This looks straight forward.
> 
> However, I ran just this patch through libhugetlbfs test suite and it
> hung on
> misaligned_offset (2M: 32).
> https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/misaligned_offset.c


Speaking of "looks straightforward", how do I compile the
libhugetlbfs code?

The __morecore variable, which is pointed at either the
THP or hugetlbfs morecore function, does not seem to be
defined anywhere in the sources.

Do I need to run some magic script (didn't find it) to
get a special header file set up before I can build
libhugetlbfs?



$ make
	 CC32 obj32/morecore.o
morecore.c: In function ‘__lh_hugetlbfs_setup_morecore’:
morecore.c:368:17: error: ‘__morecore’ undeclared (first use in this
function); did you mean ‘thp_morecore’?
  368 |                 __morecore = &thp_morecore;
      |                 ^~~~~~~~~~
      |                 thp_morecore
morecore.c:368:17: note: each undeclared identifier is reported only
once for each function it appears in
make: *** [Makefile:292: obj32/morecore.o] Error 1
$ grep __morecore *.[ch]
morecore.c:		__morecore = &thp_morecore;
morecore.c:		__morecore = &hugetlbfs_morecore;
  
Mike Kravetz Sept. 22, 2023, 4:44 p.m. UTC | #8
On 09/22/23 10:37, Rik van Riel wrote:
> On Thu, 2023-09-21 at 15:42 -0700, Mike Kravetz wrote:
> > On 09/19/23 22:16, riel@surriel.com wrote:
> > > From: Rik van Riel <riel@surriel.com>
> > > 
> > > Extend the locking scheme used to protect shared hugetlb mappings
> > > from truncate vs page fault races, in order to protect private
> > > hugetlb mappings (with resv_map) against MADV_DONTNEED.
> > > 
> > > Add a read-write semaphore to the resv_map data structure, and
> > > use that from the hugetlb_vma_(un)lock_* functions, in preparation
> > > for closing the race between MADV_DONTNEED and page faults.
> > > 
> > > Signed-off-by: Rik van Riel <riel@surriel.com>
> > > ---
> > >  include/linux/hugetlb.h |  6 ++++++
> > >  mm/hugetlb.c            | 36 ++++++++++++++++++++++++++++++++----
> > >  2 files changed, 38 insertions(+), 4 deletions(-)
> > 
> > This looks straight forward.
> > 
> > However, I ran just this patch through libhugetlbfs test suite and it
> > hung on
> > misaligned_offset (2M: 32).
> > https://github.com/libhugetlbfs/libhugetlbfs/blob/master/tests/misaligned_offset.c
> 
> 
> Speaking of "looks straightforward", how do I compile the
> libhugetlbfs code?
> 
> The __morecore variable, which is pointed at either the
> THP or hugetlbfs morecore function, does not seem to be
> defined anywhere in the sources.
> 
> Do I need to run some magic script (didn't find it) to
> get a special header file set up before I can build
> libhugetlbfs?

libhugetlbfs is a mess!  Distros have dropped it.  However, I still find
the test cases useful.  I have a special VM with an old glibc just for
running the tests.

Sorry, can't give instructions for using tests on a recent glibc.

But, back to this patch ...
With the hints from the locking debug code, it came to me on my walk this
morning.  We need to also have __hugetlb_vma_unlock_write_free() work
for private vmas as called from __unmap_hugepage_range_final.  This
additional change (or something like it) is required in this patch.

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f906c5fa4d09..8f3d5895fffc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -372,6 +372,11 @@ static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma)
 		struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
 
 		__hugetlb_vma_unlock_write_put(vma_lock);
+	} else if (__vma_private_lock(vma)) {
+		struct resv_map *resv_map = vma_resv_map(vma);
+
+		/* no free for anon vmas, but still need to unlock */
+		up_write(&resv_map->rw_sema);
 	}
 }
  
Rik van Riel Sept. 22, 2023, 4:56 p.m. UTC | #9
On Fri, 2023-09-22 at 09:44 -0700, Mike Kravetz wrote:
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f906c5fa4d09..8f3d5895fffc 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -372,6 +372,11 @@ static void
> __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma)
>                 struct hugetlb_vma_lock *vma_lock = vma-
> >vm_private_data;
>  
>                 __hugetlb_vma_unlock_write_put(vma_lock);
> +       } else if (__vma_private_lock(vma)) {
> +               struct resv_map *resv_map = vma_resv_map(vma);
> +
> +               /* no free for anon vmas, but still need to unlock */
> +               up_write(&resv_map->rw_sema);
>         }
>  }
> 

Nice catch. I'll add that.

I was still trying to reproduce the bug here.

The libhugetlbfs code compiles with the offending bits
commented out, but the misaligned_offset test wasn't
causing trouble on my test VM here.

Given the potential negative impact of moving from a
per-VMA lock to a per-backing-address_space lock, I'll
keep the 3 patches separate, and in the order they are
in now.

Let me go spin and test v2.
  
Rik van Riel Sept. 22, 2023, 6:31 p.m. UTC | #10
On Fri, 2023-09-22 at 09:44 -0700, Mike Kravetz wrote:
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index f906c5fa4d09..8f3d5895fffc 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -372,6 +372,11 @@ static void
> __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma)
>                 struct hugetlb_vma_lock *vma_lock = vma-
> >vm_private_data;
>  
>                 __hugetlb_vma_unlock_write_put(vma_lock);
> +       } else if (__vma_private_lock(vma)) {
> +               struct resv_map *resv_map = vma_resv_map(vma);
> +
> +               /* no free for anon vmas, but still need to unlock */
> +               up_write(&resv_map->rw_sema);
>         }
>  }
>  

That did the trick. The libhugetlbfs tests pass now, with
lockdep and KASAN enabled. Breno's MADV_DONTNEED test case
for hugetlbfs still passes, too.
  

Patch

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 5b2626063f4f..694928fa06a3 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -60,6 +60,7 @@  struct resv_map {
 	long adds_in_progress;
 	struct list_head region_cache;
 	long region_cache_count;
+	struct rw_semaphore rw_sema;
 #ifdef CONFIG_CGROUP_HUGETLB
 	/*
 	 * On private mappings, the counter to uncharge reservations is stored
@@ -1231,6 +1232,11 @@  static inline bool __vma_shareable_lock(struct vm_area_struct *vma)
 	return (vma->vm_flags & VM_MAYSHARE) && vma->vm_private_data;
 }
 
+static inline bool __vma_private_lock(struct vm_area_struct *vma)
+{
+	return (!(vma->vm_flags & VM_MAYSHARE)) && vma->vm_private_data;
+}
+
 /*
  * Safe version of huge_pte_offset() to check the locks.  See comments
  * above huge_pte_offset().
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ba6d39b71cb1..b99d215d2939 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -97,6 +97,7 @@  static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma);
 static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma);
 static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
 		unsigned long start, unsigned long end);
+static struct resv_map *vma_resv_map(struct vm_area_struct *vma);
 
 static inline bool subpool_is_free(struct hugepage_subpool *spool)
 {
@@ -267,6 +268,10 @@  void hugetlb_vma_lock_read(struct vm_area_struct *vma)
 		struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
 
 		down_read(&vma_lock->rw_sema);
+	} else if (__vma_private_lock(vma)) {
+		struct resv_map *resv_map = vma_resv_map(vma);
+
+		down_read(&resv_map->rw_sema);
 	}
 }
 
@@ -276,6 +281,10 @@  void hugetlb_vma_unlock_read(struct vm_area_struct *vma)
 		struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
 
 		up_read(&vma_lock->rw_sema);
+	} else if (__vma_private_lock(vma)) {
+		struct resv_map *resv_map = vma_resv_map(vma);
+
+		up_read(&resv_map->rw_sema);
 	}
 }
 
@@ -285,6 +294,10 @@  void hugetlb_vma_lock_write(struct vm_area_struct *vma)
 		struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
 
 		down_write(&vma_lock->rw_sema);
+	} else if (__vma_private_lock(vma)) {
+		struct resv_map *resv_map = vma_resv_map(vma);
+
+		down_write(&resv_map->rw_sema);
 	}
 }
 
@@ -294,17 +307,27 @@  void hugetlb_vma_unlock_write(struct vm_area_struct *vma)
 		struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
 
 		up_write(&vma_lock->rw_sema);
+	} else if (__vma_private_lock(vma)) {
+		struct resv_map *resv_map = vma_resv_map(vma);
+
+		up_write(&resv_map->rw_sema);
 	}
 }
 
 int hugetlb_vma_trylock_write(struct vm_area_struct *vma)
 {
-	struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
 
-	if (!__vma_shareable_lock(vma))
-		return 1;
+	if (__vma_shareable_lock(vma)) {
+		struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
+
+		return down_write_trylock(&vma_lock->rw_sema);
+	} else if (__vma_private_lock(vma)) {
+		struct resv_map *resv_map = vma_resv_map(vma);
+
+		return down_write_trylock(&resv_map->rw_sema);
+	}
 
-	return down_write_trylock(&vma_lock->rw_sema);
+	return 1;
 }
 
 void hugetlb_vma_assert_locked(struct vm_area_struct *vma)
@@ -313,6 +336,10 @@  void hugetlb_vma_assert_locked(struct vm_area_struct *vma)
 		struct hugetlb_vma_lock *vma_lock = vma->vm_private_data;
 
 		lockdep_assert_held(&vma_lock->rw_sema);
+	} else if (__vma_private_lock(vma)) {
+		struct resv_map *resv_map = vma_resv_map(vma);
+
+		lockdep_assert_held(&resv_map->rw_sema);
 	}
 }
 
@@ -1068,6 +1095,7 @@  struct resv_map *resv_map_alloc(void)
 	kref_init(&resv_map->refs);
 	spin_lock_init(&resv_map->lock);
 	INIT_LIST_HEAD(&resv_map->regions);
+	init_rwsem(&resv_map->rw_sema);
 
 	resv_map->adds_in_progress = 0;
 	/*