mm/memory: Replace kmap() with kmap_local_page()

Message ID 20231215084417.2002370-1-fabio.maria.de.francesco@linux.intel.com
State New
Headers
Series mm/memory: Replace kmap() with kmap_local_page() |

Commit Message

Fabio M. De Francesco Dec. 15, 2023, 8:43 a.m. UTC
  kmap() has been deprecated in favor of kmap_local_page().

Therefore, replace kmap() with kmap_local_page() in mm/memory.c.

There are two main problems with kmap(): (1) It comes with an overhead as
the mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page-faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. The tasks can
be preempted and, when they are scheduled to run again, the kernel
virtual addresses are restored and still valid.

Obviously, thread locality implies that the kernel virtual addresses
returned by kmap_local_page() are only valid in the context of the
callers (i.e., they cannot be handed to other threads).

The use of kmap_local_page() in mm/memory.c does not break the
above-mentioned assumption, so it is allowed and preferred.

Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Fabio M. De Francesco <fabio.maria.de.francesco@linux.intel.com>
---
 mm/memory.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)
  

Comments

Ira Weiny Dec. 18, 2023, 3:34 a.m. UTC | #1
Fabio M. De Francesco wrote:

[snip]

> 
> Cc: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Fabio M. De Francesco <fabio.maria.de.francesco@linux.intel.com>
> ---
>  mm/memory.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index 7d9f6b685032..88377a107fbe 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5852,7 +5852,7 @@ static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
>  			if (bytes > PAGE_SIZE-offset)
>  				bytes = PAGE_SIZE-offset;
>  
> -			maddr = kmap(page);
> +			maddr = kmap_local_page(page);
>  			if (write) {
>  				copy_to_user_page(vma, page, addr,
>  						  maddr + offset, buf, bytes);
> @@ -5861,8 +5861,7 @@ static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
>  				copy_from_user_page(vma, page, addr,
>  						    buf, maddr + offset, bytes);
>  			}
> -			kunmap(page);
> -			put_page(page);
> +			unmap_and_put_page(page, maddr);

Does this really have the same functionality?

Ira
  
Fabio M. De Francesco Dec. 18, 2023, 7:43 a.m. UTC | #2
On Monday, 18 December 2023 04:34:13 CET Ira Weiny wrote:
> Fabio M. De Francesco wrote:
> 
> [snip]
> 
> > Cc: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Fabio M. De Francesco
> > <fabio.maria.de.francesco@linux.intel.com> ---
> > 
> >  mm/memory.c | 5 ++---
> >  1 file changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 7d9f6b685032..88377a107fbe 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -5852,7 +5852,7 @@ static int __access_remote_vm(struct mm_struct *mm,
> > unsigned long addr,> 
> >  			if (bytes > PAGE_SIZE-offset)
> >  			
> >  				bytes = PAGE_SIZE-offset;
> > 
> > -			maddr = kmap(page);
> > +			maddr = kmap_local_page(page);
> > 
> >  			if (write) {
> >  			
> >  				copy_to_user_page(vma, page, addr,
> >  				
> >  						  maddr + offset, buf, 
bytes);
> > 
> > @@ -5861,8 +5861,7 @@ static int __access_remote_vm(struct mm_struct *mm,
> > unsigned long addr,> 
> >  				copy_from_user_page(vma, page, addr,
> >  				
> >  						    buf, maddr + offset, 
bytes);
> >  			
> >  			}
> > 
> > -			kunmap(page);
> > -			put_page(page);
> > +			unmap_and_put_page(page, maddr);
> 
> Does this really have the same functionality?
> 
> Ira

Do you have any specific reasons to say that? 

The unmap_and_put_page() helper was created by Al Viro (it initially was 
put_and_unmap_page() and I sent a patch to rename it to the current name). He 
noticed that we have lots of kunmap_local() followed by put_page(). 

The current implementation has then been changed (Matthew did it, if I 
remember correctly).

My understanding of the current implementation is that unmap_and_put_page() 
calls folio_release_kmap(), taking as arguments the folio which the page 
belongs to and the kernel virtual address returned by kmap_local_page().

folio_release_kmap() calls kunmap_local() and then folio_put(). The last is 
called on the folio obtained by the unmap_and_put_page() wrapper and, if I'm 
not wrong, it releases refcounts on folios like put_page() does on pages.

Am I missing something?

For further reference, please take a look at the following path from Al Viro 
that is modelled after my conversions in fs/sysv: https://lore.kernel.org/all/
20231213000849.2748576-4-viro@zeniv.linux.org.uk/

Thanks,

Fabio
  
Ira Weiny Dec. 20, 2023, 7:53 p.m. UTC | #3
Fabio M. De Francesco wrote:
> On Monday, 18 December 2023 04:34:13 CET Ira Weiny wrote:
> > Fabio M. De Francesco wrote:
> > 
> > [snip]
> > 
> > > Cc: Ira Weiny <ira.weiny@intel.com>
> > > Signed-off-by: Fabio M. De Francesco
> > > <fabio.maria.de.francesco@linux.intel.com> ---
> > > 
> > >  mm/memory.c | 5 ++---
> > >  1 file changed, 2 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/mm/memory.c b/mm/memory.c
> > > index 7d9f6b685032..88377a107fbe 100644
> > > --- a/mm/memory.c
> > > +++ b/mm/memory.c
> > > @@ -5852,7 +5852,7 @@ static int __access_remote_vm(struct mm_struct *mm,
> > > unsigned long addr,> 
> > >  			if (bytes > PAGE_SIZE-offset)
> > >  			
> > >  				bytes = PAGE_SIZE-offset;
> > > 
> > > -			maddr = kmap(page);
> > > +			maddr = kmap_local_page(page);
> > > 
> > >  			if (write) {
> > >  			
> > >  				copy_to_user_page(vma, page, addr,
> > >  				
> > >  						  maddr + offset, buf, 
> bytes);
> > > 
> > > @@ -5861,8 +5861,7 @@ static int __access_remote_vm(struct mm_struct *mm,
> > > unsigned long addr,> 
> > >  				copy_from_user_page(vma, page, addr,
> > >  				
> > >  						    buf, maddr + offset, 
> bytes);
> > >  			
> > >  			}
> > > 
> > > -			kunmap(page);
> > > -			put_page(page);
> > > +			unmap_and_put_page(page, maddr);
> > 
> > Does this really have the same functionality?
> > 
> > Ira
> 
> Do you have any specific reasons to say that? 
> 
> The unmap_and_put_page() helper was created by Al Viro (it initially was 
> put_and_unmap_page() and I sent a patch to rename it to the current name). He 
> noticed that we have lots of kunmap_local() followed by put_page(). 
> 
> The current implementation has then been changed (Matthew did it, if I 
> remember correctly).
> 
> My understanding of the current implementation is that unmap_and_put_page() 
> calls folio_release_kmap(), taking as arguments the folio which the page 
> belongs to and the kernel virtual address returned by kmap_local_page().
> 
> folio_release_kmap() calls kunmap_local() and then folio_put(). The last is 
> called on the folio obtained by the unmap_and_put_page() wrapper and, if I'm 
> not wrong, it releases refcounts on folios like put_page() does on pages.

This is where my consternation came from.  I saw the folio_put() and did
not realize that get_page() now calls folio_get().

> 
> Am I missing something?

Nope, I just did not have time to trace code yesterday.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
  
Matthew Wilcox Dec. 20, 2023, 7:59 p.m. UTC | #4
On Wed, Dec 20, 2023 at 11:53:34AM -0800, Ira Weiny wrote:
> > My understanding of the current implementation is that unmap_and_put_page() 
> > calls folio_release_kmap(), taking as arguments the folio which the page 
> > belongs to and the kernel virtual address returned by kmap_local_page().
> > 
> > folio_release_kmap() calls kunmap_local() and then folio_put(). The last is 
> > called on the folio obtained by the unmap_and_put_page() wrapper and, if I'm 
> > not wrong, it releases refcounts on folios like put_page() does on pages.
> 
> This is where my consternation came from.  I saw the folio_put() and did
> not realize that get_page() now calls folio_get().

That's not new.  See 86d234cb0499 which changed get_page() to call
folio_get(), but notice that it's doing the _exact same thing_ that
get_page() used to do.  And it's behaved this way since ddc58f27f9ee
in 2016.
  

Patch

diff --git a/mm/memory.c b/mm/memory.c
index 7d9f6b685032..88377a107fbe 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5852,7 +5852,7 @@  static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
 			if (bytes > PAGE_SIZE-offset)
 				bytes = PAGE_SIZE-offset;
 
-			maddr = kmap(page);
+			maddr = kmap_local_page(page);
 			if (write) {
 				copy_to_user_page(vma, page, addr,
 						  maddr + offset, buf, bytes);
@@ -5861,8 +5861,7 @@  static int __access_remote_vm(struct mm_struct *mm, unsigned long addr,
 				copy_from_user_page(vma, page, addr,
 						    buf, maddr + offset, bytes);
 			}
-			kunmap(page);
-			put_page(page);
+			unmap_and_put_page(page, maddr);
 		}
 		len -= bytes;
 		buf += bytes;