[PATCHv9,03/14] mm/page_alloc: Fake unaccepted memory

Message ID 20230330114956.20342-4-kirill.shutemov@linux.intel.com
State New
Headers
Series mm, x86/cc: Implement support for unaccepted memory |

Commit Message

Kirill A. Shutemov March 30, 2023, 11:49 a.m. UTC
  For testing purposes, it is useful to fake unaccepted memory in the
system. It helps to understand unaccepted memory overhead to the page
allocator.

The patch allows to treat memory above the specified physical memory
address as unaccepted.

The change only fakes unaccepted memory for page allocator. Memblock is
not affected.

It also assumes that arch-provided accept_memory() on already accepted
memory is a nop.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 mm/page_alloc.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)
  

Comments

Vlastimil Babka April 3, 2023, 1:39 p.m. UTC | #1
On 3/30/23 13:49, Kirill A. Shutemov wrote:
> For testing purposes, it is useful to fake unaccepted memory in the
> system. It helps to understand unaccepted memory overhead to the page
> allocator.

Ack on being useful for testing, but the question is if we want to also
merge this patch into mainline as it is?

> The patch allows to treat memory above the specified physical memory
> address as unaccepted.
> 
> The change only fakes unaccepted memory for page allocator. Memblock is
> not affected.
> 
> It also assumes that arch-provided accept_memory() on already accepted
> memory is a nop.

I guess to be in mainline it would have to at least gracefully handle the
case of accept_memory actually not being a nop, and running on a system with
actual unaccepted memory (probably by ignoring the parameter in such case).
Then also the parameter would have to be documented.

Speaking of documented parameters, I found at least two that seem a more
generic variant of this (but I didn't look closely if that makes sense):

efi_fake_mem=   nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI; X86]
    Add arbitrary attribute to specific memory range by
    updating original EFI memory map.

memmap=<size>%<offset>-<oldtype>+<newtype>
    [KNL,ACPI] Convert memory within the specified region
    from <oldtype> to <newtype>. If "-<oldtype>" is left

Would any of those be usable for this usecase?

> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  mm/page_alloc.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d62fcb2f28bd..509a93b7e5af 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7213,6 +7213,8 @@ static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages);
>  
>  static bool lazy_accept = true;
>  
> +static unsigned long fake_unaccepted_start = -1UL;
> +
>  static int __init accept_memory_parse(char *p)
>  {
>  	if (!strcmp(p, "lazy")) {
> @@ -7227,11 +7229,30 @@ static int __init accept_memory_parse(char *p)
>  }
>  early_param("accept_memory", accept_memory_parse);
>  
> +static int __init fake_unaccepted_start_parse(char *p)
> +{
> +	if (!p)
> +		return -EINVAL;
> +
> +	fake_unaccepted_start = memparse(p, &p);
> +
> +	if (*p != '\0') {
> +		fake_unaccepted_start = -1UL;
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +early_param("fake_unaccepted_start", fake_unaccepted_start_parse);
> +
>  static bool page_contains_unaccepted(struct page *page, unsigned int order)
>  {
>  	phys_addr_t start = page_to_phys(page);
>  	phys_addr_t end = start + (PAGE_SIZE << order);
>  
> +	if (start >= fake_unaccepted_start)
> +		return true;
> +
>  	return range_contains_unaccepted_memory(start, end);
>  }
>
  
Kirill A. Shutemov April 3, 2023, 2:39 p.m. UTC | #2
On Mon, Apr 03, 2023 at 03:39:53PM +0200, Vlastimil Babka wrote:
> On 3/30/23 13:49, Kirill A. Shutemov wrote:
> > For testing purposes, it is useful to fake unaccepted memory in the
> > system. It helps to understand unaccepted memory overhead to the page
> > allocator.
> 
> Ack on being useful for testing, but the question is if we want to also
> merge this patch into mainline as it is?

I don't insist on getting it upstream, but it can be handy to debug
related bugs in the future.

> > The patch allows to treat memory above the specified physical memory
> > address as unaccepted.
> > 
> > The change only fakes unaccepted memory for page allocator. Memblock is
> > not affected.
> > 
> > It also assumes that arch-provided accept_memory() on already accepted
> > memory is a nop.
> 
> I guess to be in mainline it would have to at least gracefully handle the
> case of accept_memory actually not being a nop, and running on a system with
> actual unaccepted memory (probably by ignoring the parameter in such case).
> Then also the parameter would have to be documented.

As it is written now, accept_memory() is nop on system with real
unaccepted memory if the memory is already accepted. Arch-specific code
will check against own records to see if the memory needs accepting. If
not, just return.

And the option will not interfere with unaccepted memory declared by EFI
memmap. It can extend it, but that's it.

Looks safe to me.

> Speaking of documented parameters, I found at least two that seem a more
> generic variant of this (but I didn't look closely if that makes sense):
> 
> efi_fake_mem=   nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI; X86]
>     Add arbitrary attribute to specific memory range by
>     updating original EFI memory map.
> 
> memmap=<size>%<offset>-<oldtype>+<newtype>
>     [KNL,ACPI] Convert memory within the specified region
>     from <oldtype> to <newtype>. If "-<oldtype>" is left
> 
> Would any of those be usable for this usecase?

Oh. I missed them. Will take a closer look.

> 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  mm/page_alloc.c | 21 +++++++++++++++++++++
> >  1 file changed, 21 insertions(+)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index d62fcb2f28bd..509a93b7e5af 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7213,6 +7213,8 @@ static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages);
> >  
> >  static bool lazy_accept = true;
> >  
> > +static unsigned long fake_unaccepted_start = -1UL;
> > +
> >  static int __init accept_memory_parse(char *p)
> >  {
> >  	if (!strcmp(p, "lazy")) {
> > @@ -7227,11 +7229,30 @@ static int __init accept_memory_parse(char *p)
> >  }
> >  early_param("accept_memory", accept_memory_parse);
> >  
> > +static int __init fake_unaccepted_start_parse(char *p)
> > +{
> > +	if (!p)
> > +		return -EINVAL;
> > +
> > +	fake_unaccepted_start = memparse(p, &p);
> > +
> > +	if (*p != '\0') {
> > +		fake_unaccepted_start = -1UL;
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +early_param("fake_unaccepted_start", fake_unaccepted_start_parse);
> > +
> >  static bool page_contains_unaccepted(struct page *page, unsigned int order)
> >  {
> >  	phys_addr_t start = page_to_phys(page);
> >  	phys_addr_t end = start + (PAGE_SIZE << order);
> >  
> > +	if (start >= fake_unaccepted_start)
> > +		return true;
> > +
> >  	return range_contains_unaccepted_memory(start, end);
> >  }
> >  
>
  
David Hildenbrand April 3, 2023, 2:43 p.m. UTC | #3
On 30.03.23 13:49, Kirill A. Shutemov wrote:
> For testing purposes, it is useful to fake unaccepted memory in the
> system. It helps to understand unaccepted memory overhead to the page
> allocator.
> 
> The patch allows to treat memory above the specified physical memory
> address as unaccepted.
> 
> The change only fakes unaccepted memory for page allocator. Memblock is
> not affected.
> 
> It also assumes that arch-provided accept_memory() on already accepted
> memory is a nop.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>   mm/page_alloc.c | 21 +++++++++++++++++++++
>   1 file changed, 21 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d62fcb2f28bd..509a93b7e5af 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7213,6 +7213,8 @@ static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages);
>   
>   static bool lazy_accept = true;
>   
> +static unsigned long fake_unaccepted_start = -1UL;
> +
>   static int __init accept_memory_parse(char *p)
>   {
>   	if (!strcmp(p, "lazy")) {
> @@ -7227,11 +7229,30 @@ static int __init accept_memory_parse(char *p)
>   }
>   early_param("accept_memory", accept_memory_parse);
>   
> +static int __init fake_unaccepted_start_parse(char *p)
> +{
> +	if (!p)
> +		return -EINVAL;
> +
> +	fake_unaccepted_start = memparse(p, &p);
> +
> +	if (*p != '\0') {
> +		fake_unaccepted_start = -1UL;
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +early_param("fake_unaccepted_start", fake_unaccepted_start_parse);
> +
>   static bool page_contains_unaccepted(struct page *page, unsigned int order)
>   {
>   	phys_addr_t start = page_to_phys(page);
>   	phys_addr_t end = start + (PAGE_SIZE << order);
>   
> +	if (start >= fake_unaccepted_start)
> +		return true;
> +
>   	return range_contains_unaccepted_memory(start, end);
>   }
>   

The "unpleasant" thing about this is, that page_contains_unaccepted() 
could not be used for sanity checks because the result is static.

For example, something like

if (page_contains_unaccepted(page, 0))
	accept_memory(page, 0);
BUG_ON(!page_contains_unaccepted(page, 0));

Would work on real hardware, however, not for the fake variant.
  
Kirill A. Shutemov April 3, 2023, 2:47 p.m. UTC | #4
On Mon, Apr 03, 2023 at 04:43:08PM +0200, David Hildenbrand wrote:
> On 30.03.23 13:49, Kirill A. Shutemov wrote:
> > For testing purposes, it is useful to fake unaccepted memory in the
> > system. It helps to understand unaccepted memory overhead to the page
> > allocator.
> > 
> > The patch allows to treat memory above the specified physical memory
> > address as unaccepted.
> > 
> > The change only fakes unaccepted memory for page allocator. Memblock is
> > not affected.
> > 
> > It also assumes that arch-provided accept_memory() on already accepted
> > memory is a nop.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >   mm/page_alloc.c | 21 +++++++++++++++++++++
> >   1 file changed, 21 insertions(+)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index d62fcb2f28bd..509a93b7e5af 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7213,6 +7213,8 @@ static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages);
> >   static bool lazy_accept = true;
> > +static unsigned long fake_unaccepted_start = -1UL;
> > +
> >   static int __init accept_memory_parse(char *p)
> >   {
> >   	if (!strcmp(p, "lazy")) {
> > @@ -7227,11 +7229,30 @@ static int __init accept_memory_parse(char *p)
> >   }
> >   early_param("accept_memory", accept_memory_parse);
> > +static int __init fake_unaccepted_start_parse(char *p)
> > +{
> > +	if (!p)
> > +		return -EINVAL;
> > +
> > +	fake_unaccepted_start = memparse(p, &p);
> > +
> > +	if (*p != '\0') {
> > +		fake_unaccepted_start = -1UL;
> > +		return -EINVAL;
> > +	}
> > +
> > +	return 0;
> > +}
> > +early_param("fake_unaccepted_start", fake_unaccepted_start_parse);
> > +
> >   static bool page_contains_unaccepted(struct page *page, unsigned int order)
> >   {
> >   	phys_addr_t start = page_to_phys(page);
> >   	phys_addr_t end = start + (PAGE_SIZE << order);
> > +	if (start >= fake_unaccepted_start)
> > +		return true;
> > +
> >   	return range_contains_unaccepted_memory(start, end);
> >   }
> 
> The "unpleasant" thing about this is, that page_contains_unaccepted() could
> not be used for sanity checks because the result is static.
> 
> For example, something like
> 
> if (page_contains_unaccepted(page, 0))
> 	accept_memory(page, 0);
> BUG_ON(!page_contains_unaccepted(page, 0));
> 
> Would work on real hardware, however, not for the fake variant.

Need for raw_page_contains_unaccepted()? :P
  
Kirill A. Shutemov April 3, 2023, 3:50 p.m. UTC | #5
On Mon, Apr 03, 2023 at 05:39:15PM +0300, Kirill A. Shutemov wrote:
> On Mon, Apr 03, 2023 at 03:39:53PM +0200, Vlastimil Babka wrote:
> > On 3/30/23 13:49, Kirill A. Shutemov wrote:
> > > For testing purposes, it is useful to fake unaccepted memory in the
> > > system. It helps to understand unaccepted memory overhead to the page
> > > allocator.
> > 
> > Ack on being useful for testing, but the question is if we want to also
> > merge this patch into mainline as it is?
> 
> I don't insist on getting it upstream, but it can be handy to debug
> related bugs in the future.
> 
> > > The patch allows to treat memory above the specified physical memory
> > > address as unaccepted.
> > > 
> > > The change only fakes unaccepted memory for page allocator. Memblock is
> > > not affected.
> > > 
> > > It also assumes that arch-provided accept_memory() on already accepted
> > > memory is a nop.
> > 
> > I guess to be in mainline it would have to at least gracefully handle the
> > case of accept_memory actually not being a nop, and running on a system with
> > actual unaccepted memory (probably by ignoring the parameter in such case).
> > Then also the parameter would have to be documented.
> 
> As it is written now, accept_memory() is nop on system with real
> unaccepted memory if the memory is already accepted. Arch-specific code
> will check against own records to see if the memory needs accepting. If
> not, just return.
> 
> And the option will not interfere with unaccepted memory declared by EFI
> memmap. It can extend it, but that's it.
> 
> Looks safe to me.
> 
> > Speaking of documented parameters, I found at least two that seem a more
> > generic variant of this (but I didn't look closely if that makes sense):
> > 
> > efi_fake_mem=   nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI; X86]
> >     Add arbitrary attribute to specific memory range by
> >     updating original EFI memory map.

As of now, efi_fake_mem= can adjust attributes of memory. Unaccepted is
type of memory, not an attribute. I guess we can allow it override type
too. But syntax is going to be fun.

> > memmap=<size>%<offset>-<oldtype>+<newtype>
> >     [KNL,ACPI] Convert memory within the specified region
> >     from <oldtype> to <newtype>. If "-<oldtype>" is left

It overrides E820 map, but unaccepted memory is not represented there.
Unaccepted memory is just RAM in E820.
  
Kirill A. Shutemov April 14, 2023, 10:19 a.m. UTC | #6
On Mon, Apr 03, 2023 at 06:50:11PM +0300, Kirill A. Shutemov wrote:
> On Mon, Apr 03, 2023 at 05:39:15PM +0300, Kirill A. Shutemov wrote:
> > On Mon, Apr 03, 2023 at 03:39:53PM +0200, Vlastimil Babka wrote:
> > > On 3/30/23 13:49, Kirill A. Shutemov wrote:
> > > > For testing purposes, it is useful to fake unaccepted memory in the
> > > > system. It helps to understand unaccepted memory overhead to the page
> > > > allocator.
> > > 
> > > Ack on being useful for testing, but the question is if we want to also
> > > merge this patch into mainline as it is?
> > 
> > I don't insist on getting it upstream, but it can be handy to debug
> > related bugs in the future.
> > 
> > > > The patch allows to treat memory above the specified physical memory
> > > > address as unaccepted.
> > > > 
> > > > The change only fakes unaccepted memory for page allocator. Memblock is
> > > > not affected.
> > > > 
> > > > It also assumes that arch-provided accept_memory() on already accepted
> > > > memory is a nop.
> > > 
> > > I guess to be in mainline it would have to at least gracefully handle the
> > > case of accept_memory actually not being a nop, and running on a system with
> > > actual unaccepted memory (probably by ignoring the parameter in such case).
> > > Then also the parameter would have to be documented.
> > 
> > As it is written now, accept_memory() is nop on system with real
> > unaccepted memory if the memory is already accepted. Arch-specific code
> > will check against own records to see if the memory needs accepting. If
> > not, just return.
> > 
> > And the option will not interfere with unaccepted memory declared by EFI
> > memmap. It can extend it, but that's it.
> > 
> > Looks safe to me.
> > 
> > > Speaking of documented parameters, I found at least two that seem a more
> > > generic variant of this (but I didn't look closely if that makes sense):
> > > 
> > > efi_fake_mem=   nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI; X86]
> > >     Add arbitrary attribute to specific memory range by
> > >     updating original EFI memory map.
> 
> As of now, efi_fake_mem= can adjust attributes of memory. Unaccepted is
> type of memory, not an attribute. I guess we can allow it override type
> too. But syntax is going to be fun.

efi_fake_mem applied too late. Bitmap that represents unaccepted memory
for kernel created at kernel decompression stage, but efi_fake_mem=
handled in main kernel.

I don't think pushing efi_fake_mem to decompressor makes sesne. I would
rather drom the feature altogether.
  

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d62fcb2f28bd..509a93b7e5af 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7213,6 +7213,8 @@  static DEFINE_STATIC_KEY_FALSE(zones_with_unaccepted_pages);
 
 static bool lazy_accept = true;
 
+static unsigned long fake_unaccepted_start = -1UL;
+
 static int __init accept_memory_parse(char *p)
 {
 	if (!strcmp(p, "lazy")) {
@@ -7227,11 +7229,30 @@  static int __init accept_memory_parse(char *p)
 }
 early_param("accept_memory", accept_memory_parse);
 
+static int __init fake_unaccepted_start_parse(char *p)
+{
+	if (!p)
+		return -EINVAL;
+
+	fake_unaccepted_start = memparse(p, &p);
+
+	if (*p != '\0') {
+		fake_unaccepted_start = -1UL;
+		return -EINVAL;
+	}
+
+	return 0;
+}
+early_param("fake_unaccepted_start", fake_unaccepted_start_parse);
+
 static bool page_contains_unaccepted(struct page *page, unsigned int order)
 {
 	phys_addr_t start = page_to_phys(page);
 	phys_addr_t end = start + (PAGE_SIZE << order);
 
+	if (start >= fake_unaccepted_start)
+		return true;
+
 	return range_contains_unaccepted_memory(start, end);
 }