[RFC,06/34] x86/boot: Use consistent value for iomem_resource.end

Message ID 20240222183934.033178B5@davehans-spike.ostc.intel.com
State New
Headers
Series x86: Rework system-wide configuration masquerading as per-cpu data |

Commit Message

Dave Hansen Feb. 22, 2024, 6:39 p.m. UTC
  From: Dave Hansen <dave.hansen@linux.intel.com>

The 'struct cpuinfo_x86' values (including 'boot_cpu_info') get
written and overwritten rather randomly.  They are not stable
during early boot and readers end up getting a random mishmash
of hard-coded defaults or CPUID-provided values based on when
the values are read.

iomem_resource.end is one of these users.  Because of where it
is called, it ended up seeing .x86_phys_bits==MAX_PHYSMEM_BITS
which is (mostly) a compile-time default.  But
iomem_resource.end is never updated if the runtime CPUID
x86_phys_bits is lower.

Set iomem_resource.end to the compile-time value explicitly.
It does not need to be precise as this is mostly to ensure
that insane values can't be reserved in 'iomem_resource'.

Make MAX_PHYSMEM_BITS available outside of sparsemem
configurations by removing the #ifdef CONFIG_SPARSEMEM in the
header.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
---

 b/arch/x86/include/asm/sparsemem.h |    3 ---
 b/arch/x86/kernel/setup.c          |   10 +++++++++-
 2 files changed, 9 insertions(+), 4 deletions(-)
  

Comments

Kai Huang Feb. 27, 2024, 10:59 a.m. UTC | #1
On Thu, 2024-02-22 at 10:39 -0800, Dave Hansen wrote:
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> The 'struct cpuinfo_x86' values (including 'boot_cpu_info') get
> written and overwritten rather randomly.  They are not stable
> during early boot and readers end up getting a random mishmash
> of hard-coded defaults or CPUID-provided values based on when
> the values are read.
> 
> iomem_resource.end is one of these users.  Because of where it
> is called, it ended up seeing .x86_phys_bits==MAX_PHYSMEM_BITS
> which is (mostly) a compile-time default.  But
> iomem_resource.end is never updated if the runtime CPUID
> x86_phys_bits is lower.
> 
> Set iomem_resource.end to the compile-time value explicitly.
> It does not need to be precise as this is mostly to ensure
> that insane values can't be reserved in 'iomem_resource'.
> 
> Make MAX_PHYSMEM_BITS available outside of sparsemem
> configurations by removing the #ifdef CONFIG_SPARSEMEM in the
> header.
> 
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> ---
> 
>  b/arch/x86/include/asm/sparsemem.h |    3 ---
>  b/arch/x86/kernel/setup.c          |   10 +++++++++-
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff -puN arch/x86/kernel/setup.c~iomem_resource_end arch/x86/kernel/setup.c
> --- a/arch/x86/kernel/setup.c~iomem_resource_end	2024-02-22 10:08:51.048554948 -0800
> +++ b/arch/x86/kernel/setup.c	2024-02-22 10:21:04.485531464 -0800
> @@ -51,6 +51,7 @@
>  #include <asm/pci-direct.h>
>  #include <asm/prom.h>
>  #include <asm/proto.h>
> +#include <asm/sparsemem.h>
>  #include <asm/thermal.h>
>  #include <asm/unwind.h>
>  #include <asm/vsyscall.h>
> @@ -813,7 +814,14 @@ void __init setup_arch(char **cmdline_p)
>  	 */
>  	early_reserve_memory();
>  
> -	iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
> +	/*
> +	 * This was too big before.  It ended up getting MAX_PHYSMEM_BITS
> +	 * even if .x86_phys_bits was eventually lowered below that.
> +	 * But that was evidently harmless, so leave it too big, but
> +	 * set it explicitly to MAX_PHYSMEM_BITS instead of taking a
> +	 * trip through .x86_phys_bits.
> +	 */
> +	iomem_resource.end = (1ULL << MAX_PHYSMEM_BITS) - 1;

Paolo's patchset to move MKTME keyid bits detection to early_cpu_init() was
merged to tip:x86/urgent, so looks it will land to Linus's tree before this
series:

https://lore.kernel.org/lkml/eff34df2-fdc1-4ee0-bb8d-90da386b7cb6@intel.com/T/

Paplo's series actually moves the reduction of x86_phys_bits before setting the
iomem_resource.end here, so after rebasing the changelog/comment seems don't
apply anymore.

Perhaps we can get rid of this patch and just set iomem_resource.end based on
x86_phys_bits()?
  
Zhang, Rui Feb. 28, 2024, 2:22 p.m. UTC | #2
On Tue, 2024-02-27 at 10:59 +0000, Huang, Kai wrote:
> On Thu, 2024-02-22 at 10:39 -0800, Dave Hansen wrote:
> > From: Dave Hansen <dave.hansen@linux.intel.com>
> > 
> > The 'struct cpuinfo_x86' values (including 'boot_cpu_info') get
> > written and overwritten rather randomly.  They are not stable
> > during early boot and readers end up getting a random mishmash
> > of hard-coded defaults or CPUID-provided values based on when
> > the values are read.
> > 
> > iomem_resource.end is one of these users.  Because of where it
> > is called, it ended up seeing .x86_phys_bits==MAX_PHYSMEM_BITS
> > which is (mostly) a compile-time default.  But
> > iomem_resource.end is never updated if the runtime CPUID
> > x86_phys_bits is lower.
> > 
> > Set iomem_resource.end to the compile-time value explicitly.
> > It does not need to be precise as this is mostly to ensure
> > that insane values can't be reserved in 'iomem_resource'.
> > 
> > Make MAX_PHYSMEM_BITS available outside of sparsemem
> > configurations by removing the #ifdef CONFIG_SPARSEMEM in the
> > header.
> > 
> > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
> > ---
> > 
> >  b/arch/x86/include/asm/sparsemem.h |    3 ---
> >  b/arch/x86/kernel/setup.c          |   10 +++++++++-
> >  2 files changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff -puN arch/x86/kernel/setup.c~iomem_resource_end
> > arch/x86/kernel/setup.c
> > --- a/arch/x86/kernel/setup.c~iomem_resource_end        2024-02-22
> > 10:08:51.048554948 -0800
> > +++ b/arch/x86/kernel/setup.c   2024-02-22 10:21:04.485531464 -0800
> > @@ -51,6 +51,7 @@
> >  #include <asm/pci-direct.h>
> >  #include <asm/prom.h>
> >  #include <asm/proto.h>
> > +#include <asm/sparsemem.h>
> >  #include <asm/thermal.h>
> >  #include <asm/unwind.h>
> >  #include <asm/vsyscall.h>
> > @@ -813,7 +814,14 @@ void __init setup_arch(char **cmdline_p)
> >          */
> >         early_reserve_memory();
> >  
> > -       iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
> > +       /*
> > +        * This was too big before.  It ended up getting
> > MAX_PHYSMEM_BITS
> > +        * even if .x86_phys_bits was eventually lowered below
> > that.
> > +        * But that was evidently harmless, so leave it too big,
> > but
> > +        * set it explicitly to MAX_PHYSMEM_BITS instead of taking
> > a
> > +        * trip through .x86_phys_bits.
> > +        */
> > +       iomem_resource.end = (1ULL << MAX_PHYSMEM_BITS) - 1;
> 
> Paolo's patchset to move MKTME keyid bits detection to
> early_cpu_init() was
> merged to tip:x86/urgent, so looks it will land to Linus's tree
> before this
> series:
> 
> https://lore.kernel.org/lkml/eff34df2-fdc1-4ee0-bb8d-90da386b7cb6@intel.com/T/
> 
> Paplo's series actually moves the reduction of x86_phys_bits before
> setting the
> iomem_resource.end here, so after rebasing the changelog/comment
> seems don't
> apply anymore.

My understanding is that the below order is always true,
setup_arch()
	early_cpu_init()
		get_cpu_address_sizes()
	iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
with or without the above patch.

> 
> Perhaps we can get rid of this patch and just set iomem_resource.end
> based on
> x86_phys_bits()?
> 
Agreed.

thanks,
rui
  

Patch

diff -puN arch/x86/kernel/setup.c~iomem_resource_end arch/x86/kernel/setup.c
--- a/arch/x86/kernel/setup.c~iomem_resource_end	2024-02-22 10:08:51.048554948 -0800
+++ b/arch/x86/kernel/setup.c	2024-02-22 10:21:04.485531464 -0800
@@ -51,6 +51,7 @@ 
 #include <asm/pci-direct.h>
 #include <asm/prom.h>
 #include <asm/proto.h>
+#include <asm/sparsemem.h>
 #include <asm/thermal.h>
 #include <asm/unwind.h>
 #include <asm/vsyscall.h>
@@ -813,7 +814,14 @@  void __init setup_arch(char **cmdline_p)
 	 */
 	early_reserve_memory();
 
-	iomem_resource.end = (1ULL << x86_phys_bits()) - 1;
+	/*
+	 * This was too big before.  It ended up getting MAX_PHYSMEM_BITS
+	 * even if .x86_phys_bits was eventually lowered below that.
+	 * But that was evidently harmless, so leave it too big, but
+	 * set it explicitly to MAX_PHYSMEM_BITS instead of taking a
+	 * trip through .x86_phys_bits.
+	 */
+	iomem_resource.end = (1ULL << MAX_PHYSMEM_BITS) - 1;
 	e820__memory_setup();
 	parse_setup_data();
 
diff -puN arch/x86/include/asm/sparsemem.h~iomem_resource_end arch/x86/include/asm/sparsemem.h
--- a/arch/x86/include/asm/sparsemem.h~iomem_resource_end	2024-02-22 10:19:56.842831828 -0800
+++ b/arch/x86/include/asm/sparsemem.h	2024-02-22 10:20:21.207804806 -0800
@@ -4,7 +4,6 @@ 
 
 #include <linux/types.h>
 
-#ifdef CONFIG_SPARSEMEM
 /*
  * generic non-linear memory support:
  *
@@ -29,8 +28,6 @@ 
 # define MAX_PHYSMEM_BITS	(pgtable_l5_enabled() ? 52 : 46)
 #endif
 
-#endif /* CONFIG_SPARSEMEM */
-
 #ifndef __ASSEMBLY__
 #ifdef CONFIG_NUMA_KEEP_MEMINFO
 extern int phys_to_target_node(phys_addr_t start);