x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VM

Message ID 20230811053137.2789-1-decui@microsoft.com
State New
Headers
Series x86/hyperv: Fix the detection of E820_TYPE_PRAM in a Gen2 VM |

Commit Message

Dexuan Cui Aug. 11, 2023, 5:31 a.m. UTC
  A Gen2 VM doesn't support legacy PCI/PCIe, so both raw_pci_ops and
raw_pci_ext_ops are NULL, and pci_subsys_init() -> pcibios_init()
doesn't call pcibios_resource_survey() -> e820__reserve_resources_late();
as a result, any emulated persistent memory of E820_TYPE_PRAM (12) via
the kernel parameter memmap=nn[KMG]!ss is not added into iomem_resource
and hence can't be detected by register_e820_pmem().

Fix this by directly calling e820__reserve_resources_late() in
hv_pci_init(), which is called from arch_initcall(pci_arch_init).

It's ok to move a Gen2 VM's e820__reserve_resources_late() from
subsys_initcall(pci_subsys_init) to arch_initcall(pci_arch_init) because
the code in-between doesn't depend on the E820 resources.
e820__reserve_resources_late() depends on e820__reserve_resources(),
which has been called earlier from setup_arch().

For a Gen-2 VM, the new hv_pci_init() also adds any memory of
E820_TYPE_PMEM (7) into iomem_resource, and acpi_nfit_register_region() ->
acpi_nfit_insert_resource() -> region_intersects() returns
REGION_INTERSECTS, so the memory of E820_TYPE_PMEM won't get added twice.

Changed the local variable "int gen2vm" to "bool gen2vm".

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_init.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)
  

Comments

Saurabh Singh Sengar Sept. 19, 2023, 5:36 a.m. UTC | #1
On Thu, Aug 10, 2023 at 10:31:37PM -0700, Dexuan Cui wrote:
> A Gen2 VM doesn't support legacy PCI/PCIe, so both raw_pci_ops and
> raw_pci_ext_ops are NULL, and pci_subsys_init() -> pcibios_init()
> doesn't call pcibios_resource_survey() -> e820__reserve_resources_late();
> as a result, any emulated persistent memory of E820_TYPE_PRAM (12) via
> the kernel parameter memmap=nn[KMG]!ss is not added into iomem_resource
> and hence can't be detected by register_e820_pmem().
> 
> Fix this by directly calling e820__reserve_resources_late() in
> hv_pci_init(), which is called from arch_initcall(pci_arch_init).
> 
> It's ok to move a Gen2 VM's e820__reserve_resources_late() from
> subsys_initcall(pci_subsys_init) to arch_initcall(pci_arch_init) because
> the code in-between doesn't depend on the E820 resources.
> e820__reserve_resources_late() depends on e820__reserve_resources(),
> which has been called earlier from setup_arch().
> 
> For a Gen-2 VM, the new hv_pci_init() also adds any memory of
> E820_TYPE_PMEM (7) into iomem_resource, and acpi_nfit_register_region() ->
> acpi_nfit_insert_resource() -> region_intersects() returns
> REGION_INTERSECTS, so the memory of E820_TYPE_PMEM won't get added twice.
> 
> Changed the local variable "int gen2vm" to "bool gen2vm".
> 
> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  arch/x86/hyperv/hv_init.c | 25 +++++++++++++++++++++----
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index b004370d3b01..6b22d49aee7b 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -13,6 +13,7 @@
>  #include <linux/io.h>
>  #include <asm/apic.h>
>  #include <asm/desc.h>
> +#include <asm/e820/api.h>
>  #include <asm/sev.h>
>  #include <asm/hypervisor.h>
>  #include <asm/hyperv-tlfs.h>
> @@ -282,15 +283,31 @@ static int hv_cpu_die(unsigned int cpu)
>  
>  static int __init hv_pci_init(void)
>  {
> -	int gen2vm = efi_enabled(EFI_BOOT);
> +	bool gen2vm = efi_enabled(EFI_BOOT);
>  
>  	/*
> -	 * For Generation-2 VM, we exit from pci_arch_init() by returning 0.
> -	 * The purpose is to suppress the harmless warning:
> +	 * A Generation-2 VM doesn't support legacy PCI/PCIe, so both
> +	 * raw_pci_ops and raw_pci_ext_ops are NULL, and pci_subsys_init() ->
> +	 * pcibios_init() doesn't call pcibios_resource_survey() ->
> +	 * e820__reserve_resources_late(); as a result, any emulated persistent
> +	 * memory of E820_TYPE_PRAM (12) via the kernel parameter
> +	 * memmap=nn[KMG]!ss is not added into iomem_resource and hence can't be
> +	 * detected by register_e820_pmem(). Fix this by directly calling
> +	 * e820__reserve_resources_late() here: e820__reserve_resources_late()
> +	 * depends on e820__reserve_resources(), which has been called earlier
> +	 * from setup_arch(). Note: e820__reserve_resources_late() also adds
> +	 * any memory of E820_TYPE_PMEM (7) into iomem_resource, and
> +	 * acpi_nfit_register_region() -> acpi_nfit_insert_resource() ->
> +	 * region_intersects() returns REGION_INTERSECTS, so the memory of
> +	 * E820_TYPE_PMEM won't get added twice.
> +	 *
> +	 * We return 0 here so that pci_arch_init() won't print the warning:
>  	 * "PCI: Fatal: No config space access function found"
>  	 */
> -	if (gen2vm)
> +	if (gen2vm) {
> +		e820__reserve_resources_late();
>  		return 0;
> +	}


Kind reminder to review this.

- Saurabh

>  
>  	/* For Generation-1 VM, we'll proceed in pci_arch_init().  */
>  	return 1;
> 
> -- 
> 2.25.1
>
  
Wei Liu Nov. 10, 2023, 11:38 p.m. UTC | #2
On Mon, Sep 18, 2023 at 10:36:47PM -0700, Saurabh Singh Sengar wrote:
> On Thu, Aug 10, 2023 at 10:31:37PM -0700, Dexuan Cui wrote:
> > A Gen2 VM doesn't support legacy PCI/PCIe, so both raw_pci_ops and
> > raw_pci_ext_ops are NULL, and pci_subsys_init() -> pcibios_init()
> > doesn't call pcibios_resource_survey() -> e820__reserve_resources_late();
> > as a result, any emulated persistent memory of E820_TYPE_PRAM (12) via
> > the kernel parameter memmap=nn[KMG]!ss is not added into iomem_resource
> > and hence can't be detected by register_e820_pmem().
> > 
> > Fix this by directly calling e820__reserve_resources_late() in
> > hv_pci_init(), which is called from arch_initcall(pci_arch_init).
> > 
> > It's ok to move a Gen2 VM's e820__reserve_resources_late() from
> > subsys_initcall(pci_subsys_init) to arch_initcall(pci_arch_init) because
> > the code in-between doesn't depend on the E820 resources.
> > e820__reserve_resources_late() depends on e820__reserve_resources(),
> > which has been called earlier from setup_arch().
> > 
> > For a Gen-2 VM, the new hv_pci_init() also adds any memory of
> > E820_TYPE_PMEM (7) into iomem_resource, and acpi_nfit_register_region() ->
> > acpi_nfit_insert_resource() -> region_intersects() returns
> > REGION_INTERSECTS, so the memory of E820_TYPE_PMEM won't get added twice.
> > 
> > Changed the local variable "int gen2vm" to "bool gen2vm".
> > 
> > Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> > Signed-off-by: Dexuan Cui <decui@microsoft.com>
> > ---
> >  arch/x86/hyperv/hv_init.c | 25 +++++++++++++++++++++----
> >  1 file changed, 21 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> > index b004370d3b01..6b22d49aee7b 100644
> > --- a/arch/x86/hyperv/hv_init.c
> > +++ b/arch/x86/hyperv/hv_init.c
> > @@ -13,6 +13,7 @@
> >  #include <linux/io.h>
> >  #include <asm/apic.h>
> >  #include <asm/desc.h>
> > +#include <asm/e820/api.h>
> >  #include <asm/sev.h>
> >  #include <asm/hypervisor.h>
> >  #include <asm/hyperv-tlfs.h>
> > @@ -282,15 +283,31 @@ static int hv_cpu_die(unsigned int cpu)
> >  
> >  static int __init hv_pci_init(void)
> >  {
> > -	int gen2vm = efi_enabled(EFI_BOOT);
> > +	bool gen2vm = efi_enabled(EFI_BOOT);
> >  
> >  	/*
> > -	 * For Generation-2 VM, we exit from pci_arch_init() by returning 0.
> > -	 * The purpose is to suppress the harmless warning:
> > +	 * A Generation-2 VM doesn't support legacy PCI/PCIe, so both
> > +	 * raw_pci_ops and raw_pci_ext_ops are NULL, and pci_subsys_init() ->
> > +	 * pcibios_init() doesn't call pcibios_resource_survey() ->
> > +	 * e820__reserve_resources_late(); as a result, any emulated persistent
> > +	 * memory of E820_TYPE_PRAM (12) via the kernel parameter
> > +	 * memmap=nn[KMG]!ss is not added into iomem_resource and hence can't be
> > +	 * detected by register_e820_pmem(). Fix this by directly calling
> > +	 * e820__reserve_resources_late() here: e820__reserve_resources_late()
> > +	 * depends on e820__reserve_resources(), which has been called earlier
> > +	 * from setup_arch(). Note: e820__reserve_resources_late() also adds
> > +	 * any memory of E820_TYPE_PMEM (7) into iomem_resource, and
> > +	 * acpi_nfit_register_region() -> acpi_nfit_insert_resource() ->
> > +	 * region_intersects() returns REGION_INTERSECTS, so the memory of
> > +	 * E820_TYPE_PMEM won't get added twice.
> > +	 *
> > +	 * We return 0 here so that pci_arch_init() won't print the warning:
> >  	 * "PCI: Fatal: No config space access function found"
> >  	 */
> > -	if (gen2vm)
> > +	if (gen2vm) {
> > +		e820__reserve_resources_late();
> >  		return 0;
> > +	}
> 
> 
> Kind reminder to review this.
> 

I tried to applied this patch to hyperv-fixes, but it doesn't apply
cleanly.

Please resend.

Thanks,
Wei.
  

Patch

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index b004370d3b01..6b22d49aee7b 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -13,6 +13,7 @@ 
 #include <linux/io.h>
 #include <asm/apic.h>
 #include <asm/desc.h>
+#include <asm/e820/api.h>
 #include <asm/sev.h>
 #include <asm/hypervisor.h>
 #include <asm/hyperv-tlfs.h>
@@ -282,15 +283,31 @@  static int hv_cpu_die(unsigned int cpu)
 
 static int __init hv_pci_init(void)
 {
-	int gen2vm = efi_enabled(EFI_BOOT);
+	bool gen2vm = efi_enabled(EFI_BOOT);
 
 	/*
-	 * For Generation-2 VM, we exit from pci_arch_init() by returning 0.
-	 * The purpose is to suppress the harmless warning:
+	 * A Generation-2 VM doesn't support legacy PCI/PCIe, so both
+	 * raw_pci_ops and raw_pci_ext_ops are NULL, and pci_subsys_init() ->
+	 * pcibios_init() doesn't call pcibios_resource_survey() ->
+	 * e820__reserve_resources_late(); as a result, any emulated persistent
+	 * memory of E820_TYPE_PRAM (12) via the kernel parameter
+	 * memmap=nn[KMG]!ss is not added into iomem_resource and hence can't be
+	 * detected by register_e820_pmem(). Fix this by directly calling
+	 * e820__reserve_resources_late() here: e820__reserve_resources_late()
+	 * depends on e820__reserve_resources(), which has been called earlier
+	 * from setup_arch(). Note: e820__reserve_resources_late() also adds
+	 * any memory of E820_TYPE_PMEM (7) into iomem_resource, and
+	 * acpi_nfit_register_region() -> acpi_nfit_insert_resource() ->
+	 * region_intersects() returns REGION_INTERSECTS, so the memory of
+	 * E820_TYPE_PMEM won't get added twice.
+	 *
+	 * We return 0 here so that pci_arch_init() won't print the warning:
 	 * "PCI: Fatal: No config space access function found"
 	 */
-	if (gen2vm)
+	if (gen2vm) {
+		e820__reserve_resources_late();
 		return 0;
+	}
 
 	/* For Generation-1 VM, we'll proceed in pci_arch_init().  */
 	return 1;