[RFC] efi: Add ACPI_MEMORY_NVS into the linear map

Message ID 20240215225116.3435953-1-boqun.feng@gmail.com
State New
Headers
Series [RFC] efi: Add ACPI_MEMORY_NVS into the linear map |

Commit Message

Boqun Feng Feb. 15, 2024, 10:51 p.m. UTC
  Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
a trouble with the following firmware memory region setup:

	[..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
	[..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]

, on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
range will be omitted from the the linear map due to 64k round-up. And
a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:

	[...] Unable to handle kernel paging request at virtual address ffff0000dfd80000

To fix this, add ACPI_MEMORY_NVS into the linear map.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: stable@vger.kernel.org # 5.15+
---
We hit this in an ARM64 Hyper-V VM when using 64k page size, although
this issue may also be fixed if the efi memory regions are all 64k
aligned, but I don't find this memory region setup is invalid per UEFI
spec, also I don't find that spec disallows ACPI_MEMORY_NVS to be mapped
in the OS linear map, but if there is any better way or I'm reading the
spec incorrectly, please let me know.

It's Cced stable since 5.15 because that's when Hyper-V ARM64 support is
added, and Hyper-V is the only one that hits the problem so far.

 drivers/firmware/efi/efi-init.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)
  

Comments

Boqun Feng Feb. 15, 2024, 10:53 p.m. UTC | #1
(Cc the correct arm mailing list)

On Thu, Feb 15, 2024 at 02:51:06PM -0800, Boqun Feng wrote:
> Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
> a trouble with the following firmware memory region setup:
> 
> 	[..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
> 	[..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]
> 
> , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
> range will be omitted from the the linear map due to 64k round-up. And
> a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:
> 
> 	[...] Unable to handle kernel paging request at virtual address ffff0000dfd80000
> 
> To fix this, add ACPI_MEMORY_NVS into the linear map.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: stable@vger.kernel.org # 5.15+
> ---
> We hit this in an ARM64 Hyper-V VM when using 64k page size, although
> this issue may also be fixed if the efi memory regions are all 64k
> aligned, but I don't find this memory region setup is invalid per UEFI
> spec, also I don't find that spec disallows ACPI_MEMORY_NVS to be mapped
> in the OS linear map, but if there is any better way or I'm reading the
> spec incorrectly, please let me know.
> 
> It's Cced stable since 5.15 because that's when Hyper-V ARM64 support is
> added, and Hyper-V is the only one that hits the problem so far.
> 
>  drivers/firmware/efi/efi-init.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/firmware/efi/efi-init.c b/drivers/firmware/efi/efi-init.c
> index a00e07b853f2..9a1b9bc66d50 100644
> --- a/drivers/firmware/efi/efi-init.c
> +++ b/drivers/firmware/efi/efi-init.c
> @@ -139,6 +139,7 @@ static __init int is_usable_memory(efi_memory_desc_t *md)
>  	case EFI_LOADER_CODE:
>  	case EFI_LOADER_DATA:
>  	case EFI_ACPI_RECLAIM_MEMORY:
> +	case EFI_ACPI_MEMORY_NVS:
>  	case EFI_BOOT_SERVICES_CODE:
>  	case EFI_BOOT_SERVICES_DATA:
>  	case EFI_CONVENTIONAL_MEMORY:
> @@ -202,8 +203,12 @@ static __init void reserve_regions(void)
>  			if (!is_usable_memory(md))
>  				memblock_mark_nomap(paddr, size);
>  
> -			/* keep ACPI reclaim memory intact for kexec etc. */
> -			if (md->type == EFI_ACPI_RECLAIM_MEMORY)
> +			/*
> +			 * keep ACPI reclaim and NVS memory and intact for kexec
> +			 * etc.
> +			 */
> +			if (md->type == EFI_ACPI_RECLAIM_MEMORY ||
> +			    md->type == EFI_ACPI_MEMORY_NVS)
>  				memblock_reserve(paddr, size);
>  		}
>  	}
> -- 
> 2.43.0
>
  
Ard Biesheuvel Feb. 15, 2024, 11:21 p.m. UTC | #2
(cc Oliver)

On Thu, 15 Feb 2024 at 23:51, Boqun Feng <boqun.feng@gmail.com> wrote:
>
> Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
> a trouble with the following firmware memory region setup:
>
>         [..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
>         [..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]
>

Which memory types were listed here?

> , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
> range will be omitted from the the linear map due to 64k round-up. And
> a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:
>
>         [...] Unable to handle kernel paging request at virtual address ffff0000dfd80000
>

You trimmed all the useful information here. ACPI reclaim memory is
reclaimable, but we don't actually do so in Linux. So this is not
general purpose memory, it is used for a specific purpose, and the
code that accesses it is assuming that it is accessible via the linear
map. There are reason why this may not be the case, so the fix might
be to use memremap() in the access instead.

> To fix this, add ACPI_MEMORY_NVS into the linear map.
>

There is a requirement in the arm64 bindings in the UEFI spec that
says that mixed attribute mappings within a 64k page are not allowed.

This is not a very clear description of the requirement or the issue
it is intended to work around. In short, the following memory types
are special

– EfiRuntimeServicesCode – EfiRuntimeServicesData – EfiReserved –
EfiACPIMemoryNVS

and care must be taken to ensure that allocations of these types are
never mapped with mismatched attributes, which might happen on a 64k
page size OS if a mapping is rounded outwards and ends up covering the
adjacent region.

The Tianocore reference implementation of UEFI achieves this by simply
aligning all allocations of these types to 64k, so that the OS never
has to reason about whether or not region A and region B sharing a 64k
page frame could have mappings or aliases that are incompatible.
(I.e., all mappings of A are compatible with all mappings of B)

ACPI reclaim is just memory, EfiACPIMemoryNVS could have special
semantics that the OS knows nothing about. That makes it unsafe to
assume that we can simply create a cacheable and writable mapping for
this memory.

> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: stable@vger.kernel.org # 5.15+
> ---
> We hit this in an ARM64 Hyper-V VM when using 64k page size, although
> this issue may also be fixed if the efi memory regions are all 64k
> aligned, but I don't find this memory region setup is invalid per UEFI
> spec, also I don't find that spec disallows ACPI_MEMORY_NVS to be mapped
> in the OS linear map, but if there is any better way or I'm reading the
> spec incorrectly, please let me know.
>

I'd prefer fixing this in the firmware.

> It's Cced stable since 5.15 because that's when Hyper-V ARM64 support is
> added, and Hyper-V is the only one that hits the problem so far.
>
>  drivers/firmware/efi/efi-init.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/firmware/efi/efi-init.c b/drivers/firmware/efi/efi-init.c
> index a00e07b853f2..9a1b9bc66d50 100644
> --- a/drivers/firmware/efi/efi-init.c
> +++ b/drivers/firmware/efi/efi-init.c
> @@ -139,6 +139,7 @@ static __init int is_usable_memory(efi_memory_desc_t *md)
>         case EFI_LOADER_CODE:
>         case EFI_LOADER_DATA:
>         case EFI_ACPI_RECLAIM_MEMORY:
> +       case EFI_ACPI_MEMORY_NVS:
>         case EFI_BOOT_SERVICES_CODE:
>         case EFI_BOOT_SERVICES_DATA:
>         case EFI_CONVENTIONAL_MEMORY:
> @@ -202,8 +203,12 @@ static __init void reserve_regions(void)
>                         if (!is_usable_memory(md))
>                                 memblock_mark_nomap(paddr, size);
>
> -                       /* keep ACPI reclaim memory intact for kexec etc. */
> -                       if (md->type == EFI_ACPI_RECLAIM_MEMORY)
> +                       /*
> +                        * keep ACPI reclaim and NVS memory and intact for kexec
> +                        * etc.
> +                        */
> +                       if (md->type == EFI_ACPI_RECLAIM_MEMORY ||
> +                           md->type == EFI_ACPI_MEMORY_NVS)
>                                 memblock_reserve(paddr, size);
>                 }
>         }
> --
> 2.43.0
>
>
  
Ard Biesheuvel Feb. 15, 2024, 11:40 p.m. UTC | #3
On Fri, 16 Feb 2024 at 00:21, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> (cc Oliver)
>
> On Thu, 15 Feb 2024 at 23:51, Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
> > a trouble with the following firmware memory region setup:
> >
> >         [..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
> >         [..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]
> >
>
> Which memory types were listed here?
>
> > , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
> > range will be omitted from the the linear map due to 64k round-up. And
> > a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:
> >
> >         [...] Unable to handle kernel paging request at virtual address ffff0000dfd80000
> >
>
> You trimmed all the useful information here. ACPI reclaim memory is
> reclaimable, but we don't actually do so in Linux. So this is not
> general purpose memory, it is used for a specific purpose, and the
> code that accesses it is assuming that it is accessible via the linear
> map. There are reason why this may not be the case, so the fix might
> be to use memremap() in the access instead.
>

Please try the below if the caller is already using memremap(). It
might misidentify the region because the start is in the linear map
but the end is not.


diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
index 269f2f63ab7d..fef0281e223c 100644
--- a/arch/arm64/mm/ioremap.c
+++ b/arch/arm64/mm/ioremap.c
@@ -31,7 +31,6 @@ void __init early_ioremap_init(void)
 bool arch_memremap_can_ram_remap(resource_size_t offset, size_t size,
                                 unsigned long flags)
 {
-       unsigned long pfn = PHYS_PFN(offset);
-
-       return pfn_is_map_memory(pfn);
+       return pfn_is_map_memory(PHYS_PFN(offset)) &&
+              pfn_is_map_memory(PHYS_PFN(offset + size - 1));
 }
  
Greg KH Feb. 17, 2024, 7:49 a.m. UTC | #4
On Thu, Feb 15, 2024 at 02:51:06PM -0800, Boqun Feng wrote:
> Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
> a trouble with the following firmware memory region setup:
> 
> 	[..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
> 	[..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]
> 
> , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
> range will be omitted from the the linear map due to 64k round-up. And
> a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:
> 
> 	[...] Unable to handle kernel paging request at virtual address ffff0000dfd80000
> 
> To fix this, add ACPI_MEMORY_NVS into the linear map.
> 
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: stable@vger.kernel.org # 5.15+

What commit id does this fix?  Can you include that as well?

thanks,

greg k-h
  
Boqun Feng Feb. 20, 2024, 4:09 a.m. UTC | #5
On Sat, Feb 17, 2024 at 08:49:32AM +0100, Greg KH wrote:
> On Thu, Feb 15, 2024 at 02:51:06PM -0800, Boqun Feng wrote:
> > Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
> > a trouble with the following firmware memory region setup:
> > 
> > 	[..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
> > 	[..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]
> > 
> > , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
> > range will be omitted from the the linear map due to 64k round-up. And
> > a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:
> > 
> > 	[...] Unable to handle kernel paging request at virtual address ffff0000dfd80000
> > 
> > To fix this, add ACPI_MEMORY_NVS into the linear map.
> > 
> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > Cc: stable@vger.kernel.org # 5.15+
> 
> What commit id does this fix?  Can you include that as well?
> 

It should be 7aff79e297ee ("Drivers: hv: Enable Hyper-V code to be built
on ARM64"), but as Ard mentioned earlier, this could be fixed at the VM
firmware, and Oliver is working on that. Should the situation change, I
will send a V2 with more information and include the commit id.

Regards,
Boqun

> thanks,
> 
> greg k-h
  
Ard Biesheuvel Feb. 20, 2024, 8:27 a.m. UTC | #6
On Tue, 20 Feb 2024 at 05:10, Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Sat, Feb 17, 2024 at 08:49:32AM +0100, Greg KH wrote:
> > On Thu, Feb 15, 2024 at 02:51:06PM -0800, Boqun Feng wrote:
> > > Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
> > > a trouble with the following firmware memory region setup:
> > >
> > >     [..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
> > >     [..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]
> > >
> > > , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
> > > range will be omitted from the the linear map due to 64k round-up. And
> > > a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:
> > >
> > >     [...] Unable to handle kernel paging request at virtual address ffff0000dfd80000
> > >
> > > To fix this, add ACPI_MEMORY_NVS into the linear map.
> > >
> > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > > Cc: stable@vger.kernel.org # 5.15+
> >
> > What commit id does this fix?  Can you include that as well?
> >
>
> It should be 7aff79e297ee ("Drivers: hv: Enable Hyper-V code to be built
> on ARM64"), but as Ard mentioned earlier, this could be fixed at the VM
> firmware, and Oliver is working on that. Should the situation change, I
> will send a V2 with more information and include the commit id.
>

The patch as-is is not acceptable to me, so no need to send a v2 just
to add more information.

Please consider the fix I proposed for arch_memremap_can_ram_remap()
if fixing this in the firmware is not feasible.
  
Boqun Feng Feb. 20, 2024, 4:28 p.m. UTC | #7
On Tue, Feb 20, 2024 at 09:27:54AM +0100, Ard Biesheuvel wrote:
> On Tue, 20 Feb 2024 at 05:10, Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > On Sat, Feb 17, 2024 at 08:49:32AM +0100, Greg KH wrote:
> > > On Thu, Feb 15, 2024 at 02:51:06PM -0800, Boqun Feng wrote:
> > > > Currently ACPI_MEMORY_NVS is omitted from the linear map, which causes
> > > > a trouble with the following firmware memory region setup:
> > > >
> > > >     [..] efi:   0x0000dfd62000-0x0000dfd83fff [ACPI Reclaim|...]
> > > >     [..] efi:   0x0000dfd84000-0x0000dfd87fff [ACPI Mem NVS|...]
> > > >
> > > > , on ARM64 with 64k page size, the whole 0x0000dfd80000-0x0000dfd8ffff
> > > > range will be omitted from the the linear map due to 64k round-up. And
> > > > a page fault happens when trying to access the ACPI_RECLAIM_MEMORY:
> > > >
> > > >     [...] Unable to handle kernel paging request at virtual address ffff0000dfd80000
> > > >
> > > > To fix this, add ACPI_MEMORY_NVS into the linear map.
> > > >
> > > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> > > > Cc: stable@vger.kernel.org # 5.15+
> > >
> > > What commit id does this fix?  Can you include that as well?
> > >
> >
> > It should be 7aff79e297ee ("Drivers: hv: Enable Hyper-V code to be built
> > on ARM64"), but as Ard mentioned earlier, this could be fixed at the VM
> > firmware, and Oliver is working on that. Should the situation change, I
> > will send a V2 with more information and include the commit id.
> >
> 
> The patch as-is is not acceptable to me, so no need to send a v2 just
> to add more information.
> 
> Please consider the fix I proposed for arch_memremap_can_ram_remap()
> if fixing this in the firmware is not feasible.

Got it. Would do if necessary, thanks!

Regards,
Boqun
  

Patch

diff --git a/drivers/firmware/efi/efi-init.c b/drivers/firmware/efi/efi-init.c
index a00e07b853f2..9a1b9bc66d50 100644
--- a/drivers/firmware/efi/efi-init.c
+++ b/drivers/firmware/efi/efi-init.c
@@ -139,6 +139,7 @@  static __init int is_usable_memory(efi_memory_desc_t *md)
 	case EFI_LOADER_CODE:
 	case EFI_LOADER_DATA:
 	case EFI_ACPI_RECLAIM_MEMORY:
+	case EFI_ACPI_MEMORY_NVS:
 	case EFI_BOOT_SERVICES_CODE:
 	case EFI_BOOT_SERVICES_DATA:
 	case EFI_CONVENTIONAL_MEMORY:
@@ -202,8 +203,12 @@  static __init void reserve_regions(void)
 			if (!is_usable_memory(md))
 				memblock_mark_nomap(paddr, size);
 
-			/* keep ACPI reclaim memory intact for kexec etc. */
-			if (md->type == EFI_ACPI_RECLAIM_MEMORY)
+			/*
+			 * keep ACPI reclaim and NVS memory and intact for kexec
+			 * etc.
+			 */
+			if (md->type == EFI_ACPI_RECLAIM_MEMORY ||
+			    md->type == EFI_ACPI_MEMORY_NVS)
 				memblock_reserve(paddr, size);
 		}
 	}