[1/2] x86/mm: Do not zap PMD entry mapping unaccepted memory table during kdump.

Message ID a0bf771e1472eb1a6a241acd2e16c98ab8ac9253.1708390906.git.ashish.kalra@amd.com
State New
Headers
Series x86/snp: Add kexec support |

Commit Message

Kalra, Ashish Feb. 20, 2024, 1:18 a.m. UTC
  From: Ashish Kalra <ashish.kalra@amd.com>

During crashkernel boot only pre-allocated crash memory is presented as
E820_TYPE_RAM. This can cause PMD entry mapping unaccepted memory table
to be zapped during phys_pmd_init() as SNP/TDX guest use E820_TYPE_ACPI
to store the unaccepted memory table and pass it between the kernels on
kexec/kdump.

E820_TYPE_ACPI covers not only ACPI data, but also EFI tables and might
be required by kernel to function properly.

The problem was discovered during debugging kdump for SNP guest. The
unaccepted memory table stored with E820_TYPE_ACPI and passed between
the kernels on kdump was getting zapped as the PMD entry mapping this
is above the E820_TYPE_RAM range for the reserved crashkernel memory.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/mm/init_64.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Kirill A. Shutemov Feb. 20, 2024, 12:42 p.m. UTC | #1
On Tue, Feb 20, 2024 at 01:18:29AM +0000, Ashish Kalra wrote:
> From: Ashish Kalra <ashish.kalra@amd.com>
> 
> During crashkernel boot only pre-allocated crash memory is presented as
> E820_TYPE_RAM. This can cause PMD entry mapping unaccepted memory table
> to be zapped during phys_pmd_init() as SNP/TDX guest use E820_TYPE_ACPI
> to store the unaccepted memory table and pass it between the kernels on
> kexec/kdump.
> 
> E820_TYPE_ACPI covers not only ACPI data, but also EFI tables and might
> be required by kernel to function properly.
> 
> The problem was discovered during debugging kdump for SNP guest. The
> unaccepted memory table stored with E820_TYPE_ACPI and passed between
> the kernels on kdump was getting zapped as the PMD entry mapping this
> is above the E820_TYPE_RAM range for the reserved crashkernel memory.
> 
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  arch/x86/mm/init_64.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index a0dffaca6d2b..207c6dddde0c 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -524,7 +524,9 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
>  			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
>  					     E820_TYPE_RAM) &&
>  			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
> -					     E820_TYPE_RESERVED_KERN))
> +					     E820_TYPE_RESERVED_KERN) &&
> +			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
> +					     E820_TYPE_ACPI))
>  				set_pmd_init(pmd, __pmd(0), init);
>  			continue;

Why do you single out phys_pmd_init()? I think it has to be addressed for
all page table levels as we do for E820_TYPE_RAM and E820_TYPE_RESERVED_KERN.
  
Kalra, Ashish Feb. 20, 2024, 7:09 p.m. UTC | #2
Hi Kirill,

On 2/20/2024 6:42 AM, Kirill A. Shutemov wrote:
> On Tue, Feb 20, 2024 at 01:18:29AM +0000, Ashish Kalra wrote:
>> From: Ashish Kalra <ashish.kalra@amd.com>
>>
>> During crashkernel boot only pre-allocated crash memory is presented as
>> E820_TYPE_RAM. This can cause PMD entry mapping unaccepted memory table
>> to be zapped during phys_pmd_init() as SNP/TDX guest use E820_TYPE_ACPI
>> to store the unaccepted memory table and pass it between the kernels on
>> kexec/kdump.
>>
>> E820_TYPE_ACPI covers not only ACPI data, but also EFI tables and might
>> be required by kernel to function properly.
>>
>> The problem was discovered during debugging kdump for SNP guest. The
>> unaccepted memory table stored with E820_TYPE_ACPI and passed between
>> the kernels on kdump was getting zapped as the PMD entry mapping this
>> is above the E820_TYPE_RAM range for the reserved crashkernel memory.
>>
>> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
>> ---
>>   arch/x86/mm/init_64.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
>> index a0dffaca6d2b..207c6dddde0c 100644
>> --- a/arch/x86/mm/init_64.c
>> +++ b/arch/x86/mm/init_64.c
>> @@ -524,7 +524,9 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
>>   			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
>>   					     E820_TYPE_RAM) &&
>>   			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
>> -					     E820_TYPE_RESERVED_KERN))
>> +					     E820_TYPE_RESERVED_KERN) &&
>> +			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
>> +					     E820_TYPE_ACPI))
>>   				set_pmd_init(pmd, __pmd(0), init);
>>   			continue;
> Why do you single out phys_pmd_init()? I think it has to be addressed for
> all page table levels as we do for E820_TYPE_RAM and E820_TYPE_RESERVED_KERN.

I believe i only discovered the issue with PMDe's (phys_pmd_init()) 
because of the crashkernel reserved memory size and the E820_TYPE_ACPI 
physical memory range mapping on my test system, but you are right this 
fix needs to be done for all page table levels and i will add also the 
fix in phys_pte_init(), phys_pud_init() and phys_p4d_init().

Thanks, Ashish
  

Patch

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index a0dffaca6d2b..207c6dddde0c 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -524,7 +524,9 @@  phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
 					     E820_TYPE_RAM) &&
 			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
-					     E820_TYPE_RESERVED_KERN))
+					     E820_TYPE_RESERVED_KERN) &&
+			    !e820__mapped_any(paddr & PMD_MASK, paddr_next,
+					     E820_TYPE_ACPI))
 				set_pmd_init(pmd, __pmd(0), init);
 			continue;
 		}