[0/3] arm64: kdump : take off the protection on crashkernel memory region

Message ID 20230324131838.409996-1-bhe@redhat.com
Headers
Series arm64: kdump : take off the protection on crashkernel memory region |

Message

Baoquan He March 24, 2023, 1:18 p.m. UTC
  Problem:
=======
On arm64, block and section mapping is supported to build page tables.
However, currently it enforces to take base page mapping for the whole
linear mapping if CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled and
crashkernel kernel parameter is set. This will cause longer time of the
linear mapping process during bootup and severe performance degradation
during running time.

Root cause:
==========
On arm64, crashkernel reservation relies on knowing the upper limit of
low memory zone because it needs to reserve memory in the zone so that
devices' DMA addressing in kdump kernel can be satisfied. However, the
upper limit of low memory on arm64 is variant. And the upper limit can
only be decided late till bootmem_init() is called [1].

And we need to map the crashkernel region with base page granularity when
doing linear mapping, because kdump needs to protect the crashkernel region
via set_memory_valid(,0) after kdump kernel loading. However, arm64 doesn't
support well on splitting the built block or section mapping due to some
cpu reststriction [2]. And unfortunately, the linear mapping is done before
bootmem_init().

To resolve the above conflict on arm64, the compromise is enforcing to
take base page mapping for the entire linear mapping if crashkernel is
set, and CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabed. Hence
performance is sacrificed.

Solution:
=========
Comparing with the always encountered base page mapping for the whole
linear region, it's better to take off the protection on crashkernel memory
region for now because the protection can only happen in a chance in one
million, while the base page mapping for the whole linear mapping is
always mitigating arm64 systems with crashkernel set.

This can let distros have chance to back port this patchset to fix the
performance issue caused by the base page mapping in the whole linear
region.

Extra words
===========
I personally expect that  we can add these back in the near future
when arm64_dma_phys_limit is fixed, e.g Raspberry Pi enlarges the device
addressing limit to 32bit; or Arm64 can support splitting built block or
section mapping. Like this, the code is the simplest and clearest.

Or as Catalin suggested, for below 4 cases we currently defer to handle
in bootme_init(), we can try to handle case 3) in advance so that memory
above 4G can avoid base page mapping wholly. This will complicate the
already complex code, let's see how it looks if people interested post patch.

crashkernel=size
1)first attempt:  low memory under arm64_dma_phys_limit
2)fallback:       finding memory above 4G

crashkernel=size,high
3)first attempt:  finding memory above 4G
4)fallback:       low memory under arm64_dma_phys_limit


[1]
https://lore.kernel.org/all/YrIIJkhKWSuAqkCx@arm.com/T/#u

[2]
https://lore.kernel.org/linux-arm-kernel/20190911182546.17094-1-nsaenzjulienne@suse.de/T/

Baoquan He (3):
  arm64: kdump : take off the protection on crashkernel memory region
  arm64: kdump: do not map crashkernel region specifically
  arm64: kdump: defer the crashkernel reservation for platforms with no
    DMA memory zones

 arch/arm64/include/asm/kexec.h    |  6 -----
 arch/arm64/include/asm/memory.h   |  5 ----
 arch/arm64/kernel/machine_kexec.c | 20 --------------
 arch/arm64/mm/init.c              |  6 +----
 arch/arm64/mm/mmu.c               | 43 -------------------------------
 5 files changed, 1 insertion(+), 79 deletions(-)
  

Comments

Catalin Marinas March 24, 2023, 5:11 p.m. UTC | #1
On Fri, Mar 24, 2023 at 09:18:35PM +0800, Baoquan He wrote:
> Baoquan He (3):
>   arm64: kdump : take off the protection on crashkernel memory region
>   arm64: kdump: do not map crashkernel region specifically
>   arm64: kdump: defer the crashkernel reservation for platforms with no
>     DMA memory zones
> 
>  arch/arm64/include/asm/kexec.h    |  6 -----
>  arch/arm64/include/asm/memory.h   |  5 ----
>  arch/arm64/kernel/machine_kexec.c | 20 --------------
>  arch/arm64/mm/init.c              |  6 +----
>  arch/arm64/mm/mmu.c               | 43 -------------------------------
>  5 files changed, 1 insertion(+), 79 deletions(-)

This series works for me and it has a negative diffstat as well (though
I'm sure people will try to bring it back ;)).

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
  
Zhen Lei March 25, 2023, 2:14 a.m. UTC | #2
On 2023/3/25 1:11, Catalin Marinas wrote:
> On Fri, Mar 24, 2023 at 09:18:35PM +0800, Baoquan He wrote:
>> Baoquan He (3):
>>   arm64: kdump : take off the protection on crashkernel memory region
>>   arm64: kdump: do not map crashkernel region specifically
>>   arm64: kdump: defer the crashkernel reservation for platforms with no
>>     DMA memory zones
>>
>>  arch/arm64/include/asm/kexec.h    |  6 -----
>>  arch/arm64/include/asm/memory.h   |  5 ----
>>  arch/arm64/kernel/machine_kexec.c | 20 --------------
>>  arch/arm64/mm/init.c              |  6 +----
>>  arch/arm64/mm/mmu.c               | 43 -------------------------------
>>  5 files changed, 1 insertion(+), 79 deletions(-)
> 
> This series works for me and it has a negative diffstat as well (though
> I'm sure people will try to bring it back ;)).

After the write protection is removed, it is recommended that crc32 check
be added. However, it can be added later.

> 
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> .
>
  
Baoquan He March 25, 2023, 3 a.m. UTC | #3
On 03/25/23 at 10:14am, Leizhen (ThunderTown) wrote:
> 
> 
> On 2023/3/25 1:11, Catalin Marinas wrote:
> > On Fri, Mar 24, 2023 at 09:18:35PM +0800, Baoquan He wrote:
> >> Baoquan He (3):
> >>   arm64: kdump : take off the protection on crashkernel memory region
> >>   arm64: kdump: do not map crashkernel region specifically
> >>   arm64: kdump: defer the crashkernel reservation for platforms with no
> >>     DMA memory zones
> >>
> >>  arch/arm64/include/asm/kexec.h    |  6 -----
> >>  arch/arm64/include/asm/memory.h   |  5 ----
> >>  arch/arm64/kernel/machine_kexec.c | 20 --------------
> >>  arch/arm64/mm/init.c              |  6 +----
> >>  arch/arm64/mm/mmu.c               | 43 -------------------------------
> >>  5 files changed, 1 insertion(+), 79 deletions(-)
> > 
> > This series works for me and it has a negative diffstat as well (though
> > I'm sure people will try to bring it back ;)).
> 
> After the write protection is removed, it is recommended that crc32 check
> be added. However, it can be added later.

That's a great catch. We have calculated the checusum with sha256 in
user space and kernel, and verify it in purgatory in user space.
However, arm64 seems to not do the verifying in kernel if
kexec_file_load is used. Please see kexec_calculate_store_digests().

If stamping happened, the checksum verification can help us spot it.
Yes, this can be added later. Thanks for raising that.
  
Mike Rapoport March 25, 2023, 6:02 a.m. UTC | #4
On Fri, Mar 24, 2023 at 09:18:35PM +0800, Baoquan He wrote:
> Problem:
> =======
> On arm64, block and section mapping is supported to build page tables.
> However, currently it enforces to take base page mapping for the whole
> linear mapping if CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabled and
> crashkernel kernel parameter is set. This will cause longer time of the
> linear mapping process during bootup and severe performance degradation
> during running time.
> 
> Root cause:
> ==========
> On arm64, crashkernel reservation relies on knowing the upper limit of
> low memory zone because it needs to reserve memory in the zone so that
> devices' DMA addressing in kdump kernel can be satisfied. However, the
> upper limit of low memory on arm64 is variant. And the upper limit can
> only be decided late till bootmem_init() is called [1].
> 
> And we need to map the crashkernel region with base page granularity when
> doing linear mapping, because kdump needs to protect the crashkernel region
> via set_memory_valid(,0) after kdump kernel loading. However, arm64 doesn't
> support well on splitting the built block or section mapping due to some
> cpu reststriction [2]. And unfortunately, the linear mapping is done before
> bootmem_init().
> 
> To resolve the above conflict on arm64, the compromise is enforcing to
> take base page mapping for the entire linear mapping if crashkernel is
> set, and CONFIG_ZONE_DMA or CONFIG_ZONE_DMA32 is enabed. Hence
> performance is sacrificed.
> 
> Solution:
> =========
> Comparing with the always encountered base page mapping for the whole
> linear region, it's better to take off the protection on crashkernel memory
> region for now because the protection can only happen in a chance in one
> million, while the base page mapping for the whole linear mapping is
> always mitigating arm64 systems with crashkernel set.
> 
> This can let distros have chance to back port this patchset to fix the
> performance issue caused by the base page mapping in the whole linear
> region.
> 
> Extra words
> ===========
> I personally expect that  we can add these back in the near future
> when arm64_dma_phys_limit is fixed, e.g Raspberry Pi enlarges the device
> addressing limit to 32bit; or Arm64 can support splitting built block or
> section mapping. Like this, the code is the simplest and clearest.
> 
> Or as Catalin suggested, for below 4 cases we currently defer to handle
> in bootme_init(), we can try to handle case 3) in advance so that memory
> above 4G can avoid base page mapping wholly. This will complicate the
> already complex code, let's see how it looks if people interested post patch.
> 
> crashkernel=size
> 1)first attempt:  low memory under arm64_dma_phys_limit
> 2)fallback:       finding memory above 4G
> 
> crashkernel=size,high
> 3)first attempt:  finding memory above 4G
> 4)fallback:       low memory under arm64_dma_phys_limit
> 
> 
> [1]
> https://lore.kernel.org/all/YrIIJkhKWSuAqkCx@arm.com/T/#u
> 
> [2]
> https://lore.kernel.org/linux-arm-kernel/20190911182546.17094-1-nsaenzjulienne@suse.de/T/
> 
> Baoquan He (3):
>   arm64: kdump : take off the protection on crashkernel memory region
>   arm64: kdump: do not map crashkernel region specifically
>   arm64: kdump: defer the crashkernel reservation for platforms with no
>     DMA memory zones
> 
>  arch/arm64/include/asm/kexec.h    |  6 -----
>  arch/arm64/include/asm/memory.h   |  5 ----
>  arch/arm64/kernel/machine_kexec.c | 20 --------------
>  arch/arm64/mm/init.c              |  6 +----
>  arch/arm64/mm/mmu.c               | 43 -------------------------------
>  5 files changed, 1 insertion(+), 79 deletions(-)

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>

> -- 
> 2.34.1
>