Message ID | 20230406220206.3067006-2-chenjiahao16@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1070218vqo; Thu, 6 Apr 2023 07:29:59 -0700 (PDT) X-Google-Smtp-Source: AKy350Yl3JmllKqle+4WdXjlOyos52aB89VEfq/njiZ14DZ57OQ7KFb5HC2h5yctiFimTajQiC53 X-Received: by 2002:a17:907:1dcc:b0:930:660d:8f92 with SMTP id og12-20020a1709071dcc00b00930660d8f92mr6216184ejc.52.1680791399509; Thu, 06 Apr 2023 07:29:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680791399; cv=none; d=google.com; s=arc-20160816; b=DIWhaLiTQKgha+uQQVDm4ZFqhR68xM9EDs7/v0W57JrGJ/ANujjm18hoeCXkTjTmli Qel7iEMlP+xhhNu8oIRU1tEvnT/9nDWEnhFP/7SEX0/bZxTgehGVYlCmtCwOAf0eg8WN hx7M9pTX7/XOlS7tZrMBWPIyofCP/ZOEi358YucwYwGqWtisveFn7KJXZyJSdhmSM5Ft Cu19GAmJNdUGGUS1nPem+2SxQLTtOPCVwf2VVpFdvOs+obJXgSN+b0p5hCeOCmUf1JfW MPmyAXdb42wzqc1P6Wsiz2pBkVuLZirYeJHuDqnCBSONPwbMvY/dsuAYV1n6PsmQgfaD ZXDg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=DlRkCn/gfjkxzY54ylU/gKYD0M2uBT+qu6e/GOVAr40=; b=c+W1lDVdVGq4Q0WzttR6Y1mQsxVCyHMtJKi5e6Rr1+KV/Q1vH+G6Ea4QhH237BPzU3 4lcfjeSjlSikKsz5UAhyCagZa2hOkaJHG/T9nJMVAN2ldkTtROcCMYUYlEvs1TIaO4QE jvibBjSJMEfvFspzpsKJVrWLDmGCGZneWUCnQTGhppF+IpcP1HlLTAYldTOHuSSNRRC0 AFBUtuOPIybfrMpIp6xkBuMXGuvWbBNGdowt5SWgF/38dcQYJk5ETSHpvfHQTL9PQxWE 0ATFOZ25BbWDUsl0xfi/JSd9UiJW1R5I+PCkra3Qgfpna2D6V5Ns9ZasaDYG1ANquBTv sMdg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v4-20020a1709064e8400b00947728e85f1si1213695eju.924.2023.04.06.07.29.35; Thu, 06 Apr 2023 07:29:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239021AbjDFOGi (ORCPT <rfc822;lkml4gm@gmail.com> + 99 others); Thu, 6 Apr 2023 10:06:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238977AbjDFOGY (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 6 Apr 2023 10:06:24 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B0E383E4; Thu, 6 Apr 2023 07:05:56 -0700 (PDT) Received: from dggpemm500016.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4Psjrj2NqnzKx1b; Thu, 6 Apr 2023 22:03:17 +0800 (CST) Received: from huawei.com (10.67.174.205) by dggpemm500016.china.huawei.com (7.185.36.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 6 Apr 2023 22:05:47 +0800 From: Chen Jiahao <chenjiahao16@huawei.com> To: <linux-kernel@vger.kernel.org>, <linux-riscv@lists.infradead.org>, <kexec@lists.infradead.org>, <linux-doc@vger.kernel.org> CC: <paul.walmsley@sifive.com>, <palmer@dabbelt.com>, <conor.dooley@microchip.com>, <guoren@kernel.org>, <heiko@sntech.de>, <bjorn@rivosinc.com>, <alex@ghiti.fr>, <akpm@linux-foundation.org>, <atishp@rivosinc.com>, <bhe@redhat.com>, <thunder.leizhen@huawei.com>, <horms@kernel.org> Subject: [PATCH -next v3 1/2] riscv: kdump: Implement crashkernel=X,[high,low] Date: Fri, 7 Apr 2023 06:02:05 +0800 Message-ID: <20230406220206.3067006-2-chenjiahao16@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230406220206.3067006-1-chenjiahao16@huawei.com> References: <20230406220206.3067006-1-chenjiahao16@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.67.174.205] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm500016.china.huawei.com (7.185.36.25) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-2.3 required=5.0 tests=DATE_IN_FUTURE_06_12, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1762437522668585517?= X-GMAIL-MSGID: =?utf-8?q?1762437522668585517?= |
Series |
support allocating crashkernel above 4G explicitly on riscv
|
|
Commit Message
Chen Jiahao
April 6, 2023, 10:02 p.m. UTC
On riscv, the current crash kernel allocation logic is trying to
allocate within 32bit addressible memory region by default, if
failed, try to allocate without 4G restriction.
In need of saving DMA zone memory while allocating a relatively large
crash kernel region, allocating the reserved memory top down in
high memory, without overlapping the DMA zone, is a mature solution.
Here introduce the parameter option crashkernel=X,[high,low].
One can reserve the crash kernel from high memory above DMA zone range
by explicitly passing "crashkernel=X,high"; or reserve a memory range
below 4G with "crashkernel=X,low".
Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com>
---
arch/riscv/kernel/setup.c | 5 +++
arch/riscv/mm/init.c | 74 ++++++++++++++++++++++++++++++++++++---
2 files changed, 74 insertions(+), 5 deletions(-)
Comments
On Thu, Apr 6, 2023 at 10:06 PM Chen Jiahao <chenjiahao16@huawei.com> wrote: > > On riscv, the current crash kernel allocation logic is trying to > allocate within 32bit addressible memory region by default, if > failed, try to allocate without 4G restriction. > > In need of saving DMA zone memory while allocating a relatively large > crash kernel region, allocating the reserved memory top down in > high memory, without overlapping the DMA zone, is a mature solution. > Here introduce the parameter option crashkernel=X,[high,low]. > > One can reserve the crash kernel from high memory above DMA zone range > by explicitly passing "crashkernel=X,high"; or reserve a memory range > below 4G with "crashkernel=X,low". Asked-by: Guo Ren <guoren@kernel.org> > > Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> > --- > arch/riscv/kernel/setup.c | 5 +++ > arch/riscv/mm/init.c | 74 ++++++++++++++++++++++++++++++++++++--- > 2 files changed, 74 insertions(+), 5 deletions(-) > > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 5d3184cbf518..ea84e5047c23 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -176,6 +176,11 @@ static void __init init_resources(void) > if (ret < 0) > goto error; > } > + if (crashk_low_res.start != crashk_low_res.end) { > + ret = add_resource(&iomem_resource, &crashk_low_res); > + if (ret < 0) > + goto error; > + } > #endif > > #ifdef CONFIG_CRASH_DUMP > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 478d6763a01a..b5b457193423 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -1152,6 +1152,28 @@ static inline void setup_vm_final(void) > } > #endif /* CONFIG_MMU */ > > +/* Reserve 128M low memory by default for swiotlb buffer */ > +#define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) > + > +static int __init reserve_crashkernel_low(unsigned long long low_size) > +{ > + unsigned long long low_base; > + > + low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit); > + if (!low_base) { > + pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size); > + return -ENOMEM; > + } > + > + pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n", > + low_base, low_base + low_size, low_size >> 20); > + > + crashk_low_res.start = low_base; > + crashk_low_res.end = low_base + low_size - 1; > + > + return 0; > +} > + > /* > * reserve_crashkernel() - reserves memory for crash kernel > * > @@ -1163,8 +1185,12 @@ static void __init reserve_crashkernel(void) > { > unsigned long long crash_base = 0; > unsigned long long crash_size = 0; > + unsigned long long crash_low_size = 0; > unsigned long search_start = memblock_start_of_DRAM(); > unsigned long search_end = memblock_end_of_DRAM(); > + unsigned long search_low_max = (unsigned long)dma32_phys_limit; > + char *cmdline = boot_command_line; > + bool fixed_base = false; > > int ret = 0; > > @@ -1180,14 +1206,37 @@ static void __init reserve_crashkernel(void) > return; > } > > - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), > &crash_size, &crash_base); > - if (ret || !crash_size) > + if (ret == -ENOENT) { > + /* > + * crashkernel=X,[high,low] can be specified or not, but > + * invalid value is not allowed. > + */ > + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); > + if (ret || !crash_size) > + return; > + > + /* > + * crashkernel=Y,low is valid only when crashkernel=X,high > + * is passed and high memory is reserved successful. > + */ > + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); > + if (ret == -ENOENT) > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > + else if (ret) > + return; > + > + search_start = search_low_max; > + } else if (ret || !crash_size) { > + /* Invalid argument value specified */ > return; > + } > > crash_size = PAGE_ALIGN(crash_size); > > if (crash_base) { > + fixed_base = true; > search_start = crash_base; > search_end = crash_base + crash_size; > } > @@ -1201,16 +1250,31 @@ static void __init reserve_crashkernel(void) > */ > crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, > search_start, > - min(search_end, (unsigned long) SZ_4G)); > + min(search_end, search_low_max)); > if (crash_base == 0) { > - /* Try again without restricting region to 32bit addressible memory */ > + if (fixed_base) { > + pr_warn("crashkernel: allocating failed with given size@offset\n"); > + return; > + } > + > + /* Try again above the region of 32bit addressible memory */ > crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, > - search_start, search_end); > + max(search_start, search_low_max), > + search_end); > if (crash_base == 0) { > pr_warn("crashkernel: couldn't allocate %lldKB\n", > crash_size >> 10); > return; > } > + > + if (!crash_low_size) > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > + } > + > + if ((crash_base > dma32_phys_limit - crash_low_size) && > + crash_low_size && reserve_crashkernel_low(crash_low_size)) { > + memblock_phys_free(crash_base, crash_size); > + return; > } > > pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", > -- > 2.31.1 >
On Fri, Apr 07, 2023 at 06:02:05AM +0800, Chen Jiahao wrote: > On riscv, the current crash kernel allocation logic is trying to > allocate within 32bit addressible memory region by default, if > failed, try to allocate without 4G restriction. > > In need of saving DMA zone memory while allocating a relatively large > crash kernel region, allocating the reserved memory top down in > high memory, without overlapping the DMA zone, is a mature solution. > Here introduce the parameter option crashkernel=X,[high,low]. > > One can reserve the crash kernel from high memory above DMA zone range > by explicitly passing "crashkernel=X,high"; or reserve a memory range > below 4G with "crashkernel=X,low". > > Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> ... > @@ -1180,14 +1206,37 @@ static void __init reserve_crashkernel(void) > return; > } > > - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), > &crash_size, &crash_base); > - if (ret || !crash_size) > + if (ret == -ENOENT) { > + /* > + * crashkernel=X,[high,low] can be specified or not, but > + * invalid value is not allowed. nit: Perhaps something like this would be easier to correlate with the code that follows: /* Fallback to crashkernel=X,[high,low] */ > + */ > + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); > + if (ret || !crash_size) > + return; > + > + /* > + * crashkernel=Y,low is valid only when crashkernel=X,high > + * is passed and high memory is reserved successful. nit: s/successful/successfully/ > + */ > + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); > + if (ret == -ENOENT) > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > + else if (ret) > + return; > + > + search_start = search_low_max; > + } else if (ret || !crash_size) { > + /* Invalid argument value specified */ > return; > + } ...
On Fri, Apr 07, 2023 at 05:06:24PM +0800, Guo Ren wrote: > On Thu, Apr 6, 2023 at 10:06 PM Chen Jiahao <chenjiahao16@huawei.com> wrote: > > > > On riscv, the current crash kernel allocation logic is trying to > > allocate within 32bit addressible memory region by default, if > > failed, try to allocate without 4G restriction. > > > > In need of saving DMA zone memory while allocating a relatively large > > crash kernel region, allocating the reserved memory top down in > > high memory, without overlapping the DMA zone, is a mature solution. > > Here introduce the parameter option crashkernel=X,[high,low]. > > > > One can reserve the crash kernel from high memory above DMA zone range > > by explicitly passing "crashkernel=X,high"; or reserve a memory range > > below 4G with "crashkernel=X,low". > Asked-by: Guo Ren <guoren@kernel.org> Perhaps 'Acked-by' :)
On 2023/4/7 20:03, Simon Horman wrote: > On Fri, Apr 07, 2023 at 06:02:05AM +0800, Chen Jiahao wrote: >> On riscv, the current crash kernel allocation logic is trying to >> allocate within 32bit addressible memory region by default, if >> failed, try to allocate without 4G restriction. >> >> In need of saving DMA zone memory while allocating a relatively large >> crash kernel region, allocating the reserved memory top down in >> high memory, without overlapping the DMA zone, is a mature solution. >> Here introduce the parameter option crashkernel=X,[high,low]. >> >> One can reserve the crash kernel from high memory above DMA zone range >> by explicitly passing "crashkernel=X,high"; or reserve a memory range >> below 4G with "crashkernel=X,low". >> >> Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> > > ... > >> @@ -1180,14 +1206,37 @@ static void __init reserve_crashkernel(void) >> return; >> } >> >> - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), >> + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), >> &crash_size, &crash_base); >> - if (ret || !crash_size) >> + if (ret == -ENOENT) { >> + /* >> + * crashkernel=X,[high,low] can be specified or not, but >> + * invalid value is not allowed. > > nit: Perhaps something like this would be easier to correlate with the > code that follows: > > /* Fallback to crashkernel=X,[high,low] */ The description "crashkernel=X,[high,low] can be specified or not" is not correct, because crashkernel=X,high must be specified when walking into this branch. So use Simon's comments or copy arm64's comments(it's written for parse_crashkernel_low()). > > >> + */ >> + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); >> + if (ret || !crash_size) >> + return; >> + >> + /* >> + * crashkernel=Y,low is valid only when crashkernel=X,high >> + * is passed and high memory is reserved successful. > > nit: s/successful/successfully/ Seems like the whole "and high memory is reserved successful" needs to be deleted. Only the dependency between the two boot options should be described here, regardless of whether their memory is successfully allocated. > >> + */ >> + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); >> + if (ret == -ENOENT) >> + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; >> + else if (ret) >> + return; >> + >> + search_start = search_low_max; >> + } else if (ret || !crash_size) { >> + /* Invalid argument value specified */ >> return; >> + } > > ... > . >
On 2023/4/7 20:58, Leizhen (ThunderTown) wrote: > > > On 2023/4/7 20:03, Simon Horman wrote: >> On Fri, Apr 07, 2023 at 06:02:05AM +0800, Chen Jiahao wrote: >>> On riscv, the current crash kernel allocation logic is trying to >>> allocate within 32bit addressible memory region by default, if >>> failed, try to allocate without 4G restriction. >>> >>> In need of saving DMA zone memory while allocating a relatively large >>> crash kernel region, allocating the reserved memory top down in >>> high memory, without overlapping the DMA zone, is a mature solution. >>> Here introduce the parameter option crashkernel=X,[high,low]. >>> >>> One can reserve the crash kernel from high memory above DMA zone range >>> by explicitly passing "crashkernel=X,high"; or reserve a memory range >>> below 4G with "crashkernel=X,low". >>> >>> Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> >> >> ... >> >>> @@ -1180,14 +1206,37 @@ static void __init reserve_crashkernel(void) >>> return; >>> } >>> >>> - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), >>> + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), >>> &crash_size, &crash_base); >>> - if (ret || !crash_size) >>> + if (ret == -ENOENT) { >>> + /* >>> + * crashkernel=X,[high,low] can be specified or not, but >>> + * invalid value is not allowed. >> >> nit: Perhaps something like this would be easier to correlate with the >> code that follows: >> >> /* Fallback to crashkernel=X,[high,low] */ > > The description "crashkernel=X,[high,low] can be specified or not" is not > correct, because crashkernel=X,high must be specified when walking into this > branch. So use Simon's comments or copy arm64's comments(it's written for > parse_crashkernel_low()). I rethink it a little bit, if it's relative to crashkernel=X[@offset], that's also true. Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> > >> >> >>> + */ >>> + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); >>> + if (ret || !crash_size) >>> + return; >>> + >>> + /* >>> + * crashkernel=Y,low is valid only when crashkernel=X,high >>> + * is passed and high memory is reserved successful. >> >> nit: s/successful/successfully/ > > Seems like the whole "and high memory is reserved successful" needs to be deleted. > Only the dependency between the two boot options should be described here, > regardless of whether their memory is successfully allocated. > >> >>> + */ >>> + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); >>> + if (ret == -ENOENT) >>> + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; >>> + else if (ret) >>> + return; >>> + >>> + search_start = search_low_max; >>> + } else if (ret || !crash_size) { >>> + /* Invalid argument value specified */ >>> return; >>> + } >> >> ... >> . >> >
On 2023/4/7 20:03, Simon Horman wrote: > On Fri, Apr 07, 2023 at 06:02:05AM +0800, Chen Jiahao wrote: >> On riscv, the current crash kernel allocation logic is trying to >> allocate within 32bit addressible memory region by default, if >> failed, try to allocate without 4G restriction. >> >> In need of saving DMA zone memory while allocating a relatively large >> crash kernel region, allocating the reserved memory top down in >> high memory, without overlapping the DMA zone, is a mature solution. >> Here introduce the parameter option crashkernel=X,[high,low]. >> >> One can reserve the crash kernel from high memory above DMA zone range >> by explicitly passing "crashkernel=X,high"; or reserve a memory range >> below 4G with "crashkernel=X,low". >> >> Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> > ... > >> @@ -1180,14 +1206,37 @@ static void __init reserve_crashkernel(void) >> return; >> } >> >> - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), >> + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), >> &crash_size, &crash_base); >> - if (ret || !crash_size) >> + if (ret == -ENOENT) { >> + /* >> + * crashkernel=X,[high,low] can be specified or not, but >> + * invalid value is not allowed. > nit: Perhaps something like this would be easier to correlate with the > code that follows: > > /* Fallback to crashkernel=X,[high,low] */ > Agreed, this would be more concise and accurate. >> + */ >> + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); >> + if (ret || !crash_size) >> + return; >> + >> + /* >> + * crashkernel=Y,low is valid only when crashkernel=X,high >> + * is passed and high memory is reserved successful. > nit: s/successful/successfully/ I will fix above nits and resend another version later, thanks. >> + */ >> + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); >> + if (ret == -ENOENT) >> + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; >> + else if (ret) >> + return; >> + >> + search_start = search_low_max; >> + } else if (ret || !crash_size) { >> + /* Invalid argument value specified */ >> return; >> + } > ...
On 2023/4/8 10:00, Leizhen (ThunderTown) wrote: > > On 2023/4/7 20:58, Leizhen (ThunderTown) wrote: >> >> On 2023/4/7 20:03, Simon Horman wrote: >>> On Fri, Apr 07, 2023 at 06:02:05AM +0800, Chen Jiahao wrote: >>>> On riscv, the current crash kernel allocation logic is trying to >>>> allocate within 32bit addressible memory region by default, if >>>> failed, try to allocate without 4G restriction. >>>> >>>> In need of saving DMA zone memory while allocating a relatively large >>>> crash kernel region, allocating the reserved memory top down in >>>> high memory, without overlapping the DMA zone, is a mature solution. >>>> Here introduce the parameter option crashkernel=X,[high,low]. >>>> >>>> One can reserve the crash kernel from high memory above DMA zone range >>>> by explicitly passing "crashkernel=X,high"; or reserve a memory range >>>> below 4G with "crashkernel=X,low". >>>> >>>> Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> >>> ... >>> >>>> @@ -1180,14 +1206,37 @@ static void __init reserve_crashkernel(void) >>>> return; >>>> } >>>> >>>> - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), >>>> + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), >>>> &crash_size, &crash_base); >>>> - if (ret || !crash_size) >>>> + if (ret == -ENOENT) { >>>> + /* >>>> + * crashkernel=X,[high,low] can be specified or not, but >>>> + * invalid value is not allowed. >>> nit: Perhaps something like this would be easier to correlate with the >>> code that follows: >>> >>> /* Fallback to crashkernel=X,[high,low] */ >> The description "crashkernel=X,[high,low] can be specified or not" is not >> correct, because crashkernel=X,high must be specified when walking into this >> branch. So use Simon's comments or copy arm64's comments(it's written for >> parse_crashkernel_low()). > I rethink it a little bit, if it's relative to crashkernel=X[@offset], > that's also true. > > Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com> Sure, The commit should not be ambiguous like this, Simon's comment above is a better option. >>> >>>> + */ >>>> + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); >>>> + if (ret || !crash_size) >>>> + return; >>>> + >>>> + /* >>>> + * crashkernel=Y,low is valid only when crashkernel=X,high >>>> + * is passed and high memory is reserved successful. >>> nit: s/successful/successfully/ >> Seems like the whole "and high memory is reserved successful" needs to be deleted. >> Only the dependency between the two boot options should be described here, >> regardless of whether their memory is successfully allocated. The comment here is imprecise, since there is absolutely no check whether the allocation is successful before "parse_crashkernel_low" >> >>>> + */ >>>> + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); >>>> + if (ret == -ENOENT) >>>> + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; >>>> + else if (ret) >>>> + return; >>>> + >>>> + search_start = search_low_max; >>>> + } else if (ret || !crash_size) { >>>> + /* Invalid argument value specified */ >>>> return; >>>> + } >>> ... >>> . >>> BR, Jiahao
On Fri, Apr 7, 2023 at 8:03 PM Simon Horman <horms@kernel.org> wrote: > > On Fri, Apr 07, 2023 at 05:06:24PM +0800, Guo Ren wrote: > > On Thu, Apr 6, 2023 at 10:06 PM Chen Jiahao <chenjiahao16@huawei.com> wrote: > > > > > > On riscv, the current crash kernel allocation logic is trying to > > > allocate within 32bit addressible memory region by default, if > > > failed, try to allocate without 4G restriction. > > > > > > In need of saving DMA zone memory while allocating a relatively large > > > crash kernel region, allocating the reserved memory top down in > > > high memory, without overlapping the DMA zone, is a mature solution. > > > Here introduce the parameter option crashkernel=X,[high,low]. > > > > > > One can reserve the crash kernel from high memory above DMA zone range > > > by explicitly passing "crashkernel=X,high"; or reserve a memory range > > > below 4G with "crashkernel=X,low". > > Asked-by: Guo Ren <guoren@kernel.org> > > Perhaps 'Acked-by' :) Sorry, my typo. Acked-by: Guo Ren <guoren@kernel.org>
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 5d3184cbf518..ea84e5047c23 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -176,6 +176,11 @@ static void __init init_resources(void) if (ret < 0) goto error; } + if (crashk_low_res.start != crashk_low_res.end) { + ret = add_resource(&iomem_resource, &crashk_low_res); + if (ret < 0) + goto error; + } #endif #ifdef CONFIG_CRASH_DUMP diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 478d6763a01a..b5b457193423 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1152,6 +1152,28 @@ static inline void setup_vm_final(void) } #endif /* CONFIG_MMU */ +/* Reserve 128M low memory by default for swiotlb buffer */ +#define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) + +static int __init reserve_crashkernel_low(unsigned long long low_size) +{ + unsigned long long low_base; + + low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit); + if (!low_base) { + pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size); + return -ENOMEM; + } + + pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n", + low_base, low_base + low_size, low_size >> 20); + + crashk_low_res.start = low_base; + crashk_low_res.end = low_base + low_size - 1; + + return 0; +} + /* * reserve_crashkernel() - reserves memory for crash kernel * @@ -1163,8 +1185,12 @@ static void __init reserve_crashkernel(void) { unsigned long long crash_base = 0; unsigned long long crash_size = 0; + unsigned long long crash_low_size = 0; unsigned long search_start = memblock_start_of_DRAM(); unsigned long search_end = memblock_end_of_DRAM(); + unsigned long search_low_max = (unsigned long)dma32_phys_limit; + char *cmdline = boot_command_line; + bool fixed_base = false; int ret = 0; @@ -1180,14 +1206,37 @@ static void __init reserve_crashkernel(void) return; } - ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), + ret = parse_crashkernel(cmdline, memblock_phys_mem_size(), &crash_size, &crash_base); - if (ret || !crash_size) + if (ret == -ENOENT) { + /* + * crashkernel=X,[high,low] can be specified or not, but + * invalid value is not allowed. + */ + ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base); + if (ret || !crash_size) + return; + + /* + * crashkernel=Y,low is valid only when crashkernel=X,high + * is passed and high memory is reserved successful. + */ + ret = parse_crashkernel_low(cmdline, 0, &crash_low_size, &crash_base); + if (ret == -ENOENT) + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; + else if (ret) + return; + + search_start = search_low_max; + } else if (ret || !crash_size) { + /* Invalid argument value specified */ return; + } crash_size = PAGE_ALIGN(crash_size); if (crash_base) { + fixed_base = true; search_start = crash_base; search_end = crash_base + crash_size; } @@ -1201,16 +1250,31 @@ static void __init reserve_crashkernel(void) */ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, search_start, - min(search_end, (unsigned long) SZ_4G)); + min(search_end, search_low_max)); if (crash_base == 0) { - /* Try again without restricting region to 32bit addressible memory */ + if (fixed_base) { + pr_warn("crashkernel: allocating failed with given size@offset\n"); + return; + } + + /* Try again above the region of 32bit addressible memory */ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, - search_start, search_end); + max(search_start, search_low_max), + search_end); if (crash_base == 0) { pr_warn("crashkernel: couldn't allocate %lldKB\n", crash_size >> 10); return; } + + if (!crash_low_size) + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; + } + + if ((crash_base > dma32_phys_limit - crash_low_size) && + crash_low_size && reserve_crashkernel_low(crash_low_size)) { + memblock_phys_free(crash_base, crash_size); + return; } pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",