From patchwork Mon May 15 06:02:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baoquan He X-Patchwork-Id: 93882 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp6702148vqo; Sun, 14 May 2023 23:21:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4ARZIpLnwrO6qbaNnp/UaGCpHwNzQHAH440JlJ2Hk4v4sRCGz83ZSsVle86rRCy5/5Uj/a X-Received: by 2002:a05:6a00:2396:b0:646:6c71:ee13 with SMTP id f22-20020a056a00239600b006466c71ee13mr32507762pfc.24.1684131705473; Sun, 14 May 2023 23:21:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684131705; cv=none; d=google.com; s=arc-20160816; b=RV/qphE3E9m9ZPOKK0XGsm4dsV2+GnYocYtQPst8XTwCU2/Wnulw3bW3UoonJJ50MT 10ZtXW1coG92Uuj2wzML7kVew/8cyiQs0/oZHU8THNPqAGmE5wL7NbbBaBrhkY0+HAGE dnN2XEKDkeT1oaNAboxo1qkzibNW2t9PJP3UpyFP/ogZS/PNQ2d/X0Be/NSiScPWhvTs vLaD0T3VlA4JD+a1DEu/1mjnb1UUpn9lyV5fRzj3y8pwXgVUJJN5AA+RMzfqz14M/ysT 2cjF7Adq7LNsySleiYz782L2hSJoEpuoaln1/7zGpwVC/Zb7oc2QyRLBoLJy2YwIJ5MQ Cpxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WArNoC6uSt8maUFT7n6H0H1hJWsPEQmI39XIRXJdkyU=; b=aqDX1Bcwm+NJQ9aF3+VMji0YOEaeu3lnC9EvGZFz6OJWyGiAJKYfG5tZRVG+V/c4+V fWVeSdA3F50uttGL9nbs0mrP2P5ZJh/34IzjpPaZ2uxHwbpllBG0ICv+N/FZV3ryihUb TD3b8UEa9E+Lu76Dh0qVdM1o86MCzDiD2vCfTXVgzhD8KMTbUcjZNM1JL4LG5VgKOJIS vBWo8E+8YiAo0GVl0emtL75ANhdltiFJre580KBLLVCqEDINKPHZkP3lAQUjkEv+kta/ dX9bPX8RpBuAZqXqVSDxhBiutBhVIvwmBuEcoirYGQxUCA4/VDvuAh8TWxyfHNFo7HCw I+uQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TixGiILM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e5-20020a056a0000c500b00643bc9d1245si16232206pfj.314.2023.05.14.23.21.30; Sun, 14 May 2023 23:21:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TixGiILM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238550AbjEOGIW (ORCPT + 99 others); Mon, 15 May 2023 02:08:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239475AbjEOGIC (ORCPT ); Mon, 15 May 2023 02:08:02 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D5A430DD for ; Sun, 14 May 2023 23:03:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684130593; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WArNoC6uSt8maUFT7n6H0H1hJWsPEQmI39XIRXJdkyU=; b=TixGiILMA5SvjIglFPZDcop4+99CnTGufsC1HXAsRjukUINVdz0bVAdnoaUdLrgzLmVe9C AqPXobqbC2wFp7ALo6K/HNwxq1pcE6axfmw2IybxCkDwc07FufEulX6MNMOBFc8N7op/oL +F85S8WqnbJw3D9PZ2vHNTQVHtZOAas= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-564-59zbIiHkPuaB0hNwovmpWA-1; Mon, 15 May 2023 02:03:12 -0400 X-MC-Unique: 59zbIiHkPuaB0hNwovmpWA-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 7D08185A588; Mon, 15 May 2023 06:03:11 +0000 (UTC) Received: from MiWiFi-R3L-srv.redhat.com (ovpn-12-32.pek2.redhat.com [10.72.12.32]) by smtp.corp.redhat.com (Postfix) with ESMTP id 161AA483EC2; Mon, 15 May 2023 06:03:06 +0000 (UTC) From: Baoquan He To: linux-kernel@vger.kernel.org Cc: catalin.marinas@arm.com, will@kernel.org, horms@kernel.org, thunder.leizhen@huawei.com, John.p.donnelly@oracle.com, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Baoquan He Subject: [PATCH v6 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high Date: Mon, 15 May 2023 14:02:58 +0800 Message-Id: <20230515060259.830662-2-bhe@redhat.com> In-Reply-To: <20230515060259.830662-1-bhe@redhat.com> References: <20230515060259.830662-1-bhe@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765940087296576947?= X-GMAIL-MSGID: =?utf-8?q?1765940087296576947?= On arm64, reservation for 'crashkernel=xM,high' is taken by searching for suitable memory region top down. If the 'xM' of crashkernel high memory is reserved from high memory successfully, it will try to reserve crashkernel low memory later accoringly. Otherwise, it will try to search low memory area for the 'xM' suitable region. Please see the details in Documentation/admin-guide/kernel-parameters.txt. While we observed an unexpected case where a reserved region crosses the high and low meomry boundary. E.g on a system with 4G as low memory end, user added the kernel parameters like: 'crashkernel=512M,high', it could finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. The crashkernel high region crossing low and high memory boudary will bring issues: 1) For crashkernel=x,high, if getting crashkernel high region across low and high memory boundary, then user will see two memory regions in low memory, and one memory region in high memory. The two crashkernel low memory regions are confusing as shown in above example. 2) If people explicityly specify "crashkernel=x,high crashkernel=y,low" and y <= 128M, when crashkernel high region crosses low and high memory boundary and the part of crashkernel high reservation below boundary is bigger than y, the expected crahskernel low reservation will be skipped. But the expected crashkernel high reservation is shrank and could not satisfy user space requirement. 3) The crossing boundary behaviour of crahskernel high reservation is different than x86 arch. On x86_64, the low memory end is 4G fixedly, and the memory near 4G is reserved by system, e.g for mapping firmware, pci mapping, so the crashkernel reservation crossing boundary never happens. From distros point of view, this brings inconsistency and confusion. Users need to dig into x86 and arm64 system details to find out why. For kernel itself, the impact of issue 3) could be slight. While issue 1) and 2) cause actual impact because it brings obscure semantics and behaviour to crashkernel=,high reservation. Here, for crashkernel=xM,high, search the high memory for the suitable region only in high memory. If failed, try reserving the suitable region only in low memory. Like this, the crashkernel high region will only exist in high memory, and crashkernel low region only exists in low memory. The reservation behaviour for crashkernel=,high is clearer and simpler. Note: RPi4 has different zone ranges than normal memory. Its DMA zone is 0~1G, and DMA32 zone is 1G~4G if CONFIG_ZONE_DMA|DMA32 are enabled by default. The low memory end is 1G in order to validate all devices, high memory starts at 1G memory. However, for being consistent with normla arm64 system, its low memory end is still 1G, while reserving crashkernel high memory from 4G if crashkernel=size,high specified. This will remove confusion. With above change applied, summary of arm64 crashkernel reservation range: 1) RPi4(zone DMA:0~1G; DMA32:1G~4G): crashkernel=size 0~1G: low memory | 1G~top: high memory crashkernel=size,high 0~1G: low memory | 4G~top: high memory 2) Other normal system: crashkernel=size crashkernel=size,high 0~4G: low memory | 4G~top: high memory 3) Systems w/o zone DMA|DMA32 crashkernel=size crashkernel=size,high 0~top: low memory Signed-off-by: Baoquan He arm64: kdump: fix warning reported by static checker Signed-off-by: Baoquan He --- arch/arm64/mm/init.c | 44 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 34 insertions(+), 10 deletions(-) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 66e70ca47680..c28c2c8483cc 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -69,6 +69,7 @@ phys_addr_t __ro_after_init arm64_dma_phys_limit; #define CRASH_ADDR_LOW_MAX arm64_dma_phys_limit #define CRASH_ADDR_HIGH_MAX (PHYS_MASK + 1) +#define CRASH_HIGH_SEARCH_BASE SZ_4G #define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) @@ -101,12 +102,13 @@ static int __init reserve_crashkernel_low(unsigned long long low_size) */ static void __init reserve_crashkernel(void) { - unsigned long long crash_base, crash_size; - unsigned long long crash_low_size = 0; + unsigned long long crash_low_size = 0, search_base = 0; unsigned long long crash_max = CRASH_ADDR_LOW_MAX; + unsigned long long crash_base, crash_size; char *cmdline = boot_command_line; - int ret; bool fixed_base = false; + bool high = false; + int ret; if (!IS_ENABLED(CONFIG_KEXEC_CORE)) return; @@ -129,7 +131,9 @@ static void __init reserve_crashkernel(void) else if (ret) return; + search_base = CRASH_HIGH_SEARCH_BASE; crash_max = CRASH_ADDR_HIGH_MAX; + high = true; } else if (ret || !crash_size) { /* The specified value is invalid */ return; @@ -140,31 +144,51 @@ static void __init reserve_crashkernel(void) /* User specifies base address explicitly. */ if (crash_base) { fixed_base = true; + search_base = crash_base; crash_max = crash_base + crash_size; } retry: crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, - crash_base, crash_max); + search_base, crash_max); if (!crash_base) { /* - * If the first attempt was for low memory, fall back to - * high memory, the minimum required low memory will be - * reserved later. + * For crashkernel=size[KMG]@offset[KMG], print out failure + * message if can't reserve the specified region. */ - if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) { + if (fixed_base) { + pr_warn("crashkernel reservation failed - memory is in use.\n"); + return; + } + + /* + * For crashkernel=size[KMG], if the first attempt was for + * low memory, fall back to high memory, the minimum required + * low memory will be reserved later. + */ + if (!high && crash_max == CRASH_ADDR_LOW_MAX) { crash_max = CRASH_ADDR_HIGH_MAX; + search_base = CRASH_ADDR_LOW_MAX; crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; goto retry; } + /* + * For crashkernel=size[KMG],high, if the first attempt was + * for high memory, fall back to low memory. + */ + if (high && crash_max == CRASH_ADDR_HIGH_MAX) { + crash_max = CRASH_ADDR_LOW_MAX; + search_base = 0; + goto retry; + } pr_warn("cannot allocate crashkernel (size:0x%llx)\n", crash_size); return; } - if ((crash_base > CRASH_ADDR_LOW_MAX - crash_low_size) && - crash_low_size && reserve_crashkernel_low(crash_low_size)) { + if ((crash_base >= CRASH_ADDR_LOW_MAX) && crash_low_size && + reserve_crashkernel_low(crash_low_size)) { memblock_phys_free(crash_base, crash_size); return; } From patchwork Mon May 15 06:02:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baoquan He X-Patchwork-Id: 93881 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp6701560vqo; Sun, 14 May 2023 23:20:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ48mhjTL0ZWWMXHrQrm0w84GaSjCQMlDrzTLz333mvyUH/JaYS4FiCO7pS1uIaZor0D79+6 X-Received: by 2002:a05:6a20:3947:b0:104:70cf:eeb8 with SMTP id r7-20020a056a20394700b0010470cfeeb8mr14014699pzg.33.1684131627813; Sun, 14 May 2023 23:20:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684131627; cv=none; d=google.com; s=arc-20160816; b=Ga2FyP0XkRruSiVbYTKzhbDFH3ngE9PPjFHCkehAHcriXQXihhDDMHHM+TGtfXswTp IdtzjDwcLmR1VaYhIqJCLhhTVljGu9w/HgFuPt8YBnFK5oXgZ6/C5DiHatHBRbjaL9Lp ScTfMFhgiDdUtx9rqdsVLKQxZsrryCP3KE/a5zkp2rwLxmrpdgsLFQ/xN5eS4Ho4Dw// Cn7ImYfZxvPzv3uRVT4NTWs5Ah3k4yFNSgul3BjYJ6ppGiXn/GBxw7OVjlXHcz8bOJKN c+pIKZ/N4JkhcxzbyIFcQaDNLf/hlHYcyGethj899DUIlpIf0SPD0K0PtWALXx0/tpwS clqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=q/HTD3f0P0/3Vfg2IW/92bJMtlhcGD951h+Pgg59OXQ=; b=00kxUSmAJP9h0QdUmUSmsntHYzcw6VTrFmrCnbUAud0jQ9pDsFa9nX/+YkEzVS78gm wfU/6Y6eQY50Hr5nV4xm03Eufu70h+i3bS+kS4nNad6Rgmlp9Cw2VvaBx6Vc0kLLuhjH amugQohdDjuQ8xgrX5Yt/3BISy6dbAOmnmdav5v/4W4k1GvGRjzRNs1pSERcw+CtjKiD wrZEnasjxSBPwixvPl/+4z+OCOL9ojdS+udo4eSaFU3LhST8xSMAMAfeandzt5JH3rkc /QX3oZdq/0FKBFOzWUExgXhkLKpq6N2AWqaJf9Lxu5rvfnegW/Gq7BFnTIMN8gZ77oIL q4dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IdwlBG1a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bs126-20020a632884000000b0052023579876si14968118pgb.710.2023.05.14.23.20.13; Sun, 14 May 2023 23:20:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IdwlBG1a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239561AbjEOGIc (ORCPT + 99 others); Mon, 15 May 2023 02:08:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238961AbjEOGIF (ORCPT ); Mon, 15 May 2023 02:08:05 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CEF02D7D for ; Sun, 14 May 2023 23:03:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684130600; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=q/HTD3f0P0/3Vfg2IW/92bJMtlhcGD951h+Pgg59OXQ=; b=IdwlBG1acQm5kWI8L8sOOvto8y85toc0IDV/l5SzAbeYnQG/JdtxLcDhBJgTcfDu2bi6ZC rjwSfxZtPVEOee3xLoQsSRsFfrNExJ5xVOpITHXZ0KQXXJoAaJpTSpF8x3Frv4dRQ/4PRF naX2ZuQFCooKZwM+7Ir5ft0wz2oN16Y= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-631-U_SSZYFTOVq2JwICkBmuDg-1; Mon, 15 May 2023 02:03:17 -0400 X-MC-Unique: U_SSZYFTOVq2JwICkBmuDg-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B1F72101A552; Mon, 15 May 2023 06:03:16 +0000 (UTC) Received: from MiWiFi-R3L-srv.redhat.com (ovpn-12-32.pek2.redhat.com [10.72.12.32]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4E56140B947; Mon, 15 May 2023 06:03:11 +0000 (UTC) From: Baoquan He To: linux-kernel@vger.kernel.org Cc: catalin.marinas@arm.com, will@kernel.org, horms@kernel.org, thunder.leizhen@huawei.com, John.p.donnelly@oracle.com, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, Baoquan He Subject: [PATCH v6 2/2] Documentation: add kdump.rst to present crashkernel reservation on arm64 Date: Mon, 15 May 2023 14:02:59 +0800 Message-Id: <20230515060259.830662-3-bhe@redhat.com> In-Reply-To: <20230515060259.830662-1-bhe@redhat.com> References: <20230515060259.830662-1-bhe@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765940005773056719?= X-GMAIL-MSGID: =?utf-8?q?1765940005773056719?= People complained the crashkernel reservation code flow is hard to follow, so add this document to explain the background, concepts and implementation of crashkernel reservation on arm64. Hope this can help people to understand it more easily. Signed-off-by: Baoquan He Reviewed-by: Zhen Lei --- Documentation/arm64/kdump.rst | 103 ++++++++++++++++++++++++++++++++++ 1 file changed, 103 insertions(+) create mode 100644 Documentation/arm64/kdump.rst diff --git a/Documentation/arm64/kdump.rst b/Documentation/arm64/kdump.rst new file mode 100644 index 000000000000..78b22017c490 --- /dev/null +++ b/Documentation/arm64/kdump.rst @@ -0,0 +1,103 @@ +======================================= +crashkernel memory reservation on arm64 +======================================= + +Author: Baoquan He + +Kdump mechanism is utilized to capture corrupted kernel's vmcore so +that people can analyze it to get the root cause of corruption. In +order to do that, a preliminarily reserved memory is needed to load +in kdump kernel, and switch to kdump kernel to boot up and run if +corruption happened. + +That reserved memory for kdump is adapted to be able to minimally +accommodate kdump kernel to boot and run, and user space programs +running to do the vmcore collecting. + +Kernel parameter +================ +Through kernel parameter like below, memory can be reserved +accordingly during early stage of 1st kernel's bootup so that +continuous large chunk of memomy can be found and reserved. Meanwhile, +the need of low memory need be considered if crashkernel is reserved +in high memory area. + +- crashkernel=size@offset +- crashkernel=size +- crashkernel=size,high crashkernel=size,low + +Low memory and high memory +=============== +What is low memory and high memory? In kdump reservation, low memory +means the memory area under a specific limitation, and it's usually +decided by the lowest addressing bits of PCI devices which kdump kernel +need rely on to boot up and collect vmcore successfully. Those devices +not related to vmcore dumping can be ignored, e.g on x86, those i2c may +only be able to access 24bits addressing area, but kdump kernel still +take 4G as the limitation because all known devices that kdump kernel +cares about have 32bits addressing ability. On arm64, the low memory +upper boundary is not fixed, it's 1G on RPi4 platform, while 4G on normal +arm64 system. On the special system with CONFIG_ZONE_DMA|DMA32 disabled, +the whole system RAM is low memory. Except of low memory, all the rest +of system RAM is high memory which kernel and user space programs can +require to allocate and use. + +Implementation +============== +1)crashkernel=size@offset +------------------------- +crashkernel memory must be reserved at the user specified region, otherwise +fail if already occupied. + + +2) crashkernel=size +------------------- +crashkernel memory region will be reserved in any available position +according to searching order. + +Firstly, it searches the low memory area for an available region with specified +size. + +Secondly, if searching low memory failed, fallback to search the high memory +area with the specified size. Meanwhile, if the reservation in high memory +succeeds, a default reservation in low memory will be done, the current default +value is 128M which is satisfying the low memory needs, e.g pci device driver +initialization. + +If both the above searching failed, the reservation will fail finally. + +Note: crashkernel=size is recommended option among crashkernel kernel +parameters. With it, user doesn't need to know much about system memory +information, just need to specify whatever memory kdump kernel needs to +make vmcore dumping succeed. + +3) crashkernel=size,high crashkernel=size,low +-------------------------------------------- +crashkernel=size,high is an important supplement to crashkernel=size. It +allows user to precisely specify how much memory need be allocated from +high memory, and how much memory is needed from low memory. On system +with large memory, low memory is small and precious since some kernel +feature and many devices can only request memory from the area, while +requiring a large chunk of continuous memory from high memory area doesn't +matter much and can satisfy most of kernel and almost all user space +programs' requirement. In such case, only a small part of necessary memory +from low memory area can satisfy needs. With it, the 1st kernel's normal +running won't be impacted because of limited low memory resource. + +To reserve memory for crashkernel=size,high, firstly, searching is tried in +high memory region. If reservation succeeds, low memory reservaton will be +done subsequently. + +Secondly, if reservation in high memory failed, fallback to search the +low memory with the specified size in crsahkernel=,high. If succeeds, +everything is fine since no low memory is needed. + +Notes: +- If crashkernel=,low is not specified, the default low memory reservation + will be done automically. + +- if crashkernel=0,low is specified, means that low memory reservation is + ommited intentionally. + +3) +