From patchwork Fri Nov 24 19:54:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiri Bohac X-Patchwork-Id: 17073 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce62:0:b0:403:3b70:6f57 with SMTP id o2csp1492402vqx; Fri, 24 Nov 2023 11:54:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IEcpAqxgbTBQrrzfQOxomIOuO/aBchQq3bEDwoyBX+dAehLPvLuGqzVkYPuNe6UqCgNIBct X-Received: by 2002:a05:6a00:21c7:b0:690:ce30:47ba with SMTP id t7-20020a056a0021c700b00690ce3047bamr4596803pfj.10.1700855690171; Fri, 24 Nov 2023 11:54:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700855690; cv=none; d=google.com; s=arc-20160816; b=HsalHGa7GpecbZc002pwoZIx77398k67ZEZpMQkd69z4C+homswjnXegIiXHnbolp2 UCaY2vzfNnNoqpcdMcaUNYaSxCvJjHS83do7/9/UbiGR+3YjUBaV/2R26CU92cQD9YWz AJtR4QOEmgx2Vavk7uJ6LTEmkN6TvNFjsd5TDY58SV/wEsQHDqIMp0C7TkdCnOVId3P0 1It2tBHufhIkfXi7SRx7xqW7iCTjFZeIzgg/MjsSm/RnF7dXdL+85ReZANl12a2x98Mr 7flQB9D1RfindHwYhGZ5CNgTCz3IZoTdrTK7FGlSGJX/peJ10TKl69IrcMlYzv4VkDBY zktw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date:dkim-signature:dkim-signature; bh=NKaKK9zej0m4xFJ3PcSZPMegxdL3JWqo86QBJH29fAg=; fh=N9iTC4ETbLLv8Qag9but96wYdeum3ovJpDOsbUrbTQg=; b=fWSDmfobQ1oM8YwAIETPf9mBqsaokPd+i9aZmS7QRy/GJ1SpgKjao4P97dHSAKSO+6 ysVk2GkUzSLqHua3LCYnwARAr2IkNgh56ctzQ/L5U1BA0MJTpWZ6V6j8uUL0hlFIZlm0 zIgB1+sy8WYKLZYGMlwef0d/O+y/D1jZh061bSPbsOlPpoCsfYUYDO5A4HxrQXDrn3n8 PX4n4Kjoqg4/r2SxK8+IDQwYMJGxshni+glAS1wxPYSRGS9rRXfWHmSOWDH2I6Mkj/Jg ozXla3L0yxfOPHXKjA3bB5Hb7zVZ8l8+dlvvnjvjjVl/0zdBFSLsyaBKTzLPeY4Q6n5i rlQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=a4YpK+ov; dkim=neutral (no key) header.i=@suse.cz header.b=tsricbvE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id f23-20020a635117000000b0055b640a6b3csi4231199pgb.884.2023.11.24.11.54.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 11:54:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.cz header.s=susede2_rsa header.b=a4YpK+ov; dkim=neutral (no key) header.i=@suse.cz header.b=tsricbvE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 746B78043D0C; Fri, 24 Nov 2023 11:54:45 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346095AbjKXTyg (ORCPT + 99 others); Fri, 24 Nov 2023 14:54:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345941AbjKXTyf (ORCPT ); Fri, 24 Nov 2023 14:54:35 -0500 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9CFEC10F6 for ; Fri, 24 Nov 2023 11:54:41 -0800 (PST) Received: from relay2.suse.de (unknown [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id F17E51FDB6; Fri, 24 Nov 2023 19:54:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1700855678; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=NKaKK9zej0m4xFJ3PcSZPMegxdL3JWqo86QBJH29fAg=; b=a4YpK+ovIPwdPnP2AM4Eh+RnsVS/RnRhRa6Asf10nYw6N/T77ZEGJuIco6XK/SREOT6eqU otY2lG0pJAZYMXLLhY8o9HRxQoT4GFCkSdHE7kYF1I1UXGpGrUNd601S2Ndy5txvGUdZwc aC0UsaiMjjYig20wNZE/itBLnHu+RYQ= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1700855678; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=NKaKK9zej0m4xFJ3PcSZPMegxdL3JWqo86QBJH29fAg=; b=tsricbvEp+EJK7XoMtsacnyy7m2bCFGl4JzjredPk4r4PMZwDe28cW4FswhNbmXNLlnAat +HjVZPggEBea0mBQ== Received: from localhost (dwarf.suse.cz [10.100.12.32]) by relay2.suse.de (Postfix) with ESMTP id E28652C145; Fri, 24 Nov 2023 19:54:36 +0000 (UTC) Date: Fri, 24 Nov 2023 20:54:36 +0100 From: Jiri Bohac To: Baoquan He , Vivek Goyal , Dave Young , kexec@lists.infradead.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.cz Subject: [PATCH 0/4] kdump: crashkernel reservation from CMA Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spamd-Bar: ++++++++++ X-Spam-Score: 10.50 X-Rspamd-Server: rspamd1 X-Rspamd-Queue-Id: F17E51FDB6 Authentication-Results: smtp-out2.suse.de; dkim=none; dmarc=none; spf=pass (smtp-out2.suse.de: domain of jbohac@suse.cz designates 149.44.160.134 as permitted sender) smtp.mailfrom=jbohac@suse.cz X-Spamd-Result: default: False [10.50 / 50.00]; RDNS_NONE(1.00)[]; SPAMHAUS_XBL(0.00)[149.44.160.134:from]; TO_DN_SOME(0.00)[]; RWL_MAILSPIKE_GOOD(-1.00)[149.44.160.134:from]; R_SPF_ALLOW(-0.20)[+ip4:149.44.0.0/16]; HFILTER_HELO_IP_A(1.00)[relay2.suse.de]; HFILTER_HELO_NORES_A_OR_MX(0.30)[relay2.suse.de]; RCPT_COUNT_FIVE(0.00)[6]; MID_RHS_MATCH_FROMTLD(0.00)[]; MX_GOOD(-0.01)[]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(2.20)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-2.99)[99.95%]; RDNS_DNSFAIL(0.00)[]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; NEURAL_SPAM_SHORT(3.00)[1.000]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(1.20)[suse.cz]; TO_MATCH_ENVRCPT_SOME(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_SPAM_LONG(3.50)[1.000]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_TWO(0.00)[2]; HFILTER_HOSTNAME_UNKNOWN(2.50)[] X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Fri, 24 Nov 2023 11:54:45 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783476456091888427 X-GMAIL-MSGID: 1783476456091888427 Hi, this series implements a new way to reserve additional crash kernel memory using CMA. Currently, all the memory for the crash kernel is not usable by the 1st (production) kernel. It is also unmapped so that it can't be corrupted by the fault that will eventually trigger the crash. This makes sense for the memory actually used by the kexec-loaded crash kernel image and initrd and the data prepared during the load (vmcoreinfo, ...). However, the reserved space needs to be much larger than that to provide enough run-time memory for the crash kernel and the kdump userspace. Estimating the amount of memory to reserve is difficult. Being too careful makes kdump likely to end in OOM, being too generous takes even more memory from the production system. Also, the reservation only allows reserving a single contiguous block (or two with the "low" suffix). I've seen systems where this fails because the physical memory is fragmented. By reserving additional crashkernel memory from CMA, the main crashkernel reservation can be just small enough to fit the kernel and initrd image, minimizing the memory taken away from the production system. Most of the run-time memory for the crash kernel will be memory previously available to userspace in the production system. As this memory is no longer wasted, the reservation can be done with a generous margin, making kdump more reliable. Kernel memory that we need to preserve for dumping is never allocated from CMA. User data is typically not dumped by makedumpfile. When dumping of user data is intended this new CMA reservation cannot be used. There are four patches in this series: The first adds a new ",cma" suffix to the recenly introduced generic crashkernel parsing code. parse_crashkernel() takes one more argument to store the cma reservation size. The second patch implements reserve_crashkernel_cma() which performs the reservation. If the requested size is not available in a single range, multiple smaller ranges will be reserved. The third patch enables the functionality for x86 as a proof of concept. There are just three things every arch needs to do: - call reserve_crashkernel_cma() - include the CMA-reserved ranges in the physical memory map - exclude the CMA-reserved ranges from the memory available through /proc/vmcore by excluding them from the vmcoreinfo PT_LOAD ranges. Adding other architectures is easy and I can do that as soon as this series is merged. The fourth patch just updates Documentation/ Now, specifying crashkernel=100M craskhernel=1G,cma on the command line will make a standard crashkernel reservation of 100M, where kexec will load the kernel and initrd. An additional 1G will be reserved from CMA, still usable by the production system. The crash kernel will have 1.1G memory available. The 100M can be reliably predicted based on the size of the kernel and initrd. When no crashkernel=size,cma is specified, everything works as before. NAcked-by: Philipp Rudo