From patchwork Mon Feb 13 23:48:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 56599 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2648977wrn; Mon, 13 Feb 2023 15:56:51 -0800 (PST) X-Google-Smtp-Source: AK7set8cLM419Wp862qLaXLNDX3SvTxIZEzZZCn54J9gQuZ5BB+/v1SERWAJ+6kSMsPrVSXczsuR X-Received: by 2002:a17:906:8306:b0:879:36b4:486 with SMTP id j6-20020a170906830600b0087936b40486mr913814ejx.13.1676332611538; Mon, 13 Feb 2023 15:56:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676332611; cv=none; d=google.com; s=arc-20160816; b=opd8jQ2QZgmJRCCwZHY61Zg3BVHuhdkt8oo3cJSFP6LDhV3jlSs6YfoM0yXrZw1t0V oQfvvJTqtOnYjhORdqdlCJxpb1T8cBMZRI3Mfq9zxApIPMFhKGodlrYVOR3XpBHQ22z9 SSCumMpxFJqCDW0WtkGvwb8RvP2ScovOJndj5kEmVHUV4XI+8rTHaUthHXmGwtfgogjK 5IU7BZR/6TgV43s2r3aNmhCxbjDQjTuch3t+aksYVVXJMcJTwdjYYqYaE9aFu6C0Wj2q DOTC8h2GWQeOBlFYNtc5+t9gN30KP3UA9/mycd1rCmRcPO6Jx4rT5tCywq5fkXoyeqhM hmdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=n4WkeoAsEkPJgJL+pfHPB8by+xno+wnCiToA8Ijb/10=; b=ZCsctvdO4gOfYX3lBUGst/vX4kH6MjqluxxFnaHMLjnYmx2YFCv9UkOvmgdcttdmwW Co1ZVWB3LJU/FwKYU8U5rax//2CRGsgcwBIDU/d/y9rCgcMuU0R96QjLoqY8CBU6cYQV FcizzBQ+JIKp7mjoKJgQPK4GvehVQdEVq8mUrbHw9SJobZlsbJHBTG5rURSD6GMZ2HSN h64FV4Q7v8xFqISeIeN9Q5X8PQRzPSpn3IV1acOVEz8MPXQS3kTddcaPN9lG8/nPQjpL ehhKvl3rlSCOHPZUZ4+iBE90J3Fy/9VzD5YBCHbzGhmGQO3cIU+e243bc7Lvwf77VboV p7Aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DEU4gkRe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m27-20020a170906599b00b00887dadb95e2si15288968ejs.711.2023.02.13.15.56.28; Mon, 13 Feb 2023 15:56:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DEU4gkRe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231226AbjBMXtP (ORCPT + 99 others); Mon, 13 Feb 2023 18:49:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231214AbjBMXtK (ORCPT ); Mon, 13 Feb 2023 18:49:10 -0500 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3B6D193DC for ; Mon, 13 Feb 2023 15:49:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676332143; x=1707868143; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0x8prcfOkEmACeYEiOgHUm+WyjDXkIRz6t1uNF7BUGU=; b=DEU4gkRe/LLVPz4mbmoDkRzEjfxFbx3I8SLNirxLNL96qdWaDD15dR26 lkOcvo9DsLQEL09c7WQdokTlK3piGVyHkYW8iQreseYHUnBrjt85LtUCS fQDksFjPUHK5tEtSPPMRj42RwY3I/Fb5BdFmopd2plqZ/bKHuOt+Pcu3Z 9WmOx09Em9RfwgCtJHM33EyyvQWtQcaK4r5DrGUru1h6WCMDLyPpKcrz7 Q0585VAobnBCGTTm/xwZrgn9gwVHC5fvRxUEwjMosn+HwK2Z6pyZimU0M 1fFa63JS+f3I/qrdhd/F/3HFpfpYpGVFMGMUaQEcmAPVaq3IAc7QDAr1c g==; X-IronPort-AV: E=McAfee;i="6500,9779,10620"; a="395645603" X-IronPort-AV: E=Sophos;i="5.97,294,1669104000"; d="scan'208";a="395645603" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2023 15:48:48 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10620"; a="732672802" X-IronPort-AV: E=Sophos;i="5.97,294,1669104000"; d="scan'208";a="732672802" Received: from iannetti-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.49.216]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2023 15:48:46 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 62E0010CA34; Tue, 14 Feb 2023 02:48:43 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Borislav Petkov Cc: Kuppuswamy Sathyanarayanan , Thomas Gleixner , Isaku Yamahata , x86@kernel.org, linux-coco@lists.linux.dev, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCH 1/2] x86/kexec: Preserve CR4.MCE during kexec Date: Tue, 14 Feb 2023 02:48:35 +0300 Message-Id: <20230213234836.3683-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230213234836.3683-1-kirill.shutemov@linux.intel.com> References: <20230213234836.3683-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757762144263195444?= X-GMAIL-MSGID: =?utf-8?q?1757762144263195444?= TDX guests are not allowed to clear CR4.MCE. Attempt to clear it leads to #VE. Preserve the flag during kexec. Signed-off-by: Kirill A. Shutemov Tested-by: Rick Edgecombe --- arch/x86/kernel/relocate_kernel_64.S | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S index 4a73351f87f8..18f19dcc40e9 100644 --- a/arch/x86/kernel/relocate_kernel_64.S +++ b/arch/x86/kernel/relocate_kernel_64.S @@ -145,8 +145,12 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped) * Set cr4 to a known state: * - physical address extension enabled * - 5-level paging, if it was enabled before + * - Preserve MCE, if it was set. Clearing MCE may fault in some + * environments. */ - movl $X86_CR4_PAE, %eax + movq %cr4, %rax + andl $X86_CR4_MCE, %eax + orl $X86_CR4_PAE, %eax testq $X86_CR4_LA57, %r13 jz 1f orl $X86_CR4_LA57, %eax From patchwork Mon Feb 13 23:48:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 56598 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2648845wrn; Mon, 13 Feb 2023 15:56:28 -0800 (PST) X-Google-Smtp-Source: AK7set+NoeW2xjCg87PIcwYphlOLypzfgeyf5h/scdVjJNnr2oLwQoH+H1xX1rcl1gjkaClaJTyD X-Received: by 2002:a50:8aca:0:b0:4aa:a5b3:15e6 with SMTP id k10-20020a508aca000000b004aaa5b315e6mr581512edk.0.1676332588429; Mon, 13 Feb 2023 15:56:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676332588; cv=none; d=google.com; s=arc-20160816; b=uhLEqaCqTsXVRzrUfW52XrDySCyvOAmXo+CDljkPexTgydARlxKJPVvN/jW2AlGA+8 8+18hk+gDR+1wqa+ndsL2uMEKyMKLWfKZZ62rc8YDmUb4UHFJ2RdZM0OIxmXpAt2im6v cpYha4zm5hqCBNPfGyHae7QnCOoOv9HZWwip99PS1CsGR3nmMw9GK1zplGeAd1fW/J5q 6roaCprEdVlDDbjjWDYwwqrlJtNnkVWl6HsX6gz+w3dbWsqtYo/TLD5/U2m1Rn/DdKPg 1DIOWPSyz0xbB2BQAvsF7Wbwlc68AZxoxd0Brf9ssf0V5nr+noiTNSsXNs4rxJTM5/62 OUoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=VSy/ebClyjmDshWRbSVN84e1ynu0XDirq8La6wZPnjY=; b=VCubwmc0e0oCLO7iQMLBnrqFILjbItou0zHBrDbUcaBz/UzHYxbrzyAh7o+xbU2Qba i8pUK4uEr6IM10NF43PYQwhHWZlxS/S1MOZW9i3Dq7XR0iO4cc8ZJJ1zN94q8y4r1VIC ltgDDi2w0OaVWKMwFNMk0LvyrhiyUPaOPk3T1Dls32sJgvK6PM7ir55lGX5HM44uBBr4 y+yj/lYnmLAyV/Bqfd7+oVXE7Ge9QF+F/EAoBibtPVIIstLP8uy1vUEvkMtWPOs19eZW 3Xmo7DjQY8NvPzt4KKtC2/Xe9WkdqbIG0XfiDh5ysxYEUg38m9lfixBCa0gDzU1wmYrp NgZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IQd+0EVF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y2-20020a50eb02000000b004aab0e984e0si15593489edp.95.2023.02.13.15.56.05; Mon, 13 Feb 2023 15:56:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IQd+0EVF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231225AbjBMXtC (ORCPT + 99 others); Mon, 13 Feb 2023 18:49:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230491AbjBMXsv (ORCPT ); Mon, 13 Feb 2023 18:48:51 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A5DF16AC3 for ; Mon, 13 Feb 2023 15:48:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676332130; x=1707868130; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0VB2hV8K9+oaRfBh3poIm3QeNRaII+t6ZBfZ5Dl8yyE=; b=IQd+0EVFYfo2a92OBFS/8dvOw28EEfuhirJPGo7Q/0pjm2WThJoXiTBM rynEnsHa+3aWp0g7subBRlp1qWc3wxeqoxTfcQEUj1LBHDDCxFuzCwmO7 Rl2HmDqmLiyb3Udz5PoktUDhZFopCo4MsxwuLVgnUTMBzgYPFs48M3JBw pwq0iFbRLp42nRmlFURo13h/RtJ2F0Q9aGLykjOlD/XqOhhLar/8AB3Uk 1X9wtjjpIpyIltF91lCZ+V+Cod71X3qrUJkqENlQe0dvKq8PX49LmVZEo uNuXc0zVBt6/aGKQcyyFh2vE0HISdWvNfaaoBMz94T+BMaNLqX5wjHwB7 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10620"; a="329658425" X-IronPort-AV: E=Sophos;i="5.97,294,1669104000"; d="scan'208";a="329658425" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2023 15:48:49 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10620"; a="668965313" X-IronPort-AV: E=Sophos;i="5.97,294,1669104000"; d="scan'208";a="668965313" Received: from iannetti-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.49.216]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2023 15:48:46 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 6CEF710CA35; Tue, 14 Feb 2023 02:48:43 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Borislav Petkov Cc: Kuppuswamy Sathyanarayanan , Thomas Gleixner , Isaku Yamahata , x86@kernel.org, linux-coco@lists.linux.dev, kexec@lists.infradead.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCH 2/2] x86/tdx: Convert shared memory back to private on kexec Date: Tue, 14 Feb 2023 02:48:36 +0300 Message-Id: <20230213234836.3683-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230213234836.3683-1-kirill.shutemov@linux.intel.com> References: <20230213234836.3683-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757762120240970924?= X-GMAIL-MSGID: =?utf-8?q?1757762120240970924?= TDX guests allocate shared buffers to perform I/O. It is done by allocating pages normally from the buddy allocator and converting them to shared with set_memory_decrypted(). The target kernel has no idea what memory is converted this way. It only sees E820_TYPE_RAM. Accessing shared memory via private mapping is fatal. It leads to unrecoverable TD exit. Walk direct mapping and convert all shared memory back to private. It makes all RAM private again and target kernel may use it normally. Skip the conversion on kexec of crashkernel. It uses own pool of memory and will not accidentally allocate from the memory the first kernel made shared. For crash investigation, it might be useful to access data in the shared buffers. Signed-off-by: Kirill A. Shutemov --- arch/x86/coco/tdx/Makefile | 1 + arch/x86/coco/tdx/kexec.c | 82 ++++++++++++++++++++++++++++++ arch/x86/include/asm/tdx.h | 4 ++ arch/x86/kernel/machine_kexec_64.c | 2 + 4 files changed, 89 insertions(+) create mode 100644 arch/x86/coco/tdx/kexec.c diff --git a/arch/x86/coco/tdx/Makefile b/arch/x86/coco/tdx/Makefile index 46c55998557d..a5daa2a33531 100644 --- a/arch/x86/coco/tdx/Makefile +++ b/arch/x86/coco/tdx/Makefile @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0 obj-y += tdx.o tdcall.o +obj-$(CONFIG_KEXEC_CORE) += kexec.o diff --git a/arch/x86/coco/tdx/kexec.c b/arch/x86/coco/tdx/kexec.c new file mode 100644 index 000000000000..f1f31515f372 --- /dev/null +++ b/arch/x86/coco/tdx/kexec.c @@ -0,0 +1,82 @@ +#define pr_fmt(fmt) "tdx: " fmt + +#include +#include +#include + +static inline bool pud_decrypted(pud_t pud) +{ + return cc_mkdec(pud_val(pud)) == pud_val(pud); +} + +static inline bool pmd_decrypted(pmd_t pmd) +{ + return cc_mkdec(pmd_val(pmd)) == pmd_val(pmd); +} + +static inline bool pte_decrypted(pte_t pte) +{ + return cc_mkdec(pte_val(pte)) == pte_val(pte); +} + +static inline void unshare_range(unsigned long start, unsigned long end) +{ + int pages = (end - start) / PAGE_SIZE; + + if (!x86_platform.guest.enc_status_change_finish(start, pages, true)) + pr_err("Failed to unshare range %#lx-%#lx\n", start, end); +} + +static int unshare_pud(pud_t *pud, unsigned long addr, unsigned long next, + struct mm_walk *walk) +{ + if (pud_decrypted(*pud)) + unshare_range(addr, next); + + return 0; +} + +static int unshare_pmd(pmd_t *pmd, unsigned long addr, unsigned long next, + struct mm_walk *walk) +{ + if (pmd_decrypted(*pmd)) + unshare_range(addr, next); + + return 0; +} + +static int unshare_pte(pte_t *pte, unsigned long addr, unsigned long next, + struct mm_walk *walk) +{ + if (pte_decrypted(*pte)) + unshare_range(addr, next); + + return 0; +} + +static const struct mm_walk_ops unshare_ops = { + .pud_entry = unshare_pud, + .pmd_entry = unshare_pmd, + .pte_entry = unshare_pte, +}; + +void tdx_kexec_prepare(bool crash) +{ + /* + * Crash kernel may want to see data in the shared buffers. + * Do not revert them to private on kexec of crash kernel. + */ + if (crash) + return; + + /* + * Walk direct mapping and convert all shared memory back to private, + * so the target kernel will be able use it normally. + */ + mmap_write_lock(&init_mm); + walk_page_range_novma(&init_mm, + PAGE_OFFSET, + PAGE_OFFSET + (max_pfn_mapped << PAGE_SHIFT), + &unshare_ops, init_mm.pgd, NULL); + mmap_write_unlock(&init_mm); +} diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 28d889c9aa16..7cdbf10e9f7d 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -69,6 +69,8 @@ bool tdx_early_handle_ve(struct pt_regs *regs); int tdx_mcall_get_report0(u8 *reportdata, u8 *tdreport); +void tdx_kexec_prepare(bool crash); + #else static inline void tdx_early_init(void) { }; @@ -76,6 +78,8 @@ static inline void tdx_safe_halt(void) { }; static inline bool tdx_early_handle_ve(struct pt_regs *regs) { return false; } +static inline void tdx_kexec_prepare(bool crash) {} + #endif /* CONFIG_INTEL_TDX_GUEST */ #if defined(CONFIG_KVM_GUEST) && defined(CONFIG_INTEL_TDX_GUEST) diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index 0611fd83858e..adbb3e5347da 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -28,6 +28,7 @@ #include #include #include +#include #ifdef CONFIG_ACPI /* @@ -312,6 +313,7 @@ void machine_kexec(struct kimage *image) local_irq_disable(); hw_breakpoint_disable(); cet_disable(); + tdx_kexec_prepare(image->type == KEXEC_TYPE_CRASH); if (image->preserve_context) { #ifdef CONFIG_X86_IO_APIC