From patchwork Wed Jan 31 11:31:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kai Huang X-Patchwork-Id: 194705 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2087:b0:106:209c:c626 with SMTP id gs7csp1825846dyb; Wed, 31 Jan 2024 03:41:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IH3fu5O4fXuJaEGoV5sg0EGyo+WZjZMsoeWZvKCHGbUBwKyrqm0d4q6Ah7RU+S92b/mi4Wq X-Received: by 2002:a05:6808:4d3:b0:3bd:cddd:7634 with SMTP id a19-20020a05680804d300b003bdcddd7634mr1317571oie.29.1706701292934; Wed, 31 Jan 2024 03:41:32 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706701292; cv=pass; d=google.com; s=arc-20160816; b=FkIpLzgMQEPyuBth87kJztME9k0ej0axAhQpH3jKK5MM3zD3+tV/CFieNv33BchbjR kG/lRhjOQL9ZfJOcub+xto3fMTnk00/8YkF4hF1WgMOR5VBHrHd2DuQtKRp6N+nkAX/x Bt9IJnYWsjXKYYoCXIE0Mxz2/V4Tvu+AXWBDeAC5NNdaL/vW33zalEsfXy2Y9Km67Ckn wSk6AxTIrdN0Fxe12vkXudy6qpLwm1ViXrLtsT8rD+Z4xSHMW4/LBCMNODiG/Ddr3NmX ekoThNJPC1/lnUaNC+cjHc1lHTHT8KdzcrwE4gnBK++IvquXO1LUr9qr3DiOdmCq8ZPC U5Pw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=UnjLSx+Oa8bDWXRFx01zrrzCoDrQ2NKWEWE5hOOx5Es=; fh=hSNLS8OYE5zEk849oEbeJLdmZZEdKjk1TbyzIaF2Jvs=; b=fPTDlOLKV+6+RByyiQ/4zoJMh3QYgPAjJF4SP+bO7dthV39Ak4JTxUBeq7OkrYVhh9 b6XMg1zmuNwblBvUg8G18sGwkkU+qcEwTKQ4K3f1ZFhrwaOOgxx59408PM+44T7iHg+K HWHvtDesMh2uKncxvwgsn8gASmrSUCPTNLDrnKz0nW7EsmW5waoISRLUTLmKoEFUaY9d +Hk/SevVBpgklbJnRFP/dA6H+96f95wkXant6j5r/ivvjtRO12C+fv2bav4PZASmrbkg hcxk1dnakZIOzdKtzTvXNnpvgQgkwznPkxFpXW2B3HIHOa1/QeGPTN/UhEogzhI45WTR aOHA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="G7Jz/XYa"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-46352-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-46352-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Forwarded-Encrypted: i=1; AJvYcCUnYePWArSF7EskCuO0nrVjLUlsPnSPwYFxNBf4VhRQn3u4uc24Q14DE8J5RdAYfazT6+ZlXL1m1U7rqDPvakWmPvsVyw== Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id s24-20020a62e718000000b006ddcc9294ffsi9371431pfh.47.2024.01.31.03.41.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 03:41:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-46352-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="G7Jz/XYa"; arc=pass (i=1 spf=pass spfdomain=intel.com dkim=pass dkdomain=intel.com dmarc=pass fromdomain=intel.com); spf=pass (google.com: domain of linux-kernel+bounces-46352-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-46352-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 0A37A29119E for ; Wed, 31 Jan 2024 11:33:18 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9DE9C79935; Wed, 31 Jan 2024 11:32:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="G7Jz/XYa" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 23FEB77F2C for ; Wed, 31 Jan 2024 11:32:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.12 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706700740; cv=none; b=iONQAaSdkgpEBQIDxJDdAjw0vISHebSe9rYhPHNmyVXnqQypVsrToe7m6/s4M6CQtVyUfFryKtlD7+QDr71O+nzI2WpC+sMyIqtvR47pdmLMUUxVBT4UTnTqHpMLimg+qEBPgwViLIvdgica2iJf5sm1C94JmkrGDG2gPeQ4jTw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706700740; c=relaxed/simple; bh=3ayy1GCfAP4s1+4FRo5mCAe8slIxZjOZp22vN+iG2s8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZDhovmJIGyLJ+k7qRgGheGdsSSJES+Z4XevF3I02vzscfEYO+hEvZPp1OvLBF5yppXN8I3fgrzV87zCJVls6ZFK7fbxSi0VEsO9JWQhmSMdtMrZUeHEzWexSSALny5FQd0O0qqf9sXmXeHMw/b/5nYuNB62MRwUTrhvA93JZrRg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=G7Jz/XYa; arc=none smtp.client-ip=192.198.163.12 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1706700732; x=1738236732; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3ayy1GCfAP4s1+4FRo5mCAe8slIxZjOZp22vN+iG2s8=; b=G7Jz/XYadVTwCVWPk8W01x6k/ZxuyEkD8kkZYUirj5zOy2Bx+9zpkdQS U+U+Tunp/V7rJWE88qo3LmunydS1gtV9lKW45+zo5XvgbrL/DW7L684yb TeWjCflePJDJ7yR4dX4PiT0+o00PbZaWWYlaKjHKyBTzJVECAnBRQGEww njIlB8BLVXBPQ5Jg0QinL0tfh7oHIaCyOpvIbjtmXuSHclE1O8jK2G48o ZCbXbWZUaPZFhZqYe8NCxwDjvxRxo+ZXxgZUtbkbsnB4KwsDtsn2JfzQv GPk1hTb2BBw2QWgYKNqWce2VrxpyOwqJIvESiHpTejKVeAfaz75NE+JWz A==; X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="3414163" X-IronPort-AV: E=Sophos;i="6.05,231,1701158400"; d="scan'208";a="3414163" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Jan 2024 03:32:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10969"; a="878764791" X-IronPort-AV: E=Sophos;i="6.05,231,1701158400"; d="scan'208";a="878764791" Received: from server.sh.intel.com ([10.239.53.117]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Jan 2024 03:32:06 -0800 From: "Huang, Kai" To: linux-kernel@vger.kernel.org Cc: x86@kernel.org, dave.hansen@intel.com, kirill.shutemov@linux.intel.com, tglx@linutronix.de, bp@alien8.de, mingo@redhat.com, hpa@zytor.com, luto@kernel.org, peterz@infradead.org, thomas.lendacky@amd.com, chao.gao@intel.com, bhe@redhat.com, nik.borisov@suse.com, pbonzini@redhat.com Subject: [PATCH 1/4] x86/coco: Add a new CC attribute to unify cache flush during kexec Date: Wed, 31 Jan 2024 11:31:53 +0000 Message-Id: X-Mailer: git-send-email 2.34.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789606014692205586 X-GMAIL-MSGID: 1789606014692205586 From: Kai Huang Currently on AMD SME platforms, during kexec() caches are flushed manually before jumping to the new kernel due to memory encryption. Intel TDX needs to flush cachelines of TDX private memory before jumping to the second kernel too, otherwise they may silently corrupt the new kernel. Instead of sprinkling both AMD and Intel's specific checks around, introduce a new CC_ATTR_HOST_MEM_INCOHERENT attribute to unify both Intel and AMD, and simplify the logic: Could the old kernel leave incoherent caches around? If so, do WBINVD. Convert the AMD SME to use this new CC attribute. A later patch will utilize this new attribute for Intel TDX too. Specifically, AMD SME flushes caches at two places: 1) stop_this_cpu(); 2) relocate_kernel(). stop_this_cpu() checks the CPUID directly to do WBINVD for the reason that the current kernel's SME enabling status may not match the new kernel's choice. However the relocate_kernel() only does the WBINVD when the current kernel has enabled SME for the reason that the new kernel is always placed in an "unencrypted" area. To simplify the logic, for AMD SME change to always use the way that is done in stop_this_cpu(). This will cause an additional WBINVD in relocate_kernel() when the current kernel hasn't enabled SME (e.g., disabled by kernel command line), but this is acceptable for the sake of having less complicated code (see [1] for the relevant discussion). Note currently the kernel only advertises CC vendor for AMD SME when SME is actually enabled by the kernel. To always advertise the new CC_ATTR_HOST_MEM_INCOHERENT regardless of the kernel's SME enabling status, change to set CC vendor as long as the hardware has enabled SME. Note "advertising CC_ATTR_HOST_MEM_INCOHERENT when the hardware has enabled SME" is still different from "checking the CPUID" (the way that is done in stop_this_cpu()), but technically the former also serves the purpose and is actually more accurate. Such change allows sme_me_mask to be 0 while CC vendor reports as AMD. But this doesn't impact other CC attributes on AMD platforms, nor does it impact the cc_mkdec()/cc_mkenc(). [1] https://lore.kernel.org/lkml/cbc9c527-17e5-4a63-80fe-85451394cc7c@amd.com/ Suggested-by: Dave Hansen Signed-off-by: Kai Huang --- arch/x86/coco/core.c | 13 +++++++++++++ arch/x86/kernel/machine_kexec_64.c | 2 +- arch/x86/kernel/process.c | 14 +++----------- arch/x86/mm/mem_encrypt_identity.c | 11 ++++++++++- include/linux/cc_platform.h | 15 +++++++++++++++ 5 files changed, 42 insertions(+), 13 deletions(-) diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c index eeec9986570e..8d6d727e6e18 100644 --- a/arch/x86/coco/core.c +++ b/arch/x86/coco/core.c @@ -72,6 +72,19 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr) case CC_ATTR_HOST_MEM_ENCRYPT: return sme_me_mask && !(sev_status & MSR_AMD64_SEV_ENABLED); + case CC_ATTR_HOST_MEM_INCOHERENT: + /* + * CC_ATTR_HOST_MEM_INCOHERENT represents whether SME has + * enabled on the platform regardless whether the kernel + * has actually enabled the SME. + */ + return !(sev_status & MSR_AMD64_SEV_ENABLED); + + /* + * For all CC_ATTR_GUEST_* there's no need to check sme_me_mask + * as it must be true when there's any SEV enable bit set in + * sev_status. + */ case CC_ATTR_GUEST_MEM_ENCRYPT: return sev_status & MSR_AMD64_SEV_ENABLED; diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c index bc0a5348b4a6..c9c6974e2e9c 100644 --- a/arch/x86/kernel/machine_kexec_64.c +++ b/arch/x86/kernel/machine_kexec_64.c @@ -358,7 +358,7 @@ void machine_kexec(struct kimage *image) (unsigned long)page_list, image->start, image->preserve_context, - cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT)); + cc_platform_has(CC_ATTR_HOST_MEM_INCOHERENT)); #ifdef CONFIG_KEXEC_JUMP if (image->preserve_context) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index ab49ade31b0d..2c7e8d9889c0 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -813,18 +813,10 @@ void __noreturn stop_this_cpu(void *dummy) mcheck_cpu_clear(c); /* - * Use wbinvd on processors that support SME. This provides support - * for performing a successful kexec when going from SME inactive - * to SME active (or vice-versa). The cache must be cleared so that - * if there are entries with the same physical address, both with and - * without the encryption bit, they don't race each other when flushed - * and potentially end up with the wrong entry being committed to - * memory. - * - * Test the CPUID bit directly because the machine might've cleared - * X86_FEATURE_SME due to cmdline options. + * Use wbinvd on processors that the first kernel *could* + * potentially leave incoherent cachelines. */ - if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0))) + if (cc_platform_has(CC_ATTR_HOST_MEM_INCOHERENT)) native_wbinvd(); /* diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c index 7f72472a34d6..87e4fddab770 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -570,9 +570,19 @@ void __init sme_enable(struct boot_params *bp) msr = __rdmsr(MSR_AMD64_SYSCFG); if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT)) return; + + /* + * Always set CC vendor when the platform has SME enabled + * regardless whether the kernel will actually activates the + * SME or not. This reports the CC_ATTR_HOST_MEM_INCOHERENT + * being true as long as the platform has SME enabled so that + * stop_this_cpu() can do necessary WBINVD during kexec(). + */ + cc_vendor = CC_VENDOR_AMD; } else { /* SEV state cannot be controlled by a command line option */ sme_me_mask = me_mask; + cc_vendor = CC_VENDOR_AMD; goto out; } @@ -608,7 +618,6 @@ void __init sme_enable(struct boot_params *bp) out: if (sme_me_mask) { physical_mask &= ~sme_me_mask; - cc_vendor = CC_VENDOR_AMD; cc_set_mask(sme_me_mask); } } diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h index cb0d6cd1c12f..2f7273596102 100644 --- a/include/linux/cc_platform.h +++ b/include/linux/cc_platform.h @@ -42,6 +42,21 @@ enum cc_attr { */ CC_ATTR_HOST_MEM_ENCRYPT, + /** + * @CC_ATTR_HOST_MEM_INCOHERENT: Host memory encryption can be + * incoherent + * + * The platform/OS is running as a bare-metal system or a hypervisor. + * The memory encryption engine might have left non-cache-coherent + * data in the caches that needs to be flushed. + * + * Use this in places where the cache coherency of the memory matters + * but the encryption status does not. + * + * Includes all systems that set CC_ATTR_HOST_MEM_ENCRYPT. + */ + CC_ATTR_HOST_MEM_INCOHERENT, + /** * @CC_ATTR_GUEST_MEM_ENCRYPT: Guest memory encryption is active *