Message ID | cover.1706307364.git.thomas.lendacky@amd.com |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-40737-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp180424dyb; Fri, 26 Jan 2024 14:16:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IHKF35Se6ZLO1Pfl7901pcJgnjELyxaUqtpAyMQbz4jaRi015IaF0JGaByEmSJwhmWuirEH X-Received: by 2002:a17:902:9a4a:b0:1d7:7a87:89b8 with SMTP id x10-20020a1709029a4a00b001d77a8789b8mr1849446plv.64.1706307408064; Fri, 26 Jan 2024 14:16:48 -0800 (PST) Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id s23-20020a170902989700b001d727386683si1750573plp.103.2024.01.26.14.16.47 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 14:16:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40737-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=4k8I4DCD; arc=fail (signature failed); spf=pass (google.com: domain of linux-kernel+bounces-40737-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40737-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id B7D2E280DA8 for <ouuuleilei@gmail.com>; Fri, 26 Jan 2024 22:16:47 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BA8292CCBA; Fri, 26 Jan 2024 22:16:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b="4k8I4DCD" Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2041.outbound.protection.outlook.com [40.107.236.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8B9092557A for <linux-kernel@vger.kernel.org>; Fri, 26 Jan 2024 22:16:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.236.41 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706307376; cv=fail; b=q9fZ+F9gb5G4VvWs+Z0sb5L5+l9ueRAwswfYZxPoFL9AbdvtAznvbDXFqFq8FvoxoWaQy+HdNLN846KsS4sNlO21PsADFmSM1WYD8Dt3Mzwjiqe0FSPfcADSoM6C4tUIPllEpogfI9QsV4ZGLE5JYxFmTbCdxOOJZqUPrxvDVWE= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706307376; c=relaxed/simple; bh=SX528BthTji01s/dRAEuWglPZlCgXGWeBySGXad3Fek=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=CC6JvzKQKcSBmFdn+9sui1bwJoz287uYnX8gUVbwif3flVqqRs4zAxGjQVA9lm6U39QoKpKHTUvyCeTUOEH1gKcVeUfRjLtVj4OREscXHp5Nr+MX7rXadcZo8RhPjEakv8RGWBgjkLqfW/QoPKmJbo9mFN1+E1aVqH6I7JtzCNc= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com; spf=fail smtp.mailfrom=amd.com; dkim=pass (1024-bit key) header.d=amd.com header.i=@amd.com header.b=4k8I4DCD; arc=fail smtp.client-ip=40.107.236.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=amd.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=amd.com ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hNCLemZwHm7m2ZVt/2FyMVbv4WBZql54Li+/YJ8Ov5O9U6tJkvRxAqRcespoGUDyRO/w5T0KuwwbF726nMI6ZrD8HoOYkL6+utwkX6nUgA83qA2CLcW9jhNX+VBzYzcmtwTjpVbeI+wjgSsqdW+aceizzC8p9r9kZ01enYhZtl7IMYpJVsE7m/aozGwtJ2y5SUg9ZJ1JKUq6kISAfzUlb2gM/xQE7nwiRRtGKIDjReX2rttmHI/zyS7k5qmg3N05704Rg28D/0TmJD5GQ/c1AHpQzoHlMPQIjnrTXmJJXnwM5X8OXMr7xi3LFlUMNWbtvNKoyGNKjXG90rJbV42h/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OV53yi0+S4NN4BwMrjiXLSQ5YsHz5UYSGgsTq38s4fQ=; b=e5cXQWl+nlCD60K85wif6KLWAnIL3ecfXV/c9N0/iOgxKS72ispNeSFsXfXanzxVbgOlwfe6jsvRj+TFTc6Bht85A+uNUF5dvxTpa9rkMBKaJh/YzMH1zpPJxR+E9CRY3hs/IXzM5cyr550qvVdVO6Q1gtOfCCtOzLiEWr9RdZHX3qU90nKG61O5jbUvBw79ovof7MgvEJ7/99513g/WfDVl1x0agUFwMVkOf3foh4g4wNh52F+VmLOe6G0cGPrEgs+SZcn2EwnyxG5nZGF6AuYVwQ5oWBrFpS0sY/SXQIfj4CY2EmuW7g063x+RdMy2nUtr5nySXKgLQgW0Mgw+Ng== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=OV53yi0+S4NN4BwMrjiXLSQ5YsHz5UYSGgsTq38s4fQ=; b=4k8I4DCDJyma2tIS0wVV9Umf40dpBBJMwAFfkwVFRc9q4eSNarBEsgf/cwuRowkVrEl8JG2y63jm1WaCnRNiJNbgqwDbBKEh571qCSKgdCTHFUbz5Ar6YNoE+6SzKPHjwKJ2GMqAZ/WLph0mjLVmRs2AORIMYxwf136cAV0tNkE= Received: from SJ0PR13CA0002.namprd13.prod.outlook.com (2603:10b6:a03:2c0::7) by PH7PR12MB9202.namprd12.prod.outlook.com (2603:10b6:510:2ef::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7228.27; Fri, 26 Jan 2024 22:16:12 +0000 Received: from DS2PEPF0000343B.namprd02.prod.outlook.com (2603:10b6:a03:2c0:cafe::3) by SJ0PR13CA0002.outlook.office365.com (2603:10b6:a03:2c0::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7228.21 via Frontend Transport; Fri, 26 Jan 2024 22:16:11 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS2PEPF0000343B.mail.protection.outlook.com (10.167.18.38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7228.16 via Frontend Transport; Fri, 26 Jan 2024 22:16:11 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 26 Jan 2024 16:16:10 -0600 From: Tom Lendacky <thomas.lendacky@amd.com> To: <linux-kernel@vger.kernel.org>, <x86@kernel.org> CC: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>, Andy Lutomirski <luto@kernel.org>, "Peter Zijlstra" <peterz@infradead.org>, Dan Williams <dan.j.williams@intel.com>, Michael Roth <michael.roth@amd.com>, Ashish Kalra <ashish.kalra@amd.com> Subject: [PATCH 00/11] Provide SEV-SNP support for running under an SVSM Date: Fri, 26 Jan 2024 16:15:53 -0600 Message-ID: <cover.1706307364.git.thomas.lendacky@amd.com> X-Mailer: git-send-email 2.42.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS2PEPF0000343B:EE_|PH7PR12MB9202:EE_ X-MS-Office365-Filtering-Correlation-Id: c79c0a0f-fbdb-407e-6afe-08dc1ebc676d X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oH717JBN0rbZKPh3VNZ/L9ofMbMfTtLgOWTpofl6Eix2iVP7jh9t5H8x4eGsUomoqtTBpWFgxsntML/vP/kmkPvhNk18WJbPRBTk10OupropeMVwvPpAvZxsi8RrE4DuGP6kEIeRT+RMZJBgsIeO9HcI/yvO1aEK8pd4b1YEBJ2kHGVf+zI2kK+JQVPkY8hxmwQelEumj7bLGH92+xv4Q0/1WkHMhoU5XvaxET7uiQS+8bEvQD3vPBDaOpJBDz8BLW2vm5t+gefx+P97/9+MXSiOSdKVLnQBhmOrj+G3rZmkBO8cnk+J2VBNTHQ36je1x2uOrvwYg0c4Uy2ZnYDRkNoj0/YFU1pqWrpwW5gkVTccvdFBK8ze8TWdkxhzMuu14v+Yy4W5p0Zcw0FagKqAaYr/CRdaMzq60Rngwb92esSnRUaSlQB3obVWaaUsji2X3eSWiF80t1YLfyZvbBkAs1kIFxg1qoYXe6mpTFazIFSmZzI9U+p7eTj1e+CYoIAiwSaLsUFngBW7t1PIO74/uX0cEVQB1WLuUy3ICQczetPV/40zuJ/2YZCQBknBktiJYe+4iqibvjOEBie+D4mOp+yX3q7pqK2IzWLVWifh5TCXtkIE6p5J8yMHObxmiFtfdXKP2khguHlmZK47WKKX9ilPtDapyYIUsD1TyJygsm33bzmcgtC3j3OqoIPrl1Om1/tmMMeZjyOFjqovcWXGWQaf7jGQYEbH4smH3CZBncs3DSsCwasqPF/8rywNP3fNdaMNIBKpN6DrNV6t6IGNSA== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(346002)(376002)(39860400002)(396003)(136003)(230922051799003)(186009)(82310400011)(1800799012)(451199024)(64100799003)(40470700004)(46966006)(36840700001)(2906002)(86362001)(41300700001)(36756003)(356005)(83380400001)(70586007)(316002)(70206006)(54906003)(40480700001)(110136005)(40460700003)(5660300002)(81166007)(6666004)(2616005)(36860700001)(7416002)(478600001)(966005)(426003)(336012)(82740400003)(16526019)(26005)(8936002)(8676002)(4326008)(47076005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Jan 2024 22:16:11.6583 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c79c0a0f-fbdb-407e-6afe-08dc1ebc676d X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS2PEPF0000343B.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB9202 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789192997020680627 X-GMAIL-MSGID: 1789192997020680627 |
Series |
Provide SEV-SNP support for running under an SVSM
|
|
Message
Tom Lendacky
Jan. 26, 2024, 10:15 p.m. UTC
This series adds SEV-SNP support for running Linux under an Secure VM Service Module (SVSM) at a less privileged VM Privilege Level (VMPL). By running at a less priviledged VMPL, the SVSM can be used to provide services, e.g. a virtual TPM, for Linux within the SEV-SNP confidential VM (CVM) rather than trust such services from the hypervisor. Currently, a Linux guest expects to run at the highest VMPL, VMPL0, and there are certain SNP related operations that require that VMPL level. Specifically, the PVALIDATE instruction and the RMPADJUST instruction when setting the VMSA attribute of a page (used when starting APs). If Linux is to run at a less privileged VMPL, e.g. VMPL2, then it must use an SVSM (which is running at VMPL0) to perform the operations that it is no longer able to perform. How Linux interacts with and uses the SVSM is documented in the SVSM specification [1] and the GHCB specification [2]. This series introduces support to run Linux under an SVSM. It consists of: - Detecting the presence of an SVSM - When not running at VMPL0, invoking the SVSM for page validation and VMSA page creation/deletion - Adding a sysfs entry that specifies the Linux VMPL - Modifying the sev-guest driver to use the VMPCK key associated with the Linux VMPL - Expanding the config-fs TSM support to request attestation reports from the SVSM - Detecting and allowing Linux to run in a VMPL other than 0 when an SVSM is present The series is based off of and tested against the tip tree: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git master b0c57a7002b0 ("Merge branch into tip/master: 'x86/cpu'") [1] https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/58019.pdf [2] https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/56421.pdf --- Tom Lendacky (11): x86/sev: Rename snp_init() in the boot/compressed/sev.c file x86/sev: Make the VMPL0 checking function more generic x86/sev: Check for the presence of an SVSM in the SNP Secrets page x86/sev: Use kernel provided SVSM Calling Areas x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0 x86/sev: Use the SVSM to create a vCPU when not in VMPL0 x86/sev: Provide SVSM discovery support x86/sev: Provide guest VMPL level to userspace virt: sev-guest: Choose the VMPCK key based on executing VMPL x86/sev: Extend the config-fs attestation support for an SVSM x86/sev: Allow non-VMPL0 execution when an SVSM is present Documentation/ABI/testing/configfs-tsm | 55 +++ arch/x86/boot/compressed/sev.c | 253 ++++++++------ arch/x86/include/asm/msr-index.h | 2 + arch/x86/include/asm/sev-common.h | 18 + arch/x86/include/asm/sev.h | 114 ++++++- arch/x86/include/uapi/asm/svm.h | 1 + arch/x86/kernel/sev-shared.c | 338 ++++++++++++++++++- arch/x86/kernel/sev.c | 426 +++++++++++++++++++++--- arch/x86/mm/mem_encrypt_amd.c | 8 +- drivers/virt/coco/sev-guest/sev-guest.c | 147 +++++++- drivers/virt/coco/tsm.c | 95 +++++- include/linux/tsm.h | 11 + 12 files changed, 1300 insertions(+), 168 deletions(-)
Comments
> This series adds SEV-SNP support for running Linux under an Secure VM > Service Module (SVSM) at a less privileged VM Privilege Level (VMPL). > By running at a less priviledged VMPL, the SVSM can be used to provide > services, e.g. a virtual TPM, for Linux within the SEV-SNP confidential > VM (CVM) rather than trust such services from the hypervisor. > > Currently, a Linux guest expects to run at the highest VMPL, VMPL0, and > there are certain SNP related operations that require that VMPL level. > Specifically, the PVALIDATE instruction and the RMPADJUST instruction > when setting the VMSA attribute of a page (used when starting APs). > > If Linux is to run at a less privileged VMPL, e.g. VMPL2, then it must > use an SVSM (which is running at VMPL0) to perform the operations that > it is no longer able to perform. > > How Linux interacts with and uses the SVSM is documented in the SVSM > specification [1] and the GHCB specification [2]. > > This series introduces support to run Linux under an SVSM. It consists > of: > - Detecting the presence of an SVSM > - When not running at VMPL0, invoking the SVSM for page validation and > VMSA page creation/deletion > - Adding a sysfs entry that specifies the Linux VMPL > - Modifying the sev-guest driver to use the VMPCK key associated with > the Linux VMPL > - Expanding the config-fs TSM support to request attestation reports > from the SVSM > - Detecting and allowing Linux to run in a VMPL other than 0 when an > SVSM is present Hi Tom and everyone, This patch set imo is a good opportunity to start a wider discussion on SVSM-style confidential guests that we actually wanted to start anyhow because TDX will need smth similar in the future. So let me explain our thinking and try to align together here. In addition to an existing notion of a Confidential Computing (CoCo) guest both Intel and AMD define a concept that a CoCo guest can be further subdivided/partitioned into different SW layers running with different privileges. In the AMD Secure Encrypted Virtualization with Secure Nested Paging (SEV-SNP) architecture this is called VM Permission Levels (VMPLs) and in the Intel Trust Domain Extensions (TDX) architecture it is called TDX Partitioning. The most privileged part of a CoCo guest is referred as running at VMPL0 for AMD SEV-SNP and as L1 for Intel TDX Partitioning. This privilege level has full control over the other components running inside a CoCo guest, as well as some operations are only allowed to be executed by the SW running at this privilege level. The assumption is that this level is used for a Virtual Machine Monitor (VMM)/Hypervisor like KVM and others or a lightweight Service Manager (SM) like coconut-SVSM [3]. The actual workload VM (together with its OS) is expected to be run in a different privilege level (!VMPL0 in AMD case and L2 layer in Intel case). Both architectures in our current understanding (please correct if this is not true for AMD) allow for different workload VM options starting from a fully unmodified legacy OS to a fully enabled/enlightened AMD SEV-SNP/ Intel TDX guest and anything in between. However, each workload guest option requires a different level of implementation support from the most privileged VMPL0/L1 layer as well as from the workload OS itself (running at !VMPL0/L2) and also has different effects on overall performance and other factors. Linux as being one of the workload OSes currently doesn’t define a common notion or interfaces for such special type of CoCo guests and there is a risk that each vendor can duplicate a lot of common concepts inside ADM SEV-SNP or Intel TDX specific code. This is not the approach Linux usually prefers and the vendor agnostic solution should be explored first. So this is an attempt to start a joint discussion on how/what/if we can unify in this space and following the recent lkml thread [1], it seems we need to first clarify how we see this special !VMPL0/L2 guest and whenever we can or need to define a common notion for it. The following options are *theoretically* possible: 1. Keep the !VMPL0/L2 guest as unmodified AMD SEV-SNP/Intel TDX guest and hide all complexity inside VMPL0/L1 VMM and/or respected Intel/AMD architecture internal components. This likely creates additional complexity in the implementation of VMPL0/L1 layer compared to other options below. This option also doesn’t allow service providers to unify their interfaces between AMD/Intel solutions, but requires their VMPL0/L1 layer to handle differences between these guests. On a plus side this option requires no changes in existing AMD SEV-SNP/Intel TDX Linux guest code to support !VMPL0/L2 guest. The big open question we have here to AMD folks is whenever it is architecturally feasible for you to support this case? 2. Keep it as Intel TDX/AMD SEV-SNP guest with some Linux guest internal code logic to handle whenever it runs in L1 vs L2/VMPL0 vs !VMPL0. This is essentially what this patch series is doing for AMD. This option potentially creates many if statements inside respected Linux implementation of these technologies to handle the differences, complicates the code, and doesn’t allow service providers to unify their L1/VMPL0 code. This option was also previously proposed for Intel TDX in this lkml thread [1] and got a negative initial reception. 3. Keep it as a legacy non-CoCo guest. This option is very bad from performance point of view since all I/O must be done via VMPL0/L1 layer and it is considered infeasible/unacceptable by service providers (performance of networking and disk is horrible). It also requires an extensive implementation in VMPL0/L1 layer to support emulation of all devices. 4. Define a new guest abstraction/guest type that would be used for !VMPL0/L2 guest. This allows in the future to define a unified L2 <-> L1/VMPL!0 <-> VMPL0 communication interface that underneath would use Intel TDX/AMD SEV-SNP specified communication primitives. Out of existing Linux code, this approach is followed to some initial degree by MSFT Hyper-V implementation [2]. It defines a new type of virtualized guest with its own initialization path and callbacks in x86_platform.guest/hyper.*. However, in our understanding noone has yet attempted to define a unified abstraction for such guest, as well as unified interface. AMD SEV-SNP has defined in [4] a VMPL0 <--> !VMPL0 communication interface which is AMD specific. 5. Anything else is missing? References: [1] https://lkml.org/lkml/2023/11/22/1089 [2] MSFT hyper-v implementation of AMD SEV-SNP !VMPL0 guest and TDX L2 partitioning guest: https://elixir.bootlin.com/linux/latest/source/arch/x86/hyperv/ivm.c#L575 [3] https://github.com/coconut-svsm/svsm [4] https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/58019.pdf
On 2/12/24 04:40, Reshetova, Elena wrote: >> This series adds SEV-SNP support for running Linux under an Secure VM >> Service Module (SVSM) at a less privileged VM Privilege Level (VMPL). >> By running at a less priviledged VMPL, the SVSM can be used to provide >> services, e.g. a virtual TPM, for Linux within the SEV-SNP confidential >> VM (CVM) rather than trust such services from the hypervisor. >> >> Currently, a Linux guest expects to run at the highest VMPL, VMPL0, and >> there are certain SNP related operations that require that VMPL level. >> Specifically, the PVALIDATE instruction and the RMPADJUST instruction >> when setting the VMSA attribute of a page (used when starting APs). >> >> If Linux is to run at a less privileged VMPL, e.g. VMPL2, then it must >> use an SVSM (which is running at VMPL0) to perform the operations that >> it is no longer able to perform. >> >> How Linux interacts with and uses the SVSM is documented in the SVSM >> specification [1] and the GHCB specification [2]. >> >> This series introduces support to run Linux under an SVSM. It consists >> of: >> - Detecting the presence of an SVSM >> - When not running at VMPL0, invoking the SVSM for page validation and >> VMSA page creation/deletion >> - Adding a sysfs entry that specifies the Linux VMPL >> - Modifying the sev-guest driver to use the VMPCK key associated with >> the Linux VMPL >> - Expanding the config-fs TSM support to request attestation reports >> from the SVSM >> - Detecting and allowing Linux to run in a VMPL other than 0 when an >> SVSM is present > > Hi Tom and everyone, > > This patch set imo is a good opportunity to start a wider discussion on > SVSM-style confidential guests that we actually wanted to start anyhow > because TDX will need smth similar in the future. > So let me explain our thinking and try to align together here. > > In addition to an existing notion of a Confidential Computing (CoCo) guest > both Intel and AMD define a concept that a CoCo guest can be further > subdivided/partitioned into different SW layers running with different > privileges. In the AMD Secure Encrypted Virtualization with Secure Nested > Paging (SEV-SNP) architecture this is called VM Permission Levels (VMPLs) > and in the Intel Trust Domain Extensions (TDX) architecture it is called > TDX Partitioning. The most privileged part of a CoCo guest is referred as > running at VMPL0 for AMD SEV-SNP and as L1 for Intel TDX Partitioning. > This privilege level has full control over the other components running > inside a CoCo guest, as well as some operations are only allowed to be > executed by the SW running at this privilege level. The assumption is that > this level is used for a Virtual Machine Monitor (VMM)/Hypervisor like KVM > and others or a lightweight Service Manager (SM) like coconut-SVSM [3]. I'm not sure what you mean about the level being used for a VMM/hypervisor, since they are running in the host. Coconut-SVSM is correct, since it is running within the guest context. > The actual workload VM (together with its OS) is expected to be run in a > different privilege level (!VMPL0 in AMD case and L2 layer in Intel case). > Both architectures in our current understanding (please correct if this is > not true for AMD) allow for different workload VM options starting from > a fully unmodified legacy OS to a fully enabled/enlightened AMD SEV-SNP/ > Intel TDX guest and anything in between. However, each workload guest I'm not sure about the "anything in between" aspect. I would think that if the guest is enlightened it would be fully enlightened or not at all. It would be difficult to try to decide what operations should be sent to the SVSM to handle, and how that would occur if the guest OS is unaware of the SVSM protocol to use. If it is aware of the protocol, then it would just use it. For the unenlighted guest, it sounds like more of a para-visor approach being used where the guest wouldn't know that control was ever transferred to the para-visor to handle the event. With SNP, that would be done through a feature called Reflect-VC. But that means it is an all or nothing action. > option requires a different level of implementation support from the most > privileged VMPL0/L1 layer as well as from the workload OS itself (running > at !VMPL0/L2) and also has different effects on overall performance and > other factors. Linux as being one of the workload OSes currently doesn’t > define a common notion or interfaces for such special type of CoCo guests > and there is a risk that each vendor can duplicate a lot of common concepts > inside ADM SEV-SNP or Intel TDX specific code. This is not the approach > Linux usually prefers and the vendor agnostic solution should be explored first. > > So this is an attempt to start a joint discussion on how/what/if we can unify > in this space and following the recent lkml thread [1], it seems we need > to first clarify how we see this special !VMPL0/L2 guest and whenever we > can or need to define a common notion for it. > The following options are *theoretically* possible: > > 1. Keep the !VMPL0/L2 guest as unmodified AMD SEV-SNP/Intel TDX guest > and hide all complexity inside VMPL0/L1 VMM and/or respected Intel/AMD > architecture internal components. This likely creates additional complexity > in the implementation of VMPL0/L1 layer compared to other options below. > This option also doesn’t allow service providers to unify their interfaces > between AMD/Intel solutions, but requires their VMPL0/L1 layer to handle > differences between these guests. On a plus side this option requires no > changes in existing AMD SEV-SNP/Intel TDX Linux guest code to support > !VMPL0/L2 guest. The big open question we have here to AMD folks is > whenever it is architecturally feasible for you to support this case? It is architecturally feasible to support this, but it would come with a performance penalty. For SNP, all #VC exceptions would be routed back to the HV, into the SVSM/para-visor to be processed, back to the HV and finally back the guest. While we would expect some operations, such as PVALIDATE, to have to make this kind of exchange, operations such as CPUID or MSR accesses would suffer. > > 2. Keep it as Intel TDX/AMD SEV-SNP guest with some Linux guest internal > code logic to handle whenever it runs in L1 vs L2/VMPL0 vs !VMPL0. > This is essentially what this patch series is doing for AMD. > This option potentially creates many if statements inside respected Linux > implementation of these technologies to handle the differences, complicates > the code, and doesn’t allow service providers to unify their L1/VMPL0 code. > This option was also previously proposed for Intel TDX in this lkml thread [1] > and got a negative initial reception. I think the difference here is that the guest would still be identified as an SNP guest and still use all of the memory encryption and #VC handling it does today. It is just specific VMPL0-only operations that would need to performed by the SVSM instead of by the guest. > > 3. Keep it as a legacy non-CoCo guest. This option is very bad from > performance point of view since all I/O must be done via VMPL0/L1 layer > and it is considered infeasible/unacceptable by service providers > (performance of networking and disk is horrible). It also requires an > extensive implementation in VMPL0/L1 layer to support emulation of all devices. > > 4. Define a new guest abstraction/guest type that would be used for > !VMPL0/L2 guest. This allows in the future to define a unified L2 <-> L1/VMPL!0 > <-> VMPL0 communication interface that underneath would use Intel > TDX/AMD SEV-SNP specified communication primitives. Out of existing Linux code, > this approach is followed to some initial degree by MSFT Hyper-V implementation [2]. > It defines a new type of virtualized guest with its own initialization path and callbacks in > x86_platform.guest/hyper.*. However, in our understanding noone has yet > attempted to define a unified abstraction for such guest, as well as unified interface. > AMD SEV-SNP has defined in [4] a VMPL0 <--> !VMPL0 communication interface > which is AMD specific. Can TDX create a new protocol within the SVSM that it could use? Thanks, Tom > > 5. Anything else is missing? > > References: > > [1] https://lkml.org/lkml/2023/11/22/1089 > > [2] MSFT hyper-v implementation of AMD SEV-SNP !VMPL0 guest and TDX L2 > partitioning guest: > https://elixir.bootlin.com/linux/latest/source/arch/x86/hyperv/ivm.c#L575 > > [3] https://github.com/coconut-svsm/svsm > > [4] https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/specifications/58019.pdf > >
On Fri, Feb 16, 2024 at 01:46:41PM -0600, Tom Lendacky wrote: > > 4. Define a new guest abstraction/guest type that would be used for > > !VMPL0/L2 guest. This allows in the future to define a unified L2 <-> L1/VMPL!0 > > <-> VMPL0 communication interface that underneath would use Intel > > TDX/AMD SEV-SNP specified communication primitives. Out of existing Linux code, > > this approach is followed to some initial degree by MSFT Hyper-V implementation [2]. > > It defines a new type of virtualized guest with its own initialization path and callbacks in > > x86_platform.guest/hyper.*. However, in our understanding noone has yet > > attempted to define a unified abstraction for such guest, as well as unified interface. > > AMD SEV-SNP has defined in [4] a VMPL0 <--> !VMPL0 communication interface > > which is AMD specific. > > Can TDX create a new protocol within the SVSM that it could use? Sure we can. But it contributes to virtualization zoo. The situation is bad as it is. Ideally we would have a single SVSM guest type instead of SVSM/TDX and SVSM/SEV.
> Subject: Re: [PATCH 00/11] Provide SEV-SNP support for running under an SVSM > > On 2/12/24 04:40, Reshetova, Elena wrote: > >> This series adds SEV-SNP support for running Linux under an Secure VM > >> Service Module (SVSM) at a less privileged VM Privilege Level (VMPL). > >> By running at a less priviledged VMPL, the SVSM can be used to provide > >> services, e.g. a virtual TPM, for Linux within the SEV-SNP confidential > >> VM (CVM) rather than trust such services from the hypervisor. > >> > >> Currently, a Linux guest expects to run at the highest VMPL, VMPL0, and > >> there are certain SNP related operations that require that VMPL level. > >> Specifically, the PVALIDATE instruction and the RMPADJUST instruction > >> when setting the VMSA attribute of a page (used when starting APs). > >> > >> If Linux is to run at a less privileged VMPL, e.g. VMPL2, then it must > >> use an SVSM (which is running at VMPL0) to perform the operations that > >> it is no longer able to perform. > >> > >> How Linux interacts with and uses the SVSM is documented in the SVSM > >> specification [1] and the GHCB specification [2]. > >> > >> This series introduces support to run Linux under an SVSM. It consists > >> of: > >> - Detecting the presence of an SVSM > >> - When not running at VMPL0, invoking the SVSM for page validation and > >> VMSA page creation/deletion > >> - Adding a sysfs entry that specifies the Linux VMPL > >> - Modifying the sev-guest driver to use the VMPCK key associated with > >> the Linux VMPL > >> - Expanding the config-fs TSM support to request attestation reports > >> from the SVSM > >> - Detecting and allowing Linux to run in a VMPL other than 0 when an > >> SVSM is present > > > > Hi Tom and everyone, > > > > This patch set imo is a good opportunity to start a wider discussion on > > SVSM-style confidential guests that we actually wanted to start anyhow > > because TDX will need smth similar in the future. > > So let me explain our thinking and try to align together here. > > > > In addition to an existing notion of a Confidential Computing (CoCo) guest > > both Intel and AMD define a concept that a CoCo guest can be further > > subdivided/partitioned into different SW layers running with different > > privileges. In the AMD Secure Encrypted Virtualization with Secure Nested > > Paging (SEV-SNP) architecture this is called VM Permission Levels (VMPLs) > > and in the Intel Trust Domain Extensions (TDX) architecture it is called > > TDX Partitioning. The most privileged part of a CoCo guest is referred as > > running at VMPL0 for AMD SEV-SNP and as L1 for Intel TDX Partitioning. > > This privilege level has full control over the other components running > > inside a CoCo guest, as well as some operations are only allowed to be > > executed by the SW running at this privilege level. The assumption is that > > this level is used for a Virtual Machine Monitor (VMM)/Hypervisor like KVM > > and others or a lightweight Service Manager (SM) like coconut-SVSM [3]. > > I'm not sure what you mean about the level being used for a > VMM/hypervisor, since they are running in the host. Coconut-SVSM is > correct, since it is running within the guest context. What I meant is that this privilege level can be in principle used to host any hypervisor/VMM also (not on the host, but in the guest). For TDX we have pocs published in past that enabled KVM running as L1 inside the guest. > > > The actual workload VM (together with its OS) is expected to be run in a > > different privilege level (!VMPL0 in AMD case and L2 layer in Intel case). > > Both architectures in our current understanding (please correct if this is > > not true for AMD) allow for different workload VM options starting from > > a fully unmodified legacy OS to a fully enabled/enlightened AMD SEV-SNP/ > > Intel TDX guest and anything in between. However, each workload guest > > I'm not sure about the "anything in between" aspect. I would think that if > the guest is enlightened it would be fully enlightened or not at all. It > would be difficult to try to decide what operations should be sent to the > SVSM to handle, and how that would occur if the guest OS is unaware of the > SVSM protocol to use. If it is aware of the protocol, then it would just > use it. Architecturally we can support guests that would fall somewhere in between of a fully enlightened guest or legacy non-coco guest, albeit I am not saying it is a way to go. A minimally enlightened guest can ask for a service from SVSM on some things (i.e. attestation evidence) but behave fully unenlightened when it comes to other things (like handling MMIO - will be emulated by SVSM or forwarded to the host). > > For the unenlighted guest, it sounds like more of a para-visor approach > being used where the guest wouldn't know that control was ever transferred > to the para-visor to handle the event. With SNP, that would be done > through a feature called Reflect-VC. But that means it is an all or > nothing action. Thank you for the SEV insights. > > > option requires a different level of implementation support from the most > > privileged VMPL0/L1 layer as well as from the workload OS itself (running > > at !VMPL0/L2) and also has different effects on overall performance and > > other factors. Linux as being one of the workload OSes currently doesn’t > > define a common notion or interfaces for such special type of CoCo guests > > and there is a risk that each vendor can duplicate a lot of common concepts > > inside ADM SEV-SNP or Intel TDX specific code. This is not the approach > > Linux usually prefers and the vendor agnostic solution should be explored first. > > > > So this is an attempt to start a joint discussion on how/what/if we can unify > > in this space and following the recent lkml thread [1], it seems we need > > to first clarify how we see this special !VMPL0/L2 guest and whenever we > > can or need to define a common notion for it. > > The following options are *theoretically* possible: > > > > 1. Keep the !VMPL0/L2 guest as unmodified AMD SEV-SNP/Intel TDX guest > > and hide all complexity inside VMPL0/L1 VMM and/or respected Intel/AMD > > architecture internal components. This likely creates additional complexity > > in the implementation of VMPL0/L1 layer compared to other options below. > > This option also doesn’t allow service providers to unify their interfaces > > between AMD/Intel solutions, but requires their VMPL0/L1 layer to handle > > differences between these guests. On a plus side this option requires no > > changes in existing AMD SEV-SNP/Intel TDX Linux guest code to support > > !VMPL0/L2 guest. The big open question we have here to AMD folks is > > whenever it is architecturally feasible for you to support this case? > > It is architecturally feasible to support this, but it would come with a > performance penalty. For SNP, all #VC exceptions would be routed back to > the HV, into the SVSM/para-visor to be processed, back to the HV and > finally back the guest. While we would expect some operations, such as > PVALIDATE, to have to make this kind of exchange, operations such as CPUID > or MSR accesses would suffer. Sorry for my ignorance, what the HV? > > > > > 2. Keep it as Intel TDX/AMD SEV-SNP guest with some Linux guest internal > > code logic to handle whenever it runs in L1 vs L2/VMPL0 vs !VMPL0. > > This is essentially what this patch series is doing for AMD. > > This option potentially creates many if statements inside respected Linux > > implementation of these technologies to handle the differences, complicates > > the code, and doesn’t allow service providers to unify their L1/VMPL0 code. > > This option was also previously proposed for Intel TDX in this lkml thread [1] > > and got a negative initial reception. > > I think the difference here is that the guest would still be identified as > an SNP guest and still use all of the memory encryption and #VC handling > it does today. It is just specific VMPL0-only operations that would need > to performed by the SVSM instead of by the guest. I see, you are saying less fragmentation overall, but overall I think this option still reflects it also. > > > > > 3. Keep it as a legacy non-CoCo guest. This option is very bad from > > performance point of view since all I/O must be done via VMPL0/L1 layer > > and it is considered infeasible/unacceptable by service providers > > (performance of networking and disk is horrible). It also requires an > > extensive implementation in VMPL0/L1 layer to support emulation of all > devices. > > > > 4. Define a new guest abstraction/guest type that would be used for > > !VMPL0/L2 guest. This allows in the future to define a unified L2 <-> L1/VMPL!0 > > <-> VMPL0 communication interface that underneath would use Intel > > TDX/AMD SEV-SNP specified communication primitives. Out of existing Linux > code, > > this approach is followed to some initial degree by MSFT Hyper-V > implementation [2]. > > It defines a new type of virtualized guest with its own initialization path and > callbacks in > > x86_platform.guest/hyper.*. However, in our understanding noone has yet > > attempted to define a unified abstraction for such guest, as well as unified > interface. > > AMD SEV-SNP has defined in [4] a VMPL0 <--> !VMPL0 communication interface > > which is AMD specific. > > Can TDX create a new protocol within the SVSM that it could use? Kirill already commented on this, and the answer is of course we can, but imo we need to see a bigger picture first. If we go with option 2 above, then coming with a joint protocol is only limitedly useful because likely we wont be able to share the code in the guest kernel. Ideally I think we want a common concept and a common protocol that we can share in both guest kernel and coconut-svsm. Btw, is continuing discussion here the best/preferred/efficient way forward? Or should we setup a call with anyone who is interested in the topic to form a joint understanding on what can be done here? Best Regards, Elena. > > Thanks, > Tom > > > > > 5. Anything else is missing? > > > > References: > > > > [1] https://lkml.org/lkml/2023/11/22/1089 > > > > [2] MSFT hyper-v implementation of AMD SEV-SNP !VMPL0 guest and TDX L2 > > partitioning guest: > > https://elixir.bootlin.com/linux/latest/source/arch/x86/hyperv/ivm.c#L575 > > > > [3] https://github.com/coconut-svsm/svsm > > > > [4] https://www.amd.com/content/dam/amd/en/documents/epyc-technical- > docs/specifications/58019.pdf > > > >
On 2/19/24 11:54, Reshetova, Elena wrote: >> Subject: Re: [PATCH 00/11] Provide SEV-SNP support for running under an SVSM >> >> On 2/12/24 04:40, Reshetova, Elena wrote: >>>> This series adds SEV-SNP support for running Linux under an Secure VM > > Sorry for my ignorance, what the HV? HV == Hypervisor > >> > > Kirill already commented on this, and the answer is of course we can, but imo we > need to see a bigger picture first. If we go with option 2 above, then coming with a > joint protocol is only limitedly useful because likely we wont be able to share the > code in the guest kernel. Ideally I think we want a common concept and a common > protocol that we can share in both guest kernel and coconut-svsm. > > Btw, is continuing discussion here the best/preferred/efficient way forward? Or should we > setup a call with anyone who is interested in the topic to form a joint understanding > on what can be done here? I'm not sure what the best way forward is since I'm not sure what a common concept / common protocol would look like. If you feel we can effectively describe it via email, then we should continue that, maybe on a new thread under linux-coco. If not, then a call might be best. Thanks, Tom > > Best Regards, > Elena. > > >> >> Thanks, >> Tom >> >>> >>> 5. Anything else is missing? >>> >>> References: >>> >>> [1] https://lkml.org/lkml/2023/11/22/1089 >>> >>> [2] MSFT hyper-v implementation of AMD SEV-SNP !VMPL0 guest and TDX L2 >>> partitioning guest: >>> https://elixir.bootlin.com/linux/latest/source/arch/x86/hyperv/ivm.c#L575 >>> >>> [3] https://github.com/coconut-svsm/svsm >>> >>> [4] https://www.amd.com/content/dam/amd/en/documents/epyc-technical- >> docs/specifications/58019.pdf >>> >>>
> > Kirill already commented on this, and the answer is of course we can, but imo > we > > need to see a bigger picture first. If we go with option 2 above, then coming > with a > > joint protocol is only limitedly useful because likely we wont be able to share > the > > code in the guest kernel. Ideally I think we want a common concept and a > common > > protocol that we can share in both guest kernel and coconut-svsm. > > > > Btw, is continuing discussion here the best/preferred/efficient way forward? > Or should we > > setup a call with anyone who is interested in the topic to form a joint > understanding > > on what can be done here? > > I'm not sure what the best way forward is since I'm not sure what a common > concept / common protocol would look like. If you feel we can effectively > describe it via email, then we should continue that, maybe on a new thread > under linux-coco. If not, then a call might be best. OK, let us first put some proposal from our side on how this potentially could look like. It might be easier to have a discussion against smth more concrete. Best Regards, Elena. > > Thanks, > Tom > > > > > Best Regards, > > Elena. > > > > > >> > >> Thanks, > >> Tom > >> > >>> > >>> 5. Anything else is missing? > >>> > >>> References: > >>> > >>> [1] https://lkml.org/lkml/2023/11/22/1089 > >>> > >>> [2] MSFT hyper-v implementation of AMD SEV-SNP !VMPL0 guest and TDX L2 > >>> partitioning guest: > >>> https://elixir.bootlin.com/linux/latest/source/arch/x86/hyperv/ivm.c#L575 > >>> > >>> [3] https://github.com/coconut-svsm/svsm > >>> > >>> [4] https://www.amd.com/content/dam/amd/en/documents/epyc- > technical- > >> docs/specifications/58019.pdf > >>> > >>>