Message ID | 1d38d28c2731075d66ac65b56b813a138900f638.1680628986.git.thomas.lendacky@amd.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp72466vqo; Tue, 4 Apr 2023 11:13:54 -0700 (PDT) X-Google-Smtp-Source: AKy350anDwZSXuIKK9MrJFYbNoJ+INDzpVDdU9m2yo34p0J+S87aVJnl1FiUXEEEW7mKHFfEZvcH X-Received: by 2002:a17:902:e542:b0:1a1:ee8c:eef8 with SMTP id n2-20020a170902e54200b001a1ee8ceef8mr4481732plf.2.1680632034641; Tue, 04 Apr 2023 11:13:54 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1680632034; cv=pass; d=google.com; s=arc-20160816; b=p8L5eJKSRPL4xY+OZPpL5hRIdYfAt9X2qR+4TVA4/QFpt9dW54wWmvRuDTnbtHrDFM MK90AUbXufc902s7WqImLcoEzwpcuiyzch5W1GR6XQqa4nH0Ffwe3vGvM60QFHOPriCt IHSGUBl3xajoz2MIhtiTlezgujpnY0PuMcR6681nntmsSLvJf0FhwR5iZfqtX8StruTh FA5SFwrxWJtwOjUpHvZOKfNfLneAHlnqpNNU/adLah9bCnoWUhXbU//QrpG2O0LV6JIT 7bVL6LEyD8Ydb124yQ3hDYd6oXm8b00ct+hvAfUT13tKhyrfNzenrPiAkd0eC6yHhEQU IQQA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Ua/JAhWpwRQN24wUKwp/qKfiLdX/ukBXUDTLahHI7u8=; b=pXNLFnvUhND0GSp9Rdo9M1gRWN87mhqpQo8JMQHaXSmdOMDVG+w5qnfOBzUjuH7csd lA609NLcmzRZTkZN4C7T8FLKmn3XrZt3ln2A5sSQeuZ3e4NFvUztTpd8rJgrRhtjaT1T mTDZW3GmnIE/u/CurVkW+cSl7ErujhC1inEZqJ33N4b3WSTokTC+3Sj8csuPTtJIzGgL hfxjxuFoHc+l25pG6RhzWL8edUGI85v5yLiUEVlmCI92GE7Xigt8qv1TX2HDtmHxfQu6 xk/RIvMwoKenl2Y1iCwVI4YF/ZaNjgT9jBXRTMvo9e+oMbMoiF8aBKQ6dyWJeZchI7MM OWhg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=0dvKFOey; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t20-20020a170902d21400b001a1d6a9c6c0si1363560ply.127.2023.04.04.11.13.41; Tue, 04 Apr 2023 11:13:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=0dvKFOey; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232851AbjDDR1h (ORCPT <rfc822;lkml4gm@gmail.com> + 99 others); Tue, 4 Apr 2023 13:27:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236335AbjDDR1W (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 4 Apr 2023 13:27:22 -0400 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on20609.outbound.protection.outlook.com [IPv6:2a01:111:f400:fe5a::609]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C79AF6E8C for <linux-kernel@vger.kernel.org>; Tue, 4 Apr 2023 10:25:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mh7RVM4kiD3uBi/GWrtM/heUFRc7fUTeKPxMGB2OAGs5OQimZS3emRRYC+PDuuUg4An/yCedC3LXd7/5Yh0975IeFzBVGIGQ0ZFwt9cdjpk75Clz3JBmDcupRXzBbYoa5WN1aXAHdOQTkEG9Vy3+dkEZuLSLr3Pwt6bkJQKU9jibL3kC/DDDt/ttVMq+FY38KTXBufGbKkjvC3lmHJNAlzUKHX5RKr6WKSDxp1iVFXNCtiViiAbUY/eBwZngkINhbixLsBkTcsHdgD5m8MKQXYkuwlNOHFk5Re7JQH0+vAP6we/EyADiATXlKyE3fgpD4YDy52zeL0ujul2kja8Pcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ua/JAhWpwRQN24wUKwp/qKfiLdX/ukBXUDTLahHI7u8=; b=iT8Uu/CeH/RPu259Gje/agnGo2F5qR3PQ4F3qOhEFeaNR8aVUn4i+4IuezP3GTtI+7zpwn+LpaVA4VpTtu6nwh0vbmTkHYH/GHBkAzdAphPc399ukFtZy2E4OFMG/JnVATdzLv3kKCCVWnrpgZdcCuUfdTh/sW0Qo+gr8nPKhrF6rvUOnoa3MFUtk9bkHr8OTmeyasrbSlMKK91ajVMjA7+sLpwCmYLo3xGU6tJBkRQAt7q4Y0nHciEgcihNHFpxHm30wLWaZ31IeSQI0q3vYan9LEbT2uFG9D1llnMXN/7+ToDdaVVMu8O+7MhHUkedRmex6vRYqt6LzduM9vL4KQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ua/JAhWpwRQN24wUKwp/qKfiLdX/ukBXUDTLahHI7u8=; b=0dvKFOeyPE1EnzIYZQM+TlYTTydEZwm035sHKM4kPIWQ7DPBsMklGyOcoj4L6/0Z5h3/X9t/i94EblxMysUBIVyctx+tnYMHlscpnEgIvly+6MZrUtItUlNcx8L1Y1W0bmU7XA/KB6OTD1uJAiuAxGgtf4HLVdoEtkZsCuz/DhA= Received: from DS7PR05CA0101.namprd05.prod.outlook.com (2603:10b6:8:56::21) by PH7PR12MB7892.namprd12.prod.outlook.com (2603:10b6:510:27e::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6254.35; Tue, 4 Apr 2023 17:24:10 +0000 Received: from DS1PEPF0000E641.namprd02.prod.outlook.com (2603:10b6:8:56:cafe::d0) by DS7PR05CA0101.outlook.office365.com (2603:10b6:8:56::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6277.28 via Frontend Transport; Tue, 4 Apr 2023 17:24:10 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS1PEPF0000E641.mail.protection.outlook.com (10.167.17.201) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6178.30 via Frontend Transport; Tue, 4 Apr 2023 17:24:10 +0000 Received: from tlendack-t1.amdoffice.net (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 4 Apr 2023 12:24:08 -0500 From: Tom Lendacky <thomas.lendacky@amd.com> To: <linux-kernel@vger.kernel.org>, <x86@kernel.org> CC: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "Kirill A. Shutemov" <kirill@shutemov.name>, "H. Peter Anvin" <hpa@zytor.com>, Michael Roth <michael.roth@amd.com>, Joerg Roedel <jroedel@suse.de>, Dionna Glaze <dionnaglaze@google.com>, Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Ard Biescheuvel <ardb@kernel.org>, "Min M. Xu" <min.m.xu@intel.com>, Gerd Hoffmann <kraxel@redhat.com>, James Bottomley <jejb@linux.ibm.com>, Tom Lendacky <Thomas.Lendacky@amd.com>, Jiewen Yao <jiewen.yao@intel.com>, Erdem Aktas <erdemaktas@google.com>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Subject: [PATCH v7 6/6] x86/efi: Safely enable unaccepted memory in UEFI Date: Tue, 4 Apr 2023 12:23:06 -0500 Message-ID: <1d38d28c2731075d66ac65b56b813a138900f638.1680628986.git.thomas.lendacky@amd.com> X-Mailer: git-send-email 2.40.0 In-Reply-To: <cover.1680628986.git.thomas.lendacky@amd.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> <cover.1680628986.git.thomas.lendacky@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF0000E641:EE_|PH7PR12MB7892:EE_ X-MS-Office365-Filtering-Correlation-Id: d218344c-abf9-4398-b8b4-08db3531673b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: RZPSbBTGGfskKqXaaRVUSrkmszox9WKemfNijB+J6rvHN29l725paMEm55qnFVbjyIutH29jBVG3n7QuPd4Ei/zb2mA7KS3MnVIGwvbAW8/JAheKZw3Fdo9I1IO6Ht6tTVRwuEhkr1UwjLmhybFl9Nok/a+gzoJhT+BlLM3FCM0SGRFY7FeqCjg2V5xKooygI/PvvdE1Nus6ecwEm99SmhGA4Wmf2Z2tbzPC9y2SlnWQ8F/wes0FwubfWQwbcyvPLAm5L/Myh9RnXPNKAaLsU8/c2F6rxwBYAi7nIQHCHr2o35SskfrKr7XNg0YUErRzHZ1oUReVKpJ6/t7jpAlCL71FOtKeH/ceYdg3kxpJdXuPsdu9bkL49FwV9ikenVFqlACYFyGry/6XE0Fo7FOmE3+GaELWfpYP+c9io5fXKPQfuX3Md0QFMO/n73tZ5mEXkShGJ5pk3T/E2baaGRHolA6cK+m2vj2WeoyhsGuLFxj9HKFY5iilWdpKv0MjLwvkfWO4IJ3rTx4Pa6xQcognGSVO9oSB/n/5i77LqJpUIP8IyR0yjNylDJIHhC3izqhtMRnotVS8xDumPBA0J39GkCxNek/xvp2/dCdGVLxBFTDMIsCCpeJ/L2txc8Sk+YBX7jH9Xkm8+VhljhlCW1QUfPDjtiloL/be0eOqIg7DZ8a574HCwykbkOqh6oHfuiphO9uqQXlpUOypWXOIEa5kLxtqWPZxsx+DBrol88oOTps= X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(346002)(39860400002)(376002)(136003)(396003)(451199021)(36840700001)(46966006)(40470700004)(86362001)(36756003)(82310400005)(2906002)(40480700001)(336012)(2616005)(47076005)(186003)(83380400001)(426003)(16526019)(26005)(8676002)(4326008)(70206006)(41300700001)(70586007)(36860700001)(478600001)(40460700003)(5660300002)(7416002)(82740400003)(110136005)(316002)(81166007)(8936002)(356005)(54906003)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Apr 2023 17:24:10.3756 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: d218344c-abf9-4398-b8b4-08db3531673b X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF0000E641.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR12MB7892 X-Spam-Status: No, score=0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO,SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1762270416480474916?= X-GMAIL-MSGID: =?utf-8?q?1762270416480474916?= |
Series |
Provide SEV-SNP support for unaccepted memory
|
|
Commit Message
Tom Lendacky
April 4, 2023, 5:23 p.m. UTC
From: Dionna Glaze <dionnaglaze@google.com> The UEFI v2.9 specification includes a new memory type to be used in environments where the OS must accept memory that is provided from its host. Before the introduction of this memory type, all memory was accepted eagerly in the firmware. In order for the firmware to safely stop accepting memory on the OS's behalf, the OS must affirmatively indicate support to the firmware. This is only a problem for AMD SEV-SNP, since Linux has had support for it since 5.19. The other technology that can make use of unaccepted memory, Intel TDX, does not yet have Linux support, so it can strictly require unaccepted memory support as a dependency of CONFIG_TDX and not require communication with the firmware. Enabling unaccepted memory requires calling a 0-argument enablement protocol before ExitBootServices. This call is only made if the kernel is compiled with UNACCEPTED_MEMORY=y This protocol will be removed after the end of life of the first LTS that includes it, in order to give firmware implementations an expiration date for it. When the protocol is removed, firmware will strictly infer that a SEV-SNP VM is running an OS that supports the unaccepted memory type. At the earliest convenience, when unaccepted memory support is added to Linux, SEV-SNP may take strict dependence in it. After the firmware removes support for the protocol, this patch should be reverted. [tl: address some checkscript warnings] Cc: Ard Biescheuvel <ardb@kernel.org> Cc: "Min M. Xu" <min.m.xu@intel.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: James Bottomley <jejb@linux.ibm.com> Cc: Tom Lendacky <Thomas.Lendacky@amd.com> Cc: Jiewen Yao <jiewen.yao@intel.com> Cc: Erdem Aktas <erdemaktas@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Dionna Glaze <dionnaglaze@google.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> --- drivers/firmware/efi/libstub/x86-stub.c | 36 +++++++++++++++++++++++++ include/linux/efi.h | 3 +++ 2 files changed, 39 insertions(+)
Comments
On Tue, Apr 04, 2023 at 12:23:06PM -0500, Tom Lendacky wrote: > From: Dionna Glaze <dionnaglaze@google.com> > > The UEFI v2.9 specification includes a new memory type to be used in > environments where the OS must accept memory that is provided from its > host. Before the introduction of this memory type, all memory was > accepted eagerly in the firmware. In order for the firmware to safely > stop accepting memory on the OS's behalf, the OS must affirmatively > indicate support to the firmware. This is only a problem for AMD > SEV-SNP, since Linux has had support for it since 5.19. The other > technology that can make use of unaccepted memory, Intel TDX, does not > yet have Linux support, so it can strictly require unaccepted memory > support as a dependency of CONFIG_TDX and not require communication with > the firmware. > > Enabling unaccepted memory requires calling a 0-argument enablement > protocol before ExitBootServices. This call is only made if the kernel > is compiled with UNACCEPTED_MEMORY=y > > This protocol will be removed after the end of life of the first LTS > that includes it, in order to give firmware implementations an > expiration date for it. When the protocol is removed, firmware will > strictly infer that a SEV-SNP VM is running an OS that supports the > unaccepted memory type. At the earliest convenience, when unaccepted > memory support is added to Linux, SEV-SNP may take strict dependence in > it. After the firmware removes support for the protocol, this patch > should be reverted. > > [tl: address some checkscript warnings] > > Cc: Ard Biescheuvel <ardb@kernel.org> > Cc: "Min M. Xu" <min.m.xu@intel.com> > Cc: Gerd Hoffmann <kraxel@redhat.com> > Cc: James Bottomley <jejb@linux.ibm.com> > Cc: Tom Lendacky <Thomas.Lendacky@amd.com> > Cc: Jiewen Yao <jiewen.yao@intel.com> > Cc: Erdem Aktas <erdemaktas@google.com> > Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> > Cc: Dave Hansen <dave.hansen@linux.intel.com> > Cc: Borislav Petkov <bp@alien8.de> > Signed-off-by: Dionna Glaze <dionnaglaze@google.com> > Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> I still think it is a bad idea. As I asked before, please include my Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> into the patch.
On 4/4/23 10:45, Kirill A. Shutemov wrote: > I still think it is a bad idea. > > As I asked before, please include my > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > into the patch. I was pretty opposed to this when I first saw it too. But, Tom and company have worn down my opposition a bit. The fact is that we have upstream kernels out there with SEV-SNP support that don't know anything about unaccepted memory. They're either relegated to using the pre-accepted memory (4GB??) or _some_ entity needs to accept the memory. That entity obviously can't be the kernel unless we backport unaccepted memory support. This both lets the BIOS be the page-accepting entity _and_ allows the entity to delegate that to the kernel when it needs to. As much as I want to nak this and pretend that that those existing kernel's don't exist, my powers of self-delusion do have their limits. If our AMD friends don't do this, what is their alternative?
On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > I still think it is a bad idea. > > > > As I asked before, please include my > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > into the patch. > > I was pretty opposed to this when I first saw it too. But, Tom and > company have worn down my opposition a bit. > > The fact is that we have upstream kernels out there with SEV-SNP support > that don't know anything about unaccepted memory. They're either > relegated to using the pre-accepted memory (4GB??) or _some_ entity > needs to accept the memory. That entity obviously can't be the kernel > unless we backport unaccepted memory support. > > This both lets the BIOS be the page-accepting entity _and_ allows the > entity to delegate that to the kernel when it needs to. > > As much as I want to nak this and pretend that that those existing > kernel's don't exist, my powers of self-delusion do have their limits. > > If our AMD friends don't do this, what is their alternative? The alternative is coordination on the host side: VMM can load a BIOS that pre-accepts all memory if the kernel is older. I know that it is not convenient for VMM, but it is technically possible. Introduce an ABI with an expiration date is much more ugly. And nobody will care about the expiration date, until you will try to remove it.
On 4/4/23 11:09, Kirill A. Shutemov wrote: >> If our AMD friends don't do this, what is their alternative? > The alternative is coordination on the host side: VMM can load a BIOS that > pre-accepts all memory if the kernel is older. > > I know that it is not convenient for VMM, but it is technically possible. Yeah, either a specific BIOS or a knob to tell the BIOS what it has to do. But, either way, that requires coordination between the BIOS (or BIOS configuration) and the specific guest. I can see why that's unpalatable. > Introduce an ABI with an expiration date is much more ugly. And nobody > will care about the expiration date, until you will try to remove it. Yeah, the only real expiration date for an ABI is "never". I don't believe for a second that we'll ever be able to remove the interface. Either way, I'd love to hear more from folks about why a BIOS-side option (configuration or otherwise) is not a good option. I know we've discussed this in a few mail threads, but it would be even better to get it into the cover letter or documentation.
On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > > I still think it is a bad idea. > > > > > > As I asked before, please include my > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > > into the patch. > > > > I was pretty opposed to this when I first saw it too. But, Tom and > > company have worn down my opposition a bit. > > > > The fact is that we have upstream kernels out there with SEV-SNP support > > that don't know anything about unaccepted memory. They're either > > relegated to using the pre-accepted memory (4GB??) or _some_ entity > > needs to accept the memory. That entity obviously can't be the kernel > > unless we backport unaccepted memory support. > > > > This both lets the BIOS be the page-accepting entity _and_ allows the > > entity to delegate that to the kernel when it needs to. > > > > As much as I want to nak this and pretend that that those existing > > kernel's don't exist, my powers of self-delusion do have their limits. > > > > If our AMD friends don't do this, what is their alternative? > > The alternative is coordination on the host side: VMM can load a BIOS that > pre-accepts all memory if the kernel is older. > And how does one identify such a kernel? How does the VMM know which kernel the guest is going to load after it boots? > I know that it is not convenient for VMM, but it is technically possible. > > Introduce an ABI with an expiration date is much more ugly. And nobody > will care about the expiration date, until you will try to remove it. > None of us are thrilled about this, but the simple reality is that there are kernels that do not understand unaccepted memory. EFI being an extensible, generic, protocol based programmatic interface, the best way of informing the loader that a kernel does understand it is /not/ by adding some flag to some highly arch and OS specific header, but to discover a protocol and call it. We're past arguing that a legitimate need exists for a solution to this problem. So what solution are you proposing?
On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote: > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > > > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > > > I still think it is a bad idea. > > > > > > > > As I asked before, please include my > > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > > > > into the patch. > > > > > > I was pretty opposed to this when I first saw it too. But, Tom and > > > company have worn down my opposition a bit. > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support > > > that don't know anything about unaccepted memory. They're either > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity > > > needs to accept the memory. That entity obviously can't be the kernel > > > unless we backport unaccepted memory support. > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the > > > entity to delegate that to the kernel when it needs to. > > > > > > As much as I want to nak this and pretend that that those existing > > > kernel's don't exist, my powers of self-delusion do have their limits. > > > > > > If our AMD friends don't do this, what is their alternative? > > > > The alternative is coordination on the host side: VMM can load a BIOS that > > pre-accepts all memory if the kernel is older. > > > > And how does one identify such a kernel? How does the VMM know which > kernel the guest is going to load after it boots? VMM has to know what it is running. Yes, it is cumbersome. But enabling phase for a feature is often rough. It will get smoother overtime. > > I know that it is not convenient for VMM, but it is technically possible. > > > > Introduce an ABI with an expiration date is much more ugly. And nobody > > will care about the expiration date, until you will try to remove it. > > > > None of us are thrilled about this, but the simple reality is that > there are kernels that do not understand unaccepted memory. How is it different from any other feature the kernel is not [yet] aware of? Like if we boot a legacy kernel on machine with persistent memory or memory attached over CLX, it will not see it as conventional memory. > EFI being > an extensible, generic, protocol based programmatic interface, the > best way of informing the loader that a kernel does understand it is > /not/ by adding some flag to some highly arch and OS specific header, > but to discover a protocol and call it. > > We're past arguing that a legitimate need exists for a solution to > this problem. So what solution are you proposing? I described the solution multiple times. You just don't like it.
On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote: > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > > > > I still think it is a bad idea. > > > > > > > > > > As I asked before, please include my > > > > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > > > > > > into the patch. > > > > > > > > I was pretty opposed to this when I first saw it too. But, Tom and > > > > company have worn down my opposition a bit. > > > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support > > > > that don't know anything about unaccepted memory. They're either > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity > > > > needs to accept the memory. That entity obviously can't be the kernel > > > > unless we backport unaccepted memory support. > > > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the > > > > entity to delegate that to the kernel when it needs to. > > > > > > > > As much as I want to nak this and pretend that that those existing > > > > kernel's don't exist, my powers of self-delusion do have their limits. > > > > > > > > If our AMD friends don't do this, what is their alternative? > > > > > > The alternative is coordination on the host side: VMM can load a BIOS that > > > pre-accepts all memory if the kernel is older. > > > > > > > And how does one identify such a kernel? How does the VMM know which > > kernel the guest is going to load after it boots? > > VMM has to know what it is running. Yes, it is cumbersome. But enabling > phase for a feature is often rough. It will get smoother overtime. > So how does the VMM get informed about what it is running? How does it distinguish between kernels that support unaccepted memory and ones that don't? And how does it predict which kernel a guest is going to load? If the solution you described many times addresses these questions, could you please share a link?
On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote: > On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote: > > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > > > > > I still think it is a bad idea. > > > > > > > > > > > > As I asked before, please include my > > > > > > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > > > > > > > > into the patch. > > > > > > > > > > I was pretty opposed to this when I first saw it too. But, Tom and > > > > > company have worn down my opposition a bit. > > > > > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support > > > > > that don't know anything about unaccepted memory. They're either > > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity > > > > > needs to accept the memory. That entity obviously can't be the kernel > > > > > unless we backport unaccepted memory support. > > > > > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the > > > > > entity to delegate that to the kernel when it needs to. > > > > > > > > > > As much as I want to nak this and pretend that that those existing > > > > > kernel's don't exist, my powers of self-delusion do have their limits. > > > > > > > > > > If our AMD friends don't do this, what is their alternative? > > > > > > > > The alternative is coordination on the host side: VMM can load a BIOS that > > > > pre-accepts all memory if the kernel is older. > > > > > > > > > > And how does one identify such a kernel? How does the VMM know which > > > kernel the guest is going to load after it boots? > > > > VMM has to know what it is running. Yes, it is cumbersome. But enabling > > phase for a feature is often rough. It will get smoother overtime. > > > > So how does the VMM get informed about what it is running? How does it > distinguish between kernels that support unaccepted memory and ones > that don't? And how does it predict which kernel a guest is going to > load? User will specify if it wants unaccepted memory or not for the VM. And if it does it is his responsibility to have kernel that supports it. And you have not addressed my question: How is it different from any other feature the kernel is not [yet] aware of?
On Tue, 4 Apr 2023 at 23:02, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote: > > On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote: > > > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > > > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > > > > > > I still think it is a bad idea. > > > > > > > > > > > > > > As I asked before, please include my > > > > > > > > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > > > > > > > > > > into the patch. > > > > > > > > > > > > I was pretty opposed to this when I first saw it too. But, Tom and > > > > > > company have worn down my opposition a bit. > > > > > > > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support > > > > > > that don't know anything about unaccepted memory. They're either > > > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity > > > > > > needs to accept the memory. That entity obviously can't be the kernel > > > > > > unless we backport unaccepted memory support. > > > > > > > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the > > > > > > entity to delegate that to the kernel when it needs to. > > > > > > > > > > > > As much as I want to nak this and pretend that that those existing > > > > > > kernel's don't exist, my powers of self-delusion do have their limits. > > > > > > > > > > > > If our AMD friends don't do this, what is their alternative? > > > > > > > > > > The alternative is coordination on the host side: VMM can load a BIOS that > > > > > pre-accepts all memory if the kernel is older. > > > > > > > > > > > > > And how does one identify such a kernel? How does the VMM know which > > > > kernel the guest is going to load after it boots? > > > > > > VMM has to know what it is running. Yes, it is cumbersome. But enabling > > > phase for a feature is often rough. It will get smoother overtime. > > > > > > > So how does the VMM get informed about what it is running? How does it > > distinguish between kernels that support unaccepted memory and ones > > that don't? And how does it predict which kernel a guest is going to > > load? > > User will specify if it wants unaccepted memory or not for the VM. And if > it does it is his responsibility to have kernel that supports it. > > And you have not addressed my question: > > How is it different from any other feature the kernel is not [yet] aware > of? > It is the same problem, but this is just a better solution. Having a BIOS menu option (or similar) to choose between unaccepted memory or not (or to expose CXL memory via the EFI memory map, which is another hack I have seen) is just unnecessary complication, if the kernel can simply inform the loader about what it supports. We do this all the time with things like OsIndications. We can phase out the protocol implementation from the firmware once we no longer need it, at which point the LocateProtocol() call just becomes a NOP (we do the same thing for UGA support, which has disappeared a long time ago, but we still look for the protocol in the EFI stub). Once the firmware stops exposing this protocol (and ceases to accept memory on the OS's behalf), we can phase it out from the kernel as well. The only other potential solution I see is exposing the unaccepted memory as coldplugged ACPI memory objects, and implementing the accept calls via PRM methods. But PRM has had very little test coverage, so it is anybody's guess whether it works for the stable kernels that we need to support with this. It would also mean that the new unaccepted memory logic would need to be updated and cross reference these memory regions with EFI unaccepted memory regions and avoid claiming them both.
Hi, > User will specify if it wants unaccepted memory or not for the VM. And if > it does it is his responsibility to have kernel that supports it. > > And you have not addressed my question: > > How is it different from any other feature the kernel is not [yet] aware > of? Come on. Automatic feature negotiation is standard procedure in many places. It's not like we inventing something totally new here. Just one example: When a virtio device learns a new trick a feature flag is added for it, and in case both guest and host support it it can be enabled, otherwise not. There is no need for the user to configure the virtio device features manually according to the capabilities of the kernel it is going to boot. take care, Gerd
On 4/5/23 00:46, Ard Biesheuvel wrote: > Once the firmware stops exposing this protocol (and ceases to accept > memory on the OS's behalf), we can phase it out from the kernel as > well. This is a part of the story that I have doubts about. How and when do you think this phase-out would happen, realistically? The firmware will need the unaccepted memory protocol support as long as there are guests around that need it, right? People like to keep running old kernels for a _long_ time. Doesn't that mean _some_ firmware will need to keep doing this dance for a long time? As long as there is firmware out there in the wild that people want to run new kernels on, the support needs to stay in mainline. It can't be dropped.
On Wed, Apr 05, 2023 at 09:46:59AM +0200, Ard Biesheuvel wrote: > On Tue, 4 Apr 2023 at 23:02, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote: > > > On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote: > > > > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > > > > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > > > > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > > > > > > > I still think it is a bad idea. > > > > > > > > > > > > > > > > As I asked before, please include my > > > > > > > > > > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > > > > > > > > > > > > into the patch. > > > > > > > > > > > > > > I was pretty opposed to this when I first saw it too. But, Tom and > > > > > > > company have worn down my opposition a bit. > > > > > > > > > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support > > > > > > > that don't know anything about unaccepted memory. They're either > > > > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity > > > > > > > needs to accept the memory. That entity obviously can't be the kernel > > > > > > > unless we backport unaccepted memory support. > > > > > > > > > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the > > > > > > > entity to delegate that to the kernel when it needs to. > > > > > > > > > > > > > > As much as I want to nak this and pretend that that those existing > > > > > > > kernel's don't exist, my powers of self-delusion do have their limits. > > > > > > > > > > > > > > If our AMD friends don't do this, what is their alternative? > > > > > > > > > > > > The alternative is coordination on the host side: VMM can load a BIOS that > > > > > > pre-accepts all memory if the kernel is older. > > > > > > > > > > > > > > > > And how does one identify such a kernel? How does the VMM know which > > > > > kernel the guest is going to load after it boots? > > > > > > > > VMM has to know what it is running. Yes, it is cumbersome. But enabling > > > > phase for a feature is often rough. It will get smoother overtime. > > > > > > > > > > So how does the VMM get informed about what it is running? How does it > > > distinguish between kernels that support unaccepted memory and ones > > > that don't? And how does it predict which kernel a guest is going to > > > load? > > > > User will specify if it wants unaccepted memory or not for the VM. And if > > it does it is his responsibility to have kernel that supports it. > > > > And you have not addressed my question: > > > > How is it different from any other feature the kernel is not [yet] aware > > of? > > > > It is the same problem, but this is just a better solution. Okay, we at least agree that there are more then one solution to the problem. > Having a BIOS menu option (or similar) to choose between unaccepted > memory or not (or to expose CXL memory via the EFI memory map, which is > another hack I have seen) is just unnecessary complication, if the > kernel can simply inform the loader about what it supports. We do this > all the time with things like OsIndications. It assumes that kernel calls ExitBootServices() which is not always true. A bootloader in between will make impossible for kernel to use any of futures exposed this way. But we talked about this before. BTW, can we at least acknowledge the limitation in the commit message? > We can phase out the protocol implementation from the firmware once we > no longer need it, at which point the LocateProtocol() call just > becomes a NOP (we do the same thing for UGA support, which has > disappeared a long time ago, but we still look for the protocol in the > EFI stub). > > Once the firmware stops exposing this protocol (and ceases to accept > memory on the OS's behalf), we can phase it out from the kernel as > well. It is unlikely to ever happen. In few year everybody will forget about this conversation. Regardless of what is written in commit message. Everything works, why bother? > The only other potential solution I see is exposing the unaccepted > memory as coldplugged ACPI memory objects, and implementing the accept > calls via PRM methods. But PRM has had very little test coverage, so > it is anybody's guess whether it works for the stable kernels that we > need to support with this. It would also mean that the new unaccepted > memory logic would need to be updated and cross reference these memory > regions with EFI unaccepted memory regions and avoid claiming them > both. Nah. That is a lot of complexity for no particular reason.
On Wed, 5 Apr 2023 at 15:00, Dave Hansen <dave.hansen@intel.com> wrote: > > On 4/5/23 00:46, Ard Biesheuvel wrote: > > Once the firmware stops exposing this protocol (and ceases to accept > > memory on the OS's behalf), we can phase it out from the kernel as > > well. > > This is a part of the story that I have doubts about. > > How and when do you think this phase-out would happen, realistically? > > The firmware will need the unaccepted memory protocol support as long as > there are guests around that need it, right? > Current firmware will accept all memory on behalf of the OS unless the OS invokes the protocol to prevent it from doing so. Future firmware will simply never accept all memory on behalf of the OS, and not expose the protocol at all. So the difference of opinion mainly comes down to whether or not the intermediate, first step is needed or not. Unenlightened OS kernels will not invoke the protocol, and will therefore need current firmware in order to see all of their memory. Enlightened OS kernels will invoke the protocol unless it does not exist, and so will be able to accept their memory lazily both on current and future firmware. We will be able to move to future firmware once we no longer need to support unenlightened kernels. > People like to keep running old kernels for a _long_ time. Doesn't that > mean _some_ firmware will need to keep doing this dance for a long time? > Yes. > As long as there is firmware out there in the wild that people want to > run new kernels on, the support needs to stay in mainline. It can't be > dropped. The penalty for not calling the protocol on firmware that implements it is a much slower boot, but everything works as it should beyond that. Given that the intent here is to retain compatibility with unenlightened workloads (i.e., which do not upgrade their kernels), I think it is perfectly reasonable to drop this from mainline at some point.
On Wed, 5 Apr 2023 at 15:42, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > On Wed, Apr 05, 2023 at 09:46:59AM +0200, Ard Biesheuvel wrote: > > On Tue, 4 Apr 2023 at 23:02, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote: > > > > On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > > > > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote: > > > > > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote: > > > > > > > > > > > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote: > > > > > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote: > > > > > > > > > I still think it is a bad idea. > > > > > > > > > > > > > > > > > > As I asked before, please include my > > > > > > > > > > > > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> > > > > > > > > > > > > > > > > > > into the patch. > > > > > > > > > > > > > > > > I was pretty opposed to this when I first saw it too. But, Tom and > > > > > > > > company have worn down my opposition a bit. > > > > > > > > > > > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support > > > > > > > > that don't know anything about unaccepted memory. They're either > > > > > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity > > > > > > > > needs to accept the memory. That entity obviously can't be the kernel > > > > > > > > unless we backport unaccepted memory support. > > > > > > > > > > > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the > > > > > > > > entity to delegate that to the kernel when it needs to. > > > > > > > > > > > > > > > > As much as I want to nak this and pretend that that those existing > > > > > > > > kernel's don't exist, my powers of self-delusion do have their limits. > > > > > > > > > > > > > > > > If our AMD friends don't do this, what is their alternative? > > > > > > > > > > > > > > The alternative is coordination on the host side: VMM can load a BIOS that > > > > > > > pre-accepts all memory if the kernel is older. > > > > > > > > > > > > > > > > > > > And how does one identify such a kernel? How does the VMM know which > > > > > > kernel the guest is going to load after it boots? > > > > > > > > > > VMM has to know what it is running. Yes, it is cumbersome. But enabling > > > > > phase for a feature is often rough. It will get smoother overtime. > > > > > > > > > > > > > So how does the VMM get informed about what it is running? How does it > > > > distinguish between kernels that support unaccepted memory and ones > > > > that don't? And how does it predict which kernel a guest is going to > > > > load? > > > > > > User will specify if it wants unaccepted memory or not for the VM. And if > > > it does it is his responsibility to have kernel that supports it. > > > > > > And you have not addressed my question: > > > > > > How is it different from any other feature the kernel is not [yet] aware > > > of? > > > > > > > It is the same problem, but this is just a better solution. > > Okay, we at least agree that there are more then one solution to the > problem. > > > Having a BIOS menu option (or similar) to choose between unaccepted > > memory or not (or to expose CXL memory via the EFI memory map, which is > > another hack I have seen) is just unnecessary complication, if the > > kernel can simply inform the loader about what it supports. We do this > > all the time with things like OsIndications. > > It assumes that kernel calls ExitBootServices() which is not always true. > A bootloader in between will make impossible for kernel to use any of > futures exposed this way. > > But we talked about this before. > Yes, we have. But this is a theoretical concern, as nobody who is deploying this stuff is interested in booting the kernel without the stub: even the trenchboot folks are bending over backwards to incorporate execution of the kernel's EFI stub into the D-RTM bootflow, and all of the confidential compute attestation logic is based on EFI protocols as well. So using a bootloader that calls ExitBootServices() and subsequently boots the Linux kernel using the legacy boot protocol is simply not something anyone is interested in doing. But don't take my word for it. > BTW, can we at least acknowledge the limitation in the commit message? > Sure. > > We can phase out the protocol implementation from the firmware once we > > no longer need it, at which point the LocateProtocol() call just > > becomes a NOP (we do the same thing for UGA support, which has > > disappeared a long time ago, but we still look for the protocol in the > > EFI stub). > > > > Once the firmware stops exposing this protocol (and ceases to accept > > memory on the OS's behalf), we can phase it out from the kernel as > > well. > > It is unlikely to ever happen. In few year everybody will forget about > this conversation. Regardless of what is written in commit message. > > Everything works, why bother? > That is a good question. If it doesn't get in the way and does not prevent us from doing any of the things we want to do, why would we even care? But as I argued in my reply to Dave, we can actually drop it from mainline later if we provide an upgrade path for legacy workloads that want to upgrade their kernels. > > The only other potential solution I see is exposing the unaccepted > > memory as coldplugged ACPI memory objects, and implementing the accept > > calls via PRM methods. But PRM has had very little test coverage, so > > it is anybody's guess whether it works for the stable kernels that we > > need to support with this. It would also mean that the new unaccepted > > memory logic would need to be updated and cross reference these memory > > regions with EFI unaccepted memory regions and avoid claiming them > > both. > > Nah. That is a lot of complexity for no particular reason. > Good, at least we agree on that :-)
On 4/5/23 06:44, Ard Biesheuvel wrote: > Given that the intent here is to retain compatibility with > unenlightened workloads (i.e., which do not upgrade their kernels), I > think it is perfectly reasonable to drop this from mainline at some > point. OK, so there are three firmware types that matter: 1. Today's SEV-SNP deployed firmware. 2. Near future SEV-SNP firmware that exposes the new ExitBootServices() protocol that allows guests that speak the protocol to boot faster by participating in the unaccepted memory dance. 3. Far future firmware that doesn't have the ExitBootServices() protocol There are also three kernel types: 1. Old kernels with zero unaccepted memory support: no ExitBootServices() protocol support and no hypercalls to accept pages 2. Kernels that can accept pages and twiddle the ExitBootServices() flag 3. Future kernels that can accept pages, but have had ExitBootServices() support removed. That leads to nine possible mix-and-match firmware/kernel combos. I'm personally assuming that folks are going to *try* to run with all of these combos and will send us kernel folks bug reports if they see regressions. Let's just enumerate all of them and their implications before we go consult our crystal balls about what folks will actually do in the future. So, here we go: | Kernel | | | | Unenlightened | Enlightened | Dropped UEFI | Firmware | ~5.19?? | ~6.4?? | protocol | |---------------+-------------+--------------| Deployed | Slow boot | Slow boot | Slow boot | Near future | Slow boot | Fast boot | Slow boot | Far future | Crashes?? | Fast Boot | Fast boot | I hope I got that all right. The thing that worries me is the "Near future firmware" where someone runs a ~6.4 kernel and has a fast boot experience. They upgrade to a newer, "dropped protocol" kernel and their boot gets slower. I'm also a little fuzzy about what an ancient enlightened kernel would do on a "far future" firmware that requires unaccepted memory support. I _think_ those kernels would hit some unaccepted memory, and #VC/#VE/#whatever and die. Is that right, or is there some fallback there?
On Wed, Apr 05, 2023 at 09:15:15AM -0700, Dave Hansen wrote: > On 4/5/23 06:44, Ard Biesheuvel wrote: > > Given that the intent here is to retain compatibility with > > unenlightened workloads (i.e., which do not upgrade their kernels), I > > think it is perfectly reasonable to drop this from mainline at some > > point. > > OK, so there are three firmware types that matter: > > 1. Today's SEV-SNP deployed firmware. > 2. Near future SEV-SNP firmware that exposes the new ExitBootServices() > protocol that allows guests that speak the protocol to boot faster > by participating in the unaccepted memory dance. > 3. Far future firmware that doesn't have the ExitBootServices() protocol > > There are also three kernel types: > 1. Old kernels with zero unaccepted memory support: no > ExitBootServices() protocol support and no hypercalls to accept pages > 2. Kernels that can accept pages and twiddle the ExitBootServices() flag > 3. Future kernels that can accept pages, but have had ExitBootServices() > support removed. > > That leads to nine possible mix-and-match firmware/kernel combos. I'm > personally assuming that folks are going to *try* to run with all of > these combos and will send us kernel folks bug reports if they see > regressions. Let's just enumerate all of them and their implications > before we go consult our crystal balls about what folks will actually do > in the future. > > So, here we go: > > | Kernel | > | | > | Unenlightened | Enlightened | Dropped UEFI | > Firmware | ~5.19?? | ~6.4?? | protocol | > |---------------+-------------+--------------| > Deployed | Slow boot | Slow boot | Slow boot | > Near future | Slow boot | Fast boot | Slow boot | > Far future | Crashes?? | Fast Boot | Fast boot | > > I hope I got that all right. > > The thing that worries me is the "Near future firmware" where someone > runs a ~6.4 kernel and has a fast boot experience. They upgrade to a > newer, "dropped protocol" kernel and their boot gets slower. > > I'm also a little fuzzy about what an ancient enlightened kernel would > do on a "far future" firmware that requires unaccepted memory support. > I _think_ those kernels would hit some unaccepted memory, and > #VC/#VE/#whatever and die. Is that right, or is there some fallback there? The far future firmware in this scheme would expose unaccepted memory in EFI memory map without need of kernel to declare unaccepted memory support. The unenlightened kernel in this case will not be able to use the memory and consider it reserved. Only memory accepted by firmware will be accessible. Depending on how much memory firmware would pre-accept it can be OOM, but more likely it will boot fine with the fraction of memory usable.
On 4/5/23 14:06, Kirill A. Shutemov wrote: > On Wed, Apr 05, 2023 at 09:15:15AM -0700, Dave Hansen wrote: >> On 4/5/23 06:44, Ard Biesheuvel wrote: >>> Given that the intent here is to retain compatibility with >>> unenlightened workloads (i.e., which do not upgrade their kernels), I >>> think it is perfectly reasonable to drop this from mainline at some >>> point. >> >> OK, so there are three firmware types that matter: >> >> 1. Today's SEV-SNP deployed firmware. SNP support is originally available as part of the edk2-stable202202 release. >> 2. Near future SEV-SNP firmware that exposes the new ExitBootServices() >> protocol that allows guests that speak the protocol to boot faster >> by participating in the unaccepted memory dance. This is already out and available as part of the edk2-stable202302 release. But it did come out after general SNP support, so the near future terminology works. >> 3. Far future firmware that doesn't have the ExitBootServices() protocol >> >> There are also three kernel types: >> 1. Old kernels with zero unaccepted memory support: no >> ExitBootServices() protocol support and no hypercalls to accept pages >> 2. Kernels that can accept pages and twiddle the ExitBootServices() flag >> 3. Future kernels that can accept pages, but have had ExitBootServices() >> support removed. >> >> That leads to nine possible mix-and-match firmware/kernel combos. I'm >> personally assuming that folks are going to *try* to run with all of >> these combos and will send us kernel folks bug reports if they see >> regressions. Let's just enumerate all of them and their implications >> before we go consult our crystal balls about what folks will actually do >> in the future. >> >> So, here we go: >> >> | Kernel | >> | | >> | Unenlightened | Enlightened | Dropped UEFI | >> Firmware | ~5.19?? | ~6.4?? | protocol | >> |---------------+-------------+--------------| >> Deployed | Slow boot | Slow boot | Slow boot | >> Near future | Slow boot | Fast boot | Slow boot | >> Far future | Crashes?? | Fast Boot | Fast boot | >> >> I hope I got that all right. Looks correct to me (with Kirill's description below in place of the "Crashes??"). >> >> The thing that worries me is the "Near future firmware" where someone >> runs a ~6.4 kernel and has a fast boot experience. They upgrade to a >> newer, "dropped protocol" kernel and their boot gets slower. Right, so that is what begs the question of when to actually drop the call. Or does it really need to be dropped? It's a small patch to execute a boot services call, I guess I don't see the big deal of it being there. If the firmware still has the protocol, the call is made, if it doesn't, its not. In the overall support for unaccepted memory, this seems to be a very minor piece. >> >> I'm also a little fuzzy about what an ancient enlightened kernel would >> do on a "far future" firmware that requires unaccepted memory support. >> I _think_ those kernels would hit some unaccepted memory, and >> #VC/#VE/#whatever and die. Is that right, or is there some fallback there? > > The far future firmware in this scheme would expose unaccepted memory in > EFI memory map without need of kernel to declare unaccepted memory > support. The unenlightened kernel in this case will not be able to use the > memory and consider it reserved. Only memory accepted by firmware will be > accessible. Depending on how much memory firmware would pre-accept it can > be OOM, but more likely it will boot fine with the fraction of memory > usable. Right, since a typical Qemu VM has a 2GB hole for PCI/MMIO, the guest is likely to only see 2GB of memory available to it. Thanks, Tom >
On 4/5/23 13:11, Tom Lendacky wrote: >>> The thing that worries me is the "Near future firmware" where someone >>> runs a ~6.4 kernel and has a fast boot experience. They upgrade to a >>> newer, "dropped protocol" kernel and their boot gets slower. > > Right, so that is what begs the question of when to actually drop the > call. Or does it really need to be dropped? It's a small patch to > execute a boot services call, I guess I don't see the big deal of it > being there. > If the firmware still has the protocol, the call is made, if it doesn't, > its not. In the overall support for unaccepted memory, this seems to be > a very minor piece. I honestly don't think it's a big deal either, at least on the kernel side. Maybe it's a bigger deal to the firmware folks on their side. So, the corrected table looks something like this: | Kernel | | | | Unenlightened | Enlightened | Dropped UEFI | Firmware | ~5.19?? | ~6.4?? | protocol | |---------------+-------------+--------------| Deployed | Slow boot | Slow boot | Slow boot | Near future | Slow boot | Fast boot | Slow boot | Far future | 2GB limited | Fast Boot | Fast boot | But, honestly, I don't see much benefit to the "dropped UEFI protocol". It adds complexity and will represent a regression either in boot speeds, or in unenlightened kernels losing RAM when moving to newer firmware. Neither of those is great. Looking at this _purely_ from the kernel perspective, I think I'd prefer this situation: | Kernel | | | | Unenlightened | Enlightened | Firmware | ~5.19?? | ~6.4?? | |---------------+-------------+ Deployed | Slow boot | Slow boot | Future | Slow boot | Fast boot | and not have future firmware drop support for the handshake protocol. That way there are no potential regressions. Is there a compelling reason on the firmware side to drop the ExitBootServices() protocol that I'm missing?
On Wed, 5 Apr 2023 at 23:23, Dave Hansen <dave.hansen@intel.com> wrote: > > On 4/5/23 13:11, Tom Lendacky wrote: > >>> The thing that worries me is the "Near future firmware" where someone > >>> runs a ~6.4 kernel and has a fast boot experience. They upgrade to a > >>> newer, "dropped protocol" kernel and their boot gets slower. > > > > Right, so that is what begs the question of when to actually drop the > > call. Or does it really need to be dropped? It's a small patch to > > execute a boot services call, I guess I don't see the big deal of it > > being there. > > If the firmware still has the protocol, the call is made, if it doesn't, > > its not. In the overall support for unaccepted memory, this seems to be > > a very minor piece. > > I honestly don't think it's a big deal either, at least on the kernel > side. Maybe it's a bigger deal to the firmware folks on their side. > > So, the corrected table looks something like this: > > | Kernel | > | | > | Unenlightened | Enlightened | Dropped UEFI | > Firmware | ~5.19?? | ~6.4?? | protocol | > |---------------+-------------+--------------| > Deployed | Slow boot | Slow boot | Slow boot | > Near future | Slow boot | Fast boot | Slow boot | > Far future | 2GB limited | Fast Boot | Fast boot | > I don't think there is any agreement on the firmware side on what constitutes are reasonable minimum to accept when lazy accept is in use, so the 2 GiB is really the upper bound here, and it could substantially less. > > But, honestly, I don't see much benefit to the "dropped UEFI protocol". > It adds complexity and will represent a regression either in boot > speeds, or in unenlightened kernels losing RAM when moving to newer > firmware. Neither of those is great. > > Looking at this _purely_ from the kernel perspective, I think I'd prefer > this situation: > > | Kernel | > | | > | Unenlightened | Enlightened | > Firmware | ~5.19?? | ~6.4?? | > |---------------+-------------+ > Deployed | Slow boot | Slow boot | > Future | Slow boot | Fast boot | > > and not have future firmware drop support for the handshake protocol. > That way there are no potential regressions. > > Is there a compelling reason on the firmware side to drop the > ExitBootServices() protocol that I'm missing? The protocol only exists to stop the firmware from eagerly accepting all memory on behalf of the OS. So from the firmware side, it would be more about removing that functionality (making the protocol call moot) rather than removing the protocol itself.
diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c index 1afe7b5b02e1..119e201cfc68 100644 --- a/drivers/firmware/efi/libstub/x86-stub.c +++ b/drivers/firmware/efi/libstub/x86-stub.c @@ -27,6 +27,17 @@ const efi_dxe_services_table_t *efi_dxe_table; u32 image_offset __section(".data"); static efi_loaded_image_t *image = NULL; +typedef union sev_memory_acceptance_protocol sev_memory_acceptance_protocol_t; +union sev_memory_acceptance_protocol { + struct { + efi_status_t (__efiapi * allow_unaccepted_memory)( + sev_memory_acceptance_protocol_t *); + }; + struct { + u32 allow_unaccepted_memory; + } mixed_mode; +}; + static efi_status_t preserve_pci_rom_image(efi_pci_io_protocol_t *pci, struct pci_setup_rom **__rom) { @@ -311,6 +322,29 @@ setup_memory_protection(unsigned long image_base, unsigned long image_size) #endif } +static void setup_unaccepted_memory(void) +{ + efi_guid_t mem_acceptance_proto = OVMF_SEV_MEMORY_ACCEPTANCE_PROTOCOL_GUID; + sev_memory_acceptance_protocol_t *proto; + efi_status_t status; + + if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY)) + return; + + /* + * Enable unaccepted memory before calling exit boot services in order + * for the UEFI to not accept all memory on EBS. + */ + status = efi_bs_call(locate_protocol, &mem_acceptance_proto, NULL, + (void **)&proto); + if (status != EFI_SUCCESS) + return; + + status = efi_call_proto(proto, allow_unaccepted_memory); + if (status != EFI_SUCCESS) + efi_err("Memory acceptance protocol failed\n"); +} + static const efi_char16_t apple[] = L"Apple"; static void setup_quirks(struct boot_params *boot_params, @@ -967,6 +1001,8 @@ asmlinkage unsigned long efi_main(efi_handle_t handle, setup_quirks(boot_params, bzimage_addr, buffer_end - buffer_start); + setup_unaccepted_memory(); + status = exit_boot(boot_params, handle); if (status != EFI_SUCCESS) { efi_err("exit_boot() failed!\n"); diff --git a/include/linux/efi.h b/include/linux/efi.h index 1d4f0343c710..e728b8cf6b73 100644 --- a/include/linux/efi.h +++ b/include/linux/efi.h @@ -436,6 +436,9 @@ void efi_native_runtime_setup(void); #define DELLEMC_EFI_RCI2_TABLE_GUID EFI_GUID(0x2d9f28a2, 0xa886, 0x456a, 0x97, 0xa8, 0xf1, 0x1e, 0xf2, 0x4f, 0xf4, 0x55) #define AMD_SEV_MEM_ENCRYPT_GUID EFI_GUID(0x0cf29b71, 0x9e51, 0x433a, 0xa3, 0xb7, 0x81, 0xf3, 0xab, 0x16, 0xb8, 0x75) +/* OVMF protocol GUIDs */ +#define OVMF_SEV_MEMORY_ACCEPTANCE_PROTOCOL_GUID EFI_GUID(0xc5a010fe, 0x38a7, 0x4531, 0x8a, 0x4a, 0x05, 0x00, 0xd2, 0xfd, 0x16, 0x49) + typedef struct { efi_guid_t guid; u64 table;