Message ID | 20230515165917.1306922-3-ltykernel@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp7062580vqo; Mon, 15 May 2023 10:00:51 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ454P6jmQqvRuPyUkKxyk44oqYSYhyiWsqPjHhzxvx6HqM1oh91kNQi5Xderh4rtQLlXjJz X-Received: by 2002:a17:90b:19ce:b0:24e:1177:f467 with SMTP id nm14-20020a17090b19ce00b0024e1177f467mr35038860pjb.12.1684170051259; Mon, 15 May 2023 10:00:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684170051; cv=none; d=google.com; s=arc-20160816; b=Ptp1CKFW4EEXxnzsEWX7xLFrNMzBZgpG0Q0H2yUXQgSIcSOvv7/jcK4gtnnE1hYVts 9Jh55l2K6/tQhRYptKCacmSlnBtCz3NcTSuWv96e+qWB0rpzmC/lIU62cjJieK9VI/4Q v6fquevhYV3OoHtAd/9SJqKuQi1gZlVgEseSWywEBx6ucT3p6oU0r0Qq8W8CAWM9hXSx +1XHktH8y8FEZ0ICcbyIcOPWw32HGbEr8bP16KsmzLwYiYzXwRastdyqoMHawUhoO9K3 ooE1LZKso1cJw35sVH5DjyEoXBGcLDi+sujwysiQcXQgCuliTmZ7rADQxdD1qpJ/dJSU AelQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=cvibDiw0YAeqW1tnLVhywXPW6u1gg+0sroPN7cRMLpI=; b=jPMwh+9DZ/0+SGe///glF9hQszu/Ypu+u0eQ8yXttl+tOtLNfo6fFxlxVCF4Y8gEDS +h8HJ0obkZQLmRmBURL75owrF6cfnzjw0xQac1vAvKPdmeJwgyeUUPUm+ZVm1Mu6900x EOeX1Monpaxy7yvZ5vUWc4phWN4KvmoWwnB5ArtNGxk2BUM+FVHxEeue6dM4L3aMWKYU aCqQET7lI4QWuH6sGM0CGwubJEnqX6Hfg5hCdJXj/yjknT7DDY1ZlCm0DA3fhglbk1oI cJoQDNH0fvGd1eWvd6mslgZKtlA0kUjk+F9bG0a2v0WlTsbz0nzEabvkX0vsIkDpqIjq Hv3g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b="A/mTC3tA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g4-20020a633744000000b00530725b58a1si11214690pgn.24.2023.05.15.10.00.37; Mon, 15 May 2023 10:00:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b="A/mTC3tA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243363AbjEOQ7i (ORCPT <rfc822;peekingduck44@gmail.com> + 99 others); Mon, 15 May 2023 12:59:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243336AbjEOQ72 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 15 May 2023 12:59:28 -0400 Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46B275BA3; Mon, 15 May 2023 09:59:27 -0700 (PDT) Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-51f6461af24so9045850a12.2; Mon, 15 May 2023 09:59:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684169967; x=1686761967; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cvibDiw0YAeqW1tnLVhywXPW6u1gg+0sroPN7cRMLpI=; b=A/mTC3tAK9hBc18irOO6lZqLzOlnjfLlO8KVk95Rz7Z4FMLapsm9Prd389aAN3n1eb Lzw4jSKT2sy/+tUf6QX4WZXqjCxVpt5u82bs8Sukawkz+HFcIZgmZRgAMtzAzZk0aTPJ zOZXFYS/hN90Q9lCCQeRDAuOoQWW266Xj59pjkM/TEvP5k6c22tMyu1oq7WyixHY27dA LgYAmUGJOX+0s1WMQAE/zzwyZpGO12zLGkoPwlx9rq0aw73ERL4mdOgysIuzcdkQW222 Sj5PI4vMBlYTkvL7JMWWRcdgh8KFG8jo289qQnl5glKeFyvHkOC90qlj0zBG8hUSrX26 Hb1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684169967; x=1686761967; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cvibDiw0YAeqW1tnLVhywXPW6u1gg+0sroPN7cRMLpI=; b=AOz7vPHt73BAMLQ8v1HJ2crgd7icnr+rIWZxGj6+H43R7ZQE5WhPO/f8K21PXcevET CpoypfI/boAxPXJCnFm0S2SlO4pf6CiJWC94MCIOYLIfyTAiJx4PGlAgO1GLOfmzfzjt hBzZSKq02HrdgGVXorsPBpcm+02/BU7IPok4nAdZM83rRi8eDd801iOiBqWFv2ND6IoH xxOj4s17YF+jHUcUeLkwG3B41udoi6eBfGa49MZ9Jsr2YiSuF1KFug4br03PYwMcsXHF KdW8eiLVOoLM8xSDVhE9sasVeSwAaj7Gm1eVo8hCLvHermzEJtLHOMo3zZmF4YjfNfb5 BwzA== X-Gm-Message-State: AC+VfDzxYtIjUqxk8QDCjbraRKVq+C89QoVXalDvEigfBsy0btrstLlB ML7Hkhr8atO9VWKU0/C41dk= X-Received: by 2002:a17:90a:890a:b0:250:ce6a:cf1a with SMTP id u10-20020a17090a890a00b00250ce6acf1amr19255279pjn.38.1684169966607; Mon, 15 May 2023 09:59:26 -0700 (PDT) Received: from ubuntu-Virtual-Machine.corp.microsoft.com ([2001:4898:80e8:f:85bb:dfc8:391f:ff73]) by smtp.gmail.com with ESMTPSA id x13-20020a17090aa38d00b0024df6bbf5d8sm2151pjp.30.2023.05.15.09.59.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 May 2023 09:59:25 -0700 (PDT) From: Tianyu Lan <ltykernel@gmail.com> To: luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, seanjc@google.com, pbonzini@redhat.com, jgross@suse.com, tiala@microsoft.com, kirill@shutemov.name, jiangshan.ljs@antgroup.com, peterz@infradead.org, ashish.kalra@amd.com, srutherford@google.com, akpm@linux-foundation.org, anshuman.khandual@arm.com, pawan.kumar.gupta@linux.intel.com, adrian.hunter@intel.com, daniel.sneddon@linux.intel.com, alexander.shishkin@linux.intel.com, sandipan.das@amd.com, ray.huang@amd.com, brijesh.singh@amd.com, michael.roth@amd.com, thomas.lendacky@amd.com, venu.busireddy@oracle.com, sterritt@google.com, tony.luck@intel.com, samitolvanen@google.com, fenghua.yu@intel.com Cc: pangupta@amd.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-arch@vger.kernel.org Subject: [RFC PATCH V6 02/14] x86/sev: Add Check of #HV event in path Date: Mon, 15 May 2023 12:59:04 -0400 Message-Id: <20230515165917.1306922-3-ltykernel@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230515165917.1306922-1-ltykernel@gmail.com> References: <20230515165917.1306922-1-ltykernel@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765980295790339197?= X-GMAIL-MSGID: =?utf-8?q?1765980295790339197?= |
Series |
x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
|
|
Commit Message
Tianyu Lan
May 15, 2023, 4:59 p.m. UTC
From: Tianyu Lan <tiala@microsoft.com> Add check_hv_pending() and check_hv_pending_after_irq() to check queued #HV event when irq is disabled. Signed-off-by: Tianyu Lan <tiala@microsoft.com> --- arch/x86/entry/entry_64.S | 18 ++++++++++++++++ arch/x86/include/asm/irqflags.h | 14 +++++++++++- arch/x86/kernel/sev.c | 38 +++++++++++++++++++++++++++++++++ 3 files changed, 69 insertions(+), 1 deletion(-)
Comments
On Mon, May 15, 2023 at 12:59:04PM -0400, Tianyu Lan wrote: > From: Tianyu Lan <tiala@microsoft.com> > > Add check_hv_pending() and check_hv_pending_after_irq() to > check queued #HV event when irq is disabled. > > Signed-off-by: Tianyu Lan <tiala@microsoft.com> > --- > arch/x86/entry/entry_64.S | 18 ++++++++++++++++ > arch/x86/include/asm/irqflags.h | 14 +++++++++++- > arch/x86/kernel/sev.c | 38 +++++++++++++++++++++++++++++++++ > 3 files changed, 69 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S > index 653b1f10699b..147b850babf6 100644 > --- a/arch/x86/entry/entry_64.S > +++ b/arch/x86/entry/entry_64.S > @@ -1019,6 +1019,15 @@ SYM_CODE_END(paranoid_entry) > * R15 - old SPEC_CTRL > */ > SYM_CODE_START_LOCAL(paranoid_exit) > +#ifdef CONFIG_AMD_MEM_ENCRYPT > + /* > + * If a #HV was delivered during execution and interrupts were > + * disabled, then check if it can be handled before the iret > + * (which may re-enable interrupts). > + */ > + mov %rsp, %rdi > + call check_hv_pending > +#endif > UNWIND_HINT_REGS > > /* > @@ -1143,6 +1152,15 @@ SYM_CODE_START(error_entry) > SYM_CODE_END(error_entry) > > SYM_CODE_START_LOCAL(error_return) > +#ifdef CONFIG_AMD_MEM_ENCRYPT > + /* > + * If a #HV was delivered during execution and interrupts were > + * disabled, then check if it can be handled before the iret > + * (which may re-enable interrupts). > + */ > + mov %rsp, %rdi > + call check_hv_pending > +#endif > UNWIND_HINT_REGS > DEBUG_ENTRY_ASSERT_IRQS_OFF > testb $3, CS(%rsp) Oh hell no... do now you're adding unconditional calls to every single interrupt and nmi exit path, with the grand total of 0 justification.
On 5/16/2023 5:32 PM, Peter Zijlstra wrote: >> --- a/arch/x86/entry/entry_64.S >> +++ b/arch/x86/entry/entry_64.S >> @@ -1019,6 +1019,15 @@ SYM_CODE_END(paranoid_entry) >> * R15 - old SPEC_CTRL >> */ >> SYM_CODE_START_LOCAL(paranoid_exit) >> +#ifdef CONFIG_AMD_MEM_ENCRYPT >> + /* >> + * If a #HV was delivered during execution and interrupts were >> + * disabled, then check if it can be handled before the iret >> + * (which may re-enable interrupts). >> + */ >> + mov %rsp, %rdi >> + call check_hv_pending >> +#endif >> UNWIND_HINT_REGS >> >> /* >> @@ -1143,6 +1152,15 @@ SYM_CODE_START(error_entry) >> SYM_CODE_END(error_entry) >> >> SYM_CODE_START_LOCAL(error_return) >> +#ifdef CONFIG_AMD_MEM_ENCRYPT >> + /* >> + * If a #HV was delivered during execution and interrupts were >> + * disabled, then check if it can be handled before the iret >> + * (which may re-enable interrupts). >> + */ >> + mov %rsp, %rdi >> + call check_hv_pending >> +#endif >> UNWIND_HINT_REGS >> DEBUG_ENTRY_ASSERT_IRQS_OFF >> testb $3, CS(%rsp) > Oh hell no... do now you're adding unconditional calls to every single > interrupt and nmi exit path, with the grand total of 0 justification. > Sorry to Add check inside of check_hv_pending(). Will move the check before calling check_hv_pending() in the next version. Thanks.
On Wed, May 17, 2023 at 05:55:45PM +0800, Tianyu Lan wrote: > On 5/16/2023 5:32 PM, Peter Zijlstra wrote: > > > --- a/arch/x86/entry/entry_64.S > > > +++ b/arch/x86/entry/entry_64.S > > > @@ -1019,6 +1019,15 @@ SYM_CODE_END(paranoid_entry) > > > * R15 - old SPEC_CTRL > > > */ > > > SYM_CODE_START_LOCAL(paranoid_exit) > > > +#ifdef CONFIG_AMD_MEM_ENCRYPT > > > + /* > > > + * If a #HV was delivered during execution and interrupts were > > > + * disabled, then check if it can be handled before the iret > > > + * (which may re-enable interrupts). > > > + */ > > > + mov %rsp, %rdi > > > + call check_hv_pending > > > +#endif > > > UNWIND_HINT_REGS > > > /* > > > @@ -1143,6 +1152,15 @@ SYM_CODE_START(error_entry) > > > SYM_CODE_END(error_entry) > > > SYM_CODE_START_LOCAL(error_return) > > > +#ifdef CONFIG_AMD_MEM_ENCRYPT > > > + /* > > > + * If a #HV was delivered during execution and interrupts were > > > + * disabled, then check if it can be handled before the iret > > > + * (which may re-enable interrupts). > > > + */ > > > + mov %rsp, %rdi > > > + call check_hv_pending > > > +#endif > > > UNWIND_HINT_REGS > > > DEBUG_ENTRY_ASSERT_IRQS_OFF > > > testb $3, CS(%rsp) > > Oh hell no... do now you're adding unconditional calls to every single > > interrupt and nmi exit path, with the grand total of 0 justification. > > > > Sorry to Add check inside of check_hv_pending(). Will move the check before > calling check_hv_pending() in the next version. Thanks. You will also explain, in the Changelog, in excruciating detail, *WHY* any of this is required. Any additional code in these paths that are only required for some random hypervisor had better proof that they are absolutely required and no alternative solution exists and have no performance impact on normal users. If this is due to Hyper-V design idiocies over something fundamentally required by the hardware design you'll get a NAK.
From: Peter Zijlstra <peterz@infradead.org> Sent: Wednesday, May 17, 2023 6:10 AM > > On Wed, May 17, 2023 at 05:55:45PM +0800, Tianyu Lan wrote: > > On 5/16/2023 5:32 PM, Peter Zijlstra wrote: > > > > --- a/arch/x86/entry/entry_64.S > > > > +++ b/arch/x86/entry/entry_64.S > > > > @@ -1019,6 +1019,15 @@ SYM_CODE_END(paranoid_entry) > > > > * R15 - old SPEC_CTRL > > > > */ > > > > SYM_CODE_START_LOCAL(paranoid_exit) > > > > +#ifdef CONFIG_AMD_MEM_ENCRYPT > > > > + /* > > > > + * If a #HV was delivered during execution and interrupts were > > > > + * disabled, then check if it can be handled before the iret > > > > + * (which may re-enable interrupts). > > > > + */ > > > > + mov %rsp, %rdi > > > > + call check_hv_pending > > > > +#endif > > > > UNWIND_HINT_REGS > > > > /* > > > > @@ -1143,6 +1152,15 @@ SYM_CODE_START(error_entry) > > > > SYM_CODE_END(error_entry) > > > > SYM_CODE_START_LOCAL(error_return) > > > > +#ifdef CONFIG_AMD_MEM_ENCRYPT > > > > + /* > > > > + * If a #HV was delivered during execution and interrupts were > > > > + * disabled, then check if it can be handled before the iret > > > > + * (which may re-enable interrupts). > > > > + */ > > > > + mov %rsp, %rdi > > > > + call check_hv_pending > > > > +#endif > > > > UNWIND_HINT_REGS > > > > DEBUG_ENTRY_ASSERT_IRQS_OFF > > > > testb $3, CS(%rsp) > > > Oh hell no... do now you're adding unconditional calls to every single > > > interrupt and nmi exit path, with the grand total of 0 justification. > > > > > > > Sorry to Add check inside of check_hv_pending(). Will move the check before > > calling check_hv_pending() in the next version. Thanks. > > You will also explain, in the Changelog, in excruciating detail, *WHY* > any of this is required. > > Any additional code in these paths that are only required for some > random hypervisor had better proof that they are absolutely required and > no alternative solution exists and have no performance impact on normal > users. > > If this is due to Hyper-V design idiocies over something fundamentally > required by the hardware design you'll get a NAK. I'm jumping in to answer some of the basic questions here. Yesterday, there was a discussion about nested #HV exceptions, so maybe some of this is already understood, but let me recap at a higher level, provide some references, and suggest the path forward. This code and some of the other patches in this series are for handling the #HV exception that is introduced by the Restricted Interrupt Injection feature of the SEV-SNP architecture. See Section 15.36.16 of [1], and Section 5 of [2]. There's also an AMD presentation from LPC last fall [3]. Hyper-V requires that the guest implement Restricted Interrupt Injection to handle the case of a compromised hypervisor injecting an exception (and forcing the running of that exception handler), even when it should be disallowed by guest state. For example, the hypervisor could inject an interrupt while the guest has interrupts disabled. In time, presumably other hypervisors like KVM will at least have an option where they expect SEV-SNP guests to implement Restricted Interrupt Injection functionality, so it's not Hyper-V specific. Naming the new exception as #HV and use of "hv" as the Linux prefix for related functions and variable names is a bit unfortunate. It conflicts with the existing use of the "hv" prefix to denote Hyper-V specific code in the Linux kernel, and at first glance makes this code look like it is Hyper-V specific code. Maybe we can choose a different prefix ("hvex"?) for this #HV exception related code to avoid that "first glance" confusion. I've talked with Tianyu offline, and he will do the following: 1) Split this patch set into two patch sets. The first patch set is Hyper-V specific code for managing communication pages that must be shared between the guest and Hyper-V, for starting APs, etc. The second patch set will be only the Restricted Interrupt Injection and #HV code. 2) For the Restricted Interrupt Injection code, Tianyu will look at how to absolutely minimize the impact in the hot code paths, particularly when SEV-SNP is not active. Hopefully the impact can be a couple of instructions at most, or even less with the use of other existing kernel techniques. He'll look at the other things you've commented on and get the code into a better state. I'll work with him on writing commit messages and comments that explain what's going on. Michael [1] https://www.amd.com/system/files/TechDocs/24593.pdf [2] https://www.amd.com/system/files/TechDocs/56421-guest-hypervisor-communication-block-standardization.pdf [3] https://lpc.events/event/16/contributions/1321/attachments/965/1886/SNP_Interrupt_Security.pptx
On Wed, May 31, 2023 at 02:50:50PM +0000, Michael Kelley (LINUX) wrote: > I'm jumping in to answer some of the basic questions here. Yesterday, > there was a discussion about nested #HV exceptions, so maybe some of > this is already understood, but let me recap at a higher level, provide some > references, and suggest the path forward. > 2) For the Restricted Interrupt Injection code, Tianyu will look at > how to absolutely minimize the impact in the hot code paths, > particularly when SEV-SNP is not active. Hopefully the impact can > be a couple of instructions at most, or even less with the use of > other existing kernel techniques. He'll look at the other things you've > commented on and get the code into a better state. I'll work with > him on writing commit messages and comments that explain what's > going on. So from what I understand of all this SEV-SNP/#HV muck is that it is near impossible to get right without ucode/hw changes. Hence my request to Tom to look into that. The feature as specified in the AMD documentation seems fundamentally buggered. Specifically #HV needs to be IST because hypervisor can inject at any moment, irrespective of IF or anything else -- even #HV itself. This means also in the syscall gap. Since it is IST, a nested #HV is instant stack corruption -- #HV can attempt to play stack games as per the copied #VC crap (which I'm not at all convinced about being correct itself), but this doesn't actually fix anything, all you need is a single instruction window to wreck things. Because as stated, the whole premise is that the hypervisor is out to get you, you must not leave it room to wiggle. As is, this is security through prayer, and we don't do that. In short; I really want a solid proof that what you propose to implement is correct and not wishful thinking.
From: Peter Zijlstra <peterz@infradead.org> Sent: Wednesday, May 31, 2023 8:49 AM > > On Wed, May 31, 2023 at 02:50:50PM +0000, Michael Kelley (LINUX) wrote: > > > I'm jumping in to answer some of the basic questions here. Yesterday, > > there was a discussion about nested #HV exceptions, so maybe some of > > this is already understood, but let me recap at a higher level, provide some > > references, and suggest the path forward. > > > 2) For the Restricted Interrupt Injection code, Tianyu will look at > > how to absolutely minimize the impact in the hot code paths, > > particularly when SEV-SNP is not active. Hopefully the impact can > > be a couple of instructions at most, or even less with the use of > > other existing kernel techniques. He'll look at the other things you've > > commented on and get the code into a better state. I'll work with > > him on writing commit messages and comments that explain what's > > going on. > > So from what I understand of all this SEV-SNP/#HV muck is that it is > near impossible to get right without ucode/hw changes. Hence my request > to Tom to look into that. > > The feature as specified in the AMD documentation seems fundamentally > buggered. > > Specifically #HV needs to be IST because hypervisor can inject at any > moment, irrespective of IF or anything else -- even #HV itself. This > means also in the syscall gap. > > Since it is IST, a nested #HV is instant stack corruption -- #HV can > attempt to play stack games as per the copied #VC crap (which I'm not at > all convinced about being correct itself), but this doesn't actually fix > anything, all you need is a single instruction window to wreck things. > > Because as stated, the whole premise is that the hypervisor is out to > get you, you must not leave it room to wiggle. As is, this is security > through prayer, and we don't do that. > > In short; I really want a solid proof that what you propose to implement > is correct and not wishful thinking. Fair enough. We will be sync'ing with the AMD folks to make sure that one way or another this really will work. Michael
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 653b1f10699b..147b850babf6 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -1019,6 +1019,15 @@ SYM_CODE_END(paranoid_entry) * R15 - old SPEC_CTRL */ SYM_CODE_START_LOCAL(paranoid_exit) +#ifdef CONFIG_AMD_MEM_ENCRYPT + /* + * If a #HV was delivered during execution and interrupts were + * disabled, then check if it can be handled before the iret + * (which may re-enable interrupts). + */ + mov %rsp, %rdi + call check_hv_pending +#endif UNWIND_HINT_REGS /* @@ -1143,6 +1152,15 @@ SYM_CODE_START(error_entry) SYM_CODE_END(error_entry) SYM_CODE_START_LOCAL(error_return) +#ifdef CONFIG_AMD_MEM_ENCRYPT + /* + * If a #HV was delivered during execution and interrupts were + * disabled, then check if it can be handled before the iret + * (which may re-enable interrupts). + */ + mov %rsp, %rdi + call check_hv_pending +#endif UNWIND_HINT_REGS DEBUG_ENTRY_ASSERT_IRQS_OFF testb $3, CS(%rsp) diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h index 8c5ae649d2df..d09ec6d76591 100644 --- a/arch/x86/include/asm/irqflags.h +++ b/arch/x86/include/asm/irqflags.h @@ -11,6 +11,10 @@ /* * Interrupt control: */ +#ifdef CONFIG_AMD_MEM_ENCRYPT +void check_hv_pending(struct pt_regs *regs); +void check_hv_pending_irq_enable(void); +#endif /* Declaration required for gcc < 4.9 to prevent -Werror=missing-prototypes */ extern inline unsigned long native_save_fl(void); @@ -40,12 +44,20 @@ static __always_inline void native_irq_disable(void) static __always_inline void native_irq_enable(void) { asm volatile("sti": : :"memory"); +#ifdef CONFIG_AMD_MEM_ENCRYPT + check_hv_pending_irq_enable(); +#endif } static __always_inline void native_safe_halt(void) { mds_idle_clear_cpu_buffers(); - asm volatile("sti; hlt": : :"memory"); + asm volatile("sti": : :"memory"); + +#ifdef CONFIG_AMD_MEM_ENCRYPT + check_hv_pending_irq_enable(); +#endif + asm volatile("hlt": : :"memory"); } static __always_inline void native_halt(void) diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index e25445de0957..ff5eab48bfe2 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -181,6 +181,44 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs) this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist); } +static void do_exc_hv(struct pt_regs *regs) +{ + /* Handle #HV exception. */ +} + +void check_hv_pending(struct pt_regs *regs) +{ + if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) + return; + + if ((regs->flags & X86_EFLAGS_IF) == 0) + return; + + do_exc_hv(regs); +} + +void check_hv_pending_irq_enable(void) +{ + struct pt_regs regs; + + if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) + return; + + memset(®s, 0, sizeof(struct pt_regs)); + asm volatile("movl %%cs, %%eax;" : "=a" (regs.cs)); + asm volatile("movl %%ss, %%eax;" : "=a" (regs.ss)); + regs.orig_ax = 0xffffffff; + regs.flags = native_save_fl(); + + /* + * Disable irq when handle pending #HV events after + * re-enabling irq. + */ + asm volatile("cli" : : : "memory"); + do_exc_hv(®s); + asm volatile("sti" : : : "memory"); +} + void noinstr __sev_es_ist_exit(void) { unsigned long ist;