[3/9] x86/hyperv: Mark Hyper-V vp assist page unencrypted in SEV-SNP enlightened guest

Message ID 20230601151624.1757616-4-ltykernel@gmail.com
State New
Headers
Series x86/hyperv: Add AMD sev-snp enlightened guest support on hyperv |

Commit Message

Tianyu Lan June 1, 2023, 3:16 p.m. UTC
  From: Tianyu Lan <tiala@microsoft.com>

hv vp assist page needs to be shared between SEV-SNP guest and Hyper-V.
So mark the page unencrypted in the SEV-SNP guest.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/hyperv/hv_init.c | 6 ++++++
 1 file changed, 6 insertions(+)
  

Comments

Vitaly Kuznetsov June 5, 2023, 12:13 p.m. UTC | #1
Tianyu Lan <ltykernel@gmail.com> writes:

> From: Tianyu Lan <tiala@microsoft.com>
>
> hv vp assist page needs to be shared between SEV-SNP guest and Hyper-V.
> So mark the page unencrypted in the SEV-SNP guest.
>
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>  arch/x86/hyperv/hv_init.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index b4a2327c823b..331b855314b7 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -18,6 +18,7 @@
>  #include <asm/hyperv-tlfs.h>
>  #include <asm/mshyperv.h>
>  #include <asm/idtentry.h>
> +#include <asm/set_memory.h>
>  #include <linux/kexec.h>
>  #include <linux/version.h>
>  #include <linux/vmalloc.h>
> @@ -113,6 +114,11 @@ static int hv_cpu_init(unsigned int cpu)
>  
>  	}
>  	if (!WARN_ON(!(*hvp))) {
> +		if (hv_isolation_type_en_snp()) {
> +			WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
> +			memset(*hvp, 0, PAGE_SIZE);
> +		}

Why do we need to set the page as decrypted here and not when we
allocate the page (a few lines above)? And why do we need to clear it
_after_ we made it decrypted? In case we care about not leaking the
stale content to the hypervisor, we should've cleared it _before_, but
the bigger problem I see is that memset() is problemmatic e.g. for KVM
which uses enlightened VMCS. You put a CPU offline and then back online
and this path will be taken. Clearing VP assist page will likely brake
things. (AFAIU SEV-SNP Hyper-V guests don't expose SVM yet so the
problem is likely theoretical only, but still).

> +
>  		msr.enable = 1;
>  		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64);
>  	}
  
Tianyu Lan June 6, 2023, 3:22 p.m. UTC | #2
On 6/5/2023 8:13 PM, Vitaly Kuznetsov wrote:
>> @@ -113,6 +114,11 @@ static int hv_cpu_init(unsigned int cpu)
>>   
>>   	}
>>   	if (!WARN_ON(!(*hvp))) {
>> +		if (hv_isolation_type_en_snp()) {
>> +			WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
>> +			memset(*hvp, 0, PAGE_SIZE);
>> +		}
> Why do we need to set the page as decrypted here and not when we
> allocate the page (a few lines above)?

If Linux root partition boots in the SEV-SNP guest, the page still needs 
to be decrypted.

> And why do we need to clear it
> _after_  we made it decrypted? In case we care about not leaking the
> stale content to the hypervisor, we should've cleared it_before_, but
> the bigger problem I see is that memset() is problemmatic e.g. for KVM
> which uses enlightened VMCS. You put a CPU offline and then back online
> and this path will be taken. Clearing VP assist page will likely brake
> things. (AFAIU SEV-SNP Hyper-V guests don't expose SVM yet so the
> problem is likely theoretical only, but still).
> 

The page will be made dirt by hardware after decrypting operation and so 
memset the page after that.
  
Vitaly Kuznetsov June 6, 2023, 3:49 p.m. UTC | #3
Tianyu Lan <ltykernel@gmail.com> writes:

> On 6/5/2023 8:13 PM, Vitaly Kuznetsov wrote:
>>> @@ -113,6 +114,11 @@ static int hv_cpu_init(unsigned int cpu)
>>>   
>>>   	}
>>>   	if (!WARN_ON(!(*hvp))) {
>>> +		if (hv_isolation_type_en_snp()) {
>>> +			WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
>>> +			memset(*hvp, 0, PAGE_SIZE);
>>> +		}
>> Why do we need to set the page as decrypted here and not when we
>> allocate the page (a few lines above)?
>
> If Linux root partition boots in the SEV-SNP guest, the page still needs 
> to be decrypted.
>

I'd suggest we add a flag to indicate that VP assist page was actually
set (on the first invocation of hv_cpu_init() for guest partitions and
all invocations for root partition) and only call
set_memory_decrypted()/memset() then: that would both help with the
potential issue with KVM using enlightened vmcs and avoid the unneeded
hypercall.
  
Michael Kelley (LINUX) June 8, 2023, 1:25 p.m. UTC | #4
From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Tuesday, June 6, 2023 8:49 AM
> 
> Tianyu Lan <ltykernel@gmail.com> writes:
> 
> > On 6/5/2023 8:13 PM, Vitaly Kuznetsov wrote:
> >>> @@ -113,6 +114,11 @@ static int hv_cpu_init(unsigned int cpu)
> >>>
> >>>   	}
> >>>   	if (!WARN_ON(!(*hvp))) {
> >>> +		if (hv_isolation_type_en_snp()) {
> >>> +			WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
> >>> +			memset(*hvp, 0, PAGE_SIZE);
> >>> +		}
> >> Why do we need to set the page as decrypted here and not when we
> >> allocate the page (a few lines above)?
> >
> > If Linux root partition boots in the SEV-SNP guest, the page still needs
> > to be decrypted.

We have code in place that prevents this scenario.  We don't allow Linux
in the root partition to run in SEV-SNP mode.  See commit f8acb24aaf89.

> >
> 
> I'd suggest we add a flag to indicate that VP assist page was actually
> set (on the first invocation of hv_cpu_init() for guest partitions and
> all invocations for root partition) and only call
> set_memory_decrypted()/memset() then: that would both help with the
> potential issue with KVM using enlightened vmcs and avoid the unneeded
> hypercall.
> 

I think there's actually a more immediate problem with the code as
written.  The VP assist page for a CPU is not re-encrypted or freed when
a CPU goes offline (for reasons that have been discussed elsewhere).  So
if a CPU in an SEV-SNP VM goes offline and then comes back online, the
originally allocated and already decrypted VP assist page will be reused.
But bad things will happen if we try to decrypt the page again.

Given that we disallow the root partition running in SEV-SNP mode,
can we avoid the complexity of a flag, and just do the decryption and
zero'ing when the page is allocated?

Michael
  
Vitaly Kuznetsov June 8, 2023, 1:44 p.m. UTC | #5
"Michael Kelley (LINUX)" <mikelley@microsoft.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Tuesday, June 6, 2023 8:49 AM
>> 
>> Tianyu Lan <ltykernel@gmail.com> writes:
>> 
>> > On 6/5/2023 8:13 PM, Vitaly Kuznetsov wrote:
>> >>> @@ -113,6 +114,11 @@ static int hv_cpu_init(unsigned int cpu)
>> >>>
>> >>>   	}
>> >>>   	if (!WARN_ON(!(*hvp))) {
>> >>> +		if (hv_isolation_type_en_snp()) {
>> >>> +			WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
>> >>> +			memset(*hvp, 0, PAGE_SIZE);
>> >>> +		}
>> >> Why do we need to set the page as decrypted here and not when we
>> >> allocate the page (a few lines above)?
>> >
>> > If Linux root partition boots in the SEV-SNP guest, the page still needs
>> > to be decrypted.
>
> We have code in place that prevents this scenario.  We don't allow Linux
> in the root partition to run in SEV-SNP mode.  See commit f8acb24aaf89.
>
>> >
>> 
>> I'd suggest we add a flag to indicate that VP assist page was actually
>> set (on the first invocation of hv_cpu_init() for guest partitions and
>> all invocations for root partition) and only call
>> set_memory_decrypted()/memset() then: that would both help with the
>> potential issue with KVM using enlightened vmcs and avoid the unneeded
>> hypercall.
>> 
>
> I think there's actually a more immediate problem with the code as
> written.  The VP assist page for a CPU is not re-encrypted or freed when
> a CPU goes offline (for reasons that have been discussed elsewhere).  So
> if a CPU in an SEV-SNP VM goes offline and then comes back online, the
> originally allocated and already decrypted VP assist page will be reused.
> But bad things will happen if we try to decrypt the page again.
>
> Given that we disallow the root partition running in SEV-SNP mode,
> can we avoid the complexity of a flag, and just do the decryption and
> zero'ing when the page is allocated?

Sure, makes perfect sense but let's leave a [one line] comment why we
don't do any decryption for root partition then.
  

Patch

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index b4a2327c823b..331b855314b7 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -18,6 +18,7 @@ 
 #include <asm/hyperv-tlfs.h>
 #include <asm/mshyperv.h>
 #include <asm/idtentry.h>
+#include <asm/set_memory.h>
 #include <linux/kexec.h>
 #include <linux/version.h>
 #include <linux/vmalloc.h>
@@ -113,6 +114,11 @@  static int hv_cpu_init(unsigned int cpu)
 
 	}
 	if (!WARN_ON(!(*hvp))) {
+		if (hv_isolation_type_en_snp()) {
+			WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
+			memset(*hvp, 0, PAGE_SIZE);
+		}
+
 		msr.enable = 1;
 		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64);
 	}