[v2,05/12] x86/hyperv: Change vTOM handling to use standard coco mechanisms
Commit Message
Hyper-V guests on AMD SEV-SNP hardware have the option of using the
"virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP
architecture. With vTOM, shared vs. private memory accesses are
controlled by splitting the guest physical address space into two
halves. vTOM is the dividing line where the uppermost bit of the
physical address space is set; e.g., with 47 bits of guest physical
address space, vTOM is 0x40000000000 (bit 46 is set). Guest phyiscal
memory is accessible at two parallel physical addresses -- one below
vTOM and one above vTOM. Accesses below vTOM are private (encrypted)
while accesses above vTOM are shared (decrypted). In this sense, vTOM
is like the GPA.SHARED bit in Intel TDX.
Support for Hyper-V guests using vTOM was added to the Linux kernel in
two patch sets[1][2]. This support treats the vTOM bit as part of
the physical address. For accessing shared (decrypted) memory, these
patch sets create a second kernel virtual mapping that maps to physical
addresses above vTOM.
A better approach is to treat the vTOM bit as a protection flag, not
as part of the physical address. This new approach is like the approach
for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel
virtual mapping, the existing mapping is updated using recently added
coco mechanisms. When memory is changed between private and shared using
set_memory_decrypted() and set_memory_encrypted(), the PTEs for the
existing kernel mapping are changed to add or remove the vTOM bit
in the guest physical address, just as with TDX. The hypercalls to
change the memory status on the host side are made using the existing
callback mechanism. Everything just works, with a minor tweak to map
the I/O APIC to use private accesses.
To accomplish the switch in approach, the following must be done in
in this single patch:
* Update Hyper-V initialization to set the cc _mask based on vTOM
and do other coco initialization.
* Update physical_mask so the vTOM bit is no longer treated as part
of the physical address
* Update cc_mkenc() and cc_mkdec() to be active for Hyper-V guests.
This makes the vTOM bit part of the protection flags.
* Code already exists to make hypercalls to inform Hyper-V about pages
changing between shared and private. Update this code to run as a
callback from __set_memory_enc_pgtable().
* Remove the Hyper-V special case from __set_memory_enc_dec(), and
make the normal case active for Hyper-V VMs, which have
CC_ATTR_GUEST_MEM_ENCRYPT, but not CC_ATTR_MEM_ENCRYPT.
[1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/
[2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
arch/x86/coco/core.c | 10 ++++++++-
arch/x86/hyperv/ivm.c | 45 +++++++++++++++++++++++++++++++----------
arch/x86/include/asm/mshyperv.h | 8 ++------
arch/x86/kernel/cpu/mshyperv.c | 15 +++++++-------
arch/x86/mm/pat/set_memory.c | 6 ++----
5 files changed, 54 insertions(+), 30 deletions(-)
Comments
On 11/11/22 00:21, Michael Kelley wrote:
> Hyper-V guests on AMD SEV-SNP hardware have the option of using the
> "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP
> architecture. With vTOM, shared vs. private memory accesses are
> controlled by splitting the guest physical address space into two
> halves. vTOM is the dividing line where the uppermost bit of the
> physical address space is set; e.g., with 47 bits of guest physical
> address space, vTOM is 0x40000000000 (bit 46 is set). Guest phyiscal
> memory is accessible at two parallel physical addresses -- one below
> vTOM and one above vTOM. Accesses below vTOM are private (encrypted)
> while accesses above vTOM are shared (decrypted). In this sense, vTOM
> is like the GPA.SHARED bit in Intel TDX.
>
> Support for Hyper-V guests using vTOM was added to the Linux kernel in
> two patch sets[1][2]. This support treats the vTOM bit as part of
> the physical address. For accessing shared (decrypted) memory, these
> patch sets create a second kernel virtual mapping that maps to physical
> addresses above vTOM.
>
> A better approach is to treat the vTOM bit as a protection flag, not
> as part of the physical address. This new approach is like the approach
> for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel
> virtual mapping, the existing mapping is updated using recently added
> coco mechanisms. When memory is changed between private and shared using
> set_memory_decrypted() and set_memory_encrypted(), the PTEs for the
> existing kernel mapping are changed to add or remove the vTOM bit
> in the guest physical address, just as with TDX. The hypercalls to
> change the memory status on the host side are made using the existing
> callback mechanism. Everything just works, with a minor tweak to map
> the I/O APIC to use private accesses.
>
> To accomplish the switch in approach, the following must be done in
> in this single patch:
>
> * Update Hyper-V initialization to set the cc _mask based on vTOM
> and do other coco initialization.
>
> * Update physical_mask so the vTOM bit is no longer treated as part
> of the physical address
>
> * Update cc_mkenc() and cc_mkdec() to be active for Hyper-V guests.
> This makes the vTOM bit part of the protection flags.
>
> * Code already exists to make hypercalls to inform Hyper-V about pages
> changing between shared and private. Update this code to run as a
> callback from __set_memory_enc_pgtable().
>
> * Remove the Hyper-V special case from __set_memory_enc_dec(), and
> make the normal case active for Hyper-V VMs, which have
> CC_ATTR_GUEST_MEM_ENCRYPT, but not CC_ATTR_MEM_ENCRYPT.
>
> [1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/
> [2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/
>
> Signed-off-by: Michael Kelley <mikelley@microsoft.com>
> Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
> arch/x86/coco/core.c | 10 ++++++++-
> arch/x86/hyperv/ivm.c | 45 +++++++++++++++++++++++++++++++----------
> arch/x86/include/asm/mshyperv.h | 8 ++------
> arch/x86/kernel/cpu/mshyperv.c | 15 +++++++-------
> arch/x86/mm/pat/set_memory.c | 6 ++----
> 5 files changed, 54 insertions(+), 30 deletions(-)
>
> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> index 06eb8910..024fbf4 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
>
> static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> {
> - if (hv_is_isolation_supported())
> - return hv_set_mem_host_visibility(addr, numpages, !enc);
> -
> - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
> + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) ||
> + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
This seems kind of strange since CC_ATTR_MEM_ENCRYPT is supposed to mean
either HOST or GUEST memory encryption, but then you check for GUEST
memory encryption directly. Can your cc_platform_has() support be setup to
handle the CC_ATTR_MEM_ENCRYPT attribute in some way?
Thanks,
Tom
> return __set_memory_enc_pgtable(addr, numpages, enc);
>
> return 0;
From: Tom Lendacky <thomas.lendacky@amd.com> Sent: Friday, November 11, 2022 10:50 AM
>
> On 11/11/22 00:21, Michael Kelley wrote:
[snip]
> > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
> > index 06eb8910..024fbf4 100644
> > --- a/arch/x86/mm/pat/set_memory.c
> > +++ b/arch/x86/mm/pat/set_memory.c
> > @@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long
> addr, int numpages, bool enc)
> >
> > static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> > {
> > - if (hv_is_isolation_supported())
> > - return hv_set_mem_host_visibility(addr, numpages, !enc);
> > -
> > - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
> > + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) ||
> > + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
>
> This seems kind of strange since CC_ATTR_MEM_ENCRYPT is supposed to mean
> either HOST or GUEST memory encryption, but then you check for GUEST
> memory encryption directly. Can your cc_platform_has() support be setup to
> handle the CC_ATTR_MEM_ENCRYPT attribute in some way?
>
> Thanks,
> Tom
Current upstream code for Hyper-V guests with vTOM enables only
CC_ATTR_GUEST_MEM_ENCRYPT. I had been wary of also enabling
CC_ATTR_MEM_ENCRYPT because that would enable other code paths that
Might not be right for the vTOM case. But looking at it more closely, enabling
CC_ATTR_MEM_ENCRYPT may work.
There are two problems with Hyper-V vTOM enabling CC_ATTR_MEM_ENCRYPT,
but both are fixable:
1) The call to mem_encrypt_init() happens a little bit too soon. Hyper-V is fully
initialized and hypercalls become possible after start_kernel() calls late_time_init().
mem_encrypt_init() needs to happen after the call to late_time_init() so that
marking the swiotlb memory as decrypted can make the hypercalls to sync the
page state change with the host. Moving mem_encrypt_init() a few lines later in
start_kernel() works in my case, but I can't test all the cases that you probably
have. This change also has the benefit of removing the call to
swiotlb_update_mem_attributes() at the end of hyperv_init(), which always
seemed like a hack.
2) mem_encrypt_free_decrypted_mem() is mismatched with
sme_postprocess_startup() in its handling of bss decrypted memory. The
decryption is done if sme_me_mask is non-zero, while the re-encryption is
done if CC_ATTR_MEM_ENCRYPT is true, and those conditions won't be
equivalent in a Hyper-V vTOM VM if we enable CC_ATTR_MEM_ENCRYPT
(sme_me_mask is always zero in a Hyper-V vTOM VM). Changing
mem_encrypt_free_decrypted_mem() to do re-encryption only if sme_me_mask
is non-zero solves that problem. Note that there doesn't seem to be a way for a
Hyper-V vTOM VM to have decrypted bss, since there's no way to sync the
page state change with the host that early in the boot process, but I don't think
there's a requirement for such, so all is good.
With the above two changes, Hyper-V vTOM VMs can enable
CC_ATTR_MEM_ENCRYPT. The Hyper-V hack in __set_memory_enc_dec()
still goes away, and there's no change to the condition for invoking
__set_memory_enc_pgtable().
Thoughts? Have I missed anything? Overall, I'm persuaded that this is a better
approach and can submit a v3 patch series with these changes if you agree.
Michael
On 11/13/22 10:01, Michael Kelley (LINUX) wrote:
> From: Tom Lendacky <thomas.lendacky@amd.com> Sent: Friday, November 11, 2022 10:50 AM
>>
>> On 11/11/22 00:21, Michael Kelley wrote:
>
> [snip]
>
>>> diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
>>> index 06eb8910..024fbf4 100644
>>> --- a/arch/x86/mm/pat/set_memory.c
>>> +++ b/arch/x86/mm/pat/set_memory.c
>>> @@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long
>> addr, int numpages, bool enc)
>>>
>>> static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
>>> {
>>> - if (hv_is_isolation_supported())
>>> - return hv_set_mem_host_visibility(addr, numpages, !enc);
>>> -
>>> - if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
>>> + if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) ||
>>> + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
>>
>> This seems kind of strange since CC_ATTR_MEM_ENCRYPT is supposed to mean
>> either HOST or GUEST memory encryption, but then you check for GUEST
>> memory encryption directly. Can your cc_platform_has() support be setup to
>> handle the CC_ATTR_MEM_ENCRYPT attribute in some way?
>>
>> Thanks,
>> Tom
>
> Current upstream code for Hyper-V guests with vTOM enables only
> CC_ATTR_GUEST_MEM_ENCRYPT. I had been wary of also enabling
> CC_ATTR_MEM_ENCRYPT because that would enable other code paths that
> Might not be right for the vTOM case. But looking at it more closely, enabling
> CC_ATTR_MEM_ENCRYPT may work.
>
> There are two problems with Hyper-V vTOM enabling CC_ATTR_MEM_ENCRYPT,
> but both are fixable:
>
> 1) The call to mem_encrypt_init() happens a little bit too soon. Hyper-V is fully
> initialized and hypercalls become possible after start_kernel() calls late_time_init().
> mem_encrypt_init() needs to happen after the call to late_time_init() so that
> marking the swiotlb memory as decrypted can make the hypercalls to sync the
> page state change with the host. Moving mem_encrypt_init() a few lines later in
> start_kernel() works in my case, but I can't test all the cases that you probably
> have. This change also has the benefit of removing the call to
> swiotlb_update_mem_attributes() at the end of hyperv_init(), which always
> seemed like a hack.
It seems safe for SME/SEV since mem_encrypt_init() is only updating the
SWIOTLB attributes at this point. I'll do some quick testing, but you
might want to verify with TDX folks, too.
>
> 2) mem_encrypt_free_decrypted_mem() is mismatched with
> sme_postprocess_startup() in its handling of bss decrypted memory. The
> decryption is done if sme_me_mask is non-zero, while the re-encryption is
> done if CC_ATTR_MEM_ENCRYPT is true, and those conditions won't be
> equivalent in a Hyper-V vTOM VM if we enable CC_ATTR_MEM_ENCRYPT
> (sme_me_mask is always zero in a Hyper-V vTOM VM). Changing
> mem_encrypt_free_decrypted_mem() to do re-encryption only if sme_me_mask
> is non-zero solves that problem. Note that there doesn't seem to be a way for a
Hmmm, yes, this was because of an issue using the cc_platform_has() call
during identity mapped paging. I think matching them in this case would be
best, e.g., changing mem_encrypt_free_decrypted_mem() to check for a
non-zero sme_me_mask - along with a nice comment on why it is checking
sme_me_mask.
Thanks,
Tom
> Hyper-V vTOM VM to have decrypted bss, since there's no way to sync the
> page state change with the host that early in the boot process, but I don't think
> there's a requirement for such, so all is good.
>
> With the above two changes, Hyper-V vTOM VMs can enable
> CC_ATTR_MEM_ENCRYPT. The Hyper-V hack in __set_memory_enc_dec()
> still goes away, and there's no change to the condition for invoking
> __set_memory_enc_pgtable().
>
> Thoughts? Have I missed anything? Overall, I'm persuaded that this is a better
> approach and can submit a v3 patch series with these changes if you agree.
>
> Michael
@@ -78,7 +78,13 @@ static bool amd_cc_platform_has(enum cc_attr attr)
static bool hyperv_cc_platform_has(enum cc_attr attr)
{
- return attr == CC_ATTR_GUEST_MEM_ENCRYPT;
+ switch (attr) {
+ case CC_ATTR_GUEST_MEM_ENCRYPT:
+ case CC_ATTR_HAS_PARAVISOR:
+ return true;
+ default:
+ return false;
+ }
}
bool cc_platform_has(enum cc_attr attr)
@@ -108,6 +114,7 @@ u64 cc_mkenc(u64 val)
switch (vendor) {
case CC_VENDOR_AMD:
return val | cc_mask;
+ case CC_VENDOR_HYPERV:
case CC_VENDOR_INTEL:
return val & ~cc_mask;
default:
@@ -121,6 +128,7 @@ u64 cc_mkdec(u64 val)
switch (vendor) {
case CC_VENDOR_AMD:
return val & ~cc_mask;
+ case CC_VENDOR_HYPERV:
case CC_VENDOR_INTEL:
return val | cc_mask;
default:
@@ -13,6 +13,7 @@
#include <asm/svm.h>
#include <asm/sev.h>
#include <asm/io.h>
+#include <asm/coco.h>
#include <asm/mshyperv.h>
#include <asm/hypervisor.h>
@@ -233,7 +234,6 @@ void hv_ghcb_msr_read(u64 msr, u64 *value)
local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(hv_ghcb_msr_read);
-#endif
/*
* hv_mark_gpa_visibility - Set pages visible to host via hvcall.
@@ -286,27 +286,25 @@ static int hv_mark_gpa_visibility(u16 count, const u64 pfn[],
}
/*
- * hv_set_mem_host_visibility - Set specified memory visible to host.
+ * hv_vtom_set_host_visibility - Set specified memory visible to host.
*
* In Isolation VM, all guest memory is encrypted from host and guest
* needs to set memory visible to host via hvcall before sharing memory
* with host. This function works as wrap of hv_mark_gpa_visibility()
* with memory base and size.
*/
-int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visible)
+static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bool enc)
{
- enum hv_mem_host_visibility visibility = visible ?
- VMBUS_PAGE_VISIBLE_READ_WRITE : VMBUS_PAGE_NOT_VISIBLE;
+ enum hv_mem_host_visibility visibility = enc ?
+ VMBUS_PAGE_NOT_VISIBLE : VMBUS_PAGE_VISIBLE_READ_WRITE;
u64 *pfn_array;
int ret = 0;
+ bool result = true;
int i, pfn;
- if (!hv_is_isolation_supported() || !hv_hypercall_pg)
- return 0;
-
pfn_array = kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
if (!pfn_array)
- return -ENOMEM;
+ return false;
for (i = 0, pfn = 0; i < pagecount; i++) {
pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE);
@@ -315,17 +313,42 @@ int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visibl
if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) {
ret = hv_mark_gpa_visibility(pfn, pfn_array,
visibility);
- if (ret)
+ if (ret) {
+ result = false;
goto err_free_pfn_array;
+ }
pfn = 0;
}
}
err_free_pfn_array:
kfree(pfn_array);
- return ret;
+ return result;
+}
+
+static bool hv_vtom_tlb_flush_required(bool private)
+{
+ return true;
}
+static bool hv_vtom_cache_flush_required(void)
+{
+ return false;
+}
+
+void __init hv_vtom_init(void)
+{
+ cc_set_vendor(CC_VENDOR_HYPERV);
+ cc_set_mask(ms_hyperv.shared_gpa_boundary);
+ physical_mask &= ms_hyperv.shared_gpa_boundary - 1;
+
+ x86_platform.guest.enc_cache_flush_required = hv_vtom_cache_flush_required;
+ x86_platform.guest.enc_tlb_flush_required = hv_vtom_tlb_flush_required;
+ x86_platform.guest.enc_status_change_finish = hv_vtom_set_host_visibility;
+}
+
+#endif /* CONFIG_AMD_MEM_ENCRYPT */
+
/*
* hv_map_memory - map memory to extra space in the AMD SEV-SNP Isolation VM.
*/
@@ -174,18 +174,19 @@ static inline void hv_apic_init(void) {}
int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
struct hv_interrupt_entry *entry);
int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry);
-int hv_set_mem_host_visibility(unsigned long addr, int numpages, bool visible);
#ifdef CONFIG_AMD_MEM_ENCRYPT
void hv_ghcb_msr_write(u64 msr, u64 value);
void hv_ghcb_msr_read(u64 msr, u64 *value);
bool hv_ghcb_negotiate_protocol(void);
void hv_ghcb_terminate(unsigned int set, unsigned int reason);
+void hv_vtom_init(void);
#else
static inline void hv_ghcb_msr_write(u64 msr, u64 value) {}
static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {}
static inline bool hv_ghcb_negotiate_protocol(void) { return false; }
static inline void hv_ghcb_terminate(unsigned int set, unsigned int reason) {}
+static inline void hv_vtom_init(void) {}
#endif
extern bool hv_isolation_type_snp(void);
@@ -241,11 +242,6 @@ static inline int hyperv_flush_guest_mapping_range(u64 as,
}
static inline void hv_set_register(unsigned int reg, u64 value) { }
static inline u64 hv_get_register(unsigned int reg) { return 0; }
-static inline int hv_set_mem_host_visibility(unsigned long addr, int numpages,
- bool visible)
-{
- return -1;
-}
#endif /* CONFIG_HYPERV */
@@ -33,7 +33,6 @@
#include <asm/nmi.h>
#include <clocksource/hyperv_timer.h>
#include <asm/numa.h>
-#include <asm/coco.h>
/* Is Linux running as the root partition? */
bool hv_root_partition;
@@ -325,8 +324,10 @@ static void __init ms_hyperv_init_platform(void)
if (ms_hyperv.priv_high & HV_ISOLATION) {
ms_hyperv.isolation_config_a = cpuid_eax(HYPERV_CPUID_ISOLATION_CONFIG);
ms_hyperv.isolation_config_b = cpuid_ebx(HYPERV_CPUID_ISOLATION_CONFIG);
- ms_hyperv.shared_gpa_boundary =
- BIT_ULL(ms_hyperv.shared_gpa_boundary_bits);
+
+ if (ms_hyperv.shared_gpa_boundary_active)
+ ms_hyperv.shared_gpa_boundary =
+ BIT_ULL(ms_hyperv.shared_gpa_boundary_bits);
pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n",
ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
@@ -337,11 +338,6 @@ static void __init ms_hyperv_init_platform(void)
swiotlb_unencrypted_base = ms_hyperv.shared_gpa_boundary;
#endif
}
- /* Isolation VMs are unenlightened SEV-based VMs, thus this check: */
- if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
- if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE)
- cc_set_vendor(CC_VENDOR_HYPERV);
- }
}
if (hv_max_functions_eax >= HYPERV_CPUID_NESTED_FEATURES) {
@@ -410,6 +406,9 @@ static void __init ms_hyperv_init_platform(void)
i8253_clear_counter_on_shutdown = false;
#if IS_ENABLED(CONFIG_HYPERV)
+ if ((hv_get_isolation_type() == HV_ISOLATION_TYPE_VBS) ||
+ (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP))
+ hv_vtom_init();
/*
* Setup the hook to get control post apic initialization.
*/
@@ -2126,10 +2126,8 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
{
- if (hv_is_isolation_supported())
- return hv_set_mem_host_visibility(addr, numpages, !enc);
-
- if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
+ if (cc_platform_has(CC_ATTR_MEM_ENCRYPT) ||
+ cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return __set_memory_enc_pgtable(addr, numpages, enc);
return 0;