[v3,07/14] x86/hyperv: Change vTOM handling to use standard coco mechanisms

Message ID 1668624097-14884-8-git-send-email-mikelley@microsoft.com
State New
Headers
Series Add PCI pass-thru support to Hyper-V Confidential VMs |

Commit Message

Michael Kelley (LINUX) Nov. 16, 2022, 6:41 p.m. UTC
  Hyper-V guests on AMD SEV-SNP hardware have the option of using the
"virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP
architecture. With vTOM, shared vs. private memory accesses are
controlled by splitting the guest physical address space into two
halves.  vTOM is the dividing line where the uppermost bit of the
physical address space is set; e.g., with 47 bits of guest physical
address space, vTOM is 0x40000000000 (bit 46 is set).  Guest phyiscal
memory is accessible at two parallel physical addresses -- one below
vTOM and one above vTOM.  Accesses below vTOM are private (encrypted)
while accesses above vTOM are shared (decrypted). In this sense, vTOM
is like the GPA.SHARED bit in Intel TDX.

Support for Hyper-V guests using vTOM was added to the Linux kernel in
two patch sets[1][2]. This support treats the vTOM bit as part of
the physical address. For accessing shared (decrypted) memory, these
patch sets create a second kernel virtual mapping that maps to physical
addresses above vTOM.

A better approach is to treat the vTOM bit as a protection flag, not
as part of the physical address. This new approach is like the approach
for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel
virtual mapping, the existing mapping is updated using recently added
coco mechanisms.  When memory is changed between private and shared using
set_memory_decrypted() and set_memory_encrypted(), the PTEs for the
existing kernel mapping are changed to add or remove the vTOM bit
in the guest physical address, just as with TDX. The hypercalls to
change the memory status on the host side are made using the existing
callback mechanism. Everything just works, with a minor tweak to map
the I/O APIC to use private accesses.

To accomplish the switch in approach, the following must be done in
in this single patch:

* Update Hyper-V initialization to set the cc_mask based on vTOM
  and do other coco initialization.

* Update physical_mask so the vTOM bit is no longer treated as part
  of the physical address

* Update cc_mkenc() and cc_mkdec() to be active for Hyper-V guests.
  This makes the vTOM bit part of the protection flags.

* Code already exists to make hypercalls to inform Hyper-V about pages
  changing between shared and private.  Update this code to run as a
  callback from __set_memory_enc_pgtable().

* Remove the Hyper-V special case from __set_memory_enc_dec()

* Remove the Hyper-V specific call to swiotlb_update_mem_attributes()
  since mem_encrypt_init() will now do it.

[1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/
[2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
---
 arch/x86/coco/core.c            | 11 +++++++++-
 arch/x86/hyperv/hv_init.c       | 11 ----------
 arch/x86/hyperv/ivm.c           | 45 +++++++++++++++++++++++++++++++----------
 arch/x86/include/asm/mshyperv.h |  8 ++------
 arch/x86/kernel/cpu/mshyperv.c  | 15 +++++++-------
 arch/x86/mm/pat/set_memory.c    |  3 ---
 6 files changed, 53 insertions(+), 40 deletions(-)
  

Comments

Tianyu Lan Nov. 17, 2022, 2:59 a.m. UTC | #1
On 11/17/2022 2:41 AM, Michael Kelley wrote:
> Hyper-V guests on AMD SEV-SNP hardware have the option of using the
> "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP
> architecture. With vTOM, shared vs. private memory accesses are
> controlled by splitting the guest physical address space into two
> halves.  vTOM is the dividing line where the uppermost bit of the
> physical address space is set; e.g., with 47 bits of guest physical
> address space, vTOM is 0x40000000000 (bit 46 is set).  Guest phyiscal
> memory is accessible at two parallel physical addresses -- one below
> vTOM and one above vTOM.  Accesses below vTOM are private (encrypted)
> while accesses above vTOM are shared (decrypted). In this sense, vTOM
> is like the GPA.SHARED bit in Intel TDX.
> 
> Support for Hyper-V guests using vTOM was added to the Linux kernel in
> two patch sets[1][2]. This support treats the vTOM bit as part of
> the physical address. For accessing shared (decrypted) memory, these
> patch sets create a second kernel virtual mapping that maps to physical
> addresses above vTOM.
> 
> A better approach is to treat the vTOM bit as a protection flag, not
> as part of the physical address. This new approach is like the approach
> for the GPA.SHARED bit in Intel TDX. Rather than creating a second kernel
> virtual mapping, the existing mapping is updated using recently added
> coco mechanisms.  When memory is changed between private and shared using
> set_memory_decrypted() and set_memory_encrypted(), the PTEs for the
> existing kernel mapping are changed to add or remove the vTOM bit
> in the guest physical address, just as with TDX. The hypercalls to
> change the memory status on the host side are made using the existing
> callback mechanism. Everything just works, with a minor tweak to map
> the I/O APIC to use private accesses.
> 
> To accomplish the switch in approach, the following must be done in
> in this single patch:
> 
> * Update Hyper-V initialization to set the cc_mask based on vTOM
>    and do other coco initialization.
> 
> * Update physical_mask so the vTOM bit is no longer treated as part
>    of the physical address
> 
> * Update cc_mkenc() and cc_mkdec() to be active for Hyper-V guests.
>    This makes the vTOM bit part of the protection flags.
> 
> * Code already exists to make hypercalls to inform Hyper-V about pages
>    changing between shared and private.  Update this code to run as a
>    callback from __set_memory_enc_pgtable().
> 
> * Remove the Hyper-V special case from __set_memory_enc_dec()
> 
> * Remove the Hyper-V specific call to swiotlb_update_mem_attributes()
>    since mem_encrypt_init() will now do it.
> 
> [1]https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/
> [2]https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/
> 
> Signed-off-by: Michael Kelley<mikelley@microsoft.com>
> ---
>   arch/x86/coco/core.c            | 11 +++++++++-
>   arch/x86/hyperv/hv_init.c       | 11 ----------
>   arch/x86/hyperv/ivm.c           | 45 +++++++++++++++++++++++++++++++----------
>   arch/x86/include/asm/mshyperv.h |  8 ++------
>   arch/x86/kernel/cpu/mshyperv.c  | 15 +++++++-------
>   arch/x86/mm/pat/set_memory.c    |  3 ---
>   6 files changed, 53 insertions(+), 40 deletions(-)

Reviewed-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
  
Borislav Petkov Nov. 21, 2022, 3:03 p.m. UTC | #2
On Wed, Nov 16, 2022 at 10:41:30AM -0800, Michael Kelley wrote:
> Hyper-V guests on AMD SEV-SNP hardware have the option of using the
> "virtual Top Of Memory" (vTOM) feature specified by the SEV-SNP
> architecture. With vTOM, shared vs. private memory accesses are
> controlled by splitting the guest physical address space into two
> halves.  vTOM is the dividing line where the uppermost bit of the
> physical address space is set; e.g., with 47 bits of guest physical
> address space, vTOM is 0x40000000000 (bit 46 is set).  Guest phyiscal

Unknown word [phyiscal] in commit message.
Suggestions: ['physical', 'physically', ... ]

Please introduce a spellchecker into your patch creation workflow.

...

> @@ -108,6 +115,7 @@ u64 cc_mkenc(u64 val)
>  	switch (vendor) {
>  	case CC_VENDOR_AMD:
>  		return val | cc_mask;
> +	case CC_VENDOR_HYPERV:
>  	case CC_VENDOR_INTEL:
>  		return val & ~cc_mask;
>  	default:
> @@ -121,6 +129,7 @@ u64 cc_mkdec(u64 val)
>  	switch (vendor) {
>  	case CC_VENDOR_AMD:
>  		return val & ~cc_mask;
> +	case CC_VENDOR_HYPERV:
>  	case CC_VENDOR_INTEL:
>  		return val | cc_mask;
>  	default:

Uuuh, this needs a BIG FAT COMMENT.

You're running on SNP and yet the enc/dec meaning is flipped. And that's
because of vTOM.

What happens if you have other types of SNP-based VMs on HyperV? The
isolation VMs thing? Or is that the same?

What happens when you do TDX guests with HyperV?

This becomes wrong then.

I think you need a more finer-grained check here in the sense of "is it
a HyperV guest using a paravisor and vTOM is enabled" or so.

Otherwise, I like the removal of the HyperV-specific checks ofc.

Thx.
  
Michael Kelley (LINUX) Nov. 22, 2022, 6:22 p.m. UTC | #3
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 21, 2022 7:03 AM
> 
> On Wed, Nov 16, 2022 at 10:41:30AM -0800, Michael Kelley wrote:
> ...
> 
> > @@ -108,6 +115,7 @@ u64 cc_mkenc(u64 val)
> >  	switch (vendor) {
> >  	case CC_VENDOR_AMD:
> >  		return val | cc_mask;
> > +	case CC_VENDOR_HYPERV:
> >  	case CC_VENDOR_INTEL:
> >  		return val & ~cc_mask;
> >  	default:
> > @@ -121,6 +129,7 @@ u64 cc_mkdec(u64 val)
> >  	switch (vendor) {
> >  	case CC_VENDOR_AMD:
> >  		return val & ~cc_mask;
> > +	case CC_VENDOR_HYPERV:
> >  	case CC_VENDOR_INTEL:
> >  		return val | cc_mask;
> >  	default:
> 
> Uuuh, this needs a BIG FAT COMMENT.
> 
> You're running on SNP and yet the enc/dec meaning is flipped. And that's
> because of vTOM.
> 
> What happens if you have other types of SNP-based VMs on HyperV? The
> isolation VMs thing? Or is that the same?
> 
> What happens when you do TDX guests with HyperV?
> 
> This becomes wrong then.
> 
> I think you need a more finer-grained check here in the sense of "is it
> a HyperV guest using a paravisor and vTOM is enabled" or so.
> 
> Otherwise, I like the removal of the HyperV-specific checks ofc.
> 

I think the core problem here is the naming and meaning of
CC_VENDOR_HYPERV. The name was created originally when the
Hyper-V vTOM handling code was a lot of special cases.   With the
changes in this patch series that make the vTOM functionality more
mainstream, the name would be better as CC_VENDOR_AMD_VTOM.
vTOM is part of the AMD SEV-SNP spec, and it's a different way of
doing the encryption from the "C-bit" based approach.  As much as
possible, I'm trying to not make it be Hyper-V specific, though currently
we have N=1 for hypervisors that offer the vTOM option, so it's a little
hard to generalize.

With the thinking oriented that way, a Linux guest on Hyper-V using
TDX will run with CC_VENDOR_INTEL.  A Linux guest on Hyper-V that
is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD.

Dexuan Cui just posted a patch set for initial TDX support on Hyper-V,
and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if
I'm wrong about that -- I haven't reviewed your patches yet).  Tianyu Lan
has a patch set out for Hyper-V guests using the "C-bit".  That patch set
still uses CC_VENDOR_HYPERV.  Tianyu and I need to work through
whether his patch set can run with CC_VENDOR_AMD like everyone
else using the "C-bit" approach.

Yes, the polarity of the AMD vTOM bit matches the polarity of the
TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit".
I'll add a comment to that effect.

Anyway, that's where I think this should go. Does it make sense?
Other thoughts?

Michael
  
Dave Hansen Nov. 22, 2022, 6:30 p.m. UTC | #4
On 11/22/22 10:22, Michael Kelley (LINUX) wrote:
> Anyway, that's where I think this should go. Does it make sense?
> Other thoughts?

I think hard-coding the C-bit behavior and/or position to a vendor was
probably a bad idea.  Even the comment:

u64 cc_mkenc(u64 val)
{
        /*
         * Both AMD and Intel use a bit in the page table to indicate
         * encryption status of the page.
         *
         * - for AMD, bit *set* means the page is encrypted
         * - for Intel *clear* means encrypted.
         */

doesn't make a lot of sense now.  Maybe we should just have a:

	CC_ATTR_ENC_SET

which gets set for the "AMD" behavior and is clear for the "Intel"
behavior.  Hyper-V code running on AMD can set the attribute to get teh
"Intel" behavior.

That sure beats having a Hyper-V vendor.
  
Michael Kelley (LINUX) Nov. 22, 2022, 10:02 p.m. UTC | #5
From: Dave Hansen <dave.hansen@intel.com> Sent: Tuesday, November 22, 2022 10:30 AM
> 
> On 11/22/22 10:22, Michael Kelley (LINUX) wrote:
> > Anyway, that's where I think this should go. Does it make sense?
> > Other thoughts?
> 
> I think hard-coding the C-bit behavior and/or position to a vendor was
> probably a bad idea.  Even the comment:
> 
> u64 cc_mkenc(u64 val)
> {
>         /*
>          * Both AMD and Intel use a bit in the page table to indicate
>          * encryption status of the page.
>          *
>          * - for AMD, bit *set* means the page is encrypted
>          * - for Intel *clear* means encrypted.
>          */
> 
> doesn't make a lot of sense now.  Maybe we should just have a:
> 
> 	CC_ATTR_ENC_SET
> 
> which gets set for the "AMD" behavior and is clear for the "Intel"
> behavior.  Hyper-V code running on AMD can set the attribute to get teh
> "Intel" behavior.
> 
> That sure beats having a Hyper-V vendor.

To me, figuring out the encryption bit polarity and position isn't
the hard part to set up.  We've got three technologies: TDX,
AMD "C-bit", and arguably, AMD vTOM.  Future silicon and architectural
enhancements will most likely be variations and improvements on
these rather than something completely new (not that I'm necessarily
aware of what the processor vendors may have planned).   The AMD
"C-bit" technology already has sub-cases (SME, SEV, SEV-ES, SEV-SNP)
because of the architectural history.  Any of the technologies may
get additional subcases over time.  Whether those subcases are
represented as new CC_VENDOR_* options, or CC_ATTR_* options
on an existing CC_VENDOR_* should be driven by the level of
commonality with what already exists.  There's no hard-and-fast
rule.  Just do whatever makes the most sense.

I'm proposing AMD vTOM as a separate CC_VENDOR_AMD_VTOM
because it is rather different from CC_VENDOR_AMD, and not just
in the polarity and position of the encryption bit.  But if we really
wanted to just make it a sub-case of CC_VENDOR_AMD, I could
probably be convinced.  The key is cleanly handling all the other
attributes like CC_ATTR_GUEST_STATE_ENCRYPT,
CC_ATTR_ACCESS_IOAPIC_ENCRYPTED (that I add in this patch set),
CC_ATTR_GUEST_UNROLL_STRING_IO, etc. where we want to avoid
too many hacky special cases.  The current code structure where the
CC_VENDOR_* selection determines the CC_ATTR_* values seems
to work OK.

Anyway, that's my thinking.  CC_VENDOR_HYPERV goes away, but
the behavior is still essentially the same when replaced by
CC_VENDOR_AMD_VTOM.

Michael
  
Borislav Petkov Nov. 22, 2022, 10:18 p.m. UTC | #6
On Tue, Nov 22, 2022 at 06:22:46PM +0000, Michael Kelley (LINUX) wrote:
> I think the core problem here is the naming and meaning of
> CC_VENDOR_HYPERV. The name was created originally when the
> Hyper-V vTOM handling code was a lot of special cases.   With the
> changes in this patch series that make the vTOM functionality more
> mainstream, the name would be better as CC_VENDOR_AMD_VTOM.

No, if CC_VENDOR_HYPERV means different things depending on what kind of
guests you're doing, then you should not use a CC_VENDOR at all.

> vTOM is part of the AMD SEV-SNP spec, and it's a different way of
> doing the encryption from the "C-bit" based approach. As much as
> possible, I'm trying to not make it be Hyper-V specific, though
> currently we have N=1 for hypervisors that offer the vTOM option, so
> it's a little hard to generalize.

Actually, it is very simple to generalize: vTOM and the paravisor and
VMPL are all part of the effort to support unenlightened, unmodified
guests with SNP.

So, if KVM wants to run Windows NT 4.0 guests as SNP guests, then it
probably would need the same contraptions.

> With the thinking oriented that way, a Linux guest on Hyper-V using
> TDX will run with CC_VENDOR_INTEL.  A Linux guest on Hyper-V that
> is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD.

Right.

> Dexuan Cui just posted a patch set for initial TDX support on Hyper-V,
> and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if
> I'm wrong about that -- I haven't reviewed your patches yet).  Tianyu Lan
> has a patch set out for Hyper-V guests using the "C-bit".  That patch set
> still uses CC_VENDOR_HYPERV.  Tianyu and I need to work through
> whether his patch set can run with CC_VENDOR_AMD like everyone
> else using the "C-bit" approach.

So I'm not sure the vendor is the right approach here. I guess we need
to specify the *type* of guest being supported.

> Yes, the polarity of the AMD vTOM bit matches the polarity of the
> TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit".
> I'll add a comment to that effect.
> 
> Anyway, that's where I think this should go. Does it make sense?
> Other thoughts?

I think all that polarity doesn't matter as long as we abstract it away
with, "mark encrypted" and "mark decrypted".

But it is late now and I could be wrong - I'll look at the rest
tomorrow-ish.

Thx.
  
Michael Kelley (LINUX) Nov. 23, 2022, 12:59 a.m. UTC | #7
From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 22, 2022 2:18 PM
> 
> On Tue, Nov 22, 2022 at 06:22:46PM +0000, Michael Kelley (LINUX) wrote:
> > I think the core problem here is the naming and meaning of
> > CC_VENDOR_HYPERV. The name was created originally when the
> > Hyper-V vTOM handling code was a lot of special cases.   With the
> > changes in this patch series that make the vTOM functionality more
> > mainstream, the name would be better as CC_VENDOR_AMD_VTOM.
> 
> No, if CC_VENDOR_HYPERV means different things depending on what kind of
> guests you're doing, then you should not use a CC_VENDOR at all.

Agreed.   My proposal is to drop CC_VENDOR_HYPERV entirely.
Replace it with CC_VENDOR_AMD_VTOM (or something like that) that
is set *only* by Linux guests that are running on AMD SEV-SNP processors
and using the vTOM scheme instead of the AMD C-bit scheme.

> 
> > vTOM is part of the AMD SEV-SNP spec, and it's a different way of
> > doing the encryption from the "C-bit" based approach. As much as
> > possible, I'm trying to not make it be Hyper-V specific, though
> > currently we have N=1 for hypervisors that offer the vTOM option, so
> > it's a little hard to generalize.
> 
> Actually, it is very simple to generalize: vTOM and the paravisor and
> VMPL are all part of the effort to support unenlightened, unmodified
> guests with SNP.
> 
> So, if KVM wants to run Windows NT 4.0 guests as SNP guests, then it
> probably would need the same contraptions.

Yes, agreed.  My point about generalization is that Hyper-V is the only
actual implementation today.  Edge cases, like whether the IO-APIC is
accessed as encrypted or as decrypted don't have a pattern yet.  But
that's not a blocker.  Such cases can be resolved or special-cased later
when/if N > 1.

> 
> > With the thinking oriented that way, a Linux guest on Hyper-V using
> > TDX will run with CC_VENDOR_INTEL.  A Linux guest on Hyper-V that
> > is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD.
> 
> Right.

Good. We're in agreement.  :-)

> 
> > Dexuan Cui just posted a patch set for initial TDX support on Hyper-V,
> > and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if
> > I'm wrong about that -- I haven't reviewed your patches yet).

I confirmed with Dexuan that his new patch set for TDX guests on Hyper-V
has the guest running with CC_VENDOR_INTEL, which is what we want.

> > Tianyu Lan
> > has a patch set out for Hyper-V guests using the "C-bit".  That patch set
> > still uses CC_VENDOR_HYPERV.  Tianyu and I need to work through
> > whether his patch set can run with CC_VENDOR_AMD like everyone
> > else using the "C-bit" approach.

I haven't followed up with Tianyu yet.

> 
> So I'm not sure the vendor is the right approach here. I guess we need
> to specify the *type* of guest being supported.

Yes, calling it the "vendor" turns out to not quite be right because in
the AMD case, the technology/architecture/scheme/"type" (or
whatever you want to call it) is not 1:1 with the vendor.   Intel has just
one (TDX) while AMD has two (C-bit and vTOM).   "vendor" is just a label,
but we should get the label right to avoid future confusion.  The key point
is that we'll have three top-level types:

* TDX
* AMD with C-bit  (and this has some sub-types)
* AMD with vTOM

The CC_ATTR_* values are then derived from the "type".

> 
> > Yes, the polarity of the AMD vTOM bit matches the polarity of the
> > TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit".
> > I'll add a comment to that effect.
> >
> > Anyway, that's where I think this should go. Does it make sense?
> > Other thoughts?
> 
> I think all that polarity doesn't matter as long as we abstract it away
> with, "mark encrypted" and "mark decrypted".

Agreed.

Michael
  
Michael Kelley (LINUX) Nov. 28, 2022, 2:38 p.m. UTC | #8
From: Michael Kelley (LINUX) <mikelley@microsoft.com> Sent: Tuesday, November 22, 2022 4:59 PM
> 
> From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 22, 2022 2:18 PM
> >
> > On Tue, Nov 22, 2022 at 06:22:46PM +0000, Michael Kelley (LINUX) wrote:
> > > I think the core problem here is the naming and meaning of
> > > CC_VENDOR_HYPERV. The name was created originally when the
> > > Hyper-V vTOM handling code was a lot of special cases.   With the
> > > changes in this patch series that make the vTOM functionality more
> > > mainstream, the name would be better as CC_VENDOR_AMD_VTOM.
> >
> > No, if CC_VENDOR_HYPERV means different things depending on what kind of
> > guests you're doing, then you should not use a CC_VENDOR at all.
> 
> Agreed.   My proposal is to drop CC_VENDOR_HYPERV entirely.
> Replace it with CC_VENDOR_AMD_VTOM (or something like that) that
> is set *only* by Linux guests that are running on AMD SEV-SNP processors
> and using the vTOM scheme instead of the AMD C-bit scheme.
> 
> >
> > > vTOM is part of the AMD SEV-SNP spec, and it's a different way of
> > > doing the encryption from the "C-bit" based approach. As much as
> > > possible, I'm trying to not make it be Hyper-V specific, though
> > > currently we have N=1 for hypervisors that offer the vTOM option, so
> > > it's a little hard to generalize.
> >
> > Actually, it is very simple to generalize: vTOM and the paravisor and
> > VMPL are all part of the effort to support unenlightened, unmodified
> > guests with SNP.
> >
> > So, if KVM wants to run Windows NT 4.0 guests as SNP guests, then it
> > probably would need the same contraptions.
> 
> Yes, agreed.  My point about generalization is that Hyper-V is the only
> actual implementation today.  Edge cases, like whether the IO-APIC is
> accessed as encrypted or as decrypted don't have a pattern yet.  But
> that's not a blocker.  Such cases can be resolved or special-cased later
> when/if N > 1.
> 
> >
> > > With the thinking oriented that way, a Linux guest on Hyper-V using
> > > TDX will run with CC_VENDOR_INTEL.  A Linux guest on Hyper-V that
> > > is fully enlightened to use the "C-bit" will run with CC_VENDOR_AMD.
> >
> > Right.
> 
> Good. We're in agreement.  :-)
> 
> >
> > > Dexuan Cui just posted a patch set for initial TDX support on Hyper-V,
> > > and I think that runs with CC_VENDOR_INTEL (Dexuan -- correct me if
> > > I'm wrong about that -- I haven't reviewed your patches yet).
> 
> I confirmed with Dexuan that his new patch set for TDX guests on Hyper-V
> has the guest running with CC_VENDOR_INTEL, which is what we want.
> 
> > > Tianyu Lan
> > > has a patch set out for Hyper-V guests using the "C-bit".  That patch set
> > > still uses CC_VENDOR_HYPERV.  Tianyu and I need to work through
> > > whether his patch set can run with CC_VENDOR_AMD like everyone
> > > else using the "C-bit" approach.
> 
> I haven't followed up with Tianyu yet.
> 
> >
> > So I'm not sure the vendor is the right approach here. I guess we need
> > to specify the *type* of guest being supported.
> 
> Yes, calling it the "vendor" turns out to not quite be right because in
> the AMD case, the technology/architecture/scheme/"type" (or
> whatever you want to call it) is not 1:1 with the vendor.   Intel has just
> one (TDX) while AMD has two (C-bit and vTOM).   "vendor" is just a label,
> but we should get the label right to avoid future confusion.  The key point
> is that we'll have three top-level types:
> 
> * TDX
> * AMD with C-bit  (and this has some sub-types)
> * AMD with vTOM
> 
> The CC_ATTR_* values are then derived from the "type".
> 
> >
> > > Yes, the polarity of the AMD vTOM bit matches the polarity of the
> > > TDX GPA.SHARED bit, and is the opposite polarity of the AMD "C-bit".
> > > I'll add a comment to that effect.
> > >
> > > Anyway, that's where I think this should go. Does it make sense?
> > > Other thoughts?
> >
> > I think all that polarity doesn't matter as long as we abstract it away
> > with, "mark encrypted" and "mark decrypted".
> 
> Agreed.
>

Boris --

Any further comment on this patch?  I think we're agreement.  For
this patch series I propose to change the symbol "CC_VENDOR_HYPERV"
to "CC_VENDOR_AMD_VTOM" and the function name
hyperv_cc_platform_has() to amd_vtom_cc_platform_has().  There's
no functionality change vs. the current version of this patch.  The naming
changes just make clear that the scope is only for the AMD vTOM encryption
scheme.

As discussed, the concept "vendor" isn't quite right either, but let's not
add that change to this patch series.  I can follow with a separate patch
to change symbols with "vendor" or "VENDOR" to something else, which
will probably require some discussion as to what that something else
should be.

Michael
  
Borislav Petkov Nov. 28, 2022, 4:33 p.m. UTC | #9
On Mon, Nov 28, 2022 at 02:38:11PM +0000, Michael Kelley (LINUX) wrote:
> Any further comment on this patch?  I think we're agreement.  For
> this patch series I propose to change the symbol "CC_VENDOR_HYPERV"
> to "CC_VENDOR_AMD_VTOM" and the function name
> hyperv_cc_platform_has() to amd_vtom_cc_platform_has().

That doesn't sound optimal to me.

So, let's clarify things first: those Isolation VMs - are they going to
be the paravisors?

I don't see any other option because the unmodified guest must be some
old windoze....

So, if they're going to be that, then I guess this should be called

	CC_ATTR_PARAVISOR

to denote that it is a thin layer of virt gunk between an unmodified
guest and a hypervisor.

And if TDX wants to do that too later, then they can use that flag too.

Yes, no?
  
Michael Kelley (LINUX) Nov. 28, 2022, 4:59 p.m. UTC | #10
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 28, 2022 8:34 AM
> 
> On Mon, Nov 28, 2022 at 02:38:11PM +0000, Michael Kelley (LINUX) wrote:
> > Any further comment on this patch?  I think we're agreement.  For
> > this patch series I propose to change the symbol "CC_VENDOR_HYPERV"
> > to "CC_VENDOR_AMD_VTOM" and the function name
> > hyperv_cc_platform_has() to amd_vtom_cc_platform_has().
> 
> That doesn't sound optimal to me.
> 
> So, let's clarify things first: those Isolation VMs - are they going to
> be the paravisors?

No.  The paravisor exists in another layer (VMPL 0 in the AMD vTOM
architecture) that isn't visible to the Linux guest.  The paravisor is
a separate code base that isn't currently open source.  All code
in this patch series is code that runs in the Linux guest.

From an encryption standpoint, the Linux guest sees the following:

1) Guest memory is encrypted by default
2) The Linux guest must set the vTOM flag in a PTE to access a page
as unencrypted.
3) The Linux guest must explicitly notify the hypervisor to change a
page from private (encrypted) to shared (decrypted), and vice versa.

> 
> I don't see any other option because the unmodified guest must be some
> old windoze....

What Windows guests do isn't really relevant.  Again, the code in this patch
series all runs directly in the Linux guest, not the paravisor.  And the Linux
guest isn't unmodified.  We've added changes to understand vTOM and
the need to communicate with the hypervisor about page changes
between private and shared.  But there are other changes for a fully
enlightened guest that we don't have to make when using AMD vTOM,
because the paravisor transparently (to the guest -- Linux or Windows)
handles those issues.

> 
> So, if they're going to be that, then I guess this should be called
> 
> 	CC_ATTR_PARAVISOR
> 
> to denote that it is a thin layer of virt gunk between an unmodified
> guest and a hypervisor.

No, the code is this patch series is not that at all.  It's not code that is
part of the paravisor.  It's Linux guest code that understands it is running
in an environment where AMD vTOM is the encryption scheme, which is
different from the AMD C-bit encryption scheme.

> 
> And if TDX wants to do that too later, then they can use that flag too.
> 

Again, no.  What I have proposed as CC_VENDOR_AMD_VTOM is
specific to AMD's virtual-Top-of-Memory architecture.  The TDX
architecture doesn't really have a way to use a paravisor.

To summarize, the code in this patch series is about a 3rd encryption
scheme that is used by the guest.  It is completely parallel to the AMD
C-bit encryption scheme and the Intel TDX encryption scheme.   With
the AMD vTOM scheme, there is a paravisor that transparently emulates
some things for the guest so there are fewer code changes needed in the
guest, but this patch series is not about that paravisor code.

Michael
  
Borislav Petkov Nov. 28, 2022, 5:24 p.m. UTC | #11
On Mon, Nov 28, 2022 at 04:59:27PM +0000, Michael Kelley (LINUX) wrote:
> 2) The Linux guest must set the vTOM flag in a PTE to access a page
> as unencrypted.

What exactly do you call the "vTOM flag in the PTE"?

I see this here:

"When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active
VM, the VIRTUAL_TOM field is used to determine the C-bit for data
accesses instead of the guest page table contents. All data accesses
below VIRTUAL_TOM are accessed with an effective C-bit of 1 and all
addresses at or above VIRTUAL_TOM are accessed with an effective C-bit
of 0."

Now you say

"vTOM is the dividing line where the uppermost bit of the physical
address space is set; e.g., with 47 bits of guest physical address
space, vTOM is 0x40000000000 (bit 46 is set)."

So on your guests, is VIRTUAL_TOM == 0x400000000000?

Btw, that 0x4000.. does not have bit 46 set but bit 42. Bit 46 set is

	0x400000000000

which means one more '0' at the end.

So before we discuss this further, let's agree on the basics first.

> What Windows guests do isn't really relevant.  Again, the code in this patch
> series all runs directly in the Linux guest, not the paravisor.  And the Linux
> guest isn't unmodified.  We've added changes to understand vTOM and
> the need to communicate with the hypervisor about page changes
> between private and shared.  But there are other changes for a fully
> enlightened guest that we don't have to make when using AMD vTOM,
> because the paravisor transparently (to the guest -- Linux or Windows)
> handles those issues.

So this is some other type of guest you wanna run.

Where is the documentation of that thing?

I'd like to know what exactly it is going to use in the kernel.

> Again, no.  What I have proposed as CC_VENDOR_AMD_VTOM is

There's no vendor AMD_VTOM!

We did the vendor thing to denote Intel or AMD wrt to confidential
computing.

Now you're coming up with something special. It can't be HYPERV because
Hyper-V does other types of confidential solutions too, apparently.

Now before this goes any further I'd like for "something special" to be
defined properly and not just be a one-off Hyper-V thing.

> specific to AMD's virtual-Top-of-Memory architecture.  The TDX
> architecture doesn't really have a way to use a paravisor.
> 
> To summarize, the code in this patch series is about a 3rd encryption
> scheme that is used by the guest.

Yes, can that third thing be used by other hypervisors or is this
Hyper-V special?

> It is completely parallel to the AMD C-bit encryption scheme and
> the Intel TDX encryption scheme. With the AMD vTOM scheme, there is
> a paravisor that transparently emulates some things for the guest
> so there are fewer code changes needed in the guest, but this patch
> series is not about that paravisor code.

Would other hypervisors want to support such a scheme?

Is this architecture documented somewhere? If so, where?

What would it mean for the kernel to support it.

And so on and so on.

Thanks.
  
Michael Kelley (LINUX) Nov. 28, 2022, 5:55 p.m. UTC | #12
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 28, 2022 9:25 AM
> 
> On Mon, Nov 28, 2022 at 04:59:27PM +0000, Michael Kelley (LINUX) wrote:
> > 2) The Linux guest must set the vTOM flag in a PTE to access a page
> > as unencrypted.
> 
> What exactly do you call the "vTOM flag in the PTE"?
> 
> I see this here:
> 
> "When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active
> VM, the VIRTUAL_TOM field is used to determine the C-bit for data
> accesses instead of the guest page table contents. All data accesses
> below VIRTUAL_TOM are accessed with an effective C-bit of 1 and all
> addresses at or above VIRTUAL_TOM are accessed with an effective C-bit
> of 0."
> 
> Now you say
> 
> "vTOM is the dividing line where the uppermost bit of the physical
> address space is set; e.g., with 47 bits of guest physical address
> space, vTOM is 0x40000000000 (bit 46 is set)."
> 
> So on your guests, is VIRTUAL_TOM == 0x400000000000?
> 
> Btw, that 0x4000.. does not have bit 46 set but bit 42. Bit 46 set is
> 
> 	0x400000000000
> 
> which means one more '0' at the end.

Yeah, looks like I dropped a '0' in my comment text.  Will fix.

> 
> So before we discuss this further, let's agree on the basics first.
> 
> > What Windows guests do isn't really relevant.  Again, the code in this patch
> > series all runs directly in the Linux guest, not the paravisor.  And the Linux
> > guest isn't unmodified.  We've added changes to understand vTOM and
> > the need to communicate with the hypervisor about page changes
> > between private and shared.  But there are other changes for a fully
> > enlightened guest that we don't have to make when using AMD vTOM,
> > because the paravisor transparently (to the guest -- Linux or Windows)
> > handles those issues.
> 
> So this is some other type of guest you wanna run.
> 
> Where is the documentation of that thing?
> 
> I'd like to know what exactly it is going to use in the kernel.

Standard Linux is the guest.  It's fully functional for running general
Purpose workloads that want "confidential computing" where guest
memory is encrypted and the data in the guest is not visible to the
host hypervisor.  It's a standard Linux kernel.  I'm not sure what you
mean by "some other type of guest".

> 
> > Again, no.  What I have proposed as CC_VENDOR_AMD_VTOM is
> 
> There's no vendor AMD_VTOM!
> 
> We did the vendor thing to denote Intel or AMD wrt to confidential
> computing.

But vendor AMD effectively offers two different encryption schemes that
could be seen by the guest VM.  The hypervisor chooses which scheme a
particular guest will see.  Hyper-V has chosen to present the vTOM scheme
to guest VMs, including normal Linux and Windows guests, that have been
modestly updated to understand vTOM.

> 
> Now you're coming up with something special. It can't be HYPERV because
> Hyper-V does other types of confidential solutions too, apparently.

In the future, Hyper-V may also choose to present original AMD C-bit scheme
in some guest VMs, depending on the use case.  And it will present the Intel
TDX scheme when running on that hardware.

> 
> Now before this goes any further I'd like for "something special" to be
> defined properly and not just be a one-off Hyper-V thing.
> 
> > specific to AMD's virtual-Top-of-Memory architecture.  The TDX
> > architecture doesn't really have a way to use a paravisor.
> >
> > To summarize, the code in this patch series is about a 3rd encryption
> > scheme that is used by the guest.
> 
> Yes, can that third thing be used by other hypervisors or is this
> Hyper-V special?

This third thing is part of the AMD SEV-SNP architecture and could be used
by any hypervisor.

> 
> > It is completely parallel to the AMD C-bit encryption scheme and
> > the Intel TDX encryption scheme. With the AMD vTOM scheme, there is
> > a paravisor that transparently emulates some things for the guest
> > so there are fewer code changes needed in the guest, but this patch
> > series is not about that paravisor code.
> 
> Would other hypervisors want to support such a scheme?
> 
> Is this architecture documented somewhere? If so, where?

The AMD vTOM scheme is documented in the AMD SEV-SNP
architecture specs.

> 
> What would it mean for the kernel to support it.

The Linux kernel running as a guest already supports it, at least when
running on Hyper-V.   The code went into the 5.15 kernel [1][2] about
a year ago.  But that code is more Hyper-V specific than is desirable. 
This patch set makes the AMD vTOM support more parallel to the Intel
TDX and AMD C-bit support.

To my knowledge, KVM does not support the AMD vTOM scheme.
Someone from AMD may have a better sense whether adding that
support is likely in the future.

[1] https://lore.kernel.org/all/20211025122116.264793-1-ltykernel@gmail.com/
[2] https://lore.kernel.org/all/20211213071407.314309-1-ltykernel@gmail.com/

Michael
  
Borislav Petkov Nov. 28, 2022, 7:56 p.m. UTC | #13
On Mon, Nov 28, 2022 at 05:55:11PM +0000, Michael Kelley (LINUX) wrote:
> But vendor AMD effectively offers two different encryption schemes that
> could be seen by the guest VM.  The hypervisor chooses which scheme a
> particular guest will see.  Hyper-V has chosen to present the vTOM scheme
> to guest VMs, including normal Linux and Windows guests, that have been
> modestly updated to understand vTOM.

If this is a standard SNP guest then you can detect vTOM support using
SEV_FEATURES. See this thread here:

https://lore.kernel.org/r/20221117044433.244656-1-nikunj@amd.com

Which then means, you don't need any special gunk except extending this
patch above to check SNP has vTOM support.

> In the future, Hyper-V may also choose to present original AMD C-bit scheme
> in some guest VMs, depending on the use case.  And it will present the Intel
> TDX scheme when running on that hardware.

And all those should JustWork(tm) because we already support such guests.

> To my knowledge, KVM does not support the AMD vTOM scheme.
> Someone from AMD may have a better sense whether adding that
> support is likely in the future.

Yah, see above.

Thx.
  
Michael Kelley (LINUX) Nov. 29, 2022, 1:15 a.m. UTC | #14
From: Borislav Petkov <bp@alien8.de> Sent: Monday, November 28, 2022 11:57 AM
> 
> On Mon, Nov 28, 2022 at 05:55:11PM +0000, Michael Kelley (LINUX) wrote:
> > But vendor AMD effectively offers two different encryption schemes that
> > could be seen by the guest VM.  The hypervisor chooses which scheme a
> > particular guest will see.  Hyper-V has chosen to present the vTOM scheme
> > to guest VMs, including normal Linux and Windows guests, that have been
> > modestly updated to understand vTOM.
> 
> If this is a standard SNP guest then you can detect vTOM support using
> SEV_FEATURES. See this thread here:
> 
> https://lore.kernel.org/all/20221117044433.244656-1-nikunj@amd.com/
> 
> Which then means, you don't need any special gunk except extending this
> patch above to check SNP has vTOM support.

Yes, we could do that.  But this patch set is still needed to get the preliminaries
in place.  The initial code for supporting AMD vTOM that went upstream a
year ago is too Hyper-V specific.  This patch set removes the Hyper-V specific
stuff, and integrates vTOM support into the overall confidential computing
framework in arch/x86/coco/core.c.  The Hyper-V code was already somewhat
there via CC_VENDOR_HYPERV, but that really should be named something
else now.  So that's why I'm suggesting CC_VENDOR_AMD_VTOM.
 
> 
> > In the future, Hyper-V may also choose to present original AMD C-bit scheme
> > in some guest VMs, depending on the use case.  And it will present the Intel
> > TDX scheme when running on that hardware.
> 
> And all those should JustWork(tm) because we already support such guests.

Agreed.  That's where we want to be.

Of course, when you go from N=1 hypervisors (i.e., KVM) to N=2 (KVM
and Hyper-V, you find some places where incorrect assumptions were made
or some generalizations are needed.  Dexuan Cui's patch set for TDX support
is fixing those places that he has encountered.  But with those fixes, the TDX
support will JustWork(tm) for Linux guests on Hyper-V.

I haven't gone deeply into the situation with AMD C-bit support on Hyper-V.
Tianyu Lan's set of patches for that support is a bit bigger, and I'm
planning to look closely to understand whether it's also just fixing incorrect
assumptions and such, or whether they really are some differences with
Hyper-V.  If there are differences, I want to understand why.

Michael
  
Borislav Petkov Nov. 29, 2022, 8:40 a.m. UTC | #15
On Tue, Nov 29, 2022 at 01:15:39AM +0000, Michael Kelley (LINUX) wrote:
> So that's why I'm suggesting CC_VENDOR_AMD_VTOM.

There's no CC_VENDOR_AMD_VTOM. How many times do I need to explain this?!
CC_VENDOR is well the CC vendor - not some special case.

IOW, the goal here is for generic SNP functionality to be the same for
*all* SNP guests. We won't do the AMD's version of vTOM enablement,
Hyper-V's version of vTOM enablement and so on. It must be a single
vTOM feature which works on *all* hypervisors as vTOM is a generic SNP
feature - not Hyper-V feature.

> Of course, when you go from N=1 hypervisors (i.e., KVM) to N=2 (KVM
> and Hyper-V, you find some places where incorrect assumptions were
> made or some generalizations are needed. Dexuan Cui's patch set for
> TDX support is fixing those places that he has encountered. But with
> those fixes, the TDX support will JustWork(tm) for Linux guests on
> Hyper-V.

That sounds like the right direction to take.

> I haven't gone deeply into the situation with AMD C-bit support on
> Hyper-V. Tianyu Lan's set of patches for that support is a bit bigger,
> and I'm planning to look closely to understand whether it's also just
> fixing incorrect assumptions and such, or whether they really are
> some differences with Hyper-V. If there are differences, I want to
> understand why.

You and me too. So I guess we should look at Tianyu's set first.

Thx.
  
Michael Kelley (LINUX) Nov. 29, 2022, 3:49 p.m. UTC | #16
From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 29, 2022 12:41 AM
> 
> On Tue, Nov 29, 2022 at 01:15:39AM +0000, Michael Kelley (LINUX) wrote:
> > So that's why I'm suggesting CC_VENDOR_AMD_VTOM.
> 
> There's no CC_VENDOR_AMD_VTOM. How many times do I need to explain this?!
> CC_VENDOR is well the CC vendor - not some special case.

Let's back up half a step.  We have CC_VENDOR_INTEL and
CC_VENDOR_AMD because that's a convenient way to identify
two fairly different schemes for doing guest encryption.  The
schemes have some things in common, but the details are pretty
different.  Tagging the schemes based on the vendor makes sense
because the schemes were independently developed by each
processor vendor.

But it turns out that AMD really has two fairly different schemes:
the C-bit scheme and the vTOM scheme.  The details of these
two AMD schemes are pretty different.  vTOM is *not* just a minor
option on the C-bit scheme.  It's an either/or -- a guest VM is either
doing the C-bit scheme or the vTOM scheme, not some combination.
Linux code in coco/core.c could choose to treat C-bit and vTOM as
two sub-schemes under CC_VENDOR_AMD, but that makes the
code a bit messy because we end up with "if" statements to figure
out whether to do things the C-bit way or the vTOM way.  The
difference in the C-bit way and the vTOM way is not Hyper-V
specific.  Any hypervisor running on an AMD processor can make
the same choice to offer C-bit VMs or vTOM VMs.

Or we could model the two AMD schemes as two different vendors,
which is what I'm suggesting.  Doing so recognizes that the two schemes
are fairly disjoint, and it makes the code cleaner.

Tom Lendacky -- any thoughts on this question from AMD's
standpoint?   

> 
> IOW, the goal here is for generic SNP functionality to be the same for
> *all* SNP guests. We won't do the AMD's version of vTOM enablement,
> Hyper-V's version of vTOM enablement and so on. It must be a single
> vTOM feature which works on *all* hypervisors as vTOM is a generic SNP
> feature - not Hyper-V feature.

Yes, there's only one version of vTOM enablement -- the AMD version.

But the broader AMD SNP functionality is bifurcated:  there's the
C-bit scheme and the vTOM scheme.  The question is how Linux code
should model those two different schemes.  I'm suggesting that things
are cleaner conceptually and in the code if we model the two different
AMD schemes as two different vendors.

Michael

> 
> > Of course, when you go from N=1 hypervisors (i.e., KVM) to N=2 (KVM
> > and Hyper-V, you find some places where incorrect assumptions were
> > made or some generalizations are needed. Dexuan Cui's patch set for
> > TDX support is fixing those places that he has encountered. But with
> > those fixes, the TDX support will JustWork(tm) for Linux guests on
> > Hyper-V.
> 
> That sounds like the right direction to take.
> 
> > I haven't gone deeply into the situation with AMD C-bit support on
> > Hyper-V. Tianyu Lan's set of patches for that support is a bit bigger,
> > and I'm planning to look closely to understand whether it's also just
> > fixing incorrect assumptions and such, or whether they really are
> > some differences with Hyper-V. If there are differences, I want to
> > understand why.
> 
> You and me too. So I guess we should look at Tianyu's set first.
>
  
Borislav Petkov Nov. 29, 2022, 5:47 p.m. UTC | #17
On Tue, Nov 29, 2022 at 03:49:06PM +0000, Michael Kelley (LINUX) wrote:
> But it turns out that AMD really has two fairly different schemes:
> the C-bit scheme and the vTOM scheme.

Except it doesn't:

"In the VMSA of an SNP-active guest, the VIRTUAL_TOM field designates
a 2MB aligned guest physical address called the virtual top of memory.
When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active
VM, the VIRTUAL_TOM..."

So SEV_FEATURES[1] is vTOM and it is part of SNP.

Why do you keep harping on this being something else is beyond me...

I already pointed you to the patch which adds this along with the other
SEV_FEATURES.

> The details of these two AMD schemes are pretty different. vTOM is
> *not* just a minor option on the C-bit scheme. It's an either/or -- a
> guest VM is either doing the C-bit scheme or the vTOM scheme, not some
> combination. Linux code in coco/core.c could choose to treat C-bit and
> vTOM as two sub-schemes under CC_VENDOR_AMD, but that makes the code a
> bit messy because we end up with "if" statements to figure out whether
> to do things the C-bit way or the vTOM way.

Are you saying that that:

	if (cc_vendor == CC_VENDOR_AMD &&
	    sev_features & MSR_AMD64_SNP_VTOM_ENABLED)

is messy? Why?

We will have to support vTOM sooner or later.

> Or we could model the two AMD schemes as two different vendors,
> which is what I'm suggesting.  Doing so recognizes that the two schemes
> are fairly disjoint, and it makes the code cleaner.

How is that any different from the above check?

You *need* some sort of a check to differentiate between the two anyway.
  
Michael Kelley (LINUX) Nov. 30, 2022, 4:11 p.m. UTC | #18
From: Borislav Petkov <bp@alien8.de> Sent: Tuesday, November 29, 2022 9:47 AM
> 
> On Tue, Nov 29, 2022 at 03:49:06PM +0000, Michael Kelley (LINUX) wrote:
> > But it turns out that AMD really has two fairly different schemes:
> > the C-bit scheme and the vTOM scheme.
> 
> Except it doesn't:
> 
> "In the VMSA of an SNP-active guest, the VIRTUAL_TOM field designates
> a 2MB aligned guest physical address called the virtual top of memory.
> When bit 1 (vTOM) of SEV_FEATURES is set in the VMSA of an SNP-active
> VM, the VIRTUAL_TOM..."
> 
> So SEV_FEATURES[1] is vTOM and it is part of SNP.
> 
> Why do you keep harping on this being something else is beyond me...
> 
> I already pointed you to the patch which adds this along with the other
> SEV_FEATURES.
> 
> > The details of these two AMD schemes are pretty different. vTOM is
> > *not* just a minor option on the C-bit scheme. It's an either/or -- a
> > guest VM is either doing the C-bit scheme or the vTOM scheme, not some
> > combination. Linux code in coco/core.c could choose to treat C-bit and
> > vTOM as two sub-schemes under CC_VENDOR_AMD, but that makes the code a
> > bit messy because we end up with "if" statements to figure out whether
> > to do things the C-bit way or the vTOM way.
> 
> Are you saying that that:
> 
> 	if (cc_vendor == CC_VENDOR_AMD &&
> 	    sev_features & MSR_AMD64_SNP_VTOM_ENABLED)
> 
> is messy? Why?
> 
> We will have to support vTOM sooner or later.
> 
> > Or we could model the two AMD schemes as two different vendors,
> > which is what I'm suggesting.  Doing so recognizes that the two schemes
> > are fairly disjoint, and it makes the code cleaner.
> 
> How is that any different from the above check?
> 
> You *need* some sort of a check to differentiate between the two anyway.
> 

Alright.  Enough conceptual debate.  I'll do a v4 of the patch series with
the AMD C-bit and vTOM schemes folder under CC_VENDOR_AMD and
we can see if there's any further feedback.  I should have that v4 out later
today or tomorrow.

Michael
  

Patch

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index 49b44f8..f5e1f2d 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -78,7 +78,14 @@  static bool amd_cc_platform_has(enum cc_attr attr)
 
 static bool hyperv_cc_platform_has(enum cc_attr attr)
 {
-	return attr == CC_ATTR_GUEST_MEM_ENCRYPT;
+	switch (attr) {
+	case CC_ATTR_GUEST_MEM_ENCRYPT:
+	case CC_ATTR_MEM_ENCRYPT:
+	case CC_ATTR_EMULATED_IOAPIC:
+		return true;
+	default:
+		return false;
+	}
 }
 
 bool cc_platform_has(enum cc_attr attr)
@@ -108,6 +115,7 @@  u64 cc_mkenc(u64 val)
 	switch (vendor) {
 	case CC_VENDOR_AMD:
 		return val | cc_mask;
+	case CC_VENDOR_HYPERV:
 	case CC_VENDOR_INTEL:
 		return val & ~cc_mask;
 	default:
@@ -121,6 +129,7 @@  u64 cc_mkdec(u64 val)
 	switch (vendor) {
 	case CC_VENDOR_AMD:
 		return val & ~cc_mask;
+	case CC_VENDOR_HYPERV:
 	case CC_VENDOR_INTEL:
 		return val | cc_mask;
 	default:
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index f49bc3e..89a97d7 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -29,7 +29,6 @@ 
 #include <linux/syscore_ops.h>
 #include <clocksource/hyperv_timer.h>
 #include <linux/highmem.h>
-#include <linux/swiotlb.h>
 
 int hyperv_init_cpuhp;
 u64 hv_current_partition_id = ~0ull;
@@ -504,16 +503,6 @@  void __init hyperv_init(void)
 	/* Query the VMs extended capability once, so that it can be cached. */
 	hv_query_ext_cap(0);
 
-#ifdef CONFIG_SWIOTLB
-	/*
-	 * Swiotlb bounce buffer needs to be mapped in extra address
-	 * space. Map function doesn't work in the early place and so
-	 * call swiotlb_update_mem_attributes() here.
-	 */
-	if (hv_is_isolation_supported())
-		swiotlb_update_mem_attributes();
-#endif
-
 	return;
 
 clean_guest_os_id:
diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index e8be4c2..29ccbe8 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -13,6 +13,7 @@ 
 #include <asm/svm.h>
 #include <asm/sev.h>
 #include <asm/io.h>
+#include <asm/coco.h>
 #include <asm/mshyperv.h>
 #include <asm/hypervisor.h>
 
@@ -233,7 +234,6 @@  void hv_ghcb_msr_read(u64 msr, u64 *value)
 	local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(hv_ghcb_msr_read);
-#endif
 
 /*
  * hv_mark_gpa_visibility - Set pages visible to host via hvcall.
@@ -286,27 +286,25 @@  static int hv_mark_gpa_visibility(u16 count, const u64 pfn[],
 }
 
 /*
- * hv_set_mem_host_visibility - Set specified memory visible to host.
+ * hv_vtom_set_host_visibility - Set specified memory visible to host.
  *
  * In Isolation VM, all guest memory is encrypted from host and guest
  * needs to set memory visible to host via hvcall before sharing memory
  * with host. This function works as wrap of hv_mark_gpa_visibility()
  * with memory base and size.
  */
-int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visible)
+static bool hv_vtom_set_host_visibility(unsigned long kbuffer, int pagecount, bool enc)
 {
-	enum hv_mem_host_visibility visibility = visible ?
-			VMBUS_PAGE_VISIBLE_READ_WRITE : VMBUS_PAGE_NOT_VISIBLE;
+	enum hv_mem_host_visibility visibility = enc ?
+			VMBUS_PAGE_NOT_VISIBLE : VMBUS_PAGE_VISIBLE_READ_WRITE;
 	u64 *pfn_array;
 	int ret = 0;
+	bool result = true;
 	int i, pfn;
 
-	if (!hv_is_isolation_supported() || !hv_hypercall_pg)
-		return 0;
-
 	pfn_array = kmalloc(HV_HYP_PAGE_SIZE, GFP_KERNEL);
 	if (!pfn_array)
-		return -ENOMEM;
+		return false;
 
 	for (i = 0, pfn = 0; i < pagecount; i++) {
 		pfn_array[pfn] = virt_to_hvpfn((void *)kbuffer + i * HV_HYP_PAGE_SIZE);
@@ -315,17 +313,42 @@  int hv_set_mem_host_visibility(unsigned long kbuffer, int pagecount, bool visibl
 		if (pfn == HV_MAX_MODIFY_GPA_REP_COUNT || i == pagecount - 1) {
 			ret = hv_mark_gpa_visibility(pfn, pfn_array,
 						     visibility);
-			if (ret)
+			if (ret) {
+				result = false;
 				goto err_free_pfn_array;
+			}
 			pfn = 0;
 		}
 	}
 
  err_free_pfn_array:
 	kfree(pfn_array);
-	return ret;
+	return result;
+}
+
+static bool hv_vtom_tlb_flush_required(bool private)
+{
+	return true;
 }
 
+static bool hv_vtom_cache_flush_required(void)
+{
+	return false;
+}
+
+void __init hv_vtom_init(void)
+{
+	cc_set_vendor(CC_VENDOR_HYPERV);
+	cc_set_mask(ms_hyperv.shared_gpa_boundary);
+	physical_mask &= ms_hyperv.shared_gpa_boundary - 1;
+
+	x86_platform.guest.enc_cache_flush_required = hv_vtom_cache_flush_required;
+	x86_platform.guest.enc_tlb_flush_required = hv_vtom_tlb_flush_required;
+	x86_platform.guest.enc_status_change_finish = hv_vtom_set_host_visibility;
+}
+
+#endif /* CONFIG_AMD_MEM_ENCRYPT */
+
 /*
  * hv_map_memory - map memory to extra space in the AMD SEV-SNP Isolation VM.
  */
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 61f0c20..59b3310 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -174,18 +174,19 @@  static inline void hv_apic_init(void) {}
 int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
 		struct hv_interrupt_entry *entry);
 int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry);
-int hv_set_mem_host_visibility(unsigned long addr, int numpages, bool visible);
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 void hv_ghcb_msr_write(u64 msr, u64 value);
 void hv_ghcb_msr_read(u64 msr, u64 *value);
 bool hv_ghcb_negotiate_protocol(void);
 void hv_ghcb_terminate(unsigned int set, unsigned int reason);
+void hv_vtom_init(void);
 #else
 static inline void hv_ghcb_msr_write(u64 msr, u64 value) {}
 static inline void hv_ghcb_msr_read(u64 msr, u64 *value) {}
 static inline bool hv_ghcb_negotiate_protocol(void) { return false; }
 static inline void hv_ghcb_terminate(unsigned int set, unsigned int reason) {}
+static inline void hv_vtom_init(void) {}
 #endif
 
 extern bool hv_isolation_type_snp(void);
@@ -241,11 +242,6 @@  static inline int hyperv_flush_guest_mapping_range(u64 as,
 }
 static inline void hv_set_register(unsigned int reg, u64 value) { }
 static inline u64 hv_get_register(unsigned int reg) { return 0; }
-static inline int hv_set_mem_host_visibility(unsigned long addr, int numpages,
-					     bool visible)
-{
-	return -1;
-}
 #endif /* CONFIG_HYPERV */
 
 
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 8316139..b080795 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -33,7 +33,6 @@ 
 #include <asm/nmi.h>
 #include <clocksource/hyperv_timer.h>
 #include <asm/numa.h>
-#include <asm/coco.h>
 
 /* Is Linux running as the root partition? */
 bool hv_root_partition;
@@ -325,8 +324,10 @@  static void __init ms_hyperv_init_platform(void)
 	if (ms_hyperv.priv_high & HV_ISOLATION) {
 		ms_hyperv.isolation_config_a = cpuid_eax(HYPERV_CPUID_ISOLATION_CONFIG);
 		ms_hyperv.isolation_config_b = cpuid_ebx(HYPERV_CPUID_ISOLATION_CONFIG);
-		ms_hyperv.shared_gpa_boundary =
-			BIT_ULL(ms_hyperv.shared_gpa_boundary_bits);
+
+		if (ms_hyperv.shared_gpa_boundary_active)
+			ms_hyperv.shared_gpa_boundary =
+				BIT_ULL(ms_hyperv.shared_gpa_boundary_bits);
 
 		pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n",
 			ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
@@ -337,11 +338,6 @@  static void __init ms_hyperv_init_platform(void)
 			swiotlb_unencrypted_base = ms_hyperv.shared_gpa_boundary;
 #endif
 		}
-		/* Isolation VMs are unenlightened SEV-based VMs, thus this check: */
-		if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) {
-			if (hv_get_isolation_type() != HV_ISOLATION_TYPE_NONE)
-				cc_set_vendor(CC_VENDOR_HYPERV);
-		}
 	}
 
 	if (hv_max_functions_eax >= HYPERV_CPUID_NESTED_FEATURES) {
@@ -410,6 +406,9 @@  static void __init ms_hyperv_init_platform(void)
 	i8253_clear_counter_on_shutdown = false;
 
 #if IS_ENABLED(CONFIG_HYPERV)
+	if ((hv_get_isolation_type() == HV_ISOLATION_TYPE_VBS) ||
+	    (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP))
+		hv_vtom_init();
 	/*
 	 * Setup the hook to get control post apic initialization.
 	 */
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index 06eb8910..0757cfe 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2126,9 +2126,6 @@  static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
 
 static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
 {
-	if (hv_is_isolation_supported())
-		return hv_set_mem_host_visibility(addr, numpages, !enc);
-
 	if (cc_platform_has(CC_ATTR_MEM_ENCRYPT))
 		return __set_memory_enc_pgtable(addr, numpages, enc);