[v3,05/14] x86/mm: Handle decryption/re-encryption of bss_decrypted consistently

Message ID 1668624097-14884-6-git-send-email-mikelley@microsoft.com
State New
Headers
Series Add PCI pass-thru support to Hyper-V Confidential VMs |

Commit Message

Michael Kelley (LINUX) Nov. 16, 2022, 6:41 p.m. UTC
  Current code in sme_postprocess_startup() decrypts the bss_decrypted
section when sme_me_mask is non-zero.  But code in
mem_encrypt_free_decrytped_mem() re-encrypts the unused portion based
on CC_ATTR_MEM_ENCRYPT.  In a Hyper-V guest VM using vTOM, these
conditions are not equivalent as sme_me_mask is always zero when
using vTOM.  Consequently, mem_encrypt_free_decrypted_mem() attempts
to re-encrypt memory that was never decrypted.

Fix this in mem_encrypt_free_decrypted_mem() by conditioning the
re-encryption on the same test for non-zero sme_me_mask.  Hyper-V
guests using vTOM don't need the bss_decrypted section to be
decrypted, so skipping the decryption/re-encryption doesn't cause
a problem.

Signed-off-by: Michael Kelley <mikelley@microsoft.com>
---
 arch/x86/mm/mem_encrypt_amd.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)
  

Comments

Tom Lendacky Nov. 16, 2022, 8:35 p.m. UTC | #1
On 11/16/22 12:41, Michael Kelley wrote:
> Current code in sme_postprocess_startup() decrypts the bss_decrypted
> section when sme_me_mask is non-zero.  But code in
> mem_encrypt_free_decrytped_mem() re-encrypts the unused portion based
> on CC_ATTR_MEM_ENCRYPT.  In a Hyper-V guest VM using vTOM, these
> conditions are not equivalent as sme_me_mask is always zero when
> using vTOM.  Consequently, mem_encrypt_free_decrypted_mem() attempts
> to re-encrypt memory that was never decrypted.
> 
> Fix this in mem_encrypt_free_decrypted_mem() by conditioning the
> re-encryption on the same test for non-zero sme_me_mask.  Hyper-V
> guests using vTOM don't need the bss_decrypted section to be
> decrypted, so skipping the decryption/re-encryption doesn't cause
> a problem.
> 
> Signed-off-by: Michael Kelley <mikelley@microsoft.com>
> ---
>   arch/x86/mm/mem_encrypt_amd.c | 10 +++++++---
>   1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
> index 9c4d8db..5a51343 100644
> --- a/arch/x86/mm/mem_encrypt_amd.c
> +++ b/arch/x86/mm/mem_encrypt_amd.c
> @@ -513,10 +513,14 @@ void __init mem_encrypt_free_decrypted_mem(void)
>   	npages = (vaddr_end - vaddr) >> PAGE_SHIFT;
>   
>   	/*
> -	 * The unused memory range was mapped decrypted, change the encryption
> -	 * attribute from decrypted to encrypted before freeing it.
> +	 * If the unused memory range was mapped decrypted, change the encryption
> +	 * attribute from decrypted to encrypted before freeing it. Base the
> +	 * re-encryption on the same condition used for the decryption in
> +	 * sme_postprocess_startup(). Higher level abstractions, such as
> +	 * CC_ATTR_MEM_ENCRYPT, aren't necessarily equivalent in a Hyper-V VM
> +	 * using vTOM, where sme_me_mask is always zero.
>   	 */
> -	if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
> +	if (sme_get_me_mask()) {

To be consistent within this file, you should use sme_me_mask directly.

Thanks,
Tom

>   		r = set_memory_encrypted(vaddr, npages);
>   		if (r) {
>   			pr_warn("failed to free unused decrypted pages\n");
  
Tom Lendacky Nov. 16, 2022, 9:15 p.m. UTC | #2
On 11/16/22 14:35, Tom Lendacky wrote:
> On 11/16/22 12:41, Michael Kelley wrote:
>> Current code in sme_postprocess_startup() decrypts the bss_decrypted
>> section when sme_me_mask is non-zero.  But code in
>> mem_encrypt_free_decrytped_mem() re-encrypts the unused portion based
>> on CC_ATTR_MEM_ENCRYPT.  In a Hyper-V guest VM using vTOM, these
>> conditions are not equivalent as sme_me_mask is always zero when
>> using vTOM.  Consequently, mem_encrypt_free_decrypted_mem() attempts
>> to re-encrypt memory that was never decrypted.
>>
>> Fix this in mem_encrypt_free_decrypted_mem() by conditioning the
>> re-encryption on the same test for non-zero sme_me_mask.  Hyper-V
>> guests using vTOM don't need the bss_decrypted section to be
>> decrypted, so skipping the decryption/re-encryption doesn't cause
>> a problem.
>>
>> Signed-off-by: Michael Kelley <mikelley@microsoft.com>

Meant to add this in the previous reply...

With the change to use sme_me_mask directly

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>

>> ---
>>   arch/x86/mm/mem_encrypt_amd.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
>> index 9c4d8db..5a51343 100644
>> --- a/arch/x86/mm/mem_encrypt_amd.c
>> +++ b/arch/x86/mm/mem_encrypt_amd.c
>> @@ -513,10 +513,14 @@ void __init mem_encrypt_free_decrypted_mem(void)
>>       npages = (vaddr_end - vaddr) >> PAGE_SHIFT;
>>       /*
>> -     * The unused memory range was mapped decrypted, change the encryption
>> -     * attribute from decrypted to encrypted before freeing it.
>> +     * If the unused memory range was mapped decrypted, change the 
>> encryption
>> +     * attribute from decrypted to encrypted before freeing it. Base the
>> +     * re-encryption on the same condition used for the decryption in
>> +     * sme_postprocess_startup(). Higher level abstractions, such as
>> +     * CC_ATTR_MEM_ENCRYPT, aren't necessarily equivalent in a Hyper-V VM
>> +     * using vTOM, where sme_me_mask is always zero.
>>        */
>> -    if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
>> +    if (sme_get_me_mask()) {
> 
> To be consistent within this file, you should use sme_me_mask directly.
> 
> Thanks,
> Tom
> 
>>           r = set_memory_encrypted(vaddr, npages);
>>           if (r) {
>>               pr_warn("failed to free unused decrypted pages\n");
  
Kuppuswamy Sathyanarayanan Nov. 17, 2022, 9:47 p.m. UTC | #3
On 11/16/22 10:41 AM, Michael Kelley wrote:
> Current code in sme_postprocess_startup() decrypts the bss_decrypted
> section when sme_me_mask is non-zero.  But code in
> mem_encrypt_free_decrytped_mem() re-encrypts the unused portion based
> on CC_ATTR_MEM_ENCRYPT.  In a Hyper-V guest VM using vTOM, these
> conditions are not equivalent as sme_me_mask is always zero when
> using vTOM.  Consequently, mem_encrypt_free_decrypted_mem() attempts
> to re-encrypt memory that was never decrypted.
> 
> Fix this in mem_encrypt_free_decrypted_mem() by conditioning the
> re-encryption on the same test for non-zero sme_me_mask.  Hyper-V
> guests using vTOM don't need the bss_decrypted section to be
> decrypted, so skipping the decryption/re-encryption doesn't cause
> a problem.
> 

Do you think it needs Fixes tag?

> Signed-off-by: Michael Kelley <mikelley@microsoft.com>
> ---
>  arch/x86/mm/mem_encrypt_amd.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
> index 9c4d8db..5a51343 100644
> --- a/arch/x86/mm/mem_encrypt_amd.c
> +++ b/arch/x86/mm/mem_encrypt_amd.c
> @@ -513,10 +513,14 @@ void __init mem_encrypt_free_decrypted_mem(void)
>  	npages = (vaddr_end - vaddr) >> PAGE_SHIFT;
>  
>  	/*
> -	 * The unused memory range was mapped decrypted, change the encryption
> -	 * attribute from decrypted to encrypted before freeing it.
> +	 * If the unused memory range was mapped decrypted, change the encryption
> +	 * attribute from decrypted to encrypted before freeing it. Base the
> +	 * re-encryption on the same condition used for the decryption in
> +	 * sme_postprocess_startup(). Higher level abstractions, such as
> +	 * CC_ATTR_MEM_ENCRYPT, aren't necessarily equivalent in a Hyper-V VM
> +	 * using vTOM, where sme_me_mask is always zero.
>  	 */
> -	if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
> +	if (sme_get_me_mask()) {
>  		r = set_memory_encrypted(vaddr, npages);
>  		if (r) {
>  			pr_warn("failed to free unused decrypted pages\n");
  
Michael Kelley (LINUX) Nov. 18, 2022, 2:55 a.m. UTC | #4
From: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> On 11/16/22 10:41 AM, Michael Kelley wrote:
> > Current code in sme_postprocess_startup() decrypts the bss_decrypted
> > section when sme_me_mask is non-zero.  But code in
> > mem_encrypt_free_decrytped_mem() re-encrypts the unused portion based
> > on CC_ATTR_MEM_ENCRYPT.  In a Hyper-V guest VM using vTOM, these
> > conditions are not equivalent as sme_me_mask is always zero when
> > using vTOM.  Consequently, mem_encrypt_free_decrypted_mem() attempts
> > to re-encrypt memory that was never decrypted.
> >
> > Fix this in mem_encrypt_free_decrypted_mem() by conditioning the
> > re-encryption on the same test for non-zero sme_me_mask.  Hyper-V
> > guests using vTOM don't need the bss_decrypted section to be
> > decrypted, so skipping the decryption/re-encryption doesn't cause
> > a problem.
> >
> 
> Do you think it needs Fixes tag?
> 

At least for my purposes, it doesn't.  The original assumption that non-zero
sme_me_mask and CC_ATTR_MEM_ENCRYPT are equivalent was valid until
this patch series where Hyper-V guests are reporting CC_ATTR_MEM_ENCRYPT
as "true" but sme_me_mask is zero.  This patch series won't be backported,
so the old assumption remains valid for older kernels.  There's no benefit in
backporting the change.

But I had not thought about TDX.  In the TDX case, it appears that
sme_postprocess_startup() will not decrypt the bss_decrypted section.
The corresponding mem_encrypt_free_decrypted_mem() is a no-op unless
CONFIG_AMD_MEM_ENCRYPT is set.  But maybe if someone builds a
kernel image that supports both TDX and AMD encryption, it could break
at runtime on a TDX system.  I would also note that on a TDX system
without CONFIG_AMD_MEM_ENCRYPT, the unused memory in the
bss_decrypted section never gets freed.

But check my logic. :-)   I'm not averse to adding the Fixes: tag if there's a
scenario for TDX where doing the backport will solve a real problem.

And thanks for reviewing the code!

Michael
  
Michael Kelley (LINUX) Nov. 18, 2022, 2:59 a.m. UTC | #5
From: Tom Lendacky <thomas.lendacky@amd.com> Sent: Wednesday, November 16, 2022 1:16 PM
> 
> On 11/16/22 14:35, Tom Lendacky wrote:
> > On 11/16/22 12:41, Michael Kelley wrote:
> >> Current code in sme_postprocess_startup() decrypts the bss_decrypted
> >> section when sme_me_mask is non-zero.  But code in
> >> mem_encrypt_free_decrytped_mem() re-encrypts the unused portion based
> >> on CC_ATTR_MEM_ENCRYPT.  In a Hyper-V guest VM using vTOM, these
> >> conditions are not equivalent as sme_me_mask is always zero when
> >> using vTOM.  Consequently, mem_encrypt_free_decrypted_mem() attempts
> >> to re-encrypt memory that was never decrypted.
> >>
> >> Fix this in mem_encrypt_free_decrypted_mem() by conditioning the
> >> re-encryption on the same test for non-zero sme_me_mask.  Hyper-V
> >> guests using vTOM don't need the bss_decrypted section to be
> >> decrypted, so skipping the decryption/re-encryption doesn't cause
> >> a problem.
> >>
> >> Signed-off-by: Michael Kelley <mikelley@microsoft.com>
> 
> Meant to add this in the previous reply...
> 
> With the change to use sme_me_mask directly
> 
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> 

Thanks for the reviews.  And I see your point about sme_me_mask.  I had
not previously noticed that it is defined in the module, so no need to use
a getter function.

Michael
  
Borislav Petkov Nov. 21, 2022, 2:39 p.m. UTC | #6
On Fri, Nov 18, 2022 at 02:55:32AM +0000, Michael Kelley (LINUX) wrote:
> But I had not thought about TDX.  In the TDX case, it appears that
> sme_postprocess_startup() will not decrypt the bss_decrypted section.
> The corresponding mem_encrypt_free_decrypted_mem() is a no-op unless
> CONFIG_AMD_MEM_ENCRYPT is set.  But maybe if someone builds a
> kernel image that supports both TDX and AMD encryption, it could break

sme_me_mask better be 0 on a kernel with both built in and running as a
TDX guest.
  
Borislav Petkov Nov. 21, 2022, 2:40 p.m. UTC | #7
On Wed, Nov 16, 2022 at 10:41:28AM -0800, Michael Kelley wrote:
> Current code in sme_postprocess_startup() decrypts the bss_decrypted
> section when sme_me_mask is non-zero.  But code in
> mem_encrypt_free_decrytped_mem() re-encrypts the unused portion based
			^^

letters flipped.

> @@ -513,10 +513,14 @@ void __init mem_encrypt_free_decrypted_mem(void)
>  	npages = (vaddr_end - vaddr) >> PAGE_SHIFT;
>  
>  	/*
> -	 * The unused memory range was mapped decrypted, change the encryption
> -	 * attribute from decrypted to encrypted before freeing it.
> +	 * If the unused memory range was mapped decrypted, change the encryption
> +	 * attribute from decrypted to encrypted before freeing it. Base the
> +	 * re-encryption on the same condition used for the decryption in
> +	 * sme_postprocess_startup(). Higher level abstractions, such as
> +	 * CC_ATTR_MEM_ENCRYPT, aren't necessarily equivalent in a Hyper-V VM
> +	 * using vTOM, where sme_me_mask is always zero.

Good, an example why one needs to pay attention here.

Thx.
  
Kuppuswamy Sathyanarayanan Nov. 21, 2022, 10:06 p.m. UTC | #8
On 11/21/22 6:39 AM, Borislav Petkov wrote:
> On Fri, Nov 18, 2022 at 02:55:32AM +0000, Michael Kelley (LINUX) wrote:
>> But I had not thought about TDX.  In the TDX case, it appears that
>> sme_postprocess_startup() will not decrypt the bss_decrypted section.
>> The corresponding mem_encrypt_free_decrypted_mem() is a no-op unless
>> CONFIG_AMD_MEM_ENCRYPT is set.  But maybe if someone builds a
>> kernel image that supports both TDX and AMD encryption, it could break
> 
> sme_me_mask better be 0 on a kernel with both built in and running as a
> TDX guest.
> 

Yes. It will be 0 in TDX. In sme_enable(), AMD code checks for CPUID
support before updating the sme_me_mask.
  
Michael Kelley (LINUX) Nov. 22, 2022, 5:59 p.m. UTC | #9
From: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
> 
> On 11/21/22 6:39 AM, Borislav Petkov wrote:
> > On Fri, Nov 18, 2022 at 02:55:32AM +0000, Michael Kelley (LINUX) wrote:
> >> But I had not thought about TDX.  In the TDX case, it appears that
> >> sme_postprocess_startup() will not decrypt the bss_decrypted section.
> >> The corresponding mem_encrypt_free_decrypted_mem() is a no-op unless
> >> CONFIG_AMD_MEM_ENCRYPT is set.  But maybe if someone builds a
> >> kernel image that supports both TDX and AMD encryption, it could break
> >
> > sme_me_mask better be 0 on a kernel with both built in and running as a
> > TDX guest.
> >
> 
> Yes. It will be 0 in TDX. In sme_enable(), AMD code checks for CPUID
> support before updating the sme_me_mask.
> 

Right.  But here's my point:  With current code and an image built with
CONFIG_AMD_MEM_ENCRYPT=y and running as a TDX guest,
sme_postprocess_startup() will not decrypt the bss_decrypted section.
Then later mem_encrypt_free_decrypted_mem() will run, see that
CC_ATTR_MEM_ENCRYPT is true, and try to re-encrypt the memory.
In other words, a TDX guest would break in the same way as a Hyper-V
vTOM guest would break.  This patch fixes the problem for both cases.

The only things I see in the bss_decrypted section are two clock structures
In arch/x86/kernel/kvmclock.c, which aren't needed when Hyper-V is the
hypervisor.  But with a TDX guest on KVM, will *not* decrypting the
bss_decrypted section be a problem?  I don't know that kvmclock
code or why the two clock structures need to be decrypted for
AMD mem encryption.

Michael
  
Dexuan Cui Nov. 28, 2022, 2:52 a.m. UTC | #10
> From: Michael Kelley (LINUX) <mikelley@microsoft.com>
> Sent: Thursday, November 17, 2022 6:56 PM
> [...]
> 
> But I had not thought about TDX.  In the TDX case, it appears that
> sme_postprocess_startup() will not decrypt the bss_decrypted section.
> The corresponding mem_encrypt_free_decrypted_mem() is a no-op unless
> CONFIG_AMD_MEM_ENCRYPT is set.  But maybe if someone builds a
> kernel image that supports both TDX and AMD encryption, it could break
> at runtime on a TDX system.  I would also note that on a TDX system
> without CONFIG_AMD_MEM_ENCRYPT, the unused memory in the
> bss_decrypted section never gets freed.

On a TDX system *with* CONFIG_AMD_MEM_ENCRYPT, the unused 
memory in the bss_decrypted section also never gets freed due to the
below "return;"

I'd suggest a Fixes tag should be added to make sure the distro vendors
notice the patch and backport it :-)

BTW, I just posted a similar patch as I didn't notice this patch. I have
replied to my patch email, asking people to ignore my patch.

Fixes: b3f0907c71e0 ("x86/mm: Add .bss..decrypted section to hold shared variables")

void __init mem_encrypt_free_decrypted_mem(void)
{
        unsigned long vaddr, vaddr_end, npages;
        int r;

        vaddr = (unsigned long)__start_bss_decrypted_unused;
        vaddr_end = (unsigned long)__end_bss_decrypted;
        npages = (vaddr_end - vaddr) >> PAGE_SHIFT;

        /*
         * The unused memory range was mapped decrypted, change the encryption
         * attribute from decrypted to encrypted before freeing it.
         */
        if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
                r = set_memory_encrypted(vaddr, npages);
                if (r) {
                        pr_warn("failed to free unused decrypted pages\n");
                        return;
                }
        }

        free_init_pages("unused decrypted", vaddr, vaddr_end);
}
  
Borislav Petkov Nov. 28, 2022, 10:50 a.m. UTC | #11
On Tue, Nov 22, 2022 at 05:59:04PM +0000, Michael Kelley (LINUX) wrote:
> Right.  But here's my point:  With current code and an image built with
> CONFIG_AMD_MEM_ENCRYPT=y and running as a TDX guest,
> sme_postprocess_startup() will not decrypt the bss_decrypted section.
> Then later mem_encrypt_free_decrypted_mem() will run, see that
> CC_ATTR_MEM_ENCRYPT is true, and try to re-encrypt the memory.
> In other words, a TDX guest would break in the same way as a Hyper-V
> vTOM guest would break.  This patch fixes the problem for both cases.

I guess making the check more concrete by checking sme_me_mask directly
along with a comment makes sense.

We need to be very careful here not to fragment the code too much for
all the different guest types.
  
Tom Lendacky Nov. 28, 2022, 2:15 p.m. UTC | #12
On 11/27/22 20:52, Dexuan Cui wrote:
>> From: Michael Kelley (LINUX) <mikelley@microsoft.com>
>> Sent: Thursday, November 17, 2022 6:56 PM
>> [...]
>>
>> But I had not thought about TDX.  In the TDX case, it appears that
>> sme_postprocess_startup() will not decrypt the bss_decrypted section.
>> The corresponding mem_encrypt_free_decrypted_mem() is a no-op unless
>> CONFIG_AMD_MEM_ENCRYPT is set.  But maybe if someone builds a
>> kernel image that supports both TDX and AMD encryption, it could break
>> at runtime on a TDX system.  I would also note that on a TDX system
>> without CONFIG_AMD_MEM_ENCRYPT, the unused memory in the
>> bss_decrypted section never gets freed.
> 
> On a TDX system *with* CONFIG_AMD_MEM_ENCRYPT, the unused
> memory in the bss_decrypted section also never gets freed due to the
> below "return;"
> 
> I'd suggest a Fixes tag should be added to make sure the distro vendors
> notice the patch and backport it :-)
> 
> BTW, I just posted a similar patch as I didn't notice this patch. I have
> replied to my patch email, asking people to ignore my patch.
> 
> Fixes: b3f0907c71e0 ("x86/mm: Add .bss..decrypted section to hold shared variables")

I think the Fixes: tag should really be:

e9d1d2bb75b2 ("treewide: Replace the use of mem_encrypt_active() with cc_platform_has()")

since mem_encrypt_active() used to return sme_me_mask, so the checks were
balanced at that point.

Thanks,
Tom

> 
> void __init mem_encrypt_free_decrypted_mem(void)
> {
>          unsigned long vaddr, vaddr_end, npages;
>          int r;
> 
>          vaddr = (unsigned long)__start_bss_decrypted_unused;
>          vaddr_end = (unsigned long)__end_bss_decrypted;
>          npages = (vaddr_end - vaddr) >> PAGE_SHIFT;
> 
>          /*
>           * The unused memory range was mapped decrypted, change the encryption
>           * attribute from decrypted to encrypted before freeing it.
>           */
>          if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
>                  r = set_memory_encrypted(vaddr, npages);
>                  if (r) {
>                          pr_warn("failed to free unused decrypted pages\n");
>                          return;
>                  }
>          }
> 
>          free_init_pages("unused decrypted", vaddr, vaddr_end);
> }
>
  

Patch

diff --git a/arch/x86/mm/mem_encrypt_amd.c b/arch/x86/mm/mem_encrypt_amd.c
index 9c4d8db..5a51343 100644
--- a/arch/x86/mm/mem_encrypt_amd.c
+++ b/arch/x86/mm/mem_encrypt_amd.c
@@ -513,10 +513,14 @@  void __init mem_encrypt_free_decrypted_mem(void)
 	npages = (vaddr_end - vaddr) >> PAGE_SHIFT;
 
 	/*
-	 * The unused memory range was mapped decrypted, change the encryption
-	 * attribute from decrypted to encrypted before freeing it.
+	 * If the unused memory range was mapped decrypted, change the encryption
+	 * attribute from decrypted to encrypted before freeing it. Base the
+	 * re-encryption on the same condition used for the decryption in
+	 * sme_postprocess_startup(). Higher level abstractions, such as
+	 * CC_ATTR_MEM_ENCRYPT, aren't necessarily equivalent in a Hyper-V VM
+	 * using vTOM, where sme_me_mask is always zero.
 	 */
-	if (cc_platform_has(CC_ATTR_MEM_ENCRYPT)) {
+	if (sme_get_me_mask()) {
 		r = set_memory_encrypted(vaddr, npages);
 		if (r) {
 			pr_warn("failed to free unused decrypted pages\n");