[RFC,RFT,1/4] hv: Leak pages if set_memory_encrypted() fails

Message ID 20240222021006.2279329-2-rick.p.edgecombe@intel.com
State New
Headers
Series Handle set_memory_XXcrypted() errors in hyperv |

Commit Message

Edgecombe, Rick P Feb. 22, 2024, 2:10 a.m. UTC
  On TDX it is possible for the untrusted host to cause
set_memory_encrypted() or set_memory_decrypted() to fail such that an
error is returned and the resulting memory is shared. Callers need to take
care to handle these errors to avoid returning decrypted (shared) memory to
the page allocator, which could lead to functional or security issues.

Hyperv could free decrypted/shared pages if set_memory_encrypted() fails.
Leak the pages if this happens.

Only compile tested.

Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: linux-hyperv@vger.kernel.org
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
 drivers/hv/connection.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)
  

Comments

Wei Liu March 1, 2024, 9:26 a.m. UTC | #1
On Wed, Feb 21, 2024 at 06:10:03PM -0800, Rick Edgecombe wrote:
> On TDX it is possible for the untrusted host to cause
> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to take
> care to handle these errors to avoid returning decrypted (shared) memory to
> the page allocator, which could lead to functional or security issues.
> 
> Hyperv could free decrypted/shared pages if set_memory_encrypted() fails.

"Hyper-V" throughout.

> Leak the pages if this happens.
> 
> Only compile tested.
> 
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Dexuan Cui <decui@microsoft.com>
> Cc: linux-hyperv@vger.kernel.org
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  drivers/hv/connection.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 3cabeeabb1ca..e39493421bbb 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -315,6 +315,7 @@ int vmbus_connect(void)
>  
>  void vmbus_disconnect(void)
>  {
> +	int ret;
>  	/*
>  	 * First send the unload request to the host.
>  	 */
> @@ -337,11 +338,13 @@ void vmbus_disconnect(void)
>  		vmbus_connection.int_page = NULL;
>  	}
>  
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
> +	ret = set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> +	ret |= set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
>  
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	if (!ret) {
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	}

This silently leaks the pages if set_memory_encrypted() fails.  I think
there should print some warning or error messages here.

Thanks,
Wei.

>  	vmbus_connection.monitor_pages[0] = NULL;
>  	vmbus_connection.monitor_pages[1] = NULL;
>  }
> -- 
> 2.34.1
>
  
Michael Kelley March 1, 2024, 7 p.m. UTC | #2
From: Rick Edgecombe <rick.p.edgecombe@intel.com> Sent: Wednesday, February 21, 2024 6:10 PM
> 

Historically, the preferred Subject prefix for changes to connection.c has
been "Drivers: hv: vmbus:", not just "hv:".  Sometimes that preference
isn't followed, but most of the time it is.

> On TDX it is possible for the untrusted host to cause

I'd argue that this is for CoCo VMs in general, not just TDX.  I don't know
all the failure modes for SEV-SNP, but the code paths you are changing
are run in both TDX and SEV-SNP CoCo VMs.

> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to take
> care to handle these errors to avoid returning decrypted (shared) memory to
> the page allocator, which could lead to functional or security issues.
> 
> Hyperv could free decrypted/shared pages if set_memory_encrypted() fails.

It's not Hyper-V doing the freeing.  Maybe say "VMBus code could
free ...."

> Leak the pages if this happens.
> 
> Only compile tested.
> 
> Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> Cc: Haiyang Zhang <haiyangz@microsoft.com>
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Dexuan Cui <decui@microsoft.com>
> Cc: linux-hyperv@vger.kernel.org
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
>  drivers/hv/connection.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 3cabeeabb1ca..e39493421bbb 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -315,6 +315,7 @@ int vmbus_connect(void)
> 
>  void vmbus_disconnect(void)
>  {
> +	int ret;
>  	/*
>  	 * First send the unload request to the host.
>  	 */
> @@ -337,11 +338,13 @@ void vmbus_disconnect(void)
>  		vmbus_connection.int_page = NULL;
>  	}
> 
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> -	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
> +	ret = set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
> +	ret |= set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
> 
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> -	hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	if (!ret) {
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> +		hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> +	}

Of course, this will leak the memory for both pages if only one of the
set_memory_encrypted() calls fails, but I'm OK with that.  It doesn't
seem worth the additional complexity to treat each page separately.

>  	vmbus_connection.monitor_pages[0] = NULL;
>  	vmbus_connection.monitor_pages[1] = NULL;
>  }
> --
> 2.34.1
  
Edgecombe, Rick P March 1, 2024, 7:12 p.m. UTC | #3
On Fri, 2024-03-01 at 09:26 +0000, Wei Liu wrote:
> > Hyperv could free decrypted/shared pages if set_memory_encrypted()
> > fails.
> 
> "Hyper-V" throughout.

Ok.

> 
> > Leak the pages if this happens.
> > 
> > Only compile tested.
> > 
> > Cc: "K. Y. Srinivasan" <kys@microsoft.com>
> > Cc: Haiyang Zhang <haiyangz@microsoft.com>
> > Cc: Wei Liu <wei.liu@kernel.org>
> > Cc: Dexuan Cui <decui@microsoft.com>
> > Cc: linux-hyperv@vger.kernel.org
> > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> > ---
> >   drivers/hv/connection.c | 11 +++++++----
> >   1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> > index 3cabeeabb1ca..e39493421bbb 100644
> > --- a/drivers/hv/connection.c
> > +++ b/drivers/hv/connection.c
> > @@ -315,6 +315,7 @@ int vmbus_connect(void)
> >   
> >   void vmbus_disconnect(void)
> >   {
> > +       int ret;
> >         /*
> >          * First send the unload request to the host.
> >          */
> > @@ -337,11 +338,13 @@ void vmbus_disconnect(void)
> >                 vmbus_connection.int_page = NULL;
> >         }
> >   
> > -       set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[0], 1);
> > -       set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[1], 1);
> > +       ret = set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[0], 1);
> > +       ret |= set_memory_encrypted((unsigned
> > long)vmbus_connection.monitor_pages[1], 1);
> >   
> > -       hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
> > -       hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
> > +       if (!ret) {
> > +               hv_free_hyperv_page(vmbus_connection.monitor_pages[
> > 0]);
> > +               hv_free_hyperv_page(vmbus_connection.monitor_pages[
> > 1]);
> > +       }
> 
> This silently leaks the pages if set_memory_encrypted() fails.  I
> think
> there should print some warning or error messages here.

Another patch will warn in CPA for the particular case:
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/mm

Do we want a warning in the caller too? It is more robust to other
types of failures in the future I guess.
  
Edgecombe, Rick P March 1, 2024, 7:13 p.m. UTC | #4
On Fri, 2024-03-01 at 19:00 +0000, Michael Kelley wrote:
> From: Rick Edgecombe <rick.p.edgecombe@intel.com> Sent: Wednesday,
> February 21, 2024 6:10 PM
> > 
> 
> Historically, the preferred Subject prefix for changes to
> connection.c has
> been "Drivers: hv: vmbus:", not just "hv:".  Sometimes that
> preference
> isn't followed, but most of the time it is.

Ok, I can update it.

> 
> > On TDX it is possible for the untrusted host to cause
> 
> I'd argue that this is for CoCo VMs in general, not just TDX.  I
> don't know
> all the failure modes for SEV-SNP, but the code paths you are
> changing
> are run in both TDX and SEV-SNP CoCo VMs.

On SEV-SNP the host can cause the call to fail too was my
understanding. But in Linux, that side panics and never gets to the
point of being able to free the shared memory. So it's not TDX
architecture specific, it's just how Linux handles it on the different
sids. For TDX the suggestion was to avoid panicing because it is
possible to handle in SW, as Linux usually tries it's best to do.

> 
> > set_memory_encrypted() or set_memory_decrypted() to fail such that
> > an
> > error is returned and the resulting memory is shared. Callers need
> > to take
> > care to handle these errors to avoid returning decrypted (shared)
> > memory to
> > the page allocator, which could lead to functional or security
> > issues.
> > 
> > Hyperv could free decrypted/shared pages if set_memory_encrypted()
> > fails.
> 
> It's not Hyper-V doing the freeing.  Maybe say "VMBus code could
> free ...."

Ok.
  
Michael Kelley March 1, 2024, 8:21 p.m. UTC | #5
From: Edgecombe, Rick P <rick.p.edgecombe@intel.com> Sent: Friday, March 1, 2024 11:13 AM
> >
> > > On TDX it is possible for the untrusted host to cause
> >
> > I'd argue that this is for CoCo VMs in general, not just TDX.  I don't know
> > all the failure modes for SEV-SNP, but the code paths you are changing
> > are run in both TDX and SEV-SNP CoCo VMs.
> 
> On SEV-SNP the host can cause the call to fail too was my
> understanding. But in Linux, that side panics and never gets to the
> point of being able to free the shared memory. So it's not TDX
> architecture specific, it's just how Linux handles it on the different
> sids. For TDX the suggestion was to avoid panicing because it is
> possible to handle in SW, as Linux usually tries it's best to do.
> 

The Hyper-V case can actually be a third path when a paravisor
is being used.  In that case, for both TDX and SEV-SNP, the
hypervisor callbacks in __set_memory_enc_pgtable() go
to Hyper-V specific functions that talk to the paravisor. Those
callbacks never panic. After a failure, either at the paravisor
level or in the paravisor talking to the hypervisor/VMM, the
decrypted/encrypted state of the memory isn't known.  So
leaking the memory is still the right thing to do, and your
patch set is good. But in the Hyper-V with paravisor case,
the leaking is applicable more broadly than just TDX.

The text in the commit message isn't something that I'll
go to the mat over.  But I wanted to offer the slightly broader
perspective. 

Michael
  
Edgecombe, Rick P March 1, 2024, 8:47 p.m. UTC | #6
On Fri, 2024-03-01 at 20:21 +0000, Michael Kelley wrote:
> 
> The Hyper-V case can actually be a third path when a paravisor
> is being used.  In that case, for both TDX and SEV-SNP, the
> hypervisor callbacks in __set_memory_enc_pgtable() go
> to Hyper-V specific functions that talk to the paravisor. Those
> callbacks never panic. After a failure, either at the paravisor
> level or in the paravisor talking to the hypervisor/VMM, the
> decrypted/encrypted state of the memory isn't known.  So
> leaking the memory is still the right thing to do, and your
> patch set is good. But in the Hyper-V with paravisor case,
> the leaking is applicable more broadly than just TDX.
> 
> The text in the commit message isn't something that I'll
> go to the mat over.  But I wanted to offer the slightly broader
> perspective.

Oh, interesting. I think I missed it because it only has a special
enc_status_change_finish(). But yea. It does sound like the text you
suggested is more accurate. I'll update it.
  

Patch

diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index 3cabeeabb1ca..e39493421bbb 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -315,6 +315,7 @@  int vmbus_connect(void)
 
 void vmbus_disconnect(void)
 {
+	int ret;
 	/*
 	 * First send the unload request to the host.
 	 */
@@ -337,11 +338,13 @@  void vmbus_disconnect(void)
 		vmbus_connection.int_page = NULL;
 	}
 
-	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
-	set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
+	ret = set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[0], 1);
+	ret |= set_memory_encrypted((unsigned long)vmbus_connection.monitor_pages[1], 1);
 
-	hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
-	hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
+	if (!ret) {
+		hv_free_hyperv_page(vmbus_connection.monitor_pages[0]);
+		hv_free_hyperv_page(vmbus_connection.monitor_pages[1]);
+	}
 	vmbus_connection.monitor_pages[0] = NULL;
 	vmbus_connection.monitor_pages[1] = NULL;
 }