diff mbox series

[v2] x86/mm/cpa: Warn if set_memory_XXcrypted() fails

Message ID	20231027214744.1742056-1-rick.p.edgecombe@intel.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; From: Rick Edgecombe <rick.p.edgecombe@intel.com> To: x86@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, luto@kernel.org, peterz@infradead.org, kirill.shutemov@linux.intel.com, elena.reshetova@intel.com, isaku.yamahata@intel.com, seanjc@google.com, Michael Kelley <mikelley@microsoft.com>, thomas.lendacky@amd.com, decui@microsoft.com, sathyanarayanan.kuppuswamy@linux.intel.com, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [PATCH v2] x86/mm/cpa: Warn if set_memory_XXcrypted() fails Date: Fri, 27 Oct 2023 14:47:44 -0700 Message-Id: <20231027214744.1742056-1-rick.p.edgecombe@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[v2] x86/mm/cpa: Warn if set_memory_XXcrypted() fails \| [v2] x86/mm/cpa: Warn if set_memory_XXcrypted() fails

Commit Message

Edgecombe, Rick P Oct. 27, 2023, 9:47 p.m. UTC

  On TDX it is possible for the untrusted host to cause
set_memory_encrypted() or set_memory_decrypted() to fail such that an
error is returned and the resulting memory is shared. Callers need to take
care to handle these errors to avoid returning decrypted (shared) memory to
the page allocator, which could lead to functional or security issues.
In terms of security, the problematic case is guest PTEs mapping the
shared alias GFNs, since the VMM has control of the shared mapping in the
EPT/NPT.

Such conversion errors may herald future system instability, but are
temporarily survivable with proper handling in the caller. The kernel
traditionally makes every effort to keep running, but it is expected that
some coco guests may prefer to play it safe security-wise, and panic in
this case. To accommodate both cases, warn when the arch breakouts for
converting memory at the VMM layer return an error to CPA. Security focused
users can rely on panic_on_warn to defend against bugs in the callers. Some
VMMs are not known to behave in the troublesome way, so users that would
like to terminate on any unusual behavior by the VMM around this will be
covered as well.

Since the arch breakouts host the logic for handling coco implementation
specific errors, an error returned from them means that the set_memory()
call is out of options for handling the error internally. Make this the
condition to warn about.

It is possible that very rarely these functions could fail due to guest
memory pressure (in the case of failing to allocate a huge page when
splitting a page table). Don't warn in this case because it is a lot less
likely to indicate an attack by the host and it is not clear which
set_memory() calls should get the same treatment. That corner should be
addressed by future work that considers the more general problem and not
just papers over a single set_memory() variant.

Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Suggested-by: Michael Kelley (LINUX) <mikelley@microsoft.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
For v2:
 - Update commit log to call out importance of PTEs being shared in
   guest for there to be a problem, and that some users may want to
   terminate the guest on any unsual behavior. (Michael Kelley)
 - Remove out label (Thomas Lendacky, Sathyanarayanan Kuppuswamy)

v1 is here:
https://lore.kernel.org/lkml/20231024234829.1443125-1-rick.p.edgecombe@intel.com/
---
 arch/x86/mm/pat/set_memory.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

Comments

Kirill A. Shutemov Oct. 30, 2023, 8:27 a.m. UTC | #1

On Fri, Oct 27, 2023 at 02:47:44PM -0700, Rick Edgecombe wrote:
> On TDX it is possible for the untrusted host to cause
> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to take
> care to handle these errors to avoid returning decrypted (shared) memory to
> the page allocator, which could lead to functional or security issues.
> In terms of security, the problematic case is guest PTEs mapping the
> shared alias GFNs, since the VMM has control of the shared mapping in the
> EPT/NPT.
> 
> Such conversion errors may herald future system instability, but are
> temporarily survivable with proper handling in the caller. The kernel
> traditionally makes every effort to keep running, but it is expected that
> some coco guests may prefer to play it safe security-wise, and panic in
> this case. To accommodate both cases, warn when the arch breakouts for
> converting memory at the VMM layer return an error to CPA. Security focused
> users can rely on panic_on_warn to defend against bugs in the callers. Some
> VMMs are not known to behave in the troublesome way, so users that would
> like to terminate on any unusual behavior by the VMM around this will be
> covered as well.
> 
> Since the arch breakouts host the logic for handling coco implementation
> specific errors, an error returned from them means that the set_memory()
> call is out of options for handling the error internally. Make this the
> condition to warn about.
> 
> It is possible that very rarely these functions could fail due to guest
> memory pressure (in the case of failing to allocate a huge page when
> splitting a page table). Don't warn in this case because it is a lot less
> likely to indicate an attack by the host and it is not clear which
> set_memory() calls should get the same treatment. That corner should be
> addressed by future work that considers the more general problem and not
> just papers over a single set_memory() variant.
> 
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Suggested-by: Michael Kelley (LINUX) <mikelley@microsoft.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

Tha patch looks good:

Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

It intended to get upstream alongside with the caller fixes to leak memory
on failure, right? Maybe get it into one patchset?

Edgecombe, Rick P Oct. 30, 2023, 4:58 p.m. UTC | #2

On Mon, 2023-10-30 at 11:27 +0300, kirill.shutemov@linux.intel.com
wrote:
> Tha patch looks good:
> 
> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
Thanks!

> It intended to get upstream alongside with the caller fixes to leak
> memory
> on failure, right? Maybe get it into one patchset?

Why do you think? Since the callers are smattered across various
drivers, and those changes are now disconnected from the changes to
CPA, I thought to just follow up each area separately. For example I
was going to put all the hyper-v related changes together, but that
part is RFC due to the fact that I can't really test it. The MS folks
said they could help out there. So the different areas were feeling
like separate series.

Michael Kelley Oct. 30, 2023, 5:04 p.m. UTC | #3

From: Rick Edgecombe <rick.p.edgecombe@intel.com> Sent: Friday, October 27, 2023 2:48 PM
> 
> On TDX it is possible for the untrusted host to cause
> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to take care
> to handle these errors to avoid returning decrypted (shared) memory to the
> page allocator, which could lead to functional or security issues.
> In terms of security, the problematic case is guest PTEs mapping the shared
> alias GFNs, since the VMM has control of the shared mapping in the EPT/NPT.
> 
> Such conversion errors may herald future system instability, but are
> temporarily survivable with proper handling in the caller. The kernel
> traditionally makes every effort to keep running, but it is expected that some
> coco guests may prefer to play it safe security-wise, and panic in this case. To
> accommodate both cases, warn when the arch breakouts for converting
> memory at the VMM layer return an error to CPA. Security focused users can
> rely on panic_on_warn to defend against bugs in the callers. Some VMMs are
> not known to behave in the troublesome way, so users that would like to
> terminate on any unusual behavior by the VMM around this will be covered as
> well.
> 
> Since the arch breakouts host the logic for handling coco implementation
> specific errors, an error returned from them means that the set_memory() call
> is out of options for handling the error internally. Make this the condition to
> warn about.
> 
> It is possible that very rarely these functions could fail due to guest memory
> pressure (in the case of failing to allocate a huge page when splitting a page
> table). Don't warn in this case because it is a lot less likely to indicate an attack
> by the host and it is not clear which
> set_memory() calls should get the same treatment. That corner should be
> addressed by future work that considers the more general problem and not
> just papers over a single set_memory() variant.
> 
> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Suggested-by: Michael Kelley (LINUX) <mikelley@microsoft.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
> For v2:
>  - Update commit log to call out importance of PTEs being shared in
>    guest for there to be a problem, and that some users may want to
>    terminate the guest on any unsual behavior. (Michael Kelley)
>  - Remove out label (Thomas Lendacky, Sathyanarayanan Kuppuswamy)
> 
> v1 is here:
> https://lore.kernel.org/lkml/20231024234829.1443125-1-rick.p.edgecombe@intel.com/
> ---
>  arch/x86/mm/pat/set_memory.c | 19 +++++++++++++------
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/mm/pat/set_memory.c
> b/arch/x86/mm/pat/set_memory.c index bda9f129835e..34f2c0c88a6b
> 100644
> --- a/arch/x86/mm/pat/set_memory.c
> +++ b/arch/x86/mm/pat/set_memory.c
> @@ -2153,7 +2153,7 @@ static int __set_memory_enc_pgtable(unsigned
> long addr, int numpages, bool enc)
> 
>  	/* Notify hypervisor that we are about to set/clr encryption attribute. */
>  	if (!x86_platform.guest.enc_status_change_prepare(addr, numpages, enc))
> -		return -EIO;
> +		goto vmm_fail;
> 
>  	ret = __change_page_attr_set_clr(&cpa, 1);
> 
> @@ -2167,12 +2167,19 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
>  	cpa_flush(&cpa, 0);
> 
>  	/* Notify hypervisor that we have successfully set/clr encryption attribute. */
> -	if (!ret) {
> -		if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
> -			ret = -EIO;
> -	}
> +	if (ret)
> +		return ret;
> 
> -	return ret;
> +	if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
> +		goto vmm_fail;
> +
> +	return 0;
> +
> +vmm_fail:
> +	WARN_ONCE(1, "CPA VMM failure to convert memory (addr=%p, numpages=%d) to %s.\n",
> +		  (void *)addr, numpages, enc ? "private" : "shared");
> +
> +	return -EIO;
>  }
> 
>  static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)
> --
> 2.34.1

Reviewed-by: Michael Kelley <mikelley@microsoft.com>

Dave Hansen Oct. 30, 2023, 5:10 p.m. UTC | #4

I can fix this up when it gets applied, but a nit about the subject:
This isn't handling generic "set_memory_XXcrypted()" failures.  It's
specifically about presumed VMM-specific failures, thus the "vmm_fail"
label and the warning text.

Kirill A. Shutemov Oct. 31, 2023, 6:07 a.m. UTC | #5

On Mon, Oct 30, 2023 at 04:58:37PM +0000, Edgecombe, Rick P wrote:
> > It intended to get upstream alongside with the caller fixes to leak
> > memory
> > on failure, right? Maybe get it into one patchset?
> 
> Why do you think? Since the callers are smattered across various
> drivers, and those changes are now disconnected from the changes to
> CPA, I thought to just follow up each area separately. For example I
> was going to put all the hyper-v related changes together, but that
> part is RFC due to the fact that I can't really test it. The MS folks
> said they could help out there. So the different areas were feeling
> like separate series.

I am okay with doing it separately. I just was not clear on your plans
with the fixes.

Edgecombe, Rick P Dec. 6, 2023, 6:36 p.m. UTC | #6

On Fri, 2023-10-27 at 14:47 -0700, Rick Edgecombe wrote:
> On TDX it is possible for the untrusted host to cause
> set_memory_encrypted() or set_memory_decrypted() to fail such that an
> error is returned and the resulting memory is shared. Callers need to
> take
> care to handle these errors to avoid returning decrypted (shared)
> memory to
> the page allocator, which could lead to functional or security
> issues.
> In terms of security, the problematic case is guest PTEs mapping the
> shared alias GFNs, since the VMM has control of the shared mapping in
> the
> EPT/NPT.
> 
> Such conversion errors may herald future system instability, but are
> temporarily survivable with proper handling in the caller. The kernel
> traditionally makes every effort to keep running, but it is expected
> that
> some coco guests may prefer to play it safe security-wise, and panic
> in
> this case. To accommodate both cases, warn when the arch breakouts
> for
> converting memory at the VMM layer return an error to CPA. Security
> focused
> users can rely on panic_on_warn to defend against bugs in the
> callers. Some
> VMMs are not known to behave in the troublesome way, so users that
> would
> like to terminate on any unusual behavior by the VMM around this will
> be
> covered as well.
> 
> Since the arch breakouts host the logic for handling coco
> implementation
> specific errors, an error returned from them means that the
> set_memory()
> call is out of options for handling the error internally. Make this
> the
> condition to warn about.
> 
> It is possible that very rarely these functions could fail due to
> guest
> memory pressure (in the case of failing to allocate a huge page when
> splitting a page table). Don't warn in this case because it is a lot
> less
> likely to indicate an attack by the host and it is not clear which
> set_memory() calls should get the same treatment. That corner should
> be
> addressed by future work that considers the more general problem and
> not
> just papers over a single set_memory() variant.

x86 maintainers,

If you don't want this patch yet but are ok with the general approach,
could you share? I didn't want to start fixing up the callers until
this was settled. If you can share you are ok with the approach I can
start in the meantime.

Thanks,

Rick

diff mbox series

Patch

diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c
index bda9f129835e..34f2c0c88a6b 100644
--- a/arch/x86/mm/pat/set_memory.c
+++ b/arch/x86/mm/pat/set_memory.c
@@ -2153,7 +2153,7 @@  static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
 
 	/* Notify hypervisor that we are about to set/clr encryption attribute. */
 	if (!x86_platform.guest.enc_status_change_prepare(addr, numpages, enc))
-		return -EIO;
+		goto vmm_fail;
 
 	ret = __change_page_attr_set_clr(&cpa, 1);
 
@@ -2167,12 +2167,19 @@  static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc)
 	cpa_flush(&cpa, 0);
 
 	/* Notify hypervisor that we have successfully set/clr encryption attribute. */
-	if (!ret) {
-		if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
-			ret = -EIO;
-	}
+	if (ret)
+		return ret;
 
-	return ret;
+	if (!x86_platform.guest.enc_status_change_finish(addr, numpages, enc))
+		goto vmm_fail;
+
+	return 0;
+
+vmm_fail:
+	WARN_ONCE(1, "CPA VMM failure to convert memory (addr=%p, numpages=%d) to %s.\n",
+		  (void *)addr, numpages, enc ? "private" : "shared");
+
+	return -EIO;
 }
 
 static int __set_memory_enc_dec(unsigned long addr, int numpages, bool enc)