x86/coco: Require seeding RNG with RDRAND on CoCo systems

Message ID 20240209164946.4164052-1-Jason@zx2c4.com
State New
Headers
Series x86/coco: Require seeding RNG with RDRAND on CoCo systems |

Commit Message

Jason A. Donenfeld Feb. 9, 2024, 4:49 p.m. UTC
  There are few uses of CoCo that don't rely on working cryptography and
hence a working RNG. Unfortunately, the CoCo threat model means that the
VM host cannot be trusted and may actively work against guests to
extract secrets or manipulate computation. Since a malicious host can
modify or observe nearly all inputs to guests, the only remaining source
of entropy for CoCo guests is RDRAND.

Unfortunately, RDRAND itself can be rendered unreliable by the host,
since the host controls guest scheduling and can starve RDRAND's
generation. A malicious host could also choose to simply terminate or
not boot a CoCo guest. So, tie the starvation of RDRAND to a BUG_ON at
boot time.

Specifically, try at boot to seed the RNG using 256 bits of RDRAND
output. If these fail, BUG(). This doesn't handle the more complicated
case of reseeding later in boot, but that's fraught with its own
difficulties, such as a malicious userspace starving the kernel. For
now, simply make sure the RNG is initially seeded securely during boot,
avoiding the worst of potential pitfalls.

This patch is deliberately written to be "just a CoCo x86 driver
feature" and not part of the RNG itself. Many device drivers and
platforms have some desire to contribute something to the RNG, and
add_device_randomness() is specifically meant for this purpose. Any
driver can call this with seed data of any quality, or even garbage
quality, and it can only possibly make the quality of the RNG better or
have no effect, but can never make it worse. Rather than trying to
build something into the core of the RNG, this patch interprets the
particular CoCo issue as just a CoCo issue, and therefore separates this
all out into driver (well, arch/platform) code.

Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
Probably this shouldn't be merged until Dave/Elena and others get back
with regards to the full picture, with information from inside Intel.
But I have a feeling this patch, or something like it, is ultimately
what we'll wind up with, so I'm posting it now.

I don't have a functional CoCo setup, so this patch has only been very
lightly tested.

 arch/x86/coco/core.c        | 36 ++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/coco.h |  2 ++
 arch/x86/kernel/setup.c     |  2 ++
 3 files changed, 40 insertions(+)
  

Comments

Andi Kleen Feb. 10, 2024, 5:09 a.m. UTC | #1
> +	for (i = 0; i < ARRAY_SIZE(rng_seed); i += longs) {
> +		longs = arch_get_random_longs(&rng_seed[i], ARRAY_SIZE(rng_seed) - i);
> +
> +		/*
> +		 * A zero return value means that the guest is under attack,
> +		 * the hardware is broken, or some other mishap has occurred
> +		 * that means the RNG cannot be properly rng_seeded, which also
> +		 * likely means most crypto inside of the CoCo instance will be
> +		 * broken, defeating the purpose of CoCo in the first place. So
> +		 * just panic here because it's absolutely unsafe to continue
> +		 * executing.
> +		 */
> +		BUG_ON(longs == 0);

BUG_ON doesn't necessarily panic. If you want panic, use panic.

-Andi
  
Tom Lendacky Feb. 23, 2024, 7:33 p.m. UTC | #2
On 2/9/24 10:49, Jason A. Donenfeld wrote:
> There are few uses of CoCo that don't rely on working cryptography and
> hence a working RNG. Unfortunately, the CoCo threat model means that the
> VM host cannot be trusted and may actively work against guests to
> extract secrets or manipulate computation. Since a malicious host can
> modify or observe nearly all inputs to guests, the only remaining source
> of entropy for CoCo guests is RDRAND.
> 
> Unfortunately, RDRAND itself can be rendered unreliable by the host,
> since the host controls guest scheduling and can starve RDRAND's
> generation. A malicious host could also choose to simply terminate or
> not boot a CoCo guest. So, tie the starvation of RDRAND to a BUG_ON at
> boot time.
> 
> Specifically, try at boot to seed the RNG using 256 bits of RDRAND
> output. If these fail, BUG(). This doesn't handle the more complicated
> case of reseeding later in boot, but that's fraught with its own
> difficulties, such as a malicious userspace starving the kernel. For
> now, simply make sure the RNG is initially seeded securely during boot,
> avoiding the worst of potential pitfalls.
> 
> This patch is deliberately written to be "just a CoCo x86 driver
> feature" and not part of the RNG itself. Many device drivers and
> platforms have some desire to contribute something to the RNG, and
> add_device_randomness() is specifically meant for this purpose. Any
> driver can call this with seed data of any quality, or even garbage
> quality, and it can only possibly make the quality of the RNG better or
> have no effect, but can never make it worse. Rather than trying to
> build something into the core of the RNG, this patch interprets the
> particular CoCo issue as just a CoCo issue, and therefore separates this
> all out into driver (well, arch/platform) code.
> 
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Daniel P. Berrangé <berrange@redhat.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Elena Reshetova <elena.reshetova@intel.com>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Theodore Ts'o <tytso@mit.edu>
> Cc: Thomas Gleixner <tglx@linutronix.de>,
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
> Probably this shouldn't be merged until Dave/Elena and others get back
> with regards to the full picture, with information from inside Intel.
> But I have a feeling this patch, or something like it, is ultimately
> what we'll wind up with, so I'm posting it now.
> 
> I don't have a functional CoCo setup, so this patch has only been very
> lightly tested.
> 
>   arch/x86/coco/core.c        | 36 ++++++++++++++++++++++++++++++++++++
>   arch/x86/include/asm/coco.h |  2 ++
>   arch/x86/kernel/setup.c     |  2 ++
>   3 files changed, 40 insertions(+)
> 
> diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
> index eeec9986570e..4e3b1cfe0063 100644
> --- a/arch/x86/coco/core.c
> +++ b/arch/x86/coco/core.c
> @@ -3,13 +3,16 @@
>    * Confidential Computing Platform Capability checks
>    *
>    * Copyright (C) 2021 Advanced Micro Devices, Inc.
> + * Copyright (C) 2024 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
>    *
>    * Author: Tom Lendacky <thomas.lendacky@amd.com>
>    */
>   
>   #include <linux/export.h>
>   #include <linux/cc_platform.h>
> +#include <linux/random.h>
>   
> +#include <asm/archrandom.h>
>   #include <asm/coco.h>
>   #include <asm/processor.h>
>   
> @@ -153,3 +156,36 @@ __init void cc_set_mask(u64 mask)
>   {
>   	cc_mask = mask;
>   }
> +
> +__init void cc_random_init(void)
> +{
> +	unsigned long rng_seed[32 / sizeof(long)];
> +	size_t i, longs;
> +
> +	if (cc_vendor == CC_VENDOR_NONE)

You probably want to use:

	if (!cc_platform_has(CC_GUEST_MEM_ENCRYPT))
		return;

Otherwise, you can hit the bare-metal case where AMD SME is active and 
then cc_vendor will not be CC_VENDOR_NONE.

Thanks,
Tom

> +		return;
> +
> +	/*
> +	 * Since the CoCo threat model includes the host, the only reliable
> +	 * source of entropy that can be neither observed nor manipulated is
> +	 * RDRAND. Usually, RDRAND failure is considered tolerable, but since a
> +	 * host can possibly induce failures consistently, it's important to at
> +	 * least ensure the RNG gets some initial random seeds.
> +	 */
> +	for (i = 0; i < ARRAY_SIZE(rng_seed); i += longs) {
> +		longs = arch_get_random_longs(&rng_seed[i], ARRAY_SIZE(rng_seed) - i);
> +
> +		/*
> +		 * A zero return value means that the guest is under attack,
> +		 * the hardware is broken, or some other mishap has occurred
> +		 * that means the RNG cannot be properly rng_seeded, which also
> +		 * likely means most crypto inside of the CoCo instance will be
> +		 * broken, defeating the purpose of CoCo in the first place. So
> +		 * just panic here because it's absolutely unsafe to continue
> +		 * executing.
> +		 */
> +		BUG_ON(longs == 0);
> +	}
> +	add_device_randomness(rng_seed, sizeof(rng_seed));
> +	memzero_explicit(rng_seed, sizeof(rng_seed));
> +}
> diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
> index 76c310b19b11..e9d059449885 100644
> --- a/arch/x86/include/asm/coco.h
> +++ b/arch/x86/include/asm/coco.h
> @@ -15,6 +15,7 @@ extern enum cc_vendor cc_vendor;
>   void cc_set_mask(u64 mask);
>   u64 cc_mkenc(u64 val);
>   u64 cc_mkdec(u64 val);
> +void cc_random_init(void);
>   #else
>   #define cc_vendor (CC_VENDOR_NONE)
>   
> @@ -27,6 +28,7 @@ static inline u64 cc_mkdec(u64 val)
>   {
>   	return val;
>   }
> +static inline void cc_random_init(void) { }
>   #endif
>   
>   #endif /* _ASM_X86_COCO_H */
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 84201071dfac..30a653cfc7d2 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -36,6 +36,7 @@
>   #include <asm/bios_ebda.h>
>   #include <asm/bugs.h>
>   #include <asm/cacheinfo.h>
> +#include <asm/coco.h>
>   #include <asm/cpu.h>
>   #include <asm/efi.h>
>   #include <asm/gart.h>
> @@ -994,6 +995,7 @@ void __init setup_arch(char **cmdline_p)
>   	 * memory size.
>   	 */
>   	mem_encrypt_setup_arch();
> +	cc_random_init();
>   
>   	efi_fake_memmap();
>   	efi_find_mirror();
  

Patch

diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index eeec9986570e..4e3b1cfe0063 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -3,13 +3,16 @@ 
  * Confidential Computing Platform Capability checks
  *
  * Copyright (C) 2021 Advanced Micro Devices, Inc.
+ * Copyright (C) 2024 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
  *
  * Author: Tom Lendacky <thomas.lendacky@amd.com>
  */
 
 #include <linux/export.h>
 #include <linux/cc_platform.h>
+#include <linux/random.h>
 
+#include <asm/archrandom.h>
 #include <asm/coco.h>
 #include <asm/processor.h>
 
@@ -153,3 +156,36 @@  __init void cc_set_mask(u64 mask)
 {
 	cc_mask = mask;
 }
+
+__init void cc_random_init(void)
+{
+	unsigned long rng_seed[32 / sizeof(long)];
+	size_t i, longs;
+
+	if (cc_vendor == CC_VENDOR_NONE)
+		return;
+
+	/*
+	 * Since the CoCo threat model includes the host, the only reliable
+	 * source of entropy that can be neither observed nor manipulated is
+	 * RDRAND. Usually, RDRAND failure is considered tolerable, but since a
+	 * host can possibly induce failures consistently, it's important to at
+	 * least ensure the RNG gets some initial random seeds.
+	 */
+	for (i = 0; i < ARRAY_SIZE(rng_seed); i += longs) {
+		longs = arch_get_random_longs(&rng_seed[i], ARRAY_SIZE(rng_seed) - i);
+
+		/*
+		 * A zero return value means that the guest is under attack,
+		 * the hardware is broken, or some other mishap has occurred
+		 * that means the RNG cannot be properly rng_seeded, which also
+		 * likely means most crypto inside of the CoCo instance will be
+		 * broken, defeating the purpose of CoCo in the first place. So
+		 * just panic here because it's absolutely unsafe to continue
+		 * executing.
+		 */
+		BUG_ON(longs == 0);
+	}
+	add_device_randomness(rng_seed, sizeof(rng_seed));
+	memzero_explicit(rng_seed, sizeof(rng_seed));
+}
diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h
index 76c310b19b11..e9d059449885 100644
--- a/arch/x86/include/asm/coco.h
+++ b/arch/x86/include/asm/coco.h
@@ -15,6 +15,7 @@  extern enum cc_vendor cc_vendor;
 void cc_set_mask(u64 mask);
 u64 cc_mkenc(u64 val);
 u64 cc_mkdec(u64 val);
+void cc_random_init(void);
 #else
 #define cc_vendor (CC_VENDOR_NONE)
 
@@ -27,6 +28,7 @@  static inline u64 cc_mkdec(u64 val)
 {
 	return val;
 }
+static inline void cc_random_init(void) { }
 #endif
 
 #endif /* _ASM_X86_COCO_H */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 84201071dfac..30a653cfc7d2 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -36,6 +36,7 @@ 
 #include <asm/bios_ebda.h>
 #include <asm/bugs.h>
 #include <asm/cacheinfo.h>
+#include <asm/coco.h>
 #include <asm/cpu.h>
 #include <asm/efi.h>
 #include <asm/gart.h>
@@ -994,6 +995,7 @@  void __init setup_arch(char **cmdline_p)
 	 * memory size.
 	 */
 	mem_encrypt_setup_arch();
+	cc_random_init();
 
 	efi_fake_memmap();
 	efi_find_mirror();