x86/head/64: Switch to KERNEL_CS as soon as new GDT is installed

Message ID 6ff1f28af2829cc9aea357ebee285825f90a431f.1684340801.git.thomas.lendacky@amd.com
State New
Headers
Series x86/head/64: Switch to KERNEL_CS as soon as new GDT is installed |

Commit Message

Tom Lendacky May 17, 2023, 4:26 p.m. UTC
  The call to startup_64_setup_env() will install a new GDT but does not
actually switch to using the KERNEL_CS entry until returning from the
function call.

Commit bcce82908333 ("x86/sev: Detect/setup SEV/SME features earlier in
boot") moved the call to sme_enable() earlier in the boot process and in
between the call to startup_64_setup_env() and the switch to KERNEL_CS.
An SEV-ES or an SEV-SNP guest will trigger #VC exceptions during the call
to sme_enable() and if the CS pushed on the stack as part of the exception
and used by IRETQ is not mapped by the new GDT, then problems occur.
Today, the current CS when entering startup_64 is the kernel CS value
because it was set up by the decompressor code, so no issue is seen.

However, a recent patchset that looked to avoid using the legacy
decompressor during an EFI boot exposed this bug. At entry to startup_64,
the CS value is that of EFI and is not mapped in the new kernel GDT. So
when a #VC exception occurs, the CS value used by IRETQ is not valid and
the guest boot crashes.

Fix this issue by moving the block that switches to the KERNEL_CS value to
be done immediately after returning from startup_64_setup_env().

Fixes: bcce82908333 ("x86/sev: Detect/setup SEV/SME features earlier in boot")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/kernel/head_64.S | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)
  

Comments

Joerg Roedel May 22, 2023, 1:12 p.m. UTC | #1
On Wed, May 17, 2023 at 11:26:41AM -0500, Tom Lendacky wrote:
> The call to startup_64_setup_env() will install a new GDT but does not
> actually switch to using the KERNEL_CS entry until returning from the
> function call.
> 
> Commit bcce82908333 ("x86/sev: Detect/setup SEV/SME features earlier in
> boot") moved the call to sme_enable() earlier in the boot process and in
> between the call to startup_64_setup_env() and the switch to KERNEL_CS.
> An SEV-ES or an SEV-SNP guest will trigger #VC exceptions during the call
> to sme_enable() and if the CS pushed on the stack as part of the exception
> and used by IRETQ is not mapped by the new GDT, then problems occur.
> Today, the current CS when entering startup_64 is the kernel CS value
> because it was set up by the decompressor code, so no issue is seen.
> 
> However, a recent patchset that looked to avoid using the legacy
> decompressor during an EFI boot exposed this bug. At entry to startup_64,
> the CS value is that of EFI and is not mapped in the new kernel GDT. So
> when a #VC exception occurs, the CS value used by IRETQ is not valid and
> the guest boot crashes.
> 
> Fix this issue by moving the block that switches to the KERNEL_CS value to
> be done immediately after returning from startup_64_setup_env().
> 
> Fixes: bcce82908333 ("x86/sev: Detect/setup SEV/SME features earlier in boot")
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>  arch/x86/kernel/head_64.S | 18 +++++++++---------
>  1 file changed, 9 insertions(+), 9 deletions(-)

Makes sense, thanks for the fix.

Reviewed-by: Joerg Roedel <jroedel@suse.de>

> 
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 9cd77d319555..c5b9289837dc 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -79,6 +79,15 @@ SYM_CODE_START_NOALIGN(startup_64)
>  	call	startup_64_setup_env
>  	popq	%rsi
>  
> +	/* Now switch to __KERNEL_CS so IRET works reliably */
> +	pushq	$__KERNEL_CS
> +	leaq	.Lon_kernel_cs(%rip), %rax
> +	pushq	%rax
> +	lretq
> +
> +.Lon_kernel_cs:
> +	UNWIND_HINT_END_OF_STACK
> +
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
>  	/*
>  	 * Activate SEV/SME memory encryption if supported/enabled. This needs to
> @@ -92,15 +101,6 @@ SYM_CODE_START_NOALIGN(startup_64)
>  	popq	%rsi
>  #endif
>  
> -	/* Now switch to __KERNEL_CS so IRET works reliably */
> -	pushq	$__KERNEL_CS
> -	leaq	.Lon_kernel_cs(%rip), %rax
> -	pushq	%rax
> -	lretq
> -
> -.Lon_kernel_cs:
> -	UNWIND_HINT_END_OF_STACK
> -
>  	/* Sanitize CPU configuration */
>  	call verify_cpu
>  
> -- 
> 2.40.0
>
  
Dave Hansen May 25, 2023, 11:17 p.m. UTC | #2
On 5/17/23 09:26, Tom Lendacky wrote:
> However, a recent patchset that looked to avoid using the legacy
> decompressor during an EFI boot exposed this bug. At entry to startup_64,
> the CS value is that of EFI and is not mapped in the new kernel GDT. So
> when a #VC exception occurs, the CS value used by IRETQ is not valid and
> the guest boot crashes.

This confused me a bit.  Nobody merged that patchset yet, right?  You
just happened across this issue when debugging a crash in that *other* set?

> Fix this issue by moving the block that switches to the KERNEL_CS value to
> be done immediately after returning from startup_64_setup_env().
> 
> Fixes: bcce82908333 ("x86/sev: Detect/setup SEV/SME features earlier in boot")
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Any thoughts on whether we want this in stable@?

I also wonder whether we need a comment in that little chunk of code
something along the lines of:

	/*
	 * Do not add anything which might take a fault or exception.
	 * IRET does not work here.
	 */

Michael, do you think you would have spotted something like this had it
been in the code when you were patching it?
  
Tom Lendacky May 26, 2023, 1:07 p.m. UTC | #3
On 5/25/23 18:17, Dave Hansen wrote:
> On 5/17/23 09:26, Tom Lendacky wrote:
>> However, a recent patchset that looked to avoid using the legacy
>> decompressor during an EFI boot exposed this bug. At entry to startup_64,
>> the CS value is that of EFI and is not mapped in the new kernel GDT. So
>> when a #VC exception occurs, the CS value used by IRETQ is not valid and
>> the guest boot crashes.
> 
> This confused me a bit.  Nobody merged that patchset yet, right?  You
> just happened across this issue when debugging a crash in that *other* set?

Correct, it was Ard's EFI decompressor patchset series that he submitted, 
but has not yet been accepted, that I was testing:

[PATCH 0/6] efi/x86: Avoid legacy decompressor during EFI boot
   https://lore.kernel.org/lkml/20230424165726.2245548-1-ardb@kernel.org/

I reported the problem to Ard and submitted this patch before I realized 
that he also included a patch in his next version of the series:

https://lore.kernel.org/lkml/20230508070330.582131-16-ardb@kernel.org/

> 
>> Fix this issue by moving the block that switches to the KERNEL_CS value to
>> be done immediately after returning from startup_64_setup_env().
>>
>> Fixes: bcce82908333 ("x86/sev: Detect/setup SEV/SME features earlier in boot")
>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> 
> Any thoughts on whether we want this in stable@?

It probably doesn't need to go to stable since the current decompressor 
code switches to a new GDT and uses the same kernel CS value that is being 
setup in arch/x86/kernel/head_64.S (which is why it hasn't been a problem).

> 
> I also wonder whether we need a comment in that little chunk of code
> something along the lines of:
> 
> 	/*
> 	 * Do not add anything which might take a fault or exception.
> 	 * IRET does not work here.
> 	 */
> 
> Michael, do you think you would have spotted something like this had it
> been in the code when you were patching it?

Let me know if you would like a v2 with that comment. I'm also fine if you 
  just want to add it as part of applying the patch, too.

Thanks,
Tom
  

Patch

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 9cd77d319555..c5b9289837dc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -79,6 +79,15 @@  SYM_CODE_START_NOALIGN(startup_64)
 	call	startup_64_setup_env
 	popq	%rsi
 
+	/* Now switch to __KERNEL_CS so IRET works reliably */
+	pushq	$__KERNEL_CS
+	leaq	.Lon_kernel_cs(%rip), %rax
+	pushq	%rax
+	lretq
+
+.Lon_kernel_cs:
+	UNWIND_HINT_END_OF_STACK
+
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 	/*
 	 * Activate SEV/SME memory encryption if supported/enabled. This needs to
@@ -92,15 +101,6 @@  SYM_CODE_START_NOALIGN(startup_64)
 	popq	%rsi
 #endif
 
-	/* Now switch to __KERNEL_CS so IRET works reliably */
-	pushq	$__KERNEL_CS
-	leaq	.Lon_kernel_cs(%rip), %rax
-	pushq	%rax
-	lretq
-
-.Lon_kernel_cs:
-	UNWIND_HINT_END_OF_STACK
-
 	/* Sanitize CPU configuration */
 	call verify_cpu