[v3,00/17] x86: head_64.S spring cleaning

Message ID 20221122161017.2426828-1-ardb@kernel.org
Headers
Series x86: head_64.S spring cleaning |

Message

Ard Biesheuvel Nov. 22, 2022, 4:10 p.m. UTC
  After doing some cleanup work on the EFI code in head_64.S, the mixed
mode code in particular, I noticed that the memory encryption pieces
could use some attention as well, so I cleaned that up too.

Changes since v2:
- add some clarifying comments to the EFI mixed mode changes
- include patch to make the EFI handover protocol optional that was sent
  out separately before
- rebase onto tip/master

Changes since v1:
- at Boris's request, split the patches into smaller ones that are
  easier to review

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Michael Roth <michael.roth@amd.com>

Ard Biesheuvel (17):
  x86/compressed: efi-mixed: rename efi_thunk_64.S to efi-mixed.S
  x86/compressed: efi-mixed: move 32-bit entrypoint code into .text
    section
  x86/compressed: efi-mixed: move bootargs parsing out of 32-bit startup
    code
  x86/compressed: efi-mixed: move efi32_pe_entry into .text section
  x86/compressed: efi-mixed: move efi32_entry out of head_64.S
  x86/compressed: efi-mixed: move efi32_pe_entry() out of head_64.S
  x86/compressed: efi: merge multiple definitions of image_offset into
    one
  x86/compressed: efi-mixed: simplify IDT/GDT preserve/restore
  x86/compressed: avoid touching ECX in startup32_set_idt_entry()
  x86/compressed: pull global variable ref up into startup32_load_idt()
  x86/compressed: move startup32_load_idt() into .text section
  x86/compressed: move startup32_load_idt() out of head_64.S
  x86/compressed: move startup32_check_sev_cbit() into .text
  x86/compressed: move startup32_check_sev_cbit() out of head_64.S
  x86/compressed: adhere to calling convention in
    get_sev_encryption_bit()
  x86/compressed: only build mem_encrypt.S if AMD_MEM_ENCRYPT=y
  efi: x86: Make the deprecated EFI handover protocol optional

 arch/x86/Kconfig                        |  17 +
 arch/x86/boot/compressed/Makefile       |   8 +-
 arch/x86/boot/compressed/efi_mixed.S    | 351 ++++++++++++++++++++
 arch/x86/boot/compressed/efi_thunk_64.S | 195 -----------
 arch/x86/boot/compressed/head_32.S      |   4 -
 arch/x86/boot/compressed/head_64.S      | 303 +----------------
 arch/x86/boot/compressed/mem_encrypt.S  | 152 ++++++++-
 arch/x86/boot/header.S                  |   2 +-
 arch/x86/boot/tools/build.c             |   2 +
 drivers/firmware/efi/libstub/x86-stub.c |   2 +-
 10 files changed, 533 insertions(+), 503 deletions(-)
 create mode 100644 arch/x86/boot/compressed/efi_mixed.S
 delete mode 100644 arch/x86/boot/compressed/efi_thunk_64.S
  

Comments

Tom Lendacky Nov. 22, 2022, 8:48 p.m. UTC | #1
On 11/22/22 10:10, Ard Biesheuvel wrote:
> After doing some cleanup work on the EFI code in head_64.S, the mixed
> mode code in particular, I noticed that the memory encryption pieces
> could use some attention as well, so I cleaned that up too.
> 
> Changes since v2:
> - add some clarifying comments to the EFI mixed mode changes
> - include patch to make the EFI handover protocol optional that was sent
>    out separately before
> - rebase onto tip/master
> 
> Changes since v1:
> - at Boris's request, split the patches into smaller ones that are
>    easier to review
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Michael Roth <michael.roth@amd.com>

This causes an SEV guest to blow up on boot in the early boot code. It
looks like the stack pointer is not valid and it triple faults on a pushq
instruction (pushq $__KERNEL_CS in arch/x86/boot/compressed/head_64.S of
startup_64).

Here is the Qemu register dump:
RAX=00000000029cc260 RBX=ffffffffdd98c000 RCX=0000000000000010 RDX=0000000000000002
RSI=000000003dec1000 RDI=0000000000000000 RBP=ffffffffdb000000 RSP=ffffffffde36e000
R8 =000000003dec1410 R9 =000000003dec13fc R10=000000000000006c R11=0000000000000000
R12=0000000000000000 R13=0000000000000001 R14=0000000000000004 R15=000000003eacdf44
RIP=0000000002000263 RFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 00000000 00000000
CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0000 0000000000000000 00000000 00000000
DS =0000 0000000000000000 00000000 00000000
FS =0000 0000000000000000 00000000 00000000
GS =0000 0000000000000000 00000000 00000000
LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
GDT=     00000000029cc270 0000002f
IDT=     000000003f55e018 00000fff
CR0=80010033 CR2=ffffffffde36dff8 CR3=000000003fc01000 CR4=00000668
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d00

Thanks,
Tom

> 
> Ard Biesheuvel (17):
>    x86/compressed: efi-mixed: rename efi_thunk_64.S to efi-mixed.S
>    x86/compressed: efi-mixed: move 32-bit entrypoint code into .text
>      section
>    x86/compressed: efi-mixed: move bootargs parsing out of 32-bit startup
>      code
>    x86/compressed: efi-mixed: move efi32_pe_entry into .text section
>    x86/compressed: efi-mixed: move efi32_entry out of head_64.S
>    x86/compressed: efi-mixed: move efi32_pe_entry() out of head_64.S
>    x86/compressed: efi: merge multiple definitions of image_offset into
>      one
>    x86/compressed: efi-mixed: simplify IDT/GDT preserve/restore
>    x86/compressed: avoid touching ECX in startup32_set_idt_entry()
>    x86/compressed: pull global variable ref up into startup32_load_idt()
>    x86/compressed: move startup32_load_idt() into .text section
>    x86/compressed: move startup32_load_idt() out of head_64.S
>    x86/compressed: move startup32_check_sev_cbit() into .text
>    x86/compressed: move startup32_check_sev_cbit() out of head_64.S
>    x86/compressed: adhere to calling convention in
>      get_sev_encryption_bit()
>    x86/compressed: only build mem_encrypt.S if AMD_MEM_ENCRYPT=y
>    efi: x86: Make the deprecated EFI handover protocol optional
> 
>   arch/x86/Kconfig                        |  17 +
>   arch/x86/boot/compressed/Makefile       |   8 +-
>   arch/x86/boot/compressed/efi_mixed.S    | 351 ++++++++++++++++++++
>   arch/x86/boot/compressed/efi_thunk_64.S | 195 -----------
>   arch/x86/boot/compressed/head_32.S      |   4 -
>   arch/x86/boot/compressed/head_64.S      | 303 +----------------
>   arch/x86/boot/compressed/mem_encrypt.S  | 152 ++++++++-
>   arch/x86/boot/header.S                  |   2 +-
>   arch/x86/boot/tools/build.c             |   2 +
>   drivers/firmware/efi/libstub/x86-stub.c |   2 +-
>   10 files changed, 533 insertions(+), 503 deletions(-)
>   create mode 100644 arch/x86/boot/compressed/efi_mixed.S
>   delete mode 100644 arch/x86/boot/compressed/efi_thunk_64.S
>
  
Ard Biesheuvel Nov. 22, 2022, 9:37 p.m. UTC | #2
On Tue, 22 Nov 2022 at 21:48, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 11/22/22 10:10, Ard Biesheuvel wrote:
> > After doing some cleanup work on the EFI code in head_64.S, the mixed
> > mode code in particular, I noticed that the memory encryption pieces
> > could use some attention as well, so I cleaned that up too.
> >
> > Changes since v2:
> > - add some clarifying comments to the EFI mixed mode changes
> > - include patch to make the EFI handover protocol optional that was sent
> >    out separately before
> > - rebase onto tip/master
> >
> > Changes since v1:
> > - at Boris's request, split the patches into smaller ones that are
> >    easier to review
> >
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > Cc: Michael Roth <michael.roth@amd.com>
>
> This causes an SEV guest to blow up on boot in the early boot code. It
> looks like the stack pointer is not valid and it triple faults on a pushq
> instruction (pushq $__KERNEL_CS in arch/x86/boot/compressed/head_64.S of
> startup_64).
>

Thanks for the report.

So the mystery here (at least to me) is that all the changes are to
the 32-bit code, and startup_64 reloads the stack pointer from the
symbol

Does your config have CONFIG_EFI_MIXED enabled?

Can I reproduce this fully emulated with QEMU? Or do I need a SEV host?

> Here is the Qemu register dump:
> RAX=00000000029cc260 RBX=ffffffffdd98c000 RCX=0000000000000010 RDX=0000000000000002
> RSI=000000003dec1000 RDI=0000000000000000 RBP=ffffffffdb000000 RSP=ffffffffde36e000
> R8 =000000003dec1410 R9 =000000003dec13fc R10=000000000000006c R11=0000000000000000
> R12=0000000000000000 R13=0000000000000001 R14=0000000000000004 R15=000000003eacdf44
> RIP=0000000002000263 RFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
> ES =0000 0000000000000000 00000000 00000000
> CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
> SS =0000 0000000000000000 00000000 00000000
> DS =0000 0000000000000000 00000000 00000000
> FS =0000 0000000000000000 00000000 00000000
> GS =0000 0000000000000000 00000000 00000000
> LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
> TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
> GDT=     00000000029cc270 0000002f
> IDT=     000000003f55e018 00000fff
> CR0=80010033 CR2=ffffffffde36dff8 CR3=000000003fc01000 CR4=00000668
> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
> DR6=00000000ffff0ff0 DR7=0000000000000400
> EFER=0000000000000d00
>
> Thanks,
> Tom
>
> >
> > Ard Biesheuvel (17):
> >    x86/compressed: efi-mixed: rename efi_thunk_64.S to efi-mixed.S
> >    x86/compressed: efi-mixed: move 32-bit entrypoint code into .text
> >      section
> >    x86/compressed: efi-mixed: move bootargs parsing out of 32-bit startup
> >      code
> >    x86/compressed: efi-mixed: move efi32_pe_entry into .text section
> >    x86/compressed: efi-mixed: move efi32_entry out of head_64.S
> >    x86/compressed: efi-mixed: move efi32_pe_entry() out of head_64.S
> >    x86/compressed: efi: merge multiple definitions of image_offset into
> >      one
> >    x86/compressed: efi-mixed: simplify IDT/GDT preserve/restore
> >    x86/compressed: avoid touching ECX in startup32_set_idt_entry()
> >    x86/compressed: pull global variable ref up into startup32_load_idt()
> >    x86/compressed: move startup32_load_idt() into .text section
> >    x86/compressed: move startup32_load_idt() out of head_64.S
> >    x86/compressed: move startup32_check_sev_cbit() into .text
> >    x86/compressed: move startup32_check_sev_cbit() out of head_64.S
> >    x86/compressed: adhere to calling convention in
> >      get_sev_encryption_bit()
> >    x86/compressed: only build mem_encrypt.S if AMD_MEM_ENCRYPT=y
> >    efi: x86: Make the deprecated EFI handover protocol optional
> >
> >   arch/x86/Kconfig                        |  17 +
> >   arch/x86/boot/compressed/Makefile       |   8 +-
> >   arch/x86/boot/compressed/efi_mixed.S    | 351 ++++++++++++++++++++
> >   arch/x86/boot/compressed/efi_thunk_64.S | 195 -----------
> >   arch/x86/boot/compressed/head_32.S      |   4 -
> >   arch/x86/boot/compressed/head_64.S      | 303 +----------------
> >   arch/x86/boot/compressed/mem_encrypt.S  | 152 ++++++++-
> >   arch/x86/boot/header.S                  |   2 +-
> >   arch/x86/boot/tools/build.c             |   2 +
> >   drivers/firmware/efi/libstub/x86-stub.c |   2 +-
> >   10 files changed, 533 insertions(+), 503 deletions(-)
> >   create mode 100644 arch/x86/boot/compressed/efi_mixed.S
> >   delete mode 100644 arch/x86/boot/compressed/efi_thunk_64.S
> >
  
Ard Biesheuvel Nov. 22, 2022, 9:42 p.m. UTC | #3
On Tue, 22 Nov 2022 at 22:37, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Tue, 22 Nov 2022 at 21:48, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >
> > On 11/22/22 10:10, Ard Biesheuvel wrote:
> > > After doing some cleanup work on the EFI code in head_64.S, the mixed
> > > mode code in particular, I noticed that the memory encryption pieces
> > > could use some attention as well, so I cleaned that up too.
> > >
> > > Changes since v2:
> > > - add some clarifying comments to the EFI mixed mode changes
> > > - include patch to make the EFI handover protocol optional that was sent
> > >    out separately before
> > > - rebase onto tip/master
> > >
> > > Changes since v1:
> > > - at Boris's request, split the patches into smaller ones that are
> > >    easier to review
> > >
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Ingo Molnar <mingo@redhat.com>
> > > Cc: Borislav Petkov <bp@alien8.de>
> > > Cc: Dave Hansen <dave.hansen@linux.intel.com>
> > > Cc: Michael Roth <michael.roth@amd.com>
> >
> > This causes an SEV guest to blow up on boot in the early boot code. It
> > looks like the stack pointer is not valid and it triple faults on a pushq
> > instruction (pushq $__KERNEL_CS in arch/x86/boot/compressed/head_64.S of
> > startup_64).
> >
>
> Thanks for the report.
>
> So the mystery here (at least to me) is that all the changes are to
> the 32-bit code, and startup_64 reloads the stack pointer from the
> symbol
>
> Does your config have CONFIG_EFI_MIXED enabled?
>
> Can I reproduce this fully emulated with QEMU? Or do I need a SEV host?
>

Also, mind giving this a quick spin?

diff --git a/drivers/firmware/efi/libstub/x86-stub.c
b/drivers/firmware/efi/libstub/x86-stub.c
index cb5f0befee57..1af11d34bc6c 100644
--- a/drivers/firmware/efi/libstub/x86-stub.c
+++ b/drivers/firmware/efi/libstub/x86-stub.c
@@ -23,7 +23,7 @@

 const efi_system_table_t *efi_system_table;
 const efi_dxe_services_table_t *efi_dxe_table;
-u32 image_offset;
+u32 __section(".data") image_offset;
 static efi_loaded_image_t *image = NULL;

 static efi_status_t

Thanks,
Ard.
  
Tom Lendacky Nov. 22, 2022, 9:49 p.m. UTC | #4
On 11/22/22 15:37, Ard Biesheuvel wrote:
> On Tue, 22 Nov 2022 at 21:48, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>
>> On 11/22/22 10:10, Ard Biesheuvel wrote:
>>> After doing some cleanup work on the EFI code in head_64.S, the mixed
>>> mode code in particular, I noticed that the memory encryption pieces
>>> could use some attention as well, so I cleaned that up too.
>>>
>>> Changes since v2:
>>> - add some clarifying comments to the EFI mixed mode changes
>>> - include patch to make the EFI handover protocol optional that was sent
>>>     out separately before
>>> - rebase onto tip/master
>>>
>>> Changes since v1:
>>> - at Boris's request, split the patches into smaller ones that are
>>>     easier to review
>>>
>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: Borislav Petkov <bp@alien8.de>
>>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>>> Cc: Michael Roth <michael.roth@amd.com>
>>
>> This causes an SEV guest to blow up on boot in the early boot code. It
>> looks like the stack pointer is not valid and it triple faults on a pushq
>> instruction (pushq $__KERNEL_CS in arch/x86/boot/compressed/head_64.S of
>> startup_64).
>>
> 
> Thanks for the report.
> 
> So the mystery here (at least to me) is that all the changes are to
> the 32-bit code, and startup_64 reloads the stack pointer from the
> symbol

I bisected it to:

99b7f4b23d9f ("x86/boot/compressed, efi: Merge multiple definitions of image_offset into one")

And doing the following fixed it:

diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
index cb5f0befee57..a0bfd31358ba 100644
--- a/drivers/firmware/efi/libstub/x86-stub.c
+++ b/drivers/firmware/efi/libstub/x86-stub.c
@@ -23,7 +23,7 @@
  
  const efi_system_table_t *efi_system_table;
  const efi_dxe_services_table_t *efi_dxe_table;
-u32 image_offset;
+u32 image_offset __section(".data");
  static efi_loaded_image_t *image = NULL;
  
  static efi_status_t

I assume it has to do with being in .data vs .bss and not being explicitly
cleared with the encryption bit set. With the change to put image_offset in
the .data section, it is read as zero, where as when it was in the .bss
section it was reading "ciphertext".

Thanks,
Tom

> 
> Does your config have CONFIG_EFI_MIXED enabled?
> 
> Can I reproduce this fully emulated with QEMU? Or do I need a SEV host?
> 
>> Here is the Qemu register dump:
>> RAX=00000000029cc260 RBX=ffffffffdd98c000 RCX=0000000000000010 RDX=0000000000000002
>> RSI=000000003dec1000 RDI=0000000000000000 RBP=ffffffffdb000000 RSP=ffffffffde36e000
>> R8 =000000003dec1410 R9 =000000003dec13fc R10=000000000000006c R11=0000000000000000
>> R12=0000000000000000 R13=0000000000000001 R14=0000000000000004 R15=000000003eacdf44
>> RIP=0000000002000263 RFL=00200002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0000 0000000000000000 00000000 00000000
>> CS =0038 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
>> SS =0000 0000000000000000 00000000 00000000
>> DS =0000 0000000000000000 00000000 00000000
>> FS =0000 0000000000000000 00000000 00000000
>> GS =0000 0000000000000000 00000000 00000000
>> LDT=0000 0000000000000000 0000ffff 00008200 DPL=0 LDT
>> TR =0000 0000000000000000 0000ffff 00008b00 DPL=0 TSS64-busy
>> GDT=     00000000029cc270 0000002f
>> IDT=     000000003f55e018 00000fff
>> CR0=80010033 CR2=ffffffffde36dff8 CR3=000000003fc01000 CR4=00000668
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> EFER=0000000000000d00
>>
>> Thanks,
>> Tom
>>
>>>
>>> Ard Biesheuvel (17):
>>>     x86/compressed: efi-mixed: rename efi_thunk_64.S to efi-mixed.S
>>>     x86/compressed: efi-mixed: move 32-bit entrypoint code into .text
>>>       section
>>>     x86/compressed: efi-mixed: move bootargs parsing out of 32-bit startup
>>>       code
>>>     x86/compressed: efi-mixed: move efi32_pe_entry into .text section
>>>     x86/compressed: efi-mixed: move efi32_entry out of head_64.S
>>>     x86/compressed: efi-mixed: move efi32_pe_entry() out of head_64.S
>>>     x86/compressed: efi: merge multiple definitions of image_offset into
>>>       one
>>>     x86/compressed: efi-mixed: simplify IDT/GDT preserve/restore
>>>     x86/compressed: avoid touching ECX in startup32_set_idt_entry()
>>>     x86/compressed: pull global variable ref up into startup32_load_idt()
>>>     x86/compressed: move startup32_load_idt() into .text section
>>>     x86/compressed: move startup32_load_idt() out of head_64.S
>>>     x86/compressed: move startup32_check_sev_cbit() into .text
>>>     x86/compressed: move startup32_check_sev_cbit() out of head_64.S
>>>     x86/compressed: adhere to calling convention in
>>>       get_sev_encryption_bit()
>>>     x86/compressed: only build mem_encrypt.S if AMD_MEM_ENCRYPT=y
>>>     efi: x86: Make the deprecated EFI handover protocol optional
>>>
>>>    arch/x86/Kconfig                        |  17 +
>>>    arch/x86/boot/compressed/Makefile       |   8 +-
>>>    arch/x86/boot/compressed/efi_mixed.S    | 351 ++++++++++++++++++++
>>>    arch/x86/boot/compressed/efi_thunk_64.S | 195 -----------
>>>    arch/x86/boot/compressed/head_32.S      |   4 -
>>>    arch/x86/boot/compressed/head_64.S      | 303 +----------------
>>>    arch/x86/boot/compressed/mem_encrypt.S  | 152 ++++++++-
>>>    arch/x86/boot/header.S                  |   2 +-
>>>    arch/x86/boot/tools/build.c             |   2 +
>>>    drivers/firmware/efi/libstub/x86-stub.c |   2 +-
>>>    10 files changed, 533 insertions(+), 503 deletions(-)
>>>    create mode 100644 arch/x86/boot/compressed/efi_mixed.S
>>>    delete mode 100644 arch/x86/boot/compressed/efi_thunk_64.S
>>>
  
Tom Lendacky Nov. 22, 2022, 9:50 p.m. UTC | #5
On 11/22/22 15:42, Ard Biesheuvel wrote:
> On Tue, 22 Nov 2022 at 22:37, Ard Biesheuvel <ardb@kernel.org> wrote:
>>
>> On Tue, 22 Nov 2022 at 21:48, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>>>
>>> On 11/22/22 10:10, Ard Biesheuvel wrote:
>>>> After doing some cleanup work on the EFI code in head_64.S, the mixed
>>>> mode code in particular, I noticed that the memory encryption pieces
>>>> could use some attention as well, so I cleaned that up too.
>>>>
>>>> Changes since v2:
>>>> - add some clarifying comments to the EFI mixed mode changes
>>>> - include patch to make the EFI handover protocol optional that was sent
>>>>     out separately before
>>>> - rebase onto tip/master
>>>>
>>>> Changes since v1:
>>>> - at Boris's request, split the patches into smaller ones that are
>>>>     easier to review
>>>>
>>>> Cc: Thomas Gleixner <tglx@linutronix.de>
>>>> Cc: Ingo Molnar <mingo@redhat.com>
>>>> Cc: Borislav Petkov <bp@alien8.de>
>>>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>>>> Cc: Michael Roth <michael.roth@amd.com>
>>>
>>> This causes an SEV guest to blow up on boot in the early boot code. It
>>> looks like the stack pointer is not valid and it triple faults on a pushq
>>> instruction (pushq $__KERNEL_CS in arch/x86/boot/compressed/head_64.S of
>>> startup_64).
>>>
>>
>> Thanks for the report.
>>
>> So the mystery here (at least to me) is that all the changes are to
>> the 32-bit code, and startup_64 reloads the stack pointer from the
>> symbol
>>
>> Does your config have CONFIG_EFI_MIXED enabled?
>>
>> Can I reproduce this fully emulated with QEMU? Or do I need a SEV host?
>>
> 
> Also, mind giving this a quick spin?

Just saw this after I sent out my email. Yes, this fixes it.

Thanks,
Tom

> 
> diff --git a/drivers/firmware/efi/libstub/x86-stub.c
> b/drivers/firmware/efi/libstub/x86-stub.c
> index cb5f0befee57..1af11d34bc6c 100644
> --- a/drivers/firmware/efi/libstub/x86-stub.c
> +++ b/drivers/firmware/efi/libstub/x86-stub.c
> @@ -23,7 +23,7 @@
> 
>   const efi_system_table_t *efi_system_table;
>   const efi_dxe_services_table_t *efi_dxe_table;
> -u32 image_offset;
> +u32 __section(".data") image_offset;
>   static efi_loaded_image_t *image = NULL;
> 
>   static efi_status_t
> 
> Thanks,
> Ard.
  
Ard Biesheuvel Nov. 22, 2022, 9:51 p.m. UTC | #6
On Tue, 22 Nov 2022 at 22:51, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 11/22/22 15:42, Ard Biesheuvel wrote:
> > On Tue, 22 Nov 2022 at 22:37, Ard Biesheuvel <ardb@kernel.org> wrote:
> >>
> >> On Tue, 22 Nov 2022 at 21:48, Tom Lendacky <thomas.lendacky@amd.com> wrote:
> >>>
> >>> On 11/22/22 10:10, Ard Biesheuvel wrote:
> >>>> After doing some cleanup work on the EFI code in head_64.S, the mixed
> >>>> mode code in particular, I noticed that the memory encryption pieces
> >>>> could use some attention as well, so I cleaned that up too.
> >>>>
> >>>> Changes since v2:
> >>>> - add some clarifying comments to the EFI mixed mode changes
> >>>> - include patch to make the EFI handover protocol optional that was sent
> >>>>     out separately before
> >>>> - rebase onto tip/master
> >>>>
> >>>> Changes since v1:
> >>>> - at Boris's request, split the patches into smaller ones that are
> >>>>     easier to review
> >>>>
> >>>> Cc: Thomas Gleixner <tglx@linutronix.de>
> >>>> Cc: Ingo Molnar <mingo@redhat.com>
> >>>> Cc: Borislav Petkov <bp@alien8.de>
> >>>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> >>>> Cc: Michael Roth <michael.roth@amd.com>
> >>>
> >>> This causes an SEV guest to blow up on boot in the early boot code. It
> >>> looks like the stack pointer is not valid and it triple faults on a pushq
> >>> instruction (pushq $__KERNEL_CS in arch/x86/boot/compressed/head_64.S of
> >>> startup_64).
> >>>
> >>
> >> Thanks for the report.
> >>
> >> So the mystery here (at least to me) is that all the changes are to
> >> the 32-bit code, and startup_64 reloads the stack pointer from the
> >> symbol
> >>
> >> Does your config have CONFIG_EFI_MIXED enabled?
> >>
> >> Can I reproduce this fully emulated with QEMU? Or do I need a SEV host?
> >>
> >
> > Also, mind giving this a quick spin?
>
> Just saw this after I sent out my email. Yes, this fixes it.
>

Excellent, thanks for testing.
  
Borislav Petkov Nov. 22, 2022, 10:20 p.m. UTC | #7
On Tue, Nov 22, 2022 at 03:49:29PM -0600, Tom Lendacky wrote:
> I bisected it to:
> 
> 99b7f4b23d9f ("x86/boot/compressed, efi: Merge multiple definitions of image_offset into one")
> 
> And doing the following fixed it:
> 
> diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
> index cb5f0befee57..a0bfd31358ba 100644
> --- a/drivers/firmware/efi/libstub/x86-stub.c
> +++ b/drivers/firmware/efi/libstub/x86-stub.c
> @@ -23,7 +23,7 @@
>  const efi_system_table_t *efi_system_table;
>  const efi_dxe_services_table_t *efi_dxe_table;
> -u32 image_offset;
> +u32 image_offset __section(".data");
>  static efi_loaded_image_t *image = NULL;
>  static efi_status_t
> 
> I assume it has to do with being in .data vs .bss and not being explicitly
> cleared with the encryption bit set. With the change to put image_offset in
> the .data section, it is read as zero, where as when it was in the .bss
> section it was reading "ciphertext".

Thank you both. I'll refresh the set and run it here tomorrow too, to
make sure it boots here too.

Thx.
  
Borislav Petkov Nov. 23, 2022, 10:49 a.m. UTC | #8
On Tue, Nov 22, 2022 at 03:49:29PM -0600, Tom Lendacky wrote:
> diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
> index cb5f0befee57..a0bfd31358ba 100644
> --- a/drivers/firmware/efi/libstub/x86-stub.c
> +++ b/drivers/firmware/efi/libstub/x86-stub.c
> @@ -23,7 +23,7 @@
>  const efi_system_table_t *efi_system_table;
>  const efi_dxe_services_table_t *efi_dxe_table;
> -u32 image_offset;
> +u32 image_offset __section(".data");
>  static efi_loaded_image_t *image = NULL;
>  static efi_status_t
> 
> I assume it has to do with being in .data vs .bss and not being explicitly
> cleared with the encryption bit set. With the change to put image_offset in
> the .data section, it is read as zero, where as when it was in the .bss
> section it was reading "ciphertext".

Hmm, two points about this:

1. Can we do

u32 image_offset __bss_decrypted;

here instead? We have this special section just for that fun and it
self-documents this way.

2. Also, why does my SEV-ES guest boot just fine without that change?

[    0.000000] Linux version 6.1.0-rc6+ (root@ml) (gcc (Debian 11.3.0-1) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38) #1 SMP PREEMPT_DYNAMIC Wed Nov 23 11:27:17 CET 2022
...
[    0.336132] Memory Encryption Features active: AMD SEV SEV-ES

Thx.
  
Ard Biesheuvel Nov. 23, 2022, 10:52 a.m. UTC | #9
On Wed, 23 Nov 2022 at 11:49, Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Nov 22, 2022 at 03:49:29PM -0600, Tom Lendacky wrote:
> > diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
> > index cb5f0befee57..a0bfd31358ba 100644
> > --- a/drivers/firmware/efi/libstub/x86-stub.c
> > +++ b/drivers/firmware/efi/libstub/x86-stub.c
> > @@ -23,7 +23,7 @@
> >  const efi_system_table_t *efi_system_table;
> >  const efi_dxe_services_table_t *efi_dxe_table;
> > -u32 image_offset;
> > +u32 image_offset __section(".data");
> >  static efi_loaded_image_t *image = NULL;
> >  static efi_status_t
> >
> > I assume it has to do with being in .data vs .bss and not being explicitly
> > cleared with the encryption bit set. With the change to put image_offset in
> > the .data section, it is read as zero, where as when it was in the .bss
> > section it was reading "ciphertext".
>
> Hmm, two points about this:
>
> 1. Can we do
>
> u32 image_offset __bss_decrypted;
>
> here instead? We have this special section just for that fun and it
> self-documents this way.
>

The patch moves it from .data to .bss inadvertently, and I am not
convinced Tom's analysis is entirely accurate: we may simply have
garbage in image_offset if we access it before .bss gets cleared.

> 2. Also, why does my SEV-ES guest boot just fine without that change?
>

Indeed, so it needs to be in .data


> [    0.000000] Linux version 6.1.0-rc6+ (root@ml) (gcc (Debian 11.3.0-1) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38) #1 SMP PREEMPT_DYNAMIC Wed Nov 23 11:27:17 CET 2022
> ...
> [    0.336132] Memory Encryption Features active: AMD SEV SEV-ES
>
> Thx.
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
  
Borislav Petkov Nov. 23, 2022, 11:09 a.m. UTC | #10
On Wed, Nov 23, 2022 at 11:52:32AM +0100, Ard Biesheuvel wrote:
> The patch moves it from .data to .bss inadvertently, and I am not
> convinced Tom's analysis is entirely accurate: we may simply have
> garbage in image_offset if we access it before .bss gets cleared.

That should not be too hard to find out: add an endless loop in asm in
the guest right after the first image_offset access:

1:
	jmp 1b

and then dump its value.

Or Tom might have an even better solution.

But looking at the code, BSS clearing happens later, at .Lrelocated and
the EFI stub comes before it. AFAICT.
  
Ard Biesheuvel Nov. 23, 2022, 11:22 a.m. UTC | #11
On Wed, 23 Nov 2022 at 12:09, Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Nov 23, 2022 at 11:52:32AM +0100, Ard Biesheuvel wrote:
> > The patch moves it from .data to .bss inadvertently, and I am not
> > convinced Tom's analysis is entirely accurate: we may simply have
> > garbage in image_offset if we access it before .bss gets cleared.
>
> That should not be too hard to find out: add an endless loop in asm in
> the guest right after the first image_offset access:
>
> 1:
>         jmp 1b
>
> and then dump its value.
>
> Or Tom might have an even better solution.
>
> But looking at the code, BSS clearing happens later, at .Lrelocated and
> the EFI stub comes before it. AFAICT.
>

Indeed. And moving it back into .data makes the most sense in any case
- the point of the patch is to drop the duplicate definitions from asm
code, not to move it into a different section.

The reason I hadn't spotted this is because my boot chain always sets
the value of image_offset during the boot, and does not rely on the
statically initialized value at all.
  
Tom Lendacky Nov. 23, 2022, 2:13 p.m. UTC | #12
On 11/23/22 04:49, Borislav Petkov wrote:
> On Tue, Nov 22, 2022 at 03:49:29PM -0600, Tom Lendacky wrote:
>> diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
>> index cb5f0befee57..a0bfd31358ba 100644
>> --- a/drivers/firmware/efi/libstub/x86-stub.c
>> +++ b/drivers/firmware/efi/libstub/x86-stub.c
>> @@ -23,7 +23,7 @@
>>   const efi_system_table_t *efi_system_table;
>>   const efi_dxe_services_table_t *efi_dxe_table;
>> -u32 image_offset;
>> +u32 image_offset __section(".data");
>>   static efi_loaded_image_t *image = NULL;
>>   static efi_status_t
>>
>> I assume it has to do with being in .data vs .bss and not being explicitly
>> cleared with the encryption bit set. With the change to put image_offset in
>> the .data section, it is read as zero, where as when it was in the .bss
>> section it was reading "ciphertext".
> 
> Hmm, two points about this:
> 
> 1. Can we do
> 
> u32 image_offset __bss_decrypted;
> 
> here instead? We have this special section just for that fun and it
> self-documents this way.

Yes, but __bss_decrypted is for the main kernel, not the decompression kernel.

The original value was in the .data section of the assembler (before the 
patch moved it), which gets initialized when loaded. Having it in the .bss 
section where you hope that memory was zeroed before hand is the issue.

> 
> 2. Also, why does my SEV-ES guest boot just fine without that change?
> 
> [    0.000000] Linux version 6.1.0-rc6+ (root@ml) (gcc (Debian 11.3.0-1) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38) #1 SMP PREEMPT_DYNAMIC Wed Nov 23 11:27:17 CET 2022
> ...
> [    0.336132] Memory Encryption Features active: AMD SEV SEV-ES

Are you booting directly using the -kernel option on Qemu or going through 
the bootloader. It was only when using Grub that the problem appeared for me.

Thanks,
Tom

> 
> Thx.
>
  
Tom Lendacky Nov. 23, 2022, 2:16 p.m. UTC | #13
On 11/23/22 04:52, Ard Biesheuvel wrote:
> On Wed, 23 Nov 2022 at 11:49, Borislav Petkov <bp@alien8.de> wrote:
>>
>> On Tue, Nov 22, 2022 at 03:49:29PM -0600, Tom Lendacky wrote:
>>> diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
>>> index cb5f0befee57..a0bfd31358ba 100644
>>> --- a/drivers/firmware/efi/libstub/x86-stub.c
>>> +++ b/drivers/firmware/efi/libstub/x86-stub.c
>>> @@ -23,7 +23,7 @@
>>>   const efi_system_table_t *efi_system_table;
>>>   const efi_dxe_services_table_t *efi_dxe_table;
>>> -u32 image_offset;
>>> +u32 image_offset __section(".data");
>>>   static efi_loaded_image_t *image = NULL;
>>>   static efi_status_t
>>>
>>> I assume it has to do with being in .data vs .bss and not being explicitly
>>> cleared with the encryption bit set. With the change to put image_offset in
>>> the .data section, it is read as zero, where as when it was in the .bss
>>> section it was reading "ciphertext".
>>
>> Hmm, two points about this:
>>
>> 1. Can we do
>>
>> u32 image_offset __bss_decrypted;
>>
>> here instead? We have this special section just for that fun and it
>> self-documents this way.
>>
> 
> The patch moves it from .data to .bss inadvertently, and I am not
> convinced Tom's analysis is entirely accurate: we may simply have
> garbage in image_offset if we access it before .bss gets cleared.

When running non-encrypted, I imagine all the memory is cleared to zero as 
part of Qemu allocating it. As soon as you put an SEV guest on top of 
that, host made zeroes will not appear as zeroes to the SEV guest, rather 
they will be decrypted and end up looking like ciphertext (hence the 
random values I kept seeing in image_offset). The SEV guest must 
explicitly clear it, which is why having it in .bss doesn't work for SEV.

Thanks,
Tom

> 
>> 2. Also, why does my SEV-ES guest boot just fine without that change?
>>
> 
> Indeed, so it needs to be in .data
> 
> 
>> [    0.000000] Linux version 6.1.0-rc6+ (root@ml) (gcc (Debian 11.3.0-1) 11.3.0, GNU ld (GNU Binutils for Debian) 2.38) #1 SMP PREEMPT_DYNAMIC Wed Nov 23 11:27:17 CET 2022
>> ...
>> [    0.336132] Memory Encryption Features active: AMD SEV SEV-ES
>>
>> Thx.
>>
>> --
>> Regards/Gruss,
>>      Boris.
>>
>> https://people.kernel.org/tglx/notes-about-netiquette
  
Ard Biesheuvel Nov. 23, 2022, 2:33 p.m. UTC | #14
On Wed, 23 Nov 2022 at 15:16, Tom Lendacky <thomas.lendacky@amd.com> wrote:
>
> On 11/23/22 04:52, Ard Biesheuvel wrote:
> > On Wed, 23 Nov 2022 at 11:49, Borislav Petkov <bp@alien8.de> wrote:
> >>
> >> On Tue, Nov 22, 2022 at 03:49:29PM -0600, Tom Lendacky wrote:
> >>> diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
> >>> index cb5f0befee57..a0bfd31358ba 100644
> >>> --- a/drivers/firmware/efi/libstub/x86-stub.c
> >>> +++ b/drivers/firmware/efi/libstub/x86-stub.c
> >>> @@ -23,7 +23,7 @@
> >>>   const efi_system_table_t *efi_system_table;
> >>>   const efi_dxe_services_table_t *efi_dxe_table;
> >>> -u32 image_offset;
> >>> +u32 image_offset __section(".data");
> >>>   static efi_loaded_image_t *image = NULL;
> >>>   static efi_status_t
> >>>
> >>> I assume it has to do with being in .data vs .bss and not being explicitly
> >>> cleared with the encryption bit set. With the change to put image_offset in
> >>> the .data section, it is read as zero, where as when it was in the .bss
> >>> section it was reading "ciphertext".
> >>
> >> Hmm, two points about this:
> >>
> >> 1. Can we do
> >>
> >> u32 image_offset __bss_decrypted;
> >>
> >> here instead? We have this special section just for that fun and it
> >> self-documents this way.
> >>
> >
> > The patch moves it from .data to .bss inadvertently, and I am not
> > convinced Tom's analysis is entirely accurate: we may simply have
> > garbage in image_offset if we access it before .bss gets cleared.
>
> When running non-encrypted, I imagine all the memory is cleared to zero as
> part of Qemu allocating it. As soon as you put an SEV guest on top of
> that, host made zeroes will not appear as zeroes to the SEV guest, rather
> they will be decrypted and end up looking like ciphertext (hence the
> random values I kept seeing in image_offset). The SEV guest must
> explicitly clear it, which is why having it in .bss doesn't work for SEV.
>

Yes, QEMU will probably clear it, but GRUB, shim, etc do a terrible
job at implementing image loading correctly, i.e., not bothering to
parse the PE/COFF header at all, but just copying the image into
memory and invoke it at a fixed offset of 0x290 bytes into the image
(the famous EFI handover protocol, see the last patch in this series
if you want to know more :-))

This also means that the balance of the file size vs the image size
(where BSS lives) is completely ignored, and we actually have to
relocate the image from the EFI stub in such cases, because we cannot
be sure that the BSS does not overlap with memory that is already in
use, given the top down allocation strategy the EFI usually employs.

In summary, I guess there are multiple way how we might end up in this
situation, and simply putting the variable back into .data where it
came from is probably the best approach here.