[v5,00/16] x86: Confine early 1:1 mapped startup code

Message ID 20240221113506.2565718-18-ardb+git@google.com
Headers
Series x86: Confine early 1:1 mapped startup code |

Message

Ard Biesheuvel Feb. 21, 2024, 11:35 a.m. UTC
  From: Ard Biesheuvel <ardb@kernel.org>

This is a follow-up to [0] which implemented rigorous build time checks
to ensure that any code that is executed during early startup supports
running from the initial 1:1 mapping of memory, which is how the kernel
is entered from the decompressor or the EFI firmware.

Using PIC codegen and introducing new magic sections into generic code
would create a maintenance burden, and more experimentation is needed
there.  One issue with PIC codegen is that it still permits the compiler
to make assumptions about the runtime address of global objects (modulo
runtime relocation), which is incompatible with how the kernel is
entered, i.e., running a fully linked and relocated executable from the
wrong runtime address.

The RIP_REL_REF() macro that was introduced recently [1] is actually
more appropriate for this use case, as it hides the access from the
compiler entirely, and so the compiler can never predict its result.

To make incremental progress on this, this v5 drops the special
instrumentation for .pi.text and PIC codegen, but retains all the
cleanup work on the startup code to make it more maintainable and more
obviously correct.

In particular, this involves:
- getting rid of early accesses to global objects, either by moving them
  to the stack, deferring the access until later, or dropping the
  globals entirely;
- moving all code that runs early via the 1:1 mapping into .head.text,
  and moving code that does not out of it, so that build time checks can
  be added later to ensure that no inadvertent absolute references were
  emitted into code that does not tolerate them;
- removing fixup_pointer() and occurrences of __pa_symbol(), which rely
  on the compiler emitting absolute references, and this is not
  guaranteed. (Without -fpic, the compiler might still use RIP-relative
  references in some cases)

Changes since v4 [2]:
- incorporate Boris's tweaked version of patch #1
- split __startup64() changes into multiple patches, and align more
  closely with the original logic
- fix build for CONFIG_X86_5LEVEL=n
- add comment to clarify that CR4.PSE is always set deliberately
- add separate SME startup change to remove SME/SVE related calls from
  the non-SME/SVE boot path (this can be backported more easily further
  back than to where we need the changes for SVE guest boot)

Changes since v3:
- dropped half of the patches and added a couple of new ones
- applied feedback from Boris to patches that were retained, mostly
  related to some minor oversights on my part, and to some style issues

[0] https://lkml.kernel.org/r/20240129180502.4069817-21-ardb%2Bgit%40google.com
[1] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/sev&id=1c811d403afd73f0
[2] https://lkml.kernel.org/r/20240213124143.1484862-13-ardb%2Bgit%40google.com

Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Dionna Glaze <dionnaglaze@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: llvm@lists.linux.dev

Ard Biesheuvel (16):
  x86/startup_64: Simplify global variable accesses in GDT/IDT
    programming
  x86/startup_64: Use RIP_REL_REF() to assign phys_base
  x86/startup_64: Use RIP_REL_REF() to access early_dynamic_pgts[]
  x86/startup_64: Use RIP_REL_REF() to access __supported_pte_mask
  x86/startup_64: Use RIP_REL_REF() to access early page tables
  x86/startup_64: Use RIP_REL_REF() to access early_top_pgt[]
  x86/startup_64: Simplify CR4 handling in startup code
  x86/startup_64: Defer assignment of 5-level paging global variables
  x86/startup_64: Simplify calculation of initial page table address
  x86/startup_64: Simplify virtual switch on primary boot
  x86/sme: Avoid SME/SVE related checks on non-SME/SVE platforms
  efi/libstub: Add generic support for parsing mem_encrypt=
  x86/boot: Move mem_encrypt= parsing to the decompressor
  x86/sme: Move early SME kernel encryption handling into .head.text
  x86/sev: Move early startup code into .head.text section
  x86/startup_64: Drop global variables keeping track of LA57 state

 arch/x86/boot/compressed/misc.c                |  15 ++
 arch/x86/boot/compressed/misc.h                |   4 -
 arch/x86/boot/compressed/pgtable_64.c          |  12 --
 arch/x86/boot/compressed/sev.c                 |   3 +
 arch/x86/boot/compressed/vmlinux.lds.S         |   1 +
 arch/x86/include/asm/mem_encrypt.h             |   8 +-
 arch/x86/include/asm/pgtable_64_types.h        |  43 ++---
 arch/x86/include/asm/setup.h                   |   2 +-
 arch/x86/include/asm/sev.h                     |  10 +-
 arch/x86/include/uapi/asm/bootparam.h          |   1 +
 arch/x86/kernel/cpu/common.c                   |   2 -
 arch/x86/kernel/head64.c                       | 195 ++++++--------------
 arch/x86/kernel/head_64.S                      |  95 ++++------
 arch/x86/kernel/sev-shared.c                   |  23 +--
 arch/x86/kernel/sev.c                          |  14 +-
 arch/x86/lib/Makefile                          |  13 --
 arch/x86/mm/kasan_init_64.c                    |   3 -
 arch/x86/mm/mem_encrypt_identity.c             |  89 +++------
 drivers/firmware/efi/libstub/efi-stub-helper.c |   8 +
 drivers/firmware/efi/libstub/efistub.h         |   2 +-
 drivers/firmware/efi/libstub/x86-stub.c        |   3 +
 21 files changed, 203 insertions(+), 343 deletions(-)


base-commit: ee8ff8768735edc3e013837c4416f819543ddc17