[v3,05/19] x86/startup_64: Simplify CR4 handling in startup code

Message ID 20240129180502.4069817-26-ardb+git@google.com
State New
Headers
Series x86: Confine early 1:1 mapped startup code |

Commit Message

Ard Biesheuvel Jan. 29, 2024, 6:05 p.m. UTC
  From: Ard Biesheuvel <ardb@kernel.org>

When executing in long mode, the CR4.PAE and CR4.LA57 control bits
cannot be updated, and so they can simply be preserved rather than
reason about whether or not they need to be set. CR4.PSE has no effect
in long mode so it can be omitted.

CR4.PGE is used to flush the TLBs, by clearing it if it was set, and
subsequently re-enabling it. So there is no need to set it just to
disable and re-enable it later.

CR4.MCE must be preserved unless the kernel was built without
CONFIG_X86_MCE, in which case it must be cleared.

Reimplement the above logic in a more straight-forward way, by defining
a mask of CR4 bits to preserve, and applying that to CR4 at the point
where it needs to be updated anyway.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head_64.S | 27 ++++++++------------
 1 file changed, 10 insertions(+), 17 deletions(-)
  

Comments

Borislav Petkov Feb. 6, 2024, 6:21 p.m. UTC | #1
On Mon, Jan 29, 2024 at 07:05:08PM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
> 
> When executing in long mode, the CR4.PAE and CR4.LA57 control bits
> cannot be updated,

"Long mode requires PAE to be enabled in order to use the 64-bit
page-translation data structures to translate 64-bit virtual addresses
to 52-bit physical addresses."

which is actually already enabled at that point:

cr4            0x20                [ PAE ]

"5-Level paging is enabled by setting CR4[LA57]=1 when EFER[LMA]=1.
CR4[LA57] is ignored when long mode is not active (EFER[LMA]=0)."

and if I had a 5-level guest, it would have LA57 already set too.

So I think you mean "When paging is enabled" as dhansen correctly points
out.

> and so they can simply be preserved rather than reason about whether
> or not they need to be set. CR4.PSE has no effect in long mode so it
> can be omitted.

f4c5ca985012 ("x86_64: Show CR4.PSE on auxiliaries like on BSP")

Please don't forget about git history before doing changes here.

> CR4.PGE is used to flush the TLBs, by clearing it if it was set, and

.. to flush TLB entries with the global bit set.

And just like the above commit says, I think the CR4 settings across all
CPUs on the machine should be the same. So we want to keep PSE.

Removing the CONFIG_X86_5LEVEL ifdeffery is nice, OTOH.

Thx.
  
Ard Biesheuvel Feb. 7, 2024, 10:38 a.m. UTC | #2
On Tue, 6 Feb 2024 at 18:21, Borislav Petkov <bp@alien8.de> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:08PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > When executing in long mode, the CR4.PAE and CR4.LA57 control bits
> > cannot be updated,
>
> "Long mode requires PAE to be enabled in order to use the 64-bit
> page-translation data structures to translate 64-bit virtual addresses
> to 52-bit physical addresses."
>
> which is actually already enabled at that point:
>
> cr4            0x20                [ PAE ]
>
> "5-Level paging is enabled by setting CR4[LA57]=1 when EFER[LMA]=1.
> CR4[LA57] is ignored when long mode is not active (EFER[LMA]=0)."
>
> and if I had a 5-level guest, it would have LA57 already set too.
>
> So I think you mean "When paging is enabled" as dhansen correctly points
> out.
>

Ack.

> > and so they can simply be preserved rather than reason about whether
> > or not they need to be set. CR4.PSE has no effect in long mode so it
> > can be omitted.
>
> f4c5ca985012 ("x86_64: Show CR4.PSE on auxiliaries like on BSP")
>
> Please don't forget about git history before doing changes here.
>

My bad - I misunderstood what is going on here.

> > CR4.PGE is used to flush the TLBs, by clearing it if it was set, and
>
> ... to flush TLB entries with the global bit set.
>
> And just like the above commit says, I think the CR4 settings across all
> CPUs on the machine should be the same. So we want to keep PSE.
>
> Removing the CONFIG_X86_5LEVEL ifdeffery is nice, OTOH.
>

Cheers.
  

Patch

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 6d24c2014759..ca46995205d4 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -179,6 +179,12 @@  SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 
 1:
 
+	/*
+	 * Define a mask of CR4 bits to preserve. PAE and LA57 cannot be
+	 * modified while paging remains enabled. PGE will be toggled below if
+	 * it is already set.
+	 */
+	movl	$(X86_CR4_PAE | X86_CR4_PGE | X86_CR4_LA57), %edx
 #ifdef CONFIG_X86_MCE
 	/*
 	 * Preserve CR4.MCE if the kernel will enable #MC support.
@@ -187,22 +193,9 @@  SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 * configured will crash the system regardless of the CR4.MCE value set
 	 * here.
 	 */
-	movq	%cr4, %rcx
-	andl	$X86_CR4_MCE, %ecx
-#else
-	movl	$0, %ecx
+	orl	$X86_CR4_MCE, %edx
 #endif
 
-	/* Enable PAE mode, PSE, PGE and LA57 */
-	orl	$(X86_CR4_PAE | X86_CR4_PSE | X86_CR4_PGE), %ecx
-#ifdef CONFIG_X86_5LEVEL
-	testb	$1, __pgtable_l5_enabled(%rip)
-	jz	1f
-	orl	$X86_CR4_LA57, %ecx
-1:
-#endif
-	movq	%rcx, %cr4
-
 	/*
 	 * Switch to new page-table
 	 *
@@ -218,10 +211,10 @@  SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 * entries from the identity mapping are flushed.
 	 */
 	movq	%cr4, %rcx
-	movq	%rcx, %rax
-	xorq	$X86_CR4_PGE, %rcx
+	andl	%edx, %ecx
+0:	btcl	$X86_CR4_PGE_BIT, %ecx
 	movq	%rcx, %cr4
-	movq	%rax, %cr4
+	jc	0b
 
 	/* Ensure I am executing from virtual addresses */
 	movq	$1f, %rax