[v7,1/9] x86/startup_64: Simplify CR4 handling in startup code

Message ID 20240227151907.387873-12-ardb+git@google.com
State New
Headers
Series x86: Confine early 1:1 mapped startup code |

Commit Message

Ard Biesheuvel Feb. 27, 2024, 3:19 p.m. UTC
  From: Ard Biesheuvel <ardb@kernel.org>

When paging is enabled, the CR4.PAE and CR4.LA57 control bits cannot be
changed, and so they can simply be preserved rather than reason about
whether or not they need to be set. CR4.MCE should be preserved unless
the kernel was built without CONFIG_X86_MCE, in which case it must be
cleared.

CR4.PSE should be set explicitly, regardless of whether or not it was
set before.

CR4.PGE is set explicitly, and then cleared and set again after
programming CR3 in order to flush TLB entries based on global
translations. This makes the first assignment redundant, and can
therefore be omitted. So clear PGE by omitting it from the preserve
mask, and set it again explicitly after switching to the new page
tables.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/x86/kernel/head_64.S | 30 ++++++++------------
 1 file changed, 12 insertions(+), 18 deletions(-)
  

Comments

Borislav Petkov Feb. 28, 2024, 1:45 p.m. UTC | #1
On Tue, Feb 27, 2024 at 04:19:09PM +0100, Ard Biesheuvel wrote:
> +	/*
> +	 * Create a mask of CR4 bits to preserve. Omit PGE in order to clean
> +	 * global 1:1 translations from the TLBs.

Brian raised this question when exactly global entries get flushed and
I was looking for the exact definition in the SDM, here's what I'll do
ontop:

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 2d8762887c6a..24df91535062 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -186,8 +186,13 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 1:
 
 	/*
-	 * Create a mask of CR4 bits to preserve. Omit PGE in order to clean
+	 * Create a mask of CR4 bits to preserve. Omit PGE in order to flush
 	 * global 1:1 translations from the TLBs.
+	 *
+	 * From the SDM:
+	 * "If CR4.PGE is changing from 0 to 1, there were no global TLB
+	 *  entries before the execution; if CR4.PGE is changing from 1 to 0,
+	 *  there will be no global TLB entries after the execution."
 	 */
 	movl	$(X86_CR4_PAE | X86_CR4_LA57), %edx
 #ifdef CONFIG_X86_MCE
---

And how it is perfectly clear.

Thx.
  
Ard Biesheuvel Feb. 29, 2024, 10:36 p.m. UTC | #2
On Wed, 28 Feb 2024 at 14:45, Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Feb 27, 2024 at 04:19:09PM +0100, Ard Biesheuvel wrote:
> > +     /*
> > +      * Create a mask of CR4 bits to preserve. Omit PGE in order to clean
> > +      * global 1:1 translations from the TLBs.
>
> Brian raised this question when exactly global entries get flushed and
> I was looking for the exact definition in the SDM, here's what I'll do
> ontop:
>
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 2d8762887c6a..24df91535062 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -186,8 +186,13 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
>  1:
>
>         /*
> -        * Create a mask of CR4 bits to preserve. Omit PGE in order to clean
> +        * Create a mask of CR4 bits to preserve. Omit PGE in order to flush
>          * global 1:1 translations from the TLBs.
> +        *
> +        * From the SDM:
> +        * "If CR4.PGE is changing from 0 to 1, there were no global TLB
> +        *  entries before the execution; if CR4.PGE is changing from 1 to 0,
> +        *  there will be no global TLB entries after the execution."
>          */
>         movl    $(X86_CR4_PAE | X86_CR4_LA57), %edx
>  #ifdef CONFIG_X86_MCE
> ---
>
> And how it is perfectly clear.
>

Looks good to me - thanks.
  

Patch

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index d295bf68bf94..1b054585bfd1 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -185,6 +185,11 @@  SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	addq	$(init_top_pgt - __START_KERNEL_map), %rax
 1:
 
+	/*
+	 * Create a mask of CR4 bits to preserve. Omit PGE in order to clean
+	 * global 1:1 translations from the TLBs.
+	 */
+	movl	$(X86_CR4_PAE | X86_CR4_LA57), %edx
 #ifdef CONFIG_X86_MCE
 	/*
 	 * Preserve CR4.MCE if the kernel will enable #MC support.
@@ -193,20 +198,13 @@  SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	 * configured will crash the system regardless of the CR4.MCE value set
 	 * here.
 	 */
-	movq	%cr4, %rcx
-	andl	$X86_CR4_MCE, %ecx
-#else
-	movl	$0, %ecx
+	orl	$X86_CR4_MCE, %edx
 #endif
+	movq	%cr4, %rcx
+	andl	%edx, %ecx
 
-	/* Enable PAE mode, PSE, PGE and LA57 */
-	orl	$(X86_CR4_PAE | X86_CR4_PSE | X86_CR4_PGE), %ecx
-#ifdef CONFIG_X86_5LEVEL
-	testb	$1, __pgtable_l5_enabled(%rip)
-	jz	1f
-	orl	$X86_CR4_LA57, %ecx
-1:
-#endif
+	/* Even if ignored in long mode, set PSE uniformly on all logical CPUs. */
+	btsl	$X86_CR4_PSE_BIT, %ecx
 	movq	%rcx, %cr4
 
 	/* Setup early boot stage 4-/5-level pagetables. */
@@ -223,14 +221,10 @@  SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
 	movq	%rax, %cr3
 
 	/*
-	 * Do a global TLB flush after the CR3 switch to make sure the TLB
-	 * entries from the identity mapping are flushed.
+	 * Set CR4.PGE to re-enable global translations.
 	 */
-	movq	%cr4, %rcx
-	movq	%rcx, %rax
-	xorq	$X86_CR4_PGE, %rcx
+	btsl	$X86_CR4_PGE_BIT, %ecx
 	movq	%rcx, %cr4
-	movq	%rax, %cr4
 
 	/* Ensure I am executing from virtual addresses */
 	movq	$1f, %rax