[1/2] x86/kexec: Preserve CR4.MCE during kexec

Message ID 20230213234836.3683-2-kirill.shutemov@linux.intel.com
State New
Headers
Series Kexec enabling in TDX guest |

Commit Message

Kirill A. Shutemov Feb. 13, 2023, 11:48 p.m. UTC
  TDX guests are not allowed to clear CR4.MCE. Attempt to clear it leads
to #VE.

Preserve the flag during kexec.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/kernel/relocate_kernel_64.S | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
  

Comments

Edgecombe, Rick P Feb. 16, 2023, 1:49 a.m. UTC | #1
On Tue, 2023-02-14 at 02:48 +0300, Kirill A. Shutemov wrote:
> TDX guests are not allowed to clear CR4.MCE. Attempt to clear it
> leads
> to #VE.
> 
> Preserve the flag during kexec.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

I wonder whats going on with the pre-existing switching between eax and
rax in this code for the cr0 and cr4 manipulations. Do you know what
the reason is?

Also, for a simple non-tdx kexec regression test:
Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
  
Kirill A. Shutemov Feb. 16, 2023, 9:43 a.m. UTC | #2
On Thu, Feb 16, 2023 at 01:49:39AM +0000, Edgecombe, Rick P wrote:
> On Tue, 2023-02-14 at 02:48 +0300, Kirill A. Shutemov wrote:
> > TDX guests are not allowed to clear CR4.MCE. Attempt to clear it
> > leads
> > to #VE.
> > 
> > Preserve the flag during kexec.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> I wonder whats going on with the pre-existing switching between eax and
> rax in this code for the cr0 and cr4 manipulations. Do you know what
> the reason is?

32-bit ORs and ANDs save one byte per instruction. And there's no 32-bit
MOV to/from control registers in 64-bit mode.

> 
> Also, for a simple non-tdx kexec regression test:
> Tested-by: Rick Edgecombe <rick.p.edgecombe@intel.com>

Thanks!
  
Edgecombe, Rick P Feb. 16, 2023, 5:33 p.m. UTC | #3
On Thu, 2023-02-16 at 12:43 +0300, Kirill A. Shutemov wrote:
> On Thu, Feb 16, 2023 at 01:49:39AM +0000, Edgecombe, Rick P wrote:
> > On Tue, 2023-02-14 at 02:48 +0300, Kirill A. Shutemov wrote:
> > > TDX guests are not allowed to clear CR4.MCE. Attempt to clear it
> > > leads
> > > to #VE.
> > > 
> > > Preserve the flag during kexec.
> > > 
> > > Signed-off-by: Kirill A. Shutemov <
> > > kirill.shutemov@linux.intel.com>
> > 
> > I wonder whats going on with the pre-existing switching between eax
> > and
> > rax in this code for the cr0 and cr4 manipulations. Do you know
> > what
> > the reason is?
> 
> 32-bit ORs and ANDs save one byte per instruction. And there's no 32-
> bit
> MOV to/from control registers in 64-bit mode.

Oh right, I think I recall now. There is a 64 bit AND in the CR0 piece
here too, which of course is outside of these changes.

But otherwise, it's not clear from the patch what the implications are
of leaving CR4.MCE set for the non-TDX environment. I see in head_64.S
it will clear it during boot if the kernel doesn't support machine
check. So it leaves a little window where CR4.MCE is set where it
wasn't before.

The piece in head_64.S talks about how an #MC will crash the system if
it happens before the machine check stuff is fully setup anyway, so it
doesn't hurt to leave it on. Is that the reasoning for this change as
well? If so it might help to add a little more about the reasoning in
the commit log.
  

Patch

diff --git a/arch/x86/kernel/relocate_kernel_64.S b/arch/x86/kernel/relocate_kernel_64.S
index 4a73351f87f8..18f19dcc40e9 100644
--- a/arch/x86/kernel/relocate_kernel_64.S
+++ b/arch/x86/kernel/relocate_kernel_64.S
@@ -145,8 +145,12 @@  SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
 	 * Set cr4 to a known state:
 	 *  - physical address extension enabled
 	 *  - 5-level paging, if it was enabled before
+	 *  - Preserve MCE, if it was set. Clearing MCE may fault in some
+	 *    environments.
 	 */
-	movl	$X86_CR4_PAE, %eax
+	movq	%cr4, %rax
+	andl	$X86_CR4_MCE, %eax
+	orl	$X86_CR4_PAE, %eax
 	testq	$X86_CR4_LA57, %r13
 	jz	1f
 	orl	$X86_CR4_LA57, %eax