x86/realmode: Make stack lock work in trampoline_compat()

Message ID 87h6rujdvl.ffs@tglx
State New
Headers
Series x86/realmode: Make stack lock work in trampoline_compat() |

Commit Message

Thomas Gleixner May 30, 2023, 10:46 a.m. UTC
  The stack locking and stack assignment macro LOAD_REALMODE_ESP fails to
work when invoked from the 64bit trampoline entry point:

trampoline_start64
  trampoline_compat
    LOAD_REALMODE_ESP <- lock

Accessing tr_lock is only possible from 16bit mode. For the compat entry
point this needs to be pa_tr_lock so that the required relocation entry is
generated. Otherwise it locks the non-relocated address which is
aside of being wrong never cleared in secondary_startup_64() causing all
but the first CPU to get stuck on the lock.

Make the macro take an argument lock_pa which defaults to 0 and rename it
to LOCK_AND_LOAD_REALMODE_ESP to make it clear what this is about.

Fixes: f6f1ae9128d2 ("x86/smpboot: Implement a bit spinlock to protect the realmode stack")
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/realmode/rm/trampoline_64.S |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)
  

Comments

Kirill A. Shutemov May 30, 2023, 11:12 a.m. UTC | #1
On Tue, May 30, 2023 at 12:46:22PM +0200, Thomas Gleixner wrote:
> The stack locking and stack assignment macro LOAD_REALMODE_ESP fails to
> work when invoked from the 64bit trampoline entry point:
> 
> trampoline_start64
>   trampoline_compat
>     LOAD_REALMODE_ESP <- lock
> 
> Accessing tr_lock is only possible from 16bit mode. For the compat entry
> point this needs to be pa_tr_lock so that the required relocation entry is
> generated. Otherwise it locks the non-relocated address which is
> aside of being wrong never cleared in secondary_startup_64() causing all
> but the first CPU to get stuck on the lock.
> 
> Make the macro take an argument lock_pa which defaults to 0 and rename it
> to LOCK_AND_LOAD_REALMODE_ESP to make it clear what this is about.
> 
> Fixes: f6f1ae9128d2 ("x86/smpboot: Implement a bit spinlock to protect the realmode stack")
> Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
  
Yunhong Jiang June 8, 2023, 11:34 p.m. UTC | #2
On Tue, May 30, 2023 at 12:46:22PM +0200, Thomas Gleixner wrote:
> The stack locking and stack assignment macro LOAD_REALMODE_ESP fails to
> work when invoked from the 64bit trampoline entry point:
> 
> trampoline_start64
>   trampoline_compat
>     LOAD_REALMODE_ESP <- lock
One possibly dumb question and hope get some hints. The LOAD_REALMODE_ESP is
defined under .code16 directive and will be used by 32-bit mode caller also. Is
it ok because the instructions there will be same for both 16-bit and 32-bit? I
checked
https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/as_16.html#SEC205 and
don't find much information there.
  
Yunhong Jiang June 9, 2023, 12:22 a.m. UTC | #3
On Fri, Jun 09, 2023 at 12:57:46AM +0100, Andrew Cooper wrote:
> On 09/06/2023 12:34 am, Yunhong Jiang wrote:
> > On Tue, May 30, 2023 at 12:46:22PM +0200, Thomas Gleixner wrote:
> >> The stack locking and stack assignment macro LOAD_REALMODE_ESP fails to
> >> work when invoked from the 64bit trampoline entry point:
> >>
> >> trampoline_start64
> >>   trampoline_compat
> >>     LOAD_REALMODE_ESP <- lock
> > One possibly dumb question and hope get some hints.
> 
> There's a phrase.  "The only dumb question is the one not asked".
> 
> If you have this question, there's an excellent chance that someone else
> reading this thread has the same question.
> 
> >  The LOAD_REALMODE_ESP is
> > defined under .code16 directive and will be used by 32-bit mode caller also. Is
> > it ok because the instructions there will be same for both 16-bit and 32-bit? I
> > checked
> > https://ftp.gnu.org/old-gnu/Manuals/gas-2.9.1/html_chapter/as_16.html#SEC205 and
> > don't find much information there.
> 
> The position of the LOAD_REALMODE_ESP .macro itself doesn't matter. 
> It's just some text which gets pasted elsewhere.  Imagine it just the
> same as running the C preprocessor on a file before compiling it.
> 
> As you note, some expansions of the macro are in .code16, and some are
> not.  This does result in different bytes being emitted.  The default
> operands size flips between .code16 and .code32, so there will be some
> 0x66 prefixes in one mode, and not in others.
> 
> The important point is the l suffix on btsl, which forces it to be long
> (32bit) irrespective of the default operand size.
> 
> So yes, it will work, but that's because gas is handling the differing
> encodings automatically based on the default operand size where the
> LOAD_REALMODE_ESP gets expanded.
> 
> Hope this helps,
Thank you for the explaination, it's quite clear now.
> 
> ~Andrew
  
David Laight June 10, 2023, 7:50 p.m. UTC | #4
From: Andrew Cooper
> Sent: 09 June 2023 00:58
> 
...
> The important point is the l suffix on btsl, which forces it to be long
> (32bit) irrespective of the default operand size.

Does that matter at all?
The 'bit' opcodes (I'm sure 'bts' is 'bit test and set') take
a bit offset from the base address.
This accesses the same bit regardless of the operand size.

The one real issue is that a byte operand will only lock the one byte.
This might be problematic if non-bit locked accesses are also used.
Although it would need to be rather obscure use.
(This may be one of them...)

The only other problem is that btsl always does a locked 32bit
access. If the base address is misaligned this is a misaligned
locked access - problematic if it crosses a cache line boundary.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  

Patch

--- a/arch/x86/realmode/rm/trampoline_64.S
+++ b/arch/x86/realmode/rm/trampoline_64.S
@@ -37,12 +37,16 @@ 
 	.text
 	.code16
 
-.macro LOAD_REALMODE_ESP
+.macro LOCK_AND_LOAD_REALMODE_ESP lock_pa=0
 	/*
 	 * Make sure only one CPU fiddles with the realmode stack
 	 */
 .Llock_rm\@:
+	.if \lock_pa
+        lock btsl       $0, pa_tr_lock
+	.else
         lock btsl       $0, tr_lock
+	.endif
         jnc             2f
         pause
         jmp             .Llock_rm\@
@@ -63,7 +67,7 @@  SYM_CODE_START(trampoline_start)
 	mov	%ax, %es
 	mov	%ax, %ss
 
-	LOAD_REALMODE_ESP
+	LOCK_AND_LOAD_REALMODE_ESP
 
 	call	verify_cpu		# Verify the cpu supports long mode
 	testl   %eax, %eax		# Check for return code
@@ -106,7 +110,7 @@  SYM_CODE_START(sev_es_trampoline_start)
 	mov	%ax, %es
 	mov	%ax, %ss
 
-	LOAD_REALMODE_ESP
+	LOCK_AND_LOAD_REALMODE_ESP
 
 	jmp	.Lswitch_to_protected
 SYM_CODE_END(sev_es_trampoline_start)
@@ -189,7 +193,7 @@  SYM_CODE_START(pa_trampoline_compat)
 	 * In compatibility mode.  Prep ESP and DX for startup_32, then disable
 	 * paging and complete the switch to legacy 32-bit mode.
 	 */
-	LOAD_REALMODE_ESP
+	LOCK_AND_LOAD_REALMODE_ESP lock_pa=1
 	movw	$__KERNEL_DS, %dx
 
 	movl	$(CR0_STATE & ~X86_CR0_PG), %eax