x86/percpu: Return correct variable from current_top_of_stack()

Message ID 20231024142830.3226-1-ubizjak@gmail.com
State New
Headers
Series x86/percpu: Return correct variable from current_top_of_stack() |

Commit Message

Uros Bizjak Oct. 24, 2023, 2:28 p.m. UTC
  current_top_of_stack() should return variable from _seg_gs
qualified named address space when CONFIG_USE_X86_SEG_SUPPORT
is enbled.

Fixes: ed2f752e0e0a ("x86/percpu: Introduce const-qualified const_pcpu_hot to micro-optimize code generation")
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
---
 arch/x86/include/asm/processor.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Borislav Petkov Oct. 24, 2023, 3:56 p.m. UTC | #1
On Tue, Oct 24, 2023 at 04:28:14PM +0200, Uros Bizjak wrote:
> current_top_of_stack() should return variable from _seg_gs
> qualified named address space when CONFIG_USE_X86_SEG_SUPPORT
> is enbled.

I presume you're sending those two in order to fix stuff like the splat
below which fires in my guest with latest Linus + latest tip/master
lineup.

Because disabling CONFIG_USE_X86_SEG_SUPPORT fixes it.

I'm wondering that close to the merge window whether we should delay
all that new and fancy percpu stuff one more round until it is tested
more widely...

[    1.623994] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
[    1.627398] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
[    1.627101] BUG: unable to handle page fault for address: 000000000002f0d8
[    1.629645] HugeTLB: 16380 KiB vmemmap can be freed for a 1.00 GiB page
[    1.628158] #PF: supervisor read access in kernel mode
[    1.628161] #PF: error_code(0x0000) - not-present page
[    1.628163] PGD 0 P4D 0 
[    1.628167] Oops: 0000 [#1] PREEMPT SMP
[    1.628171] CPU: 1 PID: 10 Comm: kworker/u32:0 Not tainted 6.6.0-rc7+ #1
[    1.631566] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
[    1.629156] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[    1.632494] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page
[    1.629990] Workqueue: ftrace_check_wq ftrace_check_work_func
[    1.631041] RIP: 0010:raw_irqentry_exit_cond_resched+0x16/0x50
[    1.631041] Code: 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 f7 05 d4 ff ef 7e ff ff ff 7f 75 21 <48> 8b 05 db ff ef 7e 48 29 e0 48 3d ff 3f 00 00 77 19 65 48 8b 05
[    1.631041] RSP: 0018:ffffc9000005bab8 EFLAGS: 00010046
[    1.631041] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000002f900
[    1.631041] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc9000005bac8
[    1.631041] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000001
[    1.631041] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
[    1.631041] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[    1.631041] FS:  0000000000000000(0000) GS:ffff88807da40000(0000) knlGS:0000000000000000
[    1.631041] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.631041] CR2: 000000000002f0d8 CR3: 0000000002416000 CR4: 00000000003506f0
[    1.631041] Call Trace:
[    1.631041]  <TASK>
[    1.631041]  ? __die+0x31/0x80
[    1.631041]  ? page_fault_oops+0x160/0x440
[    1.631041]  ? exc_page_fault+0x74/0x150
[    1.631041]  ? asm_exc_page_fault+0x26/0x30
[    1.631041]  ? raw_irqentry_exit_cond_resched+0x16/0x50
[    1.631041]  irqentry_exit+0x21/0x60
[    1.631041]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[    1.631041] RIP: 0010:get_symbol_offset+0x26/0x60
[    1.631041] Code: 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 c1 e8 08 8b 04 85 80 4f 0b 82 48 05 88 af f1 81 81 e7 ff 00 00 00 74 25 31 c9 0f b6 10 <84> d2 79 0e 0f b6 70 01 83 e2 7f c1 e6 07 09 f2 ff c2 ff c2 ff c1
  
Uros Bizjak Oct. 24, 2023, 4:05 p.m. UTC | #2
On Tue, Oct 24, 2023 at 5:56 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Tue, Oct 24, 2023 at 04:28:14PM +0200, Uros Bizjak wrote:
> > current_top_of_stack() should return variable from _seg_gs
> > qualified named address space when CONFIG_USE_X86_SEG_SUPPORT
> > is enbled.
>
> I presume you're sending those two in order to fix stuff like the splat
> below which fires in my guest with latest Linus + latest tip/master
> lineup.

Yes, the first one is the fix, the second one is only tangentially
related to the fix.

> Because disabling CONFIG_USE_X86_SEG_SUPPORT fixes it.
>
> I'm wondering that close to the merge window whether we should delay
> all that new and fancy percpu stuff one more round until it is tested
> more widely...

The percpu stuff won't be merged for 6.7, it will have to sit out until 6.8.

Thanks,
Uros.


>
> [    1.623994] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized if possible.
> [    1.627398] HugeTLB: registered 1.00 GiB page size, pre-allocated 0 pages
> [    1.627101] BUG: unable to handle page fault for address: 000000000002f0d8
> [    1.629645] HugeTLB: 16380 KiB vmemmap can be freed for a 1.00 GiB page
> [    1.628158] #PF: supervisor read access in kernel mode
> [    1.628161] #PF: error_code(0x0000) - not-present page
> [    1.628163] PGD 0 P4D 0
> [    1.628167] Oops: 0000 [#1] PREEMPT SMP
> [    1.628171] CPU: 1 PID: 10 Comm: kworker/u32:0 Not tainted 6.6.0-rc7+ #1
> [    1.631566] HugeTLB: registered 2.00 MiB page size, pre-allocated 0 pages
> [    1.629156] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> [    1.632494] HugeTLB: 28 KiB vmemmap can be freed for a 2.00 MiB page
> [    1.629990] Workqueue: ftrace_check_wq ftrace_check_work_func
> [    1.631041] RIP: 0010:raw_irqentry_exit_cond_resched+0x16/0x50
> [    1.631041] Code: 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 65 f7 05 d4 ff ef 7e ff ff ff 7f 75 21 <48> 8b 05 db ff ef 7e 48 29 e0 48 3d ff 3f 00 00 77 19 65 48 8b 05
> [    1.631041] RSP: 0018:ffffc9000005bab8 EFLAGS: 00010046
> [    1.631041] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000000002f900
> [    1.631041] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc9000005bac8
> [    1.631041] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000001
> [    1.631041] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000
> [    1.631041] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [    1.631041] FS:  0000000000000000(0000) GS:ffff88807da40000(0000) knlGS:0000000000000000
> [    1.631041] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.631041] CR2: 000000000002f0d8 CR3: 0000000002416000 CR4: 00000000003506f0
> [    1.631041] Call Trace:
> [    1.631041]  <TASK>
> [    1.631041]  ? __die+0x31/0x80
> [    1.631041]  ? page_fault_oops+0x160/0x440
> [    1.631041]  ? exc_page_fault+0x74/0x150
> [    1.631041]  ? asm_exc_page_fault+0x26/0x30
> [    1.631041]  ? raw_irqentry_exit_cond_resched+0x16/0x50
> [    1.631041]  irqentry_exit+0x21/0x60
> [    1.631041]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
> [    1.631041] RIP: 0010:get_symbol_offset+0x26/0x60
> [    1.631041] Code: 90 90 90 90 0f 1f 44 00 00 48 89 f8 48 c1 e8 08 8b 04 85 80 4f 0b 82 48 05 88 af f1 81 81 e7 ff 00 00 00 74 25 31 c9 0f b6 10 <84> d2 79 0e 0f b6 70 01 83 e2 7f c1 e6 07 09 f2 ff c2 ff c2 ff c1
>
> --
> Regards/Gruss,
>     Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
  

Patch

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index a807025a4dee..4b130d894cb6 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -534,7 +534,7 @@  static __always_inline unsigned long current_top_of_stack(void)
 	 *  entry trampoline.
 	 */
 	if (IS_ENABLED(CONFIG_USE_X86_SEG_SUPPORT))
-		return pcpu_hot.top_of_stack;
+		return const_pcpu_hot.top_of_stack;
 
 	return this_cpu_read_stable(pcpu_hot.top_of_stack);
 }