x86/alternatives: Disable KASAN on text_poke_early() in apply_alternatives()

Message ID 20231010053716.2481-1-kirill.shutemov@linux.intel.com
State New
Headers
Series x86/alternatives: Disable KASAN on text_poke_early() in apply_alternatives() |

Commit Message

Kirill A. Shutemov Oct. 10, 2023, 5:37 a.m. UTC
  Fei has reported that KASAN triggers during apply_alternatives() on
5-level paging machine:

	BUG: KASAN: out-of-bounds in rcu_is_watching (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:444 ./include/linux/context_tracking.h:122 kernel/rcu/tree.c:699)
	Read of size 4 at addr ff110003ee6419a0 by task swapper/0/0

	CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.6.0-rc5 #12
	Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
	Call Trace:
	<TASK>
	dump_stack_lvl (lib/dump_stack.c:107)
	print_report (mm/kasan/report.c:365 mm/kasan/report.c:475)
	? __phys_addr (arch/x86/mm/physaddr.h:7 arch/x86/mm/physaddr.c:28)
	? kasan_addr_to_slab (./include/linux/mm.h:1265 (discriminator 1) mm/kasan/../slab.h:213 (discriminator 1) mm/kasan/common.c:36 (discriminator 1))
	kasan_report (mm/kasan/report.c:590)
	? rcu_is_watching (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:444 ./include/linux/context_tracking.h:122 kernel/rcu/tree.c:699)
	? rcu_is_watching (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:444 ./include/linux/context_tracking.h:122 kernel/rcu/tree.c:699)
	? apply_alternatives (arch/x86/kernel/alternative.c:415 (discriminator 1))
	__asan_load4 (mm/kasan/generic.c:259)
	rcu_is_watching (./arch/x86/include/asm/atomic.h:23 ./include/linux/atomic/atomic-arch-fallback.h:444 ./include/linux/context_tracking.h:122 kernel/rcu/tree.c:699)
	? text_poke_early (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:77 ./arch/x86/include/asm/irqflags.h:135 arch/x86/kernel/alternative.c:1675)
	trace_hardirqs_on (./include/trace/events/preemptirq.h:40 (discriminator 2) ./include/trace/events/preemptirq.h:40 (discriminator 2) kernel/trace/trace_preemptirq.c:56 (discriminator 2))
	? __asan_load4 (./arch/x86/include/asm/cpufeature.h:171 mm/kasan/kasan.h:306 mm/kasan/generic.c:175 mm/kasan/generic.c:259)
	text_poke_early (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:77 ./arch/x86/include/asm/irqflags.h:135 arch/x86/kernel/alternative.c:1675)
	apply_alternatives (arch/x86/kernel/alternative.c:415 (discriminator 1))
	? __asan_load4 (./arch/x86/include/asm/cpufeature.h:171 mm/kasan/kasan.h:306 mm/kasan/generic.c:175 mm/kasan/generic.c:259)
	? __pfx_apply_alternatives (arch/x86/kernel/alternative.c:400)
	? __pfx_apply_returns (arch/x86/kernel/alternative.c:720)
	? __this_cpu_preempt_check (lib/smp_processor_id.c:67)
	? _sub_I_65535_1 (init/main.c:1573)
	? int3_selftest_ip (arch/x86/kernel/alternative.c:1496)
	? __pfx_int3_selftest (arch/x86/kernel/alternative.c:1496)
	? lockdep_hardirqs_on (kernel/locking/lockdep.c:4422)
	? fpu__init_cpu_generic (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:77 ./arch/x86/include/asm/irqflags.h:135 ./arch/x86/include/asm/tlbflush.h:47 arch/x86/kernel/fpu/init.c:30)
	alternative_instructions (arch/x86/kernel/alternative.c:1618)
	arch_cpu_finalize_init (arch/x86/kernel/cpu/common.c:2404)
	start_kernel (init/main.c:1037)
	x86_64_start_reservations (arch/x86/kernel/head64.c:544)
	x86_64_start_kernel (arch/x86/kernel/head64.c:486 (discriminator 5))
	secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)
	</TASK>

	The buggy address belongs to the physical page:
	page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x3ee641
	flags: 0x20000000004000(reserved|node=0|zone=2)
	page_type: 0xffffffff()
	raw: 0020000000004000 ffd400000fb99048 ffd400000fb99048 0000000000000000
	raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
	page dumped because: kasan: bad access detected

	Memory state around the buggy address:
	ff110003ee641880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	ff110003ee641900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	>ff110003ee641980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	^
	ff110003ee641a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	ff110003ee641a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57)
got patched. It includes KASAN code, where KASAN_SHADOW_START depends on
__VIRTUAL_MASK_SHIFT, which is defined with the cpu_feature_enabled().

It seems that KASAN gets confused when apply_alternatives() patches the
KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START
static, by replacing __VIRTUAL_MASK_SHIFT with 56, fixes the issue.

During text_poke_early() in apply_alternatives(), KASAN should be
disabled. KASAN is already disabled in non-_early() text_poke().

It is unclear why the issue was not reported earlier. Bisecting does not
help. Older kernels trigger the issue less frequently, but it still
occurs. In the absence of any other clear offenders, the initial dynamic
5-level paging support is to blame.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Fei Yang <fei.yang@intel.com>
Fixes: 6657fca06e3f ("x86/mm: Allow to boot without LA57 if CONFIG_X86_5LEVEL=y")
Cc: stable@vger.kernel.org
---
 arch/x86/kernel/alternative.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Borislav Petkov Oct. 10, 2023, 8:19 a.m. UTC | #1
On Tue, Oct 10, 2023 at 08:37:16AM +0300, Kirill A. Shutemov wrote:
> On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57)
> got patched. It includes KASAN code, where KASAN_SHADOW_START depends on
> __VIRTUAL_MASK_SHIFT, which is defined with the cpu_feature_enabled().

So use boot_cpu_has(X86_FEATURE_LA57).

> It seems that KASAN gets confused when apply_alternatives() patches the

It seems?

> KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START
> static, by replacing __VIRTUAL_MASK_SHIFT with 56, fixes the issue.
> 
> During text_poke_early() in apply_alternatives(), KASAN should be
> disabled. KASAN is already disabled in non-_early() text_poke().
> 
> It is unclear why the issue was not reported earlier. Bisecting does not
> help. Older kernels trigger the issue less frequently, but it still
> occurs. In the absence of any other clear offenders, the initial dynamic
> 5-level paging support is to blame.

This whole thing sounds like it is still not really clear what is
actually happening...
  
Kirill A. Shutemov Oct. 10, 2023, 8:40 a.m. UTC | #2
On Tue, Oct 10, 2023 at 10:19:38AM +0200, Borislav Petkov wrote:
> On Tue, Oct 10, 2023 at 08:37:16AM +0300, Kirill A. Shutemov wrote:
> > On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57)
> > got patched. It includes KASAN code, where KASAN_SHADOW_START depends on
> > __VIRTUAL_MASK_SHIFT, which is defined with the cpu_feature_enabled().
> 
> So use boot_cpu_has(X86_FEATURE_LA57).

__VIRTUAL_MASK_SHIFT used in many places. I don't think it is good idea to
give up on patching completely.

> > It seems that KASAN gets confused when apply_alternatives() patches the
> 
> It seems?

Admittedly, I don't understand KASAN well enough. I confirmed my idea
indirectly, by patching KASASN_SHADOW_START, as I mentioned.

> > KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START
> > static, by replacing __VIRTUAL_MASK_SHIFT with 56, fixes the issue.
> > 
> > During text_poke_early() in apply_alternatives(), KASAN should be
> > disabled. KASAN is already disabled in non-_early() text_poke().
> > 
> > It is unclear why the issue was not reported earlier. Bisecting does not
> > help. Older kernels trigger the issue less frequently, but it still
> > occurs. In the absence of any other clear offenders, the initial dynamic
> > 5-level paging support is to blame.
> 
> This whole thing sounds like it is still not really clear what is
> actually happening...

Maybe KASAN folks can help to understand the situation.
  
Borislav Petkov Oct. 10, 2023, 9:12 a.m. UTC | #3
On Tue, Oct 10, 2023 at 11:40:41AM +0300, Kirill A. Shutemov wrote:
> __VIRTUAL_MASK_SHIFT used in many places. I don't think it is good idea to
> give up on patching completely.

Have you even looked at boot_cpu_has()'s asm?
  
Peter Zijlstra Oct. 10, 2023, 10:10 a.m. UTC | #4
On Tue, Oct 10, 2023 at 10:19:38AM +0200, Borislav Petkov wrote:
> On Tue, Oct 10, 2023 at 08:37:16AM +0300, Kirill A. Shutemov wrote:
> > On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57)
> > got patched. It includes KASAN code, where KASAN_SHADOW_START depends on
> > __VIRTUAL_MASK_SHIFT, which is defined with the cpu_feature_enabled().
> 
> So use boot_cpu_has(X86_FEATURE_LA57).
> 
> > It seems that KASAN gets confused when apply_alternatives() patches the
> 
> It seems?
> 
> > KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START
> > static, by replacing __VIRTUAL_MASK_SHIFT with 56, fixes the issue.
> > 
> > During text_poke_early() in apply_alternatives(), KASAN should be
> > disabled. KASAN is already disabled in non-_early() text_poke().
> > 
> > It is unclear why the issue was not reported earlier. Bisecting does not
> > help. Older kernels trigger the issue less frequently, but it still
> > occurs. In the absence of any other clear offenders, the initial dynamic
> > 5-level paging support is to blame.
> 
> This whole thing sounds like it is still not really clear what is
> actually happening...

somewhere along the line __asan_loadN() gets tripped, this then ends up
in kasan_check_range() -> check_region_inline() -> addr_has_metadata().

This latter has: kasan_shadow_to_mem() which is compared against
KASAN_SHADOW_START, which includes, as Kirill says __VIRTUAL_MASK_SHIFT.

Now, obviously you really don't want boot_cpu_has() in
__VIRTUAL_MASK_SHIFT, that would be really bad (Linus recently
complained about how horrible the code-gen is around this already, must
not make it far worse).


Anyway, being half-way through patching X86_FEATURE_LA57 thing *are*
inconsistent and I really can't blame things for going sideways.

That said, I don't particularly like the patch, I think it should, at
the veyr least, cover all of apply_alternatives, not just
text_poke_early().
  
Peter Zijlstra Oct. 10, 2023, 10:16 a.m. UTC | #5
On Tue, Oct 10, 2023 at 12:10:56PM +0200, Peter Zijlstra wrote:

> That said, I don't particularly like the patch, I think it should, at
> the veyr least, cover all of apply_alternatives, not just
> text_poke_early().

kasan_arch_is_ready() is another option, x86 doesn't currently define
that, but that would allow us to shut kasan down harder around patching.
Not sure if it's worth the trouble though.
  
Kirill A. Shutemov Oct. 10, 2023, 10:24 a.m. UTC | #6
On Tue, Oct 10, 2023 at 11:12:35AM +0200, Borislav Petkov wrote:
> On Tue, Oct 10, 2023 at 11:40:41AM +0300, Kirill A. Shutemov wrote:
> > __VIRTUAL_MASK_SHIFT used in many places. I don't think it is good idea to
> > give up on patching completely.
> 
> Have you even looked at boot_cpu_has()'s asm?

Obviously not :/

Okay, as alternative, the patch below also make the issue go away.

But I am not sure it is fundamentaly better. boot_cpu_has() generates call
to __asan_load8_noabort(). I think it only works because all KASAN code
has ASAN instrumentation disabled.

diff --git a/arch/x86/include/asm/kasan.h b/arch/x86/include/asm/kasan.h
index de75306b932e..bfe97013abb0 100644
--- a/arch/x86/include/asm/kasan.h
+++ b/arch/x86/include/asm/kasan.h
@@ -12,8 +12,15 @@
  * for kernel really starts from compiler's shadow offset +
  * 'kernel address space start' >> KASAN_SHADOW_SCALE_SHIFT
  */
+
+#ifdef USE_EARLY_PGTABLE_L5
+#define __KASAN_VIRT_SHIFT	(__pgtable_l5_enabled ? 56 : 47)
+#else
+#define __KASAN_VIRT_SHIFT	(boot_cpu_has(X86_FEATURE_LA57) ? 56 : 47)
+#endif
+
 #define KASAN_SHADOW_START      (KASAN_SHADOW_OFFSET + \
-					((-1UL << __VIRTUAL_MASK_SHIFT) >> \
+					((-1UL << __KASAN_VIRT_SHIFT) >> \
 						KASAN_SHADOW_SCALE_SHIFT))
 /*
  * 47 bits for kernel address -> (47 - KASAN_SHADOW_SCALE_SHIFT) bits for shadow
  
Kirill A. Shutemov Oct. 10, 2023, 10:25 a.m. UTC | #7
On Tue, Oct 10, 2023 at 12:10:56PM +0200, Peter Zijlstra wrote:
> On Tue, Oct 10, 2023 at 10:19:38AM +0200, Borislav Petkov wrote:
> > On Tue, Oct 10, 2023 at 08:37:16AM +0300, Kirill A. Shutemov wrote:
> > > On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57)
> > > got patched. It includes KASAN code, where KASAN_SHADOW_START depends on
> > > __VIRTUAL_MASK_SHIFT, which is defined with the cpu_feature_enabled().
> > 
> > So use boot_cpu_has(X86_FEATURE_LA57).
> > 
> > > It seems that KASAN gets confused when apply_alternatives() patches the
> > 
> > It seems?
> > 
> > > KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START
> > > static, by replacing __VIRTUAL_MASK_SHIFT with 56, fixes the issue.
> > > 
> > > During text_poke_early() in apply_alternatives(), KASAN should be
> > > disabled. KASAN is already disabled in non-_early() text_poke().
> > > 
> > > It is unclear why the issue was not reported earlier. Bisecting does not
> > > help. Older kernels trigger the issue less frequently, but it still
> > > occurs. In the absence of any other clear offenders, the initial dynamic
> > > 5-level paging support is to blame.
> > 
> > This whole thing sounds like it is still not really clear what is
> > actually happening...
> 
> somewhere along the line __asan_loadN() gets tripped, this then ends up
> in kasan_check_range() -> check_region_inline() -> addr_has_metadata().
> 
> This latter has: kasan_shadow_to_mem() which is compared against
> KASAN_SHADOW_START, which includes, as Kirill says __VIRTUAL_MASK_SHIFT.
> 
> Now, obviously you really don't want boot_cpu_has() in
> __VIRTUAL_MASK_SHIFT, that would be really bad (Linus recently
> complained about how horrible the code-gen is around this already, must
> not make it far worse).
> 
> 
> Anyway, being half-way through patching X86_FEATURE_LA57 thing *are*
> inconsistent and I really can't blame things for going sideways.
> 
> That said, I don't particularly like the patch, I think it should, at
> the veyr least, cover all of apply_alternatives, not just
> text_poke_early().

I can do this, if it is the only stopper.

Do you want it disabled on caller side or inside apply_alternatives()?
  
Kirill A. Shutemov Oct. 10, 2023, 10:30 a.m. UTC | #8
On Tue, Oct 10, 2023 at 12:16:21PM +0200, Peter Zijlstra wrote:
> On Tue, Oct 10, 2023 at 12:10:56PM +0200, Peter Zijlstra wrote:
> 
> > That said, I don't particularly like the patch, I think it should, at
> > the veyr least, cover all of apply_alternatives, not just
> > text_poke_early().
> 
> kasan_arch_is_ready() is another option, x86 doesn't currently define
> that, but that would allow us to shut kasan down harder around patching.
> Not sure if it's worth the trouble though.

IIUC, it was intended to delay KASAN usage until it is ready. KASAN is
functional well before apply_alternatives() and making
kasan_arch_is_ready() temporary false for patching feels like abuse of the
hook.
  
Peter Zijlstra Oct. 10, 2023, 11:24 a.m. UTC | #9
On Tue, Oct 10, 2023 at 01:25:37PM +0300, Kirill A. Shutemov wrote:

> > That said, I don't particularly like the patch, I think it should, at
> > the veyr least, cover all of apply_alternatives, not just
> > text_poke_early().
> 
> I can do this, if it is the only stopper.
> 
> Do you want it disabled on caller side or inside apply_alternatives()?

Inside probably, covering the whole for()-loop thingy. Ideally with a
comment explaining how KASAN doesn't like partial LA57 patching.
  
Borislav Petkov Oct. 10, 2023, 1:10 p.m. UTC | #10
On Tue, Oct 10, 2023 at 12:10:56PM +0200, Peter Zijlstra wrote:
> Now, obviously you really don't want boot_cpu_has() in
> __VIRTUAL_MASK_SHIFT, that would be really bad (Linus recently
> complained about how horrible the code-gen is around this already, must
> not make it far worse).

You mean a MOV (%rip) and a TEST are so horrible there because it is
a mask?

I'd experiment with it when I get a chance...
  
Peter Zijlstra Oct. 10, 2023, 1:54 p.m. UTC | #11
On Tue, Oct 10, 2023 at 03:10:54PM +0200, Borislav Petkov wrote:
> On Tue, Oct 10, 2023 at 12:10:56PM +0200, Peter Zijlstra wrote:
> > Now, obviously you really don't want boot_cpu_has() in
> > __VIRTUAL_MASK_SHIFT, that would be really bad (Linus recently
> > complained about how horrible the code-gen is around this already, must
> > not make it far worse).
> 
> You mean a MOV (%rip) and a TEST are so horrible there because it is
> a mask?
> 
> I'd experiment with it when I get a chance...

That gets you a memory-reference and potential cachemiss what should
have been an immediate :/
  

Patch

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 517ee01503be..56187fd8816e 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -450,7 +450,9 @@  void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 		DUMP_BYTES(ALT, replacement, a->replacementlen, "%px:   rpl_insn: ", replacement);
 		DUMP_BYTES(ALT, insn_buff, insn_buff_sz, "%px: final_insn: ", instr);
 
+		kasan_disable_current();
 		text_poke_early(instr, insn_buff, insn_buff_sz);
+		kasan_enable_current();
 	}
 }