[v2,12/33] kmsan: Allow disabling KMSAN checks for the current task
Commit Message
Like for KASAN, it's useful to temporarily disable KMSAN checks around,
e.g., redzone accesses. Introduce kmsan_disable_current() and
kmsan_enable_current(), which are similar to their KASAN counterparts.
Even though it's not strictly necessary, make them reentrant, in order
to match the KASAN behavior. Repurpose the allow_reporting field for
this.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
Documentation/dev-tools/kmsan.rst | 4 ++--
include/linux/kmsan-checks.h | 12 ++++++++++++
include/linux/kmsan_types.h | 2 +-
mm/kmsan/core.c | 2 +-
mm/kmsan/hooks.c | 14 +++++++++++++-
mm/kmsan/report.c | 6 +++---
6 files changed, 32 insertions(+), 8 deletions(-)
Comments
On Tue, Nov 21, 2023 at 11:06 PM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> Like for KASAN, it's useful to temporarily disable KMSAN checks around,
> e.g., redzone accesses. Introduce kmsan_disable_current() and
> kmsan_enable_current(), which are similar to their KASAN counterparts.
Initially we used to have this disablement counter in KMSAN, but
adding it uncontrollably can result in KMSAN not functioning properly.
E.g. forgetting to call kmsan_disable_current() or underflowing the
counter will break reporting.
We'd better put this API in include/linux/kmsan.h to indicate it
should be discouraged.
> Even though it's not strictly necessary, make them reentrant, in order
> to match the KASAN behavior.
Until this becomes strictly necessary, I think we'd better
KMSAN_WARN_ON if the counter is re-entered.
On Mon, 2023-12-11 at 12:50 +0100, Alexander Potapenko wrote:
> On Tue, Nov 21, 2023 at 11:06 PM Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> >
> > Like for KASAN, it's useful to temporarily disable KMSAN checks
> > around,
> > e.g., redzone accesses. Introduce kmsan_disable_current() and
> > kmsan_enable_current(), which are similar to their KASAN
> > counterparts.
>
> Initially we used to have this disablement counter in KMSAN, but
> adding it uncontrollably can result in KMSAN not functioning
> properly.
> E.g. forgetting to call kmsan_disable_current() or underflowing the
> counter will break reporting.
> We'd better put this API in include/linux/kmsan.h to indicate it
> should be discouraged.
>
> > Even though it's not strictly necessary, make them reentrant, in
> > order
> > to match the KASAN behavior.
>
> Until this becomes strictly necessary, I think we'd better
> KMSAN_WARN_ON if the counter is re-entered.
I encountered a case when we are freeing memory from an interrupt
handler:
[ 149.840553] ------------[ cut here ]------------
[ 149.840649] WARNING: CPU: 1 PID: 181 at mm/kmsan/hooks.c:447
kmsan_disable_current+0x2e/0x40
[ 149.840790] Modules linked in:
[ 149.840894] CPU: 1 PID: 181 Comm: (direxec) Tainted: G B W
N 6.7.0-rc5-gd34a4b46f382 #13
[ 149.841003] Hardware name: IBM 3931 A01 704 (KVM/Linux)
[ 149.841094] Krnl PSW : 0404c00180000000 000000000197dbc2
(kmsan_disable_current+0x32/0x40)
[ 149.841276] R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0
PM:0 RI:0 EA:3
[ 149.841420] Krnl GPRS: 0000000000000040 0000000096914100
0000000000001000 0000000000000001
[ 149.841518] 0000036d827daee0 0000000007c97008
0000000080096500 0000000092f4f000
[ 149.841617] 0000036d00000000 0000000000000000
0000000000000040 0000000000000000
[ 149.841712] 0000000092f4efc0 00000001ff710f60
000000000193acba 0000037f0008f710
[ 149.841893] Krnl Code: 000000000197dbb6: eb0018640352 mviy
14436(%r1),0
[ 149.841893] 000000000197dbbc: 07fe bcr
15,%r14
[ 149.841893] #000000000197dbbe: af000000 mc
0,0
[ 149.841893] >000000000197dbc2: a7f4fffa brc
15,000000000197dbb6
[ 149.841893] 000000000197dbc6: 0700 bcr
0,%r0
[ 149.841893] 000000000197dbc8: 0700 bcr
0,%r0
[ 149.841893] 000000000197dbca: 0700 bcr
0,%r0
[ 149.841893] 000000000197dbcc: 0700 bcr
0,%r0
[ 149.842438] Call Trace:
15:37:25 [90/1838]
[ 149.842510] [<000000000197dbc2>] kmsan_disable_current+0x32/0x40
[ 149.842631] ([<000000000193ac14>] slab_pad_check+0x1d4/0xac0)
[ 149.842738] [<0000000001949222>] free_to_partial_list+0x1d72/0x3b80
[ 149.842850] [<0000000001947066>] __slab_free+0xd86/0x11d0
[ 149.842956] [<00000000019111e8>] kmem_cache_free+0x15d8/0x25d0
[ 149.843062] [<0000000000229e3a>] __tlb_remove_table+0x20a/0xa50
[ 149.843174] [<00000000016c7f98>] tlb_remove_table_rcu+0x98/0x120
[ 149.843291] [<000000000083e1c6>] rcu_core+0x15b6/0x54b0
[ 149.843406] [<00000000069c3c0e>] __do_softirq+0xa1e/0x2178
[ 149.843514] [<00000000003467b4>] irq_exit_rcu+0x2c4/0x630
[ 149.843623] [<0000000006949f6e>] do_ext_irq+0x9e/0x120
[ 149.843736] [<00000000069c18d4>] ext_int_handler+0xc4/0xf0
[ 149.843841] [<000000000197e428>] kmsan_get_metadata+0x68/0x280
[ 149.843950] [<000000000197e344>]
kmsan_get_shadow_origin_ptr+0x74/0xf0
[ 149.844071] [<000000000197ba3a>]
__msan_metadata_ptr_for_load_8+0x2a/0x40
[ 149.844192] [<0000000000184e4a>]
unwind_get_return_address+0xda/0x150
[ 149.844313] [<000000000018fd12>] arch_stack_walk+0x172/0x2f0
[ 149.844417] [<00000000008f1af0>] stack_trace_save+0x100/0x160
[ 149.844529] [<000000000197af22>]
kmsan_internal_chain_origin+0x62/0xe0
[ 149.844647] [<000000000197c1f0>] __msan_chain_origin+0xd0/0x160
[ 149.844763] [<00000000068b3ba4>] memchr_inv+0x5b4/0xb20
[ 149.844877] [<000000000193e730>] check_bytes_and_report+0xa0/0xd30
[ 149.844986] [<000000000193b920>] check_object+0x420/0x17d0
[ 149.845092] [<000000000194aa8a>] free_to_partial_list+0x35da/0x3b80
[ 149.845202] [<0000000001947066>] __slab_free+0xd86/0x11d0
[ 149.845308] [<00000000019111e8>] kmem_cache_free+0x15d8/0x25d0
[ 149.845414] [<00000000016bc2fe>] exit_mmap+0x87e/0x1200
[ 149.845524] [<00000000002f315c>] mmput+0x13c/0x5b0
[ 149.845632] [<0000000001b9d634>] exec_mmap+0xc34/0x1230
[ 149.845744] [<0000000001b996c2>] begin_new_exec+0xcf2/0x2520
[ 149.845857] [<0000000001f6a084>] load_elf_binary+0x2364/0x67d0
[ 149.845971] [<0000000001ba5ba4>] bprm_execve+0x25b4/0x4010
[ 149.846083] [<0000000001baa7e6>] do_execveat_common+0x2436/0x2600
[ 149.846200] [<0000000001ba78f8>] __s390x_sys_execve+0x108/0x140
[ 149.846314] [<000000000011b192>] do_syscall+0x4c2/0x690
[ 149.846424] [<0000000006949d78>] __do_syscall+0x98/0xe0
[ 149.846536] [<00000000069c1640>] system_call+0x70/0xa0
[ 149.846638] INFO: lockdep is turned off.
[ 149.846846] Last Breaking-Event-Address:
[ 149.846916] [<000000000197dbb2>] kmsan_disable_current+0x22/0x40
[ 149.847057] irq event stamp: 0
[ 149.847128] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 149.847227] hardirqs last disabled at (0): [<00000000002f8f46>]
copy_process+0x21f6/0x8b20
[ 149.847344] softirqs last enabled at (0): [<00000000002f8f80>]
copy_process+0x2230/0x8b20
[ 149.847461] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 149.847559] ---[ end trace 0000000000000000 ]---
[ 149.865485] =====================================================
Using a counter resolves this issue, but, of course, at the expense of
not reporting valid issues in the interrupt handler.
Unfortunately I don't see another easy way to solve this problem. The
possibilities that come to mind are providing uninstrumented
memchr_inv() or disablement flags for each context, but I'm not sure if
we want to go there, especially since KASAN already has this
limitation.
@@ -338,11 +338,11 @@ Per-task KMSAN state
~~~~~~~~~~~~~~~~~~~~
Every task_struct has an associated KMSAN task state that holds the KMSAN
-context (see above) and a per-task flag disallowing KMSAN reports::
+context (see above) and a per-task counter disallowing KMSAN reports::
struct kmsan_context {
...
- bool allow_reporting;
+ unsigned int depth;
struct kmsan_context_state cstate;
...
}
@@ -72,6 +72,10 @@ void kmsan_copy_to_user(void __user *to, const void *from, size_t to_copy,
*/
void kmsan_memmove_metadata(void *dst, const void *src, size_t n);
+void kmsan_enable_current(void);
+
+void kmsan_disable_current(void);
+
#else
static inline void kmsan_poison_memory(const void *address, size_t size,
@@ -92,6 +96,14 @@ static inline void kmsan_memmove_metadata(void *dst, const void *src, size_t n)
{
}
+static inline void kmsan_enable_current(void)
+{
+}
+
+static inline void kmsan_disable_current(void)
+{
+}
+
#endif
#endif /* _LINUX_KMSAN_CHECKS_H */
@@ -29,7 +29,7 @@ struct kmsan_context_state {
struct kmsan_ctx {
struct kmsan_context_state cstate;
int kmsan_in_runtime;
- bool allow_reporting;
+ unsigned int depth;
};
#endif /* _LINUX_KMSAN_TYPES_H */
@@ -43,7 +43,7 @@ void kmsan_internal_task_create(struct task_struct *task)
struct thread_info *info = current_thread_info();
__memset(ctx, 0, sizeof(*ctx));
- ctx->allow_reporting = true;
+ ctx->depth = 0;
kmsan_internal_unpoison_memory(info, sizeof(*info), false);
}
@@ -44,7 +44,7 @@ void kmsan_task_exit(struct task_struct *task)
if (!kmsan_enabled || kmsan_in_runtime())
return;
- ctx->allow_reporting = false;
+ ctx->depth++;
}
void kmsan_slab_alloc(struct kmem_cache *s, void *object, gfp_t flags)
@@ -434,3 +434,15 @@ void kmsan_check_memory(const void *addr, size_t size)
REASON_ANY);
}
EXPORT_SYMBOL(kmsan_check_memory);
+
+void kmsan_enable_current(void)
+{
+ current->kmsan_ctx.depth--;
+}
+EXPORT_SYMBOL(kmsan_enable_current);
+
+void kmsan_disable_current(void)
+{
+ current->kmsan_ctx.depth++;
+}
+EXPORT_SYMBOL(kmsan_disable_current);
@@ -158,12 +158,12 @@ void kmsan_report(depot_stack_handle_t origin, void *address, int size,
if (!kmsan_enabled)
return;
- if (!current->kmsan_ctx.allow_reporting)
+ if (current->kmsan_ctx.depth)
return;
if (!origin)
return;
- current->kmsan_ctx.allow_reporting = false;
+ current->kmsan_ctx.depth++;
ua_flags = user_access_save();
raw_spin_lock(&kmsan_report_lock);
pr_err("=====================================================\n");
@@ -216,5 +216,5 @@ void kmsan_report(depot_stack_handle_t origin, void *address, int size,
if (panic_on_kmsan)
panic("kmsan.panic set ...\n");
user_access_restore(ua_flags);
- current->kmsan_ctx.allow_reporting = true;
+ current->kmsan_ctx.depth--;
}