[v4,21/22] kasan: use stack_depot_put for Generic mode

Message ID 5cef104d9b842899489b4054fe8d1339a71acee0.1700502145.git.andreyknvl@google.com
State New
Headers
Series stackdepot: allow evicting stack traces |

Commit Message

andrey.konovalov@linux.dev Nov. 20, 2023, 5:47 p.m. UTC
  From: Andrey Konovalov <andreyknvl@google.com>

Evict alloc/free stack traces from the stack depot for Generic KASAN
once they are evicted from the quaratine.

For auxiliary stack traces, evict the oldest stack trace once a new one
is saved (KASAN only keeps references to the last two).

Also evict all saved stack traces on krealloc.

To avoid double-evicting and mis-evicting stack traces (in case KASAN's
metadata was corrupted), reset KASAN's per-object metadata that stores
stack depot handles when the object is initialized and when it's evicted
from the quarantine.

Note that stack_depot_put is no-op if the handle is 0.

Reviewed-by: Marco Elver <elver@google.com>
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
---
 mm/kasan/common.c     |  3 ++-
 mm/kasan/generic.c    | 22 ++++++++++++++++++----
 mm/kasan/quarantine.c | 26 ++++++++++++++++++++------
 3 files changed, 40 insertions(+), 11 deletions(-)
  

Comments

Hyeonggon Yoo Nov. 22, 2023, 3:17 a.m. UTC | #1
On Tue, Nov 21, 2023 at 1:08 PM <andrey.konovalov@linux.dev> wrote:
>
> From: Andrey Konovalov <andreyknvl@google.com>
>
> Evict alloc/free stack traces from the stack depot for Generic KASAN
> once they are evicted from the quaratine.
>
> For auxiliary stack traces, evict the oldest stack trace once a new one
> is saved (KASAN only keeps references to the last two).
>
> Also evict all saved stack traces on krealloc.
>
> To avoid double-evicting and mis-evicting stack traces (in case KASAN's
> metadata was corrupted), reset KASAN's per-object metadata that stores
> stack depot handles when the object is initialized and when it's evicted
> from the quarantine.
>
> Note that stack_depot_put is no-op if the handle is 0.
>
> Reviewed-by: Marco Elver <elver@google.com>
> Signed-off-by: Andrey Konovalov <andreyknvl@google.com>

I observed boot hangs on a few SLUB configurations.

Having other users of stackdepot might be the cause. After passing
'slub_debug=-' which disables SLUB debugging, it boots fine.

compiler version: gcc-11
config: https://download.kerneltesting.org/builds/2023-11-21-f121f2/.config
bisect log: https://download.kerneltesting.org/builds/2023-11-21-f121f2/bisect.log.txt

[dmesg]
(gdb) lx-dmesg
[    0.000000] Linux version 6.7.0-rc1-00136-g0e8b630f3053
(hyeyoo@localhost.localdomain) (gcc (GCC) 11.3.1 20221121 (R3[
0.000000] Command line: console=ttyS0 root=/dev/sda1 nokaslr
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
6.7.0-rc1-00136-g0e8b630f3053 #22
[    0.000000] RIP: 0010:setup_arch+0x500/0x2250
[    0.000000] Code: c6 09 08 00 48 89 c5 48 85 c0 0f 84 58 13 00 00
48 c1 e8 03 48 83 05 be 97 66 00 01 80 3c 18 00 0f3[    0.000000] RSP:
0000:ffffffff86007e00 EFLAGS: 00010046 ORIG_RAX: 0000000000000009
[    0.000000] RAX: 1fffffffffe40088 RBX: dffffc0000000000 RCX: 1ffffffff11ed630
[    0.000000] RDX: 0000000000000000 RSI: feec4698e8103000 RDI: ffffffff88f6b180
[    0.000000] RBP: ffffffffff200444 R08: 8000000000000163 R09: 1ffffffff11ed628
[    0.000000] R10: ffffffff88f7a150 R11: 0000000000000000 R12: 0000000000000010
[    0.000000] R13: ffffffffff200450 R14: feec4698e8102444 R15: feec4698e8102444
[    0.000000] FS:  0000000000000000(0000) GS:ffffffff88d5b000(0000)
knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.000000] CR2: ffffffffff200444 CR3: 0000000008f0e000 CR4: 00000000000000b0
[    0.000000] Call Trace:
[    0.000000]  <TASK>
[    0.000000]  ? show_regs+0x87/0xa0
[    0.000000]  ? early_fixup_exception+0x130/0x310
[    0.000000]  ? do_early_exception+0x23/0x90
[    0.000000]  ? early_idt_handler_common+0x2f/0x40
[    0.000000]  ? setup_arch+0x500/0x2250
[    0.000000]  ? __pfx_setup_arch+0x10/0x10
[    0.000000]  ? vprintk_default+0x20/0x30
[    0.000000]  ? vprintk+0x4c/0x80
[    0.000000]  ? _printk+0xba/0xf0
[    0.000000]  ? __pfx__printk+0x10/0x10
[    0.000000]  ? init_cgroup_root+0x10f/0x2f0
--Type <RET> for more, q to quit, c to continue without paging--
[    0.000000]  ? cgroup_init_early+0x1e4/0x440
[    0.000000]  ? start_kernel+0xae/0x790
[    0.000000]  ? x86_64_start_reservations+0x28/0x50
[    0.000000]  ? x86_64_start_kernel+0x10e/0x130
[    0.000000]  ? secondary_startup_64_no_verify+0x178/0x17b
[    0.000000]  </TASK>

--
Hyeonggon
  
Hyeonggon Yoo Nov. 22, 2023, 12:37 p.m. UTC | #2
On Wed, Nov 22, 2023 at 12:17 PM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
>
> On Tue, Nov 21, 2023 at 1:08 PM <andrey.konovalov@linux.dev> wrote:
> >
> > From: Andrey Konovalov <andreyknvl@google.com>
> >
> > Evict alloc/free stack traces from the stack depot for Generic KASAN
> > once they are evicted from the quaratine.
> >
> > For auxiliary stack traces, evict the oldest stack trace once a new one
> > is saved (KASAN only keeps references to the last two).
> >
> > Also evict all saved stack traces on krealloc.
> >
> > To avoid double-evicting and mis-evicting stack traces (in case KASAN's
> > metadata was corrupted), reset KASAN's per-object metadata that stores
> > stack depot handles when the object is initialized and when it's evicted
> > from the quarantine.
> >
> > Note that stack_depot_put is no-op if the handle is 0.
> >
> > Reviewed-by: Marco Elver <elver@google.com>
> > Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
>
> I observed boot hangs on a few SLUB configurations.
>
> Having other users of stackdepot might be the cause. After passing
> 'slub_debug=-' which disables SLUB debugging, it boots fine.

Looks like I forgot to Cc regzbot.
If you need more information, please let me know.

#regzbot introduced: f0ff84b7c3a

Thanks,
Hyeonggon

> compiler version: gcc-11
> config: https://download.kerneltesting.org/builds/2023-11-21-f121f2/.config
> bisect log: https://download.kerneltesting.org/builds/2023-11-21-f121f2/bisect.log.txt
>
> [dmesg]
> (gdb) lx-dmesg
> [    0.000000] Linux version 6.7.0-rc1-00136-g0e8b630f3053
> (hyeyoo@localhost.localdomain) (gcc (GCC) 11.3.1 20221121 (R3[
> 0.000000] Command line: console=ttyS0 root=/dev/sda1 nokaslr
> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted
> 6.7.0-rc1-00136-g0e8b630f3053 #22
> [    0.000000] RIP: 0010:setup_arch+0x500/0x2250
> [    0.000000] Code: c6 09 08 00 48 89 c5 48 85 c0 0f 84 58 13 00 00
> 48 c1 e8 03 48 83 05 be 97 66 00 01 80 3c 18 00 0f3[    0.000000] RSP:
> 0000:ffffffff86007e00 EFLAGS: 00010046 ORIG_RAX: 0000000000000009
> [    0.000000] RAX: 1fffffffffe40088 RBX: dffffc0000000000 RCX: 1ffffffff11ed630
> [    0.000000] RDX: 0000000000000000 RSI: feec4698e8103000 RDI: ffffffff88f6b180
> [    0.000000] RBP: ffffffffff200444 R08: 8000000000000163 R09: 1ffffffff11ed628
> [    0.000000] R10: ffffffff88f7a150 R11: 0000000000000000 R12: 0000000000000010
> [    0.000000] R13: ffffffffff200450 R14: feec4698e8102444 R15: feec4698e8102444
> [    0.000000] FS:  0000000000000000(0000) GS:ffffffff88d5b000(0000)
> knlGS:0000000000000000
> [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.000000] CR2: ffffffffff200444 CR3: 0000000008f0e000 CR4: 00000000000000b0
> [    0.000000] Call Trace:
> [    0.000000]  <TASK>
> [    0.000000]  ? show_regs+0x87/0xa0
> [    0.000000]  ? early_fixup_exception+0x130/0x310
> [    0.000000]  ? do_early_exception+0x23/0x90
> [    0.000000]  ? early_idt_handler_common+0x2f/0x40
> [    0.000000]  ? setup_arch+0x500/0x2250
> [    0.000000]  ? __pfx_setup_arch+0x10/0x10
> [    0.000000]  ? vprintk_default+0x20/0x30
> [    0.000000]  ? vprintk+0x4c/0x80
> [    0.000000]  ? _printk+0xba/0xf0
> [    0.000000]  ? __pfx__printk+0x10/0x10
> [    0.000000]  ? init_cgroup_root+0x10f/0x2f0
> --Type <RET> for more, q to quit, c to continue without paging--
> [    0.000000]  ? cgroup_init_early+0x1e4/0x440
> [    0.000000]  ? start_kernel+0xae/0x790
> [    0.000000]  ? x86_64_start_reservations+0x28/0x50
> [    0.000000]  ? x86_64_start_kernel+0x10e/0x130
> [    0.000000]  ? secondary_startup_64_no_verify+0x178/0x17b
> [    0.000000]  </TASK>
  
Andrey Konovalov Nov. 22, 2023, 11:13 p.m. UTC | #3
On Wed, Nov 22, 2023 at 4:17 AM Hyeonggon Yoo <42.hyeyoo@gmail.com> wrote:
>
> On Tue, Nov 21, 2023 at 1:08 PM <andrey.konovalov@linux.dev> wrote:
> >
> > From: Andrey Konovalov <andreyknvl@google.com>
> >
> > Evict alloc/free stack traces from the stack depot for Generic KASAN
> > once they are evicted from the quaratine.
> >
> > For auxiliary stack traces, evict the oldest stack trace once a new one
> > is saved (KASAN only keeps references to the last two).
> >
> > Also evict all saved stack traces on krealloc.
> >
> > To avoid double-evicting and mis-evicting stack traces (in case KASAN's
> > metadata was corrupted), reset KASAN's per-object metadata that stores
> > stack depot handles when the object is initialized and when it's evicted
> > from the quarantine.
> >
> > Note that stack_depot_put is no-op if the handle is 0.
> >
> > Reviewed-by: Marco Elver <elver@google.com>
> > Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
>
> I observed boot hangs on a few SLUB configurations.
>
> Having other users of stackdepot might be the cause. After passing
> 'slub_debug=-' which disables SLUB debugging, it boots fine.

Hi Hyeonggon,

Just mailed a fix.

Thank you for the report!
  

Patch

diff --git a/mm/kasan/common.c b/mm/kasan/common.c
index 825a0240ec02..b5d8bd26fced 100644
--- a/mm/kasan/common.c
+++ b/mm/kasan/common.c
@@ -50,7 +50,8 @@  depot_stack_handle_t kasan_save_stack(gfp_t flags, depot_flags_t depot_flags)
 void kasan_set_track(struct kasan_track *track, gfp_t flags)
 {
 	track->pid = current->pid;
-	track->stack = kasan_save_stack(flags, STACK_DEPOT_FLAG_CAN_ALLOC);
+	track->stack = kasan_save_stack(flags,
+			STACK_DEPOT_FLAG_CAN_ALLOC | STACK_DEPOT_FLAG_GET);
 }
 
 #if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
diff --git a/mm/kasan/generic.c b/mm/kasan/generic.c
index 5d168c9afb32..50cc519e23f4 100644
--- a/mm/kasan/generic.c
+++ b/mm/kasan/generic.c
@@ -449,10 +449,14 @@  struct kasan_free_meta *kasan_get_free_meta(struct kmem_cache *cache,
 void kasan_init_object_meta(struct kmem_cache *cache, const void *object)
 {
 	struct kasan_alloc_meta *alloc_meta;
+	struct kasan_free_meta *free_meta;
 
 	alloc_meta = kasan_get_alloc_meta(cache, object);
 	if (alloc_meta)
 		__memset(alloc_meta, 0, sizeof(*alloc_meta));
+	free_meta = kasan_get_free_meta(cache, object);
+	if (free_meta)
+		__memset(free_meta, 0, sizeof(*free_meta));
 }
 
 size_t kasan_metadata_size(struct kmem_cache *cache, bool in_object)
@@ -489,18 +493,20 @@  static void __kasan_record_aux_stack(void *addr, depot_flags_t depot_flags)
 	if (!alloc_meta)
 		return;
 
+	stack_depot_put(alloc_meta->aux_stack[1]);
 	alloc_meta->aux_stack[1] = alloc_meta->aux_stack[0];
 	alloc_meta->aux_stack[0] = kasan_save_stack(0, depot_flags);
 }
 
 void kasan_record_aux_stack(void *addr)
 {
-	return __kasan_record_aux_stack(addr, STACK_DEPOT_FLAG_CAN_ALLOC);
+	return __kasan_record_aux_stack(addr,
+			STACK_DEPOT_FLAG_CAN_ALLOC | STACK_DEPOT_FLAG_GET);
 }
 
 void kasan_record_aux_stack_noalloc(void *addr)
 {
-	return __kasan_record_aux_stack(addr, 0);
+	return __kasan_record_aux_stack(addr, STACK_DEPOT_FLAG_GET);
 }
 
 void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags)
@@ -508,8 +514,16 @@  void kasan_save_alloc_info(struct kmem_cache *cache, void *object, gfp_t flags)
 	struct kasan_alloc_meta *alloc_meta;
 
 	alloc_meta = kasan_get_alloc_meta(cache, object);
-	if (alloc_meta)
-		kasan_set_track(&alloc_meta->alloc_track, flags);
+	if (!alloc_meta)
+		return;
+
+	/* Evict previous stack traces (might exist for krealloc). */
+	stack_depot_put(alloc_meta->alloc_track.stack);
+	stack_depot_put(alloc_meta->aux_stack[0]);
+	stack_depot_put(alloc_meta->aux_stack[1]);
+	__memset(alloc_meta, 0, sizeof(*alloc_meta));
+
+	kasan_set_track(&alloc_meta->alloc_track, flags);
 }
 
 void kasan_save_free_info(struct kmem_cache *cache, void *object)
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c
index ca4529156735..265ca2bbe2dd 100644
--- a/mm/kasan/quarantine.c
+++ b/mm/kasan/quarantine.c
@@ -143,11 +143,22 @@  static void *qlink_to_object(struct qlist_node *qlink, struct kmem_cache *cache)
 static void qlink_free(struct qlist_node *qlink, struct kmem_cache *cache)
 {
 	void *object = qlink_to_object(qlink, cache);
-	struct kasan_free_meta *meta = kasan_get_free_meta(cache, object);
+	struct kasan_alloc_meta *alloc_meta = kasan_get_alloc_meta(cache, object);
+	struct kasan_free_meta *free_meta = kasan_get_free_meta(cache, object);
 	unsigned long flags;
 
-	if (IS_ENABLED(CONFIG_SLAB))
-		local_irq_save(flags);
+	if (alloc_meta) {
+		stack_depot_put(alloc_meta->alloc_track.stack);
+		stack_depot_put(alloc_meta->aux_stack[0]);
+		stack_depot_put(alloc_meta->aux_stack[1]);
+		__memset(alloc_meta, 0, sizeof(*alloc_meta));
+	}
+
+	if (free_meta &&
+	    *(u8 *)kasan_mem_to_shadow(object) == KASAN_SLAB_FREETRACK) {
+		stack_depot_put(free_meta->free_track.stack);
+		free_meta->free_track.stack = 0;
+	}
 
 	/*
 	 * If init_on_free is enabled and KASAN's free metadata is stored in
@@ -157,14 +168,17 @@  static void qlink_free(struct qlist_node *qlink, struct kmem_cache *cache)
 	 */
 	if (slab_want_init_on_free(cache) &&
 	    cache->kasan_info.free_meta_offset == 0)
-		memzero_explicit(meta, sizeof(*meta));
+		memzero_explicit(free_meta, sizeof(*free_meta));
 
 	/*
-	 * As the object now gets freed from the quarantine, assume that its
-	 * free track is no longer valid.
+	 * As the object now gets freed from the quarantine,
+	 * take note that its free track is no longer exists.
 	 */
 	*(u8 *)kasan_mem_to_shadow(object) = KASAN_SLAB_FREE;
 
+	if (IS_ENABLED(CONFIG_SLAB))
+		local_irq_save(flags);
+
 	___cache_free(cache, object, _THIS_IP_);
 
 	if (IS_ENABLED(CONFIG_SLAB))