Message ID | 20230831101026.3122590-1-mark.rutland@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c792:0:b0:3f2:4152:657d with SMTP id b18csp356255vqu; Thu, 31 Aug 2023 09:29:23 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG8ak4MQdjeT5Lem+lTwSz5+SiagvqcqhR2lx2gD6Mm19BWgkhk6GbtmnPPgMJq7xFvdUkY X-Received: by 2002:a05:6808:1446:b0:3a7:2524:6d3b with SMTP id x6-20020a056808144600b003a725246d3bmr20952oiv.12.1693499362645; Thu, 31 Aug 2023 09:29:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693499362; cv=none; d=google.com; s=arc-20160816; b=0BRCFSwgUmheXeG4TSjY/kmqFrgUS5VXndvbGMQK+jDytvOc1R2UIR4j7oj88/mQoN 7vhX5iMI4WQfC8ZhDVfbgfh2mZuXaad8Qow84CEcb1g4E4fdym05N5Uss4ogyduNb9jN v9whJg663ObMPXwU55KwXu4JIsxLPvt9z+AYdf9JuaKgytlz7lbTO4dLPZXBY9K/02WG uE/qkBVgADUNsRoimBATWnC6JbGZzGb2XjBA1POGnqxkwIaNuGH12qOrGzyJcqkaLzHL LM++0N8TmRxoAkVMU6bSZybJdsVBL5o1Pwi6jf4ZOeysbw3h9fr6xvnYYtrprxfKHoso N4ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=nReMy8kpOaO4Ze+KfORQWcCwmEPy/+hUAi4zl+3MP1I=; fh=xue16Ma4t8k91xmHIQrHTtc23Wg5VDUkzPj6YATSL8A=; b=aECv1CA5KeCI/732XW+74den1OHYFJrVMYtyBVuYKVyUtKZWmXyQZ9pI7F8dX+3Gnn 8x0pL/l9tqnWMjHBuQGGS6ZsCxLBbq3L7rgxvQ3BWj3BovHrCNosnvVo3W+AZ9I3Zfyp /540/Mh+LIzRB6Urzqy6PZkpJLnqjanZ7OJ5MPjoE8G2zKVDdkjbAhjf7SsKyKr0tqDV qdqd0VheBY8IIBYIATmCfUU7PvHORE84iNF78iRjzlmu9R73+Eg2gbI02WwnAUlHG84A I6XR1quTVPAqg5gL7d5sENxap+swQQyKrxUduL1CQ0+6Xpg98oeIXt0abBdPw7VJDERS p9Wg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s73-20020a63774c000000b0056c2f5088besi1506222pgc.318.2023.08.31.09.29.06; Thu, 31 Aug 2023 09:29:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343704AbjHaKKh (ORCPT <rfc822;zouguomin@gmail.com> + 99 others); Thu, 31 Aug 2023 06:10:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230030AbjHaKKg (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 31 Aug 2023 06:10:36 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8001BCFA for <linux-kernel@vger.kernel.org>; Thu, 31 Aug 2023 03:10:31 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B8223C15; Thu, 31 Aug 2023 03:11:10 -0700 (PDT) Received: from lakrids.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 4C0533F64C; Thu, 31 Aug 2023 03:10:30 -0700 (PDT) From: Mark Rutland <mark.rutland@arm.com> To: linux-kernel@vger.kernel.org Cc: dianders@chromium.org, keescook@chromium.org, mark.rutland@arm.com, sumit.garg@linaro.org, swboyd@chromium.org Subject: [PATCH] lkdtm/bugs: add test for panic() with stuck secondary CPUs Date: Thu, 31 Aug 2023 11:10:26 +0100 Message-Id: <20230831101026.3122590-1-mark.rutland@arm.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775762787680250339 X-GMAIL-MSGID: 1775762787680250339 |
Series |
lkdtm/bugs: add test for panic() with stuck secondary CPUs
|
|
Commit Message
Mark Rutland
Aug. 31, 2023, 10:10 a.m. UTC
Upon a panic() the kernel will use either smp_send_stop() or
crash_smp_send_stop() to attempt to stop secondary CPUs via an IPI,
which may or may not be an NMI. Generally it's preferable that this is an
NMI so that CPUs can be stopped in as many situations as possible, but
it's not always possible to provide an NMI, and there are cases where
CPUs may be unable to handle the NMI regardless.
This patch adds a test for panic() where all other CPUs are stuck with
interrupts disabled, which can be used to check whether the kernel
gracefully handles CPUs failing to respond to a stop, and whe NMIs stops
work.
For example, on arm64 *without* an NMI, this results in:
| # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry PANIC_STOP_IRQOFF
| Kernel panic - not syncing: panic stop irqoff test
| CPU: 2 PID: 24 Comm: migration/2 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4
| Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
| Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4
| Call trace:
| dump_backtrace+0x94/0xec
| show_stack+0x18/0x24
| dump_stack_lvl+0x74/0xc0
| dump_stack+0x18/0x24
| panic+0x358/0x3e8
| lkdtm_PANIC+0x0/0x18
| multi_cpu_stop+0x9c/0x1a0
| cpu_stopper_thread+0x84/0x118
| smpboot_thread_fn+0x224/0x248
| kthread+0x114/0x118
| ret_from_fork+0x10/0x20
| SMP: stopping secondary CPUs
| SMP: failed to stop secondary CPUs 0-3
| Kernel Offset: 0x401cf3490000 from 0xffff800080000000
| PHYS_OFFSET: 0x40000000
| CPU features: 0x00000000,68c167a1,cce6773f
| Memory Limit: none
| ---[ end Kernel panic - not syncing: panic stop irqoff test ]---
On arm64 *with* an NMI, this results in:
| # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT
| lkdtm: Performing direct entry PANIC_STOP_IRQOFF
| Kernel panic - not syncing: panic stop irqoff test
| CPU: 1 PID: 19 Comm: migration/1 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4
| Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
| Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4
| Call trace:
| dump_backtrace+0x94/0xec
| show_stack+0x18/0x24
| dump_stack_lvl+0x74/0xc0
| dump_stack+0x18/0x24
| panic+0x358/0x3e8
| lkdtm_PANIC+0x0/0x18
| multi_cpu_stop+0x9c/0x1a0
| cpu_stopper_thread+0x84/0x118
| smpboot_thread_fn+0x224/0x248
| kthread+0x114/0x118
| ret_from_fork+0x10/0x20
| SMP: stopping secondary CPUs
| Kernel Offset: 0x55a9c0bc0000 from 0xffff800080000000
| PHYS_OFFSET: 0x40000000
| CPU features: 0x00000000,68c167a1,fce6773f
| Memory Limit: none
| ---[ end Kernel panic - not syncing: panic stop irqoff test ]---
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Stephen Boyd <swboyd@chromium.org
Cc: Sumit Garg <sumit.garg@linaro.org>
---
drivers/misc/lkdtm/bugs.c | 29 ++++++++++++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)
I've tested this with the arm64 NMI IPI patches:
https://lore.kernel.org/linux-arm-kernel/20230830191314.1618136-1-dianders@chromium.org/
Specifically, with the patch that uses an NMI for IPI_CPU_STOP and
IPI_CPU_CRASH_STOP:
https://lore.kernel.org/linux-arm-kernel/20230830121115.v12.5.Ifadbfd45b22c52edcb499034dd4783d096343260@changeid/
Mark.
Comments
On Thu, Aug 31, 2023 at 06:15:29PM +0530, Sumit Garg wrote: > Hi Mark, > > Thanks for putting up a test case for this. > > On Thu, 31 Aug 2023 at 15:40, Mark Rutland <mark.rutland@arm.com> wrote: > > > > Upon a panic() the kernel will use either smp_send_stop() or > > crash_smp_send_stop() to attempt to stop secondary CPUs via an IPI, > > which may or may not be an NMI. Generally it's preferable that this is an > > NMI so that CPUs can be stopped in as many situations as possible, but > > it's not always possible to provide an NMI, and there are cases where > > CPUs may be unable to handle the NMI regardless. > > > > This patch adds a test for panic() where all other CPUs are stuck with > > interrupts disabled, which can be used to check whether the kernel > > gracefully handles CPUs failing to respond to a stop, and whe NMIs stops > > s/whe/when/ > > > work. > > > > For example, on arm64 *without* an NMI, this results in: > > > > | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT > > | lkdtm: Performing direct entry PANIC_STOP_IRQOFF > > | Kernel panic - not syncing: panic stop irqoff test > > | CPU: 2 PID: 24 Comm: migration/2 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4 > > | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > > | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4 > > | Call trace: > > | dump_backtrace+0x94/0xec > > | show_stack+0x18/0x24 > > | dump_stack_lvl+0x74/0xc0 > > | dump_stack+0x18/0x24 > > | panic+0x358/0x3e8 > > | lkdtm_PANIC+0x0/0x18 > > | multi_cpu_stop+0x9c/0x1a0 > > | cpu_stopper_thread+0x84/0x118 > > | smpboot_thread_fn+0x224/0x248 > > | kthread+0x114/0x118 > > | ret_from_fork+0x10/0x20 > > | SMP: stopping secondary CPUs > > | SMP: failed to stop secondary CPUs 0-3 > > | Kernel Offset: 0x401cf3490000 from 0xffff800080000000 > > | PHYS_OFFSET: 0x40000000 > > | CPU features: 0x00000000,68c167a1,cce6773f > > | Memory Limit: none > > | ---[ end Kernel panic - not syncing: panic stop irqoff test ]--- > > > > On arm64 *with* an NMI, this results in: > > I suppose a more interesting test scenario to show difference among > NMI stop IPI and regular stop IPI would be: > > - First put any CPU into hard lockup state via: > $ echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT > > - And then provoke following from other CPU: > $ echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT I don't follow. IIUC that's only going to test whether a HW watchdog can fire and reset the system? The PANIC_STOP_IRQOFF test has each CPU run panic_stop_irqoff_fn() with IRQs disabled, and if one CPU is stuck in the HARDLOCKUP test, we'll never get all CPUs into panic_stop_irqoff_fn(), and so all CPUs will be stuck with IRQs disabled, spinning. The PANIC_STOP_IRQOFF test itself tests the different between an NMI stop IPI and regular stop IPI, as the results in the commit message shows. Look for the line above that says: | SMP: failed to stop secondary CPUs 0-3 ... which is *not* present in the NMI case (though we don't have an explicit "stoppped all CPUs" message). > > > > | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT > > | lkdtm: Performing direct entry PANIC_STOP_IRQOFF > > | Kernel panic - not syncing: panic stop irqoff test > > | CPU: 1 PID: 19 Comm: migration/1 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4 > > | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > > | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4 > > | Call trace: > > | dump_backtrace+0x94/0xec > > | show_stack+0x18/0x24 > > | dump_stack_lvl+0x74/0xc0 > > | dump_stack+0x18/0x24 > > | panic+0x358/0x3e8 > > | lkdtm_PANIC+0x0/0x18 > > | multi_cpu_stop+0x9c/0x1a0 > > | cpu_stopper_thread+0x84/0x118 > > | smpboot_thread_fn+0x224/0x248 > > | kthread+0x114/0x118 > > | ret_from_fork+0x10/0x20 > > | SMP: stopping secondary CPUs > > | Kernel Offset: 0x55a9c0bc0000 from 0xffff800080000000 > > | PHYS_OFFSET: 0x40000000 > > | CPU features: 0x00000000,68c167a1,fce6773f > > | Memory Limit: none > > | ---[ end Kernel panic - not syncing: panic stop irqoff test ]--- > > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > > Cc: Douglas Anderson <dianders@chromium.org> > > Cc: Kees Cook <keescook@chromium.org> > > Cc: Stephen Boyd <swboyd@chromium.org > > Cc: Sumit Garg <sumit.garg@linaro.org> > > --- > > drivers/misc/lkdtm/bugs.c | 29 ++++++++++++++++++++++++++++- > > 1 file changed, 28 insertions(+), 1 deletion(-) > > > > I've tested this with the arm64 NMI IPI patches: > > > > https://lore.kernel.org/linux-arm-kernel/20230830191314.1618136-1-dianders@chromium.org/ > > > > Specifically, with the patch that uses an NMI for IPI_CPU_STOP and > > IPI_CPU_CRASH_STOP: > > > > https://lore.kernel.org/linux-arm-kernel/20230830121115.v12.5.Ifadbfd45b22c52edcb499034dd4783d096343260@changeid/ > > > > Mark. > > > > diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c > > index 3c95600ab2f71..368da8b83cd1c 100644 > > --- a/drivers/misc/lkdtm/bugs.c > > +++ b/drivers/misc/lkdtm/bugs.c > > @@ -6,12 +6,14 @@ > > * test source files. > > */ > > #include "lkdtm.h" > > +#include <linux/cpu.h> > > #include <linux/list.h> > > #include <linux/sched.h> > > #include <linux/sched/signal.h> > > #include <linux/sched/task_stack.h> > > -#include <linux/uaccess.h> > > #include <linux/slab.h> > > +#include <linux/stop_machine.h> > > +#include <linux/uaccess.h> > > > > #if IS_ENABLED(CONFIG_X86_32) && !IS_ENABLED(CONFIG_UML) > > #include <asm/desc.h> > > @@ -73,6 +75,30 @@ static void lkdtm_PANIC(void) > > panic("dumptest"); > > } > > > > +static int panic_stop_irqoff_fn(void *arg) > > +{ > > + atomic_t *v = arg; > > + > > + /* > > + * Trigger the panic after all other CPUs have entered this function, > > + * so that they are guaranteed to have IRQs disabled. > > + */ > > + if (atomic_inc_return(v) == num_online_cpus()) > > + panic("panic stop irqoff test"); > > + > > + for (;;) > > + cpu_relax(); > > +} > > + > > +static void lkdtm_PANIC_STOP_IRQOFF(void) > > +{ > > + atomic_t v = ATOMIC_INIT(0); > > + > > + cpus_read_lock(); > > + stop_machine(panic_stop_irqoff_fn, &v, cpu_online_mask); > > + cpus_read_unlock(); > > stop_machine() does internally use cpus_read_{lock/unlock}(), is there > any need to have them here as well? For some reason I thought that stop_machine() copied the mask prior to calling cpus_read_lock(), and hance it was necessary to avoid the online mask changing, but from taking a look just now that's not the case. I'll drop the cpus_read_lock() .. cpus_read_unlock() here. Thanks, Mark.
On Thu, Aug 31, 2023 at 09:16:31AM -0700, Doug Anderson wrote: > Hi, > > On Thu, Aug 31, 2023 at 3:10 AM Mark Rutland <mark.rutland@arm.com> wrote: > > > > Upon a panic() the kernel will use either smp_send_stop() or > > crash_smp_send_stop() to attempt to stop secondary CPUs via an IPI, > > which may or may not be an NMI. Generally it's preferable that this is an > > NMI so that CPUs can be stopped in as many situations as possible, but > > it's not always possible to provide an NMI, and there are cases where > > CPUs may be unable to handle the NMI regardless. > > > > This patch adds a test for panic() where all other CPUs are stuck with > > interrupts disabled, which can be used to check whether the kernel > > gracefully handles CPUs failing to respond to a stop, and whe NMIs stops > > work. > > > > For example, on arm64 *without* an NMI, this results in: > > > > | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT > > | lkdtm: Performing direct entry PANIC_STOP_IRQOFF > > | Kernel panic - not syncing: panic stop irqoff test > > | CPU: 2 PID: 24 Comm: migration/2 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4 > > | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > > | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4 > > | Call trace: > > | dump_backtrace+0x94/0xec > > | show_stack+0x18/0x24 > > | dump_stack_lvl+0x74/0xc0 > > | dump_stack+0x18/0x24 > > | panic+0x358/0x3e8 > > | lkdtm_PANIC+0x0/0x18 > > | multi_cpu_stop+0x9c/0x1a0 > > | cpu_stopper_thread+0x84/0x118 > > | smpboot_thread_fn+0x224/0x248 > > | kthread+0x114/0x118 > > | ret_from_fork+0x10/0x20 > > | SMP: stopping secondary CPUs > > | SMP: failed to stop secondary CPUs 0-3 > > | Kernel Offset: 0x401cf3490000 from 0xffff800080000000 > > | PHYS_OFFSET: 0x40000000 > > | CPU features: 0x00000000,68c167a1,cce6773f > > | Memory Limit: none > > | ---[ end Kernel panic - not syncing: panic stop irqoff test ]--- > > > > On arm64 *with* an NMI, this results in: > > > > | # echo PANIC_STOP_IRQOFF > /sys/kernel/debug/provoke-crash/DIRECT > > | lkdtm: Performing direct entry PANIC_STOP_IRQOFF > > | Kernel panic - not syncing: panic stop irqoff test > > | CPU: 1 PID: 19 Comm: migration/1 Not tainted 6.5.0-rc3-00077-ge6c782389895-dirty #4 > > | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > > | Stopper: multi_cpu_stop+0x0/0x1a0 <- stop_machine_cpuslocked+0x158/0x1a4 > > | Call trace: > > | dump_backtrace+0x94/0xec > > | show_stack+0x18/0x24 > > | dump_stack_lvl+0x74/0xc0 > > | dump_stack+0x18/0x24 > > | panic+0x358/0x3e8 > > | lkdtm_PANIC+0x0/0x18 > > | multi_cpu_stop+0x9c/0x1a0 > > | cpu_stopper_thread+0x84/0x118 > > | smpboot_thread_fn+0x224/0x248 > > | kthread+0x114/0x118 > > | ret_from_fork+0x10/0x20 > > | SMP: stopping secondary CPUs > > | Kernel Offset: 0x55a9c0bc0000 from 0xffff800080000000 > > | PHYS_OFFSET: 0x40000000 > > | CPU features: 0x00000000,68c167a1,fce6773f > > | Memory Limit: none > > | ---[ end Kernel panic - not syncing: panic stop irqoff test ]--- > > Since Sumit didn't notice it and I had to use a "diff" tool to help me > spot the difference in your two examples, maybe telegraph for the > reader of the commit message that the important thing is the lack of > "SMP: failed to stop secondary CPUs" in your example. Sure; I've added explicit "note" lines to v2 for that. > > +static void lkdtm_PANIC_STOP_IRQOFF(void) > > +{ > > + atomic_t v = ATOMIC_INIT(0); > > + > > + cpus_read_lock(); > > + stop_machine(panic_stop_irqoff_fn, &v, cpu_online_mask); > > It broke my brain just a little bit that stop_machine() isn't backed > by "IPI_CPU_STOP" since the point of this was to test the patch > modifying how "IPI_CPU_STOP" worked. ;-) Assuming I didn't mess up, I > guess stop_machine() is just used to get all the CPUs running with > interrupts disabled and then the last CPU does a panic. The _panic_ > then tests "IPI_CPU_STOP" / "IPI_CPU_CRASH_STOP". Yup; I had hoped that the comment within panic_stop_irqoff_fn() was sufficient for that: /* * Trigger the panic after all other CPUs have entered this function, * so that they are guaranteed to have IRQs disabled. */ I've made that a bit more explicit now: /* * As stop_machine() disables interrupts, all CPUs within this function * have interrupts disabled and cannot take a regular IPI. * * The last CPU which enters here will trigger a panic, and as all CPUs * cannot take a regular IPI, we'll only be able to stop secondaries if * smp_send_stop() or crash_smp_send_stop() uses an NMI. */ > Probably this is obvious to everyone but me, so no action is needed > unless you agree that it's confusing and can think of a good comment / > commit message addition to help. Hopefully the above is good enough? I agree the stop_machine() naming is also somewhat confusing here, but I can't do much about that. > In any case, after Sumit's feedback is addressed: > > Reviewed-by: Douglas Anderson <dianders@chromium.org> Thanks! Mark.
diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c index 3c95600ab2f71..368da8b83cd1c 100644 --- a/drivers/misc/lkdtm/bugs.c +++ b/drivers/misc/lkdtm/bugs.c @@ -6,12 +6,14 @@ * test source files. */ #include "lkdtm.h" +#include <linux/cpu.h> #include <linux/list.h> #include <linux/sched.h> #include <linux/sched/signal.h> #include <linux/sched/task_stack.h> -#include <linux/uaccess.h> #include <linux/slab.h> +#include <linux/stop_machine.h> +#include <linux/uaccess.h> #if IS_ENABLED(CONFIG_X86_32) && !IS_ENABLED(CONFIG_UML) #include <asm/desc.h> @@ -73,6 +75,30 @@ static void lkdtm_PANIC(void) panic("dumptest"); } +static int panic_stop_irqoff_fn(void *arg) +{ + atomic_t *v = arg; + + /* + * Trigger the panic after all other CPUs have entered this function, + * so that they are guaranteed to have IRQs disabled. + */ + if (atomic_inc_return(v) == num_online_cpus()) + panic("panic stop irqoff test"); + + for (;;) + cpu_relax(); +} + +static void lkdtm_PANIC_STOP_IRQOFF(void) +{ + atomic_t v = ATOMIC_INIT(0); + + cpus_read_lock(); + stop_machine(panic_stop_irqoff_fn, &v, cpu_online_mask); + cpus_read_unlock(); +} + static void lkdtm_BUG(void) { BUG(); @@ -598,6 +624,7 @@ static noinline void lkdtm_CORRUPT_PAC(void) static struct crashtype crashtypes[] = { CRASHTYPE(PANIC), + CRASHTYPE(PANIC_STOP_IRQOFF), CRASHTYPE(BUG), CRASHTYPE(WARNING), CRASHTYPE(WARNING_MESSAGE),