panic: add option to dump blocked tasks in panic_print
Commit Message
For debugging kernel panic and other bugs, there is already option of
panic_print to dump all tasks' call stacks. On today's large servers
running many containers, there could be thousands of tasks or more,
and it will print out huge amount of call stacks, and take a lot of
time (for serial console which is main target user case of panic_print).
And in many cases, only those several tasks being blocked is key for
the panic, so add an option to only dump blocked tasks' call stack.
Signed-off-by: Feng Tang <feng.tang@intel.com>
---
Documentation/admin-guide/kernel-parameters.txt | 1 +
Documentation/admin-guide/sysctl/kernel.rst | 1 +
kernel/panic.c | 4 ++++
3 files changed, 6 insertions(+)
Comments
On 02/02/2024 10:20, Feng Tang wrote:
> For debugging kernel panic and other bugs, there is already option of
> panic_print to dump all tasks' call stacks. On today's large servers
> running many containers, there could be thousands of tasks or more,
> and it will print out huge amount of call stacks, and take a lot of
> time (for serial console which is main target user case of panic_print).
>
> And in many cases, only those several tasks being blocked is key for
> the panic, so add an option to only dump blocked tasks' call stack.
>
> Signed-off-by: Feng Tang <feng.tang@intel.com>
> [...]
Thank you Feng Tang, this is an interesting and useful idea!
I've just tested the patch and works fine - also no code issues from my
side. So, feel free to add:
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Cheers!
---
> Documentation/admin-guide/kernel-parameters.txt | 1 +
> Documentation/admin-guide/sysctl/kernel.rst | 1 +
> kernel/panic.c | 4 ++++
> 3 files changed, 6 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 31b3a25680d0..0f2369e87175 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4182,6 +4182,7 @@
> bit 4: print ftrace buffer
> bit 5: print all printk messages in buffer
> bit 6: print all CPUs backtrace (if available in the arch)
> + bit 7: print tasks in uninterruptible (blocked) state
> *Be aware* that this option may print a _lot_ of lines,
> so there are risks of losing older messages in the log.
> Use this option carefully, maybe worth to setup a
> diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
> index 6584a1f9bfe3..e066a16b35d5 100644
> --- a/Documentation/admin-guide/sysctl/kernel.rst
> +++ b/Documentation/admin-guide/sysctl/kernel.rst
> @@ -850,6 +850,7 @@ bit 3 print locks info if ``CONFIG_LOCKDEP`` is on
> bit 4 print ftrace buffer
> bit 5 print all printk messages in buffer
> bit 6 print all CPUs backtrace (if available in the arch)
> +bit 7 print tasks in uninterruptible (blocked) state
> ===== ============================================
>
> So for example to print tasks and memory info on panic, user can::
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 2807639aab51..aa17ae0897c0 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -73,6 +73,7 @@ EXPORT_SYMBOL_GPL(panic_timeout);
> #define PANIC_PRINT_FTRACE_INFO 0x00000010
> #define PANIC_PRINT_ALL_PRINTK_MSG 0x00000020
> #define PANIC_PRINT_ALL_CPU_BT 0x00000040
> +#define PANIC_PRINT_BLOCKED_TASKS 0x00000080
> unsigned long panic_print;
>
> ATOMIC_NOTIFIER_HEAD(panic_notifier_list);
> @@ -227,6 +228,9 @@ static void panic_print_sys_info(bool console_flush)
>
> if (panic_print & PANIC_PRINT_FTRACE_INFO)
> ftrace_dump(DUMP_ALL);
> +
> + if (panic_print & PANIC_PRINT_BLOCKED_TASKS)
> + show_state_filter(TASK_UNINTERRUPTIBLE);
> }
>
> void check_panic_on_warn(const char *origin)
@@ -4182,6 +4182,7 @@
bit 4: print ftrace buffer
bit 5: print all printk messages in buffer
bit 6: print all CPUs backtrace (if available in the arch)
+ bit 7: print tasks in uninterruptible (blocked) state
*Be aware* that this option may print a _lot_ of lines,
so there are risks of losing older messages in the log.
Use this option carefully, maybe worth to setup a
@@ -850,6 +850,7 @@ bit 3 print locks info if ``CONFIG_LOCKDEP`` is on
bit 4 print ftrace buffer
bit 5 print all printk messages in buffer
bit 6 print all CPUs backtrace (if available in the arch)
+bit 7 print tasks in uninterruptible (blocked) state
===== ============================================
So for example to print tasks and memory info on panic, user can::
@@ -73,6 +73,7 @@ EXPORT_SYMBOL_GPL(panic_timeout);
#define PANIC_PRINT_FTRACE_INFO 0x00000010
#define PANIC_PRINT_ALL_PRINTK_MSG 0x00000020
#define PANIC_PRINT_ALL_CPU_BT 0x00000040
+#define PANIC_PRINT_BLOCKED_TASKS 0x00000080
unsigned long panic_print;
ATOMIC_NOTIFIER_HEAD(panic_notifier_list);
@@ -227,6 +228,9 @@ static void panic_print_sys_info(bool console_flush)
if (panic_print & PANIC_PRINT_FTRACE_INFO)
ftrace_dump(DUMP_ALL);
+
+ if (panic_print & PANIC_PRINT_BLOCKED_TASKS)
+ show_state_filter(TASK_UNINTERRUPTIBLE);
}
void check_panic_on_warn(const char *origin)