panic: Reenable preemption in WARN slowpath

Message ID 3ec48fde01e4ee6505f77908ba351bad200ae3d1.1694763684.git.lukas@wunner.de
State New
Headers
Series panic: Reenable preemption in WARN slowpath |

Commit Message

Lukas Wunner Sept. 15, 2023, 7:55 a.m. UTC
  Commit 5a5d7e9badd2 ("cpuidle: lib/bug: Disable rcu_is_watching() during
WARN/BUG") amended warn_slowpath_fmt() to disable preemption until the
WARN splat has been emitted.

However the commit neglected to reenable preemption in the !fmt codepath,
i.e. when a WARN splat is emitted without additional format string.

One consequence is that users may see more splats than intended.  E.g. a
WARN splat emitted in a work item results in at least two extra splats:

  BUG: workqueue leaked lock or atomic
  (emitted by process_one_work())

  BUG: scheduling while atomic
  (emitted by worker_thread() -> schedule())

Ironically the point of the commit was to *avoid* extra splats. ;)

Fix it.

Fixes: 5a5d7e9badd2 ("cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v6.1+
Cc: Peter Zijlstra <peterz@infradead.org>
---
The original commit went in through the tip tree, hence submitting to
tip maintainers.  The commit was backported to v6.1-stable (even though
it wasn't tagged for stable), hence this fix needs a stable designation.

 kernel/panic.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Peter Zijlstra Sept. 15, 2023, 9:38 a.m. UTC | #1
On Fri, Sep 15, 2023 at 09:55:39AM +0200, Lukas Wunner wrote:
> Commit 5a5d7e9badd2 ("cpuidle: lib/bug: Disable rcu_is_watching() during
> WARN/BUG") amended warn_slowpath_fmt() to disable preemption until the
> WARN splat has been emitted.
> 
> However the commit neglected to reenable preemption in the !fmt codepath,
> i.e. when a WARN splat is emitted without additional format string.
> 
> One consequence is that users may see more splats than intended.  E.g. a
> WARN splat emitted in a work item results in at least two extra splats:
> 
>   BUG: workqueue leaked lock or atomic
>   (emitted by process_one_work())
> 
>   BUG: scheduling while atomic
>   (emitted by worker_thread() -> schedule())
> 
> Ironically the point of the commit was to *avoid* extra splats. ;)
> 
> Fix it.
> 
> Fixes: 5a5d7e9badd2 ("cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG")
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v6.1+
> Cc: Peter Zijlstra <peterz@infradead.org>
> ---
> The original commit went in through the tip tree, hence submitting to
> tip maintainers.  The commit was backported to v6.1-stable (even though
> it wasn't tagged for stable), hence this fix needs a stable designation.
> 
>  kernel/panic.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/panic.c b/kernel/panic.c
> index 07239d4..ffa037fa 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -697,6 +697,7 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint,
>  	if (!fmt) {
>  		__warn(file, line, __builtin_return_address(0), taint,
>  		       NULL, NULL);
> +		warn_rcu_exit(rcu);
>  		return;
>  	}

Urgh, thanks!
  
Ingo Molnar Sept. 15, 2023, 9:45 a.m. UTC | #2
* tip-bot2 for Lukas Wunner <tip-bot2@linutronix.de> wrote:

> The following commit has been merged into the core/urgent branch of tip:
> 
> Commit-ID:     cccd32816506cbac3a4c65d9dff51b3125ef1a03
> Gitweb:        https://git.kernel.org/tip/cccd32816506cbac3a4c65d9dff51b3125ef1a03
> Author:        Lukas Wunner <lukas@wunner.de>
> AuthorDate:    Fri, 15 Sep 2023 09:55:39 +02:00
> Committer:     Ingo Molnar <mingo@kernel.org>
> CommitterDate: Fri, 15 Sep 2023 11:28:08 +02:00
> 
> panic: Reenable preemption in WARN slowpath
> 
> Commit:
> 
>   5a5d7e9badd2 ("cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG")
> 
> amended warn_slowpath_fmt() to disable preemption until the WARN splat
> has been emitted.
> 
> However the commit neglected to reenable preemption in the !fmt codepath,
> i.e. when a WARN splat is emitted without additional format string.
> 
> One consequence is that users may see more splats than intended.  E.g. a
> WARN splat emitted in a work item results in at least two extra splats:
> 
>   BUG: workqueue leaked lock or atomic
>   (emitted by process_one_work())
> 
>   BUG: scheduling while atomic
>   (emitted by worker_thread() -> schedule())
> 
> Ironically the point of the commit was to *avoid* extra splats. ;)
> 
> Fix it.

> diff --git a/kernel/panic.c b/kernel/panic.c
> index 07239d4..ffa037f 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -697,6 +697,7 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint,
>  	if (!fmt) {
>  		__warn(file, line, __builtin_return_address(0), taint,
>  		       NULL, NULL);
> +		warn_rcu_exit(rcu);
>  		return;

BTW., one more thing we might want to consider here is to re-enable 
preemption in warn_rcu_exit() a bit more gently, without forcing a
pending reschedule, ie. preempt_enable_no_resched() or so?

Thanks,

	Ingo
  
Peter Zijlstra Sept. 15, 2023, 10:35 a.m. UTC | #3
On Fri, Sep 15, 2023 at 11:45:02AM +0200, Ingo Molnar wrote:
> 
> * tip-bot2 for Lukas Wunner <tip-bot2@linutronix.de> wrote:
> 
> > The following commit has been merged into the core/urgent branch of tip:
> > 
> > Commit-ID:     cccd32816506cbac3a4c65d9dff51b3125ef1a03
> > Gitweb:        https://git.kernel.org/tip/cccd32816506cbac3a4c65d9dff51b3125ef1a03
> > Author:        Lukas Wunner <lukas@wunner.de>
> > AuthorDate:    Fri, 15 Sep 2023 09:55:39 +02:00
> > Committer:     Ingo Molnar <mingo@kernel.org>
> > CommitterDate: Fri, 15 Sep 2023 11:28:08 +02:00
> > 
> > panic: Reenable preemption in WARN slowpath
> > 
> > Commit:
> > 
> >   5a5d7e9badd2 ("cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG")
> > 
> > amended warn_slowpath_fmt() to disable preemption until the WARN splat
> > has been emitted.
> > 
> > However the commit neglected to reenable preemption in the !fmt codepath,
> > i.e. when a WARN splat is emitted without additional format string.
> > 
> > One consequence is that users may see more splats than intended.  E.g. a
> > WARN splat emitted in a work item results in at least two extra splats:
> > 
> >   BUG: workqueue leaked lock or atomic
> >   (emitted by process_one_work())
> > 
> >   BUG: scheduling while atomic
> >   (emitted by worker_thread() -> schedule())
> > 
> > Ironically the point of the commit was to *avoid* extra splats. ;)
> > 
> > Fix it.
> 
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index 07239d4..ffa037f 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -697,6 +697,7 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint,
> >  	if (!fmt) {
> >  		__warn(file, line, __builtin_return_address(0), taint,
> >  		       NULL, NULL);
> > +		warn_rcu_exit(rcu);
> >  		return;
> 
> BTW., one more thing we might want to consider here is to re-enable 
> preemption in warn_rcu_exit() a bit more gently, without forcing a
> pending reschedule, ie. preempt_enable_no_resched() or so?

nah, it's a warn, if that triggers you get to keep the pieces. Also
preempt_enable_no_resched() isn't exported because its a horribly
dangerous function.
  
Ingo Molnar Sept. 15, 2023, 11:13 a.m. UTC | #4
* Peter Zijlstra <peterz@infradead.org> wrote:

> > > panic: Reenable preemption in WARN slowpath
> > > 
> > > Commit:
> > > 
> > >   5a5d7e9badd2 ("cpuidle: lib/bug: Disable rcu_is_watching() during WARN/BUG")
> > > 
> > > amended warn_slowpath_fmt() to disable preemption until the WARN splat
> > > has been emitted.
> > > 
> > > However the commit neglected to reenable preemption in the !fmt codepath,
> > > i.e. when a WARN splat is emitted without additional format string.
> > > 
> > > One consequence is that users may see more splats than intended.  E.g. a
> > > WARN splat emitted in a work item results in at least two extra splats:
> > > 
> > >   BUG: workqueue leaked lock or atomic
> > >   (emitted by process_one_work())
> > > 
> > >   BUG: scheduling while atomic
> > >   (emitted by worker_thread() -> schedule())
> > > 
> > > Ironically the point of the commit was to *avoid* extra splats. ;)
> > > 
> > > Fix it.
> > 
> > > diff --git a/kernel/panic.c b/kernel/panic.c
> > > index 07239d4..ffa037f 100644
> > > --- a/kernel/panic.c
> > > +++ b/kernel/panic.c
> > > @@ -697,6 +697,7 @@ void warn_slowpath_fmt(const char *file, int line, unsigned taint,
> > >  	if (!fmt) {
> > >  		__warn(file, line, __builtin_return_address(0), taint,
> > >  		       NULL, NULL);
> > > +		warn_rcu_exit(rcu);
> > >  		return;
> > 
> > BTW., one more thing we might want to consider here is to re-enable 
> > preemption in warn_rcu_exit() a bit more gently, without forcing a
> > pending reschedule, ie. preempt_enable_no_resched() or so?
> 
> nah, it's a warn, if that triggers you get to keep the pieces.

But but ... my overall point is that since we just WARN()ed, we are facing 
some sort of kernel bug, and scheduling policies are only a secondary 
concern, debuggability & getting the bug fixed is the primary concern.

So the scheduler should switch to a debugging-friendlier behavior:

  'Schedule tasks around as little as possible, to keep the debug output 
   tidy & to keep things working a bit better even if it's all broken 
   already'.

... or so. My suggestion was a small subset of that principle.

> [...] Also preempt_enable_no_resched() isn't exported because its a 
> horribly dangerous function.

Special exception for RCU debugging only, or so - it's a core kernel 
facility after all.

No strong feelings either way though.

Thanks,

	Ingo
  

Patch

diff --git a/kernel/panic.c b/kernel/panic.c
index 07239d4..ffa037fa 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -697,6 +697,7 @@  void warn_slowpath_fmt(const char *file, int line, unsigned taint,
 	if (!fmt) {
 		__warn(file, line, __builtin_return_address(0), taint,
 		       NULL, NULL);
+		warn_rcu_exit(rcu);
 		return;
 	}