[RFC] hrtimer: Use printk_deferred_once for hrtimer_interrupt message

Message ID 20240222051253.1361002-1-jstultz@google.com
State New
Headers
Series [RFC] hrtimer: Use printk_deferred_once for hrtimer_interrupt message |

Commit Message

John Stultz Feb. 22, 2024, 5:12 a.m. UTC
  With qemu, I constantly see lockdep warnings after the
hrimter_interrupt message is printed:

[   43.434557] hrtimer: interrupt took 6517564 ns
[   43.435000]
[   43.435000] =============================
[   43.435000] [ BUG: Invalid wait context ]
[   43.435000] 6.8.0-rc5-00002-g28763ef29a5b #3743 Not tainted
[   43.435000] -----------------------------
[   43.435000] lock_torture_wr/605 is trying to lock:
[   43.435000] ffffffffbdcdc6f8 (&port_lock_key){-...}-{3:3}, at: serial8250_console_write+0xdd/0x710
[   43.435000] other info that might help us debug this:
[   43.435000] context-{2:2}
[   43.435000] 4 locks held by lock_torture_wr/605:
[   43.435000]  #0: ffffffffbd6f1de8 (torture_mutex_init#4){+.+.}-{4:4}, at: torture_mutex_nested_lock+0x4b/0x70
[   43.435000]  #1: ffffffffbb557260 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0xd3/0x330
[   43.435000]  #2: ffffffffbb5572d0 (console_srcu){....}-{0:0}, at: console_flush_all+0xd6/0x6b0
[   43.435000]  #3: ffffffffbb396e20 (console_owner){-...}-{0:0}, at: console_flush_all+0x2a0/0x6b0
[   43.435000] stack backtrace:
[   43.435000] CPU: 36 PID: 605 Comm: lock_torture_wr Not tainted 6.8.0-rc5-00002-g28763ef29a5b #3743
[   43.435000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   43.435000] Call Trace:
[   43.435000]  <IRQ>
[   43.435000]  dump_stack_lvl+0x57/0x90
[   43.435000]  __lock_acquire+0xd07/0x3260
[   43.435000]  ? __pfx___lock_acquire+0x10/0x10
[   43.435000]  ? memchr+0x1e/0x50
[   43.435000]  lock_acquire+0x159/0x3b0
[   43.435000]  ? serial8250_console_write+0xdd/0x710
[   43.435000]  ? __pfx_lock_acquire+0x10/0x10
[   43.435000]  ? __pfx___lock_acquire+0x10/0x10
[   43.435000]  _raw_spin_lock_irqsave+0x42/0x60
[   43.435000]  ? serial8250_console_write+0xdd/0x710
[   43.435000]  serial8250_console_write+0xdd/0x710
[   43.435000]  ? __pfx_serial8250_console_write+0x10/0x10
[   43.435000]  ? __pfx_lock_release+0x10/0x10
[   43.435000]  ? do_raw_spin_lock+0x104/0x180
[   43.435000]  ? __pfx_do_raw_spin_lock+0x10/0x10
[   43.435000]  ? console_flush_all+0x2a0/0x6b0
[   43.435000]  console_flush_all+0x2ea/0x6b0
[   43.435000]  ? console_flush_all+0x2a0/0x6b0
[   43.435000]  ? __pfx_console_flush_all+0x10/0x10
[   43.435000]  ? __pfx_lock_acquire+0x10/0x10
[   43.435000]  console_unlock+0x9d/0x150
[   43.435000]  ? __pfx_console_unlock+0x10/0x10
[   43.435000]  ? vprintk_emit+0xd3/0x330
[   43.435000]  ? __down_trylock_console_sem+0x62/0xa0
[   43.435000]  ? vprintk_emit+0xd3/0x330
[   43.435000]  vprintk_emit+0xdc/0x330
[   43.435000]  _printk+0x92/0xb0
[   43.435000]  ? __pfx__printk+0x10/0x10
[   43.435000]  ? hrtimer_interrupt+0x2f0/0x360
[   43.439262]  __sysvec_apic_timer_interrupt+0xb8/0x290
[   43.439345]  sysvec_apic_timer_interrupt+0x8a/0xb0
[   43.439345]  </IRQ>
[   43.439345]  <TASK>
[   43.439345]  asm_sysvec_apic_timer_interrupt+0x16/0x20

I thought the new printk work was going to resolve this, but
apparently not, so to avoid trying to printk in this problematic
context, lets use prink_deferred_once() instead.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@android.com
Signed-off-by: John Stultz <jstultz@google.com>
---
 kernel/time/hrtimer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Thomas Gleixner Feb. 22, 2024, 3:17 p.m. UTC | #1
On Wed, Feb 21 2024 at 21:12, John Stultz wrote:

Cc+ John. Keeping context intact,

> With qemu, I constantly see lockdep warnings after the
> hrimter_interrupt message is printed:
>
> [   43.434557] hrtimer: interrupt took 6517564 ns
> [   43.435000]
> [   43.435000] =============================
> [   43.435000] [ BUG: Invalid wait context ]

Do you have PROVE_RAW_LOCK_NESTING enabled?

> [   43.435000] 6.8.0-rc5-00002-g28763ef29a5b #3743 Not tainted
> [   43.435000] -----------------------------
> [   43.435000] lock_torture_wr/605 is trying to lock:
> [   43.435000] ffffffffbdcdc6f8 (&port_lock_key){-...}-{3:3}, at: serial8250_console_write+0xdd/0x710
> [   43.435000] other info that might help us debug this:
> [   43.435000] context-{2:2}
> [   43.435000] 4 locks held by lock_torture_wr/605:
> [   43.435000]  #0: ffffffffbd6f1de8 (torture_mutex_init#4){+.+.}-{4:4}, at: torture_mutex_nested_lock+0x4b/0x70
> [   43.435000]  #1: ffffffffbb557260 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0xd3/0x330
> [   43.435000]  #2: ffffffffbb5572d0 (console_srcu){....}-{0:0}, at: console_flush_all+0xd6/0x6b0
> [   43.435000]  #3: ffffffffbb396e20 (console_owner){-...}-{0:0}, at: console_flush_all+0x2a0/0x6b0
> [   43.435000] stack backtrace:
> [   43.435000] CPU: 36 PID: 605 Comm: lock_torture_wr Not tainted 6.8.0-rc5-00002-g28763ef29a5b #3743
> [   43.435000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [   43.435000] Call Trace:
> [   43.435000]  <IRQ>
> [   43.435000]  dump_stack_lvl+0x57/0x90
> [   43.435000]  __lock_acquire+0xd07/0x3260
> [   43.435000]  ? __pfx___lock_acquire+0x10/0x10
> [   43.435000]  ? memchr+0x1e/0x50
> [   43.435000]  lock_acquire+0x159/0x3b0
> [   43.435000]  ? serial8250_console_write+0xdd/0x710
> [   43.435000]  ? __pfx_lock_acquire+0x10/0x10
> [   43.435000]  ? __pfx___lock_acquire+0x10/0x10
> [   43.435000]  _raw_spin_lock_irqsave+0x42/0x60
> [   43.435000]  ? serial8250_console_write+0xdd/0x710
> [   43.435000]  serial8250_console_write+0xdd/0x710
> [   43.435000]  ? __pfx_serial8250_console_write+0x10/0x10
> [   43.435000]  ? __pfx_lock_release+0x10/0x10
> [   43.435000]  ? do_raw_spin_lock+0x104/0x180
> [   43.435000]  ? __pfx_do_raw_spin_lock+0x10/0x10
> [   43.435000]  ? console_flush_all+0x2a0/0x6b0
> [   43.435000]  console_flush_all+0x2ea/0x6b0
> [   43.435000]  ? console_flush_all+0x2a0/0x6b0
> [   43.435000]  ? __pfx_console_flush_all+0x10/0x10
> [   43.435000]  ? __pfx_lock_acquire+0x10/0x10
> [   43.435000]  console_unlock+0x9d/0x150
> [   43.435000]  ? __pfx_console_unlock+0x10/0x10
> [   43.435000]  ? vprintk_emit+0xd3/0x330
> [   43.435000]  ? __down_trylock_console_sem+0x62/0xa0
> [   43.435000]  ? vprintk_emit+0xd3/0x330
> [   43.435000]  vprintk_emit+0xdc/0x330
> [   43.435000]  _printk+0x92/0xb0
> [   43.435000]  ? __pfx__printk+0x10/0x10
> [   43.435000]  ? hrtimer_interrupt+0x2f0/0x360
> [   43.439262]  __sysvec_apic_timer_interrupt+0xb8/0x290
> [   43.439345]  sysvec_apic_timer_interrupt+0x8a/0xb0
> [   43.439345]  </IRQ>
> [   43.439345]  <TASK>
> [   43.439345]  asm_sysvec_apic_timer_interrupt+0x16/0x20
>
> I thought the new printk work was going to resolve this, but
> apparently not, so to avoid trying to printk in this problematic
> context, lets use prink_deferred_once() instead.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: kernel-team@android.com
> Signed-off-by: John Stultz <jstultz@google.com>
> ---
>  kernel/time/hrtimer.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> index edb0f821dcea..e6b060403384 100644
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -1870,7 +1870,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
>  	else
>  		expires_next = ktime_add(now, delta);
>  	tick_program_event(expires_next, 1);
> -	pr_warn_once("hrtimer: interrupt took %llu ns\n", ktime_to_ns(delta));
> +	printk_deferred_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
> +			     ktime_to_ns(delta));
>  }
>  
>  /* called with interrupts disabled */
  
John Stultz Feb. 22, 2024, 4:45 p.m. UTC | #2
On Thu, Feb 22, 2024 at 7:17 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, Feb 21 2024 at 21:12, John Stultz wrote:
>
> Cc+ John. Keeping context intact,
>
> > With qemu, I constantly see lockdep warnings after the
> > hrimter_interrupt message is printed:
> >
> > [   43.434557] hrtimer: interrupt took 6517564 ns
> > [   43.435000]
> > [   43.435000] =============================
> > [   43.435000] [ BUG: Invalid wait context ]
>
> Do you have PROVE_RAW_LOCK_NESTING enabled?

Yes, I do. Let me know if there's anything else you'd like me to try.

thanks
-john
  
John Ogness Feb. 22, 2024, 8:33 p.m. UTC | #3
On 2024-02-22, John Stultz <jstultz@google.com> wrote:
> On Thu, Feb 22, 2024 at 7:17 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> On Wed, Feb 21 2024 at 21:12, John Stultz wrote:
>> > With qemu, I constantly see lockdep warnings after the
>> > hrimter_interrupt message is printed:
>> >
>> > [   43.434557] hrtimer: interrupt took 6517564 ns
>> > [   43.435000]
>> > [   43.435000] =============================
>> > [   43.435000] [ BUG: Invalid wait context ]
>>
>> Do you have PROVE_RAW_LOCK_NESTING enabled?
>
> Yes, I do. Let me know if there's anything else you'd like me to try.

This option is to "ensure that the lock nesting rules for PREEMPT_RT
enabled kernels are not violated."

Since you are not running a PREEMPT_RT enabled kernel, these warnings
are irrelevant for _your_ kernel.

>> > I thought the new printk work was going to resolve this, but
>> > apparently not

Yes, it will, but it is not all mainline yet. The full printk rework is
only available as part of the PREEMPT_RT patch series [0]. With that
series applied, it makes more sense to enable PROVE_RAW_LOCK_NESTING
because the series should resolve all known lock nesting problems with
PREEMPT_RT. (And indeed, the warning you are reporting does not occur
there.)

If you really want to test lock nesting for PREEMPT_RT, I recommend
applying the PREEMPT_RT series and keeping PROVE_RAW_LOCK_NESTING
enabled. Otherwise, if you do not want to apply the PREEMPT_RT series, I
recommend disabling PROVE_RAW_LOCK_NESTING.

Note that you can apply the PREEMPT_RT series and still choose the
!PREEMPT_RT preemption model for your kernel.

John Ogness

[0] https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/log/?h=linux-6.8.y-rt-rebase
  
John Stultz Feb. 22, 2024, 8:47 p.m. UTC | #4
On Thu, Feb 22, 2024 at 12:33 PM John Ogness <jogness@linutronix.de> wrote:
> On 2024-02-22, John Stultz <jstultz@google.com> wrote:
> > On Thu, Feb 22, 2024 at 7:17 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> >> On Wed, Feb 21 2024 at 21:12, John Stultz wrote:
> >> > With qemu, I constantly see lockdep warnings after the
> >> > hrimter_interrupt message is printed:
> >> >
> >> > [   43.434557] hrtimer: interrupt took 6517564 ns
> >> > [   43.435000]
> >> > [   43.435000] =============================
> >> > [   43.435000] [ BUG: Invalid wait context ]
> >>
> >> Do you have PROVE_RAW_LOCK_NESTING enabled?
> >
> > Yes, I do. Let me know if there's anything else you'd like me to try.
>
> This option is to "ensure that the lock nesting rules for PREEMPT_RT
> enabled kernels are not violated."
>
> Since you are not running a PREEMPT_RT enabled kernel, these warnings
> are irrelevant for _your_ kernel.

Ah, mostly I've been running with all the lockdep options as part of
my development of proxy-exec series, as I want to avoid accidentally
introducing any new problems with my work.

> >> > I thought the new printk work was going to resolve this, but
> >> > apparently not
>
> Yes, it will, but it is not all mainline yet. The full printk rework is
> only available as part of the PREEMPT_RT patch series [0]. With that

Ah, my apologies! I know the printk changes are *very* eagerly
awaited, but I haven't been following it closely, and at plumbers I
had the mistaken sense that the key parts had been queued to be merged
in 6.8.

> If you really want to test lock nesting for PREEMPT_RT, I recommend
> applying the PREEMPT_RT series and keeping PROVE_RAW_LOCK_NESTING
> enabled. Otherwise, if you do not want to apply the PREEMPT_RT series, I
> recommend disabling PROVE_RAW_LOCK_NESTING.
>

Ok, will do.  Though would it make sense to hide
PROVE_RAW_LOCK_NESTING under BROKEN or something upstream in the
meantime?

Thanks so much for the response and your efforts on the printk improvements!
-john
  

Patch

diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index edb0f821dcea..e6b060403384 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -1870,7 +1870,8 @@  void hrtimer_interrupt(struct clock_event_device *dev)
 	else
 		expires_next = ktime_add(now, delta);
 	tick_program_event(expires_next, 1);
-	pr_warn_once("hrtimer: interrupt took %llu ns\n", ktime_to_ns(delta));
+	printk_deferred_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
+			     ktime_to_ns(delta));
 }
 
 /* called with interrupts disabled */