softirq: Do not loop if running under a real-time task

Message ID 20230306154548.655799-1-oss@malat.biz
State New
Headers
Series softirq: Do not loop if running under a real-time task |

Commit Message

Petr Malat March 6, 2023, 3:45 p.m. UTC
  Softirq processing can be a source of a scheduling jitter if it executes
in a real-time task as in that case need_resched() is false unless there
is another runnable task with a higher priority. This is especially bad
if the softirq processing runs in a migration thread, which has priority
99 and usually runs for a short time.

One option would be to not restart the softirq processing if there is
another runnable task to allow the high prio task to finish and yield the
CPU, the second one is to not restart if softirq executes in a real-time
task. Usually, real-time tasks don't want to be interrupted, so implement
the second option.

Signed-off-by: Petr Malat <oss@malat.biz>
---
 kernel/softirq.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Sebastian Andrzej Siewior March 8, 2023, 9:14 a.m. UTC | #1
On 2023-03-06 16:45:48 [+0100], Petr Malat wrote:
> Softirq processing can be a source of a scheduling jitter if it executes
> in a real-time task as in that case need_resched() is false unless there
> is another runnable task with a higher priority. This is especially bad
> if the softirq processing runs in a migration thread, which has priority
> 99 and usually runs for a short time.
> 
> One option would be to not restart the softirq processing if there is
> another runnable task to allow the high prio task to finish and yield the
> CPU, the second one is to not restart if softirq executes in a real-time
> task. Usually, real-time tasks don't want to be interrupted, so implement
> the second option.

This affects only PEEMPT_RT, right?

I have plans to redo parts of it. You shouldn't enter ksoftirqd to be
begin with. There is this ktimerd in v6.1 which mitigates this to some
point and I plan to extend it to also cover the sched-softirq.
Other than that, you are right in saying that the softirq must not
continue with a RT prio and that need_resched() is not visible here.
However ksoftirqd itself must be able to do loops unless the
need-resched flag is seen.

Since you mentioned migration thread, how ofter to you see this or how
does this trigger?

> Signed-off-by: Petr Malat <oss@malat.biz>

Sebastian
  
Petr Malat March 9, 2023, 1:03 p.m. UTC | #2
On Wed, Mar 08, 2023 at 10:14:58AM +0100, Sebastian Andrzej Siewior wrote:
> On 2023-03-06 16:45:48 [+0100], Petr Malat wrote:
> > Softirq processing can be a source of a scheduling jitter if it executes
> > in a real-time task as in that case need_resched() is false unless there
> > is another runnable task with a higher priority. This is especially bad
> > if the softirq processing runs in a migration thread, which has priority
> > 99 and usually runs for a short time.
> > 
> > One option would be to not restart the softirq processing if there is
> > another runnable task to allow the high prio task to finish and yield the
> > CPU, the second one is to not restart if softirq executes in a real-time
> > task. Usually, real-time tasks don't want to be interrupted, so implement
> > the second option.
> 
> This affects only PEEMPT_RT, right?
I have observed the issue on 5.15 CONFIG_PREEMPT=y arm32 kernel.

> I have plans to redo parts of it. You shouldn't enter ksoftirqd to be
> begin with. There is this ktimerd in v6.1 which mitigates this to some
> point and I plan to extend it to also cover the sched-softirq.
> Other than that, you are right in saying that the softirq must not
> continue with a RT prio and that need_resched() is not visible here.
> However ksoftirqd itself must be able to do loops unless the
> need-resched flag is seen.
> 
> Since you mentioned migration thread, how ofter to you see this or how
> does this trigger?
I have seen only one occurrence, where I have a back trace available
(from hundreds systems). I think that's because on my system it may occur
only if it hits the migration thread, otherwise there are more runable
threads of the same priority and need_resched() breaks the loop.

I obtained the stack trace by making a debugging module which uses a
periodic timer to monitor active tasks and it dumps stack when it finds
something fishy. This is what I got:
 [<bf84f559>] (hogger_handler [hogger]) from [<c04850ef>] (__hrtimer_run_queues+0x13f/0x2f4)
 [<c04850ef>] (__hrtimer_run_queues) from [<c04858a5>] (hrtimer_interrupt+0xc9/0x1c4)
 [<c04858a5>] (hrtimer_interrupt) from [<c0810533>] (arch_timer_handler_phys+0x27/0x2c)
 [<c0810533>] (arch_timer_handler_phys) from [<c046de3b>] (handle_percpu_devid_irq+0x5b/0x1e4)
 [<c046de3b>] (handle_percpu_devid_irq) from [<c0469a27>] (__handle_domain_irq+0x53/0x94)
 [<c0469a27>] (__handle_domain_irq) from [<c041e501>] (axxia_gic_handle_irq+0x16d/0x1bc)
 [<c041e501>] (axxia_gic_handle_irq) from [<c0400ad3>] (__irq_svc+0x53/0x94)
 Exception stack(0xc1595ca8 to 0xc1595cf0)
 [<c0400ad3>] (__irq_svc) from [<c098e404>] (_raw_spin_unlock_irqrestore+0x1c/0x3c)
 [<c098e404>] (_raw_spin_unlock_irqrestore) from [<c0446b6d>] (try_to_wake_up+0x1d9/0x5d0)
 [<c0446b6d>] (try_to_wake_up) from [<c0483d2d>] (call_timer_fn+0x31/0x16c)
 [<c0483d2d>] (call_timer_fn) from [<c048406f>] (run_timer_softirq+0x207/0x2d4)
 [<c048406f>] (run_timer_softirq) from [<c0401293>] (__do_softirq+0xd3/0x2f8)
 [<c0401293>] (__do_softirq) from [<c042876b>] (irq_exit+0x57/0x78)
 [<c042876b>] (irq_exit) from [<c0469a2b>] (__handle_domain_irq+0x57/0x94)
 [<c0469a2b>] (__handle_domain_irq) from [<c041e501>] (axxia_gic_handle_irq+0x16d/0x1bc)
 [<c041e501>] (axxia_gic_handle_irq) from [<c0400ad3>] (__irq_svc+0x53/0x94)
 Exception stack(0xc1595e78 to 0xc1595ec0)
 [<c0400ad3>] (__irq_svc) from [<c044d37c>] (active_load_balance_cpu_stop+0x1ec/0x234)
 [<c044d37c>] (active_load_balance_cpu_stop) from [<c04ac099>] (cpu_stopper_thread+0x69/0xd8)
 [<c04ac099>] (cpu_stopper_thread) from [<c0440b53>] (smpboot_thread_fn+0x9f/0x17c)
 [<c0440b53>] (smpboot_thread_fn) from [<c043ccf9>] (kthread+0x129/0x12c)
 [<c043ccf9>] (kthread) from [<c0400131>] (ret_from_fork+0x11/0x20)

I was then looking into the code how it could happen softirqs were not
offloaded to the thread and the only explanation I have is what I described
in the original mail.
BR,
  Petr
  

Patch

diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..6a66d28bf020 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -478,7 +478,8 @@  asmlinkage __visible void do_softirq(void)
 
 /*
  * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times,
- * but break the loop if need_resched() is set or after 2 ms.
+ * but break the loop after 2 ms or if need_resched() is set or if we
+ * execute in a real-time task.
  * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in
  * certain cases, such as stop_machine(), jiffies may cease to
  * increment and so we need the MAX_SOFTIRQ_RESTART limit as
@@ -589,6 +590,7 @@  asmlinkage __visible void __softirq_entry __do_softirq(void)
 	pending = local_softirq_pending();
 	if (pending) {
 		if (time_before(jiffies, end) && !need_resched() &&
+		    (current->prio >= MAX_RT_PRIO || current == __this_cpu_read(ksoftirqd)) &&
 		    --max_restart)
 			goto restart;