[v10,20/20] timers: Always queue timers on the local CPU
Commit Message
The timer pull model is in place so we can remove the heuristics which try
to guess the best target CPU at enqueue/modification time.
All non pinned timers are queued on the local CPU in the separate storage
and eventually pulled at expiry time to a remote CPU.
Originally-by: Richard Cochran (linutronix GmbH) <richardcochran@gmail.com>
Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
v9:
- Update to the changes of the preceding patches
v6:
- Update TIMER_PINNED flag description.
v5:
- Move WARN_ONCE() in add_timer_on() into a previous patch
- Fold crystallball magic related hunks into this patch
v4: Update comment about TIMER_PINNED flag (heristic is removed)
---
include/linux/timer.h | 14 ++++----------
kernel/time/timer.c | 34 +++++++++++++---------------------
2 files changed, 17 insertions(+), 31 deletions(-)
Comments
Le Mon, Jan 15, 2024 at 03:37:43PM +0100, Anna-Maria Behnsen a écrit :
> The timer pull model is in place so we can remove the heuristics which try
> to guess the best target CPU at enqueue/modification time.
>
> All non pinned timers are queued on the local CPU in the separate storage
> and eventually pulled at expiry time to a remote CPU.
>
> Originally-by: Richard Cochran (linutronix GmbH) <richardcochran@gmail.com>
> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Just one detail below:
> @@ -590,10 +590,13 @@ trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
>
> /*
> * We might have to IPI the remote CPU if the base is idle and the
> - * timer is not deferrable. If the other CPU is on the way to idle
> - * then it can't set base->is_idle as we hold the base lock:
> + * timer is pinned. If it is a non pinned timer, it is only queued
> + * on the remote CPU, when timer was running during queueing. Then
> + * everything is handled by remote CPU anyway. If the other CPU is
> + * on the way to idle then it can't set base->is_idle as we hold
> + * the base lock:
> */
> - if (base->is_idle)
> + if (base->is_idle && timer->flags & TIMER_PINNED)
Is the TIMER_PINNED test necessary? If base->is_idle, then the timer
is now guaranteed to be TIMER_PINNED, right?
Thanks.
Frederic Weisbecker <frederic@kernel.org> writes:
> Le Mon, Jan 15, 2024 at 03:37:43PM +0100, Anna-Maria Behnsen a écrit :
>> The timer pull model is in place so we can remove the heuristics which try
>> to guess the best target CPU at enqueue/modification time.
>>
>> All non pinned timers are queued on the local CPU in the separate storage
>> and eventually pulled at expiry time to a remote CPU.
>>
>> Originally-by: Richard Cochran (linutronix GmbH) <richardcochran@gmail.com>
>> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
>
> Just one detail below:
>
>> @@ -590,10 +590,13 @@ trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
>>
>> /*
>> * We might have to IPI the remote CPU if the base is idle and the
>> - * timer is not deferrable. If the other CPU is on the way to idle
>> - * then it can't set base->is_idle as we hold the base lock:
>> + * timer is pinned. If it is a non pinned timer, it is only queued
>> + * on the remote CPU, when timer was running during queueing. Then
>> + * everything is handled by remote CPU anyway. If the other CPU is
>> + * on the way to idle then it can't set base->is_idle as we hold
>> + * the base lock:
>> */
>> - if (base->is_idle)
>> + if (base->is_idle && timer->flags & TIMER_PINNED)
>
> Is the TIMER_PINNED test necessary? If base->is_idle, then the timer
> is now guaranteed to be TIMER_PINNED, right?
>
Yes, you are right. Should I drop it? To clarify it, I could add a
WARN_ON_ONCE(!timer->flags & TIMER_PINNED)
instead.
Thanks,
Anna-Maria
On Thu, Feb 01, 2024 at 09:58:38PM +0100, Anna-Maria Behnsen wrote:
> Frederic Weisbecker <frederic@kernel.org> writes:
>
> > Le Mon, Jan 15, 2024 at 03:37:43PM +0100, Anna-Maria Behnsen a écrit :
> >> The timer pull model is in place so we can remove the heuristics which try
> >> to guess the best target CPU at enqueue/modification time.
> >>
> >> All non pinned timers are queued on the local CPU in the separate storage
> >> and eventually pulled at expiry time to a remote CPU.
> >>
> >> Originally-by: Richard Cochran (linutronix GmbH) <richardcochran@gmail.com>
> >> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
> >
> > Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
> >
> > Just one detail below:
> >
> >> @@ -590,10 +590,13 @@ trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
> >>
> >> /*
> >> * We might have to IPI the remote CPU if the base is idle and the
> >> - * timer is not deferrable. If the other CPU is on the way to idle
> >> - * then it can't set base->is_idle as we hold the base lock:
> >> + * timer is pinned. If it is a non pinned timer, it is only queued
> >> + * on the remote CPU, when timer was running during queueing. Then
> >> + * everything is handled by remote CPU anyway. If the other CPU is
> >> + * on the way to idle then it can't set base->is_idle as we hold
> >> + * the base lock:
> >> */
> >> - if (base->is_idle)
> >> + if (base->is_idle && timer->flags & TIMER_PINNED)
> >
> > Is the TIMER_PINNED test necessary? If base->is_idle, then the timer
> > is now guaranteed to be TIMER_PINNED, right?
> >
>
> Yes, you are right. Should I drop it? To clarify it, I could add a
>
> WARN_ON_ONCE(!timer->flags & TIMER_PINNED)
Yep, that looks good!
Thanks.
>
> instead.
>
> Thanks,
>
> Anna-Maria
>
@@ -50,16 +50,10 @@ struct timer_list {
* workqueue locking issues. It's not meant for executing random crap
* with interrupts disabled. Abuse is monitored!
*
- * @TIMER_PINNED: A pinned timer will not be affected by any timer
- * placement heuristics (like, NOHZ) and will always expire on the CPU
- * on which the timer was enqueued.
- *
- * Note: Because enqueuing of timers can migrate the timer from one
- * CPU to another, pinned timers are not guaranteed to stay on the
- * initialy selected CPU. They move to the CPU on which the enqueue
- * function is invoked via mod_timer() or add_timer(). If the timer
- * should be placed on a particular CPU, then add_timer_on() has to be
- * used.
+ * @TIMER_PINNED: A pinned timer will always expire on the CPU on which the
+ * timer was enqueued. When a particular CPU is required, add_timer_on()
+ * has to be used. Enqueue via mod_timer() and add_timer() is always done
+ * on the local CPU.
*/
#define TIMER_CPUMASK 0x0003FFFF
#define TIMER_MIGRATING 0x00040000
@@ -590,10 +590,13 @@ trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
/*
* We might have to IPI the remote CPU if the base is idle and the
- * timer is not deferrable. If the other CPU is on the way to idle
- * then it can't set base->is_idle as we hold the base lock:
+ * timer is pinned. If it is a non pinned timer, it is only queued
+ * on the remote CPU, when timer was running during queueing. Then
+ * everything is handled by remote CPU anyway. If the other CPU is
+ * on the way to idle then it can't set base->is_idle as we hold
+ * the base lock:
*/
- if (base->is_idle)
+ if (base->is_idle && timer->flags & TIMER_PINNED)
wake_up_nohz_cpu(base->cpu);
}
@@ -941,17 +944,6 @@ static inline struct timer_base *get_timer_base(u32 tflags)
return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK);
}
-static inline struct timer_base *
-get_target_base(struct timer_base *base, unsigned tflags)
-{
-#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
- if (static_branch_likely(&timers_migration_enabled) &&
- !(tflags & TIMER_PINNED))
- return get_timer_cpu_base(tflags, get_nohz_timer_target());
-#endif
- return get_timer_this_cpu_base(tflags);
-}
-
static inline void __forward_timer_base(struct timer_base *base,
unsigned long basej)
{
@@ -1106,7 +1098,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires, unsigned int option
if (!ret && (options & MOD_TIMER_PENDING_ONLY))
goto out_unlock;
- new_base = get_target_base(base, timer->flags);
+ new_base = get_timer_this_cpu_base(timer->flags);
if (base != new_base) {
/*
@@ -2237,7 +2229,7 @@ static inline u64 __get_next_timer_interrupt(unsigned long basej, u64 basem,
* granularity skew (by design).
*/
if (!base_local->is_idle && time_after(nextevt, basej + 1)) {
- base_local->is_idle = base_global->is_idle = true;
+ base_local->is_idle = true;
trace_timer_base_idle(true, base_local->cpu);
}
*idle = base_local->is_idle;
@@ -2303,13 +2295,13 @@ u64 timer_base_try_to_set_idle(unsigned long basej, u64 basem, bool *idle)
void timer_clear_idle(void)
{
/*
- * We do this unlocked. The worst outcome is a remote enqueue sending
- * a pointless IPI, but taking the lock would just make the window for
- * sending the IPI a few instructions smaller for the cost of taking
- * the lock in the exit from idle path.
+ * We do this unlocked. The worst outcome is a remote pinned timer
+ * enqueue sending a pointless IPI, but taking the lock would just
+ * make the window for sending the IPI a few instructions smaller
+ * for the cost of taking the lock in the exit from idle
+ * path. Required for BASE_LOCAL only.
*/
__this_cpu_write(timer_bases[BASE_LOCAL].is_idle, false);
- __this_cpu_write(timer_bases[BASE_GLOBAL].is_idle, false);
trace_timer_base_idle(false, smp_processor_id());
/* Activate without holding the timer_base->lock */