[v6,15/21] timer: Add get next timer interrupt functionality for remote CPUs

Message ID 20230510072817.116056-16-anna-maria@linutronix.de
State New
Headers
Series timer: Move from a push remote at enqueue to a pull at expiry model |

Commit Message

Anna-Maria Behnsen May 10, 2023, 7:28 a.m. UTC
  To prepare for the conversion of the NOHZ timer placement to a pull at
expiry time model it's required to have functionality available getting the
next timer interrupt on a remote CPU.

Locking of the timer bases and getting the information for the next timer
interrupt functionality is split into separate functions. This is required
to be compliant with lock ordering when the new model is in place.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
---
v6:
 - introduce timer_lock_remote_bases() to fix race
---
 kernel/time/tick-internal.h |  9 +++++
 kernel/time/timer.c         | 71 ++++++++++++++++++++++++++++++++++---
 2 files changed, 75 insertions(+), 5 deletions(-)
  

Comments

Frederic Weisbecker May 10, 2023, 10:16 a.m. UTC | #1
Le Wed, May 10, 2023 at 09:28:11AM +0200, Anna-Maria Behnsen a écrit :
> +/**
> + * fetch_next_timer_interrupt_remote
> + * @basej:	base time jiffies
> + * @basem:	base time clock monotonic
> + * @tevt:	Pointer to the storage for the expiry values
> + * @cpu:	Remote CPU
> + *
> + * Stores the next pending local and global timer expiry values in the
> + * struct pointed to by @tevt. If a queue is empty the corresponding
> + * field is set to KTIME_MAX. If local event expires before global
> + * event, global event is set to KTIME_MAX as well.
> + *
> + * Caller needs to make sure timer base locks are held (use
> + * timer_lock_remote_bases() for this purpose). Caller must make sure
> + * interrupts are reopened, if required.
> + */
> +void fetch_next_timer_interrupt_remote(unsigned long basej, u64 basem,
> +				       struct timer_events *tevt,
> +				       unsigned int cpu)
> +{
> +	struct timer_base *base_local, *base_global;
> +
> +	/* Preset local / global events */
> +	tevt->local = tevt->global = KTIME_MAX;
> +
> +	base_local = per_cpu_ptr(&timer_bases[BASE_LOCAL], cpu);
> +	base_global = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu);
> +
> +	lockdep_assert_held(&base_local->lock);
> +	lockdep_assert_held(&base_global->lock);
> +
> +	fetch_next_timer_interrupt(base_local, base_global, basej, basem, tevt);
> +
> +	raw_spin_unlock(&base_global->lock);
> +	raw_spin_unlock(&base_local->lock);

Oh so that makes:

LOCK(baseL)
LOCK(baseG)
LOCK(tmc)
UNLOCK(baseG)
UNLOCK(baseL)
UNLOCK(tmc)

I guess you can keep the bases locks locked until the end of
tmigr_handle_remote_cpu(). After all that's what get_next_timer_interrupt()
does. I'm not sure the above early release of bases locks will bring much
in case of contention...

Then a timer_unlock_remote_bases() would restore symmetry.

> +/**
> + * timer_lock_remote_bases - lock timer bases of cpu
> + * @cpu:	Remote CPU
> + *
> + * Locks the remote timer bases.
> + *
> + * Returns false if cpu is offline, otherwise true is returned.
> + */
> +bool timer_lock_remote_bases(unsigned int cpu)
> +{
> +	struct timer_base *base_local, *base_global;
> +
> +	/*
> +	 * Pretend that there is no timer pending if the cpu is offline.
> +	 * Possible pending timers will be migrated later to an active cpu.
> +	 */
> +	if (cpu_is_offline(cpu))
> +		return false;

This value is never checked and the caller assumes the bases are
always locked upon calling this (more on this on the big patch).

Thanks.

> +
> +	base_local = per_cpu_ptr(&timer_bases[BASE_LOCAL], cpu);
> +	base_global = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu);
> +
> +	raw_spin_lock_irq(&base_local->lock);
> +	raw_spin_lock_nested(&base_global->lock, SINGLE_DEPTH_NESTING);
> +
> +	return true;
> +}
  
Anna-Maria Behnsen May 11, 2023, 1:06 p.m. UTC | #2
On Wed, 10 May 2023, Frederic Weisbecker wrote:

> Le Wed, May 10, 2023 at 09:28:11AM +0200, Anna-Maria Behnsen a écrit :
> > +/**
> > + * fetch_next_timer_interrupt_remote
> > + * @basej:	base time jiffies
> > + * @basem:	base time clock monotonic
> > + * @tevt:	Pointer to the storage for the expiry values
> > + * @cpu:	Remote CPU
> > + *
> > + * Stores the next pending local and global timer expiry values in the
> > + * struct pointed to by @tevt. If a queue is empty the corresponding
> > + * field is set to KTIME_MAX. If local event expires before global
> > + * event, global event is set to KTIME_MAX as well.
> > + *
> > + * Caller needs to make sure timer base locks are held (use
> > + * timer_lock_remote_bases() for this purpose). Caller must make sure
> > + * interrupts are reopened, if required.
> > + */
> > +void fetch_next_timer_interrupt_remote(unsigned long basej, u64 basem,
> > +				       struct timer_events *tevt,
> > +				       unsigned int cpu)
> > +{
> > +	struct timer_base *base_local, *base_global;
> > +
> > +	/* Preset local / global events */
> > +	tevt->local = tevt->global = KTIME_MAX;
> > +
> > +	base_local = per_cpu_ptr(&timer_bases[BASE_LOCAL], cpu);
> > +	base_global = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu);
> > +
> > +	lockdep_assert_held(&base_local->lock);
> > +	lockdep_assert_held(&base_global->lock);
> > +
> > +	fetch_next_timer_interrupt(base_local, base_global, basej, basem, tevt);
> > +
> > +	raw_spin_unlock(&base_global->lock);
> > +	raw_spin_unlock(&base_local->lock);
> 
> Oh so that makes:
> 
> LOCK(baseL)
> LOCK(baseG)
> LOCK(tmc)
> UNLOCK(baseG)
> UNLOCK(baseL)
> UNLOCK(tmc)
> 
> I guess you can keep the bases locks locked until the end of
> tmigr_handle_remote_cpu(). After all that's what get_next_timer_interrupt()
> does. I'm not sure the above early release of bases locks will bring much
> in case of contention...
> 
> Then a timer_unlock_remote_bases() would restore symmetry.

When the walk of the hierarchy for updating the new timer is done,
additional locks has to be taken. So then 5 locks are held during the
update in lvl1 and higher: baseL, baseG, tmc, child, group

I don't see a problem releasing the baseL and baseG lock earlier. But I
will add an extra function for releasing the base locks to make it more
clear. The whole section is protected by a local_irq_disable() and
irq_enable is done when unlocking the tmc lock.

No?

> > +/**
> > + * timer_lock_remote_bases - lock timer bases of cpu
> > + * @cpu:	Remote CPU
> > + *
> > + * Locks the remote timer bases.
> > + *
> > + * Returns false if cpu is offline, otherwise true is returned.
> > + */
> > +bool timer_lock_remote_bases(unsigned int cpu)
> > +{
> > +	struct timer_base *base_local, *base_global;
> > +
> > +	/*
> > +	 * Pretend that there is no timer pending if the cpu is offline.
> > +	 * Possible pending timers will be migrated later to an active cpu.
> > +	 */
> > +	if (cpu_is_offline(cpu))
> > +		return false;
> 
> This value is never checked and the caller assumes the bases are
> always locked upon calling this (more on this on the big patch).
> 
> Thanks.
> 
> > +
> > +	base_local = per_cpu_ptr(&timer_bases[BASE_LOCAL], cpu);
> > +	base_global = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu);
> > +
> > +	raw_spin_lock_irq(&base_local->lock);
> > +	raw_spin_lock_nested(&base_global->lock, SINGLE_DEPTH_NESTING);
> > +
> > +	return true;
> > +}
> 
>
  

Patch

diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 649f2b48e8f0..c7bba0bb19d9 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -8,6 +8,11 @@ 
 #include "timekeeping.h"
 #include "tick-sched.h"
 
+struct timer_events {
+	u64	local;
+	u64	global;
+};
+
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
 
 # define TICK_DO_TIMER_NONE	-1
@@ -164,6 +169,10 @@  static inline void timers_update_nohz(void) { }
 DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);
 
 extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
+extern void fetch_next_timer_interrupt_remote(unsigned long basej, u64 basem,
+					      struct timer_events *tevt,
+					      unsigned int cpu);
+extern bool timer_lock_remote_bases(unsigned int cpu);
 void timer_clear_idle(void);
 
 #define CLOCK_SET_WALL							\
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 4bb6c168d106..2370c9a15ba6 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -221,11 +221,6 @@  struct timer_base {
 
 static DEFINE_PER_CPU(struct timer_base, timer_bases[NR_BASES]);
 
-struct timer_events {
-	u64	local;
-	u64	global;
-};
-
 #ifdef CONFIG_NO_HZ_COMMON
 
 static DEFINE_STATIC_KEY_FALSE(timers_nohz_active);
@@ -2013,6 +2008,72 @@  static unsigned long fetch_next_timer_interrupt(struct timer_base *base_local,
 	return local_first ? nextevt_local : nextevt_global;
 }
 
+/**
+ * fetch_next_timer_interrupt_remote
+ * @basej:	base time jiffies
+ * @basem:	base time clock monotonic
+ * @tevt:	Pointer to the storage for the expiry values
+ * @cpu:	Remote CPU
+ *
+ * Stores the next pending local and global timer expiry values in the
+ * struct pointed to by @tevt. If a queue is empty the corresponding
+ * field is set to KTIME_MAX. If local event expires before global
+ * event, global event is set to KTIME_MAX as well.
+ *
+ * Caller needs to make sure timer base locks are held (use
+ * timer_lock_remote_bases() for this purpose). Caller must make sure
+ * interrupts are reopened, if required.
+ */
+void fetch_next_timer_interrupt_remote(unsigned long basej, u64 basem,
+				       struct timer_events *tevt,
+				       unsigned int cpu)
+{
+	struct timer_base *base_local, *base_global;
+
+	/* Preset local / global events */
+	tevt->local = tevt->global = KTIME_MAX;
+
+	base_local = per_cpu_ptr(&timer_bases[BASE_LOCAL], cpu);
+	base_global = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu);
+
+	lockdep_assert_held(&base_local->lock);
+	lockdep_assert_held(&base_global->lock);
+
+	fetch_next_timer_interrupt(base_local, base_global, basej, basem, tevt);
+
+	raw_spin_unlock(&base_global->lock);
+	raw_spin_unlock(&base_local->lock);
+}
+
+
+/**
+ * timer_lock_remote_bases - lock timer bases of cpu
+ * @cpu:	Remote CPU
+ *
+ * Locks the remote timer bases.
+ *
+ * Returns false if cpu is offline, otherwise true is returned.
+ */
+bool timer_lock_remote_bases(unsigned int cpu)
+{
+	struct timer_base *base_local, *base_global;
+
+	/*
+	 * Pretend that there is no timer pending if the cpu is offline.
+	 * Possible pending timers will be migrated later to an active cpu.
+	 */
+	if (cpu_is_offline(cpu))
+		return false;
+
+	base_local = per_cpu_ptr(&timer_bases[BASE_LOCAL], cpu);
+	base_global = per_cpu_ptr(&timer_bases[BASE_GLOBAL], cpu);
+
+	raw_spin_lock_irq(&base_local->lock);
+	raw_spin_lock_nested(&base_global->lock, SINGLE_DEPTH_NESTING);
+
+	return true;
+}
+
 /*
  * Forward base clock is done only when @basej is past base->clk, otherwise
  * base-clk might be rewind.