[v6,5/6] timers: Add timer_shutdown() to be called before freeing timers

Message ID 20221110064147.529154710@goodmis.org
State New
Headers
Series timers: Use timer_shutdown*() before freeing timers |

Commit Message

Steven Rostedt Nov. 10, 2022, 6:41 a.m. UTC
  From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

Before a timer is to be freed, it must be shutdown. But there are some
locations were timer_shutdown_sync() can not be called due to the context
the object that holds the timer is in when it is freed.

For cases where the logic should keep the timer from being re-armed but
still needs to be shutdown with a sync, a new API of timer_shutdown() is
available. This is the same as del_timer() except that after it is called,
the timer can not be re-armed. If it is, a WARN_ON_ONCE() will be
triggered.

The implementation of timer_shutdown() follows the timer_shutdown_sync()
method of using the same code as del_timer() but will pass in a boolean
that the timer is about to be freed, in which case the timer->function is
set to NULL, just like timer_shutdown_sync().

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stephen Boyd <sboyd@kernel.org>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 include/linux/timer.h | 35 ++++++++++++++++++++++++++++++++++-
 kernel/time/timer.c   | 21 ++++++++-------------
 2 files changed, 42 insertions(+), 14 deletions(-)
  

Comments

Thomas Gleixner Nov. 13, 2022, 10:20 p.m. UTC | #1
On Thu, Nov 10 2022 at 01:41, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

$Subject: !!@*&^*&^@!

> Before a timer is to be freed, it must be shutdown. But there are some
> locations were timer_shutdown_sync() can not be called due to the context
> the object that holds the timer is in when it is freed.

locations? This is not about locations, it's about contexts. And please
provide a proper example for such a context.

> For cases where the logic should keep the timer from being re-armed but
> still needs to be shutdown with a sync, a new API of timer_shutdown() is
> available.

"Needs to shutdown with a sync"? "is available"? Try again with
comprehensible explanations.

> This is the same as del_timer() except that after it is called, the
> timer can not be re-armed. If it is, a WARN_ON_ONCE() will be
> triggered.
>
> The implementation of timer_shutdown() follows the timer_shutdown_sync()
> method of using the same code as del_timer() but will pass in a boolean
> that the timer is about to be freed, in which case the timer->function is
> set to NULL, just like timer_shutdown_sync().

That's complete useless information for a changelog. We can see that
from the patch itself, no?

Changelogs are about context and the problem the patch tries to solve,
not about implementation details.

> +/**
> + * del_timer - deactivate a timer.
> + * @timer: the timer to be deactivated

See previous comments about uppercase.

> + * del_timer() deactivates a timer - this works on both active and inactive
> + * timers.

How so? What "works"? What's the work done on an inactive timer? Also
this lacks documentation that this function is fundamentally racy
against a concurrent rearm.

> + * The function returns whether it has deactivated a pending timer or not.
> + * (ie. del_timer() of an inactive timer returns 0, del_timer() of an
> + * active timer returns 1.)

See previous comment about return value documentation.

> + */
> +static inline int del_timer(struct timer_list *timer)
> +{
> +	return __del_timer(timer, false);
> +}
> +
> +/**
> + * timer_shutdown - deactivate a timer and shut it down
> + * @timer: the timer to be deactivated
> + *
> + * timer_shutdown() deactivates a timer - this works on both active
> + * and inactive timers, and will prevent it from being rearmed.

This needs some further explanation especially vs. the function pointer
being set to NULL. Which means that in case that the timer is not freed
and reused later on it needs to be initialized again. Which is btw
lacking from timer_shutdown_sync() too.

> + * The function returns whether it has deactivated a pending timer or not.
> + * (ie. timer_shutdown() of an inactive timer returns 0,
> + *   timer_shutdown() of an active timer returns 1.)
> + */
> +static inline int timer_shutdown(struct timer_list *timer)
> +{
> +	return __del_timer(timer, true);
> +}
> +
>  /*
>   * The jiffies value which is added to now, when there is no timer
>   * in the timer wheel:
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 111a3550b3f2..7c224766065e 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -1240,18 +1240,7 @@ void add_timer_on(struct timer_list *timer, int cpu)
>  }
>  EXPORT_SYMBOL_GPL(add_timer_on);
>  
> -/**
> - * del_timer - deactivate a timer.
> - * @timer: the timer to be deactivated
> - *
> - * del_timer() deactivates a timer - this works on both active and inactive
> - * timers.
> - *
> - * The function returns whether it has deactivated a pending timer or not.
> - * (ie. del_timer() of an inactive timer returns 0, del_timer() of an
> - * active timer returns 1.)
> - */

Instead of blurbing about invoking __del_timer() with free=true in the
changelog you could have kept the kernel doc here and/or added some
useful comment to the code below.

But...

> -int del_timer(struct timer_list *timer)
> +int __del_timer(struct timer_list *timer, bool free)
>  {
>  	struct timer_base *base;
>  	unsigned long flags;
> @@ -1262,12 +1251,18 @@ int del_timer(struct timer_list *timer)
>  	if (timer_pending(timer)) {
>  		base = lock_timer_base(timer, &flags);
>  		ret = detach_if_pending(timer, base, true);
> +		if (free)
> +			timer->function = NULL;
> +		raw_spin_unlock_irqrestore(&base->lock, flags);
> +	} else if (free) {
> +		base = lock_timer_base(timer, &flags);
> +		timer->function = NULL;
>  		raw_spin_unlock_irqrestore(&base->lock, flags);
>  	}

... this function is a concurrency disaster:

CPU0                           		CPU1

timer_shutdown(timer)
  __del_timer(timer, free=true)
    // timer is not pending
    ....
    } else if (free)                    mod_timer()
                                          lock_timer(timer);
      lock_timer(timer)                   enqueue_timer(timer);
                                          unlock_timer(timer);
      timer->function = NULL;
      unlock_timer(timer);
                                        //timer expires
                                        lock_timer(timer);
                                        fn = timer->function;
                                        unlock_timer(timer);
                                        fn(timer); <--- NULL pointer dereference

So you "solve" the existing problem by introducing one which is even
more horrible to debug, right?

Let me go back to the timer_shutdown_sync() variant and figure out
whether that one is at least not borked in the same way.

Thanks,

        tglx
  

Patch

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 4d56e20613eb..0b959b52d0db 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -168,12 +168,45 @@  static inline int timer_pending(const struct timer_list * timer)
 	return !hlist_unhashed_lockless(&timer->entry);
 }
 
+extern int __del_timer(struct timer_list * timer, bool free);
+
 extern void add_timer_on(struct timer_list *timer, int cpu);
-extern int del_timer(struct timer_list * timer);
 extern int mod_timer(struct timer_list *timer, unsigned long expires);
 extern int mod_timer_pending(struct timer_list *timer, unsigned long expires);
 extern int timer_reduce(struct timer_list *timer, unsigned long expires);
 
+/**
+ * del_timer - deactivate a timer.
+ * @timer: the timer to be deactivated
+ *
+ * del_timer() deactivates a timer - this works on both active and inactive
+ * timers.
+ *
+ * The function returns whether it has deactivated a pending timer or not.
+ * (ie. del_timer() of an inactive timer returns 0, del_timer() of an
+ * active timer returns 1.)
+ */
+static inline int del_timer(struct timer_list *timer)
+{
+	return __del_timer(timer, false);
+}
+
+/**
+ * timer_shutdown - deactivate a timer and shut it down
+ * @timer: the timer to be deactivated
+ *
+ * timer_shutdown() deactivates a timer - this works on both active
+ * and inactive timers, and will prevent it from being rearmed.
+ *
+ * The function returns whether it has deactivated a pending timer or not.
+ * (ie. timer_shutdown() of an inactive timer returns 0,
+ *   timer_shutdown() of an active timer returns 1.)
+ */
+static inline int timer_shutdown(struct timer_list *timer)
+{
+	return __del_timer(timer, true);
+}
+
 /*
  * The jiffies value which is added to now, when there is no timer
  * in the timer wheel:
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 111a3550b3f2..7c224766065e 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -1240,18 +1240,7 @@  void add_timer_on(struct timer_list *timer, int cpu)
 }
 EXPORT_SYMBOL_GPL(add_timer_on);
 
-/**
- * del_timer - deactivate a timer.
- * @timer: the timer to be deactivated
- *
- * del_timer() deactivates a timer - this works on both active and inactive
- * timers.
- *
- * The function returns whether it has deactivated a pending timer or not.
- * (ie. del_timer() of an inactive timer returns 0, del_timer() of an
- * active timer returns 1.)
- */
-int del_timer(struct timer_list *timer)
+int __del_timer(struct timer_list *timer, bool free)
 {
 	struct timer_base *base;
 	unsigned long flags;
@@ -1262,12 +1251,18 @@  int del_timer(struct timer_list *timer)
 	if (timer_pending(timer)) {
 		base = lock_timer_base(timer, &flags);
 		ret = detach_if_pending(timer, base, true);
+		if (free)
+			timer->function = NULL;
+		raw_spin_unlock_irqrestore(&base->lock, flags);
+	} else if (free) {
+		base = lock_timer_base(timer, &flags);
+		timer->function = NULL;
 		raw_spin_unlock_irqrestore(&base->lock, flags);
 	}
 
 	return ret;
 }
-EXPORT_SYMBOL(del_timer);
+EXPORT_SYMBOL(__del_timer);
 
 static int __try_to_del_timer_sync(struct timer_list *timer, bool free)
 {