[09/45] posix-cpu-timers: Fix posix_cpu_timer_get() behaviour

Message ID 20230606142031.532247561@linutronix.de
State New
Headers
Series posix-timers: Cure inconsistencies and the SIG_IGN mess |

Commit Message

Thomas Gleixner June 6, 2023, 2:37 p.m. UTC
  timer_gettime() must return the remaining time to the next expiry of a
timer or 0 if the timer is not armed and no signal pending.

This has to be correct also for interval timers even if the signal is
pending or the timer has been created with SIGEV_NONE.

The posix CPU timer implementation fails for both cases as it does not
forward the timer in posix_cpu_timer_get() before calculating the expiry
time.

It neither clears the expiry value when a oneshot SIGEV_NONE timer expired
and returns 1nsec instead, which is only correct for timers with signals
when the signal delivery did not happen yet.

Aside of that posix_cpu_timer_set() pointlessly arms SIGEV_NONE timers
which are later disarmed when the initial expiry happens. That's bogus and
just keeping the process wide timer active for nothing.

Cure this by:

     1) Avoiding to arm SIGEV_NONE timers

     2) Forwarding interval timers in posix_cpu_timer_get()

     3) Taking SIGEV_NONE into account when a oneshot timer has expired

Make the update logic a separate function so it can be reused to simplify
posix_cpu_timer_set().

Fixes: ae1a78eecc45 ("cpu-timers: Return correct previous timer reload value")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/time/posix-cpu-timers.c |   96 +++++++++++++++++++++++------------------
 1 file changed, 54 insertions(+), 42 deletions(-)
  

Comments

Frederic Weisbecker June 26, 2023, 10:46 p.m. UTC | #1
On Tue, Jun 06, 2023 at 04:37:33PM +0200, Thomas Gleixner wrote:
> timer_gettime() must return the remaining time to the next expiry of a
> timer or 0 if the timer is not armed and no signal pending.
> 
> This has to be correct also for interval timers even if the signal is
> pending or the timer has been created with SIGEV_NONE.
> 
> The posix CPU timer implementation fails for both cases as it does not
> forward the timer in posix_cpu_timer_get() before calculating the expiry
> time.
> 
> It neither clears the expiry value when a oneshot SIGEV_NONE timer expired
> and returns 1nsec instead, which is only correct for timers with signals
> when the signal delivery did not happen yet.
> 
> Aside of that posix_cpu_timer_set() pointlessly arms SIGEV_NONE timers
> which are later disarmed when the initial expiry happens. That's bogus and
> just keeping the process wide timer active for nothing.
> 
> Cure this by:
> 
>      1) Avoiding to arm SIGEV_NONE timers
> 
>      2) Forwarding interval timers in posix_cpu_timer_get()
> 
>      3) Taking SIGEV_NONE into account when a oneshot timer has expired

This patch does too many things at once...

> 
> Make the update logic a separate function so it can be reused to simplify
> posix_cpu_timer_set().
> 
> Fixes: ae1a78eecc45 ("cpu-timers: Return correct previous timer reload value")
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/time/posix-cpu-timers.c |   96 +++++++++++++++++++++++------------------
>  1 file changed, 54 insertions(+), 42 deletions(-)
> 
> --- a/kernel/time/posix-cpu-timers.c
> +++ b/kernel/time/posix-cpu-timers.c
> @@ -785,45 +782,60 @@ static int posix_cpu_timer_set(struct k_
>  	return ret;
>  }
>  
> -static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp)
> +static void __posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp, u64 now)
>  {
> -	clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
> -	struct cpu_timer *ctmr = &timer->it.cpu;
> -	u64 now, expires = cpu_timer_getexpires(ctmr);
> -	struct task_struct *p;
> -
> -	rcu_read_lock();
> -	p = cpu_timer_task_rcu(timer);
> -	if (!p)
> -		goto out;
> +	bool sigev_none = timer->it_sigev_notify == SIGEV_NONE;
> +	u64 expires;
>  
>  	/*
> -	 * Easy part: convert the reload time.
> +	 * Make sure that interval timers are moved forward for the
> +	 * following cases:
> +	 *  - SIGEV_NONE timers which are never armed
> +	 *  - Timers which expired, but the signal has not yet been
> +	 *    delivered
>  	 */
> -	itp->it_interval = ktime_to_timespec64(timer->it_interval);
> -
> -	if (!expires)
> -		goto out;
> +	expires = bump_cpu_timer(timer, now);

What if the expiration has been reached but we arrived here before
handle_posix_cpu_timers() had a chance? In that case the call to
bump_cpu_timer() may forward the timer and artificially create an
overrun / missed event.

Also we are not holding the sighand lock here. So even though the timer
is forwarded, it may still be picked up afterward by check_thread_timers()
based on its stalled previous expires value... This can create a discrepancy
between the overrun count and the actual events received, and perhaps other
funny things...

Thanks.

>  
>  	/*
> -	 * Sample the clock to take the difference with the expiry time.
> +	 * Interval timers cannot have a remaining time <= 0 because the
> +	 * forwarding guarantees to move them forward so that the next
> +	 * timer expiry is > @now.
>  	 */
> -	if (CPUCLOCK_PERTHREAD(timer->it_clock))
> -		now = cpu_clock_sample(clkid, p);
> -	else
> -		now = cpu_clock_sample_group(clkid, p, false);
> -
>  	if (now < expires) {
>  		itp->it_value = ns_to_timespec64(expires - now);
>  	} else {
>  		/*
> -		 * The timer should have expired already, but the firing
> -		 * hasn't taken place yet.  Say it's just about to expire.
> +		 * A single shot SIGEV_NONE timer must return 0, when it is
> +		 * expired! Timers which have a real signal delivery mode
> +		 * must return a remaining time greater than 0 because the
> +		 * signal has not yet been delivered.
>  		 */
> -		itp->it_value.tv_nsec = 1;
> -		itp->it_value.tv_sec = 0;
> +		if (!sigev_none)
> +			itp->it_value.tv_nsec = 1;
> +	}
> +}
  
Thomas Gleixner June 29, 2023, 6:14 p.m. UTC | #2
On Tue, Jun 27 2023 at 00:46, Frederic Weisbecker wrote:> On Tue, Jun 06, 2023 at 04:37:33PM +0200, Thomas Gleixner wrote:
>> Aside of that posix_cpu_timer_set() pointlessly arms SIGEV_NONE timers
>> which are later disarmed when the initial expiry happens. That's bogus and
>> just keeping the process wide timer active for nothing.
>> 
>> Cure this by:
>> 
>>      1) Avoiding to arm SIGEV_NONE timers
>> 
>>      2) Forwarding interval timers in posix_cpu_timer_get()
>> 
>>      3) Taking SIGEV_NONE into account when a oneshot timer has expired
>
> This patch does too many things at once...

Let me try again to split it up. I failed before ...

>> -static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp)
>> +static void __posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp, u64 now)
>>  {
>> -	clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
>> -	struct cpu_timer *ctmr = &timer->it.cpu;
>> -	u64 now, expires = cpu_timer_getexpires(ctmr);
>> -	struct task_struct *p;
>> -
>> -	rcu_read_lock();
>> -	p = cpu_timer_task_rcu(timer);
>> -	if (!p)
>> -		goto out;
>> +	bool sigev_none = timer->it_sigev_notify == SIGEV_NONE;
>> +	u64 expires;
>>  
>>  	/*
>> -	 * Easy part: convert the reload time.
>> +	 * Make sure that interval timers are moved forward for the
>> +	 * following cases:
>> +	 *  - SIGEV_NONE timers which are never armed
>> +	 *  - Timers which expired, but the signal has not yet been
>> +	 *    delivered
>>  	 */
>> -	itp->it_interval = ktime_to_timespec64(timer->it_interval);
>> -
>> -	if (!expires)
>> -		goto out;
>> +	expires = bump_cpu_timer(timer, now);
>
> What if the expiration has been reached but we arrived here before
> handle_posix_cpu_timers() had a chance? In that case the call to
> bump_cpu_timer() may forward the timer and artificially create an
> overrun / missed event.

Bah. This clearly misses some conditionals so that it actually handles
the cases described in the comment above it...

Thanks,

        tglx
  

Patch

--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -584,12 +584,7 @@  static void cpu_timer_fire(struct k_itim
 {
 	struct cpu_timer *ctmr = &timer->it.cpu;
 
-	if ((timer->it_sigev_notify & ~SIGEV_THREAD_ID) == SIGEV_NONE) {
-		/*
-		 * User don't want any signal.
-		 */
-		cpu_timer_setexpires(ctmr, 0);
-	} else if (unlikely(timer->sigq == NULL)) {
+	if (unlikely(timer->sigq == NULL)) {
 		/*
 		 * This a special case for clock_nanosleep,
 		 * not a normal timer from sys_timer_create.
@@ -623,6 +618,7 @@  static void cpu_timer_fire(struct k_itim
 static int posix_cpu_timer_set(struct k_itimer *timer, int timer_flags,
 			       struct itimerspec64 *new, struct itimerspec64 *old)
 {
+	bool sigev_none = timer->it_sigev_notify == SIGEV_NONE;
 	clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
 	u64 old_expires, new_expires, old_incr, val;
 	struct cpu_timer *ctmr = &timer->it.cpu;
@@ -686,7 +682,7 @@  static int posix_cpu_timer_set(struct k_
 	if (CPUCLOCK_PERTHREAD(timer->it_clock))
 		val = cpu_clock_sample(clkid, p);
 	else
-		val = cpu_clock_sample_group(clkid, p, true);
+		val = cpu_clock_sample_group(clkid, p, !sigev_none);
 
 	if (old) {
 		if (old_expires == 0) {
@@ -723,19 +719,20 @@  static int posix_cpu_timer_set(struct k_
 		goto out;
 	}
 
-	if (new_expires != 0 && !(timer_flags & TIMER_ABSTIME)) {
+	/* Convert relative expiry time to absolute */
+	if (new_expires && !(timer_flags & TIMER_ABSTIME))
 		new_expires += val;
-	}
+
+	/* Set the new expiry time (might be 0) */
+	cpu_timer_setexpires(ctmr, new_expires);
 
 	/*
-	 * Install the new expiry time (or zero).
-	 * For a timer with no notification action, we don't actually
-	 * arm the timer (we'll just fake it for timer_gettime).
+	 * Arm the timer if it is not disabled, the new expiry value has
+	 * not yet expired and the timer requires signal delivery.
+	 * SIGEV_NONE timers are never armed.
 	 */
-	cpu_timer_setexpires(ctmr, new_expires);
-	if (new_expires != 0 && val < new_expires) {
+	if (!sigev_none && new_expires && val < new_expires)
 		arm_timer(timer, p);
-	}
 
 	unlock_task_sighand(p, &flags);
 	/*
@@ -754,7 +751,7 @@  static int posix_cpu_timer_set(struct k_
 	timer->it_overrun_last = 0;
 	timer->it_overrun = -1;
 
-	if (val >= new_expires) {
+	if (!sigev_none && val >= new_expires) {
 		if (new_expires != 0) {
 			/*
 			 * The designated time already passed, so we notify
@@ -785,45 +782,60 @@  static int posix_cpu_timer_set(struct k_
 	return ret;
 }
 
-static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp)
+static void __posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp, u64 now)
 {
-	clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
-	struct cpu_timer *ctmr = &timer->it.cpu;
-	u64 now, expires = cpu_timer_getexpires(ctmr);
-	struct task_struct *p;
-
-	rcu_read_lock();
-	p = cpu_timer_task_rcu(timer);
-	if (!p)
-		goto out;
+	bool sigev_none = timer->it_sigev_notify == SIGEV_NONE;
+	u64 expires;
 
 	/*
-	 * Easy part: convert the reload time.
+	 * Make sure that interval timers are moved forward for the
+	 * following cases:
+	 *  - SIGEV_NONE timers which are never armed
+	 *  - Timers which expired, but the signal has not yet been
+	 *    delivered
 	 */
-	itp->it_interval = ktime_to_timespec64(timer->it_interval);
-
-	if (!expires)
-		goto out;
+	expires = bump_cpu_timer(timer, now);
 
 	/*
-	 * Sample the clock to take the difference with the expiry time.
+	 * Interval timers cannot have a remaining time <= 0 because the
+	 * forwarding guarantees to move them forward so that the next
+	 * timer expiry is > @now.
 	 */
-	if (CPUCLOCK_PERTHREAD(timer->it_clock))
-		now = cpu_clock_sample(clkid, p);
-	else
-		now = cpu_clock_sample_group(clkid, p, false);
-
 	if (now < expires) {
 		itp->it_value = ns_to_timespec64(expires - now);
 	} else {
 		/*
-		 * The timer should have expired already, but the firing
-		 * hasn't taken place yet.  Say it's just about to expire.
+		 * A single shot SIGEV_NONE timer must return 0, when it is
+		 * expired! Timers which have a real signal delivery mode
+		 * must return a remaining time greater than 0 because the
+		 * signal has not yet been delivered.
 		 */
-		itp->it_value.tv_nsec = 1;
-		itp->it_value.tv_sec = 0;
+		if (!sigev_none)
+			itp->it_value.tv_nsec = 1;
+	}
+}
+
+static void posix_cpu_timer_get(struct k_itimer *timer, struct itimerspec64 *itp)
+{
+	clockid_t clkid = CPUCLOCK_WHICH(timer->it_clock);
+	struct cpu_timer *ctmr = &timer->it.cpu;
+	struct task_struct *p;
+	u64 now;
+
+	rcu_read_lock();
+	p = cpu_timer_task_rcu(timer);
+	if (p) {
+		itp->it_interval = ktime_to_timespec64(timer->it_interval);
+
+		if (cpu_timer_getexpires(ctmr)) {
+			if (CPUCLOCK_PERTHREAD(timer->it_clock))
+				now = cpu_clock_sample(clkid, p);
+			else
+				now = cpu_clock_sample_group(clkid, p, false);
+
+			__posix_cpu_timer_get(timer, itp, now);
+		}
 	}
-out:
 	rcu_read_unlock();
 }