[REPOST,2/2] signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT.

Message ID 20230606085524.2049961-3-bigeasy@linutronix.de
State New
Headers
Series signal: Avoid preempt_disable() in ptrace_stop() on PREEMPT_RT |

Commit Message

Sebastian Andrzej Siewior June 6, 2023, 8:55 a.m. UTC
  On PREEMPT_RT keeping preemption disabled during the invocation of
cgroup_enter_frozen() is a problem because the function acquires css_set_lock
which is a sleeping lock on PREEMPT_RT and must not be acquired with disabled
preemption.
The preempt-disabled section is only for performance optimisation
reasons and can be avoided.

Extend the comment and don't disable preemption before scheduling on
PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/signal.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)
  

Comments

Oleg Nesterov June 6, 2023, 11:04 a.m. UTC | #1
The patch LGTM, but I am a bit confused by the changelog/comments,
I guess I missed something...

On 06/06, Sebastian Andrzej Siewior wrote:
>
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
>  	 * The preempt-disable section ensures that there will be no preemption
>  	 * between unlock and schedule() and so improving the performance since
>  	 * the ptracer has no reason to sleep.
> +	 *
> +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> +	 * within the preempt-disable section.
>  	 */
> -	preempt_disable();
> +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> +		preempt_disable();

Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
this optimisation doesn't work on RT anyway?

IIUC, read_lock() on RT disables migration but not preemption, so it is simply
too late to do preempt_disable() before unlock/schedule. The tracer can preempt
the tracee right after do_notify_parent_cldstop().

Oleg.
  
Peter Zijlstra June 6, 2023, 11:14 a.m. UTC | #2
On Tue, Jun 06, 2023 at 01:04:48PM +0200, Oleg Nesterov wrote:
> The patch LGTM, but I am a bit confused by the changelog/comments,
> I guess I missed something...
> 
> On 06/06, Sebastian Andrzej Siewior wrote:
> >
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
> >  	 * The preempt-disable section ensures that there will be no preemption
> >  	 * between unlock and schedule() and so improving the performance since
> >  	 * the ptracer has no reason to sleep.
> > +	 *
> > +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> > +	 * within the preempt-disable section.
> >  	 */
> > -	preempt_disable();
> > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > +		preempt_disable();
> 
> Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
> this optimisation doesn't work on RT anyway?
> 
> IIUC, read_lock() on RT disables migration but not preemption, so it is simply
> too late to do preempt_disable() before unlock/schedule. The tracer can preempt
> the tracee right after do_notify_parent_cldstop().

Correct -- but I think you can disable preemption over what is
effectivly rwsem_up_read(), but you can't over the effective
rtmutex_lock() that cgroup_enter_frozen() will then attempt.

(iow, unlock() doesn't tend to sleep, while lock() does)

But you're correct to point out that the whole preempt_disable() thing
is entirely pointless due to the whole task_lock region being
preemptible before it.
  
Oleg Nesterov June 6, 2023, 11:38 a.m. UTC | #3
On 06/06, Peter Zijlstra wrote:
>
> On Tue, Jun 06, 2023 at 01:04:48PM +0200, Oleg Nesterov wrote:
> > The patch LGTM, but I am a bit confused by the changelog/comments,
> > I guess I missed something...
> >
> > On 06/06, Sebastian Andrzej Siewior wrote:
> > >
> > > --- a/kernel/signal.c
> > > +++ b/kernel/signal.c
> > > @@ -2328,11 +2328,16 @@ static int ptrace_stop(int exit_code, int why, unsigned long message,
> > >  	 * The preempt-disable section ensures that there will be no preemption
> > >  	 * between unlock and schedule() and so improving the performance since
> > >  	 * the ptracer has no reason to sleep.
> > > +	 *
> > > +	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
> > > +	 * within the preempt-disable section.
> > >  	 */
> > > -	preempt_disable();
> > > +	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> > > +		preempt_disable();
> >
> > Not only we the problems with cgroup_enter_frozen(), afaics (please correct me)
> > this optimisation doesn't work on RT anyway?
> >
> > IIUC, read_lock() on RT disables migration but not preemption, so it is simply
> > too late to do preempt_disable() before unlock/schedule. The tracer can preempt
> > the tracee right after do_notify_parent_cldstop().
>
> Correct -- but I think you can disable preemption over what is
> effectivly rwsem_up_read(), but you can't over the effective
> rtmutex_lock() that cgroup_enter_frozen() will then attempt.
>
> (iow, unlock() doesn't tend to sleep, while lock() does)
>
> But you're correct to point out that the whole preempt_disable() thing
> is entirely pointless due to the whole task_lock region being
> preemptible before it.

Thanks Peter.

So I think the comment should be updated. Otherwise it looks as if it makes
sense to try to move cgroup_enter_frozen() up before preempt_disable().

Oleg.
  

Patch

diff --git a/kernel/signal.c b/kernel/signal.c
index da017a5461163..9e07b3075c72e 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2328,11 +2328,16 @@  static int ptrace_stop(int exit_code, int why, unsigned long message,
 	 * The preempt-disable section ensures that there will be no preemption
 	 * between unlock and schedule() and so improving the performance since
 	 * the ptracer has no reason to sleep.
+	 *
+	 * This optimisation is not doable on PREEMPT_RT due to the spinlock_t
+	 * within the preempt-disable section.
 	 */
-	preempt_disable();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_disable();
 	read_unlock(&tasklist_lock);
 	cgroup_enter_frozen();
-	preempt_enable_no_resched();
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT))
+		preempt_enable_no_resched();
 	schedule();
 	cgroup_leave_frozen(true);