From patchwork Fri Feb 17 14:53:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 58623 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp929259wrn; Fri, 17 Feb 2023 06:56:17 -0800 (PST) X-Google-Smtp-Source: AK7set983J+1qUGphSH3ZrPz6OCP7br7DPtwY/lWMopWDCy4P3NCup/ZM8HVEKPFGc6tR1u3LagF X-Received: by 2002:a17:902:b610:b0:19a:b302:5176 with SMTP id b16-20020a170902b61000b0019ab3025176mr1204463pls.46.1676645777631; Fri, 17 Feb 2023 06:56:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676645777; cv=none; d=google.com; s=arc-20160816; b=CVR4XmL5fYcv6hmglPsZENLH66cVkVCXFuy5I7uqIZiWtoMEYJDV5PNXSDoTKq0gw0 L8aceX47v6YB2pUeS6HjJgOLRUhHGP9/vqTvxXtAYx3qONMbLN9FgdZWzlgnmlQIoiDH Yjba0J5e/WU3dJZ8qz+9p8CjygWaiecTesHe5g5Dl7QYnGNWgCFxaZf95G/DZH10Npva Ty0EYs5Si6ums7yd2MjIkMB9RUQxWglyPGJaxZes86ZlyNoXN9ACEY9hChO4x/Z4efOW PBF5mp2HvO0k1Xs7+IkNMJlYEh78lOqvhm+gIPT3Ro3Jy886s8V9CMvyC+UQe5KmIwHA fq4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:dkim-signature:dkim-signature:date; bh=eBITpJvFAgJDlrvkIZ8qiH8cZafnQorRAJ90UsZZfh8=; b=V6F2LezfxRgUdvZ2CQc9sG3FshgsGB06YwNcFejo6cacqCpgVNbCo4eL3ujzUZIa4V rv4Rw11fq/WXC81Vbl+4skJOvBl0pN3IYKu/AVKAVP88WFSdRFoAR4ednUGupTHS051m s+7T4V7glu6IDDz1f8ix1LSrwAvgNoHP6S1mbhTAb4burT8rt4G0t6ZriGoaujRqJX2i 0qQnwuhYHoo+hdhS9pRZ0VMXZXZQSmY2fwaAGX8Ak9SnHBplDvicoAo3KZEmVKa+R2eA PUYam9hNlGQNd30Ysiw2TMv2Y9C99oU+oI96IKxRpGFv//GlfDpC7wf3g9jDBGgrzDQb JuOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=DEAEl9ro; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c10-20020a170903234a00b00194bcc88b52si5588664plh.363.2023.02.17.06.56.04; Fri, 17 Feb 2023 06:56:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=DEAEl9ro; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229940AbjBQOxf (ORCPT + 99 others); Fri, 17 Feb 2023 09:53:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229896AbjBQOxd (ORCPT ); Fri, 17 Feb 2023 09:53:33 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CB583B0D7 for ; Fri, 17 Feb 2023 06:53:15 -0800 (PST) Date: Fri, 17 Feb 2023 15:53:02 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1676645593; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=eBITpJvFAgJDlrvkIZ8qiH8cZafnQorRAJ90UsZZfh8=; b=DEAEl9roZus/CNCAm5+6poOLk9HQXmUVToZJmyysfvdx0XhtspkSxOZX0ReGtGqU/hxxAm MNyl7TgMNRZh8QjVd/ywNPw/nXxWT7iMQFEGQfdqVG4tVprovvtO0bEp6tBIiWr5dgRsw5 pa17jLPT2TqU5npTsXmTQdhrUoVbGNaNlyZJ2m9tz2uKNFrjMlqmsj62ePTRQqv/ommd+I R7ZPLpzpREfztW3scv7VEsipGzU9X+LEBLOfUuyzkCEepkJFA5WAUeKHorwVckAUC475RX FlwFI88g9aEnX6GSObvFNwvLlHp+Ru0UCBuHnp7eI8nMLXfdfvYgtxUTBqv4FA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1676645593; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=eBITpJvFAgJDlrvkIZ8qiH8cZafnQorRAJ90UsZZfh8=; b=8Kv/gmXsSjVXxQ9/NI0Wdr+MXMkGPlc67msI8eJuJ9o2FICMtJfX/otamCA+Edc6xotfRO /RRz0glgXysZWUCA== From: Sebastian Andrzej Siewior To: linux-kernel@vger.kernel.org Cc: Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Mel Gorman , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Valentin Schneider , Vincent Guittot Subject: [PATCH] sched: Consider task_struct::saved_state in wait_task_inactive(). Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758090522730561412?= X-GMAIL-MSGID: =?utf-8?q?1758090522730561412?= wait_task_inactive() waits for thread to unschedule in a certain task state. On PREEMPT_RT that state may be stored in task_struct::saved_state while the thread, that is being waited for, blocks on a sleeping lock and task_struct::__state is set to TASK_RTLOCK_WAIT. It is not possible to check only for TASK_RTLOCK_WAIT to be sure that the task is blocked on a sleeping lock because during wake up (after the sleeping lock has been acquired) the task state is set TASK_RUNNING. After the task in on CPU and acquired the pi_lock it will reset the state accordingly but until then TASK_RUNNING will be observed (with the desired state is saved in saved_state). Check also for task_struct::saved_state if the desired match was not found in task_struct::__state on PREEMPT_RT. If the state was found in saved_state, wait until the task is idle and state is visible in task_struct::__state. Signed-off-by: Sebastian Andrzej Siewior Reviewed-by: Valentin Schneider Signed-off-by: Peter Zijlstra (Intel) Tested-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) --- Repost of https://lore.kernel.org/Yt%2FpQAFQ1xKNK0RY@linutronix.de kernel/sched/core.c | 81 ++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 76 insertions(+), 5 deletions(-) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3266,6 +3266,76 @@ int migrate_swap(struct task_struct *cur } #endif /* CONFIG_NUMA_BALANCING */ +#ifdef CONFIG_PREEMPT_RT + +/* + * Consider: + * + * set_special_state(X); + * + * do_things() + * // Somewhere in there is an rtlock that can be contended: + * current_save_and_set_rtlock_wait_state(); + * [...] + * schedule_rtlock(); (A) + * [...] + * current_restore_rtlock_saved_state(); + * + * schedule(); (B) + * + * If p->saved_state is anything else than TASK_RUNNING, then p blocked on an + * rtlock (A) *before* voluntarily calling into schedule() (B) after setting its + * state to X. For things like ptrace (X=TASK_TRACED), the task could have more + * work to do upon acquiring the lock in do_things() before whoever called + * wait_task_inactive() should return. IOW, we have to wait for: + * + * p.saved_state = TASK_RUNNING + * p.__state = X + * + * which implies the task isn't blocked on an RT lock and got to schedule() (B). + * + * Also see comments in ttwu_state_match(). + */ + +static __always_inline bool state_mismatch(struct task_struct *p, unsigned int match_state) +{ + unsigned long flags; + bool mismatch; + + raw_spin_lock_irqsave(&p->pi_lock, flags); + if (READ_ONCE(p->__state) & match_state) + mismatch = false; + else if (READ_ONCE(p->saved_state) & match_state) + mismatch = false; + else + mismatch = true; + + raw_spin_unlock_irqrestore(&p->pi_lock, flags); + return mismatch; +} +static __always_inline bool state_match(struct task_struct *p, unsigned int match_state, + bool *wait) +{ + if (READ_ONCE(p->__state) & match_state) + return true; + if (READ_ONCE(p->saved_state) & match_state) { + *wait = true; + return true; + } + return false; +} +#else +static __always_inline bool state_mismatch(struct task_struct *p, unsigned int match_state) +{ + return !(READ_ONCE(p->__state) & match_state); +} +static __always_inline bool state_match(struct task_struct *p, unsigned int match_state, + bool *wait) +{ + return (READ_ONCE(p->__state) & match_state); +} +#endif + /* * wait_task_inactive - wait for a thread to unschedule. * @@ -3284,7 +3354,7 @@ int migrate_swap(struct task_struct *cur */ unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state) { - int running, queued; + bool running, wait; struct rq_flags rf; unsigned long ncsw; struct rq *rq; @@ -3310,7 +3380,7 @@ unsigned long wait_task_inactive(struct * is actually now running somewhere else! */ while (task_on_cpu(rq, p)) { - if (!(READ_ONCE(p->__state) & match_state)) + if (state_mismatch(p, match_state)) return 0; cpu_relax(); } @@ -3323,9 +3393,10 @@ unsigned long wait_task_inactive(struct rq = task_rq_lock(p, &rf); trace_sched_wait_task(p); running = task_on_cpu(rq, p); - queued = task_on_rq_queued(p); + wait = task_on_rq_queued(p); ncsw = 0; - if (READ_ONCE(p->__state) & match_state) + + if (state_match(p, match_state, &wait)) ncsw = p->nvcsw | LONG_MIN; /* sets MSB */ task_rq_unlock(rq, p, &rf); @@ -3355,7 +3426,7 @@ unsigned long wait_task_inactive(struct * running right now), it's preempted, and we should * yield - it could be a while. */ - if (unlikely(queued)) { + if (unlikely(wait)) { ktime_t to = NSEC_PER_SEC / HZ; set_current_state(TASK_UNINTERRUPTIBLE);