From patchwork Mon Jun 5 19:16:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tip-bot2 for Thomas Gleixner X-Patchwork-Id: 103465 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp2911390vqr; Mon, 5 Jun 2023 12:37:17 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4zMlTzhf23Krz/lWB8YTtlycqo8DSwUuSOL7tU4frkMuQX3dEXXBIL3LIhwl4BWY7MpzO+ X-Received: by 2002:a05:6a20:5495:b0:117:71bf:a58a with SMTP id i21-20020a056a20549500b0011771bfa58amr69139pzk.17.1685993837111; Mon, 05 Jun 2023 12:37:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685993837; cv=none; d=google.com; s=arc-20160816; b=YT/oNKzu5PyDbXZjtVkl69t7jWM9IOVPXCf3N0m2EuxCHc8KphB+5++qSfTdLds+7O JMhJ29C2R0UYqRKZy1dDfx2oz8E9bV1LddlWf8W6IeNNszYO3w5tR5y8yG1gx72YsDCB +74QQr44xvs/aCadejsyJZFczHehK13659c8OH92M4cziNtTmpvcio//IxQyciRV0feA vWys21j7Z/aiYeXtRXJcfJMbQd1p6nMG+qSQa5V4pTjLymCWxPWFPm1z56MpnUrHhmL4 4421Cp8yKkWV6Sr2N9Aloaw5wQqHzaPu0iVb2D7zpnt/3VaHLII2rI3w6/2L/gQH2aPG dIqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:cc:subject:to:reply-to:sender:from :dkim-signature:dkim-signature:date; bh=uhzLkj49YJUuRG22jBOVFdrVFrIngllWs7LukgCiquo=; b=LYqQ9+UIsKrCYBhiHSPfWLVeZrPVmqIQ7v1ISwXKjkHuP9l3qwBIve13kj547Y6dqd vvnxZAQmDOkGB9aScFhOnQJyKs5YSNLTYRN7/FOE20JxRjKafw8AlJ7iGMX+Lj1JI1ii 9J0/iRT7TF40jAD5Ii3k4YL+xIMiOmN0RVytTrIwn5QwtbbPLWw/9PpImqv3/CT4UxJ1 Gq94/mhVZDaBz4g5McfTBaUqVg3P1jwX7KonkvX4otpQhKe213mzTbkwYfG6F5J0nrj/ W5gt/F1CV5qpbuS5OZslrJFX53orS9j4DTEs+vN5Gg8Vqx9jSEMj3F910aa2HBfc83HA Pb7g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="kgvE/2EO"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=ziVHrQCO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l9-20020a170902f68900b001aafc97feb0si6010068plg.174.2023.06.05.12.37.04; Mon, 05 Jun 2023 12:37:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="kgvE/2EO"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=ziVHrQCO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235578AbjFETRX (ORCPT + 99 others); Mon, 5 Jun 2023 15:17:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235462AbjFETQ0 (ORCPT ); Mon, 5 Jun 2023 15:16:26 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34BADA7; Mon, 5 Jun 2023 12:16:23 -0700 (PDT) Date: Mon, 05 Jun 2023 19:16:21 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1685992581; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=uhzLkj49YJUuRG22jBOVFdrVFrIngllWs7LukgCiquo=; b=kgvE/2EOBrYEDRHP7LonBAIZK65ngOnxOB3eepYMTrkllw9H4XHh8fgRPMv77TbV/4GfnF v6qMGl/X/4FtWNX2PA7g1MkmJAc9Fcpdwfy55Ce5kWYJ2/e5exAyMJWsi07J1AikyiD0ma 30JPn6TcoeWwUJZR+pXnazlfs12/AcH84u9LR3PgUD7o2lYwo1gOB2n5lgGRXeM7MlWq5F Mmg4uU7cKkxQcQTKd0EmJ1rFx6QNqThvtwd91HNRrYaOlnkmY//nJ8tmy9YA8ktyz/+dqk eF534TqtiGYRMUbH2uUmM3YmwK4Pexqc6FXRlkWq1JIFYUjGoo0fXMev7KjiCQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1685992581; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=uhzLkj49YJUuRG22jBOVFdrVFrIngllWs7LukgCiquo=; b=ziVHrQCO95Yz/4pq/VeT1KwPnCfL/i3NgqQ14f4Ye8pYM9+sqtHT8KYjGTVIFXMX5jqnhj U2Sk8Tyt1djGp5BQ== From: "tip-bot2 for Peter Zijlstra" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched: Unconditionally use full-fat wait_task_inactive() Cc: "Peter Zijlstra (Intel)" , x86@kernel.org, linux-kernel@vger.kernel.org MIME-Version: 1.0 Message-ID: <168599258108.404.12910966685801669273.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767892673869534198?= X-GMAIL-MSGID: =?utf-8?q?1767892673869534198?= The following commit has been merged into the sched/core branch of tip: Commit-ID: d5e1586617be7093ea3419e3fa9387ed833cdbb1 Gitweb: https://git.kernel.org/tip/d5e1586617be7093ea3419e3fa9387ed833cdbb1 Author: Peter Zijlstra AuthorDate: Fri, 02 Jun 2023 10:42:53 +02:00 Committer: Peter Zijlstra CommitterDate: Mon, 05 Jun 2023 21:11:02 +02:00 sched: Unconditionally use full-fat wait_task_inactive() While modifying wait_task_inactive() for PREEMPT_RT; the build robot noted that UP got broken. This led to audit and consideration of the UP implementation of wait_task_inactive(). It looks like the UP implementation is also broken for PREEMPT; consider task_current_syscall() getting preempted between the two calls to wait_task_inactive(). Therefore move the wait_task_inactive() implementation out of CONFIG_SMP and unconditionally use it. Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20230602103731.GA630648%40hirez.programming.kicks-ass.net --- include/linux/sched.h | 7 +- kernel/sched/core.c | 216 ++++++++++++++++++++--------------------- 2 files changed, 110 insertions(+), 113 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index eed5d65..1292d38 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2006,15 +2006,12 @@ static __always_inline void scheduler_ipi(void) */ preempt_fold_need_resched(); } -extern unsigned long wait_task_inactive(struct task_struct *, unsigned int match_state); #else static inline void scheduler_ipi(void) { } -static inline unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state) -{ - return 1; -} #endif +extern unsigned long wait_task_inactive(struct task_struct *, unsigned int match_state); + /* * Set thread flags in other task's structures. * See asm/thread_info.h for TIF_xxxx flags available: diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 944c3ae..810cf7d 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2213,6 +2213,114 @@ void check_preempt_curr(struct rq *rq, struct task_struct *p, int flags) rq_clock_skip_update(rq); } +/* + * wait_task_inactive - wait for a thread to unschedule. + * + * Wait for the thread to block in any of the states set in @match_state. + * If it changes, i.e. @p might have woken up, then return zero. When we + * succeed in waiting for @p to be off its CPU, we return a positive number + * (its total switch count). If a second call a short while later returns the + * same number, the caller can be sure that @p has remained unscheduled the + * whole time. + * + * The caller must ensure that the task *will* unschedule sometime soon, + * else this function might spin for a *long* time. This function can't + * be called with interrupts off, or it may introduce deadlock with + * smp_call_function() if an IPI is sent by the same process we are + * waiting to become inactive. + */ +unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state) +{ + int running, queued; + struct rq_flags rf; + unsigned long ncsw; + struct rq *rq; + + for (;;) { + /* + * We do the initial early heuristics without holding + * any task-queue locks at all. We'll only try to get + * the runqueue lock when things look like they will + * work out! + */ + rq = task_rq(p); + + /* + * If the task is actively running on another CPU + * still, just relax and busy-wait without holding + * any locks. + * + * NOTE! Since we don't hold any locks, it's not + * even sure that "rq" stays as the right runqueue! + * But we don't care, since "task_on_cpu()" will + * return false if the runqueue has changed and p + * is actually now running somewhere else! + */ + while (task_on_cpu(rq, p)) { + if (!(READ_ONCE(p->__state) & match_state)) + return 0; + cpu_relax(); + } + + /* + * Ok, time to look more closely! We need the rq + * lock now, to be *sure*. If we're wrong, we'll + * just go back and repeat. + */ + rq = task_rq_lock(p, &rf); + trace_sched_wait_task(p); + running = task_on_cpu(rq, p); + queued = task_on_rq_queued(p); + ncsw = 0; + if (READ_ONCE(p->__state) & match_state) + ncsw = p->nvcsw | LONG_MIN; /* sets MSB */ + task_rq_unlock(rq, p, &rf); + + /* + * If it changed from the expected state, bail out now. + */ + if (unlikely(!ncsw)) + break; + + /* + * Was it really running after all now that we + * checked with the proper locks actually held? + * + * Oops. Go back and try again.. + */ + if (unlikely(running)) { + cpu_relax(); + continue; + } + + /* + * It's not enough that it's not actively running, + * it must be off the runqueue _entirely_, and not + * preempted! + * + * So if it was still runnable (but just not actively + * running right now), it's preempted, and we should + * yield - it could be a while. + */ + if (unlikely(queued)) { + ktime_t to = NSEC_PER_SEC / HZ; + + set_current_state(TASK_UNINTERRUPTIBLE); + schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD); + continue; + } + + /* + * Ahh, all good. It wasn't running, and it wasn't + * runnable, which means that it will never become + * running in the future either. We're all done! + */ + break; + } + + return ncsw; +} + #ifdef CONFIG_SMP static void @@ -3341,114 +3449,6 @@ out: } #endif /* CONFIG_NUMA_BALANCING */ -/* - * wait_task_inactive - wait for a thread to unschedule. - * - * Wait for the thread to block in any of the states set in @match_state. - * If it changes, i.e. @p might have woken up, then return zero. When we - * succeed in waiting for @p to be off its CPU, we return a positive number - * (its total switch count). If a second call a short while later returns the - * same number, the caller can be sure that @p has remained unscheduled the - * whole time. - * - * The caller must ensure that the task *will* unschedule sometime soon, - * else this function might spin for a *long* time. This function can't - * be called with interrupts off, or it may introduce deadlock with - * smp_call_function() if an IPI is sent by the same process we are - * waiting to become inactive. - */ -unsigned long wait_task_inactive(struct task_struct *p, unsigned int match_state) -{ - int running, queued; - struct rq_flags rf; - unsigned long ncsw; - struct rq *rq; - - for (;;) { - /* - * We do the initial early heuristics without holding - * any task-queue locks at all. We'll only try to get - * the runqueue lock when things look like they will - * work out! - */ - rq = task_rq(p); - - /* - * If the task is actively running on another CPU - * still, just relax and busy-wait without holding - * any locks. - * - * NOTE! Since we don't hold any locks, it's not - * even sure that "rq" stays as the right runqueue! - * But we don't care, since "task_on_cpu()" will - * return false if the runqueue has changed and p - * is actually now running somewhere else! - */ - while (task_on_cpu(rq, p)) { - if (!(READ_ONCE(p->__state) & match_state)) - return 0; - cpu_relax(); - } - - /* - * Ok, time to look more closely! We need the rq - * lock now, to be *sure*. If we're wrong, we'll - * just go back and repeat. - */ - rq = task_rq_lock(p, &rf); - trace_sched_wait_task(p); - running = task_on_cpu(rq, p); - queued = task_on_rq_queued(p); - ncsw = 0; - if (READ_ONCE(p->__state) & match_state) - ncsw = p->nvcsw | LONG_MIN; /* sets MSB */ - task_rq_unlock(rq, p, &rf); - - /* - * If it changed from the expected state, bail out now. - */ - if (unlikely(!ncsw)) - break; - - /* - * Was it really running after all now that we - * checked with the proper locks actually held? - * - * Oops. Go back and try again.. - */ - if (unlikely(running)) { - cpu_relax(); - continue; - } - - /* - * It's not enough that it's not actively running, - * it must be off the runqueue _entirely_, and not - * preempted! - * - * So if it was still runnable (but just not actively - * running right now), it's preempted, and we should - * yield - it could be a while. - */ - if (unlikely(queued)) { - ktime_t to = NSEC_PER_SEC / HZ; - - set_current_state(TASK_UNINTERRUPTIBLE); - schedule_hrtimeout(&to, HRTIMER_MODE_REL_HARD); - continue; - } - - /* - * Ahh, all good. It wasn't running, and it wasn't - * runnable, which means that it will never become - * running in the future either. We're all done! - */ - break; - } - - return ncsw; -} - /*** * kick_process - kick a running thread to enter/exit the kernel * @p: the to-be-kicked thread