From patchwork Fri Oct 28 09:34:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vincent Guittot X-Patchwork-Id: 12189 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp726090wru; Fri, 28 Oct 2022 02:46:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM43MscLv4Y8FYXTz2Y8gGIb5Kw9IfaIp3ddt0g7pqWT4ow5PcnYi21WgDbPFjyqwpWOb03/ X-Received: by 2002:a17:90b:4a02:b0:213:9ba4:206a with SMTP id kk2-20020a17090b4a0200b002139ba4206amr422046pjb.102.1666950418344; Fri, 28 Oct 2022 02:46:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666950418; cv=none; d=google.com; s=arc-20160816; b=QPYM4Rkb1VDJ41OXX4E9OKWGg/p+FgUZV6KLUSPsX6Dmsh+vJ4rqEIhGaYX5iYYptM mLwf2e9yjPVney2G/bVgIO2JYGYLAoe0URK930L4pZ+DFNYYYR9sbSwwgQ5PKa2FS51+ SIvWdvklthVsUDHe/Nut1KrAC7fdk8/iqneSd+a44cCNGpUwjhwOgkEyFzfFkNl91qdG pd+hrwbvE787IqJit5ylqtwe9c5adGt3tKWH2TflM778ofsvSimPkWP4vNgOKmqcLpCx 8vFDhpFt/57zCCwnflmq1CA3ZNHagC8hcUitsDeFxjPCEc8Ouym8PUzX65UcxHJfwpaK Hz0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=IqipjHgpREzjNaThIgGCrkPVXCZmKq3Gs2tcl5VBklo=; b=QRvEVdVxEQC4GZgE1OfQDrkeZGRbAK4N8FURiL+l2Atkv2sEtv78iNtBVLNhNIVygB Ve5U4E1Jxpiba888KD7me1T7/moSqblHWgAjWWg+PVL7TvO3IY6DlKsj2ocs+7iW8U23 drRneGsNuBE4AdO+2E9JXTyxnhFaQeiNGvdCSMnq9g80iFD7RjILxQ22ghQNo7WLH+dg hVcZtA1A0nGCKqWcSdk8cJQWHzW1jFRR3Inp1UYt52F+lL92HN4ugeaV7G/tko9yWkdT oEJYe97pmPdG5+2eANvb6HfZ1p3Ov4YcWs3Q0fJOg0xzXCF63jAgMBg6C/0uDWaT3+2Y sOvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=uAedgzxK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f2-20020a170902f38200b0017f74b810c4si4107440ple.53.2022.10.28.02.46.45; Fri, 28 Oct 2022 02:46:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=uAedgzxK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230232AbiJ1Jfv (ORCPT + 99 others); Fri, 28 Oct 2022 05:35:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229661AbiJ1Jf1 (ORCPT ); Fri, 28 Oct 2022 05:35:27 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E44A71989 for ; Fri, 28 Oct 2022 02:35:13 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id i5-20020a1c3b05000000b003cf47dcd316so6127411wma.4 for ; Fri, 28 Oct 2022 02:35:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=IqipjHgpREzjNaThIgGCrkPVXCZmKq3Gs2tcl5VBklo=; b=uAedgzxK/hhARCE2SOVitzXdwwfH9h9yFDTpPVIDqzXMkjEd0Zz/KUuzS0Ttj/Bf7E EAbskiG0Gh2ZP5dasm4tj2wNYP7qntVEonSGfhLzson5vIY37PJ6GT6nHDXV/h+/zV6y anZDiUqiEkGqkqyn94DrLKtFc5OvNCRzJzNQDIclttNALTQrAlqVmEQRE6JA3wFFPy3n rxr7sbf94cVgp3LDYQDr178NWEoNvZh0BLiokObVyOH+Jlt+wpPEuZ2AOhKCyAyw8UM3 O9gYgSDj3BZq81LeA9MID307+Ct382bGkPKHcHqEdVhMI3QXXRyGRdGz7TxOsSNIlKIS DVsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IqipjHgpREzjNaThIgGCrkPVXCZmKq3Gs2tcl5VBklo=; b=WANzo73V2z5+ZWfOrGN8/vX8sUPsqcUYpE6j/EKs0ZD32qYN4dWLh+I+zY6R6rRgO3 /InymwmV0QMFODgSYWlYWUvIvXmXOhbgc6wQiQsqLoFzmKOt0b/onkQujQpWLV7gnhh+ +BTfKgUSnMR2L2No4Bb27zfWxCXUsZE7OlOCpom0P4ZTvz1q7dH2F73xki0nL/LmJFCQ cmzP+YkAz6qhSJCjcmHsi8AwZxjYn7usNNuLJVh75iYBQW2MmFukO6ylTp/tYSKdcJNJ UWIQrXfV7Pqngzycv5QZE6umSI2Oz1ji45qSpRMYmgxMLZMVrWxbL80lvwXzgz71hxVh lqVA== X-Gm-Message-State: ACrzQf13IOMUZ2xCXIlWvfLcs7Dbl+LaeLW1MMranHQfeZC4IOaJ88ao t0E7rhcYdpUoOKo7fW+yG/k8HQ== X-Received: by 2002:adf:e6ce:0:b0:236:76a2:fc80 with SMTP id y14-20020adfe6ce000000b0023676a2fc80mr15161780wrm.163.1666949712438; Fri, 28 Oct 2022 02:35:12 -0700 (PDT) Received: from localhost.localdomain ([2a01:e0a:f:6020:c12b:b448:f0a9:83ef]) by smtp.gmail.com with ESMTPSA id k3-20020a05600c1c8300b003c6b7f5567csm10909426wms.0.2022.10.28.02.35.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 02:35:11 -0700 (PDT) From: Vincent Guittot To: mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, dietmar.eggemann@arm.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org, parth@linux.ibm.com Cc: qais.yousef@arm.com, chris.hyser@oracle.com, patrick.bellasi@matbug.net, David.Laight@aculab.com, pjt@google.com, pavel@ucw.cz, tj@kernel.org, qperret@google.com, tim.c.chen@linux.intel.com, joshdon@google.com, timj@gnu.org, kprateek.nayak@amd.com, yu.c.chen@intel.com, youssefesmat@chromium.org, joel@joelfernandes.org, Vincent Guittot Subject: [PATCH v7 8/9] sched/fair: Add latency list Date: Fri, 28 Oct 2022 11:34:02 +0200 Message-Id: <20221028093403.6673-9-vincent.guittot@linaro.org> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221028093403.6673-1-vincent.guittot@linaro.org> References: <20221028093403.6673-1-vincent.guittot@linaro.org> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747924201437811640?= X-GMAIL-MSGID: =?utf-8?q?1747924201437811640?= Add a rb tree for latency sensitive entities so we can schedule the most sensitive one first even when it failed to preempt current at wakeup or when it got quickly preempted by another entity of higher priority. In order to keep fairness, the latency is used once at wakeup to get a minimum slice and not during the following scheduling slice to prevent long running entity to got more running time than allocated to his nice priority. The rb tree enables to cover the last corner case where latency sensitive entity can't got schedule quickly after the wakeup. Signed-off-by: Vincent Guittot --- include/linux/sched.h | 2 + kernel/sched/fair.c | 96 +++++++++++++++++++++++++++++++++++++++++-- kernel/sched/sched.h | 1 + 3 files changed, 96 insertions(+), 3 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index a74cad08e91e..0b92674e3664 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -547,6 +547,8 @@ struct sched_entity { /* For load-balancing: */ struct load_weight load; struct rb_node run_node; + struct rb_node latency_node; + unsigned int on_latency; struct list_head group_node; unsigned int on_rq; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index a7372f80b1ea..c28992b7d1a6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -664,7 +664,77 @@ struct sched_entity *__pick_last_entity(struct cfs_rq *cfs_rq) return __node_2_se(last); } +#endif +/************************************************************** + * Scheduling class tree data structure manipulation methods: + * for latency + */ + +static inline bool latency_before(struct sched_entity *a, + struct sched_entity *b) +{ + return (s64)(a->vruntime + a->latency_offset - b->vruntime - b->latency_offset) < 0; +} + +#define __latency_node_2_se(node) \ + rb_entry((node), struct sched_entity, latency_node) + +static inline bool __latency_less(struct rb_node *a, const struct rb_node *b) +{ + return latency_before(__latency_node_2_se(a), __latency_node_2_se(b)); +} + +/* + * Enqueue an entity into the latency rb-tree: + */ +static void __enqueue_latency(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) +{ + + /* Only latency sensitive entity can be added to the list */ + if (se->latency_offset >= 0) + return; + + if (se->on_latency) + return; + + /* + * An execution time less than sysctl_sched_min_granularity means that + * the entity has been preempted by a higher sched class or an entity + * with higher latency constraint. + * Put it back in the list so it gets a chance to run 1st during the + * next slice. + */ + if (!(flags & ENQUEUE_WAKEUP)) { + u64 delta_exec = se->sum_exec_runtime - se->prev_sum_exec_runtime; + + if (delta_exec >= sysctl_sched_min_granularity) + return; + } + + rb_add_cached(&se->latency_node, &cfs_rq->latency_timeline, __latency_less); + se->on_latency = 1; +} + +static void __dequeue_latency(struct cfs_rq *cfs_rq, struct sched_entity *se) +{ + if (se->on_latency) { + rb_erase_cached(&se->latency_node, &cfs_rq->latency_timeline); + se->on_latency = 0; + } +} + +static struct sched_entity *__pick_first_latency(struct cfs_rq *cfs_rq) +{ + struct rb_node *left = rb_first_cached(&cfs_rq->latency_timeline); + + if (!left) + return NULL; + + return __latency_node_2_se(left); +} + +#ifdef CONFIG_SCHED_DEBUG /************************************************************** * Scheduling class statistics methods: */ @@ -4439,8 +4509,10 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) check_schedstat_required(); update_stats_enqueue_fair(cfs_rq, se, flags); check_spread(cfs_rq, se); - if (!curr) + if (!curr) { __enqueue_entity(cfs_rq, se); + __enqueue_latency(cfs_rq, se, flags); + } se->on_rq = 1; if (cfs_rq->nr_running == 1) { @@ -4526,8 +4598,10 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags) clear_buddies(cfs_rq, se); - if (se != cfs_rq->curr) + if (se != cfs_rq->curr) { __dequeue_entity(cfs_rq, se); + __dequeue_latency(cfs_rq, se); + } se->on_rq = 0; account_entity_dequeue(cfs_rq, se); @@ -4616,6 +4690,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se) */ update_stats_wait_end_fair(cfs_rq, se); __dequeue_entity(cfs_rq, se); + __dequeue_latency(cfs_rq, se); update_load_avg(cfs_rq, se, UPDATE_TG); } @@ -4654,7 +4729,7 @@ static struct sched_entity * pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr) { struct sched_entity *left = __pick_first_entity(cfs_rq); - struct sched_entity *se; + struct sched_entity *latency, *se; /* * If curr is set we have to see if its left of the leftmost entity @@ -4696,6 +4771,12 @@ pick_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *curr) se = cfs_rq->last; } + /* Check for latency sensitive entity waiting for running */ + latency = __pick_first_latency(cfs_rq); + if (latency && (latency != se) && + wakeup_preempt_entity(latency, se) < 1) + se = latency; + return se; } @@ -4719,6 +4800,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev) update_stats_wait_start_fair(cfs_rq, prev); /* Put 'current' back into the tree. */ __enqueue_entity(cfs_rq, prev); + __enqueue_latency(cfs_rq, prev, 0); /* in !on_rq case, update occurred at dequeue */ update_load_avg(cfs_rq, prev, 0); } @@ -11712,6 +11794,7 @@ static void set_next_task_fair(struct rq *rq, struct task_struct *p, bool first) void init_cfs_rq(struct cfs_rq *cfs_rq) { cfs_rq->tasks_timeline = RB_ROOT_CACHED; + cfs_rq->latency_timeline = RB_ROOT_CACHED; u64_u32_store(cfs_rq->min_vruntime, (u64)(-(1LL << 20))); #ifdef CONFIG_SMP raw_spin_lock_init(&cfs_rq->removed.lock); @@ -12020,8 +12103,15 @@ int sched_group_set_latency(struct task_group *tg, s64 latency) for_each_possible_cpu(i) { struct sched_entity *se = tg->se[i]; + struct rq *rq = cpu_rq(i); + struct rq_flags rf; + + rq_lock_irqsave(rq, &rf); + __dequeue_latency(se->cfs_rq, se); WRITE_ONCE(se->latency_offset, latency); + + rq_unlock_irqrestore(rq, &rf); } mutex_unlock(&shares_mutex); diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 95d4be4f3af6..91ec36c1158b 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -599,6 +599,7 @@ struct cfs_rq { #endif struct rb_root_cached tasks_timeline; + struct rb_root_cached latency_timeline; /* * 'curr' points to currently running entity on this cfs_rq.