Message ID | 20221128132100.30253-10-ricardo.neri-calderon@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp5657506wrr; Mon, 28 Nov 2022 05:15:20 -0800 (PST) X-Google-Smtp-Source: AA0mqf4UiD8U3yeAdCrrrQuEl21IXOQDi/XQ6wea7Gsq6boRP/QlsJs4L41UJEcFLE8O46nGGuga X-Received: by 2002:a63:f5a:0:b0:470:18d4:f18d with SMTP id 26-20020a630f5a000000b0047018d4f18dmr27445974pgp.295.1669641320553; Mon, 28 Nov 2022 05:15:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669641320; cv=none; d=google.com; s=arc-20160816; b=kAEZMartdQ1nVOf+PcZotGMM58Nr/0WSEu1PBHU/CJPEDJxjJiIMgURK3k7N167Yhi k59Lrgt2MaO7phf9OcPj+HN/xlTIjYTqq3z+2ngVeBfkwEDxOv3MZQSmedzCqDX2+R+Q C/8UvFwsYym4kD4AVqykmt+HsR2XA3tDOVDIw112tG1WtT3PyIm5StJbMjP5QHIV3v9q PDxw5jtSzGV0a1/GT9eTo1Nk3cIQifq1G1gXLlPTILzWEjaCxWHNPyCjCV2pkrxTl7Tx QVklLF4bOLGo745B3adKb99+x7BWm1tcZxtDL+FKmssgI0d0GribvTAazO7RTwmQsNKX RQnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=Wj9/IxYjURYHpcVfR0uyKjtR4fqlrTUk40P8zS6/Cws=; b=zOxn1nhrCS2p3BiiytUUMMCDRqOzBU6qWYMZKVqsyI5m9rFXETq7ztzXGE/0wFOjh1 lZ4UfKBConDZ8FU0C7fav0xn/z7niFs0u+qkhCpG5xMYGAKaDvXZvWo1jyZxsVeMkS/t 2U4mW24C5RBVJ2lEkYswEjI2G3rCbUgLZ1Vj//sD5ZC7UUFLIE8NVDNKsM9ZPT7Vfk+0 bnGnHq32BGvIQWla1O0FmcSxR1rvHiF12IMNa30vWBKq/H42pRXVZv8V1E/mY6pgADIv HeGiwyUyouLcEx29/Lqdly1FkCp9aneAw6uqGJFkWSDsBFKRgAjt6h5ilA+a90Wvd6A+ ehWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TIsGMRdx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ls6-20020a17090b350600b00202880e0827si7806378pjb.28.2022.11.28.05.15.03; Mon, 28 Nov 2022 05:15:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TIsGMRdx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231843AbiK1NOY (ORCPT <rfc822;gah0developer@gmail.com> + 99 others); Mon, 28 Nov 2022 08:14:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231665AbiK1NNj (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 28 Nov 2022 08:13:39 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 697E21DA63; Mon, 28 Nov 2022 05:13:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669641218; x=1701177218; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=QuRfJtcjeJpezXilnQW60/bNiOmEuQ40ZOh0UG1TU4A=; b=TIsGMRdxitwIt/fUtN7V2nT9X9lBWZouyurJtfOyx5dZUdoKk5707IcM ntl1vDzJKPn8i6poAh9txSBxi0C2G5iI/h8rjS1y2oP7ctsQRaht66/n1 uqOaFjze8szn7zOvzPmIBeRoYwUMJyMTM6tBQFzhmfq37GvBQnHWedF5z aWOCLe9AJHDitFtq/Ate8xOn2siFldEWRvIAvBMpk2ArqyAJ9YcUcE5fC 3J/9+YbwR0Vmm8raxkSNZRwe033ILMwJ7yuw26li5ap4Cwc026U+exMq8 KZKXB3hHUiWzqhBx8eApRgGKCtjM55iU4taaFJM2vb4sxXokII/iClpyt w==; X-IronPort-AV: E=McAfee;i="6500,9779,10544"; a="401117137" X-IronPort-AV: E=Sophos;i="5.96,200,1665471600"; d="scan'208";a="401117137" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Nov 2022 05:13:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10544"; a="749381354" X-IronPort-AV: E=Sophos;i="5.96,200,1665471600"; d="scan'208";a="749381354" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga002.fm.intel.com with ESMTP; 28 Nov 2022 05:13:33 -0800 From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> To: "Peter Zijlstra (Intel)" <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org> Cc: Ricardo Neri <ricardo.neri@intel.com>, "Ravi V. Shankar" <ravi.v.shankar@intel.com>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Steven Rostedt <rostedt@goodmis.org>, Tim Chen <tim.c.chen@linux.intel.com>, Valentin Schneider <vschneid@redhat.com>, x86@kernel.org, "Joel Fernandes (Google)" <joel@joelfernandes.org>, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri <ricardo.neri-calderon@linux.intel.com>, "Tim C . Chen" <tim.c.chen@intel.com> Subject: [PATCH v2 09/22] sched/fair: Use IPC class score to select a busiest runqueue Date: Mon, 28 Nov 2022 05:20:47 -0800 Message-Id: <20221128132100.30253-10-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221128132100.30253-1-ricardo.neri-calderon@linux.intel.com> References: <20221128132100.30253-1-ricardo.neri-calderon@linux.intel.com> X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750745817532651359?= X-GMAIL-MSGID: =?utf-8?q?1750745817532651359?= |
Series |
sched: Introduce IPC classes for load balance
|
|
Commit Message
Ricardo Neri
Nov. 28, 2022, 1:20 p.m. UTC
For two runqueues of equal priority and equal number of running of tasks,
select the one whose current task would have the highest IPC class score
if placed on the destination CPU.
Cc: Ben Segall <bsegall@google.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: Len Brown <len.brown@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim C. Chen <tim.c.chen@intel.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: x86@kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Changes since v1:
* Fixed a bug when selecting a busiest runqueue: when comparing two
runqueues with equal nr_running, we must compute the IPCC score delta
of both.
* Renamed local variables to improve the layout of the code block.
(PeterZ)
* Used the new interface names.
---
kernel/sched/fair.c | 54 ++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 53 insertions(+), 1 deletion(-)
Comments
Hi, On Monday 28 Nov 2022 at 05:20:47 (-0800), Ricardo Neri wrote: > For two runqueues of equal priority and equal number of running of tasks, > select the one whose current task would have the highest IPC class score > if placed on the destination CPU. > [..] > +static int ipcc_score_delta(struct task_struct *p, int alt_cpu) > +{ > + unsigned long ipcc = p->ipcc; > + > + if (!sched_ipcc_enabled()) > + return INT_MIN; > + > + return arch_get_ipcc_score(ipcc, alt_cpu) - > + arch_get_ipcc_score(ipcc, task_cpu(p)); Nit: arch_get_ipcc_score() return values are never checked for error. > +} > + > #else /* CONFIG_IPC_CLASSES */ > static void update_sg_lb_ipcc_stats(struct sg_lb_ipcc_stats *sgcs, > struct rq *rq) > @@ -9258,6 +9276,11 @@ static bool sched_asym_ipcc_pick(struct sched_group *a, > return false; > } > > +static int ipcc_score_delta(struct task_struct *p, int alt_cpu) > +{ > + return INT_MIN; > +} > + > #endif /* CONFIG_IPC_CLASSES */ > > /** > @@ -10419,8 +10442,8 @@ static struct rq *find_busiest_queue(struct lb_env *env, > { > struct rq *busiest = NULL, *rq; > unsigned long busiest_util = 0, busiest_load = 0, busiest_capacity = 1; > + int i, busiest_ipcc_delta = INT_MIN; > unsigned int busiest_nr = 0; > - int i; > > for_each_cpu_and(i, sched_group_span(group), env->cpus) { > unsigned long capacity, load, util; > @@ -10526,8 +10549,37 @@ static struct rq *find_busiest_queue(struct lb_env *env, > > case migrate_task: > if (busiest_nr < nr_running) { > + struct task_struct *curr; > + > busiest_nr = nr_running; > busiest = rq; > + > + /* > + * Remember the IPC score delta of busiest::curr. > + * We may need it to break a tie with other queues > + * with equal nr_running. > + */ > + curr = rcu_dereference(busiest->curr); > + busiest_ipcc_delta = ipcc_score_delta(curr, > + env->dst_cpu); > + /* > + * If rq and busiest have the same number of running > + * tasks, pick rq if doing so would give rq::curr a > + * bigger IPC boost on dst_cpu. > + */ > + } else if (sched_ipcc_enabled() && > + busiest_nr == nr_running) { > + struct task_struct *curr; > + int delta; > + > + curr = rcu_dereference(rq->curr); > + delta = ipcc_score_delta(curr, env->dst_cpu); > + > + if (busiest_ipcc_delta < delta) { > + busiest_ipcc_delta = delta; > + busiest_nr = nr_running; > + busiest = rq; > + } > } > break; > While in the commit message you describe this as breaking a tie for asym_packing, the code here does not only affect asym_packing. If another architecture would have sched_ipcc_enabled() it would use this as generic policy, and that might not be desired. Hope it helps, Ionela. > -- > 2.25.1 > >
On Thu, Dec 08, 2022 at 08:51:03AM +0000, Ionela Voinescu wrote: > Hi, > > On Monday 28 Nov 2022 at 05:20:47 (-0800), Ricardo Neri wrote: > > For two runqueues of equal priority and equal number of running of tasks, > > select the one whose current task would have the highest IPC class score > > if placed on the destination CPU. > > > [..] > > +static int ipcc_score_delta(struct task_struct *p, int alt_cpu) > > +{ > > + unsigned long ipcc = p->ipcc; > > + > > + if (!sched_ipcc_enabled()) > > + return INT_MIN; > > + > > + return arch_get_ipcc_score(ipcc, alt_cpu) - > > + arch_get_ipcc_score(ipcc, task_cpu(p)); > > Nit: arch_get_ipcc_score() return values are never checked for error. Fair point. I will handle error values. > > > +} > > + > > #else /* CONFIG_IPC_CLASSES */ > > static void update_sg_lb_ipcc_stats(struct sg_lb_ipcc_stats *sgcs, > > struct rq *rq) > > @@ -9258,6 +9276,11 @@ static bool sched_asym_ipcc_pick(struct sched_group *a, > > return false; > > } > > > > +static int ipcc_score_delta(struct task_struct *p, int alt_cpu) > > +{ > > + return INT_MIN; > > +} > > + > > #endif /* CONFIG_IPC_CLASSES */ > > > > /** > > @@ -10419,8 +10442,8 @@ static struct rq *find_busiest_queue(struct lb_env *env, > > { > > struct rq *busiest = NULL, *rq; > > unsigned long busiest_util = 0, busiest_load = 0, busiest_capacity = 1; > > + int i, busiest_ipcc_delta = INT_MIN; > > unsigned int busiest_nr = 0; > > - int i; > > > > for_each_cpu_and(i, sched_group_span(group), env->cpus) { > > unsigned long capacity, load, util; > > @@ -10526,8 +10549,37 @@ static struct rq *find_busiest_queue(struct lb_env *env, > > > > case migrate_task: > > if (busiest_nr < nr_running) { > > + struct task_struct *curr; > > + > > busiest_nr = nr_running; > > busiest = rq; > > + > > + /* > > + * Remember the IPC score delta of busiest::curr. > > + * We may need it to break a tie with other queues > > + * with equal nr_running. > > + */ > > + curr = rcu_dereference(busiest->curr); > > + busiest_ipcc_delta = ipcc_score_delta(curr, > > + env->dst_cpu); > > + /* > > + * If rq and busiest have the same number of running > > + * tasks, pick rq if doing so would give rq::curr a > > + * bigger IPC boost on dst_cpu. > > + */ > > + } else if (sched_ipcc_enabled() && > > + busiest_nr == nr_running) { > > + struct task_struct *curr; > > + int delta; > > + > > + curr = rcu_dereference(rq->curr); > > + delta = ipcc_score_delta(curr, env->dst_cpu); > > + > > + if (busiest_ipcc_delta < delta) { > > + busiest_ipcc_delta = delta; > > + busiest_nr = nr_running; > > + busiest = rq; > > + } > > } > > break; > > > > While in the commit message you describe this as breaking a tie for > asym_packing, Are you referring to the overall series or this specific patch? I checked commit message and I do not see references to asym_packing. > the code here does not only affect asym_packing. If > another architecture would have sched_ipcc_enabled() it would use this > as generic policy, and that might not be desired. Indeed, the patchset implements support to use IPCC classes for asym_packing, but it is not limited to it. It is true that I don't check here for asym_packing, but it should not be a problem, IMO. I compare two runqueues with equal nr_running, either runqueue is a good choice. This tie breaker is an overall improvement, no? Thanks and BR, Ricardo
Hi Ricardo, On Tuesday 13 Dec 2022 at 16:32:43 (-0800), Ricardo Neri wrote: [..] > > > /** > > > @@ -10419,8 +10442,8 @@ static struct rq *find_busiest_queue(struct lb_env *env, > > > { > > > struct rq *busiest = NULL, *rq; > > > unsigned long busiest_util = 0, busiest_load = 0, busiest_capacity = 1; > > > + int i, busiest_ipcc_delta = INT_MIN; > > > unsigned int busiest_nr = 0; > > > - int i; > > > > > > for_each_cpu_and(i, sched_group_span(group), env->cpus) { > > > unsigned long capacity, load, util; > > > @@ -10526,8 +10549,37 @@ static struct rq *find_busiest_queue(struct lb_env *env, > > > > > > case migrate_task: > > > if (busiest_nr < nr_running) { > > > + struct task_struct *curr; > > > + > > > busiest_nr = nr_running; > > > busiest = rq; > > > + > > > + /* > > > + * Remember the IPC score delta of busiest::curr. > > > + * We may need it to break a tie with other queues > > > + * with equal nr_running. > > > + */ > > > + curr = rcu_dereference(busiest->curr); > > > + busiest_ipcc_delta = ipcc_score_delta(curr, > > > + env->dst_cpu); > > > + /* > > > + * If rq and busiest have the same number of running > > > + * tasks, pick rq if doing so would give rq::curr a > > > + * bigger IPC boost on dst_cpu. > > > + */ > > > + } else if (sched_ipcc_enabled() && > > > + busiest_nr == nr_running) { > > > + struct task_struct *curr; > > > + int delta; > > > + > > > + curr = rcu_dereference(rq->curr); > > > + delta = ipcc_score_delta(curr, env->dst_cpu); > > > + > > > + if (busiest_ipcc_delta < delta) { > > > + busiest_ipcc_delta = delta; > > > + busiest_nr = nr_running; > > > + busiest = rq; > > > + } > > > } > > > break; > > > > > > > While in the commit message you describe this as breaking a tie for > > asym_packing, > > Are you referring to the overall series or this specific patch? I checked > commit message and I do not see references to asym_packing. Sorry, my bad, I was thinking about the cover letter, not the commit message. It's under "+++ Balancing load using classes of tasks. Theory of operation". > > > the code here does not only affect asym_packing. If > > another architecture would have sched_ipcc_enabled() it would use this > > as generic policy, and that might not be desired. > > Indeed, the patchset implements support to use IPCC classes for asym_packing, > but it is not limited to it. > So is your current intention to support IPC classes only for asym_packing for now? What would be the impact on you if you were to limit the functionality in this patch to asym_packing only? > It is true that I don't check here for asym_packing, but it should not be a > problem, IMO. I compare two runqueues with equal nr_running, either runqueue > is a good choice. This tie breaker is an overall improvement, no? > It could be, but equally there could be other better policies as well - other ways to consider IPC class information to break the tie. If other architectures start having sched_ipcc_enabled() they would automatically use the policy you've decided on here. If other policies are better for those architectures this generic policy would be difficult to modify to ensure there are no regressions for all other architectures that use it, or it would be difficult to work around it. For this and for future support of IPC classes I am just wondering if we can better design how we enable different architectures to have different policies. Thanks, Ionela. > Thanks and BR, > Ricardo
On Wed, Dec 14, 2022 at 11:16:39PM +0000, Ionela Voinescu wrote: > Hi Ricardo, > > On Tuesday 13 Dec 2022 at 16:32:43 (-0800), Ricardo Neri wrote: > [..] > > > > /** > > > > @@ -10419,8 +10442,8 @@ static struct rq *find_busiest_queue(struct lb_env *env, > > > > { > > > > struct rq *busiest = NULL, *rq; > > > > unsigned long busiest_util = 0, busiest_load = 0, busiest_capacity = 1; > > > > + int i, busiest_ipcc_delta = INT_MIN; > > > > unsigned int busiest_nr = 0; > > > > - int i; > > > > > > > > for_each_cpu_and(i, sched_group_span(group), env->cpus) { > > > > unsigned long capacity, load, util; > > > > @@ -10526,8 +10549,37 @@ static struct rq *find_busiest_queue(struct lb_env *env, > > > > > > > > case migrate_task: > > > > if (busiest_nr < nr_running) { > > > > + struct task_struct *curr; > > > > + > > > > busiest_nr = nr_running; > > > > busiest = rq; > > > > + > > > > + /* > > > > + * Remember the IPC score delta of busiest::curr. > > > > + * We may need it to break a tie with other queues > > > > + * with equal nr_running. > > > > + */ > > > > + curr = rcu_dereference(busiest->curr); > > > > + busiest_ipcc_delta = ipcc_score_delta(curr, > > > > + env->dst_cpu); > > > > + /* > > > > + * If rq and busiest have the same number of running > > > > + * tasks, pick rq if doing so would give rq::curr a > > > > + * bigger IPC boost on dst_cpu. > > > > + */ > > > > + } else if (sched_ipcc_enabled() && > > > > + busiest_nr == nr_running) { > > > > + struct task_struct *curr; > > > > + int delta; > > > > + > > > > + curr = rcu_dereference(rq->curr); > > > > + delta = ipcc_score_delta(curr, env->dst_cpu); > > > > + > > > > + if (busiest_ipcc_delta < delta) { > > > > + busiest_ipcc_delta = delta; > > > > + busiest_nr = nr_running; > > > > + busiest = rq; > > > > + } > > > > } > > > > break; > > > > > > > > > > While in the commit message you describe this as breaking a tie for > > > asym_packing, > > > > Are you referring to the overall series or this specific patch? I checked > > commit message and I do not see references to asym_packing. > > Sorry, my bad, I was thinking about the cover letter, not the commit > message. It's under "+++ Balancing load using classes of tasks. Theory > of operation". > > > > > > the code here does not only affect asym_packing. If > > > another architecture would have sched_ipcc_enabled() it would use this > > > as generic policy, and that might not be desired. > > > > Indeed, the patchset implements support to use IPCC classes for asym_packing, > > but it is not limited to it. > > > > So is your current intention to support IPC classes only for asym_packing > for now? My intention is to introduce IPC classes in general and make it available to other policies or architectures. I use asym_packing as use case. > What would be the impact on you if you were to limit the > functionality in this patch to asym_packing only? There would not be any adverse impact. > > > It is true that I don't check here for asym_packing, but it should not be a > > problem, IMO. I compare two runqueues with equal nr_running, either runqueue > > is a good choice. This tie breaker is an overall improvement, no? > > > > It could be, but equally there could be other better policies as well - > other ways to consider IPC class information to break the tie. > > If other architectures start having sched_ipcc_enabled() they would > automatically use the policy you've decided on here. If other policies > are better for those architectures this generic policy would be difficult > to modify to ensure there are no regressions for all other architectures > that use it, or it would be difficult to work around it. > > For this and for future support of IPC classes I am just wondering if we > can better design how we enable different architectures to have different > policies. I see your point. I agree that other architectures may want to implement policies differently. I'll add an extra check for env->sd & SD_ASYM_PACKING. Thanks and BR, Ricardo
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e8b181c31842..113470bbd7a5 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9233,6 +9233,24 @@ static bool sched_asym_ipcc_pick(struct sched_group *a, return sched_asym_ipcc_prefer(a_stats, b_stats); } +/** + * ipcc_score_delta - Get the IPCC score delta on a different CPU + * @p: A task + * @alt_cpu: A prospective CPU to place @p + * + * Returns: The IPCC score delta that @p would get if placed on @alt_cpu + */ +static int ipcc_score_delta(struct task_struct *p, int alt_cpu) +{ + unsigned long ipcc = p->ipcc; + + if (!sched_ipcc_enabled()) + return INT_MIN; + + return arch_get_ipcc_score(ipcc, alt_cpu) - + arch_get_ipcc_score(ipcc, task_cpu(p)); +} + #else /* CONFIG_IPC_CLASSES */ static void update_sg_lb_ipcc_stats(struct sg_lb_ipcc_stats *sgcs, struct rq *rq) @@ -9258,6 +9276,11 @@ static bool sched_asym_ipcc_pick(struct sched_group *a, return false; } +static int ipcc_score_delta(struct task_struct *p, int alt_cpu) +{ + return INT_MIN; +} + #endif /* CONFIG_IPC_CLASSES */ /** @@ -10419,8 +10442,8 @@ static struct rq *find_busiest_queue(struct lb_env *env, { struct rq *busiest = NULL, *rq; unsigned long busiest_util = 0, busiest_load = 0, busiest_capacity = 1; + int i, busiest_ipcc_delta = INT_MIN; unsigned int busiest_nr = 0; - int i; for_each_cpu_and(i, sched_group_span(group), env->cpus) { unsigned long capacity, load, util; @@ -10526,8 +10549,37 @@ static struct rq *find_busiest_queue(struct lb_env *env, case migrate_task: if (busiest_nr < nr_running) { + struct task_struct *curr; + busiest_nr = nr_running; busiest = rq; + + /* + * Remember the IPC score delta of busiest::curr. + * We may need it to break a tie with other queues + * with equal nr_running. + */ + curr = rcu_dereference(busiest->curr); + busiest_ipcc_delta = ipcc_score_delta(curr, + env->dst_cpu); + /* + * If rq and busiest have the same number of running + * tasks, pick rq if doing so would give rq::curr a + * bigger IPC boost on dst_cpu. + */ + } else if (sched_ipcc_enabled() && + busiest_nr == nr_running) { + struct task_struct *curr; + int delta; + + curr = rcu_dereference(rq->curr); + delta = ipcc_score_delta(curr, env->dst_cpu); + + if (busiest_ipcc_delta < delta) { + busiest_ipcc_delta = delta; + busiest_nr = nr_running; + busiest = rq; + } } break;