Message ID | 20231110125902.2152380-1-pierre.gondois@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b129:0:b0:403:3b70:6f57 with SMTP id q9csp1300387vqs; Fri, 10 Nov 2023 10:22:46 -0800 (PST) X-Google-Smtp-Source: AGHT+IHC0C/wi5A0IQ2m7XSqzykgepp2J7EnV+JCRkC0aYzuIcPuqZMfy1mPvPAIC49B8O27WbSj X-Received: by 2002:a05:6808:358:b0:3af:709c:1b2b with SMTP id j24-20020a056808035800b003af709c1b2bmr78940oie.32.1699640566608; Fri, 10 Nov 2023 10:22:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1699640566; cv=none; d=google.com; s=arc-20160816; b=h3q/JXNfyZCj/IZwc9iuZHtduAUVRWCRrhsi1BRtNTzj1jqtfNnTYiGxm3yY3UaQDW 5Bwt9TVOpj11sTOcwoxhlI+ChcAejUpXTYwp9f6QPM/mFlHarnx3OFn0Nlb7riaYjyfI vdbjRs+LHS/pl8Pz9otQaHKTRK+XanfQfCUB2es5MynbYqR9Sr3GpiUO3XAgMhMHbLb1 fz4gxnMengsvnNmm5OoyYuib9jF0dOVOAtFsGlM32HT+sVmSpSrtxhhpirSpSjxmCi7G 8VRxnhr6KeJAFgxUfcVWj17Yya8GVFw44MPFbRmY1mqDF36XW177VaX4VnvccpF6ksNV PPIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=lIY6wMmcUmey7PqGGLxZyBu9cUmrZhoazROOnfMecR8=; fh=q2O5b0D4J2cuSJyaPE284WzuH6G1SvLUKX1oUTIo+oQ=; b=q3VFqViouHkZmdUtUnrppeBsNQ4OgwtOLMZOMs/qyqS5Tf0j5DjUL/pLX0jnkiZIJ/ HRMIjVO+crqY4dSaYvA9ICEnP/tKhk+gzSw0d9Xx4uaazxqXrfN2GA+7PDqkn6TFRmFX iqqSTC1H6zIwRxovhpWwgR9PbvymTcvcYo0s1mn98+lR6vjReFFyDIDmMPsbRBnYeDOs VvVm4zrlxLQ2VF+m/mO4SOu/SVN2MCxeUd1bUvuX1GNENGqJxg2wivAyEyGPdvkTefWK R68vuEZEtUbFUB/KsZ1h+aVWxTlFa0KumoV4NQL7y9rr7zyk8UeZiBLmEFzU1mC7PNpx JjSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id i4-20020a63e444000000b005b982b6a01fsi9474469pgk.39.2023.11.10.10.22.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 10 Nov 2023 10:22:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id E502080B81FD; Fri, 10 Nov 2023 10:21:33 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346020AbjKJSUr (ORCPT <rfc822;lhua1029@gmail.com> + 30 others); Fri, 10 Nov 2023 13:20:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345076AbjKJSRY (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 10 Nov 2023 13:17:24 -0500 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A65B5357AA for <linux-kernel@vger.kernel.org>; Fri, 10 Nov 2023 04:59:11 -0800 (PST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E144112FC; Fri, 10 Nov 2023 04:59:55 -0800 (PST) Received: from e126645.arm.com (unknown [10.57.82.190]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id C25FB3F7C5; Fri, 10 Nov 2023 04:59:08 -0800 (PST) From: Pierre Gondois <pierre.gondois@arm.com> To: linux-kernel@vger.kernel.org Cc: Pierre Gondois <pierre.gondois@arm.com>, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, Valentin Schneider <vschneid@redhat.com> Subject: [PATCH] sched/fair: Use all little CPUs for CPU-bound workload Date: Fri, 10 Nov 2023 13:59:02 +0100 Message-Id: <20231110125902.2152380-1-pierre.gondois@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Fri, 10 Nov 2023 10:21:34 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782202306570574572 X-GMAIL-MSGID: 1782202306570574572 |
Series |
sched/fair: Use all little CPUs for CPU-bound workload
|
|
Commit Message
Pierre Gondois
Nov. 10, 2023, 12:59 p.m. UTC
Running n CPU-bound tasks on an n CPUs platform with asymmetric CPU
capacity might result in a task placement where two tasks run on a
big CPU and none on a little CPU. This placement could be more optimal
by using all CPUs.
Testing platform:
Juno-r2:
- 2 big CPUs (1-2), maximum capacity of 1024
- 4 little CPUs (0,3-5), maximum capacity of 383
Testing workload ([1]):
Spawn 6 CPU-bound tasks. During the first 100ms (step 1), each tasks
is affine to a CPU, except for:
- one little CPU which is left idle.
- one big CPU which has 2 tasks affine.
After the 100ms (step 2), remove the cpumask affinity.
Before patch:
During step 2, the load balancer running from the idle CPU tags sched
domains as:
- little CPUs: 'group_has_spare'. Indeed, 3 CPU-bound tasks run on a
4 CPUs sched-domain, and the idle CPU provides enough spare
capacity.
- big CPUs: 'group_overloaded'. Indeed, 3 tasks run on a 2 CPUs
sched-domain, so the following path is used:
group_is_overloaded()
\-if (sgs->sum_nr_running <= sgs->group_weight) return true;
The following path which would change the migration type to
'migrate_task' is not taken:
calculate_imbalance()
\-if (env->idle != CPU_NOT_IDLE && env->imbalance == 0)
as the local group has some spare capacity, so the imbalance
is not 0.
The migration type requested is 'migrate_util' and the busiest
runqueue is the big CPU's runqueue having 2 tasks (each having a
utilization of 512). The idle little CPU cannot pull one of these
task as its capacity is too small for the task. The following path
is used:
detach_tasks()
\-case migrate_util:
\-if (util > env->imbalance) goto next;
After patch:
When the local group has spare capacity and the busiest group is at
least tagged as 'group_fully_busy', if the local group has more CPUs
than CFS tasks and the busiest group more CFS tasks than CPUs,
request a 'migrate_task' type migration.
Improvement:
Running the testing workload [1] with the step 2 representing
a ~10s load for a big CPU:
Before patch: ~19.3s
After patch: ~18s (-6.7%)
The issue only happens at the DIE level on platforms able to have
'migrate_util' migration types, i.e. no DynamIQ systems where
SD_SHARE_PKG_RESOURCES is set.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
---
kernel/sched/fair.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
Comments
On 10/11/2023 13:59, Pierre Gondois wrote: > Running n CPU-bound tasks on an n CPUs platform with asymmetric CPU > capacity might result in a task placement where two tasks run on a > big CPU and none on a little CPU. This placement could be more optimal > by using all CPUs. > > Testing platform: > Juno-r2: > - 2 big CPUs (1-2), maximum capacity of 1024 > - 4 little CPUs (0,3-5), maximum capacity of 383 > > Testing workload ([1]): > Spawn 6 CPU-bound tasks. During the first 100ms (step 1), each tasks > is affine to a CPU, except for: > - one little CPU which is left idle. > - one big CPU which has 2 tasks affine. > After the 100ms (step 2), remove the cpumask affinity. I used your workload on my Juno-r0 with LISA (rt-app overwrites the mainline CPU capacity values [446 1024 1024 446 446 466] to [675 1024 1024 675 675 675] to adapt to rt-app busy loop's instruction mix. Here I can't see the issue you bring up here. The two tasks sharing a CPU have util_avg = ~512 and the load-balance from one task to the idle little CPU is happening, I assume it's the diff in CPU capacity of the little CPUs: 383 < 512 < 675 ? > > Before patch: > During step 2, the load balancer running from the idle CPU tags sched > domains as: > - little CPUs: 'group_has_spare'. Indeed, 3 CPU-bound tasks run on a > 4 CPUs sched-domain, and the idle CPU provides enough spare > capacity. > - big CPUs: 'group_overloaded'. Indeed, 3 tasks run on a 2 CPUs > sched-domain, so the following path is used: > group_is_overloaded() > \-if (sgs->sum_nr_running <= sgs->group_weight) return true; > > The following path which would change the migration type to > 'migrate_task' is not taken: > calculate_imbalance() > \-if (env->idle != CPU_NOT_IDLE && env->imbalance == 0) > as the local group has some spare capacity, so the imbalance > is not 0. > > The migration type requested is 'migrate_util' and the busiest > runqueue is the big CPU's runqueue having 2 tasks (each having a > utilization of 512). The idle little CPU cannot pull one of these > task as its capacity is too small for the task. The following path > is used: Ah, here you're describing the issue I mentioned above. > detach_tasks() > \-case migrate_util: > \-if (util > env->imbalance) goto next; > > After patch: > When the local group has spare capacity and the busiest group is at > least tagged as 'group_fully_busy', if the local group has more CPUs > than CFS tasks and the busiest group more CFS tasks than CPUs, > request a 'migrate_task' type migration. > > Improvement: > Running the testing workload [1] with the step 2 representing > a ~10s load for a big CPU: > Before patch: ~19.3s > After patch: ~18s (-6.7%) > > The issue only happens at the DIE level on platforms able to have > 'migrate_util' migration types, i.e. no DynamIQ systems where > SD_SHARE_PKG_RESOURCES is set. Right, mainline Arm DynamIQ should only have 1 SD level (MC). Android might still be affected since they run MC and DIE. > > Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> > --- > kernel/sched/fair.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index df348aa55d3c..5a215c96d420 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -10495,6 +10495,23 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s > env->imbalance = max(local->group_capacity, local->group_util) - > local->group_util; > > + /* > + * On an asymmetric system with CPU-bound tasks, a > + * migrate_util balance might not be able to migrate a > + * task from a big to a little CPU, letting a little > + * CPU unused. > + * If local has an empty CPU and busiest is overloaded, > + * balance one task with a migrate_task migration type > + * instead. > + */ > + if (env->sd->flags & SD_ASYM_CPUCAPACITY && > + local->sum_nr_running < local->group_weight && > + busiest->sum_nr_running > busiest->group_weight) { > + env->migration_type = migrate_task; > + env->imbalance = 1; > + return; > + } > + > /* > * In some cases, the group's utilization is max or even > * higher than capacity because of migrations but the
Hi Pierre, On Fri, 10 Nov 2023 at 13:59, Pierre Gondois <pierre.gondois@arm.com> wrote: > > Running n CPU-bound tasks on an n CPUs platform with asymmetric CPU > capacity might result in a task placement where two tasks run on a > big CPU and none on a little CPU. This placement could be more optimal > by using all CPUs. > > Testing platform: > Juno-r2: > - 2 big CPUs (1-2), maximum capacity of 1024 > - 4 little CPUs (0,3-5), maximum capacity of 383 > > Testing workload ([1]): > Spawn 6 CPU-bound tasks. During the first 100ms (step 1), each tasks > is affine to a CPU, except for: > - one little CPU which is left idle. > - one big CPU which has 2 tasks affine. > After the 100ms (step 2), remove the cpumask affinity. > > Before patch: > During step 2, the load balancer running from the idle CPU tags sched > domains as: > - little CPUs: 'group_has_spare'. Indeed, 3 CPU-bound tasks run on a > 4 CPUs sched-domain, and the idle CPU provides enough spare > capacity. > - big CPUs: 'group_overloaded'. Indeed, 3 tasks run on a 2 CPUs > sched-domain, so the following path is used: > group_is_overloaded() > \-if (sgs->sum_nr_running <= sgs->group_weight) return true; This remembers me a similar discussion with Qais: https://lore.kernel.org/lkml/20230716014125.139577-1-qyousef@layalina.io/ > > The following path which would change the migration type to > 'migrate_task' is not taken: > calculate_imbalance() > \-if (env->idle != CPU_NOT_IDLE && env->imbalance == 0) > as the local group has some spare capacity, so the imbalance > is not 0. > > The migration type requested is 'migrate_util' and the busiest > runqueue is the big CPU's runqueue having 2 tasks (each having a > utilization of 512). The idle little CPU cannot pull one of these > task as its capacity is too small for the task. The following path > is used: > detach_tasks() > \-case migrate_util: > \-if (util > env->imbalance) goto next; > > After patch: > When the local group has spare capacity and the busiest group is at > least tagged as 'group_fully_busy', if the local group has more CPUs the busiest group is more than 'group_fully_busy' > than CFS tasks and the busiest group more CFS tasks than CPUs, > request a 'migrate_task' type migration. > > Improvement: > Running the testing workload [1] with the step 2 representing > a ~10s load for a big CPU: > Before patch: ~19.3s > After patch: ~18s (-6.7%) > > The issue only happens at the DIE level on platforms able to have > 'migrate_util' migration types, i.e. no DynamIQ systems where > SD_SHARE_PKG_RESOURCES is set. > > Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> > --- > kernel/sched/fair.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index df348aa55d3c..5a215c96d420 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -10495,6 +10495,23 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s > env->imbalance = max(local->group_capacity, local->group_util) - > local->group_util; > > + /* > + * On an asymmetric system with CPU-bound tasks, a > + * migrate_util balance might not be able to migrate a > + * task from a big to a little CPU, letting a little > + * CPU unused. > + * If local has an empty CPU and busiest is overloaded, group_has_spare doesn't mean that the local has an empty cpu but that one or more cpu might be idle some time which could not be the case when the load balance happen > + * balance one task with a migrate_task migration type > + * instead. > + */ > + if (env->sd->flags & SD_ASYM_CPUCAPACITY && > + local->sum_nr_running < local->group_weight && > + busiest->sum_nr_running > busiest->group_weight) { > + env->migration_type = migrate_task; > + env->imbalance = 1; I wonder if this is too aggressive. We can have cases where (local->sum_nr_running < local->group_weight) at the time of the load balance because one cpu can be shortly idle and you will migrate the task that will then compete with another one on a little core. So maybe you should do something similar to the migrate_load in detach_tasks like: diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fc8e9ced6aa8..3a04fa0f1eae 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8977,7 +8977,7 @@ static int detach_tasks(struct lb_env *env) case migrate_util: util = task_util_est(p); - if (util > env->imbalance) + if (shr_bound(util, env->sd->nr_balance_failed) > env->imbalance) goto next; env->imbalance -= util; -- This should cover more intermediate cases and would benefit to more topology and cases > + return; > + } > + > /* > * In some cases, the group's utilization is max or even > * higher than capacity because of migrations but the > -- > 2.25.1 >
Hello Vincent, On 11/17/23 15:17, Vincent Guittot wrote: > Hi Pierre, > > On Fri, 10 Nov 2023 at 13:59, Pierre Gondois <pierre.gondois@arm.com> wrote: >> >> Running n CPU-bound tasks on an n CPUs platform with asymmetric CPU >> capacity might result in a task placement where two tasks run on a >> big CPU and none on a little CPU. This placement could be more optimal >> by using all CPUs. >> >> Testing platform: >> Juno-r2: >> - 2 big CPUs (1-2), maximum capacity of 1024 >> - 4 little CPUs (0,3-5), maximum capacity of 383 >> >> Testing workload ([1]): >> Spawn 6 CPU-bound tasks. During the first 100ms (step 1), each tasks >> is affine to a CPU, except for: >> - one little CPU which is left idle. >> - one big CPU which has 2 tasks affine. >> After the 100ms (step 2), remove the cpumask affinity. >> >> Before patch: >> During step 2, the load balancer running from the idle CPU tags sched >> domains as: >> - little CPUs: 'group_has_spare'. Indeed, 3 CPU-bound tasks run on a >> 4 CPUs sched-domain, and the idle CPU provides enough spare >> capacity. >> - big CPUs: 'group_overloaded'. Indeed, 3 tasks run on a 2 CPUs >> sched-domain, so the following path is used: >> group_is_overloaded() >> \-if (sgs->sum_nr_running <= sgs->group_weight) return true; > > This remembers me a similar discussion with Qais: > https://lore.kernel.org/lkml/20230716014125.139577-1-qyousef@layalina.io/ Yes indeed, this is exactly the same case and same backstory actually. > >> >> The following path which would change the migration type to >> 'migrate_task' is not taken: >> calculate_imbalance() >> \-if (env->idle != CPU_NOT_IDLE && env->imbalance == 0) >> as the local group has some spare capacity, so the imbalance >> is not 0. >> >> The migration type requested is 'migrate_util' and the busiest >> runqueue is the big CPU's runqueue having 2 tasks (each having a >> utilization of 512). The idle little CPU cannot pull one of these >> task as its capacity is too small for the task. The following path >> is used: >> detach_tasks() >> \-case migrate_util: >> \-if (util > env->imbalance) goto next; >> >> After patch: >> When the local group has spare capacity and the busiest group is at >> least tagged as 'group_fully_busy', if the local group has more CPUs > > the busiest group is more than 'group_fully_busy' > >> than CFS tasks and the busiest group more CFS tasks than CPUs, >> request a 'migrate_task' type migration. >> >> Improvement: >> Running the testing workload [1] with the step 2 representing >> a ~10s load for a big CPU: >> Before patch: ~19.3s >> After patch: ~18s (-6.7%) >> >> The issue only happens at the DIE level on platforms able to have >> 'migrate_util' migration types, i.e. no DynamIQ systems where >> SD_SHARE_PKG_RESOURCES is set. >> >> Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> >> --- >> kernel/sched/fair.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index df348aa55d3c..5a215c96d420 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -10495,6 +10495,23 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s >> env->imbalance = max(local->group_capacity, local->group_util) - >> local->group_util; >> >> + /* >> + * On an asymmetric system with CPU-bound tasks, a >> + * migrate_util balance might not be able to migrate a >> + * task from a big to a little CPU, letting a little >> + * CPU unused. >> + * If local has an empty CPU and busiest is overloaded, > > group_has_spare doesn't mean that the local has an empty cpu but that > one or more cpu might be idle some time which could not be the case > when the load balance happen > >> + * balance one task with a migrate_task migration type >> + * instead. >> + */ >> + if (env->sd->flags & SD_ASYM_CPUCAPACITY && >> + local->sum_nr_running < local->group_weight && >> + busiest->sum_nr_running > busiest->group_weight) { >> + env->migration_type = migrate_task; >> + env->imbalance = 1; > > I wonder if this is too aggressive. We can have cases where > (local->sum_nr_running < local->group_weight) at the time of the load > balance because one cpu can be shortly idle and you will migrate the > task that will then compete with another one on a little core. So Ok right. > maybe you should do something similar to the migrate_load in > detach_tasks like: > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index fc8e9ced6aa8..3a04fa0f1eae 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -8977,7 +8977,7 @@ static int detach_tasks(struct lb_env *env) > case migrate_util: > util = task_util_est(p); > > - if (util > env->imbalance) > + if (shr_bound(util, > env->sd->nr_balance_failed) > env->imbalance) > goto next; > > env->imbalance -= util; > -- > > This should cover more intermediate cases and would benefit to more > topology and cases Your change also solves the issue. I'll try to see if this might raise other issues, Thanks for the suggestion, Regards, Pierre > >> + return; >> + } >> + >> /* >> * In some cases, the group's utilization is max or even >> * higher than capacity because of migrations but the >> -- >> 2.25.1 >>
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index df348aa55d3c..5a215c96d420 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10495,6 +10495,23 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s env->imbalance = max(local->group_capacity, local->group_util) - local->group_util; + /* + * On an asymmetric system with CPU-bound tasks, a + * migrate_util balance might not be able to migrate a + * task from a big to a little CPU, letting a little + * CPU unused. + * If local has an empty CPU and busiest is overloaded, + * balance one task with a migrate_task migration type + * instead. + */ + if (env->sd->flags & SD_ASYM_CPUCAPACITY && + local->sum_nr_running < local->group_weight && + busiest->sum_nr_running > busiest->group_weight) { + env->migration_type = migrate_task; + env->imbalance = 1; + return; + } + /* * In some cases, the group's utilization is max or even * higher than capacity because of migrations but the