Message ID | b20517e3986bfdde8a605afa19d144ec411c7a42.1683156492.git.tim.c.chen@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp447345vqo; Thu, 4 May 2023 09:21:52 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6jnSoA/Q/TyxpSZmXtwW11EgfzfBmgLGbG9orEtAdZ3aMeFscrDgsAPDzid9ax8RNElAoa X-Received: by 2002:a05:6a20:d492:b0:f4:4ff5:11af with SMTP id im18-20020a056a20d49200b000f44ff511afmr2561719pzb.14.1683217312059; Thu, 04 May 2023 09:21:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683217312; cv=none; d=google.com; s=arc-20160816; b=HER3v+eKOdekTabE90sow/CeTKDJA0FuB/FFag1vYHVOuWVl9othtPU49tekQC3t3l haKUR6zk08EFZ8sc0SC0LB4FvRio8Ju9992W9b5fM9bPJT7Hl+CIm5F3v4tdPK77WEpm j83jeIhentkUBVtiL3us/G3XZSSTm/+hm+Znr9ibEkXsH7NEBsiFEF3+xGzXvTpBz8DG iyEj605YXsDdk0zATyYj42W0OEhrSMAhk/b1KGLegZvoZdgbo+B/7rEOejb0Msws4zI9 WEVp859PPiQz2xGuArnB+5DTFSYMeUoh/fCY2MUddReoIaLHDW0MgPQCvV5d/b9RrP9N HqNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jhIk9wqSUbQX4PyC9LyUIhRsUE3XWW7Tp6K4hKZwVgE=; b=UycMNEXoRiUvRM6QxNMUv0LiLwfCvwPB5knqEjKt5Th3Z0ItNjAF3dgG5QWU6XUTm1 lSrqjBSpLJ0zU26jF9CC9yBOwvVE2b5oUWMfr5/Kcm+HVSRGHoyVHI/6TXoK6aYNMFi9 TZxGHTvD9bvFtQF61Eo+/FqUg01IV+LlTbf7h2Kj/U7Z4TuGFWocBrj5wpkVpDABpxL8 tHs0WkV0LWzlnf9tPdN2OOqQt+OR3ZgMZYBxpEtOzKOEglPUf+hsjWfwnsbe67yXjYj4 lt0PhpDyuCG9E8NvfApZSr0xbBuNjgUAiKi4yrIwoHcKPNuKzsIfTNCI68kch0vzKU00 k/Eg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XaWAeOa6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e13-20020a056a0000cd00b0063473a51539si31150167pfj.398.2023.05.04.09.21.37; Thu, 04 May 2023 09:21:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XaWAeOa6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229689AbjEDQL4 (ORCPT <rfc822;b08248@gmail.com> + 99 others); Thu, 4 May 2023 12:11:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229619AbjEDQLq (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 4 May 2023 12:11:46 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F7D9618E for <linux-kernel@vger.kernel.org>; Thu, 4 May 2023 09:11:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683216705; x=1714752705; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LkjRqsC5V494jb9ZiKbdqZcVv6xc35gqFoVHEt0Y9iQ=; b=XaWAeOa6M7Yhk9SsK1AUdN41/txU+CnDx4kKa5opbJ1IaDpFCKErUxRt Y5r6MQjWV3YoSocypwzMuyo5QJF+l2Xq0O7RDmQXnyXlkerrALNCBwU7x J2luLzgmSECSv478G6iv9RzarTwQRVAgJEUwdDYaJKy/DPywdJqkETiK5 AIdhPvUtGV0USpV4ZERELgCP4LnYJoonDPzy5G5BpZMAU6rIj+GXQ7iXo dzj1+LViLQF9EmHximXoQZ9Dqu2tLySe0IH0OkC6sjNex28Bl6Fg6VZuS WopH1Y1ucQX6D9j2ZVXWrAn16aNsTmlLc3N+szRylQ+/4n1ACjLxmcNBe w==; X-IronPort-AV: E=McAfee;i="6600,9927,10700"; a="377049091" X-IronPort-AV: E=Sophos;i="5.99,249,1677571200"; d="scan'208";a="377049091" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 May 2023 09:09:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10700"; a="766633540" X-IronPort-AV: E=Sophos;i="5.99,249,1677571200"; d="scan'208";a="766633540" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by fmsmga004.fm.intel.com with ESMTP; 04 May 2023 09:09:33 -0700 From: Tim Chen <tim.c.chen@linux.intel.com> To: Peter Zijlstra <peterz@infradead.org> Cc: Tim C Chen <tim.c.chen@linux.intel.com>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Ricardo Neri <ricardo.neri@intel.com>, "Ravi V . Shankar" <ravi.v.shankar@intel.com>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>, "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Steven Rostedt <rostedt@goodmis.org>, Valentin Schneider <vschneid@redhat.com>, Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org, linux-kernel@vger.kernel.org, Shrikanth Hegde <sshegde@linux.vnet.ibm.com>, Srikar Dronamraju <srikar@linux.vnet.ibm.com>, naveen.n.rao@linux.vnet.ibm.com, Yicong Yang <yangyicong@hisilicon.com>, Barry Song <v-songbaohua@oppo.com>, Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Subject: [PATCH 4/6] sched/fair: Skip prefer sibling move between SMT group and non-SMT group Date: Thu, 4 May 2023 09:09:54 -0700 Message-Id: <b20517e3986bfdde8a605afa19d144ec411c7a42.1683156492.git.tim.c.chen@linux.intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <cover.1683156492.git.tim.c.chen@linux.intel.com> References: <cover.1683156492.git.tim.c.chen@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764981276146923802?= X-GMAIL-MSGID: =?utf-8?q?1764981276146923802?= |
Series |
Enable Cluster Scheduling for x86 Hybrid CPUs
|
|
Commit Message
Tim Chen
May 4, 2023, 4:09 p.m. UTC
From: Tim C Chen <tim.c.chen@linux.intel.com> Do not try to move tasks between non SMT sched group and SMT sched group for "prefer sibling" load balance. Let asym_active_balance_busiest() handle that case properly. Otherwise we could get task bouncing back and forth between the SMT sched group and non SMT sched group. Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- kernel/sched/fair.c | 4 ++++ 1 file changed, 4 insertions(+)
Comments
On Thu, May 04, 2023 at 09:09:54AM -0700, Tim Chen wrote: > From: Tim C Chen <tim.c.chen@linux.intel.com> > > Do not try to move tasks between non SMT sched group and SMT sched > group for "prefer sibling" load balance. > Let asym_active_balance_busiest() handle that case properly. > Otherwise we could get task bouncing back and forth between > the SMT sched group and non SMT sched group. > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > kernel/sched/fair.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 8a325db34b02..58ef7d529731 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env) > /* > * Try to move all excess tasks to a sibling domain of the busiest > * group's child domain. > + * > + * Do not try to move between non smt sched group and smt sched > + * group. Let asym active balance properly handle that case. > */ > if (sds.prefer_sibling && local->group_type == group_has_spare && > + !asymmetric_groups(sds.busiest, sds.local) && > busiest->sum_nr_running > local->sum_nr_running + 1) > goto force_balance; This seems to have the hidden assumption that a !SMT core is somehow 'less' that an SMT code. Should this not also look at sched_asym_prefer() to establush this is so? I mean, imagine I have a regular system and just offline one smt sibling for giggles.
On Fri, 2023-05-05 at 15:22 +0200, Peter Zijlstra wrote: > On Thu, May 04, 2023 at 09:09:54AM -0700, Tim Chen wrote: > > From: Tim C Chen <tim.c.chen@linux.intel.com> > > > > Do not try to move tasks between non SMT sched group and SMT sched > > group for "prefer sibling" load balance. > > Let asym_active_balance_busiest() handle that case properly. > > Otherwise we could get task bouncing back and forth between > > the SMT sched group and non SMT sched group. > > > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > > --- > > kernel/sched/fair.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 8a325db34b02..58ef7d529731 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env) > > /* > > * Try to move all excess tasks to a sibling domain of the busiest > > * group's child domain. > > + * > > + * Do not try to move between non smt sched group and smt sched > > + * group. Let asym active balance properly handle that case. > > */ > > if (sds.prefer_sibling && local->group_type == group_has_spare && > > + !asymmetric_groups(sds.busiest, sds.local) && > > busiest->sum_nr_running > local->sum_nr_running + 1) > > goto force_balance; > > This seems to have the hidden assumption that a !SMT core is somehow > 'less' that an SMT code. Should this not also look at > sched_asym_prefer() to establush this is so? > > I mean, imagine I have a regular system and just offline one smt sibling > for giggles. I don't quite follow your point as asymmetric_groups() returns false even one smt sibling is offlined. Even say sds.busiest has 1 SMT and sds.local has 2 SMT, both sched groups still have SD_SHARE_CPUCAPACITY flag turned on. So asymmetric_groups() return false and the load balancing logic is not changed for regular non-hybrid system. I may be missing something. Tim
On Fri, May 05, 2023 at 04:07:39PM -0700, Tim Chen wrote: > On Fri, 2023-05-05 at 15:22 +0200, Peter Zijlstra wrote: > > On Thu, May 04, 2023 at 09:09:54AM -0700, Tim Chen wrote: > > > From: Tim C Chen <tim.c.chen@linux.intel.com> > > > > > > Do not try to move tasks between non SMT sched group and SMT sched > > > group for "prefer sibling" load balance. > > > Let asym_active_balance_busiest() handle that case properly. > > > Otherwise we could get task bouncing back and forth between > > > the SMT sched group and non SMT sched group. > > > > > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > > > --- > > > kernel/sched/fair.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > index 8a325db34b02..58ef7d529731 100644 > > > --- a/kernel/sched/fair.c > > > +++ b/kernel/sched/fair.c > > > @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env) > > > /* > > > * Try to move all excess tasks to a sibling domain of the busiest > > > * group's child domain. > > > + * > > > + * Do not try to move between non smt sched group and smt sched > > > + * group. Let asym active balance properly handle that case. > > > */ > > > if (sds.prefer_sibling && local->group_type == group_has_spare && > > > + !asymmetric_groups(sds.busiest, sds.local) && > > > busiest->sum_nr_running > local->sum_nr_running + 1) > > > goto force_balance; > > > > This seems to have the hidden assumption that a !SMT core is somehow > > 'less' that an SMT code. Should this not also look at > > sched_asym_prefer() to establush this is so? > > > > I mean, imagine I have a regular system and just offline one smt sibling > > for giggles. > > I don't quite follow your point as asymmetric_groups() returns false even > one smt sibling is offlined. > > Even say sds.busiest has 1 SMT and sds.local has 2 SMT, both sched groups still > have SD_SHARE_CPUCAPACITY flag turned on. So asymmetric_groups() return > false and the load balancing logic is not changed for regular non-hybrid system. > > I may be missing something. What's the difference between the two cases? That is, if the remaining sibling will have SD_SHARE_CPUCAPACIY from the degenerate SMT domain that's been reaped, then why doesn't the same thing apply to the atoms in the hybrid muck? Those two cases *should* be identical, both cases you have cores with and cores without SMT.
On Sat, May 06, 2023 at 01:38:36AM +0200, Peter Zijlstra wrote: > On Fri, May 05, 2023 at 04:07:39PM -0700, Tim Chen wrote: > > On Fri, 2023-05-05 at 15:22 +0200, Peter Zijlstra wrote: > > > On Thu, May 04, 2023 at 09:09:54AM -0700, Tim Chen wrote: > > > > From: Tim C Chen <tim.c.chen@linux.intel.com> > > > > > > > > Do not try to move tasks between non SMT sched group and SMT sched > > > > group for "prefer sibling" load balance. > > > > Let asym_active_balance_busiest() handle that case properly. > > > > Otherwise we could get task bouncing back and forth between > > > > the SMT sched group and non SMT sched group. > > > > > > > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > > > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > > > > --- > > > > kernel/sched/fair.c | 4 ++++ > > > > 1 file changed, 4 insertions(+) > > > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > > > index 8a325db34b02..58ef7d529731 100644 > > > > --- a/kernel/sched/fair.c > > > > +++ b/kernel/sched/fair.c > > > > @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env) > > > > /* > > > > * Try to move all excess tasks to a sibling domain of the busiest > > > > * group's child domain. > > > > + * > > > > + * Do not try to move between non smt sched group and smt sched > > > > + * group. Let asym active balance properly handle that case. > > > > */ > > > > if (sds.prefer_sibling && local->group_type == group_has_spare && > > > > + !asymmetric_groups(sds.busiest, sds.local) && > > > > busiest->sum_nr_running > local->sum_nr_running + 1) > > > > goto force_balance; > > > > > > This seems to have the hidden assumption that a !SMT core is somehow > > > 'less' that an SMT code. Should this not also look at > > > sched_asym_prefer() to establush this is so? > > > > > > I mean, imagine I have a regular system and just offline one smt sibling > > > for giggles. > > > > I don't quite follow your point as asymmetric_groups() returns false even > > one smt sibling is offlined. > > > > Even say sds.busiest has 1 SMT and sds.local has 2 SMT, both sched groups still > > have SD_SHARE_CPUCAPACITY flag turned on. So asymmetric_groups() return > > false and the load balancing logic is not changed for regular non-hybrid system. > > > > I may be missing something. > > What's the difference between the two cases? That is, if the remaining > sibling will have SD_SHARE_CPUCAPACIY from the degenerate SMT domain > that's been reaped, then why doesn't the same thing apply to the atoms > in the hybrid muck? > > Those two cases *should* be identical, both cases you have cores with > and cores without SMT. On my alderlake: [ 202.222019] CPU0 attaching sched-domain(s): [ 202.222509] domain-0: span=0-1 level=SMT [ 202.222707] groups: 0:{ span=0 }, 1:{ span=1 } [ 202.222945] domain-1: span=0-23 level=MC [ 202.223148] groups: 0:{ span=0-1 cap=2048 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 },12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 }, 23:{ span=23 } ... [ 202.249979] CPU23 attaching sched-domain(s): [ 202.250127] domain-0: span=0-23 level=MC [ 202.250198] groups: 23:{ span=23 }, 0:{ span=0-1 cap=2048 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 }, 12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 } $ echo 0 > /sys/devices/system/cpu/cpu1/online [ 251.213848] CPU0 attaching sched-domain(s): [ 251.214376] domain-0: span=0,2-23 level=MC [ 251.214580] groups: 0:{ span=0 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 }, 12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 }, 23:{ span=23 } ... [ 251.239511] CPU23 attaching sched-domain(s): [ 251.239656] domain-0: span=0,2-23 level=MC [ 251.239727] groups: 23:{ span=23 }, 0:{ span=0 }, 2:{ span=2-3 cap=2048 }, 4:{ span=4-5 cap=2048 }, 6:{ span=6-7 cap=2048 }, 8:{ span=8-9 cap=2048 }, 10:{ span=10-11 cap=2048 }, 12:{ span=12-13 cap=2048 }, 14:{ span=14-15 cap=2048 }, 16:{ span=16 }, 17:{ span=17 }, 18:{ span=18 }, 19:{ span=19 }, 20:{ span=20 }, 21:{ span=21 }, 22:{ span=22 } $ cat /debug/sched/domains/cpu0/domain0/groups_flags $ cat /debug/sched/domains/cpu23/domain0/groups_flags IOW, neither the big core with SMT with one sibling offline, nor the little core with no SMT on at all have the relevant flags set on their domain0 groups. --- diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c index 98bfc0f4ec94..e408b2889186 100644 --- a/kernel/sched/debug.c +++ b/kernel/sched/debug.c @@ -427,6 +427,7 @@ static void register_sd(struct sched_domain *sd, struct dentry *parent) #undef SDM debugfs_create_file("flags", 0444, parent, &sd->flags, &sd_flags_fops); + debugfs_create_file("groups_flags", 0444, parent, &sd->groups->flags, &sd_flags_fops); } void update_sched_domain_debugfs(void)
On Thu, 4 May 2023 at 18:11, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > From: Tim C Chen <tim.c.chen@linux.intel.com> > > Do not try to move tasks between non SMT sched group and SMT sched > group for "prefer sibling" load balance. > Let asym_active_balance_busiest() handle that case properly. > Otherwise we could get task bouncing back and forth between > the SMT sched group and non SMT sched group. > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > kernel/sched/fair.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 8a325db34b02..58ef7d529731 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env) > /* > * Try to move all excess tasks to a sibling domain of the busiest > * group's child domain. > + * > + * Do not try to move between non smt sched group and smt sched > + * group. Let asym active balance properly handle that case. > */ > if (sds.prefer_sibling && local->group_type == group_has_spare && > + !asymmetric_groups(sds.busiest, sds.local) && Can't you delete SD_PREFER_SIBLING flags when building topology like SD_ASYM_CPUCAPACITY does ? Generally speaking SD_ASYM_CPUCAPACITY and SD_ASYM_PACKING are doing quite similar thing, it would be good to get one common solution instead 2 parallel paths > busiest->sum_nr_running > local->sum_nr_running + 1) > goto force_balance; > > -- > 2.32.0 >
On Tue, 2023-05-09 at 15:36 +0200, Vincent Guittot wrote: > On Thu, 4 May 2023 at 18:11, Tim Chen <tim.c.chen@linux.intel.com> wrote: > > > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 8a325db34b02..58ef7d529731 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env) > > /* > > * Try to move all excess tasks to a sibling domain of the busiest > > * group's child domain. > > + * > > + * Do not try to move between non smt sched group and smt sched > > + * group. Let asym active balance properly handle that case. > > */ > > if (sds.prefer_sibling && local->group_type == group_has_spare && > > + !asymmetric_groups(sds.busiest, sds.local) && > > Can't you delete SD_PREFER_SIBLING flags when building topology like > SD_ASYM_CPUCAPACITY does ? The sched domain actually can have a mixture of sched groups with Atom modules and sched groups with SMT cores. When comparing sched group of Atom core cluster and Atom core cluster, or SMT core with SMT core, I think we do want the prefer sibling logic. It is only when we are comparing SMT core and Atom core cluster we want to skip this. Ricardo, please correct me if I am wrong. > > Generally speaking SD_ASYM_CPUCAPACITY and SD_ASYM_PACKING are doing > quite similar thing, it would be good to get one common solution > instead 2 parallel paths Okay. I'll see what I can do to merge the handling. Tim > > > busiest->sum_nr_running > local->sum_nr_running + 1) > > goto force_balance; > > > > -- > > 2.32.0 > >
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 8a325db34b02..58ef7d529731 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -10411,8 +10411,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env) /* * Try to move all excess tasks to a sibling domain of the busiest * group's child domain. + * + * Do not try to move between non smt sched group and smt sched + * group. Let asym active balance properly handle that case. */ if (sds.prefer_sibling && local->group_type == group_has_spare && + !asymmetric_groups(sds.busiest, sds.local) && busiest->sum_nr_running > local->sum_nr_running + 1) goto force_balance;