Message ID | 04641eeb0e95c21224352f5743ecb93dfac44654.1688770494.git.tim.c.chen@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp3589941vqx; Fri, 7 Jul 2023 16:06:10 -0700 (PDT) X-Google-Smtp-Source: APBJJlHG3UKmoxEXhRxUm6JKLOARyLrx6WNJUjJaYN0s9HT/ibCiTKJSeCk7tqGFI1oPAozy+GG1 X-Received: by 2002:aa7:c044:0:b0:51d:a124:62db with SMTP id k4-20020aa7c044000000b0051da12462dbmr6758033edo.18.1688771170549; Fri, 07 Jul 2023 16:06:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688771170; cv=none; d=google.com; s=arc-20160816; b=K0vMkB6KRn/FkgMY0PJ6+MhfQoFGSCPduKLroveukhTyimyo6PBxxdTnvkKqaSmMmX VC3aWQesLzqIVYVxmFDHUI8aR/cHPPhHrgqIWEA7IlkWD5K+O1935vv0Zuw8C7JcOe2K 18LwVb9GX6JHxD0iqzN90Sig5G8ZM7nswerGlP3GJPGMUpFuXDPQPP4jWwYTRPr0XfVs xXpqyUy0d/FH7beEmvS5IEoJJT/gPYYIr64r3bqzeRB0JTth/ujuEDsnzYv+D/Zd3fu3 igTCW/V4m6YuNTm+OY1nF6qq+vqBRP0pINsGKUVCqXB0u0wENLKzpnysrYEuBTIC98QC cTqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=IjMAh4QV72tVO1Xahg6IDGh4HmcjO84xpIRDJ+IKLAQ=; fh=4tv6885AGcJ8YvvipL25y/n+e+aw2LwA0TLilxxK7G0=; b=OfJVpmYITFlVzE0jc0a9OCASOjzRFmw91GsmZiRDCmxF5tQRJyMztsx+YAwRK9WCjk ppcmEcumq52JKWuf+ma/QluNgmLBtyyNH5Zbwd24TjtTw+YMmD4mVv3Hv6cvXcuj+ywG nVG9lZDMi9178M6MJbqv/OmjuGQWprMKM4iIY57Al3tIsUchY8XQtXw8z2HlkiNFedCg /Sd3b/jfO5hl7SfquptenxmY2a2W988c298WsrGiQDrgATM0WaeGZpDXwo/eggyKGb3W 07EW7MlGRdSKOxCWrCMFYgLGCtQ4+nghITD15aB64G4hPuEjrAWzIPotaiSNW0Oyqp68 d11g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DW44dRz7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c17-20020a056402121100b0050dfd8e2a70si2767470edw.78.2023.07.07.16.05.46; Fri, 07 Jul 2023 16:06:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DW44dRz7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232301AbjGGW5C (ORCPT <rfc822;tebrre53rla2o@gmail.com> + 99 others); Fri, 7 Jul 2023 18:57:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231245AbjGGW4z (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 7 Jul 2023 18:56:55 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2DCC1997 for <linux-kernel@vger.kernel.org>; Fri, 7 Jul 2023 15:56:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688770614; x=1720306614; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GtHT20V983sr8n3O3e7GEiqoyfK5vMEHBMPj3byjlrM=; b=DW44dRz7PP7qXGzmS0qu36AVxPOX/tWHCYhmHNNhwBvXgo02ykRyF853 +KJ0gjLdLL7cT9Gx4OXQfcNWcWvTXRLdh3BhgG4+UleE7IbUmb2nYmfcE HvF0z40i1c/wwagL/P1FqKleyPzyznrO7VkgKy6qsUhGbRBIJJURQ2pXY 9jSX97WBjxejfoHbgBZfKVzgMovp82Ia1GKrvvleuvOBSgui43EoBjZt1 GuIhAhZxFhX8FBBvO2CnGNDlBQAcuZ+6JaF0iM+3RXbFllfSOS6UizjGP mC651krSCGxP0tjfX3mhjg8nCx22KmGcvt9ZyJfyf31wowwmTkywVbYXC w==; X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="427683460" X-IronPort-AV: E=Sophos;i="6.01,189,1684825200"; d="scan'208";a="427683460" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Jul 2023 15:56:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10764"; a="714176664" X-IronPort-AV: E=Sophos;i="6.01,189,1684825200"; d="scan'208";a="714176664" Received: from b04f130c83f2.jf.intel.com ([10.165.154.98]) by orsmga007.jf.intel.com with ESMTP; 07 Jul 2023 15:56:54 -0700 From: Tim Chen <tim.c.chen@linux.intel.com> To: Peter Zijlstra <peterz@infradead.org> Cc: Tim C Chen <tim.c.chen@linux.intel.com>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Ricardo Neri <ricardo.neri@intel.com>, "Ravi V . Shankar" <ravi.v.shankar@intel.com>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>, "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Steven Rostedt <rostedt@goodmis.org>, Valentin Schneider <vschneid@redhat.com>, Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org, linux-kernel@vger.kernel.org, Shrikanth Hegde <sshegde@linux.vnet.ibm.com>, Srikar Dronamraju <srikar@linux.vnet.ibm.com>, naveen.n.rao@linux.vnet.ibm.com, Yicong Yang <yangyicong@hisilicon.com>, Barry Song <v-songbaohua@oppo.com>, Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com> Subject: [Patch v3 2/6] sched/topology: Record number of cores in sched group Date: Fri, 7 Jul 2023 15:57:01 -0700 Message-Id: <04641eeb0e95c21224352f5743ecb93dfac44654.1688770494.git.tim.c.chen@linux.intel.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <cover.1688770494.git.tim.c.chen@linux.intel.com> References: <cover.1688770494.git.tim.c.chen@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770804919050984590?= X-GMAIL-MSGID: =?utf-8?q?1770804919050984590?= |
Series |
Enable Cluster Scheduling for x86 Hybrid CPUs
|
|
Commit Message
Tim Chen
July 7, 2023, 10:57 p.m. UTC
From: Tim C Chen <tim.c.chen@linux.intel.com> When balancing sibling domains that have different number of cores, tasks in respective sibling domain should be proportional to the number of cores in each domain. In preparation of implementing such a policy, record the number of tasks in a scheduling group. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- kernel/sched/sched.h | 1 + kernel/sched/topology.c | 10 +++++++++- 2 files changed, 10 insertions(+), 1 deletion(-)
Comments
On 07/07/23 15:57, Tim Chen wrote: > From: Tim C Chen <tim.c.chen@linux.intel.com> > > When balancing sibling domains that have different number of cores, > tasks in respective sibling domain should be proportional to the number > of cores in each domain. In preparation of implementing such a policy, > record the number of tasks in a scheduling group. > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > kernel/sched/sched.h | 1 + > kernel/sched/topology.c | 10 +++++++++- > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 3d0eb36350d2..5f7f36e45b87 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -1860,6 +1860,7 @@ struct sched_group { > atomic_t ref; > > unsigned int group_weight; > + unsigned int cores; > struct sched_group_capacity *sgc; > int asym_prefer_cpu; /* CPU of highest priority in group */ > int flags; > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > index 6d5628fcebcf..6b099dbdfb39 100644 > --- a/kernel/sched/topology.c > +++ b/kernel/sched/topology.c > @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu) > static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) > { > struct sched_group *sg = sd->groups; > + struct cpumask *mask = sched_domains_tmpmask2; > > WARN_ON(!sg); > > do { > - int cpu, max_cpu = -1; > + int cpu, cores = 0, max_cpu = -1; > > sg->group_weight = cpumask_weight(sched_group_span(sg)); > > + cpumask_copy(mask, sched_group_span(sg)); > + for_each_cpu(cpu, mask) { > + cores++; > + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); > + } This rekindled my desire for an SMT core cpumask/iterator. I played around with a global mask but that's a headache: what if we end up with a core whose SMT threads are split across two exclusive cpusets? I ended up necro'ing a patch from Peter [1], but didn't get anywhere nice (the LLC shared storage caused me issues). All that to say, I couldn't think of a nicer way :( [1]: https://lore.kernel.org/all/20180530143106.082002139@infradead.org/#t
On Mon, 2023-07-10 at 21:33 +0100, Valentin Schneider wrote: > On 07/07/23 15:57, Tim Chen wrote: > > From: Tim C Chen <tim.c.chen@linux.intel.com> > > > > When balancing sibling domains that have different number of cores, > > tasks in respective sibling domain should be proportional to the number > > of cores in each domain. In preparation of implementing such a policy, > > record the number of tasks in a scheduling group. > > > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > > --- > > kernel/sched/sched.h | 1 + > > kernel/sched/topology.c | 10 +++++++++- > > 2 files changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > > index 3d0eb36350d2..5f7f36e45b87 100644 > > --- a/kernel/sched/sched.h > > +++ b/kernel/sched/sched.h > > @@ -1860,6 +1860,7 @@ struct sched_group { > > atomic_t ref; > > > > unsigned int group_weight; > > + unsigned int cores; > > struct sched_group_capacity *sgc; > > int asym_prefer_cpu; /* CPU of highest priority in group */ > > int flags; > > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > > index 6d5628fcebcf..6b099dbdfb39 100644 > > --- a/kernel/sched/topology.c > > +++ b/kernel/sched/topology.c > > @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu) > > static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) > > { > > struct sched_group *sg = sd->groups; > > + struct cpumask *mask = sched_domains_tmpmask2; > > > > WARN_ON(!sg); > > > > do { > > - int cpu, max_cpu = -1; > > + int cpu, cores = 0, max_cpu = -1; > > > > sg->group_weight = cpumask_weight(sched_group_span(sg)); > > > > + cpumask_copy(mask, sched_group_span(sg)); > > + for_each_cpu(cpu, mask) { > > + cores++; > > + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); > > + } > > > This rekindled my desire for an SMT core cpumask/iterator. I played around > with a global mask but that's a headache: what if we end up with a core > whose SMT threads are split across two exclusive cpusets? Peter and I pondered that for a while. But it seems like partitioning threads in a core between two different sched domains is not a very reasonable thing to do. https://lore.kernel.org/all/20230612112945.GK4253@hirez.programming.kicks-ass.net/ Tim > > I ended up necro'ing a patch from Peter [1], but didn't get anywhere nice > (the LLC shared storage caused me issues). > > All that to say, I couldn't think of a nicer way :( > > [1]: https://lore.kernel.org/all/20180530143106.082002139@infradead.org/#t >
On Fri, 2023-07-07 at 15:57 -0700, Tim Chen wrote: > From: Tim C Chen <tim.c.chen@linux.intel.com> > > When balancing sibling domains that have different number of cores, > tasks in respective sibling domain should be proportional to the number > of cores in each domain. In preparation of implementing such a policy, > record the number of tasks in a scheduling group. Caught a typo. Should be "the number of cores" instead of "the number of tasks" in a scheduling group. Peter, should I send you another patch with the corrected commit log? Tim > > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > kernel/sched/sched.h | 1 + > kernel/sched/topology.c | 10 +++++++++- > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 3d0eb36350d2..5f7f36e45b87 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -1860,6 +1860,7 @@ struct sched_group { > atomic_t ref; > > unsigned int group_weight; > + unsigned int cores; > struct sched_group_capacity *sgc; > int asym_prefer_cpu; /* CPU of highest priority in group */ > int flags; > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c > index 6d5628fcebcf..6b099dbdfb39 100644 > --- a/kernel/sched/topology.c > +++ b/kernel/sched/topology.c > @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu) > static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) > { > struct sched_group *sg = sd->groups; > + struct cpumask *mask = sched_domains_tmpmask2; > > WARN_ON(!sg); > > do { > - int cpu, max_cpu = -1; > + int cpu, cores = 0, max_cpu = -1; > > sg->group_weight = cpumask_weight(sched_group_span(sg)); > > + cpumask_copy(mask, sched_group_span(sg)); > + for_each_cpu(cpu, mask) { > + cores++; > + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); > + } > + sg->cores = cores; > + > if (!(sd->flags & SD_ASYM_PACKING)) > goto next; >
On Mon, Jul 10, 2023 at 03:40:34PM -0700, Tim Chen wrote: > On Fri, 2023-07-07 at 15:57 -0700, Tim Chen wrote: > > From: Tim C Chen <tim.c.chen@linux.intel.com> > > > > When balancing sibling domains that have different number of cores, > > tasks in respective sibling domain should be proportional to the number > > of cores in each domain. In preparation of implementing such a policy, > > record the number of tasks in a scheduling group. > > Caught a typo. Should be "the number of cores" instead of > "the number of tasks" in a scheduling group. > > Peter, should I send you another patch with the corrected commit log? I'll fix it up, already had to fix the patch because due to robot finding a compile fail for SCHED_SMT=n builds. > > @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu) > > static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) > > { > > struct sched_group *sg = sd->groups; > > + struct cpumask *mask = sched_domains_tmpmask2; > > > > WARN_ON(!sg); > > > > do { > > - int cpu, max_cpu = -1; > > + int cpu, cores = 0, max_cpu = -1; > > > > sg->group_weight = cpumask_weight(sched_group_span(sg)); > > > > + cpumask_copy(mask, sched_group_span(sg)); > > + for_each_cpu(cpu, mask) { > > + cores++; #ifdef CONFIG_SCHED_SMT > > + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); #else __cpumask_clear_cpu(cpu, mask); #endif or something along them lines -- should be in queue.git/sched/core already. > > + } > > + sg->cores = cores; > > + > > if (!(sd->flags & SD_ASYM_PACKING)) > > goto next; > > >
On Tue, 2023-07-11 at 13:31 +0200, Peter Zijlstra wrote: > On Mon, Jul 10, 2023 at 03:40:34PM -0700, Tim Chen wrote: > > On Fri, 2023-07-07 at 15:57 -0700, Tim Chen wrote: > > > From: Tim C Chen <tim.c.chen@linux.intel.com> > > > > > > When balancing sibling domains that have different number of cores, > > > tasks in respective sibling domain should be proportional to the number > > > of cores in each domain. In preparation of implementing such a policy, > > > record the number of tasks in a scheduling group. > > > > Caught a typo. Should be "the number of cores" instead of > > "the number of tasks" in a scheduling group. > > > > Peter, should I send you another patch with the corrected commit log? > > I'll fix it up, already had to fix the patch because due to robot > finding a compile fail for SCHED_SMT=n builds. > > > > > > @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu) > > > static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) > > > { > > > struct sched_group *sg = sd->groups; > > > + struct cpumask *mask = sched_domains_tmpmask2; > > > > > > WARN_ON(!sg); > > > > > > do { > > > - int cpu, max_cpu = -1; > > > + int cpu, cores = 0, max_cpu = -1; > > > > > > sg->group_weight = cpumask_weight(sched_group_span(sg)); > > > > > > + cpumask_copy(mask, sched_group_span(sg)); > > > + for_each_cpu(cpu, mask) { > > > + cores++; > #ifdef CONFIG_SCHED_SMT > > > + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); > #else > __cpumask_clear_cpu(cpu, mask); Thanks for fixing up the non SCHED_SMT. I think "__cpumask_clear_cpu(cpu, mask);" can be removed. Since we have already considered the CPU in the iterator, clearing it is unnecessay. So effectively for_each_cpu(cpu, mask) { cores++; } should be good enough for the non SCHED_SMT case. Or replace the patch with the patch below so we don't have #ifdef in the middle of code body. Either way is fine. --- From 9f19714db69739a7985e46bc1f8334d70a69cf2e Mon Sep 17 00:00:00 2001 Message-Id: <9f19714db69739a7985e46bc1f8334d70a69cf2e.1689092923.git.tim.c.chen@linux.intel.com> In-Reply-To: <cover.1689092923.git.tim.c.chen@linux.intel.com> References: <cover.1689092923.git.tim.c.chen@linux.intel.com> From: Tim C Chen <tim.c.chen@linux.intel.com> Date: Wed, 17 May 2023 09:09:54 -0700 Subject: [Patch v3 2/6] sched/topology: Record number of cores in sched group To: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Ricardo Neri <ricardo.neri@intel.com>, Ravi V. Shankar <ravi.v.shankar@intel.com>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>, Rafael J. Wysocki <rafael.j.wysocki@intel.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Steven Rostedt <rostedt@goodmis.org>, Tim Chen <tim.c.chen@linux.intel.com>, Valentin Schneider <vschneid@redhat.com>, Ionela Voinescu <ionela.voinescu@arm.com>, x86@kernel.org, linux-kernel@vger.kernel.org, Shrikanth Hegde <sshegde@linux.vnet.ibm.com>, Srikar Dronamraju <srikar@linux.vnet.ibm.com>, naveen.n.rao@linux.vnet.ibm.com, Yicong Yang <yangyicong@hisilicon.com>, Barry Song <v-songbaohua@oppo.com>, Chen Yu <yu.c.chen@intel.com>, Hillf Danton <hdanton@sina.com> When balancing sibling domains that have different number of cores, tasks in respective sibling domain should be proportional to the number of cores in each domain. In preparation of implementing such a policy, record the number of cores in a scheduling group. Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- kernel/sched/sched.h | 1 + kernel/sched/topology.c | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 3d0eb36350d2..5f7f36e45b87 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1860,6 +1860,7 @@ struct sched_group { atomic_t ref; unsigned int group_weight; + unsigned int cores; struct sched_group_capacity *sgc; int asym_prefer_cpu; /* CPU of highest priority in group */ int flags; diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 6d5628fcebcf..4ecdaef3f8ab 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1262,6 +1262,26 @@ build_sched_groups(struct sched_domain *sd, int cpu) return 0; } +#ifdef CONFIG_SCHED_SMT +static inline int sched_group_cores(struct sched_group *sg) +{ + struct cpumask *mask = sched_domains_tmpmask2; + int cpu, cores = 0; + + cpumask_copy(mask, sched_group_span(sg)); + for_each_cpu(cpu, mask) { + cores++; + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); + } + return cores; +} +#else +static inline int sched_group_cores(struct sched_group *sg) +{ + return sg->group_weight; +} +#endif + /* * Initialize sched groups cpu_capacity. * @@ -1282,6 +1302,7 @@ static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) int cpu, max_cpu = -1; sg->group_weight = cpumask_weight(sched_group_span(sg)); + sg->cores = sched_group_cores(sg); if (!(sd->flags & SD_ASYM_PACKING)) goto next;
On 10/07/23 15:13, Tim Chen wrote: > On Mon, 2023-07-10 at 21:33 +0100, Valentin Schneider wrote: >> On 07/07/23 15:57, Tim Chen wrote: >> > From: Tim C Chen <tim.c.chen@linux.intel.com> >> > >> > When balancing sibling domains that have different number of cores, >> > tasks in respective sibling domain should be proportional to the number >> > of cores in each domain. In preparation of implementing such a policy, >> > record the number of tasks in a scheduling group. >> > >> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> >> > --- >> > kernel/sched/sched.h | 1 + >> > kernel/sched/topology.c | 10 +++++++++- >> > 2 files changed, 10 insertions(+), 1 deletion(-) >> > >> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h >> > index 3d0eb36350d2..5f7f36e45b87 100644 >> > --- a/kernel/sched/sched.h >> > +++ b/kernel/sched/sched.h >> > @@ -1860,6 +1860,7 @@ struct sched_group { >> > atomic_t ref; >> > >> > unsigned int group_weight; >> > + unsigned int cores; >> > struct sched_group_capacity *sgc; >> > int asym_prefer_cpu; /* CPU of highest priority in group */ >> > int flags; >> > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c >> > index 6d5628fcebcf..6b099dbdfb39 100644 >> > --- a/kernel/sched/topology.c >> > +++ b/kernel/sched/topology.c >> > @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu) >> > static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) >> > { >> > struct sched_group *sg = sd->groups; >> > + struct cpumask *mask = sched_domains_tmpmask2; >> > >> > WARN_ON(!sg); >> > >> > do { >> > - int cpu, max_cpu = -1; >> > + int cpu, cores = 0, max_cpu = -1; >> > >> > sg->group_weight = cpumask_weight(sched_group_span(sg)); >> > >> > + cpumask_copy(mask, sched_group_span(sg)); >> > + for_each_cpu(cpu, mask) { >> > + cores++; >> > + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); >> > + } >> >> >> This rekindled my desire for an SMT core cpumask/iterator. I played around >> with a global mask but that's a headache: what if we end up with a core >> whose SMT threads are split across two exclusive cpusets? > > Peter and I pondered that for a while. But it seems like partitioning > threads in a core between two different sched domains is not a very > reasonable thing to do. > > https://lore.kernel.org/all/20230612112945.GK4253@hirez.programming.kicks-ass.net/ > Thanks for the link. I'll poke at this a bit more, but regardless: Reviewed-by: Valentin Schneider <vschneid@redhat.com>
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 3d0eb36350d2..5f7f36e45b87 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1860,6 +1860,7 @@ struct sched_group { atomic_t ref; unsigned int group_weight; + unsigned int cores; struct sched_group_capacity *sgc; int asym_prefer_cpu; /* CPU of highest priority in group */ int flags; diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index 6d5628fcebcf..6b099dbdfb39 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -1275,14 +1275,22 @@ build_sched_groups(struct sched_domain *sd, int cpu) static void init_sched_groups_capacity(int cpu, struct sched_domain *sd) { struct sched_group *sg = sd->groups; + struct cpumask *mask = sched_domains_tmpmask2; WARN_ON(!sg); do { - int cpu, max_cpu = -1; + int cpu, cores = 0, max_cpu = -1; sg->group_weight = cpumask_weight(sched_group_span(sg)); + cpumask_copy(mask, sched_group_span(sg)); + for_each_cpu(cpu, mask) { + cores++; + cpumask_andnot(mask, mask, cpu_smt_mask(cpu)); + } + sg->cores = cores; + if (!(sd->flags & SD_ASYM_PACKING)) goto next;