Message ID | 20221122203532.15013-5-ricardo.neri-calderon@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2425056wrr; Tue, 22 Nov 2022 12:30:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf6+iRSDjPzdmXMw969XFqvFh2T7wrl/8YeBCKpHpMTx+D741FbUQAgAG2TR+sYw97eDbVKi X-Received: by 2002:a05:6402:1a2e:b0:461:2915:e41d with SMTP id be14-20020a0564021a2e00b004612915e41dmr22301700edb.184.1669149035320; Tue, 22 Nov 2022 12:30:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669149035; cv=none; d=google.com; s=arc-20160816; b=MRBf2tVJB7JRlmnpwLadhN/saTOKW/VhL6++HjJDbIANFxTwghqeTXJ2y3MfiWFgT8 w2CXFcVafDb2sv4Q2HDxJh/Rs62O8i83PBiai4v1Jv9dBTPU6mx4D3qB0fUd3fTbRYU8 Fk2RiKDUjiACxl3j6iwo7h8LKpOQH/cN7x8fCGWspuZccbQq1v4/PrMLns6VNmaVACsh 0znZQJQ5wyLH3rM7OiXAynUFbLHc8SkX91TElBa0LsWXd68thXAknmtTQAIq95GhSDHM D89KNz/nrrTEtIpMaU/7WBK5ghjTXDeCzhV+hB4EfeMPr2SsZGalvB6EASdMRb1eWTKp YByQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=+hZ/T+LG2SEQUHOc50V28ZpAG9z7f3Qnk7J/1+X9MBw=; b=i6dJtMhgIcfQZZsj9sMzMlASU6tpMzQU2Z5MPfOrzVT3RDD7tI3GZ1dHqLeB4l59v0 Z1h7EDguWpLhI0y/jKcsPbpn+Fwvf1Uspp3pBgQr2W9YVhheerDX3SFUMNoD2uq8VoG8 Z7aitWWF6wlKDeegUkkP/Woc2U5eTmSsV4ea5pMNX8M24+csITW++0S2GOAGgS7hjO3t lLaSoUpkn7Wyzif9LOOxhC90RLbYpS35nH0fhOsm3CZhtSXYW+Vpsz7l2VliWyNgnXUT OtzArmEHlHTapmFHiPzn11UouzOc5SxpKts08o4sf/GcCjPmSc8OPNIcvj9GupX97MQE tS6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YO5abYmc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i21-20020a170906251500b007ae52a5707bsi9192881ejb.95.2022.11.22.12.30.11; Tue, 22 Nov 2022 12:30:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YO5abYmc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234880AbiKVU2h (ORCPT <rfc822;cjcooper78@gmail.com> + 99 others); Tue, 22 Nov 2022 15:28:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234402AbiKVU2P (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 22 Nov 2022 15:28:15 -0500 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC03C1B1D8 for <linux-kernel@vger.kernel.org>; Tue, 22 Nov 2022 12:28:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669148894; x=1700684894; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=ZxiRvtsJwz9dFNPg5MHP4zjCzO9YSRx6bjDINyhvJ5o=; b=YO5abYmcCWkxOVmwE0RoQ8gdQ02Gl9vS8d2LzhQcaNrhIaclRTjZDDva w5ZimWTsceDPCaNUq56YuyLrJ8tzVeStPDv6caiGQTFbfd2pNFRc6Zy2D reQHuqADcqYYODzbEyaRuUVGxwJiw7DSV7WqyPBLunxBEVkDHmPrcLA1I UUY3hn5dNJFqTXvHa/SUhP3KGvaJaV4b1fi2wwDzyIm8bhFTnUresYOk5 j2whgaklJBIUugC1QQ6HyQ9yf+en30WsRKYO3LyW8ReX5RPzsZGvzdgp0 5SJWvlE6+5jr2zf4Ay+e+/CIo4vLXHms8hlce/MztAlwoGMTNeqowluFK g==; X-IronPort-AV: E=McAfee;i="6500,9779,10539"; a="293616509" X-IronPort-AV: E=Sophos;i="5.96,185,1665471600"; d="scan'208";a="293616509" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Nov 2022 12:28:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10539"; a="816228136" X-IronPort-AV: E=Sophos;i="5.96,185,1665471600"; d="scan'208";a="816228136" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by orsmga005.jf.intel.com with ESMTP; 22 Nov 2022 12:28:14 -0800 From: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> To: "Peter Zijlstra (Intel)" <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org> Cc: Ricardo Neri <ricardo.neri@intel.com>, "Ravi V. Shankar" <ravi.v.shankar@intel.com>, Ben Segall <bsegall@google.com>, Daniel Bristot de Oliveira <bristot@redhat.com>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Len Brown <len.brown@intel.com>, Mel Gorman <mgorman@suse.de>, "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>, Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>, Steven Rostedt <rostedt@goodmis.org>, Tim Chen <tim.c.chen@linux.intel.com>, Valentin Schneider <vschneid@redhat.com>, x86@kernel.org, linux-kernel@vger.kernel.org, Ricardo Neri <ricardo.neri-calderon@linux.intel.com>, "Tim C . Chen" <tim.c.chen@intel.com> Subject: [PATCH v2 4/7] sched/fair: Introduce sched_smt_siblings_idle() Date: Tue, 22 Nov 2022 12:35:29 -0800 Message-Id: <20221122203532.15013-5-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221122203532.15013-1-ricardo.neri-calderon@linux.intel.com> References: <20221122203532.15013-1-ricardo.neri-calderon@linux.intel.com> X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_PASS, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750229618591661866?= X-GMAIL-MSGID: =?utf-8?q?1750229618591661866?= |
Series |
x86/sched: Avoid unnecessary migrations within SMT domains
|
|
Commit Message
Ricardo Neri
Nov. 22, 2022, 8:35 p.m. UTC
Architectures that implement arch_asym_cpu_priority() may need to know the
idle state of the SMT siblings of a CPU. The scheduler has this information
and functionality. Expose it.
Move the existing functionality outside of the NUMA code.
Cc: Ben Segall <bsegall@google.com>
Cc: Daniel Bristot de Oliveira <bristot@redhat.com>
Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tim C. Chen <tim.c.chen@intel.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Changes since v1:
* Introduced this patch.
---
include/linux/sched.h | 2 ++
kernel/sched/fair.c | 39 ++++++++++++++++++++++-----------------
2 files changed, 24 insertions(+), 17 deletions(-)
Comments
On 22/11/2022 21:35, Ricardo Neri wrote: > Architectures that implement arch_asym_cpu_priority() may need to know the > idle state of the SMT siblings of a CPU. The scheduler has this information > and functionality. Expose it. > > Move the existing functionality outside of the NUMA code. [...] > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 0e4251f83807..9517c48df50e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1052,6 +1052,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se) > * Scheduling class queueing methods: > */ > > +static inline bool is_core_idle(int cpu) > +{ > +#ifdef CONFIG_SCHED_SMT > + int sibling; > + > + for_each_cpu(sibling, cpu_smt_mask(cpu)) { > + if (cpu == sibling) > + continue; > + > + if (!idle_cpu(sibling)) > + return false; > + } > +#endif > + > + return true; > +} > + > +bool sched_smt_siblings_idle(int cpu) > +{ > + return is_core_idle(cpu); > +} Nitpick: Can we not just have one exported function for both use-cases: NUMA and x86 ITMT? [...]
On Tue, Dec 06, 2022 at 07:03:37PM +0100, Dietmar Eggemann wrote: > On 22/11/2022 21:35, Ricardo Neri wrote: > > Architectures that implement arch_asym_cpu_priority() may need to know the > > idle state of the SMT siblings of a CPU. The scheduler has this information > > and functionality. Expose it. > > > > Move the existing functionality outside of the NUMA code. > > [...] > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index 0e4251f83807..9517c48df50e 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -1052,6 +1052,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se) > > * Scheduling class queueing methods: > > */ > > > > +static inline bool is_core_idle(int cpu) > > +{ > > +#ifdef CONFIG_SCHED_SMT > > + int sibling; > > + > > + for_each_cpu(sibling, cpu_smt_mask(cpu)) { > > + if (cpu == sibling) > > + continue; > > + > > + if (!idle_cpu(sibling)) > > + return false; > > + } > > +#endif > > + > > + return true; > > +} > > + > > +bool sched_smt_siblings_idle(int cpu) > > +{ > > + return is_core_idle(cpu); > > +} > > Nitpick: Can we not just have one exported function for both use-cases: > NUMA and x86 ITMT? By adding a new function I intend to preserve the inlinig of is_core_idle() in update_numa_stats() (via numa_idle_core(), which is also inline). Do you think there is no value? A downside of having the new function is that now the code is duplicated in update_numa_stats() and sched_smt_siblings_idle(). I can take your suggestion if losing the inline is OK. Thanks and BR, Ricardo
On 12/12/2022 18:54, Ricardo Neri wrote: > On Tue, Dec 06, 2022 at 07:03:37PM +0100, Dietmar Eggemann wrote: >> On 22/11/2022 21:35, Ricardo Neri wrote: [...] >>> +bool sched_smt_siblings_idle(int cpu) >>> +{ >>> + return is_core_idle(cpu); >>> +} >> >> Nitpick: Can we not just have one exported function for both use-cases: >> NUMA and x86 ITMT? > > By adding a new function I intend to preserve the inlinig of is_core_idle() > in update_numa_stats() (via numa_idle_core(), which is also inline). Do you > think there is no value? OK. It's only used in NUMA balancing (task_numa_fault() -> ... -> update_numa_stats()). I can't see that this will have a noticeable perf impact but only benchmark can really tell. A `static inline bool sched_is_core_idle(int cpu)` via include/linux/sched/topology.h might work? We have similar functions (like sched_core_cookie_match()` but in the private scheduler header file kernel/sched/sched.h though. > A downside of having the new function is that now the code is duplicated > in update_numa_stats() and sched_smt_siblings_idle(). > > I can take your suggestion if losing the inline is OK. I doubt that it will have an impact but can't be sure. [...]
On 22/11/22 12:35, Ricardo Neri wrote: > Architectures that implement arch_asym_cpu_priority() may need to know the > idle state of the SMT siblings of a CPU. The scheduler has this information > and functionality. Expose it. > > Move the existing functionality outside of the NUMA code. > test_idle_cores() does something similar without an iteration, did you consider using that instead? > Cc: Ben Segall <bsegall@google.com> > Cc: Daniel Bristot de Oliveira <bristot@redhat.com> > Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> > Cc: Len Brown <len.brown@intel.com> > Cc: Mel Gorman <mgorman@suse.de> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: Tim C. Chen <tim.c.chen@intel.com> > Cc: Valentin Schneider <vschneid@redhat.com> > Cc: x86@kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> > --- > Changes since v1: > * Introduced this patch. > --- > include/linux/sched.h | 2 ++ > kernel/sched/fair.c | 39 ++++++++++++++++++++++----------------- > 2 files changed, 24 insertions(+), 17 deletions(-) > > diff --git a/include/linux/sched.h b/include/linux/sched.h > index ffb6eb55cd13..0d01c64ac737 100644 > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -2426,4 +2426,6 @@ static inline void sched_core_fork(struct task_struct *p) { } > > extern void sched_set_stop_task(int cpu, struct task_struct *stop); > > +extern bool sched_smt_siblings_idle(int cpu); > + > #endif > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 0e4251f83807..9517c48df50e 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -1052,6 +1052,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se) > * Scheduling class queueing methods: > */ > > +static inline bool is_core_idle(int cpu) > +{ > +#ifdef CONFIG_SCHED_SMT > + int sibling; > + > + for_each_cpu(sibling, cpu_smt_mask(cpu)) { > + if (cpu == sibling) > + continue; > + > + if (!idle_cpu(sibling)) > + return false; > + } > +#endif > + > + return true; > +} > + > +bool sched_smt_siblings_idle(int cpu) > +{ > + return is_core_idle(cpu); > +} > + > #ifdef CONFIG_NUMA > #define NUMA_IMBALANCE_MIN 2 > > @@ -1691,23 +1713,6 @@ struct numa_stats { > int idle_cpu; > }; > > -static inline bool is_core_idle(int cpu) > -{ > -#ifdef CONFIG_SCHED_SMT > - int sibling; > - > - for_each_cpu(sibling, cpu_smt_mask(cpu)) { > - if (cpu == sibling) > - continue; > - > - if (!idle_cpu(sibling)) > - return false; > - } > -#endif > - > - return true; > -} > - > struct task_numa_env { > struct task_struct *p; > > -- > 2.25.1
On Thu, Dec 22, 2022 at 04:56:22PM +0000, Valentin Schneider wrote: > On 22/11/22 12:35, Ricardo Neri wrote: > > Architectures that implement arch_asym_cpu_priority() may need to know the > > idle state of the SMT siblings of a CPU. The scheduler has this information > > and functionality. Expose it. > > > > Move the existing functionality outside of the NUMA code. > > > > test_idle_cores() does something similar without an iteration, did you > consider using that instead? IIUC, test_idle_cores() returns true if there is at least one idle core in the package. In my case, I need to know the idle state of only the SMT siblings of a specific CPU. Am I missing something? Thanks and BR, Ricardo
On 2022-12-23 at 21:28:50 -0800, Ricardo Neri wrote: > On Thu, Dec 22, 2022 at 04:56:22PM +0000, Valentin Schneider wrote: > > On 22/11/22 12:35, Ricardo Neri wrote: > > > Architectures that implement arch_asym_cpu_priority() may need to know the > > > idle state of the SMT siblings of a CPU. The scheduler has this information > > > and functionality. Expose it. > > > > > > Move the existing functionality outside of the NUMA code. > > > > > > > test_idle_cores() does something similar without an iteration, did you > > consider using that instead? > > IIUC, test_idle_cores() returns true if there is at least one idle core in > the package. In my case, I need to know the idle state of only the SMT > siblings of a specific CPU. Am I missing something? > I guess a similar one is select_idle_core(), but it also consider the CPU with SCHED_IDLE task as idle. Is CPU with SCHED_IDLE task a candidate in your scenario? thanks, Chenyu > Thanks and BR, > Ricardo
On Wed, Dec 28, 2022 at 11:29:52PM +0800, Chen Yu wrote: > On 2022-12-23 at 21:28:50 -0800, Ricardo Neri wrote: > > On Thu, Dec 22, 2022 at 04:56:22PM +0000, Valentin Schneider wrote: > > > On 22/11/22 12:35, Ricardo Neri wrote: > > > > Architectures that implement arch_asym_cpu_priority() may need to know the > > > > idle state of the SMT siblings of a CPU. The scheduler has this information > > > > and functionality. Expose it. > > > > > > > > Move the existing functionality outside of the NUMA code. > > > > > > > > > > test_idle_cores() does something similar without an iteration, did you > > > consider using that instead? > > > > IIUC, test_idle_cores() returns true if there is at least one idle core in > > the package. In my case, I need to know the idle state of only the SMT > > siblings of a specific CPU. Am I missing something? > > > I guess a similar one is select_idle_core(), but it also consider the CPU with > SCHED_IDLE task as idle. Is CPU with SCHED_IDLE task a candidate in your > scenario? However, we are not looking for an idle CPU. We want to know the idle state of the siblings of a CPU. I see that select_idle_core() uses available_idle_cpu(), which in turn uses idle_cpu(). is_core_idle() also uses it. As per 943d355d7fee ("sched/core: Distinguish between idle_cpu() calls based on desired effect, introduce available_idle_cpu()") the load balancer can just call idle_cpu(). Thanks and BR, Ricardo
On 23/12/22 21:28, Ricardo Neri wrote: > On Thu, Dec 22, 2022 at 04:56:22PM +0000, Valentin Schneider wrote: >> On 22/11/22 12:35, Ricardo Neri wrote: >> > Architectures that implement arch_asym_cpu_priority() may need to know the >> > idle state of the SMT siblings of a CPU. The scheduler has this information >> > and functionality. Expose it. >> > >> > Move the existing functionality outside of the NUMA code. >> > >> >> test_idle_cores() does something similar without an iteration, did you >> consider using that instead? > > IIUC, test_idle_cores() returns true if there is at least one idle core in > the package. In my case, I need to know the idle state of only the SMT > siblings of a specific CPU. Am I missing something? > No, you're right, clearly I needed that end of the year break - sorry for the noise. > Thanks and BR, > Ricardo
diff --git a/include/linux/sched.h b/include/linux/sched.h index ffb6eb55cd13..0d01c64ac737 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2426,4 +2426,6 @@ static inline void sched_core_fork(struct task_struct *p) { } extern void sched_set_stop_task(int cpu, struct task_struct *stop); +extern bool sched_smt_siblings_idle(int cpu); + #endif diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 0e4251f83807..9517c48df50e 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1052,6 +1052,28 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struct sched_entity *se) * Scheduling class queueing methods: */ +static inline bool is_core_idle(int cpu) +{ +#ifdef CONFIG_SCHED_SMT + int sibling; + + for_each_cpu(sibling, cpu_smt_mask(cpu)) { + if (cpu == sibling) + continue; + + if (!idle_cpu(sibling)) + return false; + } +#endif + + return true; +} + +bool sched_smt_siblings_idle(int cpu) +{ + return is_core_idle(cpu); +} + #ifdef CONFIG_NUMA #define NUMA_IMBALANCE_MIN 2 @@ -1691,23 +1713,6 @@ struct numa_stats { int idle_cpu; }; -static inline bool is_core_idle(int cpu) -{ -#ifdef CONFIG_SCHED_SMT - int sibling; - - for_each_cpu(sibling, cpu_smt_mask(cpu)) { - if (cpu == sibling) - continue; - - if (!idle_cpu(sibling)) - return false; - } -#endif - - return true; -} - struct task_numa_env { struct task_struct *p;