[3/6] sched/fair: Fix busiest group selection for asym groups

Message ID 2e2e6844fb3ed28594d86c5e45295df7c4335c08.1683156492.git.tim.c.chen@linux.intel.com
State New
Headers
Series Enable Cluster Scheduling for x86 Hybrid CPUs |

Commit Message

Tim Chen May 4, 2023, 4:09 p.m. UTC
  From: Tim C Chen <tim.c.chen@linux.intel.com>

For two groups that have spare capacity and are partially busy, the
busier group should be the group with pure CPUs rather than the group
with SMT CPUs.  We do not want to make the SMT group the busiest one to
pull task off the SMT and makes the whole core empty.

Otherwise suppose in the search for busiest group,
we first encounter an SMT group with 1 task and set it as the busiest.
The local group is an atom cluster with 1 task and we then encounter an atom
cluster group with 3 tasks, we will not pick this atom cluster group over the
SMT group, even though we should.  As a result, we do not load balance
the busier Atom cluster (with 3 tasks) towards the local Atom cluster
(with 1 task).  And it doesn't make sense to pick the 1 task SMT group
as the busier group as we also should not pull task off the SMT towards
the 1 task atom cluster and make the SMT core completely empty.

Fix this case.

Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 kernel/sched/fair.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)
  

Comments

Peter Zijlstra May 5, 2023, 1:19 p.m. UTC | #1
On Thu, May 04, 2023 at 09:09:53AM -0700, Tim Chen wrote:
> From: Tim C Chen <tim.c.chen@linux.intel.com>
> 
> For two groups that have spare capacity and are partially busy, the
> busier group should be the group with pure CPUs rather than the group
> with SMT CPUs.  We do not want to make the SMT group the busiest one to
> pull task off the SMT and makes the whole core empty.
> 
> Otherwise suppose in the search for busiest group,
> we first encounter an SMT group with 1 task and set it as the busiest.
> The local group is an atom cluster with 1 task and we then encounter an atom
> cluster group with 3 tasks, we will not pick this atom cluster group over the
> SMT group, even though we should.  As a result, we do not load balance
> the busier Atom cluster (with 3 tasks) towards the local Atom cluster
> (with 1 task).  And it doesn't make sense to pick the 1 task SMT group
> as the busier group as we also should not pull task off the SMT towards
> the 1 task atom cluster and make the SMT core completely empty.
> 
> Fix this case.
> 
> Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
>  kernel/sched/fair.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index bde962aa160a..8a325db34b02 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -9548,6 +9548,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,
>  		break;
>  
>  	case group_has_spare:
> +		/*
> +		 * Do not pick sg with SMT CPUs over sg with pure CPUs,
> +		 * as we do not want to pull task off half empty SMT core
> +		 * and make the core idle.
> +		 */

Comment says what the code does; not why.

> +		if (asymmetric_groups(sds->busiest, sg)) {
> +			if (sds->busiest->flags & SD_SHARE_CPUCAPACITY)
> +				return true;
> +			else
> +				return false;

			return (sds->busiest->flags & SD_SHARE_CPUCAPACITY)
> +		}

Also, should this not be part of the previous patch?
  
Tim Chen May 5, 2023, 10:36 p.m. UTC | #2
On Fri, 2023-05-05 at 15:19 +0200, Peter Zijlstra wrote:
> 
> >  
> >  	case group_has_spare:
> > +		/*
> > +		 * Do not pick sg with SMT CPUs over sg with pure CPUs,
> > +		 * as we do not want to pull task off half empty SMT core
> > +		 * and make the core idle.
> > +		 */
> 
> Comment says what the code does; not why.

Good point, will make the comment better.

> 
> > +		if (asymmetric_groups(sds->busiest, sg)) {
> > +			if (sds->busiest->flags & SD_SHARE_CPUCAPACITY)
> > +				return true;
> > +			else
> > +				return false;
> 
> 			return (sds->busiest->flags & SD_SHARE_CPUCAPACITY)
> > +		}
> 
> Also, should this not be part of the previous patch?

Sure, I can merge it with the previous patch.

Tim
  

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bde962aa160a..8a325db34b02 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9548,6 +9548,18 @@  static bool update_sd_pick_busiest(struct lb_env *env,
 		break;
 
 	case group_has_spare:
+		/*
+		 * Do not pick sg with SMT CPUs over sg with pure CPUs,
+		 * as we do not want to pull task off half empty SMT core
+		 * and make the core idle.
+		 */
+		if (asymmetric_groups(sds->busiest, sg)) {
+			if (sds->busiest->flags & SD_SHARE_CPUCAPACITY)
+				return true;
+			else
+				return false;
+		}
+
 		/*
 		 * Select not overloaded group with lowest number of idle cpus
 		 * and highest number of running tasks. We could also compare