sched/fair: fix possible active balance misbehavior

Message ID 20230621065331.3793767-1-linmiaohe@huawei.com
State New
Headers
Series sched/fair: fix possible active balance misbehavior |

Commit Message

Miaohe Lin June 21, 2023, 6:53 a.m. UTC
  In LBF_DST_PINNED case, env.dst_cpu won't be equal to this_cpu. So when
need_active_balance() returns true, env.dst_cpu should be used to do the
active balance stuff instead of this_cpu.

Fixes: 88b8dac0a14c ("sched: Improve balance_cpu() to consider other cpus in its group as target of (pinned) task")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 kernel/sched/fair.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
  

Comments

Abel Wu June 21, 2023, 7:36 a.m. UTC | #1
Hi Miaohe,

On 6/21/23 2:53 PM, Miaohe Lin wrote:
> In LBF_DST_PINNED case, env.dst_cpu won't be equal to this_cpu. So when
> need_active_balance() returns true, env.dst_cpu should be used to do the
> active balance stuff instead of this_cpu.

Active LB is the last resort to balance loads, which means no task
found can be moved to the local group before we actually do active lb.
So I don't think there is much difference between this cpu and the
selected new dst_cpu, as they are both in the local sched group and
the sched group is treated as a whole in point of view of balancing.

Best,
	Abel

> 
> Fixes: 88b8dac0a14c ("sched: Improve balance_cpu() to consider other cpus in its group as target of (pinned) task")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>   kernel/sched/fair.c | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5e90e9658528..28ff831ee847 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10968,14 +10968,14 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>   			/*
>   			 * Don't kick the active_load_balance_cpu_stop,
>   			 * if the curr task on busiest CPU can't be
> -			 * moved to this_cpu:
> +			 * moved to env.dst_cpu:
>   			 */
> -			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
> +			if (!cpumask_test_cpu(env.dst_cpu, busiest->curr->cpus_ptr)) {
>   				raw_spin_rq_unlock_irqrestore(busiest, flags);
>   				goto out_one_pinned;
>   			}
>   
> -			/* Record that we found at least one task that could run on this_cpu */
> +			/* Record that we found at least one task that could run on env.dst_cpu */
>   			env.flags &= ~LBF_ALL_PINNED;
>   
>   			/*
> @@ -10985,7 +10985,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
>   			 */
>   			if (!busiest->active_balance) {
>   				busiest->active_balance = 1;
> -				busiest->push_cpu = this_cpu;
> +				busiest->push_cpu = env.dst_cpu;
>   				active_balance = 1;
>   			}
>   			raw_spin_rq_unlock_irqrestore(busiest, flags);
  

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5e90e9658528..28ff831ee847 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10968,14 +10968,14 @@  static int load_balance(int this_cpu, struct rq *this_rq,
 			/*
 			 * Don't kick the active_load_balance_cpu_stop,
 			 * if the curr task on busiest CPU can't be
-			 * moved to this_cpu:
+			 * moved to env.dst_cpu:
 			 */
-			if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
+			if (!cpumask_test_cpu(env.dst_cpu, busiest->curr->cpus_ptr)) {
 				raw_spin_rq_unlock_irqrestore(busiest, flags);
 				goto out_one_pinned;
 			}
 
-			/* Record that we found at least one task that could run on this_cpu */
+			/* Record that we found at least one task that could run on env.dst_cpu */
 			env.flags &= ~LBF_ALL_PINNED;
 
 			/*
@@ -10985,7 +10985,7 @@  static int load_balance(int this_cpu, struct rq *this_rq,
 			 */
 			if (!busiest->active_balance) {
 				busiest->active_balance = 1;
-				busiest->push_cpu = this_cpu;
+				busiest->push_cpu = env.dst_cpu;
 				active_balance = 1;
 			}
 			raw_spin_rq_unlock_irqrestore(busiest, flags);