sched/fair: merge same code in enqueue_task_fair

Message ID 202312101719+0800-wangjinchao@xfusion.com
State New
Headers
Series sched/fair: merge same code in enqueue_task_fair |

Commit Message

Wang Jinchao Dec. 10, 2023, 9:21 a.m. UTC
  1. The code below is duplicated in two for loops and need to be
   consolidated
2. Fix the bug where a se's on_rq is true but its parent is not

```c
		cfs_rq->h_nr_running++;
		cfs_rq->idle_h_nr_running += idle_h_nr_running;

		if (cfs_rq_is_idle(cfs_rq))
			idle_h_nr_running = 1;

		/* end evaluation on encountering a throttled cfs_rq */
		if (cfs_rq_throttled(cfs_rq))
			goto enqueue_throttle;
```

Signed-off-by: WangJinchao <wangjinchao@xfusion.com>
---
 kernel/sched/fair.c | 31 ++++++++-----------------------
 1 file changed, 8 insertions(+), 23 deletions(-)
  

Comments

Vincent Guittot Dec. 11, 2023, 3:23 p.m. UTC | #1
On Sun, 10 Dec 2023 at 10:22, WangJinchao <wangjinchao@xfusion.com> wrote:
>
> 1. The code below is duplicated in two for loops and need to be
>    consolidated
> 2. Fix the bug where a se's on_rq is true but its parent is not

Could you clarify which bug you want to fix ?




>
> ```c
>                 cfs_rq->h_nr_running++;
>                 cfs_rq->idle_h_nr_running += idle_h_nr_running;
>
>                 if (cfs_rq_is_idle(cfs_rq))
>                         idle_h_nr_running = 1;
>
>                 /* end evaluation on encountering a throttled cfs_rq */
>                 if (cfs_rq_throttled(cfs_rq))
>                         goto enqueue_throttle;
> ```
>
> Signed-off-by: WangJinchao <wangjinchao@xfusion.com>
> ---
>  kernel/sched/fair.c | 31 ++++++++-----------------------
>  1 file changed, 8 insertions(+), 23 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d7a3c63a2171..e1373bfd4f2e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6681,30 +6681,15 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>                 cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
>
>         for_each_sched_entity(se) {
> -               if (se->on_rq)
> -                       break;
>                 cfs_rq = cfs_rq_of(se);
> -               enqueue_entity(cfs_rq, se, flags);
> -
> -               cfs_rq->h_nr_running++;
> -               cfs_rq->idle_h_nr_running += idle_h_nr_running;
> -
> -               if (cfs_rq_is_idle(cfs_rq))
> -                       idle_h_nr_running = 1;
> -
> -               /* end evaluation on encountering a throttled cfs_rq */
> -               if (cfs_rq_throttled(cfs_rq))
> -                       goto enqueue_throttle;
> -
> -               flags = ENQUEUE_WAKEUP;
> -       }
> -
> -       for_each_sched_entity(se) {
> -               cfs_rq = cfs_rq_of(se);
> -
> -               update_load_avg(cfs_rq, se, UPDATE_TG);
> -               se_update_runnable(se);
> -               update_cfs_group(se);
> +               if (se->on_rq) {
> +                       update_load_avg(cfs_rq, se, UPDATE_TG);
> +                       se_update_runnable(se);
> +                       update_cfs_group(se);
> +               } else {
> +                       enqueue_entity(cfs_rq, se, flags);
> +                       flags = ENQUEUE_WAKEUP;
> +               }
>
>                 cfs_rq->h_nr_running++;
>                 cfs_rq->idle_h_nr_running += idle_h_nr_running;
> --
> 2.40.0
>
  
Tim Chen Dec. 11, 2023, 10:22 p.m. UTC | #2
On Sun, 2023-12-10 at 17:21 +0800, WangJinchao wrote:
> 1. The code below is duplicated in two for loops and need to be
>    consolidated
> 2. Fix the bug where a se's on_rq is true but its parent is not

In the current code, If a se is already on a rq, all the parents would have already been
on rq too.  I don't think there's a case where se's on_rq and parent
is not for the current code before your patch.  Otherwise the child
would never get scheduled. I don't think we have seen such bug being
reported.

> 
> ```c
> 		cfs_rq->h_nr_running++;
> 		cfs_rq->idle_h_nr_running += idle_h_nr_running;
> 
> 		if (cfs_rq_is_idle(cfs_rq))
> 			idle_h_nr_running = 1;
> 
> 		/* end evaluation on encountering a throttled cfs_rq */
> 		if (cfs_rq_throttled(cfs_rq))
> 			goto enqueue_throttle;
> ```
> 
> Signed-off-by: WangJinchao <wangjinchao@xfusion.com>
> ---
>  kernel/sched/fair.c | 31 ++++++++-----------------------
>  1 file changed, 8 insertions(+), 23 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d7a3c63a2171..e1373bfd4f2e 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6681,30 +6681,15 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
>  		cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
>  
>  	for_each_sched_entity(se) {
> -		if (se->on_rq)
> -			break;
>  		cfs_rq = cfs_rq_of(se);
> -		enqueue_entity(cfs_rq, se, flags);
> -
> -		cfs_rq->h_nr_running++;
> -		cfs_rq->idle_h_nr_running += idle_h_nr_running;
> -
> -		if (cfs_rq_is_idle(cfs_rq))
> -			idle_h_nr_running = 1;
> -
> -		/* end evaluation on encountering a throttled cfs_rq */
> -		if (cfs_rq_throttled(cfs_rq))
> -			goto enqueue_throttle;
> -
> -		flags = ENQUEUE_WAKEUP;
> -	}
> -
> -	for_each_sched_entity(se) {
> -		cfs_rq = cfs_rq_of(se);
> -
> -		update_load_avg(cfs_rq, se, UPDATE_TG);
> -		se_update_runnable(se);
> -		update_cfs_group(se);
> +		if (se->on_rq) {
> +			update_load_avg(cfs_rq, se, UPDATE_TG);
> +			se_update_runnable(se);
> +			update_cfs_group(se);
> +		} else {
> +			enqueue_entity(cfs_rq, se, flags);
> +			flags = ENQUEUE_WAKEUP;
> +		}

The code change looks like a reasonable simplification.  Logic
is the same as the old code, assuming that once a se's on_rq flag
is true, all its parents are too.

Thanks.

Tim

>  
>  		cfs_rq->h_nr_running++;
>  		cfs_rq->idle_h_nr_running += idle_h_nr_running;
  
Wang Jinchao Dec. 13, 2023, 7:03 a.m. UTC | #3
On Mon, Dec 11, 2023 at 04:23:52PM +0100, Vincent Guittot wrote:
> On Sun, 10 Dec 2023 at 10:22, WangJinchao <wangjinchao@xfusion.com> wrote:
> >
> > 1. The code below is duplicated in two for loops and need to be
> >    consolidated
> > 2. Fix the bug where a se's on_rq is true but its parent is not
> 
> Could you clarify which bug you want to fix ?
Taking into account the additional information provided by Tim,
this is not a bug. Therefore, this patch is merely a logical
simplification.

I will send out a v2 and update the description in it.
  
Vincent Guittot Dec. 13, 2023, 8:23 a.m. UTC | #4
On Wed, 13 Dec 2023 at 08:04, Wang Jinchao <wangjinchao@xfusion.com> wrote:
>
> On Mon, Dec 11, 2023 at 04:23:52PM +0100, Vincent Guittot wrote:
> > On Sun, 10 Dec 2023 at 10:22, WangJinchao <wangjinchao@xfusion.com> wrote:
> > >
> > > 1. The code below is duplicated in two for loops and need to be
> > >    consolidated
> > > 2. Fix the bug where a se's on_rq is true but its parent is not
> >
> > Could you clarify which bug you want to fix ?
> Taking into account the additional information provided by Tim,
> this is not a bug. Therefore, this patch is merely a logical
> simplification.

If there is no bug why changing it ?

The duplication is done in order to have the same pattern in :
enqueue_task_fair
dequeue_task_fair
throttle_cfs_rq
unthrottle_cfs_rq

so there is no need to change it

>
> I will send out a v2 and update the description in it.
  
Wang Jinchao Dec. 14, 2023, 9:47 a.m. UTC | #5
On Wed, Dec 13, 2023 at 09:23:46AM +0100, Vincent Guittot wrote:
> On Wed, 13 Dec 2023 at 08:04, Wang Jinchao <wangjinchao@xfusion.com> wrote:
> >
> > On Mon, Dec 11, 2023 at 04:23:52PM +0100, Vincent Guittot wrote:
> > > On Sun, 10 Dec 2023 at 10:22, WangJinchao <wangjinchao@xfusion.com> wrote:
> > > >
> > > > 1. The code below is duplicated in two for loops and need to be
> > > >    consolidated
> > > > 2. Fix the bug where a se's on_rq is true but its parent is not
> > >
> > > Could you clarify which bug you want to fix ?
> > Taking into account the additional information provided by Tim,
> > this is not a bug. Therefore, this patch is merely a logical
> > simplification.
> 
> If there is no bug why changing it ?
For two reasons:
1. (from Abel Wu) 
It doesn't need to, but it can actually bring some benefit from
the point of view of text size, especially in warehouse-scale
computers where icache is extremely contended.

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-56 (-56)
Function                                     old     new   delta
enqueue_task_fair                            936     880     -56
Total: Before=64899, After=64843, chg -0.09%

2. For better code comprehension
I became curious when I reached this part, wondering why there is a lot of 
repetition inside these two for-loops. Then I thought about 'do not repeat yourself,' 
and I feel that merging them would lead to a clearer understanding. Of course, 
it might be because I am just starting to read scheduler-related code and am not
yet familiar with the entire logic.
> 
> The duplication is done in order to have the same pattern in :
> enqueue_task_fair
> dequeue_task_fair
> throttle_cfs_rq
> unthrottle_cfs_rq
Due to the two points mentioned above, do we need to adjust all four functions?
> 
> so there is no need to change it
I plan to get familiar with the scheduler-related code first and then consider this.

Thanks
>
  
Abel Wu Dec. 14, 2023, 12:10 p.m. UTC | #6
On 12/14/23 5:47 PM, Wang Jinchao Wrote:
> On Wed, Dec 13, 2023 at 09:23:46AM +0100, Vincent Guittot wrote:
>> On Wed, 13 Dec 2023 at 08:04, Wang Jinchao <wangjinchao@xfusion.com> wrote:
>>>
>>> On Mon, Dec 11, 2023 at 04:23:52PM +0100, Vincent Guittot wrote:
>>>> On Sun, 10 Dec 2023 at 10:22, WangJinchao <wangjinchao@xfusion.com> wrote:
>>>>>
>>>>> 1. The code below is duplicated in two for loops and need to be
>>>>>     consolidated
>>>>> 2. Fix the bug where a se's on_rq is true but its parent is not
>>>>
>>>> Could you clarify which bug you want to fix ?
>>> Taking into account the additional information provided by Tim,
>>> this is not a bug. Therefore, this patch is merely a logical
>>> simplification.
>>
>> If there is no bug why changing it ?
> For two reasons:
> 1. (from Abel Wu)
> It doesn't need to, but it can actually bring some benefit from
> the point of view of text size, especially in warehouse-scale
> computers where icache is extremely contended.
> 
> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-56 (-56)
> Function                                     old     new   delta
> enqueue_task_fair                            936     880     -56
> Total: Before=64899, After=64843, chg -0.09%

But TBH this benefit is kind of weak to argue about, given that you
don't have any data supporting it.

> 
> 2. For better code comprehension
> I became curious when I reached this part, wondering why there is a lot of
> repetition inside these two for-loops. Then I thought about 'do not repeat yourself,'
> and I feel that merging them would lead to a clearer understanding. Of course,
> it might be because I am just starting to read scheduler-related code and am not
> yet familiar with the entire logic.
>>
>> The duplication is done in order to have the same pattern in :
>> enqueue_task_fair
>> dequeue_task_fair
>> throttle_cfs_rq
>> unthrottle_cfs_rq
> Due to the two points mentioned above, do we need to adjust all four functions?
>>
>> so there is no need to change it
> I plan to get familiar with the scheduler-related code first and then consider this.
> 
> Thanks
>>
  
Wang Jinchao Dec. 14, 2023, 12:42 p.m. UTC | #7
On Thu, Dec 14, 2023 at 08:10:57PM +0800, Abel Wu wrote:
> On 12/14/23 5:47 PM, Wang Jinchao Wrote:
> > On Wed, Dec 13, 2023 at 09:23:46AM +0100, Vincent Guittot wrote:
> > > On Wed, 13 Dec 2023 at 08:04, Wang Jinchao <wangjinchao@xfusion.com> wrote:
> > > > 
> > > > On Mon, Dec 11, 2023 at 04:23:52PM +0100, Vincent Guittot wrote:
> > > > > On Sun, 10 Dec 2023 at 10:22, WangJinchao <wangjinchao@xfusion.com> wrote:
> > > > > > 
> > > > > > 1. The code below is duplicated in two for loops and need to be
> > > > > >     consolidated
> > > > > > 2. Fix the bug where a se's on_rq is true but its parent is not
> > > > > 
> > > > > Could you clarify which bug you want to fix ?
> > > > Taking into account the additional information provided by Tim,
> > > > this is not a bug. Therefore, this patch is merely a logical
> > > > simplification.
> > > 
> > > If there is no bug why changing it ?
> > For two reasons:
> > 1. (from Abel Wu)
> > It doesn't need to, but it can actually bring some benefit from
> > the point of view of text size, especially in warehouse-scale
> > computers where icache is extremely contended.
> > 
> > add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-56 (-56)
> > Function                                     old     new   delta
> > enqueue_task_fair                            936     880     -56
> > Total: Before=64899, After=64843, chg -0.09%
> 
> But TBH this benefit is kind of weak to argue about, given that you
> don't have any data supporting it.
Agree. My initinal target is clear comprehension.
And thanks for your numbers.
> 
> > 
> > 2. For better code comprehension
> > I became curious when I reached this part, wondering why there is a lot of
> > repetition inside these two for-loops. Then I thought about 'do not repeat yourself,'
> > and I feel that merging them would lead to a clearer understanding. Of course,
> > it might be because I am just starting to read scheduler-related code and am not
> > yet familiar with the entire logic.
> > > 
> > > The duplication is done in order to have the same pattern in :
> > > enqueue_task_fair
> > > dequeue_task_fair
> > > throttle_cfs_rq
> > > unthrottle_cfs_rq
> > Due to the two points mentioned above, do we need to adjust all four functions?
> > > 
> > > so there is no need to change it
> > I plan to get familiar with the scheduler-related code first and then consider this.
> > 
> > Thanks
> > >
  

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d7a3c63a2171..e1373bfd4f2e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6681,30 +6681,15 @@  enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 		cpufreq_update_util(rq, SCHED_CPUFREQ_IOWAIT);
 
 	for_each_sched_entity(se) {
-		if (se->on_rq)
-			break;
 		cfs_rq = cfs_rq_of(se);
-		enqueue_entity(cfs_rq, se, flags);
-
-		cfs_rq->h_nr_running++;
-		cfs_rq->idle_h_nr_running += idle_h_nr_running;
-
-		if (cfs_rq_is_idle(cfs_rq))
-			idle_h_nr_running = 1;
-
-		/* end evaluation on encountering a throttled cfs_rq */
-		if (cfs_rq_throttled(cfs_rq))
-			goto enqueue_throttle;
-
-		flags = ENQUEUE_WAKEUP;
-	}
-
-	for_each_sched_entity(se) {
-		cfs_rq = cfs_rq_of(se);
-
-		update_load_avg(cfs_rq, se, UPDATE_TG);
-		se_update_runnable(se);
-		update_cfs_group(se);
+		if (se->on_rq) {
+			update_load_avg(cfs_rq, se, UPDATE_TG);
+			se_update_runnable(se);
+			update_cfs_group(se);
+		} else {
+			enqueue_entity(cfs_rq, se, flags);
+			flags = ENQUEUE_WAKEUP;
+		}
 
 		cfs_rq->h_nr_running++;
 		cfs_rq->idle_h_nr_running += idle_h_nr_running;