[2/2] block: adjust CFS request expire time

Message ID 20240220061542.489922-2-zhaoyang.huang@unisoc.com
State New
Headers
Series [1/2] sched: introduce helper function to calculate distribution over sched class |

Commit Message

zhaoyang.huang Feb. 20, 2024, 6:15 a.m. UTC
  From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>

According to current policy, CFS's may suffer involuntary IO-latency by
being preempted by RT/DL tasks or IRQ since they possess the privilege for
both of CPU and IO scheduler. This commit introduce an approximate and
light method to decrease these affection by adjusting the expire time
via the CFS's proportion among the whole cpu active time.
The average utilization of cpu's run queue could reflect the historical
active proportion of different types of task that can be proved valid for
this goal from belowing three perspective,

1. All types of sched class's load(util) are tracked and calculated in the
same way(using a geometric series which known as PELT)
2. Keep the legacy policy by NOT adjusting rq's position in fifo_list
but only make changes over expire_time.
3. The fixed expire time(hundreds of ms) is in the same range of cpu
avg_load's account series(the utilization will be decayed to 0.5 in 32ms)

TaskA
sched in
|
|
|
submit_bio
|
|
|
fifo_time = jiffies + expire
(insert_request)

TaskB
sched in
|
|
vfs_xxx
|
|preempted by RT,DL,IRQ
|\
| This period time is unfair to TaskB's IO request, should be adjust
|/
|
submit_bio
|
|
|
fifo_time = jiffies + expire * CFS_PROPORTION(rq)
(insert_request)

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
---
 block/mq-deadline.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)
  

Comments

Christoph Hellwig Feb. 20, 2024, 9:42 a.m. UTC | #1
On Tue, Feb 20, 2024 at 02:15:42PM +0800, zhaoyang.huang wrote:
> From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> 
> According to current policy, CFS's may suffer involuntary IO-latency by
> being preempted by RT/DL tasks or IRQ since they possess the privilege for
> both of CPU and IO scheduler.

What is 'current policy', what is CFS, what is RT/DL?  What privilege
is possessed?

> 1. All types of sched class's load(util) are tracked and calculated in the
> same way(using a geometric series which known as PELT)
> 2. Keep the legacy policy by NOT adjusting rq's position in fifo_list
> but only make changes over expire_time.
> 3. The fixed expire time(hundreds of ms) is in the same range of cpu
> avg_load's account series(the utilization will be decayed to 0.5 in 32ms)

What problem does this fix, i.e. what performance number are improved
or what other effects does it have?

> +		 * The expire time is adjusted via calculating the proportion of
> +		 * CFS's activation among whole cpu time during last several
> +		 * dazen's ms.Whearas, this would NOT affect the rq's position in
> +		 * fifo_list but only take effect when this rq is checked for its
> +		 * expire time when at head.
>  		 */

Please speel check the comment and fix the formatting to have white
spaces after sentences and never exceed 80 characters in block comments.
  
Zhaoyang Huang Feb. 20, 2024, 10:37 a.m. UTC | #2
On Tue, Feb 20, 2024 at 5:42 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Tue, Feb 20, 2024 at 02:15:42PM +0800, zhaoyang.huang wrote:
> > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com>
> >
> > According to current policy, CFS's may suffer involuntary IO-latency by
> > being preempted by RT/DL tasks or IRQ since they possess the privilege for
> > both of CPU and IO scheduler.
>
> What is 'current policy', what is CFS, what is RT/DL?  What privilege
> is possessed?
CFS and RT/DL are types of sched class in which CFS has the least
privilege to get CPU.
IMO, ‘current policy’ refers to two perspectives:
1. the RT task in the same core with the CFS task gets privileges in
both CPU and IO scheduler(deadline on duty) than CFS. Could we make
the CFS requests' expire_time be earlier than it used to be now.
2. In terms of the timing of inserting the request, preempted CFS
tasks lose the fairness involuntary when compared with none-preempted
CFS tasks. Could we decrease this impact in some way.
>
> > 1. All types of sched class's load(util) are tracked and calculated in the
> > same way(using a geometric series which known as PELT)
> > 2. Keep the legacy policy by NOT adjusting rq's position in fifo_list
> > but only make changes over expire_time.
> > 3. The fixed expire time(hundreds of ms) is in the same range of cpu
> > avg_load's account series(the utilization will be decayed to 0.5 in 32ms)
>
> What problem does this fix, i.e. what performance number are improved
> or what other effects does it have?
I have verified this commit via some benchmark tools like fio and
Androbench. Neither regression nor improvement is found. By analysing
the log below[2], where I find that CFS occupies most of the CPU for
the most part. If it makes more sense in the way of [1] where CFS is
over-preempted than a threshold.

[1]
-               rq->fifo_time = jiffies + dd->fifo_expire[data_dir];

                /*adjust expire time when cfs is over-preempted than 50%*/
+              fifo_expire = cfs_prop_by_util(current,100) < 50 ?
dd->fifo_expire[data_dir] :
+                       cfs_prop_by_util(current, dd->fifo_expire[data_dir]);
+               rq->fifo_time = jiffies + fifo_expire;

[2]
//prop is the proportion of CFS's util which is mostly above 90(90%)
during common benchmark test
   kworker/u16:3-73      [000] ...1.   321.140143: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
   kworker/u16:3-73      [000] ...1.   321.140414: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
   kworker/u16:3-73      [000] ...1.   321.140505: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
   kworker/u16:3-73      [000] ...1.   321.140574: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
   kworker/u16:3-73      [000] ...1.   321.140630: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
   kworker/u16:3-73      [000] ...1.   321.140682: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
   kworker/u16:3-73      [000] ...1.   321.140736: dd_insert_request:
dir 1,cfs 513, prop 91, orig_expire 1250, expire 1149
              dd-7296    [006] ...1.   321.143139: dd_insert_request:
dir 0,cfs 610, prop 92, orig_expire 125, expire 115
              dd-7296    [006] ...1.   321.143287: dd_insert_request:
dir 0,cfs 610, prop 92, orig_expire 125, expire 115
              dd-7296    [004] ...1.   321.156074: dd_insert_request:
dir 0,cfs 691, prop 97, orig_expire 125, expire 122
              dd-7296    [004] ...1.   321.156202: dd_insert_request:
dir 0,cfs 691, prop 97, orig_expire 125, expire 122

>
> > +              * The expire time is adjusted via calculating the proportion of
> > +              * CFS's activation among whole cpu time during last several
> > +              * dazen's ms.Whearas, this would NOT affect the rq's position in
> > +              * fifo_list but only take effect when this rq is checked for its
> > +              * expire time when at head.
> >                */
>
> Please speel check the comment and fix the formatting to have white
> spaces after sentences and never exceed 80 characters in block comments.
ok.
>
  

Patch

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index f958e79277b8..1e538cb2783b 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -839,8 +839,15 @@  static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 
 		/*
 		 * set expire time and add to fifo list
+		 * The expire time is adjusted via calculating the proportion of
+		 * CFS's activation among whole cpu time during last several
+		 * dazen's ms.Whearas, this would NOT affect the rq's position in
+		 * fifo_list but only take effect when this rq is checked for its
+		 * expire time when at head.
 		 */
-		rq->fifo_time = jiffies + dd->fifo_expire[data_dir];
+		rq->fifo_time = jiffies +
+			cfs_prop_by_util(current, dd->fifo_expire[data_dir]);
+
 		insert_before = &per_prio->fifo_list[data_dir];
 #ifdef CONFIG_BLK_DEV_ZONED
 		/*