[v2,0/6] sched/deadline: cpuset: Rework DEADLINE bandwidth restoration

Message ID 20230503072228.115707-1-juri.lelli@redhat.com
Headers
Series sched/deadline: cpuset: Rework DEADLINE bandwidth restoration |

Message

Juri Lelli May 3, 2023, 7:22 a.m. UTC
  Qais reported [1] that iterating over all tasks when rebuilding root
domains for finding out which ones are DEADLINE and need their bandwidth
correctly restored on such root domains can be a costly operation (10+
ms delays on suspend-resume). He proposed we skip rebuilding root
domains for certain operations, but that approach seemed arch specific
and possibly prone to errors, as paths that ultimately trigger a rebuild
might be quite convoluted (thanks Qais for spending time on this!).

This is v2 of an alternative approach (v1 at [3]) to fix the problem.

 01/06 - Rename functions deadline with DEADLINE accounting (cleanup
         suggested by Qais) - no functional change
 02/06 - Bring back cpuset_mutex (so that we have write access to cpusets
         from scheduler operations - and we also fix some problems
         associated to percpu_cpuset_rwsem)
 03/06 - Keep track of the number of DEADLINE tasks belonging to each cpuset
 04/06 - Use this information to only perform the costly iteration if
         DEADLINE tasks are actually present in the cpuset for which a
         corresponding root domain is being rebuilt
 05/06 - Create DL BW alloc, free & check overflow interface for bulk
         bandwidth allocation/removal - no functional change 
 06/06 - Fix bandwidth allocation handling for cgroup operation
         involving multiple tasks

With respect to the v1 posting [3]

 1 - rebase on top of Linus' tree as of today (865fdb08197e)
 2 - move patch 6 to position 4 - Qais

As the rebase needed some work, I decided to remove the tested and
reviewed bys. Please take another look, just in case I messed something
up.

This set is also available from

https://github.com/jlelli/linux.git deadline/rework-cpusets

Best,
Juri

1 - https://lore.kernel.org/lkml/20230206221428.2125324-1-qyousef@layalina.io/
2 - RFC https://lore.kernel.org/lkml/20230315121812.206079-1-juri.lelli@redhat.com/
3 - v1  https://lore.kernel.org/lkml/20230329125558.255239-1-juri.lelli@redhat.com/

Dietmar Eggemann (2):
  sched/deadline: Create DL BW alloc, free & check overflow interface
  cgroup/cpuset: Free DL BW in case can_attach() fails

Juri Lelli (4):
  cgroup/cpuset: Rename functions dealing with DEADLINE accounting
  sched/cpuset: Bring back cpuset_mutex
  sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
  cgroup/cpuset: Iterate only if DEADLINE tasks are present

 include/linux/cpuset.h  |  12 +-
 include/linux/sched.h   |   4 +-
 kernel/cgroup/cgroup.c  |   4 +
 kernel/cgroup/cpuset.c  | 242 ++++++++++++++++++++++++++--------------
 kernel/sched/core.c     |  41 +++----
 kernel/sched/deadline.c |  67 ++++++++---
 kernel/sched/sched.h    |   2 +-
 7 files changed, 244 insertions(+), 128 deletions(-)
  

Comments

Peter Zijlstra May 4, 2023, 6:25 a.m. UTC | #1
On Wed, May 03, 2023 at 09:22:22AM +0200, Juri Lelli wrote:

> Dietmar Eggemann (2):
>   sched/deadline: Create DL BW alloc, free & check overflow interface
>   cgroup/cpuset: Free DL BW in case can_attach() fails
> 
> Juri Lelli (4):
>   cgroup/cpuset: Rename functions dealing with DEADLINE accounting
>   sched/cpuset: Bring back cpuset_mutex
>   sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
>   cgroup/cpuset: Iterate only if DEADLINE tasks are present
> 
>  include/linux/cpuset.h  |  12 +-
>  include/linux/sched.h   |   4 +-
>  kernel/cgroup/cgroup.c  |   4 +
>  kernel/cgroup/cpuset.c  | 242 ++++++++++++++++++++++++++--------------
>  kernel/sched/core.c     |  41 +++----
>  kernel/sched/deadline.c |  67 ++++++++---
>  kernel/sched/sched.h    |   2 +-
>  7 files changed, 244 insertions(+), 128 deletions(-)

Aside from a few niggles, these look fine to me. Who were you expecting
to merge these, tj or me?
  
Juri Lelli May 4, 2023, 8:17 a.m. UTC | #2
On 04/05/23 08:25, Peter Zijlstra wrote:
> On Wed, May 03, 2023 at 09:22:22AM +0200, Juri Lelli wrote:
> 
> > Dietmar Eggemann (2):
> >   sched/deadline: Create DL BW alloc, free & check overflow interface
> >   cgroup/cpuset: Free DL BW in case can_attach() fails
> > 
> > Juri Lelli (4):
> >   cgroup/cpuset: Rename functions dealing with DEADLINE accounting
> >   sched/cpuset: Bring back cpuset_mutex
> >   sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
> >   cgroup/cpuset: Iterate only if DEADLINE tasks are present
> > 
> >  include/linux/cpuset.h  |  12 +-
> >  include/linux/sched.h   |   4 +-
> >  kernel/cgroup/cgroup.c  |   4 +
> >  kernel/cgroup/cpuset.c  | 242 ++++++++++++++++++++++++++--------------
> >  kernel/sched/core.c     |  41 +++----
> >  kernel/sched/deadline.c |  67 ++++++++---
> >  kernel/sched/sched.h    |   2 +-
> >  7 files changed, 244 insertions(+), 128 deletions(-)
> 
> Aside from a few niggles, these look fine to me. Who were you expecting
> to merge these, tj or me?

Thanks for reviewing!

Not entirely sure, it's kind of split, but maybe the cgroup changes are
predominant (cpuset_mutex is probably contributing the most). So, maybe
tj? Assuming this looks good to him as well of course. :)

Thanks!
  
Tejun Heo May 5, 2023, 7:31 p.m. UTC | #3
On Thu, May 04, 2023 at 10:17:41AM +0200, Juri Lelli wrote:
> On 04/05/23 08:25, Peter Zijlstra wrote:
> > On Wed, May 03, 2023 at 09:22:22AM +0200, Juri Lelli wrote:
> > 
> > > Dietmar Eggemann (2):
> > >   sched/deadline: Create DL BW alloc, free & check overflow interface
> > >   cgroup/cpuset: Free DL BW in case can_attach() fails
> > > 
> > > Juri Lelli (4):
> > >   cgroup/cpuset: Rename functions dealing with DEADLINE accounting
> > >   sched/cpuset: Bring back cpuset_mutex
> > >   sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
> > >   cgroup/cpuset: Iterate only if DEADLINE tasks are present
> > > 
> > >  include/linux/cpuset.h  |  12 +-
> > >  include/linux/sched.h   |   4 +-
> > >  kernel/cgroup/cgroup.c  |   4 +
> > >  kernel/cgroup/cpuset.c  | 242 ++++++++++++++++++++++++++--------------
> > >  kernel/sched/core.c     |  41 +++----
> > >  kernel/sched/deadline.c |  67 ++++++++---
> > >  kernel/sched/sched.h    |   2 +-
> > >  7 files changed, 244 insertions(+), 128 deletions(-)
> > 
> > Aside from a few niggles, these look fine to me. Who were you expecting
> > to merge these, tj or me?
> 
> Thanks for reviewing!
> 
> Not entirely sure, it's kind of split, but maybe the cgroup changes are
> predominant (cpuset_mutex is probably contributing the most). So, maybe
> tj? Assuming this looks good to him as well of course. :)

Yeah, they all look sane to me and both Waiman and Peter seem okay with
them. If you post an updated version with the minor suggestions applied,
I'll route the series through the cgroup tree.

Thanks.
  
Juri Lelli May 8, 2023, 8:02 a.m. UTC | #4
Hi,

On 05/05/23 09:31, Tejun Heo wrote:
> On Thu, May 04, 2023 at 10:17:41AM +0200, Juri Lelli wrote:
> > On 04/05/23 08:25, Peter Zijlstra wrote:
> > > On Wed, May 03, 2023 at 09:22:22AM +0200, Juri Lelli wrote:
> > > 
> > > > Dietmar Eggemann (2):
> > > >   sched/deadline: Create DL BW alloc, free & check overflow interface
> > > >   cgroup/cpuset: Free DL BW in case can_attach() fails
> > > > 
> > > > Juri Lelli (4):
> > > >   cgroup/cpuset: Rename functions dealing with DEADLINE accounting
> > > >   sched/cpuset: Bring back cpuset_mutex
> > > >   sched/cpuset: Keep track of SCHED_DEADLINE task in cpusets
> > > >   cgroup/cpuset: Iterate only if DEADLINE tasks are present
> > > > 
> > > >  include/linux/cpuset.h  |  12 +-
> > > >  include/linux/sched.h   |   4 +-
> > > >  kernel/cgroup/cgroup.c  |   4 +
> > > >  kernel/cgroup/cpuset.c  | 242 ++++++++++++++++++++++++++--------------
> > > >  kernel/sched/core.c     |  41 +++----
> > > >  kernel/sched/deadline.c |  67 ++++++++---
> > > >  kernel/sched/sched.h    |   2 +-
> > > >  7 files changed, 244 insertions(+), 128 deletions(-)
> > > 
> > > Aside from a few niggles, these look fine to me. Who were you expecting
> > > to merge these, tj or me?
> > 
> > Thanks for reviewing!
> > 
> > Not entirely sure, it's kind of split, but maybe the cgroup changes are
> > predominant (cpuset_mutex is probably contributing the most). So, maybe
> > tj? Assuming this looks good to him as well of course. :)
> 
> Yeah, they all look sane to me and both Waiman and Peter seem okay with
> them. If you post an updated version with the minor suggestions applied,
> I'll route the series through the cgroup tree.

Thanks for reviewing and eventually taking care of the series. v3 just
posted (20230508075854.17215-1-juri.lelli@redhat.com).

Best,
Juri