[-tip] sched/fair: gracefully handle EEVDF scheduling failures

Message ID 20231208112100.18141-1-tiwei.btw@antgroup.com
State New
Headers
Series [-tip] sched/fair: gracefully handle EEVDF scheduling failures |

Commit Message

Tiwei Bie Dec. 8, 2023, 11:20 a.m. UTC
  The EEVDF scheduling might fail due to unforeseen issues. Previously,
it handled such situations gracefully, which was helpful in identifying
problems, but it no longer does so. Therefore, it would be better to
restore its previous capability.

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
---
 kernel/sched/fair.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)
  

Comments

Peter Zijlstra Dec. 8, 2023, 2:32 p.m. UTC | #1
On Fri, Dec 08, 2023 at 07:20:59PM +0800, Tiwei Bie wrote:
> The EEVDF scheduling might fail due to unforeseen issues. Previously,

I might also fly if I jump up. But is there any actual reason to believe
something like that will happen?
  
Tiwei Bie Dec. 9, 2023, 4:51 a.m. UTC | #2
On 12/8/23 10:32 PM, Peter Zijlstra wrote:
> On Fri, Dec 08, 2023 at 07:20:59PM +0800, Tiwei Bie wrote:
>> The EEVDF scheduling might fail due to unforeseen issues. Previously,
> 
> I might also fly if I jump up. But is there any actual reason to believe
> something like that will happen?

Thanks for the quick reply! Sorry, after re-reading the commit log,
it looks confusing to me as well. I didn't mean something like that
will happen. I just thought it might be worthwhile to have a sanity
check on 'best'. Because, the 'best' is initialized to NULL and is
conditionally updated. The added 'WARN_ONCE' on '!best' is more like
a 'default' case to catch an unreachable case in a 'switch' block.
There was a similar check in the past that was helpful. And there
seems to be no harm in doing it. If this is reasonable, I'd like to
submit a v2 patch.

PS. I just noticed that the subject line should start with a uppercase
letter according to the rules in the tip tree handbook [1]. The subject
line should be something like: "sched/fair: Sanity check best in pick_eevdf()".

[1] https://www.kernel.org/doc/html/next/process/maintainer-tip.html#patch-subject

Regards,
Tiwei
  

Patch

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index bcea3d55d95d..1b83b3a8e630 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -878,7 +878,7 @@  struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
 static struct sched_entity *pick_eevdf(struct cfs_rq *cfs_rq)
 {
 	struct rb_node *node = cfs_rq->tasks_timeline.rb_root.rb_node;
-	struct sched_entity *se = __pick_first_entity(cfs_rq);
+	struct sched_entity *first = __pick_first_entity(cfs_rq);
 	struct sched_entity *curr = cfs_rq->curr;
 	struct sched_entity *best = NULL;
 
@@ -887,7 +887,7 @@  static struct sched_entity *pick_eevdf(struct cfs_rq *cfs_rq)
 	 * in this cfs_rq, saving some cycles.
 	 */
 	if (cfs_rq->nr_running == 1)
-		return curr && curr->on_rq ? curr : se;
+		return curr && curr->on_rq ? curr : first;
 
 	if (curr && (!curr->on_rq || !entity_eligible(cfs_rq, curr)))
 		curr = NULL;
@@ -900,14 +900,15 @@  static struct sched_entity *pick_eevdf(struct cfs_rq *cfs_rq)
 		return curr;
 
 	/* Pick the leftmost entity if it's eligible */
-	if (se && entity_eligible(cfs_rq, se)) {
-		best = se;
+	if (first && entity_eligible(cfs_rq, first)) {
+		best = first;
 		goto found;
 	}
 
 	/* Heap search for the EEVD entity */
 	while (node) {
 		struct rb_node *left = node->rb_left;
+		struct sched_entity *se;
 
 		/*
 		 * Eligible entities in left subtree are always better
@@ -937,6 +938,9 @@  static struct sched_entity *pick_eevdf(struct cfs_rq *cfs_rq)
 	if (!best || (curr && entity_before(curr, best)))
 		best = curr;
 
+	if (WARN_ONCE(!best, "EEVDF scheduling failed, picking leftmost\n"))
+		best = first;
+
 	return best;
 }