kernel/sched/core: adjust rt_priority accordingly when prio is changed

Message ID 1675245680-2811-1-git-send-email-chensong_2000@189.cn
State New
Headers
Series kernel/sched/core: adjust rt_priority accordingly when prio is changed |

Commit Message

Song Chen Feb. 1, 2023, 10:01 a.m. UTC
  When a high priority process is acquiring a rtmutex which is held by a
low priority process, the latter's priority will be boosted up by calling
rt_mutex_setprio->__setscheduler_prio.

However, p->prio is changed but p->rt_priority is not, as a result, the
equation between prio and rt_priority is broken, which is:

	prio = MAX_RT_PRIO - 1 - rt_priority

It's confusing to the user when it calls sched_getparam, which only
returns rt_priority.

This patch addresses this issue by adjusting rt_priority according to
the new value of prio, what's more, it also returns normal_prio for
CFS processes instead of just a zero.

Signed-off-by: Song Chen <chensong_2000@189.cn>
---
 kernel/sched/core.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)
  

Comments

Steven Rostedt Feb. 1, 2023, 3:56 p.m. UTC | #1
On Wed,  1 Feb 2023 18:01:20 +0800
Song Chen <chensong_2000@189.cn> wrote:

> When a high priority process is acquiring a rtmutex which is held by a
> low priority process, the latter's priority will be boosted up by calling
> rt_mutex_setprio->__setscheduler_prio.
> 
> However, p->prio is changed but p->rt_priority is not, as a result, the
> equation between prio and rt_priority is broken, which is:
> 
> 	prio = MAX_RT_PRIO - 1 - rt_priority
> 
> It's confusing to the user when it calls sched_getparam, which only
> returns rt_priority.

If it is boosted, then that's an internal implementation and not the real
priority of the task. It should not be exposed to a user interface. In
fact, there's discussion of implementing a "proxy" algorithm which will
make what the "priority" of a task is even more complicated when acquiring
mutexes.


> 
> This patch addresses this issue by adjusting rt_priority according to
> the new value of prio, what's more, it also returns normal_prio for
> CFS processes instead of just a zero.

The comment above sched_getparam() is:

/**
 * sys_sched_getparam - get the RT priority of a thread
 * @pid: the pid in question.
 * @param: structure containing the RT priority.
 *
 * Return: On success, 0 and the RT priority is in @param. Otherwise, an error
 * code.
 */

So returning the nice value is incorrect. If anything, perhaps it should
return -EINVAL if the task is not an RT task?

-- Steve

> 
> Signed-off-by: Song Chen <chensong_2000@189.cn>
> ---
  

Patch

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index bb1ee6d7bdde..1c2c4ada08cc 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6933,14 +6933,16 @@  EXPORT_SYMBOL(default_wake_function);
 
 static void __setscheduler_prio(struct task_struct *p, int prio)
 {
+	p->prio = prio;
+
 	if (dl_prio(prio))
 		p->sched_class = &dl_sched_class;
-	else if (rt_prio(prio))
+	else if (rt_prio(prio)) {
+		p->rt_priority = MAX_RT_PRIO - 1 - prio;
 		p->sched_class = &rt_sched_class;
+	}
 	else
 		p->sched_class = &fair_sched_class;
-
-	p->prio = prio;
 }
 
 #ifdef CONFIG_RT_MUTEXES
@@ -8058,6 +8060,8 @@  SYSCALL_DEFINE2(sched_getparam, pid_t, pid, struct sched_param __user *, param)
 
 	if (task_has_rt_policy(p))
 		lp.sched_priority = p->rt_priority;
+	else
+		lp.sched_priority = normal_prio(p);
 	rcu_read_unlock();
 
 	/*