[RFC,40/86] context_tracking: add ct_state_cpu()

Message ID 20231107215742.363031-41-ankur.a.arora@oracle.com
State New
Headers
Series Make the kernel preemptible |

Commit Message

Ankur Arora Nov. 7, 2023, 9:57 p.m. UTC
  While making up its mind about whether to reschedule a target
runqueue eagerly or lazily, resched_curr() needs to know if the
target is executing in the kernel or in userspace.

Add ct_state_cpu().

Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>

---
Using context-tracking for this seems like overkill. Is there a better
way to achieve this? One problem with depending on user_enter() is that
it happens much too late for our purposes. From the scheduler's
point-of-view the exit state has effectively transitioned once the
task exits the exit_to_user_loop() so we will see stale state
while the task is done with exit_to_user_loop() but has not yet
executed user_enter().

---
 include/linux/context_tracking_state.h | 21 +++++++++++++++++++++
 kernel/Kconfig.preempt                 |  1 +
 2 files changed, 22 insertions(+)
  

Comments

Peter Zijlstra Nov. 8, 2023, 9:16 a.m. UTC | #1
On Tue, Nov 07, 2023 at 01:57:26PM -0800, Ankur Arora wrote:
> While making up its mind about whether to reschedule a target
> runqueue eagerly or lazily, resched_curr() needs to know if the
> target is executing in the kernel or in userspace.
> 
> Add ct_state_cpu().
> 
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> 
> ---
> Using context-tracking for this seems like overkill. Is there a better
> way to achieve this? One problem with depending on user_enter() is that
> it happens much too late for our purposes. From the scheduler's
> point-of-view the exit state has effectively transitioned once the
> task exits the exit_to_user_loop() so we will see stale state
> while the task is done with exit_to_user_loop() but has not yet
> executed user_enter().
> 
> ---
>  include/linux/context_tracking_state.h | 21 +++++++++++++++++++++
>  kernel/Kconfig.preempt                 |  1 +
>  2 files changed, 22 insertions(+)
> 
> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
> index bbff5f7f8803..6a8f1c7ba105 100644
> --- a/include/linux/context_tracking_state.h
> +++ b/include/linux/context_tracking_state.h
> @@ -53,6 +53,13 @@ static __always_inline int __ct_state(void)
>  {
>  	return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK;
>  }
> +
> +static __always_inline int __ct_state_cpu(int cpu)
> +{
> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
> +
> +	return atomic_read(&ct->state) & CT_STATE_MASK;
> +}
>  #endif
>  
>  #ifdef CONFIG_CONTEXT_TRACKING_IDLE
> @@ -139,6 +146,20 @@ static __always_inline int ct_state(void)
>  	return ret;
>  }
>  
> +static __always_inline int ct_state_cpu(int cpu)
> +{
> +	int ret;
> +
> +	if (!context_tracking_enabled_cpu(cpu))
> +		return CONTEXT_DISABLED;
> +
> +	preempt_disable();
> +	ret = __ct_state_cpu(cpu);
> +	preempt_enable();
> +
> +	return ret;
> +}

Those preempt_disable/enable are pointless.

But this patch is problematic, you do *NOT* want to rely on context
tracking. Context tracking adds atomics to the entry path, this is slow
and even with CONFIG_CONTEXT_TRACKING it is disabled until you configure
the NOHZ_FULL nonsense.

This simply cannot be.
  
Ankur Arora Nov. 21, 2023, 6:32 a.m. UTC | #2
Peter Zijlstra <peterz@infradead.org> writes:

> On Tue, Nov 07, 2023 at 01:57:26PM -0800, Ankur Arora wrote:
>> While making up its mind about whether to reschedule a target
>> runqueue eagerly or lazily, resched_curr() needs to know if the
>> target is executing in the kernel or in userspace.
>>
>> Add ct_state_cpu().
>>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>
>> ---
>> Using context-tracking for this seems like overkill. Is there a better
>> way to achieve this? One problem with depending on user_enter() is that
>> it happens much too late for our purposes. From the scheduler's
>> point-of-view the exit state has effectively transitioned once the
>> task exits the exit_to_user_loop() so we will see stale state
>> while the task is done with exit_to_user_loop() but has not yet
>> executed user_enter().
>>
>> ---
>>  include/linux/context_tracking_state.h | 21 +++++++++++++++++++++
>>  kernel/Kconfig.preempt                 |  1 +
>>  2 files changed, 22 insertions(+)
>>
>> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
>> index bbff5f7f8803..6a8f1c7ba105 100644
>> --- a/include/linux/context_tracking_state.h
>> +++ b/include/linux/context_tracking_state.h
>> @@ -53,6 +53,13 @@ static __always_inline int __ct_state(void)
>>  {
>>  	return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK;
>>  }
>> +
>> +static __always_inline int __ct_state_cpu(int cpu)
>> +{
>> +	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
>> +
>> +	return atomic_read(&ct->state) & CT_STATE_MASK;
>> +}
>>  #endif
>>
>>  #ifdef CONFIG_CONTEXT_TRACKING_IDLE
>> @@ -139,6 +146,20 @@ static __always_inline int ct_state(void)
>>  	return ret;
>>  }
>>
>> +static __always_inline int ct_state_cpu(int cpu)
>> +{
>> +	int ret;
>> +
>> +	if (!context_tracking_enabled_cpu(cpu))
>> +		return CONTEXT_DISABLED;
>> +
>> +	preempt_disable();
>> +	ret = __ct_state_cpu(cpu);
>> +	preempt_enable();
>> +
>> +	return ret;
>> +}
>
> Those preempt_disable/enable are pointless.
>
> But this patch is problematic, you do *NOT* want to rely on context
> tracking. Context tracking adds atomics to the entry path, this is slow
> and even with CONFIG_CONTEXT_TRACKING it is disabled until you configure
> the NOHZ_FULL nonsense.

Yeah, I had missed the fact that even though the ct->state was updated
for both ct->active, !ct->active but the static branch was only enabled
with NOHZ_FULL.

Will drop.

--
ankur
  

Patch

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index bbff5f7f8803..6a8f1c7ba105 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -53,6 +53,13 @@  static __always_inline int __ct_state(void)
 {
 	return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK;
 }
+
+static __always_inline int __ct_state_cpu(int cpu)
+{
+	struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu);
+
+	return atomic_read(&ct->state) & CT_STATE_MASK;
+}
 #endif
 
 #ifdef CONFIG_CONTEXT_TRACKING_IDLE
@@ -139,6 +146,20 @@  static __always_inline int ct_state(void)
 	return ret;
 }
 
+static __always_inline int ct_state_cpu(int cpu)
+{
+	int ret;
+
+	if (!context_tracking_enabled_cpu(cpu))
+		return CONTEXT_DISABLED;
+
+	preempt_disable();
+	ret = __ct_state_cpu(cpu);
+	preempt_enable();
+
+	return ret;
+}
+
 #else
 static __always_inline bool context_tracking_enabled(void) { return false; }
 static __always_inline bool context_tracking_enabled_cpu(int cpu) { return false; }
diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
index 715e7aebb9d8..aa87b5cd3ecc 100644
--- a/kernel/Kconfig.preempt
+++ b/kernel/Kconfig.preempt
@@ -80,6 +80,7 @@  config PREEMPT_COUNT
 config PREEMPTION
        bool
        select PREEMPT_COUNT
+       select CONTEXT_TRACKING_USER
 
 config SCHED_CORE
 	bool "Core Scheduling for SMT"