[03/30] thread_info: tif_need_resched() now takes resched_t as param

Message ID 20240213055554.1802415-4-ankur.a.arora@oracle.com
State New
Headers
Series PREEMPT_AUTO: support lazy rescheduling |

Commit Message

Ankur Arora Feb. 13, 2024, 5:55 a.m. UTC
  tif_need_resched() now takes a resched_t parameter to decide the
immediacy of the need-resched.
Update need_resched() and should_resched() so they both check for
tif_need_resched(NR_now), which keeps the current semantics. Also
define need_resched_lazy(), which as the name suggests, checks for
tif_need_resched(NR_lazy).

Given that need_resched() (and should_resched() to a lesser extent)
are used extensively in the kernel, it is worth noting their common
uses and any changes to them:

 - preempt_count(): we only ever want to fold or make preemption
   decisions based on TIF_NEED_RESCHED, not TIF_NEED_RESCHED_LAZY.
   So, related logic now uses tif_need_resched(NR_now).

 - cond_resched_*(): checks for should_resched() and preempts if
   TIF_NEED_RESCHED were set (and if (preempt_count() == offset).

   Hand-rolled versions typically first check for need_resched()
   which would also continue to check for the same thing.

   So, in either case relinquish resources only if immediate
   rescheduling was needed, not for lazy-rescheduling.

 - idle: run to completion is not meaningful for the idle task and
   so we always schedule out of idle whenever there is any work.

   Most idle code uses a mixture of tif_need_resched() and
   need_resched() (the first one especially in the interfaces defined
   in sched/idle.h.)

   This change moves all the idle code to need_resched().

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Originally-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/87jzshhexi.ffs@tglx/
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
Note that the need_resched() semantics here are narrower than
earlier versions of this code.

  In previous versions, need_resched() checked for
  (tif_need_resched(NR_now) || tif_need_resched(NR_lazy)).

  However, there's a fair bit of code which relinquishes resources
  temporarily based on need_resched()/should_resched() checks. Or,
  in the case of mutex_can_spin_on_owner() decides to optimistically
  spin or not based on need_resched().

  So, this version limits need_resched() to only checking
  TIF_NEED_RESCHED.

 arch/s390/include/asm/preempt.h |  4 ++--
 drivers/acpi/processor_idle.c   |  2 +-
 include/asm-generic/preempt.h   |  4 ++--
 include/linux/preempt.h         |  2 +-
 include/linux/sched.h           |  7 ++++++-
 include/linux/sched/idle.h      |  8 ++++----
 include/linux/thread_info.h     | 25 +++++++++++++++++++------
 kernel/sched/idle.c             |  2 +-
 kernel/trace/trace.c            |  2 +-
 9 files changed, 37 insertions(+), 19 deletions(-)
  

Comments

kernel test robot Feb. 14, 2024, 3:17 a.m. UTC | #1
Hi Ankur,

kernel test robot noticed the following build warnings:

[auto build test WARNING on paulmck-rcu/dev]
[also build test WARNING on tip/core/entry linus/master v6.8-rc4 next-20240213]
[cannot apply to tip/sched/core]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ankur-Arora/preempt-introduce-CONFIG_PREEMPT_AUTO/20240213-140748
base:   https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
patch link:    https://lore.kernel.org/r/20240213055554.1802415-4-ankur.a.arora%40oracle.com
patch subject: [PATCH 03/30] thread_info: tif_need_resched() now takes resched_t as param
config: x86_64-buildonly-randconfig-002-20240213 (https://download.01.org/0day-ci/archive/20240214/202402141055.evTx8Q0K-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240214/202402141055.evTx8Q0K-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202402141055.evTx8Q0K-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> vmlinux.o: warning: objtool: mwait_idle+0xd: call to tif_resched.constprop.0() leaves .noinstr.text section
>> vmlinux.o: warning: objtool: poll_idle+0x2c: call to tif_resched.constprop.0() leaves .noinstr.text section


objdump-func vmlinux.o mwait_idle:
0000 0000000000000060 <mwait_idle>:
0000   60:	53                   	push   %rbx
0001   61:	48 8b 1c 25 00 00 00 00 	mov    0x0,%rbx	65: R_X86_64_32S	pcpu_hot
0009   69:	80 4b 02 20          	orb    $0x20,0x2(%rbx)
000d   6d:	e8 00 00 00 00       	call   72 <mwait_idle+0x12>	6e: R_X86_64_PC32	.text+0xfe73c
0012   72:	48 63 f0             	movslq %eax,%rsi
0015   75:	48 0f a3 33          	bt     %rsi,(%rbx)
0019   79:	72 3a                	jb     b5 <mwait_idle+0x55>
001b   7b:	48 8b 05 00 00 00 00 	mov    0x0(%rip),%rax        # 82 <mwait_idle+0x22>	7e: R_X86_64_PC32	boot_cpu_data+0x6c
0022   82:	48 0f ba e0 27       	bt     $0x27,%rax
0027   87:	72 46                	jb     cf <mwait_idle+0x6f>
0029   89:	31 d2                	xor    %edx,%edx
002b   8b:	48 89 d8             	mov    %rbx,%rax
002e   8e:	48 89 d1             	mov    %rdx,%rcx
0031   91:	0f 01 c8             	monitor %rax,%ecx,%edx
0034   94:	48 0f a3 33          	bt     %rsi,(%rbx)
0038   98:	72 1b                	jb     b5 <mwait_idle+0x55>
003a   9a:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # a0 <mwait_idle+0x40>	9c: R_X86_64_PC32	mds_idle_clear-0x4
0040   a0:	85 c0                	test   %eax,%eax
0042   a2:	7e 07                	jle    ab <mwait_idle+0x4b>
0044   a4:	0f 00 2d 00 00 00 00 	verw   0x0(%rip)        # ab <mwait_idle+0x4b>	a7: R_X86_64_PC32	.rodata+0x15cd2
004b   ab:	31 c0                	xor    %eax,%eax
004d   ad:	48 89 c1             	mov    %rax,%rcx
0050   b0:	fb                   	sti
0051   b1:	0f 01 c9             	mwait  %eax,%ecx
0054   b4:	fa                   	cli
0055   b5:	48 8b 04 25 00 00 00 00 	mov    0x0,%rax	b9: R_X86_64_32S	pcpu_hot
005d   bd:	80 60 02 df          	andb   $0xdf,0x2(%rax)
0061   c1:	5b                   	pop    %rbx
0062   c2:	31 c0                	xor    %eax,%eax
0064   c4:	31 d2                	xor    %edx,%edx
0066   c6:	31 c9                	xor    %ecx,%ecx
0068   c8:	31 f6                	xor    %esi,%esi
006a   ca:	e9 00 00 00 00       	jmp    cf <mwait_idle+0x6f>	cb: R_X86_64_PLT32	__x86_return_thunk-0x4
006f   cf:	0f ae f0             	mfence
0072   d2:	0f ae 3b             	clflush (%rbx)
0075   d5:	0f ae f0             	mfence
0078   d8:	eb af                	jmp    89 <mwait_idle+0x29>
007a   da:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
  
Mark Rutland Feb. 14, 2024, 2:08 p.m. UTC | #2
On Mon, Feb 12, 2024 at 09:55:27PM -0800, Ankur Arora wrote:
> tif_need_resched() now takes a resched_t parameter to decide the
> immediacy of the need-resched.

I see at the end of the series, most callers pass a constant:

[mark@lakrids:~/src/linux]% git grep -w tif_need_resched
arch/s390/include/asm/preempt.h:        return !--S390_lowcore.preempt_count && tif_need_resched(NR_now);
arch/s390/include/asm/preempt.h:                        tif_need_resched(NR_now));
include/asm-generic/preempt.h:  return !--*preempt_count_ptr() && tif_need_resched(NR_now);
include/asm-generic/preempt.h:                  tif_need_resched(NR_now));
include/linux/preempt.h:        if (tif_need_resched(NR_now)) \
include/linux/sched.h:  return unlikely(tif_need_resched(NR_now));
include/linux/sched.h:          unlikely(tif_need_resched(NR_lazy));
include/linux/thread_info.h:static __always_inline bool tif_need_resched(resched_t rs)
include/linux/thread_info.h:     * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
kernel/entry/common.c:          if (tif_need_resched(NR_now))
kernel/sched/debug.c:   nr = tif_need_resched(NR_now) ? "need_resched" : "need_resched_lazy";
kernel/trace/trace.c:   if (tif_need_resched(NR_now))
kernel/trace/trace.c:   if (tif_need_resched(NR_lazy))

I think it'd be clearer if we had tif_need_resched_now() and
tif_need_resched_lazy() wrappers rather than taking a parameter. I think that
if we did similar elsewhere (e.g. {set,test}_tsk_need_resched_{now,lazy}()),
it'd be a bit cleaner overall, since we can special-case the lazy behaviour
more easily/clearly.

> Update need_resched() and should_resched() so they both check for
> tif_need_resched(NR_now), which keeps the current semantics. Also
> define need_resched_lazy(), which as the name suggests, checks for
> tif_need_resched(NR_lazy).
>
> Given that need_resched() (and should_resched() to a lesser extent)
> are used extensively in the kernel, it is worth noting their common
> uses and any changes to them:
> 
>  - preempt_count(): we only ever want to fold or make preemption
>    decisions based on TIF_NEED_RESCHED, not TIF_NEED_RESCHED_LAZY.
>    So, related logic now uses tif_need_resched(NR_now).
> 
>  - cond_resched_*(): checks for should_resched() and preempts if
>    TIF_NEED_RESCHED were set (and if (preempt_count() == offset).
> 
>    Hand-rolled versions typically first check for need_resched()
>    which would also continue to check for the same thing.
> 
>    So, in either case relinquish resources only if immediate
>    rescheduling was needed, not for lazy-rescheduling.
> 
>  - idle: run to completion is not meaningful for the idle task and
>    so we always schedule out of idle whenever there is any work.
> 
>    Most idle code uses a mixture of tif_need_resched() and
>    need_resched() (the first one especially in the interfaces defined
>    in sched/idle.h.)
> 
>    This change moves all the idle code to need_resched().
> 
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Originally-by: Thomas Gleixner <tglx@linutronix.de>
> Link: https://lore.kernel.org/lkml/87jzshhexi.ffs@tglx/
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> Note that the need_resched() semantics here are narrower than
> earlier versions of this code.
> 
>   In previous versions, need_resched() checked for
>   (tif_need_resched(NR_now) || tif_need_resched(NR_lazy)).
> 
>   However, there's a fair bit of code which relinquishes resources
>   temporarily based on need_resched()/should_resched() checks. Or,
>   in the case of mutex_can_spin_on_owner() decides to optimistically
>   spin or not based on need_resched().
> 
>   So, this version limits need_resched() to only checking
>   TIF_NEED_RESCHED.
> 
>  arch/s390/include/asm/preempt.h |  4 ++--
>  drivers/acpi/processor_idle.c   |  2 +-
>  include/asm-generic/preempt.h   |  4 ++--
>  include/linux/preempt.h         |  2 +-
>  include/linux/sched.h           |  7 ++++++-
>  include/linux/sched/idle.h      |  8 ++++----
>  include/linux/thread_info.h     | 25 +++++++++++++++++++------
>  kernel/sched/idle.c             |  2 +-
>  kernel/trace/trace.c            |  2 +-
>  9 files changed, 37 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/s390/include/asm/preempt.h b/arch/s390/include/asm/preempt.h
> index bf15da0fedbc..97fda8e06b6c 100644
> --- a/arch/s390/include/asm/preempt.h
> +++ b/arch/s390/include/asm/preempt.h
> @@ -114,13 +114,13 @@ static inline void __preempt_count_sub(int val)
>  
>  static inline bool __preempt_count_dec_and_test(void)
>  {
> -	return !--S390_lowcore.preempt_count && tif_need_resched();
> +	return !--S390_lowcore.preempt_count && tif_need_resched(NR_now);
>  }
>  
>  static inline bool should_resched(int preempt_offset)
>  {
>  	return unlikely(preempt_count() == preempt_offset &&
> -			tif_need_resched());
> +			tif_need_resched(NR_now));
>  }
>  
>  #endif /* CONFIG_HAVE_MARCH_Z196_FEATURES */
> diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
> index 55437f5e0c3a..7fc47007b926 100644
> --- a/drivers/acpi/processor_idle.c
> +++ b/drivers/acpi/processor_idle.c
> @@ -108,7 +108,7 @@ static const struct dmi_system_id processor_power_dmi_table[] = {
>   */
>  static void __cpuidle acpi_safe_halt(void)
>  {
> -	if (!tif_need_resched()) {
> +	if (!need_resched()) {
>  		raw_safe_halt();
>  		raw_local_irq_disable();
>  	}
> diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
> index 51f8f3881523..ed98e6168b99 100644
> --- a/include/asm-generic/preempt.h
> +++ b/include/asm-generic/preempt.h
> @@ -66,7 +66,7 @@ static __always_inline bool __preempt_count_dec_and_test(void)
>  	 * operations; we cannot use PREEMPT_NEED_RESCHED because it might get
>  	 * lost.
>  	 */
> -	return !--*preempt_count_ptr() && tif_need_resched();
> +	return !--*preempt_count_ptr() && tif_need_resched(NR_now);
>  }
>  
>  /*
> @@ -75,7 +75,7 @@ static __always_inline bool __preempt_count_dec_and_test(void)
>  static __always_inline bool should_resched(int preempt_offset)
>  {
>  	return unlikely(preempt_count() == preempt_offset &&
> -			tif_need_resched());
> +			tif_need_resched(NR_now));
>  }
>  
>  #ifdef CONFIG_PREEMPTION
> diff --git a/include/linux/preempt.h b/include/linux/preempt.h
> index 7233e9cf1bab..336cb76f0830 100644
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -312,7 +312,7 @@ do { \
>  } while (0)
>  #define preempt_fold_need_resched() \
>  do { \
> -	if (tif_need_resched()) \
> +	if (tif_need_resched(NR_now)) \
>  		set_preempt_need_resched(); \
>  } while (0)
>  
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index cdb8ea53c365..7a22a56350bb 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2099,7 +2099,12 @@ static inline bool preempt_model_preemptible(void)
>  
>  static __always_inline bool need_resched(void)
>  {
> -	return unlikely(tif_need_resched());
> +	return unlikely(tif_need_resched(NR_now));
> +}
> +
> +static __always_inline bool need_resched_lazy(void)
> +{
> +	return unlikely(tif_need_resched(NR_lazy));
>  }
>  
>  /*
> diff --git a/include/linux/sched/idle.h b/include/linux/sched/idle.h
> index 478084f9105e..d8ce85dff47a 100644
> --- a/include/linux/sched/idle.h
> +++ b/include/linux/sched/idle.h
> @@ -63,7 +63,7 @@ static __always_inline bool __must_check current_set_polling_and_test(void)
>  	 */
>  	smp_mb__after_atomic();
>  
> -	return unlikely(tif_need_resched());
> +	return need_resched();
>  }
>  
>  static __always_inline bool __must_check current_clr_polling_and_test(void)
> @@ -76,7 +76,7 @@ static __always_inline bool __must_check current_clr_polling_and_test(void)
>  	 */
>  	smp_mb__after_atomic();
>  
> -	return unlikely(tif_need_resched());
> +	return unlikely(need_resched());
>  }
>  
>  #else
> @@ -85,11 +85,11 @@ static inline void __current_clr_polling(void) { }
>  
>  static inline bool __must_check current_set_polling_and_test(void)
>  {
> -	return unlikely(tif_need_resched());
> +	return unlikely(need_resched());
>  }
>  static inline bool __must_check current_clr_polling_and_test(void)
>  {
> -	return unlikely(tif_need_resched());
> +	return unlikely(need_resched());
>  }
>  #endif
>  
> diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
> index 99043cbbb6b0..8752dbc2dac7 100644
> --- a/include/linux/thread_info.h
> +++ b/include/linux/thread_info.h
> @@ -211,22 +211,35 @@ static __always_inline unsigned long read_ti_thread_flags(struct thread_info *ti
>  
>  #ifdef _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H
>  
> -static __always_inline bool tif_need_resched(void)
> +static __always_inline bool __tif_need_resched(int nr_flag)
>  {
> -	return arch_test_bit(TIF_NEED_RESCHED,
> -			     (unsigned long *)(&current_thread_info()->flags));
> +	return arch_test_bit(nr_flag,
> +		     (unsigned long *)(&current_thread_info()->flags));
>  }
>  
>  #else
>  
> -static __always_inline bool tif_need_resched(void)
> +static __always_inline bool __tif_need_resched(int nr_flag)
>  {
> -	return test_bit(TIF_NEED_RESCHED,
> -			(unsigned long *)(&current_thread_info()->flags));
> +	return test_bit(nr_flag,
> +		(unsigned long *)(&current_thread_info()->flags));
>  }
>  
>  #endif /* _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H */
>  
> +static __always_inline bool tif_need_resched(resched_t rs)
> +{
> +	/*
> +	 * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
> +	 * as TIF_NEED_RESCHED (the TIF_NEED_RESCHED_LAZY flag is not
> +	 * defined). Return false in that case.
> +	 */
> +	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
> +		return __tif_need_resched(tif_resched(rs));
> +	else
> +		return false;
> +}

As above, I think this would be a bit simpler/clearer if we did:

static __always_inline bool tif_need_resched_now(void)
{
	return __tif_need_resched(TIF_NEED_RESCHED);
}

static __always_inline bool tif_need_resched_lazy(void)
{
	return IS_ENABLED(CONFIG_PREEMPT_AUTO) &&
	        __tif_need_resched(TIF_NEED_RESCHED_LAZY);
}

Mark.

> +
>  #ifndef CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES
>  static inline int arch_within_stack_frames(const void * const stack,
>  					   const void * const stackend,
> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
> index 31231925f1ec..be53d164e267 100644
> --- a/kernel/sched/idle.c
> +++ b/kernel/sched/idle.c
> @@ -57,7 +57,7 @@ static noinline int __cpuidle cpu_idle_poll(void)
>  	ct_cpuidle_enter();
>  
>  	raw_local_irq_enable();
> -	while (!tif_need_resched() &&
> +	while (!need_resched() &&
>  	       (cpu_idle_force_poll || tick_check_broadcast_expired()))
>  		cpu_relax();
>  	raw_local_irq_disable();
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 2a7c6fd934e9..0a9642fba852 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -2701,7 +2701,7 @@ unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status)
>  	if (softirq_count() >> (SOFTIRQ_SHIFT + 1))
>  		trace_flags |= TRACE_FLAG_BH_OFF;
>  
> -	if (tif_need_resched())
> +	if (tif_need_resched(NR_now))
>  		trace_flags |= TRACE_FLAG_NEED_RESCHED;
>  	if (test_preempt_need_resched())
>  		trace_flags |= TRACE_FLAG_PREEMPT_RESCHED;
> -- 
> 2.31.1
>
  
Ankur Arora Feb. 15, 2024, 4:08 a.m. UTC | #3
Mark Rutland <mark.rutland@arm.com> writes:

> On Mon, Feb 12, 2024 at 09:55:27PM -0800, Ankur Arora wrote:
>> tif_need_resched() now takes a resched_t parameter to decide the
>> immediacy of the need-resched.
>
> I see at the end of the series, most callers pass a constant:
>
> [mark@lakrids:~/src/linux]% git grep -w tif_need_resched
> arch/s390/include/asm/preempt.h:        return !--S390_lowcore.preempt_count && tif_need_resched(NR_now);
> arch/s390/include/asm/preempt.h:                        tif_need_resched(NR_now));
> include/asm-generic/preempt.h:  return !--*preempt_count_ptr() && tif_need_resched(NR_now);
> include/asm-generic/preempt.h:                  tif_need_resched(NR_now));
> include/linux/preempt.h:        if (tif_need_resched(NR_now)) \
> include/linux/sched.h:  return unlikely(tif_need_resched(NR_now));
> include/linux/sched.h:          unlikely(tif_need_resched(NR_lazy));
> include/linux/thread_info.h:static __always_inline bool tif_need_resched(resched_t rs)
> include/linux/thread_info.h:     * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
> kernel/entry/common.c:          if (tif_need_resched(NR_now))
> kernel/sched/debug.c:   nr = tif_need_resched(NR_now) ? "need_resched" : "need_resched_lazy";
> kernel/trace/trace.c:   if (tif_need_resched(NR_now))
> kernel/trace/trace.c:   if (tif_need_resched(NR_lazy))
>
> I think it'd be clearer if we had tif_need_resched_now() and
> tif_need_resched_lazy() wrappers rather than taking a parameter. I think that
> if we did similar elsewhere (e.g. {set,test}_tsk_need_resched_{now,lazy}()),
> it'd be a bit cleaner overall, since we can special-case the lazy behaviour
> more easily/clearly.

So, we have three need-resched interfaces:

1. need_resched(), need_resched_lazy()
 These are used all over non-core (and idle) code, and I don't
 see a case where the user would find it useful to dynamically
 choose one or the other.
 So, here two separate interfaces, need_resched()/need_resched_lazy()
 make sense.

2. tif_need_resched()
 This is mostly used from preempt.h or scheduler adjacent code to drive
 preemption and at least in current uses, the resched_t param is a
 compile time constant.

 I think the scheduler might find it useful in some cases to parametrize
 it (ex. maybe the scheduler knows how long which bit has been set for
 over long and wants to pass that on to resched_latency_warn().)

 But that's a contrived example. I think this one would be fine
 either way. Will try it out and see which (tif_need_resched(rs),
 or tif_need_resched_now()/tif_need_resched_lazy()) seems cleaner.

3. *_tsk_need_resched()
 This is is used almost entirely from the scheduler and RCU.

 One place where I found the ability to parametrize quite useful
 was __resched_curr(). So this I would like to keep.

All of that said, and I wonder if we need these new interfaces at all.
Most of the code only uses the NR_now interface. Only the scheduler and
the entry code need to distinguish between lazy and eager.
(Plus, this way lazy and eager becomes an implementation detail which
doesn't need to be known outside the scheduler. Which is also kind of
the point of PREEMPT_AUTO.)

Say something like the patch below (and similar for tif_need_resched(),
need_resched() etc.)

What do you think?

Ankur

---------
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 58e6ea7572a0..b836b238b117 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1953,7 +1953,7 @@ static inline bool test_tsk_thread_flag(struct task_struct *tsk, int flag)
  * tif_resched(NR_now). Add a check in the helpers below to ensure
  * we don't touch the tif_reshed(NR_now) bit unnecessarily.
  */
-static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
+static inline void __set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
 {
 	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
 		set_tsk_thread_flag(tsk, tif_resched(rs));
@@ -1964,6 +1964,11 @@ static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
 		BUG();
 }

+static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
+{
+	__set_tsk_need_resched(tsk, NR_now);
+}
+
 static inline void clear_tsk_need_resched(struct task_struct *tsk)
 {
 	clear_tsk_thread_flag(tsk, tif_resched(NR_now));
@@ -1972,7 +1977,7 @@ static inline void clear_tsk_need_resched(struct task_struct *tsk)
 		clear_tsk_thread_flag(tsk, tif_resched(NR_lazy));
 }

-static inline bool test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
+static inline bool __test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
 {
 	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
 		return unlikely(test_tsk_thread_flag(tsk, tif_resched(rs)));
@@ -1980,6 +1985,11 @@ static inline bool test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
 		return false;
 }

+static inline bool test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
+{
+	return __test_tsk_need_resched(tsk, NR_now);
+}
+
 /*
  * cond_resched() and cond_resched_lock(): latency reduction via
  * explicit rescheduling in places that are safe. The return
  
Mark Rutland Feb. 19, 2024, 12:30 p.m. UTC | #4
On Wed, Feb 14, 2024 at 08:08:30PM -0800, Ankur Arora wrote:
> 
> Mark Rutland <mark.rutland@arm.com> writes:
> 
> > On Mon, Feb 12, 2024 at 09:55:27PM -0800, Ankur Arora wrote:
> >> tif_need_resched() now takes a resched_t parameter to decide the
> >> immediacy of the need-resched.
> >
> > I see at the end of the series, most callers pass a constant:
> >
> > [mark@lakrids:~/src/linux]% git grep -w tif_need_resched
> > arch/s390/include/asm/preempt.h:        return !--S390_lowcore.preempt_count && tif_need_resched(NR_now);
> > arch/s390/include/asm/preempt.h:                        tif_need_resched(NR_now));
> > include/asm-generic/preempt.h:  return !--*preempt_count_ptr() && tif_need_resched(NR_now);
> > include/asm-generic/preempt.h:                  tif_need_resched(NR_now));
> > include/linux/preempt.h:        if (tif_need_resched(NR_now)) \
> > include/linux/sched.h:  return unlikely(tif_need_resched(NR_now));
> > include/linux/sched.h:          unlikely(tif_need_resched(NR_lazy));
> > include/linux/thread_info.h:static __always_inline bool tif_need_resched(resched_t rs)
> > include/linux/thread_info.h:     * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
> > kernel/entry/common.c:          if (tif_need_resched(NR_now))
> > kernel/sched/debug.c:   nr = tif_need_resched(NR_now) ? "need_resched" : "need_resched_lazy";
> > kernel/trace/trace.c:   if (tif_need_resched(NR_now))
> > kernel/trace/trace.c:   if (tif_need_resched(NR_lazy))
> >
> > I think it'd be clearer if we had tif_need_resched_now() and
> > tif_need_resched_lazy() wrappers rather than taking a parameter. I think that
> > if we did similar elsewhere (e.g. {set,test}_tsk_need_resched_{now,lazy}()),
> > it'd be a bit cleaner overall, since we can special-case the lazy behaviour
> > more easily/clearly.
> 
> So, we have three need-resched interfaces:
> 
> 1. need_resched(), need_resched_lazy()
>  These are used all over non-core (and idle) code, and I don't
>  see a case where the user would find it useful to dynamically
>  choose one or the other.
>  So, here two separate interfaces, need_resched()/need_resched_lazy()
>  make sense.
> 
> 2. tif_need_resched()
>  This is mostly used from preempt.h or scheduler adjacent code to drive
>  preemption and at least in current uses, the resched_t param is a
>  compile time constant.
> 
>  I think the scheduler might find it useful in some cases to parametrize
>  it (ex. maybe the scheduler knows how long which bit has been set for
>  over long and wants to pass that on to resched_latency_warn().)
> 
>  But that's a contrived example. I think this one would be fine
>  either way. Will try it out and see which (tif_need_resched(rs),
>  or tif_need_resched_now()/tif_need_resched_lazy()) seems cleaner.
> 
> 3. *_tsk_need_resched()
>  This is is used almost entirely from the scheduler and RCU.
> 
>  One place where I found the ability to parametrize quite useful
>  was __resched_curr(). So this I would like to keep.
> 
> All of that said, and I wonder if we need these new interfaces at all.
> Most of the code only uses the NR_now interface. Only the scheduler and
> the entry code need to distinguish between lazy and eager.
> (Plus, this way lazy and eager becomes an implementation detail which
> doesn't need to be known outside the scheduler. Which is also kind of
> the point of PREEMPT_AUTO.)
> 
> Say something like the patch below (and similar for tif_need_resched(),
> need_resched() etc.)
> 
> What do you think?
> 
> Ankur
> 
> ---------
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 58e6ea7572a0..b836b238b117 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1953,7 +1953,7 @@ static inline bool test_tsk_thread_flag(struct task_struct *tsk, int flag)
>   * tif_resched(NR_now). Add a check in the helpers below to ensure
>   * we don't touch the tif_reshed(NR_now) bit unnecessarily.
>   */
> -static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
> +static inline void __set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>  {
>  	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
>  		set_tsk_thread_flag(tsk, tif_resched(rs));
> @@ -1964,6 +1964,11 @@ static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>  		BUG();
>  }
> 
> +static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
> +{
> +	__set_tsk_need_resched(tsk, NR_now);
> +}

I assume for this (and test_tsk_need_resched() below), you mean to drop the
resched_t argument, i.e. this should be:

	static inline void set_tsk_need_resched(struct task_struct *tsk)
	{
		__set_tsk_need_resched(tsk, NR_now);
	}

Assuming so, this looks good to me!

Mark.

> +
>  static inline void clear_tsk_need_resched(struct task_struct *tsk)
>  {
>  	clear_tsk_thread_flag(tsk, tif_resched(NR_now));
> @@ -1972,7 +1977,7 @@ static inline void clear_tsk_need_resched(struct task_struct *tsk)
>  		clear_tsk_thread_flag(tsk, tif_resched(NR_lazy));
>  }
> 
> -static inline bool test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
> +static inline bool __test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>  {
>  	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
>  		return unlikely(test_tsk_thread_flag(tsk, tif_resched(rs)));
> @@ -1980,6 +1985,11 @@ static inline bool test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>  		return false;
>  }
> 
> +static inline bool test_tsk_need_resched(struct task_struct *tsk, resched_t rs)
> +{
> +	return __test_tsk_need_resched(tsk, NR_now);
> +}
> +
>  /*
>   * cond_resched() and cond_resched_lock(): latency reduction via
>   * explicit rescheduling in places that are safe. The return
  
Thomas Gleixner Feb. 19, 2024, 3:21 p.m. UTC | #5
On Wed, Feb 14 2024 at 14:08, Mark Rutland wrote:
> On Mon, Feb 12, 2024 at 09:55:27PM -0800, Ankur Arora wrote:
>>  
>> -static __always_inline bool tif_need_resched(void)
>> +static __always_inline bool __tif_need_resched(int nr_flag)
>>  {
>> -	return test_bit(TIF_NEED_RESCHED,
>> -			(unsigned long *)(&current_thread_info()->flags));
>> +	return test_bit(nr_flag,
>> +		(unsigned long *)(&current_thread_info()->flags));
>>  }
>>  
>>  #endif /* _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H */
>>  
>> +static __always_inline bool tif_need_resched(resched_t rs)
>> +{
>> +	/*
>> +	 * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
>> +	 * as TIF_NEED_RESCHED (the TIF_NEED_RESCHED_LAZY flag is not
>> +	 * defined). Return false in that case.
>> +	 */
>> +	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
>> +		return __tif_need_resched(tif_resched(rs));
>> +	else
>> +		return false;
>> +}
>
> As above, I think this would be a bit simpler/clearer if we did:
>
> static __always_inline bool tif_need_resched_now(void)
> {
> 	return __tif_need_resched(TIF_NEED_RESCHED);
> }
>
> static __always_inline bool tif_need_resched_lazy(void)
> {
> 	return IS_ENABLED(CONFIG_PREEMPT_AUTO) &&
> 	        __tif_need_resched(TIF_NEED_RESCHED_LAZY);
> }

Yes please.
  
Ankur Arora Feb. 20, 2024, 10:09 p.m. UTC | #6
Mark Rutland <mark.rutland@arm.com> writes:

> On Wed, Feb 14, 2024 at 08:08:30PM -0800, Ankur Arora wrote:
>>
>> Mark Rutland <mark.rutland@arm.com> writes:
>>
>> > On Mon, Feb 12, 2024 at 09:55:27PM -0800, Ankur Arora wrote:
>> >> tif_need_resched() now takes a resched_t parameter to decide the
>> >> immediacy of the need-resched.
>> >
>> > I see at the end of the series, most callers pass a constant:
>> >
>> > [mark@lakrids:~/src/linux]% git grep -w tif_need_resched
>> > arch/s390/include/asm/preempt.h:        return !--S390_lowcore.preempt_count && tif_need_resched(NR_now);
>> > arch/s390/include/asm/preempt.h:                        tif_need_resched(NR_now));
>> > include/asm-generic/preempt.h:  return !--*preempt_count_ptr() && tif_need_resched(NR_now);
>> > include/asm-generic/preempt.h:                  tif_need_resched(NR_now));
>> > include/linux/preempt.h:        if (tif_need_resched(NR_now)) \
>> > include/linux/sched.h:  return unlikely(tif_need_resched(NR_now));
>> > include/linux/sched.h:          unlikely(tif_need_resched(NR_lazy));
>> > include/linux/thread_info.h:static __always_inline bool tif_need_resched(resched_t rs)
>> > include/linux/thread_info.h:     * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
>> > kernel/entry/common.c:          if (tif_need_resched(NR_now))
>> > kernel/sched/debug.c:   nr = tif_need_resched(NR_now) ? "need_resched" : "need_resched_lazy";
>> > kernel/trace/trace.c:   if (tif_need_resched(NR_now))
>> > kernel/trace/trace.c:   if (tif_need_resched(NR_lazy))
>> >
>> > I think it'd be clearer if we had tif_need_resched_now() and
>> > tif_need_resched_lazy() wrappers rather than taking a parameter. I think that
>> > if we did similar elsewhere (e.g. {set,test}_tsk_need_resched_{now,lazy}()),
>> > it'd be a bit cleaner overall, since we can special-case the lazy behaviour
>> > more easily/clearly.
>>
>> So, we have three need-resched interfaces:
>>
>> 1. need_resched(), need_resched_lazy()
>>  These are used all over non-core (and idle) code, and I don't
>>  see a case where the user would find it useful to dynamically
>>  choose one or the other.
>>  So, here two separate interfaces, need_resched()/need_resched_lazy()
>>  make sense.
>>
>> 2. tif_need_resched()
>>  This is mostly used from preempt.h or scheduler adjacent code to drive
>>  preemption and at least in current uses, the resched_t param is a
>>  compile time constant.
>>
>>  I think the scheduler might find it useful in some cases to parametrize
>>  it (ex. maybe the scheduler knows how long which bit has been set for
>>  over long and wants to pass that on to resched_latency_warn().)
>>
>>  But that's a contrived example. I think this one would be fine
>>  either way. Will try it out and see which (tif_need_resched(rs),
>>  or tif_need_resched_now()/tif_need_resched_lazy()) seems cleaner.
>>
>> 3. *_tsk_need_resched()
>>  This is is used almost entirely from the scheduler and RCU.
>>
>>  One place where I found the ability to parametrize quite useful
>>  was __resched_curr(). So this I would like to keep.
>>
>> All of that said, and I wonder if we need these new interfaces at all.
>> Most of the code only uses the NR_now interface. Only the scheduler and
>> the entry code need to distinguish between lazy and eager.
>> (Plus, this way lazy and eager becomes an implementation detail which
>> doesn't need to be known outside the scheduler. Which is also kind of
>> the point of PREEMPT_AUTO.)
>>
>> Say something like the patch below (and similar for tif_need_resched(),
>> need_resched() etc.)
>>
>> What do you think?
>>
>> Ankur
>>
>> ---------
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 58e6ea7572a0..b836b238b117 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1953,7 +1953,7 @@ static inline bool test_tsk_thread_flag(struct task_struct *tsk, int flag)
>>   * tif_resched(NR_now). Add a check in the helpers below to ensure
>>   * we don't touch the tif_reshed(NR_now) bit unnecessarily.
>>   */
>> -static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>> +static inline void __set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>>  {
>>  	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
>>  		set_tsk_thread_flag(tsk, tif_resched(rs));
>> @@ -1964,6 +1964,11 @@ static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>>  		BUG();
>>  }
>>
>> +static inline void set_tsk_need_resched(struct task_struct *tsk, resched_t rs)
>> +{
>> +	__set_tsk_need_resched(tsk, NR_now);
>> +}
>
> I assume for this (and test_tsk_need_resched() below), you mean to drop the
> resched_t argument, i.e. this should be:
>
> 	static inline void set_tsk_need_resched(struct task_struct *tsk)
> 	{
> 		__set_tsk_need_resched(tsk, NR_now);
> 	}
>
> Assuming so, this looks good to me!

Yup. Great.

--
ankur
  
Ankur Arora Feb. 20, 2024, 10:21 p.m. UTC | #7
Thomas Gleixner <tglx@linutronix.de> writes:

> On Wed, Feb 14 2024 at 14:08, Mark Rutland wrote:
>> On Mon, Feb 12, 2024 at 09:55:27PM -0800, Ankur Arora wrote:
>>>
>>> -static __always_inline bool tif_need_resched(void)
>>> +static __always_inline bool __tif_need_resched(int nr_flag)
>>>  {
>>> -	return test_bit(TIF_NEED_RESCHED,
>>> -			(unsigned long *)(&current_thread_info()->flags));
>>> +	return test_bit(nr_flag,
>>> +		(unsigned long *)(&current_thread_info()->flags));
>>>  }
>>>
>>>  #endif /* _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H */
>>>
>>> +static __always_inline bool tif_need_resched(resched_t rs)
>>> +{
>>> +	/*
>>> +	 * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
>>> +	 * as TIF_NEED_RESCHED (the TIF_NEED_RESCHED_LAZY flag is not
>>> +	 * defined). Return false in that case.
>>> +	 */
>>> +	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
>>> +		return __tif_need_resched(tif_resched(rs));
>>> +	else
>>> +		return false;
>>> +}
>>
>> As above, I think this would be a bit simpler/clearer if we did:
>>
>> static __always_inline bool tif_need_resched_now(void)
>> {
>> 	return __tif_need_resched(TIF_NEED_RESCHED);
>> }
>>
>> static __always_inline bool tif_need_resched_lazy(void)
>> {
>> 	return IS_ENABLED(CONFIG_PREEMPT_AUTO) &&
>> 	        __tif_need_resched(TIF_NEED_RESCHED_LAZY);
>> }
>
> Yes please.

As I wrote to Mark in the sibling subthread, I think exposing
the lazy variants outside of the scheduler isn't really needed.

Non-scheduler/non-entry code only cares about checking if it needs
to do reschedule now. So, I think we can get away with just having:

    static __always_inline bool __tif_need_resched(resched_t rs)
    {
            /*
            * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
            * as TIF_NEED_RESCHED (the TIF_NEED_RESCHED_LAZY flag is not
            * defined). Return false in that case.
            */
            if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
                    return tif_need_resched_bitop(tif_resched(rs));
            else
                    return false;
    }

    static __always_inline bool tif_need_resched(void)
    {
            return __tif_need_resched(NR_now);
    }

(and similar for all the other interfaces.)

This way lazy and eager just becomes an implementation detail which
which seems to also match nicely to the point of PREEMPT_AUTO.

--
ankur
  
Thomas Gleixner Feb. 21, 2024, 5:07 p.m. UTC | #8
On Tue, Feb 20 2024 at 14:21, Ankur Arora wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
>>> static __always_inline bool tif_need_resched_lazy(void)
>>> {
>>> 	return IS_ENABLED(CONFIG_PREEMPT_AUTO) &&
>>> 	        __tif_need_resched(TIF_NEED_RESCHED_LAZY);
>>> }
>>
>> Yes please.
>
> As I wrote to Mark in the sibling subthread, I think exposing
> the lazy variants outside of the scheduler isn't really needed.

But having a proper wrapper inline in the scheduler code (local header)
makes still a lot of sense.

Thanks,

        tglx
  
Ankur Arora Feb. 21, 2024, 9:22 p.m. UTC | #9
Thomas Gleixner <tglx@linutronix.de> writes:

> On Tue, Feb 20 2024 at 14:21, Ankur Arora wrote:
>> Thomas Gleixner <tglx@linutronix.de> writes:
>>>> static __always_inline bool tif_need_resched_lazy(void)
>>>> {
>>>> 	return IS_ENABLED(CONFIG_PREEMPT_AUTO) &&
>>>> 	        __tif_need_resched(TIF_NEED_RESCHED_LAZY);
>>>> }
>>>
>>> Yes please.
>>
>> As I wrote to Mark in the sibling subthread, I think exposing
>> the lazy variants outside of the scheduler isn't really needed.
>
> But having a proper wrapper inline in the scheduler code (local header)
> makes still a lot of sense.

Yeah, there are a few callsites where this will help. Will do.

Thanks

--
ankur
  

Patch

diff --git a/arch/s390/include/asm/preempt.h b/arch/s390/include/asm/preempt.h
index bf15da0fedbc..97fda8e06b6c 100644
--- a/arch/s390/include/asm/preempt.h
+++ b/arch/s390/include/asm/preempt.h
@@ -114,13 +114,13 @@  static inline void __preempt_count_sub(int val)
 
 static inline bool __preempt_count_dec_and_test(void)
 {
-	return !--S390_lowcore.preempt_count && tif_need_resched();
+	return !--S390_lowcore.preempt_count && tif_need_resched(NR_now);
 }
 
 static inline bool should_resched(int preempt_offset)
 {
 	return unlikely(preempt_count() == preempt_offset &&
-			tif_need_resched());
+			tif_need_resched(NR_now));
 }
 
 #endif /* CONFIG_HAVE_MARCH_Z196_FEATURES */
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index 55437f5e0c3a..7fc47007b926 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -108,7 +108,7 @@  static const struct dmi_system_id processor_power_dmi_table[] = {
  */
 static void __cpuidle acpi_safe_halt(void)
 {
-	if (!tif_need_resched()) {
+	if (!need_resched()) {
 		raw_safe_halt();
 		raw_local_irq_disable();
 	}
diff --git a/include/asm-generic/preempt.h b/include/asm-generic/preempt.h
index 51f8f3881523..ed98e6168b99 100644
--- a/include/asm-generic/preempt.h
+++ b/include/asm-generic/preempt.h
@@ -66,7 +66,7 @@  static __always_inline bool __preempt_count_dec_and_test(void)
 	 * operations; we cannot use PREEMPT_NEED_RESCHED because it might get
 	 * lost.
 	 */
-	return !--*preempt_count_ptr() && tif_need_resched();
+	return !--*preempt_count_ptr() && tif_need_resched(NR_now);
 }
 
 /*
@@ -75,7 +75,7 @@  static __always_inline bool __preempt_count_dec_and_test(void)
 static __always_inline bool should_resched(int preempt_offset)
 {
 	return unlikely(preempt_count() == preempt_offset &&
-			tif_need_resched());
+			tif_need_resched(NR_now));
 }
 
 #ifdef CONFIG_PREEMPTION
diff --git a/include/linux/preempt.h b/include/linux/preempt.h
index 7233e9cf1bab..336cb76f0830 100644
--- a/include/linux/preempt.h
+++ b/include/linux/preempt.h
@@ -312,7 +312,7 @@  do { \
 } while (0)
 #define preempt_fold_need_resched() \
 do { \
-	if (tif_need_resched()) \
+	if (tif_need_resched(NR_now)) \
 		set_preempt_need_resched(); \
 } while (0)
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index cdb8ea53c365..7a22a56350bb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2099,7 +2099,12 @@  static inline bool preempt_model_preemptible(void)
 
 static __always_inline bool need_resched(void)
 {
-	return unlikely(tif_need_resched());
+	return unlikely(tif_need_resched(NR_now));
+}
+
+static __always_inline bool need_resched_lazy(void)
+{
+	return unlikely(tif_need_resched(NR_lazy));
 }
 
 /*
diff --git a/include/linux/sched/idle.h b/include/linux/sched/idle.h
index 478084f9105e..d8ce85dff47a 100644
--- a/include/linux/sched/idle.h
+++ b/include/linux/sched/idle.h
@@ -63,7 +63,7 @@  static __always_inline bool __must_check current_set_polling_and_test(void)
 	 */
 	smp_mb__after_atomic();
 
-	return unlikely(tif_need_resched());
+	return need_resched();
 }
 
 static __always_inline bool __must_check current_clr_polling_and_test(void)
@@ -76,7 +76,7 @@  static __always_inline bool __must_check current_clr_polling_and_test(void)
 	 */
 	smp_mb__after_atomic();
 
-	return unlikely(tif_need_resched());
+	return unlikely(need_resched());
 }
 
 #else
@@ -85,11 +85,11 @@  static inline void __current_clr_polling(void) { }
 
 static inline bool __must_check current_set_polling_and_test(void)
 {
-	return unlikely(tif_need_resched());
+	return unlikely(need_resched());
 }
 static inline bool __must_check current_clr_polling_and_test(void)
 {
-	return unlikely(tif_need_resched());
+	return unlikely(need_resched());
 }
 #endif
 
diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h
index 99043cbbb6b0..8752dbc2dac7 100644
--- a/include/linux/thread_info.h
+++ b/include/linux/thread_info.h
@@ -211,22 +211,35 @@  static __always_inline unsigned long read_ti_thread_flags(struct thread_info *ti
 
 #ifdef _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H
 
-static __always_inline bool tif_need_resched(void)
+static __always_inline bool __tif_need_resched(int nr_flag)
 {
-	return arch_test_bit(TIF_NEED_RESCHED,
-			     (unsigned long *)(&current_thread_info()->flags));
+	return arch_test_bit(nr_flag,
+		     (unsigned long *)(&current_thread_info()->flags));
 }
 
 #else
 
-static __always_inline bool tif_need_resched(void)
+static __always_inline bool __tif_need_resched(int nr_flag)
 {
-	return test_bit(TIF_NEED_RESCHED,
-			(unsigned long *)(&current_thread_info()->flags));
+	return test_bit(nr_flag,
+		(unsigned long *)(&current_thread_info()->flags));
 }
 
 #endif /* _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H */
 
+static __always_inline bool tif_need_resched(resched_t rs)
+{
+	/*
+	 * With !PREEMPT_AUTO tif_need_resched(NR_lazy) is defined
+	 * as TIF_NEED_RESCHED (the TIF_NEED_RESCHED_LAZY flag is not
+	 * defined). Return false in that case.
+	 */
+	if (IS_ENABLED(CONFIG_PREEMPT_AUTO) || rs == NR_now)
+		return __tif_need_resched(tif_resched(rs));
+	else
+		return false;
+}
+
 #ifndef CONFIG_HAVE_ARCH_WITHIN_STACK_FRAMES
 static inline int arch_within_stack_frames(const void * const stack,
 					   const void * const stackend,
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 31231925f1ec..be53d164e267 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -57,7 +57,7 @@  static noinline int __cpuidle cpu_idle_poll(void)
 	ct_cpuidle_enter();
 
 	raw_local_irq_enable();
-	while (!tif_need_resched() &&
+	while (!need_resched() &&
 	       (cpu_idle_force_poll || tick_check_broadcast_expired()))
 		cpu_relax();
 	raw_local_irq_disable();
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2a7c6fd934e9..0a9642fba852 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2701,7 +2701,7 @@  unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status)
 	if (softirq_count() >> (SOFTIRQ_SHIFT + 1))
 		trace_flags |= TRACE_FLAG_BH_OFF;
 
-	if (tif_need_resched())
+	if (tif_need_resched(NR_now))
 		trace_flags |= TRACE_FLAG_NEED_RESCHED;
 	if (test_preempt_need_resched())
 		trace_flags |= TRACE_FLAG_PREEMPT_RESCHED;