rcu: Fix opposite might_sleep() check in rcu_blocking_is_gp()

Message ID 20221215035755.2820163-1-qiang1.zhang@intel.com
State New
Headers
Series rcu: Fix opposite might_sleep() check in rcu_blocking_is_gp() |

Commit Message

Zqiang Dec. 15, 2022, 3:57 a.m. UTC
  Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
synchronize_rcu_*() will implies a grace period and return directly,
so there is no sleep action due to waiting for a grace period to end,
but this might_sleep() check is the opposite. therefore, this commit
puts might_sleep() check in the correct palce.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 kernel/rcu/tree.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
  

Comments

Paul E. McKenney Dec. 17, 2022, 1:03 a.m. UTC | #1
On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> synchronize_rcu_*() will implies a grace period and return directly,
> so there is no sleep action due to waiting for a grace period to end,
> but this might_sleep() check is the opposite. therefore, this commit
> puts might_sleep() check in the correct palce.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>

Queued for testing and review, thank you!

I was under the impression that might_sleep() did some lockdep-based
checking, but I am unable to find it.  If there really is such checking,
that would be a potential argument for leaving this code as it is.

But in the meantime, full speed ahead!  ;-)

						Thanx, Paul

> ---
>  kernel/rcu/tree.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index ee8a6a711719..65f3dd2fd3ae 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3379,9 +3379,10 @@ void __init kfree_rcu_scheduler_running(void)
>   */
>  static int rcu_blocking_is_gp(void)
>  {
> -	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE)
> +	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE) {
> +		might_sleep();
>  		return false;
> -	might_sleep();  /* Check for RCU read-side critical section. */
> +	}
>  	return true;
>  }
>  
> -- 
> 2.25.1
>
  
Zqiang Dec. 17, 2022, 2:08 a.m. UTC | #2
On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> synchronize_rcu_*() will implies a grace period and return directly,
> so there is no sleep action due to waiting for a grace period to end,
> but this might_sleep() check is the opposite. therefore, this commit
> puts might_sleep() check in the correct palce.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
>
>Queued for testing and review, thank you!
>
>I was under the impression that might_sleep() did some lockdep-based
>checking, but I am unable to find it.  If there really is such checking,
>that would be a potential argument for leaving this code as it is.
>

__might_sleep
   __might_resched(file, line, 0)
      rcu_sleep_check()

Does it refer to this rcu_sleep_check() ?

If so, when in the RCU_SCHEDULER_INACTIVE state,  the debug_lockdep_rcu_enabled() is always
return false, so the RCU_LOCKDEP_WARN() also does not produce an actual warning.

Thanks
Zqiang


>But in the meantime, full speed ahead!  ;-)
>
>						Thanx, Paul
>
> ---
>  kernel/rcu/tree.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index ee8a6a711719..65f3dd2fd3ae 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3379,9 +3379,10 @@ void __init kfree_rcu_scheduler_running(void)
>   */
>  static int rcu_blocking_is_gp(void)
>  {
> -	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE)
> +	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE) {
> +		might_sleep();
>  		return false;
> -	might_sleep();  /* Check for RCU read-side critical section. */
> +	}
>  	return true;
>  }
>  
> -- 
> 2.25.1
>
  
Zqiang Dec. 17, 2022, 2:44 a.m. UTC | #3
On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> synchronize_rcu_*() will implies a grace period and return directly,
> so there is no sleep action due to waiting for a grace period to end,
> but this might_sleep() check is the opposite. therefore, this commit
> puts might_sleep() check in the correct palce.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
>
>Queued for testing and review, thank you!
>
>I was under the impression that might_sleep() did some lockdep-based
>checking, but I am unable to find it.  If there really is such checking,
>that would be a potential argument for leaving this code as it is.
>
>
>__might_sleep
>   __might_resched(file, line, 0)
>      rcu_sleep_check()
>
>Does it refer to this rcu_sleep_check() ?
>
>If so, when in the RCU_SCHEDULER_INACTIVE state,  the debug_lockdep_rcu_enabled() is always
>return false, so the RCU_LOCKDEP_WARN() also does not produce an actual warning.
>

and when the system_state == SYSTEM_BOOTING, we just did  rcu_sleep_check()  and then  return.

Thanks
Zqiang

>Thanks
>Zqiang
>

>But in the meantime, full speed ahead!  ;-)
>
>						Thanx, Paul
>
> ---
>  kernel/rcu/tree.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index ee8a6a711719..65f3dd2fd3ae 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3379,9 +3379,10 @@ void __init kfree_rcu_scheduler_running(void)
>   */
>  static int rcu_blocking_is_gp(void)
>  {
> -	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE)
> +	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE) {
> +		might_sleep();
>  		return false;
> -	might_sleep();  /* Check for RCU read-side critical section. */
> +	}
>  	return true;
>  }
>  
> -- 
> 2.25.1
>
  
Paul E. McKenney Dec. 17, 2022, 5:17 a.m. UTC | #4
On Sat, Dec 17, 2022 at 02:44:47AM +0000, Zhang, Qiang1 wrote:
> 
> On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> > Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> > synchronize_rcu_*() will implies a grace period and return directly,
> > so there is no sleep action due to waiting for a grace period to end,
> > but this might_sleep() check is the opposite. therefore, this commit
> > puts might_sleep() check in the correct palce.
> > 
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> >
> >Queued for testing and review, thank you!
> >
> >I was under the impression that might_sleep() did some lockdep-based
> >checking, but I am unable to find it.  If there really is such checking,
> >that would be a potential argument for leaving this code as it is.
> >
> >
> >__might_sleep
> >   __might_resched(file, line, 0)
> >      rcu_sleep_check()
> >
> >Does it refer to this rcu_sleep_check() ?
> >
> >If so, when in the RCU_SCHEDULER_INACTIVE state,  the debug_lockdep_rcu_enabled() is always
> >return false, so the RCU_LOCKDEP_WARN() also does not produce an actual warning.
> 
> and when the system_state == SYSTEM_BOOTING, we just did  rcu_sleep_check()  and then  return.

Very good, thank you!

Thoughts from others?

							Thanx, Paul

> Thanks
> Zqiang
> 
> >Thanks
> >Zqiang
> >
> 
> >But in the meantime, full speed ahead!  ;-)
> >
> >						Thanx, Paul
> >
> > ---
> >  kernel/rcu/tree.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index ee8a6a711719..65f3dd2fd3ae 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3379,9 +3379,10 @@ void __init kfree_rcu_scheduler_running(void)
> >   */
> >  static int rcu_blocking_is_gp(void)
> >  {
> > -	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE)
> > +	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE) {
> > +		might_sleep();
> >  		return false;
> > -	might_sleep();  /* Check for RCU read-side critical section. */
> > +	}
> >  	return true;
> >  }
> >  
> > -- 
> > 2.25.1
> >
  
Joel Fernandes Dec. 18, 2022, 2:01 a.m. UTC | #5
On Fri, Dec 16, 2022 at 09:17:59PM -0800, Paul E. McKenney wrote:
> On Sat, Dec 17, 2022 at 02:44:47AM +0000, Zhang, Qiang1 wrote:
> > 
> > On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> > > Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> > > synchronize_rcu_*() will implies a grace period and return directly,
> > > so there is no sleep action due to waiting for a grace period to end,
> > > but this might_sleep() check is the opposite. therefore, this commit
> > > puts might_sleep() check in the correct palce.
> > > 
> > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > >
> > >Queued for testing and review, thank you!
> > >
> > >I was under the impression that might_sleep() did some lockdep-based
> > >checking, but I am unable to find it.  If there really is such checking,
> > >that would be a potential argument for leaving this code as it is.
> > >
> > >
> > >__might_sleep
> > >   __might_resched(file, line, 0)
> > >      rcu_sleep_check()
> > >
> > >Does it refer to this rcu_sleep_check() ?
> > >
> > >If so, when in the RCU_SCHEDULER_INACTIVE state,  the debug_lockdep_rcu_enabled() is always
> > >return false, so the RCU_LOCKDEP_WARN() also does not produce an actual warning.
> > 
> > and when the system_state == SYSTEM_BOOTING, we just did  rcu_sleep_check()  and then  return.
> 
> Very good, thank you!
> 
> Thoughts from others?

Please consider this as a best-effort comment that might be missing details:

The might_sleep() was added in 18fec7d8758d ("rcu: Improve synchronize_rcu()
diagnostics")

Since it is illegal to call a blocking API like synchronize_rcu() in a
non-preemptible section, is there any harm in just calling might_sleep()
uncomditionally in rcu_block_is_gp() ? I think it is a bit irrelevant if
synchronize_rcu() is called from a call path, before scheduler is
initialized, or after. The fact that it was even called from a
non-preemptible section is a red-flag, considering if such non-preemptible
section may call synchronize_rcu() API in the future, after full boot up,
even if rarely.

For this reason, IMHO there is still value in doing the might_sleep() check
unconditionally. Say if a common code path is invoked both before
RCU_SCHEDULER_INIT and *very rarely* after RCU_SCHEDULER_INIT.

Or is there more of a point in doing this check if scheduler is initialized
from RCU perspective ?

If not, I would do something like this:

---8<-----------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 79aea7df4345..23c2303de9f4 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3435,11 +3435,12 @@ static int rcu_blocking_is_gp(void)
 {
 	int ret;
 
+	might_sleep();  /* Check for RCU read-side critical section. */
+
 	// Invoking preempt_model_*() too early gets a splat.
 	if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE ||
 	    preempt_model_full() || preempt_model_rt())
 		return rcu_scheduler_active == RCU_SCHEDULER_INACTIVE;
-	might_sleep();  /* Check for RCU read-side critical section. */
 	preempt_disable();
 	/*
 	 * If the rcu_state.n_online_cpus counter is equal to one,
  
Paul E. McKenney Dec. 18, 2022, 6:06 p.m. UTC | #6
On Sun, Dec 18, 2022 at 02:01:11AM +0000, Joel Fernandes wrote:
> On Fri, Dec 16, 2022 at 09:17:59PM -0800, Paul E. McKenney wrote:
> > On Sat, Dec 17, 2022 at 02:44:47AM +0000, Zhang, Qiang1 wrote:
> > > 
> > > On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> > > > Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> > > > synchronize_rcu_*() will implies a grace period and return directly,
> > > > so there is no sleep action due to waiting for a grace period to end,
> > > > but this might_sleep() check is the opposite. therefore, this commit
> > > > puts might_sleep() check in the correct palce.
> > > > 
> > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > >
> > > >Queued for testing and review, thank you!
> > > >
> > > >I was under the impression that might_sleep() did some lockdep-based
> > > >checking, but I am unable to find it.  If there really is such checking,
> > > >that would be a potential argument for leaving this code as it is.
> > > >
> > > >
> > > >__might_sleep
> > > >   __might_resched(file, line, 0)
> > > >      rcu_sleep_check()
> > > >
> > > >Does it refer to this rcu_sleep_check() ?
> > > >
> > > >If so, when in the RCU_SCHEDULER_INACTIVE state,  the debug_lockdep_rcu_enabled() is always
> > > >return false, so the RCU_LOCKDEP_WARN() also does not produce an actual warning.
> > > 
> > > and when the system_state == SYSTEM_BOOTING, we just did  rcu_sleep_check()  and then  return.
> > 
> > Very good, thank you!
> > 
> > Thoughts from others?
> 
> Please consider this as a best-effort comment that might be missing details:
> 
> The might_sleep() was added in 18fec7d8758d ("rcu: Improve synchronize_rcu()
> diagnostics")
> 
> Since it is illegal to call a blocking API like synchronize_rcu() in a
> non-preemptible section, is there any harm in just calling might_sleep()
> uncomditionally in rcu_block_is_gp() ? I think it is a bit irrelevant if
> synchronize_rcu() is called from a call path, before scheduler is
> initialized, or after. The fact that it was even called from a
> non-preemptible section is a red-flag, considering if such non-preemptible
> section may call synchronize_rcu() API in the future, after full boot up,
> even if rarely.
> 
> For this reason, IMHO there is still value in doing the might_sleep() check
> unconditionally. Say if a common code path is invoked both before
> RCU_SCHEDULER_INIT and *very rarely* after RCU_SCHEDULER_INIT.
> 
> Or is there more of a point in doing this check if scheduler is initialized
> from RCU perspective ?

One advantage of its current placement would be if might_sleep() ever
unconditionally checks for interrupts being disabled.

I don't believe that might_sleep() will do that any time soon given the
likely fallout from code invoked at early boot as well as from runtime,
but why be in the way of that additional diagnostic check?

							Thanx, Paul

> If not, I would do something like this:
> 
> ---8<-----------------------
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 79aea7df4345..23c2303de9f4 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3435,11 +3435,12 @@ static int rcu_blocking_is_gp(void)
>  {
>  	int ret;
>  
> +	might_sleep();  /* Check for RCU read-side critical section. */
> +
>  	// Invoking preempt_model_*() too early gets a splat.
>  	if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE ||
>  	    preempt_model_full() || preempt_model_rt())
>  		return rcu_scheduler_active == RCU_SCHEDULER_INACTIVE;
> -	might_sleep();  /* Check for RCU read-side critical section. */
>  	preempt_disable();
>  	/*
>  	 * If the rcu_state.n_online_cpus counter is equal to one,
  
Joel Fernandes Dec. 18, 2022, 7:29 p.m. UTC | #7
On Sun, Dec 18, 2022 at 1:06 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Sun, Dec 18, 2022 at 02:01:11AM +0000, Joel Fernandes wrote:
> > On Fri, Dec 16, 2022 at 09:17:59PM -0800, Paul E. McKenney wrote:
> > > On Sat, Dec 17, 2022 at 02:44:47AM +0000, Zhang, Qiang1 wrote:
> > > >
> > > > On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> > > > > Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> > > > > synchronize_rcu_*() will implies a grace period and return directly,
> > > > > so there is no sleep action due to waiting for a grace period to end,
> > > > > but this might_sleep() check is the opposite. therefore, this commit
> > > > > puts might_sleep() check in the correct palce.
> > > > >
> > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > >
> > > > >Queued for testing and review, thank you!
> > > > >
> > > > >I was under the impression that might_sleep() did some lockdep-based
> > > > >checking, but I am unable to find it.  If there really is such checking,
> > > > >that would be a potential argument for leaving this code as it is.
> > > > >
> > > > >
> > > > >__might_sleep
> > > > >   __might_resched(file, line, 0)
> > > > >      rcu_sleep_check()
> > > > >
> > > > >Does it refer to this rcu_sleep_check() ?
> > > > >
> > > > >If so, when in the RCU_SCHEDULER_INACTIVE state,  the debug_lockdep_rcu_enabled() is always
> > > > >return false, so the RCU_LOCKDEP_WARN() also does not produce an actual warning.
> > > >
> > > > and when the system_state == SYSTEM_BOOTING, we just did  rcu_sleep_check()  and then  return.
> > >
> > > Very good, thank you!
> > >
> > > Thoughts from others?
> >
> > Please consider this as a best-effort comment that might be missing details:
> >
> > The might_sleep() was added in 18fec7d8758d ("rcu: Improve synchronize_rcu()
> > diagnostics")
> >
> > Since it is illegal to call a blocking API like synchronize_rcu() in a
> > non-preemptible section, is there any harm in just calling might_sleep()
> > uncomditionally in rcu_block_is_gp() ? I think it is a bit irrelevant if
> > synchronize_rcu() is called from a call path, before scheduler is
> > initialized, or after. The fact that it was even called from a
> > non-preemptible section is a red-flag, considering if such non-preemptible
> > section may call synchronize_rcu() API in the future, after full boot up,
> > even if rarely.
> >
> > For this reason, IMHO there is still value in doing the might_sleep() check
> > unconditionally. Say if a common code path is invoked both before
> > RCU_SCHEDULER_INIT and *very rarely* after RCU_SCHEDULER_INIT.
> >
> > Or is there more of a point in doing this check if scheduler is initialized
> > from RCU perspective ?
>
> One advantage of its current placement would be if might_sleep() ever
> unconditionally checks for interrupts being disabled.
>
> I don't believe that might_sleep() will do that any time soon given the
> likely fallout from code invoked at early boot as well as from runtime,
> but why be in the way of that additional diagnostic check?

If I understand the current code, might_sleep() is invoked only if the
scheduler is INACTIVE from RCU perspective, and I don't think here are
reports of fall out. That is current code behavior.

Situation right now is: might_sleep() only if the state is INACTIVE.
Qiang's patch: might_sleep() only if the state is NOT INACTIVE.
My suggestion: might_sleep() regardless of the state.

Is there a reason my suggestion will not work? Apologies if I
misunderstood something.

thanks,

 - Joel


>
>                                                         Thanx, Paul
>
> > If not, I would do something like this:
> >
> > ---8<-----------------------
> >
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 79aea7df4345..23c2303de9f4 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3435,11 +3435,12 @@ static int rcu_blocking_is_gp(void)
> >  {
> >       int ret;
> >
> > +     might_sleep();  /* Check for RCU read-side critical section. */
> > +
> >       // Invoking preempt_model_*() too early gets a splat.
> >       if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE ||
> >           preempt_model_full() || preempt_model_rt())
> >               return rcu_scheduler_active == RCU_SCHEDULER_INACTIVE;
> > -     might_sleep();  /* Check for RCU read-side critical section. */
> >       preempt_disable();
> >       /*
> >        * If the rcu_state.n_online_cpus counter is equal to one,
  
Paul E. McKenney Dec. 18, 2022, 7:44 p.m. UTC | #8
On Sun, Dec 18, 2022 at 02:29:10PM -0500, Joel Fernandes wrote:
> On Sun, Dec 18, 2022 at 1:06 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Sun, Dec 18, 2022 at 02:01:11AM +0000, Joel Fernandes wrote:
> > > On Fri, Dec 16, 2022 at 09:17:59PM -0800, Paul E. McKenney wrote:
> > > > On Sat, Dec 17, 2022 at 02:44:47AM +0000, Zhang, Qiang1 wrote:
> > > > >
> > > > > On Thu, Dec 15, 2022 at 11:57:55AM +0800, Zqiang wrote:
> > > > > > Currently, if the system is in the RCU_SCHEDULER_INACTIVE state, invoke
> > > > > > synchronize_rcu_*() will implies a grace period and return directly,
> > > > > > so there is no sleep action due to waiting for a grace period to end,
> > > > > > but this might_sleep() check is the opposite. therefore, this commit
> > > > > > puts might_sleep() check in the correct palce.
> > > > > >
> > > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > > >
> > > > > >Queued for testing and review, thank you!
> > > > > >
> > > > > >I was under the impression that might_sleep() did some lockdep-based
> > > > > >checking, but I am unable to find it.  If there really is such checking,
> > > > > >that would be a potential argument for leaving this code as it is.
> > > > > >
> > > > > >
> > > > > >__might_sleep
> > > > > >   __might_resched(file, line, 0)
> > > > > >      rcu_sleep_check()
> > > > > >
> > > > > >Does it refer to this rcu_sleep_check() ?
> > > > > >
> > > > > >If so, when in the RCU_SCHEDULER_INACTIVE state,  the debug_lockdep_rcu_enabled() is always
> > > > > >return false, so the RCU_LOCKDEP_WARN() also does not produce an actual warning.
> > > > >
> > > > > and when the system_state == SYSTEM_BOOTING, we just did  rcu_sleep_check()  and then  return.
> > > >
> > > > Very good, thank you!
> > > >
> > > > Thoughts from others?
> > >
> > > Please consider this as a best-effort comment that might be missing details:
> > >
> > > The might_sleep() was added in 18fec7d8758d ("rcu: Improve synchronize_rcu()
> > > diagnostics")
> > >
> > > Since it is illegal to call a blocking API like synchronize_rcu() in a
> > > non-preemptible section, is there any harm in just calling might_sleep()
> > > uncomditionally in rcu_block_is_gp() ? I think it is a bit irrelevant if
> > > synchronize_rcu() is called from a call path, before scheduler is
> > > initialized, or after. The fact that it was even called from a
> > > non-preemptible section is a red-flag, considering if such non-preemptible
> > > section may call synchronize_rcu() API in the future, after full boot up,
> > > even if rarely.
> > >
> > > For this reason, IMHO there is still value in doing the might_sleep() check
> > > unconditionally. Say if a common code path is invoked both before
> > > RCU_SCHEDULER_INIT and *very rarely* after RCU_SCHEDULER_INIT.
> > >
> > > Or is there more of a point in doing this check if scheduler is initialized
> > > from RCU perspective ?
> >
> > One advantage of its current placement would be if might_sleep() ever
> > unconditionally checks for interrupts being disabled.
> >
> > I don't believe that might_sleep() will do that any time soon given the
> > likely fallout from code invoked at early boot as well as from runtime,
> > but why be in the way of that additional diagnostic check?
> 
> If I understand the current code, might_sleep() is invoked only if the
> scheduler is INACTIVE from RCU perspective, and I don't think here are
> reports of fall out. That is current code behavior.
> 
> Situation right now is: might_sleep() only if the state is INACTIVE.
> Qiang's patch: might_sleep() only if the state is NOT INACTIVE.
> My suggestion: might_sleep() regardless of the state.
> 
> Is there a reason my suggestion will not work? Apologies if I
> misunderstood something.
> 
> thanks,
> 
>  - Joel
> 
> 
> >
> >                                                         Thanx, Paul
> >
> > > If not, I would do something like this:
> > >
> > > ---8<-----------------------
> > >
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 79aea7df4345..23c2303de9f4 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -3435,11 +3435,12 @@ static int rcu_blocking_is_gp(void)
> > >  {
> > >       int ret;
> > >
> > > +     might_sleep();  /* Check for RCU read-side critical section. */
> > > +
> > >       // Invoking preempt_model_*() too early gets a splat.
> > >       if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE ||
> > >           preempt_model_full() || preempt_model_rt())
> > >               return rcu_scheduler_active == RCU_SCHEDULER_INACTIVE;

If the scheduler is inactive (early boot with interrupts disabled),
we return here.

> > > -     might_sleep();  /* Check for RCU read-side critical section. */

We get here only if the scheduler has started, and even then only in
preemption-disabled kernels.

Or is you concern that the might_sleep() never gets invoked in kernels
with preemption enabled?  Fixing that would require a slightly different
patch, though.

Or should I have waited until tomorrow to respond to this email?  ;-)

							Thanx, Paul

> > >       preempt_disable();
> > >       /*
> > >        * If the rcu_state.n_online_cpus counter is equal to one,
  
Joel Fernandes Dec. 18, 2022, 9:02 p.m. UTC | #9
On Sun, Dec 18, 2022 at 2:44 PM Paul E. McKenney <paulmck@kernel.org> wrote:
[...]
> > > > If not, I would do something like this:
> > > >
> > > > ---8<-----------------------
> > > >
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 79aea7df4345..23c2303de9f4 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -3435,11 +3435,12 @@ static int rcu_blocking_is_gp(void)
> > > >  {
> > > >       int ret;
> > > >
> > > > +     might_sleep();  /* Check for RCU read-side critical section. */
> > > > +
> > > >       // Invoking preempt_model_*() too early gets a splat.
> > > >       if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE ||
> > > >           preempt_model_full() || preempt_model_rt())
> > > >               return rcu_scheduler_active == RCU_SCHEDULER_INACTIVE;
>
> If the scheduler is inactive (early boot with interrupts disabled),
> we return here.
>
> > > > -     might_sleep();  /* Check for RCU read-side critical section. */
>
> We get here only if the scheduler has started, and even then only in
> preemption-disabled kernels.
>
> Or is you concern that the might_sleep() never gets invoked in kernels
> with preemption enabled?  Fixing that would require a slightly different
> patch, though.
>
> Or should I have waited until tomorrow to respond to this email?  ;-)

No, I think you are quite right. I was not referring to
rcu_sleep_check(), but rather the following prints in might_sleep(). I
see an unconditional call to might_sleep()  from kvfree_call_rcu() but
not one from synchronize_rcu() which can also sleep.

But I see your point, early boot code has interrupts disabled, but can
still totally call synchronize_rcu() when the scheduler is INACTIVE.
And might_sleep() might bitterly complain. Thanks for the
clarification.

pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
      file, line);
pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d,
name: %s\n",
      in_atomic(), irqs_disabled(), current->non_block_count,
      current->pid, current->comm);
pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
      offsets & MIGHT_RESCHED_PREEMPT_MASK);

Thanks,

 - Joel

> > > >       /*
> > > >        * If the rcu_state.n_online_cpus counter is equal to one,
  
Paul E. McKenney Dec. 18, 2022, 11:30 p.m. UTC | #10
On Sun, Dec 18, 2022 at 04:02:35PM -0500, Joel Fernandes wrote:
> On Sun, Dec 18, 2022 at 2:44 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> [...]
> > > > > If not, I would do something like this:
> > > > >
> > > > > ---8<-----------------------
> > > > >
> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > index 79aea7df4345..23c2303de9f4 100644
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -3435,11 +3435,12 @@ static int rcu_blocking_is_gp(void)
> > > > >  {
> > > > >       int ret;
> > > > >
> > > > > +     might_sleep();  /* Check for RCU read-side critical section. */
> > > > > +
> > > > >       // Invoking preempt_model_*() too early gets a splat.
> > > > >       if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE ||
> > > > >           preempt_model_full() || preempt_model_rt())
> > > > >               return rcu_scheduler_active == RCU_SCHEDULER_INACTIVE;
> >
> > If the scheduler is inactive (early boot with interrupts disabled),
> > we return here.
> >
> > > > > -     might_sleep();  /* Check for RCU read-side critical section. */
> >
> > We get here only if the scheduler has started, and even then only in
> > preemption-disabled kernels.
> >
> > Or is you concern that the might_sleep() never gets invoked in kernels
> > with preemption enabled?  Fixing that would require a slightly different
> > patch, though.
> >
> > Or should I have waited until tomorrow to respond to this email?  ;-)
> 
> No, I think you are quite right. I was not referring to
> rcu_sleep_check(), but rather the following prints in might_sleep(). I
> see an unconditional call to might_sleep()  from kvfree_call_rcu() but
> not one from synchronize_rcu() which can also sleep.
> 
> But I see your point, early boot code has interrupts disabled, but can
> still totally call synchronize_rcu() when the scheduler is INACTIVE.
> And might_sleep() might bitterly complain. Thanks for the
> clarification.
> 
> pr_err("BUG: sleeping function called from invalid context at %s:%d\n",
>       file, line);
> pr_err("in_atomic(): %d, irqs_disabled(): %d, non_block: %d, pid: %d,
> name: %s\n",
>       in_atomic(), irqs_disabled(), current->non_block_count,
>       current->pid, current->comm);
> pr_err("preempt_count: %x, expected: %x\n", preempt_count(),
>       offsets & MIGHT_RESCHED_PREEMPT_MASK);

And I do not believe that we have defined whether or not it is OK to
invoke single-argument kvfree_rcu() before the scheduler has started.  ;-)

							Thanx, Paul
  

Patch

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index ee8a6a711719..65f3dd2fd3ae 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3379,9 +3379,10 @@  void __init kfree_rcu_scheduler_running(void)
  */
 static int rcu_blocking_is_gp(void)
 {
-	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE)
+	if (rcu_scheduler_active != RCU_SCHEDULER_INACTIVE) {
+		might_sleep();
 		return false;
-	might_sleep();  /* Check for RCU read-side critical section. */
+	}
 	return true;
 }