[REPOST] srcu: Use try-lock lockdep annotation for NMI-safe access.

Message ID 20231121123315.egrgopGN@linutronix.de
State New
Headers
Series [REPOST] srcu: Use try-lock lockdep annotation for NMI-safe access. |

Commit Message

Sebastian Andrzej Siewior Nov. 21, 2023, 12:33 p.m. UTC
  It is claimed that srcu_read_lock_nmisafe() NMI-safe. However it
triggers a lockdep if used from NMI because lockdep expects a deadlock
since nothing disables NMIs while the lock is acquired.

Use a try-lock annotation for srcu_read_lock_nmisafe() to avoid lockdep
complains if used from NMI.

Fixes: f0f44752f5f61 ("rcu: Annotate SRCU's update-side lockdep dependencies")
Link: https://lore.kernel.org/r/20230927160231.XRCDDSK4@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---

This is a repost of
	https://lore.kernel.org/r/20230927160231.XRCDDSK4@linutronix.de

Based on the discussion there I *think* this is preferred over the NMI
check in lock_acquire().
But then PeterZ also pointed out that he has a problem with
	f0f44752f5f61 ("rcu: Annotate SRCU's update-side lockdep dependencies")

because trace_.*_rcuidle machinery. This looks okay because the _rcuidle
part is using SRCU and the rcu_dereference_raw() tracepoint_func is
using RCU + SRCU in its free part.

 include/linux/rcupdate.h |    6 ++++++
 include/linux/srcu.h     |    2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)
  

Comments

Boqun Feng Nov. 22, 2023, 6:08 p.m. UTC | #1
On Tue, Nov 21, 2023 at 01:33:15PM +0100, Sebastian Andrzej Siewior wrote:
> It is claimed that srcu_read_lock_nmisafe() NMI-safe. However it
> triggers a lockdep if used from NMI because lockdep expects a deadlock
> since nothing disables NMIs while the lock is acquired.
> 


Thanks for reposting!

I would add a paragraph here explaining why the commit is culprit:

This is because commit f0f44752f5f61 ("rcu: Annotate SRCU's update-side
lockdep dependencies") annotates synchronize_srcu() as a write lock
usage (so that srcu_read_lock(); synchronize_srcu() deadlock can be
found), the side effect is that the lock srcu_struct now has a USED
usage in normal contexts, so it conflicts with a USED_READ usage in NMI.
But this shouldn't cause a real deadlock because the write lock usage
from synchronize_srcu() is a fake one and only used for read/write
deadlock detection.

> Use a try-lock annotation for srcu_read_lock_nmisafe() to avoid lockdep
> complains if used from NMI.
> 
> Fixes: f0f44752f5f61 ("rcu: Annotate SRCU's update-side lockdep dependencies")
> Link: https://lore.kernel.org/r/20230927160231.XRCDDSK4@linutronix.de
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> 
> This is a repost of
> 	https://lore.kernel.org/r/20230927160231.XRCDDSK4@linutronix.de
> 
> Based on the discussion there I *think* this is preferred over the NMI
> check in lock_acquire().
> But then PeterZ also pointed out that he has a problem with
> 	f0f44752f5f61 ("rcu: Annotate SRCU's update-side lockdep dependencies")
> 
> because trace_.*_rcuidle machinery. This looks okay because the _rcuidle
> part is using SRCU and the rcu_dereference_raw() tracepoint_func is
> using RCU + SRCU in its free part.
> 

Yeah, I think we don't have more problems (famous last words).

Reviewed-by: Boqun Feng <boqun.feng@gmail.com>

Regards,
Boqun

>  include/linux/rcupdate.h |    6 ++++++
>  include/linux/srcu.h     |    2 +-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -301,6 +301,11 @@ static inline void rcu_lock_acquire(stru
>  	lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
>  }
>  
> +static inline void rcu_try_lock_acquire(struct lockdep_map *map)
> +{
> +	lock_acquire(map, 0, 1, 2, 0, NULL, _THIS_IP_);
> +}
> +
>  static inline void rcu_lock_release(struct lockdep_map *map)
>  {
>  	lock_release(map, _THIS_IP_);
> @@ -315,6 +320,7 @@ int rcu_read_lock_any_held(void);
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  # define rcu_lock_acquire(a)		do { } while (0)
> +# define rcu_try_lock_acquire(a)	do { } while (0)
>  # define rcu_lock_release(a)		do { } while (0)
>  
>  static inline int rcu_read_lock_held(void)
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -229,7 +229,7 @@ static inline int srcu_read_lock_nmisafe
>  
>  	srcu_check_nmi_safety(ssp, true);
>  	retval = __srcu_read_lock_nmisafe(ssp);
> -	rcu_lock_acquire(&ssp->dep_map);
> +	rcu_try_lock_acquire(&ssp->dep_map);
>  	return retval;
>  }
>
  
Paul E. McKenney Dec. 5, 2023, 4:23 a.m. UTC | #2
On Thu, Nov 30, 2023 at 02:27:29PM +0100, Sebastian Andrzej Siewior wrote:
> It is claimed that srcu_read_lock_nmisafe() NMI-safe. However it
> triggers a lockdep if used from NMI because lockdep expects a deadlock
> since nothing disables NMIs while the lock is acquired.
> 
> This is because commit f0f44752f5f61 ("rcu: Annotate SRCU's update-side
> lockdep dependencies") annotates synchronize_srcu() as a write lock
> usage. This helps to detect a deadlocks such as
> 	srcu_read_lock();
> 	synchronize_srcu();
> 	srcu_read_unlock();
> 
> The side effect is that the lock srcu_struct now has a USED usage in normal
> contexts, so it conflicts with a USED_READ usage in NMI. But this shouldn't
> cause a real deadlock because the write lock usage from synchronize_srcu() is a
> fake one and only used for read/write deadlock detection.
> 
> Use a try-lock annotation for srcu_read_lock_nmisafe() to avoid lockdep
> complains if used from NMI.
> 
> Fixes: f0f44752f5f61 ("rcu: Annotate SRCU's update-side lockdep dependencies")
> Link: https://lore.kernel.org/r/20230927160231.XRCDDSK4@linutronix.de
> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Queued for v6.9 along with further review and testing, thank you both!

							Thanx, Paul

> ---
>  include/linux/rcupdate.h | 6 ++++++
>  include/linux/srcu.h     | 2 +-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index f7206b2623c98..31d523c4e0893 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -301,6 +301,11 @@ static inline void rcu_lock_acquire(struct lockdep_map *map)
>  	lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
>  }
>  
> +static inline void rcu_try_lock_acquire(struct lockdep_map *map)
> +{
> +	lock_acquire(map, 0, 1, 2, 0, NULL, _THIS_IP_);
> +}
> +
>  static inline void rcu_lock_release(struct lockdep_map *map)
>  {
>  	lock_release(map, _THIS_IP_);
> @@ -315,6 +320,7 @@ int rcu_read_lock_any_held(void);
>  #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
>  
>  # define rcu_lock_acquire(a)		do { } while (0)
> +# define rcu_try_lock_acquire(a)	do { } while (0)
>  # define rcu_lock_release(a)		do { } while (0)
>  
>  static inline int rcu_read_lock_held(void)
> diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> index 127ef3b2e6073..236610e4a8fa5 100644
> --- a/include/linux/srcu.h
> +++ b/include/linux/srcu.h
> @@ -229,7 +229,7 @@ static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp
>  
>  	srcu_check_nmi_safety(ssp, true);
>  	retval = __srcu_read_lock_nmisafe(ssp);
> -	rcu_lock_acquire(&ssp->dep_map);
> +	rcu_try_lock_acquire(&ssp->dep_map);
>  	return retval;
>  }
>  
> -- 
> 2.43.0
  

Patch

--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -301,6 +301,11 @@  static inline void rcu_lock_acquire(stru
 	lock_acquire(map, 0, 0, 2, 0, NULL, _THIS_IP_);
 }
 
+static inline void rcu_try_lock_acquire(struct lockdep_map *map)
+{
+	lock_acquire(map, 0, 1, 2, 0, NULL, _THIS_IP_);
+}
+
 static inline void rcu_lock_release(struct lockdep_map *map)
 {
 	lock_release(map, _THIS_IP_);
@@ -315,6 +320,7 @@  int rcu_read_lock_any_held(void);
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
 # define rcu_lock_acquire(a)		do { } while (0)
+# define rcu_try_lock_acquire(a)	do { } while (0)
 # define rcu_lock_release(a)		do { } while (0)
 
 static inline int rcu_read_lock_held(void)
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -229,7 +229,7 @@  static inline int srcu_read_lock_nmisafe
 
 	srcu_check_nmi_safety(ssp, true);
 	retval = __srcu_read_lock_nmisafe(ssp);
-	rcu_lock_acquire(&ssp->dep_map);
+	rcu_try_lock_acquire(&ssp->dep_map);
 	return retval;
 }