wifi: mac80211: tx: Add __must_hold() annotation

Message ID 20240113011145.10888-2-bpappas@pappasbrent.com
State New
Headers
Series wifi: mac80211: tx: Add __must_hold() annotation |

Commit Message

Brent Pappas Jan. 13, 2024, 1:11 a.m. UTC
  Annotates ieee80211_set_beacon_cntdwn() with a __must_hold() annotation to
make it clear that ieee80211_set_beacon_cntdwn() is only intended to be
called when the caller has a lock on the argument "link."

Signed-off-by: Brent Pappas <bpappas@pappasbrent.com>
---

Currently, ieee80211_set_beacon_cntdwn() calls rcu_dereference(), but
without calling rcu_read_lock() beforehand and rcu_read_unlock()
afterward.  At first I thought this was a bug, since (if I understand the
RCU API correctly) rcu_dereference() should only be called in RCU
read-side critical sections. However, upon closer inspection of the code,
I realized that ieee80211_set_beacon_cntdwn() is only ever called inside
critical sections. Therefore it seems appropriate to me to annotate
ieee80211_set_beacon_cntdwn() with a __must_hold() annotation to make this
apparent precondition explicit.

This is my first time submitting an RCU-related patch so please tell me if
I am misunderstanding the RCU API.

 net/mac80211/tx.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Kalle Valo Jan. 13, 2024, 6:32 a.m. UTC | #1
Brent Pappas <bpappas@pappasbrent.com> writes:

> Annotates ieee80211_set_beacon_cntdwn() with a __must_hold() annotation to
> make it clear that ieee80211_set_beacon_cntdwn() is only intended to be
> called when the caller has a lock on the argument "link."
>
> Signed-off-by: Brent Pappas <bpappas@pappasbrent.com>
> ---
>
> Currently, ieee80211_set_beacon_cntdwn() calls rcu_dereference(), but
> without calling rcu_read_lock() beforehand and rcu_read_unlock()
> afterward.  At first I thought this was a bug, since (if I understand the
> RCU API correctly) rcu_dereference() should only be called in RCU
> read-side critical sections. However, upon closer inspection of the code,
> I realized that ieee80211_set_beacon_cntdwn() is only ever called inside
> critical sections. Therefore it seems appropriate to me to annotate
> ieee80211_set_beacon_cntdwn() with a __must_hold() annotation to make this
> apparent precondition explicit.
>
> This is my first time submitting an RCU-related patch so please tell me if
> I am misunderstanding the RCU API.
>
>  net/mac80211/tx.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
> index 314998fdb1a5..7245f2e641ba 100644
> --- a/net/mac80211/tx.c
> +++ b/net/mac80211/tx.c
> @@ -10,6 +10,7 @@
>   * Transmit and frame generation functions.
>   */
>  
> +#include "linux/compiler_types.h"
>  #include <linux/kernel.h>
>  #include <linux/slab.h>
>  #include <linux/skbuff.h>
> @@ -4974,6 +4975,7 @@ static int ieee80211_beacon_add_tim(struct ieee80211_sub_if_data *sdata,
>  static void ieee80211_set_beacon_cntdwn(struct ieee80211_sub_if_data *sdata,
>  					struct beacon_data *beacon,
>  					struct ieee80211_link_data *link)
> +	__must_hold(link)

Oh, never seen __must_hold() before and looks very useful. So does this
work with RCU, mutexes and spinlocks?

In case others are interested, here's the documentation I was able to find:

https://docs.kernel.org/dev-tools/sparse.html#using-sparse-for-lock-checking
  
Johannes Berg Jan. 15, 2024, 1:13 p.m. UTC | #2
On Sat, 2024-01-13 at 08:32 +0200, Kalle Valo wrote:
> 
> >  static void ieee80211_set_beacon_cntdwn(struct ieee80211_sub_if_data *sdata,
> >  					struct beacon_data *beacon,
> >  					struct ieee80211_link_data *link)
> > +	__must_hold(link)
> 
> Oh, never seen __must_hold() before and looks very useful. So does this
> work with RCU, mutexes and spinlocks?
> 
> In case others are interested, here's the documentation I was able to find:
> 
> https://docs.kernel.org/dev-tools/sparse.html#using-sparse-for-lock-checking
> 

Except it's not actually useful, and looks more useful than it is. IMHO
it's actually more harmful than anything else.

One might even consider this patch a good example! The function
ieee80211_set_beacon_cntdwn() is called from a number of places in this
file, some of which acquire RCU critical section, and some of which
acquire no locks nor RCU critical section at all. Most of them nest and
are called in RCU.

However, there's basically no way to get sparse to warn on this. Even
inserting a function

void test(void);
void test(void)
{
        ieee80211_set_beacon_cntdwn(NULL, NULL, NULL);
}

will not cause sparse to complain, where this *clearly* doesn't hold an
locks.


Also, as we (should) all know, the argument to __acquires(),
__releases() and __must_check() is pretty much ignored. I tried to fix
this in sparse many years ago, some code even got merged (and then
reverted), and if the experience tells me anything then that it's pretty
much not fixable.

__acquires() and __releases() at least are useful for tracking that you
don't have a mismatch, e.g. a function that __acquires() but then takes
a lock in most paths but forgot one, for example. With __must_hold(),
this really isn't the case.

And then we could argue that at least it has a documentation effect, but
.. what does it even mean to "hold 'link'"? There isn't even a lock,
mutex or otherwise, in the link. You can't "own" a reference to it, or
anything like that. The closest thing in current kernels would be to
maybe see if you have the wiphy mutex, but that's likely not the case in
these paths and RCU was used to get to the link struct ...


IOW, I find this lacking from an implementation/validation point of
view, and lacking if not outright confusing from a documentation point
of view. Much better to put something lockdep_assert_held() or similar
into the right places.

As for your comment about RCU in ath11k (which points back to this
thread): I don't find

	RCU_LOCKDEP_WARN(!rcu_read_lock_held());
or
	WARN_ON_ONCE(!rcu_read_lock_held());

very persuasive, it's much better to have it checked with
rcu_dereference_protected(), rcu_dereference_check(), the condition
argument to list_for_each_rcu(), or (in the case of wiphy) our wrappers
around these like wiphy_dereference(). I cannot think of any case where
you'd want to ensure that some code is in an RCU critical section
without it actually using RCU - and if it does you have
rcu_dereference() and all those things that (a) check anyway, and also
(b) serve as their own documentation.


Anyway, long story short: I don't see value in this patch and won't be
applying it unless somebody here can convince me otherwise, ideally
addressing the concerns stated above.

johannes
  
Brent Pappas Jan. 17, 2024, 8 p.m. UTC | #3
Thanks for the feedback Johannes. As I mentioned in my original email, I'm still
learning the RCU API, so I appreciate the insight from someone more
knowledgeable.

> Much better to put something lockdep_assert_held() or similar into the right
> places.

I'm not committed to using __must_hold(); would you be willing to accept this
patch if I change it to use lockdep_assert_held() instead?

> The function ieee80211_set_beacon_cntdwn() is called from a number of places
> in this file, some of which acquire RCU critical section, and some of which
> acquire no locks nor RCU critical section at all.

Grepping through tx.c, I see ieee80211_set_beacon_cntdwn() is invoked in three
places:

- Line 5285: Inside the definition of ieee80211_beacon_get_ap(), which is only
  invoked in critical sections (both directly and in another nested call).
- Line 5439: Directly inside a critical section.
- Line 5471: Directly inside a critical section (same as previous).

> I tried to fix this in sparse many years ago, some code even got merged (and
> then reverted), and if the experience tells me anything then that it's pretty
> much not fixable.

I'm sorry to hear that; a solution to this problem sounds very useful. I'm
currently working on making my own static analyzer for performing more checks
than what sparse currently provides. Since you've worked on this problem and
have deeper insight into than I do, what sort of checks would you like to see
added to a tool like sparse (besides checking whether specific locks are held)?

Thank you,
Brent

The 01/15/2024 14:13, Johannes Berg wrote:
> On Sat, 2024-01-13 at 08:32 +0200, Kalle Valo wrote:
> > 
> > >  static void ieee80211_set_beacon_cntdwn(struct ieee80211_sub_if_data *sdata,
> > >  					struct beacon_data *beacon,
> > >  					struct ieee80211_link_data *link)
> > > +	__must_hold(link)
> > 
> > Oh, never seen __must_hold() before and looks very useful. So does this
> > work with RCU, mutexes and spinlocks?
> > 
> > In case others are interested, here's the documentation I was able to find:
> > 
> > https://docs.kernel.org/dev-tools/sparse.html#using-sparse-for-lock-checking
> > 
> 
> Except it's not actually useful, and looks more useful than it is. IMHO
> it's actually more harmful than anything else.
> 
> One might even consider this patch a good example! The function
> ieee80211_set_beacon_cntdwn() is called from a number of places in this
> file, some of which acquire RCU critical section, and some of which
> acquire no locks nor RCU critical section at all. Most of them nest and
> are called in RCU.
> 
> However, there's basically no way to get sparse to warn on this. Even
> inserting a function
> 
> void test(void);
> void test(void)
> {
>         ieee80211_set_beacon_cntdwn(NULL, NULL, NULL);
> }
> 
> will not cause sparse to complain, where this *clearly* doesn't hold an
> locks.
> 
> 
> Also, as we (should) all know, the argument to __acquires(),
> __releases() and __must_check() is pretty much ignored. I tried to fix
> this in sparse many years ago, some code even got merged (and then
> reverted), and if the experience tells me anything then that it's pretty
> much not fixable.
> 
> __acquires() and __releases() at least are useful for tracking that you
> don't have a mismatch, e.g. a function that __acquires() but then takes
> a lock in most paths but forgot one, for example. With __must_hold(),
> this really isn't the case.
> 
> And then we could argue that at least it has a documentation effect, but
> ... what does it even mean to "hold 'link'"? There isn't even a lock,
> mutex or otherwise, in the link. You can't "own" a reference to it, or
> anything like that. The closest thing in current kernels would be to
> maybe see if you have the wiphy mutex, but that's likely not the case in
> these paths and RCU was used to get to the link struct ...
> 
> 
> IOW, I find this lacking from an implementation/validation point of
> view, and lacking if not outright confusing from a documentation point
> of view. Much better to put something lockdep_assert_held() or similar
> into the right places.
> 
> As for your comment about RCU in ath11k (which points back to this
> thread): I don't find
> 
> 	RCU_LOCKDEP_WARN(!rcu_read_lock_held());
> or
> 	WARN_ON_ONCE(!rcu_read_lock_held());
> 
> very persuasive, it's much better to have it checked with
> rcu_dereference_protected(), rcu_dereference_check(), the condition
> argument to list_for_each_rcu(), or (in the case of wiphy) our wrappers
> around these like wiphy_dereference(). I cannot think of any case where
> you'd want to ensure that some code is in an RCU critical section
> without it actually using RCU - and if it does you have
> rcu_dereference() and all those things that (a) check anyway, and also
> (b) serve as their own documentation.
> 
> 
> Anyway, long story short: I don't see value in this patch and won't be
> applying it unless somebody here can convince me otherwise, ideally
> addressing the concerns stated above.
> 
> johannes
  
Johannes Berg Jan. 17, 2024, 11:07 p.m. UTC | #4
Hi Brent,

On Wed, 2024-01-17 at 15:00 -0500, Brent Pappas wrote:
> Thanks for the feedback Johannes. As I mentioned in my original email, I'm still
> learning the RCU API, so I appreciate the insight from someone more
> knowledgeable.

Note this isn't really all that RCU related.

> > Much better to put something lockdep_assert_held() or similar into the right
> > places.
> 
> I'm not committed to using __must_hold(); would you be willing to accept this
> patch if I change it to use lockdep_assert_held() instead?

I'm actually not sure what you're trying to check here. The
rcu_dereference() inside of it? But that'll already be checked at
runtime by lockdep without any further code.

So ... right now I don't see that there's any point in adding any
further annotations, but I'm also not sure what you're trying to
achieve.

> > The function ieee80211_set_beacon_cntdwn() is called from a number of places
> > in this file, some of which acquire RCU critical section, and some of which
> > acquire no locks nor RCU critical section at all.
> 
> Grepping through tx.c, I see ieee80211_set_beacon_cntdwn() is invoked in three
> places:
> 
> - Line 5285: Inside the definition of ieee80211_beacon_get_ap(), which is only
>   invoked in critical sections (both directly and in another nested call).
> - Line 5439: Directly inside a critical section.
> - Line 5471: Directly inside a critical section (same as previous).

Right.

> > I tried to fix this in sparse many years ago, some code even got merged (and
> > then reverted), and if the experience tells me anything then that it's pretty
> > much not fixable.
> 
> I'm sorry to hear that; a solution to this problem sounds very useful. I'm
> currently working on making my own static analyzer for performing more checks
> than what sparse currently provides. 

Are you aware of smatch?

> Since you've worked on this problem and
> have deeper insight into than I do, what sort of checks would you like to see
> added to a tool like sparse (besides checking whether specific locks are held)?

I haven't really thought about that ... some better taint tracking would
be nice but that's _really_ hard ;-)

johannes
  

Patch

diff --git a/net/mac80211/tx.c b/net/mac80211/tx.c
index 314998fdb1a5..7245f2e641ba 100644
--- a/net/mac80211/tx.c
+++ b/net/mac80211/tx.c
@@ -10,6 +10,7 @@ 
  * Transmit and frame generation functions.
  */
 
+#include "linux/compiler_types.h"
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <linux/skbuff.h>
@@ -4974,6 +4975,7 @@  static int ieee80211_beacon_add_tim(struct ieee80211_sub_if_data *sdata,
 static void ieee80211_set_beacon_cntdwn(struct ieee80211_sub_if_data *sdata,
 					struct beacon_data *beacon,
 					struct ieee80211_link_data *link)
+	__must_hold(link)
 {
 	u8 *beacon_data, count, max_count = 1;
 	struct probe_resp *resp;