[v6,1/5] jump_label: Prevent key->enabled int overflow
Commit Message
1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
protection against key->enabled refcounter overflow.
2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
still may turn the refcounter negative as (v + 1) may overflow.
key->enabled is indeed a ref-counter as it's documented in multiple
places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
etc.
As -1 is reserved for static key that's in process of being enabled,
functions would break with negative key->enabled refcount:
- for CONFIG_JUMP_LABEL=n negative return of static_key_count()
breaks static_key_false(), static_key_true()
- the ref counter may become 0 from negative side by too many
static_key_slow_inc() calls and lead to use-after-free issues.
These flaws result in that some users have to introduce an additional
mutex and prevent the reference counter from overflowing themselves,
see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
Prevent the reference counter overflow by checking if (v + 1) > 0.
Change functions API to return whether the increment was successful.
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
---
include/linux/jump_label.h | 21 +++++++++++---
kernel/jump_label.c | 56 ++++++++++++++++++++++++++++++--------
2 files changed, 61 insertions(+), 16 deletions(-)
Comments
On Wed, Nov 23, 2022 at 05:38:55PM +0000, Dmitry Safonov wrote:
> 1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
> protection against key->enabled refcounter overflow.
> 2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
> still may turn the refcounter negative as (v + 1) may overflow.
>
> key->enabled is indeed a ref-counter as it's documented in multiple
> places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
> etc.
>
> As -1 is reserved for static key that's in process of being enabled,
> functions would break with negative key->enabled refcount:
> - for CONFIG_JUMP_LABEL=n negative return of static_key_count()
> breaks static_key_false(), static_key_true()
> - the ref counter may become 0 from negative side by too many
> static_key_slow_inc() calls and lead to use-after-free issues.
>
> These flaws result in that some users have to introduce an additional
> mutex and prevent the reference counter from overflowing themselves,
> see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
>
> Prevent the reference counter overflow by checking if (v + 1) > 0.
> Change functions API to return whether the increment was successful.
>
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> Acked-by: Jakub Kicinski <kuba@kernel.org>
This looks good to me:
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
What is the plan for merging this? I'm assuming it would want to go
through the network tree, but as already noted earlier it depends on a
patch I have in tip/locking/core.
Now I checked, tip/locking/core is *just* that one patch, so it might be
possible to merge that branch and this series into the network tree and
note that during the pull request to Linus.
On 11/25/22 07:59, Peter Zijlstra wrote:
> On Wed, Nov 23, 2022 at 05:38:55PM +0000, Dmitry Safonov wrote:
>> 1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
>> protection against key->enabled refcounter overflow.
>> 2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
>> still may turn the refcounter negative as (v + 1) may overflow.
>>
>> key->enabled is indeed a ref-counter as it's documented in multiple
>> places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
>> etc.
>>
>> As -1 is reserved for static key that's in process of being enabled,
>> functions would break with negative key->enabled refcount:
>> - for CONFIG_JUMP_LABEL=n negative return of static_key_count()
>> breaks static_key_false(), static_key_true()
>> - the ref counter may become 0 from negative side by too many
>> static_key_slow_inc() calls and lead to use-after-free issues.
>>
>> These flaws result in that some users have to introduce an additional
>> mutex and prevent the reference counter from overflowing themselves,
>> see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.
>>
>> Prevent the reference counter overflow by checking if (v + 1) > 0.
>> Change functions API to return whether the increment was successful.
>>
>> Signed-off-by: Dmitry Safonov <dima@arista.com>
>> Acked-by: Jakub Kicinski <kuba@kernel.org>
>
> This looks good to me:
>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Thank you, Peter!
> What is the plan for merging this? I'm assuming it would want to go
> through the network tree, but as already noted earlier it depends on a
> patch I have in tip/locking/core.
>
> Now I checked, tip/locking/core is *just* that one patch, so it might be
> possible to merge that branch and this series into the network tree and
> note that during the pull request to Linus.
I initially thought it has to go through tip trees because of the
dependence, but as you say it's just one patch.
I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
get a go from him, I will send all 6 patches for inclusion into -net
tree, if that will be in time before the merge window.
Thanks again for the review and ack,
Dmitry
On Fri, 25 Nov 2022 14:28:30 +0000 Dmitry Safonov wrote:
> > What is the plan for merging this? I'm assuming it would want to go
> > through the network tree, but as already noted earlier it depends on a
> > patch I have in tip/locking/core.
> >
> > Now I checked, tip/locking/core is *just* that one patch, so it might be
> > possible to merge that branch and this series into the network tree and
> > note that during the pull request to Linus.
>
> I initially thought it has to go through tip trees because of the
> dependence, but as you say it's just one patch.
>
> I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> get a go from him, I will send all 6 patches for inclusion into -net
> tree, if that will be in time before the merge window.
Looks like we're all set on the networking side (thanks Eric!!)
Should I pull Peter's branch? Or you want to just resent a patch Peter
already queued. A bit of an unusual situation..
On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 25 Nov 2022 14:28:30 +0000 Dmitry Safonov wrote:
> > > What is the plan for merging this? I'm assuming it would want to go
> > > through the network tree, but as already noted earlier it depends on a
> > > patch I have in tip/locking/core.
> > >
> > > Now I checked, tip/locking/core is *just* that one patch, so it might be
> > > possible to merge that branch and this series into the network tree and
> > > note that during the pull request to Linus.
> >
> > I initially thought it has to go through tip trees because of the
> > dependence, but as you say it's just one patch.
> >
> > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > get a go from him, I will send all 6 patches for inclusion into -net
> > tree, if that will be in time before the merge window.
>
> Looks like we're all set on the networking side (thanks Eric!!)
Thanks!
> Should I pull Peter's branch? Or you want to just resent a patch Peter
> already queued. A bit of an unusual situation..
Either way would work for me.
I can send it in a couple of hours if you prefer instead of pulling the branch.
Thank you,
Dmitry
On Thu, 1 Dec 2022 23:17:11 +0000 Dmitry Safonov wrote:
> On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <kuba@kernel.org> wrote:
> > > I initially thought it has to go through tip trees because of the
> > > dependence, but as you say it's just one patch.
> > >
> > > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > > get a go from him, I will send all 6 patches for inclusion into -net
> > > tree, if that will be in time before the merge window.
> >
> > Looks like we're all set on the networking side (thanks Eric!!)
>
> Thanks!
>
> > Should I pull Peter's branch? Or you want to just resent a patch Peter
> > already queued. A bit of an unusual situation..
>
> Either way would work for me.
> I can send it in a couple of hours if you prefer instead of pulling the branch.
I prefer to pull, seems safer in case Peter does get another patch.
It's this one, right?
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/core
I'll pulled, I'll push out once the build is done.
On Thu, 1 Dec 2022 at 23:36, Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 1 Dec 2022 23:17:11 +0000 Dmitry Safonov wrote:
> > On Thu, 1 Dec 2022 at 22:31, Jakub Kicinski <kuba@kernel.org> wrote:
> > > > I initially thought it has to go through tip trees because of the
> > > > dependence, but as you say it's just one patch.
> > > >
> > > > I was also asked by Jakub on v4 to wait for Eric's Ack/Review, so once I
> > > > get a go from him, I will send all 6 patches for inclusion into -net
> > > > tree, if that will be in time before the merge window.
> > >
> > > Looks like we're all set on the networking side (thanks Eric!!)
> >
> > Thanks!
> >
> > > Should I pull Peter's branch? Or you want to just resent a patch Peter
> > > already queued. A bit of an unusual situation..
> >
> > Either way would work for me.
> > I can send it in a couple of hours if you prefer instead of pulling the branch.
>
> I prefer to pull, seems safer in case Peter does get another patch.
>
> It's this one, right?
It is the correct one, yes.
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/core
>
> I'll pulled, I'll push out once the build is done.
Thank you once again,
Dmitry
@@ -224,9 +224,10 @@ extern bool arch_jump_label_transform_queue(struct jump_entry *entry,
enum jump_label_type type);
extern void arch_jump_label_transform_apply(void);
extern int jump_label_text_reserved(void *start, void *end);
-extern void static_key_slow_inc(struct static_key *key);
+extern bool static_key_slow_inc(struct static_key *key);
+extern bool static_key_fast_inc_not_disabled(struct static_key *key);
extern void static_key_slow_dec(struct static_key *key);
-extern void static_key_slow_inc_cpuslocked(struct static_key *key);
+extern bool static_key_slow_inc_cpuslocked(struct static_key *key);
extern void static_key_slow_dec_cpuslocked(struct static_key *key);
extern int static_key_count(struct static_key *key);
extern void static_key_enable(struct static_key *key);
@@ -278,11 +279,23 @@ static __always_inline bool static_key_true(struct static_key *key)
return false;
}
-static inline void static_key_slow_inc(struct static_key *key)
+static inline bool static_key_fast_inc_not_disabled(struct static_key *key)
{
+ int v;
+
STATIC_KEY_CHECK_USE(key);
- atomic_inc(&key->enabled);
+ /*
+ * Prevent key->enabled getting negative to follow the same semantics
+ * as for CONFIG_JUMP_LABEL=y, see kernel/jump_label.c comment.
+ */
+ v = atomic_read(&key->enabled);
+ do {
+ if (v < 0 || (v + 1) < 0)
+ return false;
+ } while (!likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)));
+ return true;
}
+#define static_key_slow_inc(key) static_key_fast_inc_not_disabled(key)
static inline void static_key_slow_dec(struct static_key *key)
{
@@ -113,9 +113,40 @@ int static_key_count(struct static_key *key)
}
EXPORT_SYMBOL_GPL(static_key_count);
-void static_key_slow_inc_cpuslocked(struct static_key *key)
+/*
+ * static_key_fast_inc_not_disabled - adds a user for a static key
+ * @key: static key that must be already enabled
+ *
+ * The caller must make sure that the static key can't get disabled while
+ * in this function. It doesn't patch jump labels, only adds a user to
+ * an already enabled static key.
+ *
+ * Returns true if the increment was done. Unlike refcount_t the ref counter
+ * is not saturated, but will fail to increment on overflow.
+ */
+bool static_key_fast_inc_not_disabled(struct static_key *key)
{
+ int v;
+
STATIC_KEY_CHECK_USE(key);
+ /*
+ * Negative key->enabled has a special meaning: it sends
+ * static_key_slow_inc() down the slow path, and it is non-zero
+ * so it counts as "enabled" in jump_label_update(). Note that
+ * atomic_inc_unless_negative() checks >= 0, so roll our own.
+ */
+ v = atomic_read(&key->enabled);
+ do {
+ if (v <= 0 || (v + 1) < 0)
+ return false;
+ } while (!likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)));
+
+ return true;
+}
+EXPORT_SYMBOL_GPL(static_key_fast_inc_not_disabled);
+
+bool static_key_slow_inc_cpuslocked(struct static_key *key)
+{
lockdep_assert_cpus_held();
/*
@@ -124,15 +155,9 @@ void static_key_slow_inc_cpuslocked(struct static_key *key)
* jump_label_update() process. At the same time, however,
* the jump_label_update() call below wants to see
* static_key_enabled(&key) for jumps to be updated properly.
- *
- * So give a special meaning to negative key->enabled: it sends
- * static_key_slow_inc() down the slow path, and it is non-zero
- * so it counts as "enabled" in jump_label_update(). Note that
- * atomic_inc_unless_negative() checks >= 0, so roll our own.
*/
- for (int v = atomic_read(&key->enabled); v > 0; )
- if (likely(atomic_try_cmpxchg(&key->enabled, &v, v + 1)))
- return;
+ if (static_key_fast_inc_not_disabled(key))
+ return true;
jump_label_lock();
if (atomic_read(&key->enabled) == 0) {
@@ -144,16 +169,23 @@ void static_key_slow_inc_cpuslocked(struct static_key *key)
*/
atomic_set_release(&key->enabled, 1);
} else {
- atomic_inc(&key->enabled);
+ if (WARN_ON_ONCE(!static_key_fast_inc_not_disabled(key))) {
+ jump_label_unlock();
+ return false;
+ }
}
jump_label_unlock();
+ return true;
}
-void static_key_slow_inc(struct static_key *key)
+bool static_key_slow_inc(struct static_key *key)
{
+ bool ret;
+
cpus_read_lock();
- static_key_slow_inc_cpuslocked(key);
+ ret = static_key_slow_inc_cpuslocked(key);
cpus_read_unlock();
+ return ret;
}
EXPORT_SYMBOL_GPL(static_key_slow_inc);