[net-next,12/24] seg6: Use nested-BH locking for seg6_bpf_srh_states.

Message ID 20231215171020.687342-13-bigeasy@linutronix.de
State New
Headers
Series locking: Introduce nested-BH locking. |

Commit Message

Sebastian Andrzej Siewior Dec. 15, 2023, 5:07 p.m. UTC
  The access to seg6_bpf_srh_states is protected by disabling preemption.
Based on the code, the entry point is input_action_end_bpf() and
every other function (the bpf helper functions bpf_lwt_seg6_*()), that
is accessing seg6_bpf_srh_states, should be called from within
input_action_end_bpf().

input_action_end_bpf() accesses seg6_bpf_srh_states first at the top of
the function and then disables preemption. This looks wrong because if
preemption needs to be disabled as part of the locking mechanism then
the variable shouldn't be accessed beforehand.

Looking at how it is used via test_lwt_seg6local.sh then
input_action_end_bpf() is always invoked from softirq context. If this
is always the case then the preempt_disable() statement is superfluous.
If this is not always invoked from softirq then disabling only
preemption is not sufficient.

Replace the preempt_disable() statement with nested-BH locking. This is
not an equivalent replacement as it assumes that the invocation of
input_action_end_bpf() always occurs in softirq context and thus the
preempt_disable() is superfluous.
Add a local_lock_t the data structure and use local_lock_nested_bh() in
guard notation for locking. Add lockdep_assert_held() to ensure the lock
is held while the per-CPU variable is referenced in the helper functions.

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Hao Luo <haoluo@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: bpf@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/net/seg6_local.h |  1 +
 net/core/filter.c        |  3 ++
 net/ipv6/seg6_local.c    | 59 ++++++++++++++++++++++------------------
 3 files changed, 36 insertions(+), 27 deletions(-)
  

Comments

kernel test robot Dec. 16, 2023, 3:39 a.m. UTC | #1
Hi Sebastian,

kernel test robot noticed the following build warnings:

[auto build test WARNING on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Sebastian-Andrzej-Siewior/locking-local_lock-Introduce-guard-definition-for-local_lock/20231216-011911
base:   net-next/main
patch link:    https://lore.kernel.org/r/20231215171020.687342-13-bigeasy%40linutronix.de
patch subject: [PATCH net-next 12/24] seg6: Use nested-BH locking for seg6_bpf_srh_states.
config: x86_64-randconfig-r131-20231216 (https://download.01.org/0day-ci/archive/20231216/202312161151.k1MBvXUD-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231216/202312161151.k1MBvXUD-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312161151.k1MBvXUD-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> net/ipv6/seg6_local.c:1431:9: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct local_lock_t [usertype] *l @@     got struct local_lock_t [noderef] __percpu * @@
   net/ipv6/seg6_local.c:1431:9: sparse:     expected struct local_lock_t [usertype] *l
   net/ipv6/seg6_local.c:1431:9: sparse:     got struct local_lock_t [noderef] __percpu *

vim +1431 net/ipv6/seg6_local.c

  1410	
  1411	static int input_action_end_bpf(struct sk_buff *skb,
  1412					struct seg6_local_lwt *slwt)
  1413	{
  1414		struct seg6_bpf_srh_state *srh_state;
  1415		struct ipv6_sr_hdr *srh;
  1416		int ret;
  1417	
  1418		srh = get_and_validate_srh(skb);
  1419		if (!srh) {
  1420			kfree_skb(skb);
  1421			return -EINVAL;
  1422		}
  1423		advance_nextseg(srh, &ipv6_hdr(skb)->daddr);
  1424	
  1425		/* The access to the per-CPU buffer srh_state is protected by running
  1426		 * always in softirq context (with disabled BH). On PREEMPT_RT the
  1427		 * required locking is provided by the following local_lock_nested_bh()
  1428		 * statement. It is also accessed by the bpf_lwt_seg6_* helpers via
  1429		 * bpf_prog_run_save_cb().
  1430		 */
> 1431		scoped_guard(local_lock_nested_bh, &seg6_bpf_srh_states.bh_lock) {
  1432			srh_state = this_cpu_ptr(&seg6_bpf_srh_states);
  1433			srh_state->srh = srh;
  1434			srh_state->hdrlen = srh->hdrlen << 3;
  1435			srh_state->valid = true;
  1436	
  1437			rcu_read_lock();
  1438			bpf_compute_data_pointers(skb);
  1439			ret = bpf_prog_run_save_cb(slwt->bpf.prog, skb);
  1440			rcu_read_unlock();
  1441	
  1442			switch (ret) {
  1443			case BPF_OK:
  1444			case BPF_REDIRECT:
  1445				break;
  1446			case BPF_DROP:
  1447				goto drop;
  1448			default:
  1449				pr_warn_once("bpf-seg6local: Illegal return value %u\n", ret);
  1450				goto drop;
  1451			}
  1452	
  1453			if (srh_state->srh && !seg6_bpf_has_valid_srh(skb))
  1454				goto drop;
  1455		}
  1456	
  1457		if (ret != BPF_REDIRECT)
  1458			seg6_lookup_nexthop(skb, NULL, 0);
  1459	
  1460		return dst_input(skb);
  1461	
  1462	drop:
  1463		kfree_skb(skb);
  1464		return -EINVAL;
  1465	}
  1466
  
Paolo Abeni Dec. 18, 2023, 8:33 a.m. UTC | #2
On Fri, 2023-12-15 at 18:07 +0100, Sebastian Andrzej Siewior wrote:
> The access to seg6_bpf_srh_states is protected by disabling preemption.
> Based on the code, the entry point is input_action_end_bpf() and
> every other function (the bpf helper functions bpf_lwt_seg6_*()), that
> is accessing seg6_bpf_srh_states, should be called from within
> input_action_end_bpf().
> 
> input_action_end_bpf() accesses seg6_bpf_srh_states first at the top of
> the function and then disables preemption. This looks wrong because if
> preemption needs to be disabled as part of the locking mechanism then
> the variable shouldn't be accessed beforehand.
> 
> Looking at how it is used via test_lwt_seg6local.sh then
> input_action_end_bpf() is always invoked from softirq context. If this
> is always the case then the preempt_disable() statement is superfluous.
> If this is not always invoked from softirq then disabling only
> preemption is not sufficient.
> 
> Replace the preempt_disable() statement with nested-BH locking. This is
> not an equivalent replacement as it assumes that the invocation of
> input_action_end_bpf() always occurs in softirq context and thus the
> preempt_disable() is superfluous.
> Add a local_lock_t the data structure and use local_lock_nested_bh() in
> guard notation for locking. Add lockdep_assert_held() to ensure the lock
> is held while the per-CPU variable is referenced in the helper functions.
> 
> Cc: Alexei Starovoitov <ast@kernel.org>
> Cc: Andrii Nakryiko <andrii@kernel.org>
> Cc: David Ahern <dsahern@kernel.org>
> Cc: Hao Luo <haoluo@google.com>
> Cc: Jiri Olsa <jolsa@kernel.org>
> Cc: John Fastabend <john.fastabend@gmail.com>
> Cc: KP Singh <kpsingh@kernel.org>
> Cc: Martin KaFai Lau <martin.lau@linux.dev>
> Cc: Song Liu <song@kernel.org>
> Cc: Stanislav Fomichev <sdf@google.com>
> Cc: Yonghong Song <yonghong.song@linux.dev>
> Cc: bpf@vger.kernel.org
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  include/net/seg6_local.h |  1 +
>  net/core/filter.c        |  3 ++
>  net/ipv6/seg6_local.c    | 59 ++++++++++++++++++++++------------------
>  3 files changed, 36 insertions(+), 27 deletions(-)
> 
> diff --git a/include/net/seg6_local.h b/include/net/seg6_local.h
> index 3fab9dec2ec45..0f22771359f4c 100644
> --- a/include/net/seg6_local.h
> +++ b/include/net/seg6_local.h
> @@ -20,6 +20,7 @@ extern bool seg6_bpf_has_valid_srh(struct sk_buff *skb);
>  
>  struct seg6_bpf_srh_state {
>  	struct ipv6_sr_hdr *srh;
> +	local_lock_t bh_lock;
>  	u16 hdrlen;
>  	bool valid;
>  };
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 1737884be52f8..c8013f762524b 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -6384,6 +6384,7 @@ BPF_CALL_4(bpf_lwt_seg6_store_bytes, struct sk_buff *, skb, u32, offset,
>  	void *srh_tlvs, *srh_end, *ptr;
>  	int srhoff = 0;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	if (srh == NULL)
>  		return -EINVAL;
>  
> @@ -6440,6 +6441,7 @@ BPF_CALL_4(bpf_lwt_seg6_action, struct sk_buff *, skb,
>  	int hdroff = 0;
>  	int err;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	switch (action) {
>  	case SEG6_LOCAL_ACTION_END_X:
>  		if (!seg6_bpf_has_valid_srh(skb))
> @@ -6516,6 +6518,7 @@ BPF_CALL_3(bpf_lwt_seg6_adjust_srh, struct sk_buff *, skb, u32, offset,
>  	int srhoff = 0;
>  	int ret;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	if (unlikely(srh == NULL))
>  		return -EINVAL;
>  
> diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
> index 24e2b4b494cb0..ed7278af321a2 100644
> --- a/net/ipv6/seg6_local.c
> +++ b/net/ipv6/seg6_local.c
> @@ -1380,7 +1380,9 @@ static int input_action_end_b6_encap(struct sk_buff *skb,
>  	return err;
>  }
>  
> -DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states);
> +DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states) = {
> +	.bh_lock	= INIT_LOCAL_LOCK(bh_lock),
> +};
>  
>  bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
>  {
> @@ -1388,6 +1390,7 @@ bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
>  		this_cpu_ptr(&seg6_bpf_srh_states);
>  	struct ipv6_sr_hdr *srh = srh_state->srh;
>  
> +	lockdep_assert_held(&srh_state->bh_lock);
>  	if (unlikely(srh == NULL))
>  		return false;
>  
> @@ -1408,8 +1411,7 @@ bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
>  static int input_action_end_bpf(struct sk_buff *skb,
>  				struct seg6_local_lwt *slwt)
>  {
> -	struct seg6_bpf_srh_state *srh_state =
> -		this_cpu_ptr(&seg6_bpf_srh_states);
> +	struct seg6_bpf_srh_state *srh_state;
>  	struct ipv6_sr_hdr *srh;
>  	int ret;
>  
> @@ -1420,41 +1422,44 @@ static int input_action_end_bpf(struct sk_buff *skb,
>  	}
>  	advance_nextseg(srh, &ipv6_hdr(skb)->daddr);
>  
> -	/* preempt_disable is needed to protect the per-CPU buffer srh_state,
> -	 * which is also accessed by the bpf_lwt_seg6_* helpers
> +	/* The access to the per-CPU buffer srh_state is protected by running
> +	 * always in softirq context (with disabled BH). On PREEMPT_RT the
> +	 * required locking is provided by the following local_lock_nested_bh()
> +	 * statement. It is also accessed by the bpf_lwt_seg6_* helpers via
> +	 * bpf_prog_run_save_cb().
>  	 */
> -	preempt_disable();
> -	srh_state->srh = srh;
> -	srh_state->hdrlen = srh->hdrlen << 3;
> -	srh_state->valid = true;
> +	scoped_guard(local_lock_nested_bh, &seg6_bpf_srh_states.bh_lock) {
> +		srh_state = this_cpu_ptr(&seg6_bpf_srh_states);
> +		srh_state->srh = srh;
> +		srh_state->hdrlen = srh->hdrlen << 3;
> +		srh_state->valid = true;

Here the 'scoped_guard' usage adds a lot of noise to the patch, due to
the added indentation. What about using directly
local_lock_nested_bh()/local_unlock_nested_bh() ?

Cheers,

Paolo
  
Sebastian Andrzej Siewior Jan. 12, 2024, 11:23 a.m. UTC | #3
On 2023-12-18 09:33:39 [+0100], Paolo Abeni wrote:
> > --- a/net/ipv6/seg6_local.c
> > +++ b/net/ipv6/seg6_local.c
> > @@ -1420,41 +1422,44 @@ static int input_action_end_bpf(struct sk_buff *skb,
> >  	}
> >  	advance_nextseg(srh, &ipv6_hdr(skb)->daddr);
> >  
> > -	/* preempt_disable is needed to protect the per-CPU buffer srh_state,
> > -	 * which is also accessed by the bpf_lwt_seg6_* helpers
> > +	/* The access to the per-CPU buffer srh_state is protected by running
> > +	 * always in softirq context (with disabled BH). On PREEMPT_RT the
> > +	 * required locking is provided by the following local_lock_nested_bh()
> > +	 * statement. It is also accessed by the bpf_lwt_seg6_* helpers via
> > +	 * bpf_prog_run_save_cb().
> >  	 */
> > -	preempt_disable();
> > -	srh_state->srh = srh;
> > -	srh_state->hdrlen = srh->hdrlen << 3;
> > -	srh_state->valid = true;
> > +	scoped_guard(local_lock_nested_bh, &seg6_bpf_srh_states.bh_lock) {
> > +		srh_state = this_cpu_ptr(&seg6_bpf_srh_states);
> > +		srh_state->srh = srh;
> > +		srh_state->hdrlen = srh->hdrlen << 3;
> > +		srh_state->valid = true;
> 
> Here the 'scoped_guard' usage adds a lot of noise to the patch, due to
> the added indentation. What about using directly
> local_lock_nested_bh()/local_unlock_nested_bh() ?

If this is preferred, sure.

> Cheers,
> 
> Paolo

Sebastian
  

Patch

diff --git a/include/net/seg6_local.h b/include/net/seg6_local.h
index 3fab9dec2ec45..0f22771359f4c 100644
--- a/include/net/seg6_local.h
+++ b/include/net/seg6_local.h
@@ -20,6 +20,7 @@  extern bool seg6_bpf_has_valid_srh(struct sk_buff *skb);
 
 struct seg6_bpf_srh_state {
 	struct ipv6_sr_hdr *srh;
+	local_lock_t bh_lock;
 	u16 hdrlen;
 	bool valid;
 };
diff --git a/net/core/filter.c b/net/core/filter.c
index 1737884be52f8..c8013f762524b 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -6384,6 +6384,7 @@  BPF_CALL_4(bpf_lwt_seg6_store_bytes, struct sk_buff *, skb, u32, offset,
 	void *srh_tlvs, *srh_end, *ptr;
 	int srhoff = 0;
 
+	lockdep_assert_held(&srh_state->bh_lock);
 	if (srh == NULL)
 		return -EINVAL;
 
@@ -6440,6 +6441,7 @@  BPF_CALL_4(bpf_lwt_seg6_action, struct sk_buff *, skb,
 	int hdroff = 0;
 	int err;
 
+	lockdep_assert_held(&srh_state->bh_lock);
 	switch (action) {
 	case SEG6_LOCAL_ACTION_END_X:
 		if (!seg6_bpf_has_valid_srh(skb))
@@ -6516,6 +6518,7 @@  BPF_CALL_3(bpf_lwt_seg6_adjust_srh, struct sk_buff *, skb, u32, offset,
 	int srhoff = 0;
 	int ret;
 
+	lockdep_assert_held(&srh_state->bh_lock);
 	if (unlikely(srh == NULL))
 		return -EINVAL;
 
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 24e2b4b494cb0..ed7278af321a2 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -1380,7 +1380,9 @@  static int input_action_end_b6_encap(struct sk_buff *skb,
 	return err;
 }
 
-DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states);
+DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states) = {
+	.bh_lock	= INIT_LOCAL_LOCK(bh_lock),
+};
 
 bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
 {
@@ -1388,6 +1390,7 @@  bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
 		this_cpu_ptr(&seg6_bpf_srh_states);
 	struct ipv6_sr_hdr *srh = srh_state->srh;
 
+	lockdep_assert_held(&srh_state->bh_lock);
 	if (unlikely(srh == NULL))
 		return false;
 
@@ -1408,8 +1411,7 @@  bool seg6_bpf_has_valid_srh(struct sk_buff *skb)
 static int input_action_end_bpf(struct sk_buff *skb,
 				struct seg6_local_lwt *slwt)
 {
-	struct seg6_bpf_srh_state *srh_state =
-		this_cpu_ptr(&seg6_bpf_srh_states);
+	struct seg6_bpf_srh_state *srh_state;
 	struct ipv6_sr_hdr *srh;
 	int ret;
 
@@ -1420,41 +1422,44 @@  static int input_action_end_bpf(struct sk_buff *skb,
 	}
 	advance_nextseg(srh, &ipv6_hdr(skb)->daddr);
 
-	/* preempt_disable is needed to protect the per-CPU buffer srh_state,
-	 * which is also accessed by the bpf_lwt_seg6_* helpers
+	/* The access to the per-CPU buffer srh_state is protected by running
+	 * always in softirq context (with disabled BH). On PREEMPT_RT the
+	 * required locking is provided by the following local_lock_nested_bh()
+	 * statement. It is also accessed by the bpf_lwt_seg6_* helpers via
+	 * bpf_prog_run_save_cb().
 	 */
-	preempt_disable();
-	srh_state->srh = srh;
-	srh_state->hdrlen = srh->hdrlen << 3;
-	srh_state->valid = true;
+	scoped_guard(local_lock_nested_bh, &seg6_bpf_srh_states.bh_lock) {
+		srh_state = this_cpu_ptr(&seg6_bpf_srh_states);
+		srh_state->srh = srh;
+		srh_state->hdrlen = srh->hdrlen << 3;
+		srh_state->valid = true;
 
-	rcu_read_lock();
-	bpf_compute_data_pointers(skb);
-	ret = bpf_prog_run_save_cb(slwt->bpf.prog, skb);
-	rcu_read_unlock();
+		rcu_read_lock();
+		bpf_compute_data_pointers(skb);
+		ret = bpf_prog_run_save_cb(slwt->bpf.prog, skb);
+		rcu_read_unlock();
 
-	switch (ret) {
-	case BPF_OK:
-	case BPF_REDIRECT:
-		break;
-	case BPF_DROP:
-		goto drop;
-	default:
-		pr_warn_once("bpf-seg6local: Illegal return value %u\n", ret);
-		goto drop;
+		switch (ret) {
+		case BPF_OK:
+		case BPF_REDIRECT:
+			break;
+		case BPF_DROP:
+			goto drop;
+		default:
+			pr_warn_once("bpf-seg6local: Illegal return value %u\n", ret);
+			goto drop;
+		}
+
+		if (srh_state->srh && !seg6_bpf_has_valid_srh(skb))
+			goto drop;
 	}
 
-	if (srh_state->srh && !seg6_bpf_has_valid_srh(skb))
-		goto drop;
-
-	preempt_enable();
 	if (ret != BPF_REDIRECT)
 		seg6_lookup_nexthop(skb, NULL, 0);
 
 	return dst_input(skb);
 
 drop:
-	preempt_enable();
 	kfree_skb(skb);
 	return -EINVAL;
 }