[net,v2] net: sched: ematch: reject invalid data

Message ID 20221214022058.3625300-1-jun.nie@linaro.org
State New
Headers
Series [net,v2] net: sched: ematch: reject invalid data |

Commit Message

Jun Nie Dec. 14, 2022, 2:20 a.m. UTC
  syzbot reported below bug. Refuse to compare for invalid data case to fix
it.

general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.15.77-syzkaller-00764-g7048384c9872 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Workqueue: wg-crypt-wg2 wg_packet_tx_worker
RIP: 0010:em_cmp_match+0x4e/0x5f0 net/sched/em_cmp.c:25
Call Trace:
 <TASK>
 tcf_em_match net/sched/ematch.c:492 [inline]
 __tcf_em_tree_match+0x194/0x720 net/sched/ematch.c:518
 tcf_em_tree_match include/net/pkt_cls.h:463 [inline]
 basic_classify+0xd8/0x250 net/sched/cls_basic.c:48
 __tcf_classify net/sched/cls_api.c:1549 [inline]
 tcf_classify+0x161/0x430 net/sched/cls_api.c:1589
 prio_classify net/sched/sch_prio.c:42 [inline]
 prio_enqueue+0x1d3/0x6a0 net/sched/sch_prio.c:75
 dev_qdisc_enqueue net/core/dev.c:3792 [inline]
 __dev_xmit_skb+0x35c/0x1650 net/core/dev.c:3876
 __dev_queue_xmit+0x8f3/0x1b50 net/core/dev.c:4193
 dev_queue_xmit+0x17/0x20 net/core/dev.c:4261
 neigh_hh_output include/net/neighbour.h:508 [inline]
 neigh_output include/net/neighbour.h:522 [inline]
 ip_finish_output2+0xc0f/0xf00 net/ipv4/ip_output.c:228
 __ip_finish_output+0x163/0x370
 ip_finish_output+0x20b/0x220 net/ipv4/ip_output.c:316
 NF_HOOK_COND include/linux/netfilter.h:299 [inline]
 ip_output+0x1e9/0x410 net/ipv4/ip_output.c:430
 dst_output include/net/dst.h:450 [inline]
 ip_local_out+0x92/0xb0 net/ipv4/ip_output.c:126
 iptunnel_xmit+0x4a2/0x890 net/ipv4/ip_tunnel_core.c:82
 udp_tunnel_xmit_skb+0x1b6/0x2c0 net/ipv4/udp_tunnel_core.c:175
 send4+0x78d/0xd20 drivers/net/wireguard/socket.c:85
 wg_socket_send_skb_to_peer+0xd5/0x1d0 drivers/net/wireguard/socket.c:175
 wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
 wg_packet_tx_worker+0x202/0x560 drivers/net/wireguard/send.c:276
 process_one_work+0x6db/0xc00 kernel/workqueue.c:2313
 worker_thread+0xb3e/0x1340 kernel/workqueue.c:2460
 kthread+0x41c/0x500 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298

Reported-by: syzbot+963f7637dae8becc038f@syzkaller.appspotmail.com
Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
Signed-off-by: Jun Nie <jun.nie@linaro.org>
---
 net/sched/em_cmp.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)
  

Comments

Paolo Abeni Dec. 15, 2022, 12:50 p.m. UTC | #1
On Wed, 2022-12-14 at 10:20 +0800, Jun Nie wrote:
> syzbot reported below bug. Refuse to compare for invalid data case to fix
> it.
> 
> general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.15.77-syzkaller-00764-g7048384c9872 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> Workqueue: wg-crypt-wg2 wg_packet_tx_worker
> RIP: 0010:em_cmp_match+0x4e/0x5f0 net/sched/em_cmp.c:25
> Call Trace:
>  <TASK>
>  tcf_em_match net/sched/ematch.c:492 [inline]
>  __tcf_em_tree_match+0x194/0x720 net/sched/ematch.c:518
>  tcf_em_tree_match include/net/pkt_cls.h:463 [inline]
>  basic_classify+0xd8/0x250 net/sched/cls_basic.c:48
>  __tcf_classify net/sched/cls_api.c:1549 [inline]
>  tcf_classify+0x161/0x430 net/sched/cls_api.c:1589
>  prio_classify net/sched/sch_prio.c:42 [inline]
>  prio_enqueue+0x1d3/0x6a0 net/sched/sch_prio.c:75
>  dev_qdisc_enqueue net/core/dev.c:3792 [inline]
>  __dev_xmit_skb+0x35c/0x1650 net/core/dev.c:3876
>  __dev_queue_xmit+0x8f3/0x1b50 net/core/dev.c:4193
>  dev_queue_xmit+0x17/0x20 net/core/dev.c:4261
>  neigh_hh_output include/net/neighbour.h:508 [inline]
>  neigh_output include/net/neighbour.h:522 [inline]
>  ip_finish_output2+0xc0f/0xf00 net/ipv4/ip_output.c:228
>  __ip_finish_output+0x163/0x370
>  ip_finish_output+0x20b/0x220 net/ipv4/ip_output.c:316
>  NF_HOOK_COND include/linux/netfilter.h:299 [inline]
>  ip_output+0x1e9/0x410 net/ipv4/ip_output.c:430
>  dst_output include/net/dst.h:450 [inline]
>  ip_local_out+0x92/0xb0 net/ipv4/ip_output.c:126
>  iptunnel_xmit+0x4a2/0x890 net/ipv4/ip_tunnel_core.c:82
>  udp_tunnel_xmit_skb+0x1b6/0x2c0 net/ipv4/udp_tunnel_core.c:175
>  send4+0x78d/0xd20 drivers/net/wireguard/socket.c:85
>  wg_socket_send_skb_to_peer+0xd5/0x1d0 drivers/net/wireguard/socket.c:175
>  wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
>  wg_packet_tx_worker+0x202/0x560 drivers/net/wireguard/send.c:276
>  process_one_work+0x6db/0xc00 kernel/workqueue.c:2313
>  worker_thread+0xb3e/0x1340 kernel/workqueue.c:2460
>  kthread+0x41c/0x500 kernel/kthread.c:319
>  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
> 
> Reported-by: syzbot+963f7637dae8becc038f@syzkaller.appspotmail.com
> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")

Very likely this is not the correct fixes tag.

> Signed-off-by: Jun Nie <jun.nie@linaro.org>
> ---
>  net/sched/em_cmp.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
> index f17b049ea530..0284394be53f 100644
> --- a/net/sched/em_cmp.c
> +++ b/net/sched/em_cmp.c
> @@ -22,9 +22,14 @@ static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
>  			struct tcf_pkt_info *info)
>  {
>  	struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
> -	unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
> +	unsigned char *ptr;
>  	u32 val = 0;
>  
> +	if (!cmp)
> +		return 0;

It feels like this is papering over the real issue. Why em->data is
NULL here? why other ematches are not afflicted by this issue? 

is em->data really NULL or some small value instead? KASAN seams to
tell it's a small value, not 0, so this patch should not avoid the
oops. Have you tested it vs the reproducer?

Thanks,

Paolo
  
Jun Nie Dec. 15, 2022, 1:59 p.m. UTC | #2
Paolo Abeni <pabeni@redhat.com> 于2022年12月15日周四 20:50写道:
>
> On Wed, 2022-12-14 at 10:20 +0800, Jun Nie wrote:
> > syzbot reported below bug. Refuse to compare for invalid data case to fix
> > it.
> >
> > general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN
> > KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> > CPU: 0 PID: 6 Comm: kworker/0:0 Not tainted 5.15.77-syzkaller-00764-g7048384c9872 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> > Workqueue: wg-crypt-wg2 wg_packet_tx_worker
> > RIP: 0010:em_cmp_match+0x4e/0x5f0 net/sched/em_cmp.c:25
> > Call Trace:
> >  <TASK>
> >  tcf_em_match net/sched/ematch.c:492 [inline]
> >  __tcf_em_tree_match+0x194/0x720 net/sched/ematch.c:518
> >  tcf_em_tree_match include/net/pkt_cls.h:463 [inline]
> >  basic_classify+0xd8/0x250 net/sched/cls_basic.c:48
> >  __tcf_classify net/sched/cls_api.c:1549 [inline]
> >  tcf_classify+0x161/0x430 net/sched/cls_api.c:1589
> >  prio_classify net/sched/sch_prio.c:42 [inline]
> >  prio_enqueue+0x1d3/0x6a0 net/sched/sch_prio.c:75
> >  dev_qdisc_enqueue net/core/dev.c:3792 [inline]
> >  __dev_xmit_skb+0x35c/0x1650 net/core/dev.c:3876
> >  __dev_queue_xmit+0x8f3/0x1b50 net/core/dev.c:4193
> >  dev_queue_xmit+0x17/0x20 net/core/dev.c:4261
> >  neigh_hh_output include/net/neighbour.h:508 [inline]
> >  neigh_output include/net/neighbour.h:522 [inline]
> >  ip_finish_output2+0xc0f/0xf00 net/ipv4/ip_output.c:228
> >  __ip_finish_output+0x163/0x370
> >  ip_finish_output+0x20b/0x220 net/ipv4/ip_output.c:316
> >  NF_HOOK_COND include/linux/netfilter.h:299 [inline]
> >  ip_output+0x1e9/0x410 net/ipv4/ip_output.c:430
> >  dst_output include/net/dst.h:450 [inline]
> >  ip_local_out+0x92/0xb0 net/ipv4/ip_output.c:126
> >  iptunnel_xmit+0x4a2/0x890 net/ipv4/ip_tunnel_core.c:82
> >  udp_tunnel_xmit_skb+0x1b6/0x2c0 net/ipv4/udp_tunnel_core.c:175
> >  send4+0x78d/0xd20 drivers/net/wireguard/socket.c:85
> >  wg_socket_send_skb_to_peer+0xd5/0x1d0 drivers/net/wireguard/socket.c:175
> >  wg_packet_create_data_done drivers/net/wireguard/send.c:251 [inline]
> >  wg_packet_tx_worker+0x202/0x560 drivers/net/wireguard/send.c:276
> >  process_one_work+0x6db/0xc00 kernel/workqueue.c:2313
> >  worker_thread+0xb3e/0x1340 kernel/workqueue.c:2460
> >  kthread+0x41c/0x500 kernel/kthread.c:319
> >  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
> >
> > Reported-by: syzbot+963f7637dae8becc038f@syzkaller.appspotmail.com
> > Fixes: e7096c131e51 ("net: WireGuard secure network tunnel")
>
> Very likely this is not the correct fixes tag.
>
> > Signed-off-by: Jun Nie <jun.nie@linaro.org>
> > ---
> >  net/sched/em_cmp.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
> > index f17b049ea530..0284394be53f 100644
> > --- a/net/sched/em_cmp.c
> > +++ b/net/sched/em_cmp.c
> > @@ -22,9 +22,14 @@ static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
> >                       struct tcf_pkt_info *info)
> >  {
> >       struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
> > -     unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
> > +     unsigned char *ptr;
> >       u32 val = 0;
> >
> > +     if (!cmp)
> > +             return 0;
>
> It feels like this is papering over the real issue. Why em->data is
> NULL here? why other ematches are not afflicted by this issue?
>
> is em->data really NULL or some small value instead? KASAN seams to
> tell it's a small value, not 0, so this patch should not avoid the
> oops. Have you tested it vs the reproducer?
>
> Thanks,
>
> Paolo
>

The test with the reproducer[1] shows it does resolve the issue. The data
is NULL so that deferring cmp can be avoided with the patch. I did not
investigate why the em->data is NULL in WireGuard secure network tunnel
case as I am not familiar with network stack. So you can also call this patch
as a workaround.

[1]
https://syzkaller.appspot.com/bug?id=d96c4958dc8d4da11f56e18471dfc4f64d21ef6e

Regards,
Jun
  
Cong Wang Dec. 17, 2022, 9:37 p.m. UTC | #3
On Thu, Dec 15, 2022 at 01:50:43PM +0100, Paolo Abeni wrote:
> On Wed, 2022-12-14 at 10:20 +0800, Jun Nie wrote:
> > ---
> >  net/sched/em_cmp.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
> > index f17b049ea530..0284394be53f 100644
> > --- a/net/sched/em_cmp.c
> > +++ b/net/sched/em_cmp.c
> > @@ -22,9 +22,14 @@ static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
> >  			struct tcf_pkt_info *info)
> >  {
> >  	struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
> > -	unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
> > +	unsigned char *ptr;
> >  	u32 val = 0;
> >  
> > +	if (!cmp)
> > +		return 0;
> 
> It feels like this is papering over the real issue. Why em->data is
> NULL here? why other ematches are not afflicted by this issue? 
> 
> is em->data really NULL or some small value instead? KASAN seams to
> tell it's a small value, not 0, so this patch should not avoid the
> oops. Have you tested it vs the reproducer?

Right. I think I have found the root cause, let me test my patch to see
if it makes syzbot happy.

Thanks.
  

Patch

diff --git a/net/sched/em_cmp.c b/net/sched/em_cmp.c
index f17b049ea530..0284394be53f 100644
--- a/net/sched/em_cmp.c
+++ b/net/sched/em_cmp.c
@@ -22,9 +22,14 @@  static int em_cmp_match(struct sk_buff *skb, struct tcf_ematch *em,
 			struct tcf_pkt_info *info)
 {
 	struct tcf_em_cmp *cmp = (struct tcf_em_cmp *) em->data;
-	unsigned char *ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
+	unsigned char *ptr;
 	u32 val = 0;
 
+	if (!cmp)
+		return 0;
+
+	ptr = tcf_get_base_ptr(skb, cmp->layer) + cmp->off;
+
 	if (!tcf_valid_offset(skb, ptr, cmp->align))
 		return 0;