[v2] net: tun: Fix use-after-free in tun_detach()

Message ID 20221120090213.922567-1-syoshida@redhat.com
State New
Headers
Series [v2] net: tun: Fix use-after-free in tun_detach() |

Commit Message

Shigeru Yoshida Nov. 20, 2022, 9:02 a.m. UTC
  syzbot reported use-after-free in tun_detach() [1].  This causes call
trace like below:

==================================================================
BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673

CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:284 [inline]
 print_report+0x15e/0x461 mm/kasan/report.c:395
 kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
 notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
 call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
 call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
 call_netdevice_notifiers net/core/dev.c:1997 [inline]
 netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
 netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
 tun_detach drivers/net/tun.c:704 [inline]
 tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
 __fput+0x27c/0xa90 fs/file_table.c:320
 task_work_run+0x16f/0x270 kernel/task_work.c:179
 exit_task_work include/linux/task_work.h:38 [inline]
 do_exit+0xb3d/0x2a30 kernel/exit.c:820
 do_group_exit+0xd4/0x2a0 kernel/exit.c:950
 get_signal+0x21b1/0x2440 kernel/signal.c:2858
 arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
 exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
 exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
 syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
 do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The cause of the issue is that sock_put() from __tun_detach() drops
last reference count for struct net, and then notifier_call_chain()
from netdev_state_change() accesses that struct net.

This patch fixes the issue by calling sock_put() from tun_detach()
after all necessary accesses for the struct net has done.

Fixes: 83c1f36f9880 ("tun: send netlink notification when the device is modified")
Reported-by: syzbot+106f9b687cd64ee70cd1@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe [1]
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
---
v2:
- Include symbolic stack trace
- Add Fixes and Reported-by tags
v1: https://lore.kernel.org/all/20221119075615.723290-1-syoshida@redhat.com/
---
 drivers/net/tun.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)
  

Comments

Eric Dumazet Nov. 20, 2022, 4:04 p.m. UTC | #1
On Sun, Nov 20, 2022 at 2:49 AM Hillf Danton <hdanton@sina.com> wrote:
>
> On 20 Nov 2022 18:02:13 +0900 Shigeru Yoshida <syoshida@redhat.com>
> > syzbot reported use-after-free in tun_detach() [1].  This causes call
> > trace like below:
> >
> > ==================================================================
> > BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
> > Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673
> >
> > CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> > Call Trace:
> >  <TASK>
> >  __dump_stack lib/dump_stack.c:88 [inline]
> >  dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
> >  print_address_description mm/kasan/report.c:284 [inline]
> >  print_report+0x15e/0x461 mm/kasan/report.c:395
> >  kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
> >  notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
> >  call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
> >  call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
> >  call_netdevice_notifiers net/core/dev.c:1997 [inline]
> >  netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
> >  netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
> >  tun_detach drivers/net/tun.c:704 [inline]
> >  tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
> >  __fput+0x27c/0xa90 fs/file_table.c:320
> >  task_work_run+0x16f/0x270 kernel/task_work.c:179
> >  exit_task_work include/linux/task_work.h:38 [inline]
> >  do_exit+0xb3d/0x2a30 kernel/exit.c:820
> >  do_group_exit+0xd4/0x2a0 kernel/exit.c:950
> >  get_signal+0x21b1/0x2440 kernel/signal.c:2858
> >  arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
> >  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> >  exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> >  __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> >  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> >  do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> >  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >
> > The cause of the issue is that sock_put() from __tun_detach() drops
> > last reference count for struct net, and then notifier_call_chain()
> > from netdev_state_change() accesses that struct net.
>
> Correct. IOW the race between netdev_run_todo() and cleanup_net() is behind
> the uaf report from syzbot.
>
> >
> > This patch fixes the issue by calling sock_put() from tun_detach()
> > after all necessary accesses for the struct net has done.
>
> Thanks for your fix.
>
> But tun is not special wrt netdev_run_todo() and call_netdevice_notifiers(),
> so the correct fix should be making netdev grab another hold on net and
> invoking put_net() in the path of netdev_run_todo().

Well, this is not going to work. Unless I am missing something.

1) A netns is destroyed when its refcount reaches zero.

   if (refcount_dec_and_test(&net->ns.count))
        __put_net(net);

    This transition is final.

    (It is illegal to attempt a refcount_inc() from this state)

2) All its netdevices are unregistered.

   Trying to get a reference on netns in netdevice dismantle path
would immediately trigger a refcount_t warning.
  
Eric Dumazet Nov. 21, 2022, 12:53 a.m. UTC | #2
On Sun, Nov 20, 2022 at 4:34 PM Hillf Danton <hdanton@sina.com> wrote:
>
> On 20 Nov 2022 08:04:13 -0800 Eric Dumazet <edumazet@google.com>
> > On Sun, Nov 20, 2022 at 2:49 AM Hillf Danton <hdanton@sina.com> wrote:
> > > On 20 Nov 2022 18:02:13 +0900 Shigeru Yoshida <syoshida@redhat.com>
> > > >
> > > > This patch fixes the issue by calling sock_put() from tun_detach()
> > > > after all necessary accesses for the struct net has done.
> > >
> > > Thanks for your fix.
> > >
> > > But tun is not special wrt netdev_run_todo() and call_netdevice_notifiers(),
> > > so the correct fix should be making netdev grab another hold on net and
> > > invoking put_net() in the path of netdev_run_todo().
> >
> > Well, this is not going to work. Unless I am missing something.
>
> Thanks for taking a look.
>
> I mean bump up refcount for net when updating netdev->nd_net in a bid to
> make dev_net() safe throught netdev's life span.

This would prevent netns deletion, as the following sequence would
then no longer work as intended.

ip netns add foo
ip netns add ip link set lo up
ip netns del foo

When a netns is deleted ("ip netns del" and no more refcounted sockets),
we have callbacks to unregister all devices tied to it.
  
Eric Dumazet Nov. 21, 2022, 4:47 p.m. UTC | #3
On Sun, Nov 20, 2022 at 1:02 AM Shigeru Yoshida <syoshida@redhat.com> wrote:
>
> syzbot reported use-after-free in tun_detach() [1].  This causes call
> trace like below:
>
> ==================================================================
> BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
> Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673
>
> CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:88 [inline]
>  dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
>  print_address_description mm/kasan/report.c:284 [inline]
>  print_report+0x15e/0x461 mm/kasan/report.c:395
>  kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
>  notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
>  call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
>  call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
>  call_netdevice_notifiers net/core/dev.c:1997 [inline]
>  netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
>  netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
>  tun_detach drivers/net/tun.c:704 [inline]
>  tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
>  __fput+0x27c/0xa90 fs/file_table.c:320
>  task_work_run+0x16f/0x270 kernel/task_work.c:179
>  exit_task_work include/linux/task_work.h:38 [inline]
>  do_exit+0xb3d/0x2a30 kernel/exit.c:820
>  do_group_exit+0xd4/0x2a0 kernel/exit.c:950
>  get_signal+0x21b1/0x2440 kernel/signal.c:2858
>  arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
>  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>  exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>  do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>
> The cause of the issue is that sock_put() from __tun_detach() drops
> last reference count for struct net, and then notifier_call_chain()
> from netdev_state_change() accesses that struct net.
>
> This patch fixes the issue by calling sock_put() from tun_detach()
> after all necessary accesses for the struct net has done.
>
> Fixes: 83c1f36f9880 ("tun: send netlink notification when the device is modified")
> Reported-by: syzbot+106f9b687cd64ee70cd1@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe [1]
> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
> ---
> v2:
> - Include symbolic stack trace
> - Add Fixes and Reported-by tags
> v1: https://lore.kernel.org/all/20221119075615.723290-1-syoshida@redhat.com/
> ---
>  drivers/net/tun.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 7a3ab3427369..ce9fcf4c8ef4 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>                 if (tun)
>                         xdp_rxq_info_unreg(&tfile->xdp_rxq);
>                 ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
> -               sock_put(&tfile->sk);
>         }
>  }
>
> @@ -702,6 +701,11 @@ static void tun_detach(struct tun_file *tfile, bool clean)
>         if (dev)
>                 netdev_state_change(dev);
>         rtnl_unlock();
> +
> +       if (clean) {

Would you mind explaining (a comment would be nice) why this barrier is needed ?

Thanks.

> +               synchronize_rcu();
> +               sock_put(&tfile->sk);
> +       }
>  }
>
>  static void tun_detach_all(struct net_device *dev)
> --
> 2.38.1
>
  
Shigeru Yoshida Nov. 22, 2022, 6:10 p.m. UTC | #4
Hi Eric,

On Mon, 21 Nov 2022 08:47:17 -0800, Eric Dumazet wrote:
> On Sun, Nov 20, 2022 at 1:02 AM Shigeru Yoshida <syoshida@redhat.com> wrote:
>>
>> syzbot reported use-after-free in tun_detach() [1].  This causes call
>> trace like below:
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
>> Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673
>>
>> CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
>> Call Trace:
>>  <TASK>
>>  __dump_stack lib/dump_stack.c:88 [inline]
>>  dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
>>  print_address_description mm/kasan/report.c:284 [inline]
>>  print_report+0x15e/0x461 mm/kasan/report.c:395
>>  kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
>>  notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
>>  call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
>>  call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
>>  call_netdevice_notifiers net/core/dev.c:1997 [inline]
>>  netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
>>  netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
>>  tun_detach drivers/net/tun.c:704 [inline]
>>  tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
>>  __fput+0x27c/0xa90 fs/file_table.c:320
>>  task_work_run+0x16f/0x270 kernel/task_work.c:179
>>  exit_task_work include/linux/task_work.h:38 [inline]
>>  do_exit+0xb3d/0x2a30 kernel/exit.c:820
>>  do_group_exit+0xd4/0x2a0 kernel/exit.c:950
>>  get_signal+0x21b1/0x2440 kernel/signal.c:2858
>>  arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
>>  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>>  exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>>  __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>>  do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>
>> The cause of the issue is that sock_put() from __tun_detach() drops
>> last reference count for struct net, and then notifier_call_chain()
>> from netdev_state_change() accesses that struct net.
>>
>> This patch fixes the issue by calling sock_put() from tun_detach()
>> after all necessary accesses for the struct net has done.
>>
>> Fixes: 83c1f36f9880 ("tun: send netlink notification when the device is modified")
>> Reported-by: syzbot+106f9b687cd64ee70cd1@syzkaller.appspotmail.com
>> Link: https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe [1]
>> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
>> ---
>> v2:
>> - Include symbolic stack trace
>> - Add Fixes and Reported-by tags
>> v1: https://lore.kernel.org/all/20221119075615.723290-1-syoshida@redhat.com/
>> ---
>>  drivers/net/tun.c | 6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>> index 7a3ab3427369..ce9fcf4c8ef4 100644
>> --- a/drivers/net/tun.c
>> +++ b/drivers/net/tun.c
>> @@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>>                 if (tun)
>>                         xdp_rxq_info_unreg(&tfile->xdp_rxq);
>>                 ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
>> -               sock_put(&tfile->sk);
>>         }
>>  }
>>
>> @@ -702,6 +701,11 @@ static void tun_detach(struct tun_file *tfile, bool clean)
>>         if (dev)
>>                 netdev_state_change(dev);
>>         rtnl_unlock();
>> +
>> +       if (clean) {
> 
> Would you mind explaining (a comment would be nice) why this barrier is needed ?

I thought that tfile is accessed with rcu_lock(), so I put
synchronize_rcu() here.  Please let me know if I misunderstand the
concept of rcu (I'm losing my confidence...).

Thanks,
Shigeru

> 
> Thanks.
> 
>> +               synchronize_rcu();
>> +               sock_put(&tfile->sk);
>> +       }
>>  }
>>
>>  static void tun_detach_all(struct net_device *dev)
>> --
>> 2.38.1
>>
>
  
Eric Dumazet Nov. 22, 2022, 6:47 p.m. UTC | #5
On Tue, Nov 22, 2022 at 10:10 AM Shigeru Yoshida <syoshida@redhat.com> wrote:
>
> Hi Eric,
>
> On Mon, 21 Nov 2022 08:47:17 -0800, Eric Dumazet wrote:
> > On Sun, Nov 20, 2022 at 1:02 AM Shigeru Yoshida <syoshida@redhat.com> wrote:
> >>
> >> syzbot reported use-after-free in tun_detach() [1].  This causes call
> >> trace like below:
> >>
> >> ==================================================================
> >> BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
> >> Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673
> >>
> >> CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
> >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
> >> Call Trace:
> >>  <TASK>
> >>  __dump_stack lib/dump_stack.c:88 [inline]
> >>  dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
> >>  print_address_description mm/kasan/report.c:284 [inline]
> >>  print_report+0x15e/0x461 mm/kasan/report.c:395
> >>  kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
> >>  notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
> >>  call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
> >>  call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
> >>  call_netdevice_notifiers net/core/dev.c:1997 [inline]
> >>  netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
> >>  netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
> >>  tun_detach drivers/net/tun.c:704 [inline]
> >>  tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
> >>  __fput+0x27c/0xa90 fs/file_table.c:320
> >>  task_work_run+0x16f/0x270 kernel/task_work.c:179
> >>  exit_task_work include/linux/task_work.h:38 [inline]
> >>  do_exit+0xb3d/0x2a30 kernel/exit.c:820
> >>  do_group_exit+0xd4/0x2a0 kernel/exit.c:950
> >>  get_signal+0x21b1/0x2440 kernel/signal.c:2858
> >>  arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
> >>  exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
> >>  exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
> >>  __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
> >>  syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
> >>  do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
> >>  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >>
> >> The cause of the issue is that sock_put() from __tun_detach() drops
> >> last reference count for struct net, and then notifier_call_chain()
> >> from netdev_state_change() accesses that struct net.
> >>
> >> This patch fixes the issue by calling sock_put() from tun_detach()
> >> after all necessary accesses for the struct net has done.
> >>
> >> Fixes: 83c1f36f9880 ("tun: send netlink notification when the device is modified")
> >> Reported-by: syzbot+106f9b687cd64ee70cd1@syzkaller.appspotmail.com
> >> Link: https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe [1]
> >> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
> >> ---
> >> v2:
> >> - Include symbolic stack trace
> >> - Add Fixes and Reported-by tags
> >> v1: https://lore.kernel.org/all/20221119075615.723290-1-syoshida@redhat.com/
> >> ---
> >>  drivers/net/tun.c | 6 +++++-
> >>  1 file changed, 5 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> >> index 7a3ab3427369..ce9fcf4c8ef4 100644
> >> --- a/drivers/net/tun.c
> >> +++ b/drivers/net/tun.c
> >> @@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
> >>                 if (tun)
> >>                         xdp_rxq_info_unreg(&tfile->xdp_rxq);
> >>                 ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
> >> -               sock_put(&tfile->sk);
> >>         }
> >>  }
> >>
> >> @@ -702,6 +701,11 @@ static void tun_detach(struct tun_file *tfile, bool clean)
> >>         if (dev)
> >>                 netdev_state_change(dev);
> >>         rtnl_unlock();
> >> +
> >> +       if (clean) {
> >
> > Would you mind explaining (a comment would be nice) why this barrier is needed ?
>
> I thought that tfile is accessed with rcu_lock(), so I put
> synchronize_rcu() here.  Please let me know if I misunderstand the
> concept of rcu (I'm losing my confidence...).
>

Addin Jason for comments.

If an RCU grace period was needed before commit 83c1f36f9880 ("tun:
send netlink notification when the device is modified"),
would we need another patch ?

Also sock_flag(sk, SOCK_RCU_FREE) would probably be better than adding
a synchronize_rcu() (if again a grace period is needed)



> Thanks,
> Shigeru
>
> >
> > Thanks.
> >
> >> +               synchronize_rcu();
> >> +               sock_put(&tfile->sk);
> >> +       }
> >>  }
> >>
> >>  static void tun_detach_all(struct net_device *dev)
> >> --
> >> 2.38.1
> >>
> >
>
  
Jason Wang Nov. 23, 2022, 4:20 a.m. UTC | #6
在 2022/11/23 02:47, Eric Dumazet 写道:
> On Tue, Nov 22, 2022 at 10:10 AM Shigeru Yoshida <syoshida@redhat.com> wrote:
>> Hi Eric,
>>
>> On Mon, 21 Nov 2022 08:47:17 -0800, Eric Dumazet wrote:
>>> On Sun, Nov 20, 2022 at 1:02 AM Shigeru Yoshida <syoshida@redhat.com> wrote:
>>>> syzbot reported use-after-free in tun_detach() [1].  This causes call
>>>> trace like below:
>>>>
>>>> ==================================================================
>>>> BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
>>>> Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673
>>>>
>>>> CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
>>>> Call Trace:
>>>>   <TASK>
>>>>   __dump_stack lib/dump_stack.c:88 [inline]
>>>>   dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
>>>>   print_address_description mm/kasan/report.c:284 [inline]
>>>>   print_report+0x15e/0x461 mm/kasan/report.c:395
>>>>   kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
>>>>   notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
>>>>   call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
>>>>   call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
>>>>   call_netdevice_notifiers net/core/dev.c:1997 [inline]
>>>>   netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
>>>>   netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
>>>>   tun_detach drivers/net/tun.c:704 [inline]
>>>>   tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
>>>>   __fput+0x27c/0xa90 fs/file_table.c:320
>>>>   task_work_run+0x16f/0x270 kernel/task_work.c:179
>>>>   exit_task_work include/linux/task_work.h:38 [inline]
>>>>   do_exit+0xb3d/0x2a30 kernel/exit.c:820
>>>>   do_group_exit+0xd4/0x2a0 kernel/exit.c:950
>>>>   get_signal+0x21b1/0x2440 kernel/signal.c:2858
>>>>   arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
>>>>   exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>>>>   exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>>>>   syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>>>>   do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>>>>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>>>
>>>> The cause of the issue is that sock_put() from __tun_detach() drops
>>>> last reference count for struct net, and then notifier_call_chain()
>>>> from netdev_state_change() accesses that struct net.
>>>>
>>>> This patch fixes the issue by calling sock_put() from tun_detach()
>>>> after all necessary accesses for the struct net has done.
>>>>
>>>> Fixes: 83c1f36f9880 ("tun: send netlink notification when the device is modified")
>>>> Reported-by: syzbot+106f9b687cd64ee70cd1@syzkaller.appspotmail.com
>>>> Link: https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe [1]
>>>> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
>>>> ---
>>>> v2:
>>>> - Include symbolic stack trace
>>>> - Add Fixes and Reported-by tags
>>>> v1: https://lore.kernel.org/all/20221119075615.723290-1-syoshida@redhat.com/
>>>> ---
>>>>   drivers/net/tun.c | 6 +++++-
>>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>>> index 7a3ab3427369..ce9fcf4c8ef4 100644
>>>> --- a/drivers/net/tun.c
>>>> +++ b/drivers/net/tun.c
>>>> @@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
>>>>                  if (tun)
>>>>                          xdp_rxq_info_unreg(&tfile->xdp_rxq);
>>>>                  ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
>>>> -               sock_put(&tfile->sk);
>>>>          }
>>>>   }
>>>>
>>>> @@ -702,6 +701,11 @@ static void tun_detach(struct tun_file *tfile, bool clean)
>>>>          if (dev)
>>>>                  netdev_state_change(dev);
>>>>          rtnl_unlock();
>>>> +
>>>> +       if (clean) {
>>> Would you mind explaining (a comment would be nice) why this barrier is needed ?
>> I thought that tfile is accessed with rcu_lock(), so I put
>> synchronize_rcu() here.  Please let me know if I misunderstand the
>> concept of rcu (I'm losing my confidence...).
>>
> Addin Jason for comments.
>
> If an RCU grace period was needed before commit 83c1f36f9880 ("tun:
> send netlink notification when the device is modified"),
> would we need another patch ?


I think we don't need another synchronization here. __tun_detach() has 
already done the necessary synchronization when it tries to modify 
tun->tfiles array and tfile->tun.

Thanks


>
> Also sock_flag(sk, SOCK_RCU_FREE) would probably be better than adding
> a synchronize_rcu() (if again a grace period is needed)
>
>
>
>> Thanks,
>> Shigeru
>>
>>> Thanks.
>>>
>>>> +               synchronize_rcu();
>>>> +               sock_put(&tfile->sk);
>>>> +       }
>>>>   }
>>>>
>>>>   static void tun_detach_all(struct net_device *dev)
>>>> --
>>>> 2.38.1
>>>>
  
Shigeru Yoshida Nov. 23, 2022, 4:08 p.m. UTC | #7
Hi Jason,

On Wed, 23 Nov 2022 12:20:47 +0800, Jason Wang wrote:
> 
> 在 2022/11/23 02:47, Eric Dumazet 写道:
>> On Tue, Nov 22, 2022 at 10:10 AM Shigeru Yoshida <syoshida@redhat.com>
>> wrote:
>>> Hi Eric,
>>>
>>> On Mon, 21 Nov 2022 08:47:17 -0800, Eric Dumazet wrote:
>>>> On Sun, Nov 20, 2022 at 1:02 AM Shigeru Yoshida <syoshida@redhat.com>
>>>> wrote:
>>>>> syzbot reported use-after-free in tun_detach() [1].  This causes call
>>>>> trace like below:
>>>>>
>>>>> ==================================================================
>>>>> BUG: KASAN: use-after-free in notifier_call_chain+0x1ee/0x200
>>>>> kernel/notifier.c:75
>>>>> Read of size 8 at addr ffff88807324e2a8 by task syz-executor.0/3673
>>>>>
>>>>> CPU: 0 PID: 3673 Comm: syz-executor.0 Not tainted
>>>>> 6.1.0-rc5-syzkaller-00044-gcc675d22e422 #0
>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>>>>> BIOS Google 10/26/2022
>>>>> Call Trace:
>>>>>   <TASK>
>>>>>   __dump_stack lib/dump_stack.c:88 [inline]
>>>>>   dump_stack_lvl+0xd1/0x138 lib/dump_stack.c:106
>>>>>   print_address_description mm/kasan/report.c:284 [inline]
>>>>>   print_report+0x15e/0x461 mm/kasan/report.c:395
>>>>>   kasan_report+0xbf/0x1f0 mm/kasan/report.c:495
>>>>>   notifier_call_chain+0x1ee/0x200 kernel/notifier.c:75
>>>>>   call_netdevice_notifiers_info+0x86/0x130 net/core/dev.c:1942
>>>>>   call_netdevice_notifiers_extack net/core/dev.c:1983 [inline]
>>>>>   call_netdevice_notifiers net/core/dev.c:1997 [inline]
>>>>>   netdev_wait_allrefs_any net/core/dev.c:10237 [inline]
>>>>>   netdev_run_todo+0xbc6/0x1100 net/core/dev.c:10351
>>>>>   tun_detach drivers/net/tun.c:704 [inline]
>>>>>   tun_chr_close+0xe4/0x190 drivers/net/tun.c:3467
>>>>>   __fput+0x27c/0xa90 fs/file_table.c:320
>>>>>   task_work_run+0x16f/0x270 kernel/task_work.c:179
>>>>>   exit_task_work include/linux/task_work.h:38 [inline]
>>>>>   do_exit+0xb3d/0x2a30 kernel/exit.c:820
>>>>>   do_group_exit+0xd4/0x2a0 kernel/exit.c:950
>>>>>   get_signal+0x21b1/0x2440 kernel/signal.c:2858
>>>>>   arch_do_signal_or_restart+0x86/0x2300 arch/x86/kernel/signal.c:869
>>>>>   exit_to_user_mode_loop kernel/entry/common.c:168 [inline]
>>>>>   exit_to_user_mode_prepare+0x15f/0x250 kernel/entry/common.c:203
>>>>>   __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
>>>>>   syscall_exit_to_user_mode+0x1d/0x50 kernel/entry/common.c:296
>>>>>   do_syscall_64+0x46/0xb0 arch/x86/entry/common.c:86
>>>>>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
>>>>>
>>>>> The cause of the issue is that sock_put() from __tun_detach() drops
>>>>> last reference count for struct net, and then notifier_call_chain()
>>>>> from netdev_state_change() accesses that struct net.
>>>>>
>>>>> This patch fixes the issue by calling sock_put() from tun_detach()
>>>>> after all necessary accesses for the struct net has done.
>>>>>
>>>>> Fixes: 83c1f36f9880 ("tun: send netlink notification when the device
>>>>> is modified")
>>>>> Reported-by: syzbot+106f9b687cd64ee70cd1@syzkaller.appspotmail.com
>>>>> Link:
>>>>> https://syzkaller.appspot.com/bug?id=96eb7f1ce75ef933697f24eeab928c4a716edefe
>>>>> [1]
>>>>> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
>>>>> ---
>>>>> v2:
>>>>> - Include symbolic stack trace
>>>>> - Add Fixes and Reported-by tags
>>>>> v1:
>>>>> https://lore.kernel.org/all/20221119075615.723290-1-syoshida@redhat.com/
>>>>> ---
>>>>>   drivers/net/tun.c | 6 +++++-
>>>>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
>>>>> index 7a3ab3427369..ce9fcf4c8ef4 100644
>>>>> --- a/drivers/net/tun.c
>>>>> +++ b/drivers/net/tun.c
>>>>> @@ -686,7 +686,6 @@ static void __tun_detach(struct tun_file *tfile,
>>>>> bool clean)
>>>>>                  if (tun)
>>>>>                          xdp_rxq_info_unreg(&tfile->xdp_rxq);
>>>>>                  ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
>>>>> -               sock_put(&tfile->sk);
>>>>>          }
>>>>>   }
>>>>>
>>>>> @@ -702,6 +701,11 @@ static void tun_detach(struct tun_file *tfile,
>>>>> bool clean)
>>>>>          if (dev)
>>>>>                  netdev_state_change(dev);
>>>>>          rtnl_unlock();
>>>>> +
>>>>> +       if (clean) {
>>>> Would you mind explaining (a comment would be nice) why this barrier
>>>> is needed ?
>>> I thought that tfile is accessed with rcu_lock(), so I put
>>> synchronize_rcu() here.  Please let me know if I misunderstand the
>>> concept of rcu (I'm losing my confidence...).
>>>
>> Addin Jason for comments.
>>
>> If an RCU grace period was needed before commit 83c1f36f9880 ("tun:
>> send netlink notification when the device is modified"),
>> would we need another patch ?
> 
> 
> I think we don't need another synchronization here. __tun_detach() has
> already done the necessary synchronization when it tries to modify
> tun->tfiles array and tfile->tun.

Thank you so much for your comment.  I'll prepare v3 patch to remove
calling synchronize_rcu().

Thanks,
Shigeru

> Thanks
> 
> 
>>
>> Also sock_flag(sk, SOCK_RCU_FREE) would probably be better than adding
>> a synchronize_rcu() (if again a grace period is needed)
>>
>>
>>
>>> Thanks,
>>> Shigeru
>>>
>>>> Thanks.
>>>>
>>>>> +               synchronize_rcu();
>>>>> +               sock_put(&tfile->sk);
>>>>> +       }
>>>>>   }
>>>>>
>>>>>   static void tun_detach_all(struct net_device *dev)
>>>>> --
>>>>> 2.38.1
>>>>>
>
  

Patch

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 7a3ab3427369..ce9fcf4c8ef4 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -686,7 +686,6 @@  static void __tun_detach(struct tun_file *tfile, bool clean)
 		if (tun)
 			xdp_rxq_info_unreg(&tfile->xdp_rxq);
 		ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free);
-		sock_put(&tfile->sk);
 	}
 }
 
@@ -702,6 +701,11 @@  static void tun_detach(struct tun_file *tfile, bool clean)
 	if (dev)
 		netdev_state_change(dev);
 	rtnl_unlock();
+
+	if (clean) {
+		synchronize_rcu();
+		sock_put(&tfile->sk);
+	}
 }
 
 static void tun_detach_all(struct net_device *dev)