[net-next] net: page_pool: fix null-ptr-deref in page_pool_unlist

Message ID 20231130073749.1329999-1-lizhi.xu@windriver.com
State New
Headers
Series [net-next] net: page_pool: fix null-ptr-deref in page_pool_unlist |

Commit Message

Lizhi Xu Nov. 30, 2023, 7:37 a.m. UTC
  [Syz report]
Illegal XDP return value 4294946546 on prog  (id 2) dev N/A, expect packet loss!
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:__hlist_del include/linux/list.h:988 [inline]
RIP: 0010:hlist_del include/linux/list.h:1002 [inline]
RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342
Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d
RSP: 0018:ffffc900039ff768 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0
RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc
R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000
R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001
FS:  000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 __page_pool_destroy net/core/page_pool.c:851 [inline]
 page_pool_release+0x507/0x6b0 net/core/page_pool.c:891
 page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956
 xdp_test_run_teardown net/bpf/test_run.c:216 [inline]
 bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388
 bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254
 bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline]
 __sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402
 __do_sys_bpf kernel/bpf/syscall.c:5488 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5486 [inline]
 __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486
 do_syscall_x64 arch/x86/entry/common.c:51 [inline]
 do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

[Analysis]
If "user.list" is initialized, the corresponding slow.netdev device must exist, 
so before recycling "user.list", it is necessary to confirm that the "slow.netdev"
device is valid.

[Fix]
Add slow.netdev != NULL check before delete "user.list".

Fixes: 083772c9f972 ("net: page_pool: record pools per netdev")
Reported-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
---
 net/core/page_pool_user.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Eric Dumazet Nov. 30, 2023, 8:29 a.m. UTC | #1
On Thu, Nov 30, 2023 at 8:37 AM Lizhi Xu <lizhi.xu@windriver.com> wrote:
>
> [Syz report]
> Illegal XDP return value 4294946546 on prog  (id 2) dev N/A, expect packet loss!
> general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
> RIP: 0010:__hlist_del include/linux/list.h:988 [inline]
> RIP: 0010:hlist_del include/linux/list.h:1002 [inline]
> RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342
> Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d
> RSP: 0018:ffffc900039ff768 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0
> RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc
> R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000
> R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001
> FS:  000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  __page_pool_destroy net/core/page_pool.c:851 [inline]
>  page_pool_release+0x507/0x6b0 net/core/page_pool.c:891
>  page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956
>  xdp_test_run_teardown net/bpf/test_run.c:216 [inline]
>  bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388
>  bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254
>  bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline]
>  __sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402
>  __do_sys_bpf kernel/bpf/syscall.c:5488 [inline]
>  __se_sys_bpf kernel/bpf/syscall.c:5486 [inline]
>  __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486
>  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
>  do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
>  entry_SYSCALL_64_after_hwframe+0x63/0x6b
>
> [Analysis]
> If "user.list" is initialized, the corresponding slow.netdev device must exist,
> so before recycling "user.list", it is necessary to confirm that the "slow.netdev"
> device is valid.
>
> [Fix]
> Add slow.netdev != NULL check before delete "user.list".
>
> Fixes: 083772c9f972 ("net: page_pool: record pools per netdev")
> Reported-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com
> Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>



I sent a fix already ?

https://lore.kernel.org/netdev/CANn89i+6BuZA6AjocG_0zTkD1u=pNgZc_DpZMO=yUN=S1cHS3w@mail.gmail.com/

Please do not attribute to yourself work done by others, let me submit
the fix formally, thanks.





> ---
>  net/core/page_pool_user.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
> index 1426434a7e15..ca71f4103b3a 100644
> --- a/net/core/page_pool_user.c
> +++ b/net/core/page_pool_user.c
> @@ -339,7 +339,8 @@ void page_pool_unlist(struct page_pool *pool)
>         mutex_lock(&page_pools_lock);
>         netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_DEL_NTF);
>         xa_erase(&page_pools, pool->user.id);
> -       hlist_del(&pool->user.list);
> +       if (pool->slow.netdev)
> +               hlist_del(&pool->user.list);
>         mutex_unlock(&page_pools_lock);
>  }
>
> --
> 2.26.1
>
  
Lizhi Xu Nov. 30, 2023, 8:58 a.m. UTC | #2
On Thu, 30 Nov 2023 09:29:04 +0100, Eric Dumazet <edumazet@google.com> wrote:
> > [Syz report]
> > Illegal XDP return value 4294946546 on prog  (id 2) dev N/A, expect packet loss!
> > general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
> > KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> > CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
> > RIP: 0010:__hlist_del include/linux/list.h:988 [inline]
> > RIP: 0010:hlist_del include/linux/list.h:1002 [inline]
> > RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342
> > Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d
> > RSP: 0018:ffffc900039ff768 EFLAGS: 00010246
> > RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0
> > RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc
> > R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000
> > R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001
> > FS:  000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  <TASK>
> >  __page_pool_destroy net/core/page_pool.c:851 [inline]
> >  page_pool_release+0x507/0x6b0 net/core/page_pool.c:891
> >  page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956
> >  xdp_test_run_teardown net/bpf/test_run.c:216 [inline]
> >  bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388
> >  bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254
> >  bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline]
> >  __sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402
> >  __do_sys_bpf kernel/bpf/syscall.c:5488 [inline]
> >  __se_sys_bpf kernel/bpf/syscall.c:5486 [inline]
> >  __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486
> >  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
> >  do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
> >  entry_SYSCALL_64_after_hwframe+0x63/0x6b
> >
> > [Analysis]
> > If "user.list" is initialized, the corresponding slow.netdev device must exist,
> > so before recycling "user.list", it is necessary to confirm that the "slow.netdev"
> > device is valid.
> >
> > [Fix]
> > Add slow.netdev != NULL check before delete "user.list".
> >
> > Fixes: 083772c9f972 ("net: page_pool: record pools per netdev")
> > Reported-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com
> > Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
> 
> 
> 
> I sent a fix already ?
> 
> https://lore.kernel.org/netdev/CANn89i+6BuZA6AjocG_0zTkD1u=pNgZc_DpZMO=yUN=S1cHS3w@mail.gmail.com/
> 
> Please do not attribute to yourself work done by others, let me submit
> the fix formally, thanks.
What exists may not necessarily be right, and how do you prove that I saw your 
fix before fixing it?

You have only tested on syzbot, that's all. 
This does not mean that others should refer to you for repairs, nor does it 
prove that you have made repairs, and others cannot fix them.

Thanks
  
Eric Dumazet Nov. 30, 2023, 9:06 a.m. UTC | #3
On Thu, Nov 30, 2023 at 9:58 AM Lizhi Xu <lizhi.xu@windriver.com> wrote:
>
> On Thu, 30 Nov 2023 09:29:04 +0100, Eric Dumazet <edumazet@google.com> wrote:
> > > [Syz report]
> > > Illegal XDP return value 4294946546 on prog  (id 2) dev N/A, expect packet loss!
> > > general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
> > > KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> > > CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
> > > RIP: 0010:__hlist_del include/linux/list.h:988 [inline]
> > > RIP: 0010:hlist_del include/linux/list.h:1002 [inline]
> > > RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342
> > > Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d
> > > RSP: 0018:ffffc900039ff768 EFLAGS: 00010246
> > > RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000
> > > RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0
> > > RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc
> > > R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000
> > > R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001
> > > FS:  000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > >  <TASK>
> > >  __page_pool_destroy net/core/page_pool.c:851 [inline]
> > >  page_pool_release+0x507/0x6b0 net/core/page_pool.c:891
> > >  page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956
> > >  xdp_test_run_teardown net/bpf/test_run.c:216 [inline]
> > >  bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388
> > >  bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254
> > >  bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline]
> > >  __sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402
> > >  __do_sys_bpf kernel/bpf/syscall.c:5488 [inline]
> > >  __se_sys_bpf kernel/bpf/syscall.c:5486 [inline]
> > >  __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486
> > >  do_syscall_x64 arch/x86/entry/common.c:51 [inline]
> > >  do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
> > >  entry_SYSCALL_64_after_hwframe+0x63/0x6b
> > >
> > > [Analysis]
> > > If "user.list" is initialized, the corresponding slow.netdev device must exist,
> > > so before recycling "user.list", it is necessary to confirm that the "slow.netdev"
> > > device is valid.
> > >
> > > [Fix]
> > > Add slow.netdev != NULL check before delete "user.list".
> > >
> > > Fixes: 083772c9f972 ("net: page_pool: record pools per netdev")
> > > Reported-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com
> > > Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
> >
> >
> >
> > I sent a fix already ?
> >
> > https://lore.kernel.org/netdev/CANn89i+6BuZA6AjocG_0zTkD1u=pNgZc_DpZMO=yUN=S1cHS3w@mail.gmail.com/
> >
> > Please do not attribute to yourself work done by others, let me submit
> > the fix formally, thanks.
> What exists may not necessarily be right, and how do you prove that I saw your
> fix before fixing it?
>
> You have only tested on syzbot, that's all.
> This does not mean that others should refer to you for repairs, nor does it
> prove that you have made repairs, and others cannot fix them.

I am just saying I sent a fix already, and that it was sent a few
minutes after the syzbot report was available.

(You included syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com in
your report, meaning that you must have seen my patch)

It is not because I sleep during night time that you can decide to use
my work without any credits.
  

Patch

diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
index 1426434a7e15..ca71f4103b3a 100644
--- a/net/core/page_pool_user.c
+++ b/net/core/page_pool_user.c
@@ -339,7 +339,8 @@  void page_pool_unlist(struct page_pool *pool)
 	mutex_lock(&page_pools_lock);
 	netdev_nl_page_pool_event(pool, NETDEV_CMD_PAGE_POOL_DEL_NTF);
 	xa_erase(&page_pools, pool->user.id);
-	hlist_del(&pool->user.list);
+	if (pool->slow.netdev)
+		hlist_del(&pool->user.list);
 	mutex_unlock(&page_pools_lock);
 }