Message ID | 20230515021307.3072-1-dinghui@sangfor.com.cn |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp6610366vqo; Sun, 14 May 2023 19:16:29 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5WOSLqb1SuqV5Ru8zm5ROlQFCNRsBnbVd9s9niry3DdG3tsB+NuE0h24pPBNuwk2bHAoLj X-Received: by 2002:a17:903:1103:b0:1ab:d89:5ef6 with SMTP id n3-20020a170903110300b001ab0d895ef6mr39978722plh.68.1684116989216; Sun, 14 May 2023 19:16:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684116989; cv=none; d=google.com; s=arc-20160816; b=fkRo4nz42usHPPYv+r/uJ0dqk3qR4MeFKryWqBfj7SQWufDpGvClzf3Lz8O3o+VGE+ w/t3/yU4djgp6alCfHKPi4emdqk//ygHxFCzGYFhgwcwgJygo0BWAN4QqsRUQY4cBnTs jVxcAgOx/aRStD8Bws21wq94RSZWrkF2CFQwAtjLdTF10LjRO/3uHau84Z0TZelZKHUp TV+hrAdXN+ZxIWgRP3QMnr6ik9Ul1zFnNStqJOanRYo6eTXXPx10zhppPWk/FmJnqfoT 4ucC8oci+nccg2+VmTZuc7/J70ApbnUdRR6ytE/+l5ekXxEMQEEpjsG/f+G7ZkqyE/Z5 kldA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from; bh=HYLKSULlZE8k3IBCb6t1q4e2NViOtrdwXLTTNenZXi8=; b=Ctz3IRtEAW0PQvdLBYfapIMSOqkjK0b/3PZnjbUc/juOtPugw00Q0hdhv0D06v+uQ8 Ez4PN6PBCFKXfujBWy5XdI4uY8bVL9ViAd5htDA1JCNeiCU+4MD5iOHeJfnYYybeYvh2 LwSxvU1uQtq5HW02gphlQwceT6eLEvq9vSSKosfrrJaxaEADCIYwxOV16Xgn8eVIpAN2 I/htMui5OhxL93/6+rr0h3UyOgZrsELilLNd0LHbArgIs8jySc9kXjche8H5JyMYnAQ7 9ja9B+bUIkd6UpGtya0NWPhLXXlBEpywx+v2IqArsopdAY5pVKLkJdThx/LCuk4Ud3NK E/aA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q15-20020a17090311cf00b001aaf45eb4cbsi4613976plh.163.2023.05.14.19.16.14; Sun, 14 May 2023 19:16:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233743AbjEOCNx (ORCPT <rfc822;peekingduck44@gmail.com> + 99 others); Sun, 14 May 2023 22:13:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229645AbjEOCNv (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 14 May 2023 22:13:51 -0400 Received: from mail-m127104.qiye.163.com (mail-m127104.qiye.163.com [115.236.127.104]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B392110DB; Sun, 14 May 2023 19:13:48 -0700 (PDT) Received: from localhost.localdomain (unknown [IPV6:240e:3b7:3270:1980:719c:500e:9fa7:6718]) by mail-m127104.qiye.163.com (Hmail) with ESMTPA id 83E3DA40111; Mon, 15 May 2023 10:13:44 +0800 (CST) From: Ding Hui <dinghui@sangfor.com.cn> To: chuck.lever@oracle.com, jlayton@kernel.org, trond.myklebust@hammerspace.com, anna@kernel.org Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, linux-nfs@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dinghui@sangfor.com.cn, stable@vger.kernel.org Subject: [PATCH] SUNRPC: Fix UAF in svc_tcp_listen_data_ready() Date: Mon, 15 May 2023 10:13:07 +0800 Message-Id: <20230515021307.3072-1-dinghui@sangfor.com.cn> X-Mailer: git-send-email 2.17.1 X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVlCThkYVksYGkIeTkwYS0xLGFUTARMWGhIXJBQOD1 lXWRgSC1lBWUlPSx5BSBlMQUhJTEtBSkJDS0FMSkIYQU5LSx5BQh0aTEFNTEpDWVdZFhoPEhUdFF lBWU9LSFVKSktISkxVSktLVUtZBg++ X-HM-Tid: 0a881d2f6370b282kuuu83e3da40111 X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6MT46Hio5Mj0XNxksMEIOHR8U TD4wCSpVSlVKTUNPSkpNQ0lOSEJIVTMWGhIXVR8SFRwTDhI7CBoVHB0UCVUYFBZVGBVFWVdZEgtZ QVlJT0seQUgZTEFISUxLQUpCQ0tBTEpCGEFOS0seQUIdGkxBTUxKQ1lXWQgBWUFOT01NNwY+ X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765226133538332441?= X-GMAIL-MSGID: =?utf-8?q?1765924656204826553?= |
Series |
SUNRPC: Fix UAF in svc_tcp_listen_data_ready()
|
|
Commit Message
Ding Hui
May 15, 2023, 2:13 a.m. UTC
After the listener svc_sock be freed, and before invoking svc_tcp_accept()
for the established child sock, there is a window that the newsock
retaining a freed listener svc_sock in sk_user_data which cloning from
parent. In the race windows if data is received on the newsock, we will
observe use-after-free report in svc_tcp_listen_data_ready().
Reproduce by two tasks:
1. while :; do rpc.nfsd 0 ; rpc.nfsd; done
2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done
KASAN report:
==================================================================
BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
Read of size 8 at addr ffff888139d96228 by task nc/102553
CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Call Trace:
<IRQ>
dump_stack_lvl+0x33/0x50
print_address_description.constprop.0+0x27/0x310
print_report+0x3e/0x70
kasan_report+0xae/0xe0
svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
tcp_data_queue+0x9f4/0x20e0
tcp_rcv_established+0x666/0x1f60
tcp_v4_do_rcv+0x51c/0x850
tcp_v4_rcv+0x23fc/0x2e80
ip_protocol_deliver_rcu+0x62/0x300
ip_local_deliver_finish+0x267/0x350
ip_local_deliver+0x18b/0x2d0
ip_rcv+0x2fb/0x370
__netif_receive_skb_one_core+0x166/0x1b0
process_backlog+0x24c/0x5e0
__napi_poll+0xa2/0x500
net_rx_action+0x854/0xc90
__do_softirq+0x1bb/0x5de
do_softirq+0xcb/0x100
</IRQ>
<TASK>
...
</TASK>
Allocated by task 102371:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
__kasan_kmalloc+0x7b/0x90
svc_setup_socket+0x52/0x4f0 [sunrpc]
svc_addsock+0x20d/0x400 [sunrpc]
__write_ports_addfd+0x209/0x390 [nfsd]
write_ports+0x239/0x2c0 [nfsd]
nfsctl_transaction_write+0xac/0x110 [nfsd]
vfs_write+0x1c3/0xae0
ksys_write+0xed/0x1c0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Freed by task 102551:
kasan_save_stack+0x1e/0x40
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x50
__kasan_slab_free+0x106/0x190
__kmem_cache_free+0x133/0x270
svc_xprt_free+0x1e2/0x350 [sunrpc]
svc_xprt_destroy_all+0x25a/0x440 [sunrpc]
nfsd_put+0x125/0x240 [nfsd]
nfsd_svc+0x2cb/0x3c0 [nfsd]
write_threads+0x1ac/0x2a0 [nfsd]
nfsctl_transaction_write+0xac/0x110 [nfsd]
vfs_write+0x1c3/0xae0
ksys_write+0xed/0x1c0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready()
if state != TCP_LISTEN, that will avoid dereferencing svsk for all
child socket.
Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/
Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding")
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Cc: <stable@vger.kernel.org>
---
net/sunrpc/svcsock.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
Comments
> On May 14, 2023, at 10:13 PM, Ding Hui <dinghui@sangfor.com.cn> wrote: > > After the listener svc_sock be freed, and before invoking svc_tcp_accept() > for the established child sock, there is a window that the newsock > retaining a freed listener svc_sock in sk_user_data which cloning from > parent. Thank you, I will apply this (after testing it). The next step is to figure out why SUNRPC is trying to accept on a dead listener. Any thoughts about that? > In the race windows if data is received on the newsock, we will > observe use-after-free report in svc_tcp_listen_data_ready(). > > Reproduce by two tasks: > > 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done > 2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done I will continue attempting to reproduce, as I would like a root cause for this issue. > KASAN report: > > ================================================================== > BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] > Read of size 8 at addr ffff888139d96228 by task nc/102553 > CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18 > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 > Call Trace: > <IRQ> > dump_stack_lvl+0x33/0x50 > print_address_description.constprop.0+0x27/0x310 > print_report+0x3e/0x70 > kasan_report+0xae/0xe0 > svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] > tcp_data_queue+0x9f4/0x20e0 > tcp_rcv_established+0x666/0x1f60 > tcp_v4_do_rcv+0x51c/0x850 > tcp_v4_rcv+0x23fc/0x2e80 > ip_protocol_deliver_rcu+0x62/0x300 > ip_local_deliver_finish+0x267/0x350 > ip_local_deliver+0x18b/0x2d0 > ip_rcv+0x2fb/0x370 > __netif_receive_skb_one_core+0x166/0x1b0 > process_backlog+0x24c/0x5e0 > __napi_poll+0xa2/0x500 > net_rx_action+0x854/0xc90 > __do_softirq+0x1bb/0x5de > do_softirq+0xcb/0x100 > </IRQ> > <TASK> > ... > </TASK> > > Allocated by task 102371: > kasan_save_stack+0x1e/0x40 > kasan_set_track+0x21/0x30 > __kasan_kmalloc+0x7b/0x90 > svc_setup_socket+0x52/0x4f0 [sunrpc] > svc_addsock+0x20d/0x400 [sunrpc] > __write_ports_addfd+0x209/0x390 [nfsd] > write_ports+0x239/0x2c0 [nfsd] > nfsctl_transaction_write+0xac/0x110 [nfsd] > vfs_write+0x1c3/0xae0 > ksys_write+0xed/0x1c0 > do_syscall_64+0x38/0x90 > entry_SYSCALL_64_after_hwframe+0x72/0xdc > > Freed by task 102551: > kasan_save_stack+0x1e/0x40 > kasan_set_track+0x21/0x30 > kasan_save_free_info+0x2a/0x50 > __kasan_slab_free+0x106/0x190 > __kmem_cache_free+0x133/0x270 > svc_xprt_free+0x1e2/0x350 [sunrpc] > svc_xprt_destroy_all+0x25a/0x440 [sunrpc] > nfsd_put+0x125/0x240 [nfsd] > nfsd_svc+0x2cb/0x3c0 [nfsd] > write_threads+0x1ac/0x2a0 [nfsd] > nfsctl_transaction_write+0xac/0x110 [nfsd] > vfs_write+0x1c3/0xae0 > ksys_write+0xed/0x1c0 > do_syscall_64+0x38/0x90 > entry_SYSCALL_64_after_hwframe+0x72/0xdc > > Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready() > if state != TCP_LISTEN, that will avoid dereferencing svsk for all > child socket. > > Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/ > Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding") > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > Cc: <stable@vger.kernel.org> > --- > net/sunrpc/svcsock.c | 23 +++++++++++------------ > 1 file changed, 11 insertions(+), 12 deletions(-) > > diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c > index a51c9b989d58..9aca6e1e78e4 100644 > --- a/net/sunrpc/svcsock.c > +++ b/net/sunrpc/svcsock.c > @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk) > > trace_sk_data_ready(sk); > > - if (svsk) { > - /* Refer to svc_setup_socket() for details. */ > - rmb(); > - svsk->sk_odata(sk); > - } > - > /* > * This callback may called twice when a new connection > * is established as a child socket inherits everything > @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk) > * when one of child sockets become ESTABLISHED. > * 2) data_ready method of the child socket may be called > * when it receives data before the socket is accepted. > - * In case of 2, we should ignore it silently. > + * In case of 2, we should ignore it silently and DO NOT > + * dereference svsk. > */ > - if (sk->sk_state == TCP_LISTEN) { > - if (svsk) { > - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); > - svc_xprt_enqueue(&svsk->sk_xprt); > - } > + if (sk->sk_state != TCP_LISTEN) > + return; > + > + if (svsk) { > + /* Refer to svc_setup_socket() for details. */ > + rmb(); > + svsk->sk_odata(sk); > + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); > + svc_xprt_enqueue(&svsk->sk_xprt); > } > } > > -- > 2.17.1 > -- Chuck Lever
On 2023/5/15 20:01, Chuck Lever III wrote: > > >> On May 14, 2023, at 10:13 PM, Ding Hui <dinghui@sangfor.com.cn> wrote: >> >> After the listener svc_sock be freed, and before invoking svc_tcp_accept() >> for the established child sock, there is a window that the newsock >> retaining a freed listener svc_sock in sk_user_data which cloning from >> parent. > > Thank you, I will apply this (after testing it). > > The next step is to figure out why SUNRPC is trying to accept > on a dead listener. Any thoughts about that? > A child sock is cloned from the listener sock, it inherits sk_data_ready and sk_user_data from its parent sock, which is svc_tcp_listen_data_ready() and listener svc_sock, the sk_state of the child becomes ESTABLISHED once after TCP handshake in protocol stack. Case 1: listener sock | child sock | nfsd thread =>sk_data_ready | =>sk_data_ready | -------------------+-----------------------+---------------------- svc_tcp_listen_data_ready svsk is listener svc_sock set_bit(XPT_CONN) svc_recv svc_tcp_accept(listener) kernel_accept get child sock as newsock svc_setup_socket(newsock) newsock->sk_data_ready=svc_data_ready newsock->sk_user_data=newsvsk svc_data_ready svsk is newsvsk Case 2: listener sock | child sock | nfsd thread =>sk_data_ready | =>sk_data_ready | -------------------+-----------------------+---------------------- svc_tcp_listen_data_ready svsk is listener svc_sock set_bit(XPT_CONN) svc_tcp_listen_data_ready svsk is listener svc_sock svc_recv svc_tcp_accept(listener) kernel_accept get the child sock as newsock svc_setup_socket(newsock) newsock->sk_data_ready=svc_data_ready newsock->sk_user_data=newsvsk svc_data_ready svsk is newsvsk The UAF case: listener sock | child sock | rpc.nfsd 0 =>sk_data_ready | =>sk_data_ready | -------------------+-----------------------+---------------------- svc_tcp_listen_data_ready svsk is listener svc_sock set_bit(XPT_CONN) svc_xprt_destroy_all svc_xprt_free kfree listener svc_sock // the child sock has not yet been accepted, // so it is not managed by SUNRPC for now. svc_tcp_listen_data_ready svsk is listener svc_sock svsk->sk_odata // UAF! > >> In the race windows if data is received on the newsock, we will >> observe use-after-free report in svc_tcp_listen_data_ready(). >> >> Reproduce by two tasks: >> >> 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done >> 2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done > > I will continue attempting to reproduce, as I would like a > root cause for this issue. > > >> KASAN report: >> >> ================================================================== >> BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] >> Read of size 8 at addr ffff888139d96228 by task nc/102553 >> CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18 >> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 >> Call Trace: >> <IRQ> >> dump_stack_lvl+0x33/0x50 >> print_address_description.constprop.0+0x27/0x310 >> print_report+0x3e/0x70 >> kasan_report+0xae/0xe0 >> svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] >> tcp_data_queue+0x9f4/0x20e0 >> tcp_rcv_established+0x666/0x1f60 >> tcp_v4_do_rcv+0x51c/0x850 >> tcp_v4_rcv+0x23fc/0x2e80 >> ip_protocol_deliver_rcu+0x62/0x300 >> ip_local_deliver_finish+0x267/0x350 >> ip_local_deliver+0x18b/0x2d0 >> ip_rcv+0x2fb/0x370 >> __netif_receive_skb_one_core+0x166/0x1b0 >> process_backlog+0x24c/0x5e0 >> __napi_poll+0xa2/0x500 >> net_rx_action+0x854/0xc90 >> __do_softirq+0x1bb/0x5de >> do_softirq+0xcb/0x100 >> </IRQ> >> <TASK> >> ... >> </TASK> >> >> Allocated by task 102371: >> kasan_save_stack+0x1e/0x40 >> kasan_set_track+0x21/0x30 >> __kasan_kmalloc+0x7b/0x90 >> svc_setup_socket+0x52/0x4f0 [sunrpc] >> svc_addsock+0x20d/0x400 [sunrpc] >> __write_ports_addfd+0x209/0x390 [nfsd] >> write_ports+0x239/0x2c0 [nfsd] >> nfsctl_transaction_write+0xac/0x110 [nfsd] >> vfs_write+0x1c3/0xae0 >> ksys_write+0xed/0x1c0 >> do_syscall_64+0x38/0x90 >> entry_SYSCALL_64_after_hwframe+0x72/0xdc >> >> Freed by task 102551: >> kasan_save_stack+0x1e/0x40 >> kasan_set_track+0x21/0x30 >> kasan_save_free_info+0x2a/0x50 >> __kasan_slab_free+0x106/0x190 >> __kmem_cache_free+0x133/0x270 >> svc_xprt_free+0x1e2/0x350 [sunrpc] >> svc_xprt_destroy_all+0x25a/0x440 [sunrpc] >> nfsd_put+0x125/0x240 [nfsd] >> nfsd_svc+0x2cb/0x3c0 [nfsd] >> write_threads+0x1ac/0x2a0 [nfsd] >> nfsctl_transaction_write+0xac/0x110 [nfsd] >> vfs_write+0x1c3/0xae0 >> ksys_write+0xed/0x1c0 >> do_syscall_64+0x38/0x90 >> entry_SYSCALL_64_after_hwframe+0x72/0xdc >> >> Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready() >> if state != TCP_LISTEN, that will avoid dereferencing svsk for all >> child socket. >> >> Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/ >> Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding") >> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> >> Cc: <stable@vger.kernel.org> >> --- >> net/sunrpc/svcsock.c | 23 +++++++++++------------ >> 1 file changed, 11 insertions(+), 12 deletions(-) >> >> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c >> index a51c9b989d58..9aca6e1e78e4 100644 >> --- a/net/sunrpc/svcsock.c >> +++ b/net/sunrpc/svcsock.c >> @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk) >> >> trace_sk_data_ready(sk); >> >> - if (svsk) { >> - /* Refer to svc_setup_socket() for details. */ >> - rmb(); >> - svsk->sk_odata(sk); >> - } >> - >> /* >> * This callback may called twice when a new connection >> * is established as a child socket inherits everything >> @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk) >> * when one of child sockets become ESTABLISHED. >> * 2) data_ready method of the child socket may be called >> * when it receives data before the socket is accepted. >> - * In case of 2, we should ignore it silently. >> + * In case of 2, we should ignore it silently and DO NOT >> + * dereference svsk. >> */ >> - if (sk->sk_state == TCP_LISTEN) { >> - if (svsk) { >> - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); >> - svc_xprt_enqueue(&svsk->sk_xprt); >> - } >> + if (sk->sk_state != TCP_LISTEN) >> + return; >> + >> + if (svsk) { >> + /* Refer to svc_setup_socket() for details. */ >> + rmb(); >> + svsk->sk_odata(sk); >> + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); >> + svc_xprt_enqueue(&svsk->sk_xprt); >> } >> } >> >> -- >> 2.17.1 >> > > -- > Chuck Lever > > >
> On May 15, 2023, at 10:55 AM, Ding Hui <dinghui@sangfor.com.cn> wrote: > > On 2023/5/15 20:01, Chuck Lever III wrote: >>> On May 14, 2023, at 10:13 PM, Ding Hui <dinghui@sangfor.com.cn> wrote: >>> >>> After the listener svc_sock be freed, and before invoking svc_tcp_accept() >>> for the established child sock, there is a window that the newsock >>> retaining a freed listener svc_sock in sk_user_data which cloning from >>> parent. >> Thank you, I will apply this (after testing it). >> The next step is to figure out why SUNRPC is trying to accept >> on a dead listener. Any thoughts about that? > > A child sock is cloned from the listener sock, it inherits sk_data_ready > and sk_user_data from its parent sock, which is svc_tcp_listen_data_ready() > and listener svc_sock, the sk_state of the child becomes ESTABLISHED once > after TCP handshake in protocol stack. > > Case 1: > > listener sock | child sock | nfsd thread > =>sk_data_ready | =>sk_data_ready | > -------------------+-----------------------+---------------------- > svc_tcp_listen_data_ready > svsk is listener svc_sock > set_bit(XPT_CONN) > svc_recv > svc_tcp_accept(listener) > kernel_accept get child sock as newsock > svc_setup_socket(newsock) > newsock->sk_data_ready=svc_data_ready > newsock->sk_user_data=newsvsk > svc_data_ready > svsk is newsvsk > > > Case 2: > > listener sock | child sock | nfsd thread > =>sk_data_ready | =>sk_data_ready | > -------------------+-----------------------+---------------------- > svc_tcp_listen_data_ready > svsk is listener svc_sock > set_bit(XPT_CONN) > svc_tcp_listen_data_ready > svsk is listener svc_sock > svc_recv > svc_tcp_accept(listener) > kernel_accept get the child sock as newsock > svc_setup_socket(newsock) > newsock->sk_data_ready=svc_data_ready > newsock->sk_user_data=newsvsk > svc_data_ready > svsk is newsvsk > > > The UAF case: > > listener sock | child sock | rpc.nfsd 0 > =>sk_data_ready | =>sk_data_ready | > -------------------+-----------------------+---------------------- > svc_tcp_listen_data_ready > svsk is listener svc_sock > set_bit(XPT_CONN) > svc_xprt_destroy_all > svc_xprt_free > kfree listener svc_sock > // the child sock has not yet been accepted, > // so it is not managed by SUNRPC for now. > svc_tcp_listen_data_ready > svsk is listener svc_sock > svsk->sk_odata // UAF! Thanks for the ladder diagrams, that helps. >>> In the race windows if data is received on the newsock, we will >>> observe use-after-free report in svc_tcp_listen_data_ready(). >>> >>> Reproduce by two tasks: >>> >>> 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done >>> 2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done >> I will continue attempting to reproduce, as I would like a >> root cause for this issue. svc_xprt shutdown seems to be unordered. I would think that it should unregister with the portmapper, close the listener so no new connections can be established, then close the children last. That doesn't appear to be how it works, though. But if the listener happens to be closed first, that frees the svc_sock that might be relied upon by children sockets. The shutdown ordering seems to be the crux of the problem, and pinning the svc_xprt won't help, as you pointed out. Making the children not rely on the listener does address the crash, but doesn't fix the design. But the design problem doesn't seem to be urgent. So I'm going to file a low priority bug to document the design issue. >>> KASAN report: >>> >>> ================================================================== >>> BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] >>> Read of size 8 at addr ffff888139d96228 by task nc/102553 >>> CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18 >>> Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020 >>> Call Trace: >>> <IRQ> >>> dump_stack_lvl+0x33/0x50 >>> print_address_description.constprop.0+0x27/0x310 >>> print_report+0x3e/0x70 >>> kasan_report+0xae/0xe0 >>> svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc] >>> tcp_data_queue+0x9f4/0x20e0 >>> tcp_rcv_established+0x666/0x1f60 >>> tcp_v4_do_rcv+0x51c/0x850 >>> tcp_v4_rcv+0x23fc/0x2e80 >>> ip_protocol_deliver_rcu+0x62/0x300 >>> ip_local_deliver_finish+0x267/0x350 >>> ip_local_deliver+0x18b/0x2d0 >>> ip_rcv+0x2fb/0x370 >>> __netif_receive_skb_one_core+0x166/0x1b0 >>> process_backlog+0x24c/0x5e0 >>> __napi_poll+0xa2/0x500 >>> net_rx_action+0x854/0xc90 >>> __do_softirq+0x1bb/0x5de >>> do_softirq+0xcb/0x100 >>> </IRQ> >>> <TASK> >>> ... >>> </TASK> >>> >>> Allocated by task 102371: >>> kasan_save_stack+0x1e/0x40 >>> kasan_set_track+0x21/0x30 >>> __kasan_kmalloc+0x7b/0x90 >>> svc_setup_socket+0x52/0x4f0 [sunrpc] >>> svc_addsock+0x20d/0x400 [sunrpc] >>> __write_ports_addfd+0x209/0x390 [nfsd] >>> write_ports+0x239/0x2c0 [nfsd] >>> nfsctl_transaction_write+0xac/0x110 [nfsd] >>> vfs_write+0x1c3/0xae0 >>> ksys_write+0xed/0x1c0 >>> do_syscall_64+0x38/0x90 >>> entry_SYSCALL_64_after_hwframe+0x72/0xdc >>> >>> Freed by task 102551: >>> kasan_save_stack+0x1e/0x40 >>> kasan_set_track+0x21/0x30 >>> kasan_save_free_info+0x2a/0x50 >>> __kasan_slab_free+0x106/0x190 >>> __kmem_cache_free+0x133/0x270 >>> svc_xprt_free+0x1e2/0x350 [sunrpc] >>> svc_xprt_destroy_all+0x25a/0x440 [sunrpc] >>> nfsd_put+0x125/0x240 [nfsd] >>> nfsd_svc+0x2cb/0x3c0 [nfsd] >>> write_threads+0x1ac/0x2a0 [nfsd] >>> nfsctl_transaction_write+0xac/0x110 [nfsd] >>> vfs_write+0x1c3/0xae0 >>> ksys_write+0xed/0x1c0 >>> do_syscall_64+0x38/0x90 >>> entry_SYSCALL_64_after_hwframe+0x72/0xdc >>> >>> Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready() >>> if state != TCP_LISTEN, that will avoid dereferencing svsk for all >>> child socket. >>> >>> Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/ >>> Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding") >>> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> >>> Cc: <stable@vger.kernel.org> >>> --- >>> net/sunrpc/svcsock.c | 23 +++++++++++------------ >>> 1 file changed, 11 insertions(+), 12 deletions(-) >>> >>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c >>> index a51c9b989d58..9aca6e1e78e4 100644 >>> --- a/net/sunrpc/svcsock.c >>> +++ b/net/sunrpc/svcsock.c >>> @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk) >>> >>> trace_sk_data_ready(sk); >>> >>> - if (svsk) { >>> - /* Refer to svc_setup_socket() for details. */ >>> - rmb(); >>> - svsk->sk_odata(sk); >>> - } >>> - >>> /* >>> * This callback may called twice when a new connection >>> * is established as a child socket inherits everything >>> @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk) >>> * when one of child sockets become ESTABLISHED. >>> * 2) data_ready method of the child socket may be called >>> * when it receives data before the socket is accepted. >>> - * In case of 2, we should ignore it silently. >>> + * In case of 2, we should ignore it silently and DO NOT >>> + * dereference svsk. >>> */ >>> - if (sk->sk_state == TCP_LISTEN) { >>> - if (svsk) { >>> - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); >>> - svc_xprt_enqueue(&svsk->sk_xprt); >>> - } >>> + if (sk->sk_state != TCP_LISTEN) >>> + return; >>> + >>> + if (svsk) { >>> + /* Refer to svc_setup_socket() for details. */ >>> + rmb(); >>> + svsk->sk_odata(sk); >>> + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); >>> + svc_xprt_enqueue(&svsk->sk_xprt); >>> } >>> } >>> >>> -- >>> 2.17.1 >>> >> -- >> Chuck Lever > > -- > Thanks, > -dinghui -- Chuck Lever
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index a51c9b989d58..9aca6e1e78e4 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk) trace_sk_data_ready(sk); - if (svsk) { - /* Refer to svc_setup_socket() for details. */ - rmb(); - svsk->sk_odata(sk); - } - /* * This callback may called twice when a new connection * is established as a child socket inherits everything @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk) * when one of child sockets become ESTABLISHED. * 2) data_ready method of the child socket may be called * when it receives data before the socket is accepted. - * In case of 2, we should ignore it silently. + * In case of 2, we should ignore it silently and DO NOT + * dereference svsk. */ - if (sk->sk_state == TCP_LISTEN) { - if (svsk) { - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); - svc_xprt_enqueue(&svsk->sk_xprt); - } + if (sk->sk_state != TCP_LISTEN) + return; + + if (svsk) { + /* Refer to svc_setup_socket() for details. */ + rmb(); + svsk->sk_odata(sk); + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); + svc_xprt_enqueue(&svsk->sk_xprt); } }