SUNRPC: Fix UAF in svc_tcp_listen_data_ready()

Message ID 20230515021307.3072-1-dinghui@sangfor.com.cn
State New
Headers
Series SUNRPC: Fix UAF in svc_tcp_listen_data_ready() |

Commit Message

Ding Hui May 15, 2023, 2:13 a.m. UTC
  After the listener svc_sock be freed, and before invoking svc_tcp_accept()
for the established child sock, there is a window that the newsock
retaining a freed listener svc_sock in sk_user_data which cloning from
parent. In the race windows if data is received on the newsock, we will
observe use-after-free report in svc_tcp_listen_data_ready().

Reproduce by two tasks:

1. while :; do rpc.nfsd 0 ; rpc.nfsd; done
2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done

KASAN report:

  ==================================================================
  BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
  Read of size 8 at addr ffff888139d96228 by task nc/102553
  CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18
  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
  Call Trace:
   <IRQ>
   dump_stack_lvl+0x33/0x50
   print_address_description.constprop.0+0x27/0x310
   print_report+0x3e/0x70
   kasan_report+0xae/0xe0
   svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
   tcp_data_queue+0x9f4/0x20e0
   tcp_rcv_established+0x666/0x1f60
   tcp_v4_do_rcv+0x51c/0x850
   tcp_v4_rcv+0x23fc/0x2e80
   ip_protocol_deliver_rcu+0x62/0x300
   ip_local_deliver_finish+0x267/0x350
   ip_local_deliver+0x18b/0x2d0
   ip_rcv+0x2fb/0x370
   __netif_receive_skb_one_core+0x166/0x1b0
   process_backlog+0x24c/0x5e0
   __napi_poll+0xa2/0x500
   net_rx_action+0x854/0xc90
   __do_softirq+0x1bb/0x5de
   do_softirq+0xcb/0x100
   </IRQ>
   <TASK>
   ...
   </TASK>

  Allocated by task 102371:
   kasan_save_stack+0x1e/0x40
   kasan_set_track+0x21/0x30
   __kasan_kmalloc+0x7b/0x90
   svc_setup_socket+0x52/0x4f0 [sunrpc]
   svc_addsock+0x20d/0x400 [sunrpc]
   __write_ports_addfd+0x209/0x390 [nfsd]
   write_ports+0x239/0x2c0 [nfsd]
   nfsctl_transaction_write+0xac/0x110 [nfsd]
   vfs_write+0x1c3/0xae0
   ksys_write+0xed/0x1c0
   do_syscall_64+0x38/0x90
   entry_SYSCALL_64_after_hwframe+0x72/0xdc

  Freed by task 102551:
   kasan_save_stack+0x1e/0x40
   kasan_set_track+0x21/0x30
   kasan_save_free_info+0x2a/0x50
   __kasan_slab_free+0x106/0x190
   __kmem_cache_free+0x133/0x270
   svc_xprt_free+0x1e2/0x350 [sunrpc]
   svc_xprt_destroy_all+0x25a/0x440 [sunrpc]
   nfsd_put+0x125/0x240 [nfsd]
   nfsd_svc+0x2cb/0x3c0 [nfsd]
   write_threads+0x1ac/0x2a0 [nfsd]
   nfsctl_transaction_write+0xac/0x110 [nfsd]
   vfs_write+0x1c3/0xae0
   ksys_write+0xed/0x1c0
   do_syscall_64+0x38/0x90
   entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready()
if state != TCP_LISTEN, that will avoid dereferencing svsk for all
child socket.

Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/
Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding")
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Cc: <stable@vger.kernel.org>
---
 net/sunrpc/svcsock.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)
  

Comments

Chuck Lever May 15, 2023, 12:01 p.m. UTC | #1
> On May 14, 2023, at 10:13 PM, Ding Hui <dinghui@sangfor.com.cn> wrote:
> 
> After the listener svc_sock be freed, and before invoking svc_tcp_accept()
> for the established child sock, there is a window that the newsock
> retaining a freed listener svc_sock in sk_user_data which cloning from
> parent.

Thank you, I will apply this (after testing it).

The next step is to figure out why SUNRPC is trying to accept
on a dead listener. Any thoughts about that?


> In the race windows if data is received on the newsock, we will
> observe use-after-free report in svc_tcp_listen_data_ready().
> 
> Reproduce by two tasks:
> 
> 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done
> 2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done

I will continue attempting to reproduce, as I would like a
root cause for this issue.


> KASAN report:
> 
>  ==================================================================
>  BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>  Read of size 8 at addr ffff888139d96228 by task nc/102553
>  CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18
>  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
>  Call Trace:
>   <IRQ>
>   dump_stack_lvl+0x33/0x50
>   print_address_description.constprop.0+0x27/0x310
>   print_report+0x3e/0x70
>   kasan_report+0xae/0xe0
>   svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>   tcp_data_queue+0x9f4/0x20e0
>   tcp_rcv_established+0x666/0x1f60
>   tcp_v4_do_rcv+0x51c/0x850
>   tcp_v4_rcv+0x23fc/0x2e80
>   ip_protocol_deliver_rcu+0x62/0x300
>   ip_local_deliver_finish+0x267/0x350
>   ip_local_deliver+0x18b/0x2d0
>   ip_rcv+0x2fb/0x370
>   __netif_receive_skb_one_core+0x166/0x1b0
>   process_backlog+0x24c/0x5e0
>   __napi_poll+0xa2/0x500
>   net_rx_action+0x854/0xc90
>   __do_softirq+0x1bb/0x5de
>   do_softirq+0xcb/0x100
>   </IRQ>
>   <TASK>
>   ...
>   </TASK>
> 
>  Allocated by task 102371:
>   kasan_save_stack+0x1e/0x40
>   kasan_set_track+0x21/0x30
>   __kasan_kmalloc+0x7b/0x90
>   svc_setup_socket+0x52/0x4f0 [sunrpc]
>   svc_addsock+0x20d/0x400 [sunrpc]
>   __write_ports_addfd+0x209/0x390 [nfsd]
>   write_ports+0x239/0x2c0 [nfsd]
>   nfsctl_transaction_write+0xac/0x110 [nfsd]
>   vfs_write+0x1c3/0xae0
>   ksys_write+0xed/0x1c0
>   do_syscall_64+0x38/0x90
>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
>  Freed by task 102551:
>   kasan_save_stack+0x1e/0x40
>   kasan_set_track+0x21/0x30
>   kasan_save_free_info+0x2a/0x50
>   __kasan_slab_free+0x106/0x190
>   __kmem_cache_free+0x133/0x270
>   svc_xprt_free+0x1e2/0x350 [sunrpc]
>   svc_xprt_destroy_all+0x25a/0x440 [sunrpc]
>   nfsd_put+0x125/0x240 [nfsd]
>   nfsd_svc+0x2cb/0x3c0 [nfsd]
>   write_threads+0x1ac/0x2a0 [nfsd]
>   nfsctl_transaction_write+0xac/0x110 [nfsd]
>   vfs_write+0x1c3/0xae0
>   ksys_write+0xed/0x1c0
>   do_syscall_64+0x38/0x90
>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready()
> if state != TCP_LISTEN, that will avoid dereferencing svsk for all
> child socket.
> 
> Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/
> Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding")
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> Cc: <stable@vger.kernel.org>
> ---
> net/sunrpc/svcsock.c | 23 +++++++++++------------
> 1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index a51c9b989d58..9aca6e1e78e4 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
> 
> trace_sk_data_ready(sk);
> 
> - if (svsk) {
> - /* Refer to svc_setup_socket() for details. */
> - rmb();
> - svsk->sk_odata(sk);
> - }
> -
> /*
> * This callback may called twice when a new connection
> * is established as a child socket inherits everything
> @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
> *    when one of child sockets become ESTABLISHED.
> * 2) data_ready method of the child socket may be called
> *    when it receives data before the socket is accepted.
> - * In case of 2, we should ignore it silently.
> + * In case of 2, we should ignore it silently and DO NOT
> + * dereference svsk.
> */
> - if (sk->sk_state == TCP_LISTEN) {
> - if (svsk) {
> - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
> - svc_xprt_enqueue(&svsk->sk_xprt);
> - }
> + if (sk->sk_state != TCP_LISTEN)
> + return;
> +
> + if (svsk) {
> + /* Refer to svc_setup_socket() for details. */
> + rmb();
> + svsk->sk_odata(sk);
> + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
> + svc_xprt_enqueue(&svsk->sk_xprt);
> }
> }
> 
> -- 
> 2.17.1
> 

--
Chuck Lever
  
Ding Hui May 15, 2023, 2:55 p.m. UTC | #2
On 2023/5/15 20:01, Chuck Lever III wrote:
> 
> 
>> On May 14, 2023, at 10:13 PM, Ding Hui <dinghui@sangfor.com.cn> wrote:
>>
>> After the listener svc_sock be freed, and before invoking svc_tcp_accept()
>> for the established child sock, there is a window that the newsock
>> retaining a freed listener svc_sock in sk_user_data which cloning from
>> parent.
> 
> Thank you, I will apply this (after testing it).
> 
> The next step is to figure out why SUNRPC is trying to accept
> on a dead listener. Any thoughts about that?
> 

A child sock is cloned from the listener sock, it inherits sk_data_ready
and sk_user_data from its parent sock, which is svc_tcp_listen_data_ready()
and listener svc_sock, the sk_state of the child becomes ESTABLISHED once
after TCP handshake in protocol stack.

Case 1:

listener sock      | child sock            |   nfsd thread
=>sk_data_ready    | =>sk_data_ready       |
-------------------+-----------------------+----------------------
svc_tcp_listen_data_ready
   svsk is listener svc_sock
   set_bit(XPT_CONN)
                                              svc_recv
                                                svc_tcp_accept(listener)
                                                  kernel_accept get child sock as newsock
                                                  svc_setup_socket(newsock)
                                                    newsock->sk_data_ready=svc_data_ready
                                                    newsock->sk_user_data=newsvsk
                     svc_data_ready
                       svsk is newsvsk


Case 2:

listener sock      | child sock            |   nfsd thread
=>sk_data_ready    | =>sk_data_ready       |
-------------------+-----------------------+----------------------
svc_tcp_listen_data_ready
   svsk is listener svc_sock
   set_bit(XPT_CONN)
                     svc_tcp_listen_data_ready
                       svsk is listener svc_sock
                                              svc_recv
                                                svc_tcp_accept(listener)
                                                  kernel_accept get the child sock as newsock
                                                  svc_setup_socket(newsock)
                                                    newsock->sk_data_ready=svc_data_ready
                                                    newsock->sk_user_data=newsvsk
                     svc_data_ready
                       svsk is newsvsk


The UAF case:

listener sock      | child sock            |   rpc.nfsd 0
=>sk_data_ready    | =>sk_data_ready       |
-------------------+-----------------------+----------------------
svc_tcp_listen_data_ready
   svsk is listener svc_sock
   set_bit(XPT_CONN)
                                             svc_xprt_destroy_all
                                               svc_xprt_free
                                                 kfree listener svc_sock
                                             // the child sock has not yet been accepted,
                                             // so it is not managed by SUNRPC for now.
                     svc_tcp_listen_data_ready
                       svsk is listener svc_sock
                       svsk->sk_odata // UAF!

> 
>> In the race windows if data is received on the newsock, we will
>> observe use-after-free report in svc_tcp_listen_data_ready().
>>
>> Reproduce by two tasks:
>>
>> 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done
>> 2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done
> 
> I will continue attempting to reproduce, as I would like a
> root cause for this issue.
> 
> 
>> KASAN report:
>>
>>   ==================================================================
>>   BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>>   Read of size 8 at addr ffff888139d96228 by task nc/102553
>>   CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18
>>   Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
>>   Call Trace:
>>    <IRQ>
>>    dump_stack_lvl+0x33/0x50
>>    print_address_description.constprop.0+0x27/0x310
>>    print_report+0x3e/0x70
>>    kasan_report+0xae/0xe0
>>    svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>>    tcp_data_queue+0x9f4/0x20e0
>>    tcp_rcv_established+0x666/0x1f60
>>    tcp_v4_do_rcv+0x51c/0x850
>>    tcp_v4_rcv+0x23fc/0x2e80
>>    ip_protocol_deliver_rcu+0x62/0x300
>>    ip_local_deliver_finish+0x267/0x350
>>    ip_local_deliver+0x18b/0x2d0
>>    ip_rcv+0x2fb/0x370
>>    __netif_receive_skb_one_core+0x166/0x1b0
>>    process_backlog+0x24c/0x5e0
>>    __napi_poll+0xa2/0x500
>>    net_rx_action+0x854/0xc90
>>    __do_softirq+0x1bb/0x5de
>>    do_softirq+0xcb/0x100
>>    </IRQ>
>>    <TASK>
>>    ...
>>    </TASK>
>>
>>   Allocated by task 102371:
>>    kasan_save_stack+0x1e/0x40
>>    kasan_set_track+0x21/0x30
>>    __kasan_kmalloc+0x7b/0x90
>>    svc_setup_socket+0x52/0x4f0 [sunrpc]
>>    svc_addsock+0x20d/0x400 [sunrpc]
>>    __write_ports_addfd+0x209/0x390 [nfsd]
>>    write_ports+0x239/0x2c0 [nfsd]
>>    nfsctl_transaction_write+0xac/0x110 [nfsd]
>>    vfs_write+0x1c3/0xae0
>>    ksys_write+0xed/0x1c0
>>    do_syscall_64+0x38/0x90
>>    entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>
>>   Freed by task 102551:
>>    kasan_save_stack+0x1e/0x40
>>    kasan_set_track+0x21/0x30
>>    kasan_save_free_info+0x2a/0x50
>>    __kasan_slab_free+0x106/0x190
>>    __kmem_cache_free+0x133/0x270
>>    svc_xprt_free+0x1e2/0x350 [sunrpc]
>>    svc_xprt_destroy_all+0x25a/0x440 [sunrpc]
>>    nfsd_put+0x125/0x240 [nfsd]
>>    nfsd_svc+0x2cb/0x3c0 [nfsd]
>>    write_threads+0x1ac/0x2a0 [nfsd]
>>    nfsctl_transaction_write+0xac/0x110 [nfsd]
>>    vfs_write+0x1c3/0xae0
>>    ksys_write+0xed/0x1c0
>>    do_syscall_64+0x38/0x90
>>    entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>
>> Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready()
>> if state != TCP_LISTEN, that will avoid dereferencing svsk for all
>> child socket.
>>
>> Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/
>> Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding")
>> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
>> Cc: <stable@vger.kernel.org>
>> ---
>> net/sunrpc/svcsock.c | 23 +++++++++++------------
>> 1 file changed, 11 insertions(+), 12 deletions(-)
>>
>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>> index a51c9b989d58..9aca6e1e78e4 100644
>> --- a/net/sunrpc/svcsock.c
>> +++ b/net/sunrpc/svcsock.c
>> @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
>>
>> trace_sk_data_ready(sk);
>>
>> - if (svsk) {
>> - /* Refer to svc_setup_socket() for details. */
>> - rmb();
>> - svsk->sk_odata(sk);
>> - }
>> -
>> /*
>> * This callback may called twice when a new connection
>> * is established as a child socket inherits everything
>> @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
>> *    when one of child sockets become ESTABLISHED.
>> * 2) data_ready method of the child socket may be called
>> *    when it receives data before the socket is accepted.
>> - * In case of 2, we should ignore it silently.
>> + * In case of 2, we should ignore it silently and DO NOT
>> + * dereference svsk.
>> */
>> - if (sk->sk_state == TCP_LISTEN) {
>> - if (svsk) {
>> - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
>> - svc_xprt_enqueue(&svsk->sk_xprt);
>> - }
>> + if (sk->sk_state != TCP_LISTEN)
>> + return;
>> +
>> + if (svsk) {
>> + /* Refer to svc_setup_socket() for details. */
>> + rmb();
>> + svsk->sk_odata(sk);
>> + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
>> + svc_xprt_enqueue(&svsk->sk_xprt);
>> }
>> }
>>
>> -- 
>> 2.17.1
>>
> 
> --
> Chuck Lever
> 
> 
>
  
Chuck Lever May 19, 2023, 2:54 p.m. UTC | #3
> On May 15, 2023, at 10:55 AM, Ding Hui <dinghui@sangfor.com.cn> wrote:
> 
> On 2023/5/15 20:01, Chuck Lever III wrote:
>>> On May 14, 2023, at 10:13 PM, Ding Hui <dinghui@sangfor.com.cn> wrote:
>>> 
>>> After the listener svc_sock be freed, and before invoking svc_tcp_accept()
>>> for the established child sock, there is a window that the newsock
>>> retaining a freed listener svc_sock in sk_user_data which cloning from
>>> parent.
>> Thank you, I will apply this (after testing it).
>> The next step is to figure out why SUNRPC is trying to accept
>> on a dead listener. Any thoughts about that?
> 
> A child sock is cloned from the listener sock, it inherits sk_data_ready
> and sk_user_data from its parent sock, which is svc_tcp_listen_data_ready()
> and listener svc_sock, the sk_state of the child becomes ESTABLISHED once
> after TCP handshake in protocol stack.
> 
> Case 1:
> 
> listener sock      | child sock            |   nfsd thread
> =>sk_data_ready    | =>sk_data_ready       |
> -------------------+-----------------------+----------------------
> svc_tcp_listen_data_ready
>  svsk is listener svc_sock
>  set_bit(XPT_CONN)
>                                             svc_recv
>                                               svc_tcp_accept(listener)
>                                                 kernel_accept get child sock as newsock
>                                                 svc_setup_socket(newsock)
>                                                   newsock->sk_data_ready=svc_data_ready
>                                                   newsock->sk_user_data=newsvsk
>                    svc_data_ready
>                      svsk is newsvsk
> 
> 
> Case 2:
> 
> listener sock      | child sock            |   nfsd thread
> =>sk_data_ready    | =>sk_data_ready       |
> -------------------+-----------------------+----------------------
> svc_tcp_listen_data_ready
>  svsk is listener svc_sock
>  set_bit(XPT_CONN)
>                    svc_tcp_listen_data_ready
>                      svsk is listener svc_sock
>                                             svc_recv
>                                               svc_tcp_accept(listener)
>                                                 kernel_accept get the child sock as newsock
>                                                 svc_setup_socket(newsock)
>                                                   newsock->sk_data_ready=svc_data_ready
>                                                   newsock->sk_user_data=newsvsk
>                    svc_data_ready
>                      svsk is newsvsk
> 
> 
> The UAF case:
> 
> listener sock      | child sock            |   rpc.nfsd 0
> =>sk_data_ready    | =>sk_data_ready       |
> -------------------+-----------------------+----------------------
> svc_tcp_listen_data_ready
>  svsk is listener svc_sock
>  set_bit(XPT_CONN)
>                                            svc_xprt_destroy_all
>                                              svc_xprt_free
>                                                kfree listener svc_sock
>                                            // the child sock has not yet been accepted,
>                                            // so it is not managed by SUNRPC for now.
>                    svc_tcp_listen_data_ready
>                      svsk is listener svc_sock
>                      svsk->sk_odata // UAF!

Thanks for the ladder diagrams, that helps.


>>> In the race windows if data is received on the newsock, we will
>>> observe use-after-free report in svc_tcp_listen_data_ready().
>>> 
>>> Reproduce by two tasks:
>>> 
>>> 1. while :; do rpc.nfsd 0 ; rpc.nfsd; done
>>> 2. while :; do echo "" | ncat -4 127.0.0.1 2049 ; done
>> I will continue attempting to reproduce, as I would like a
>> root cause for this issue.

svc_xprt shutdown seems to be unordered. I would think that
it should unregister with the portmapper, close the listener
so no new connections can be established, then close the
children last. That doesn't appear to be how it works, though.

But if the listener happens to be closed first, that frees
the svc_sock that might be relied upon by children sockets.
The shutdown ordering seems to be the crux of the problem,
and pinning the svc_xprt won't help, as you pointed out.

Making the children not rely on the listener does address
the crash, but doesn't fix the design. But the design
problem doesn't seem to be urgent. So I'm going to file a
low priority bug to document the design issue.


>>> KASAN report:
>>> 
>>>  ==================================================================
>>>  BUG: KASAN: slab-use-after-free in svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>>>  Read of size 8 at addr ffff888139d96228 by task nc/102553
>>>  CPU: 7 PID: 102553 Comm: nc Not tainted 6.3.0+ #18
>>>  Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
>>>  Call Trace:
>>>   <IRQ>
>>>   dump_stack_lvl+0x33/0x50
>>>   print_address_description.constprop.0+0x27/0x310
>>>   print_report+0x3e/0x70
>>>   kasan_report+0xae/0xe0
>>>   svc_tcp_listen_data_ready+0x1cf/0x1f0 [sunrpc]
>>>   tcp_data_queue+0x9f4/0x20e0
>>>   tcp_rcv_established+0x666/0x1f60
>>>   tcp_v4_do_rcv+0x51c/0x850
>>>   tcp_v4_rcv+0x23fc/0x2e80
>>>   ip_protocol_deliver_rcu+0x62/0x300
>>>   ip_local_deliver_finish+0x267/0x350
>>>   ip_local_deliver+0x18b/0x2d0
>>>   ip_rcv+0x2fb/0x370
>>>   __netif_receive_skb_one_core+0x166/0x1b0
>>>   process_backlog+0x24c/0x5e0
>>>   __napi_poll+0xa2/0x500
>>>   net_rx_action+0x854/0xc90
>>>   __do_softirq+0x1bb/0x5de
>>>   do_softirq+0xcb/0x100
>>>   </IRQ>
>>>   <TASK>
>>>   ...
>>>   </TASK>
>>> 
>>>  Allocated by task 102371:
>>>   kasan_save_stack+0x1e/0x40
>>>   kasan_set_track+0x21/0x30
>>>   __kasan_kmalloc+0x7b/0x90
>>>   svc_setup_socket+0x52/0x4f0 [sunrpc]
>>>   svc_addsock+0x20d/0x400 [sunrpc]
>>>   __write_ports_addfd+0x209/0x390 [nfsd]
>>>   write_ports+0x239/0x2c0 [nfsd]
>>>   nfsctl_transaction_write+0xac/0x110 [nfsd]
>>>   vfs_write+0x1c3/0xae0
>>>   ksys_write+0xed/0x1c0
>>>   do_syscall_64+0x38/0x90
>>>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>> 
>>>  Freed by task 102551:
>>>   kasan_save_stack+0x1e/0x40
>>>   kasan_set_track+0x21/0x30
>>>   kasan_save_free_info+0x2a/0x50
>>>   __kasan_slab_free+0x106/0x190
>>>   __kmem_cache_free+0x133/0x270
>>>   svc_xprt_free+0x1e2/0x350 [sunrpc]
>>>   svc_xprt_destroy_all+0x25a/0x440 [sunrpc]
>>>   nfsd_put+0x125/0x240 [nfsd]
>>>   nfsd_svc+0x2cb/0x3c0 [nfsd]
>>>   write_threads+0x1ac/0x2a0 [nfsd]
>>>   nfsctl_transaction_write+0xac/0x110 [nfsd]
>>>   vfs_write+0x1c3/0xae0
>>>   ksys_write+0xed/0x1c0
>>>   do_syscall_64+0x38/0x90
>>>   entry_SYSCALL_64_after_hwframe+0x72/0xdc
>>> 
>>> Fix the UAF by simply doing nothing in svc_tcp_listen_data_ready()
>>> if state != TCP_LISTEN, that will avoid dereferencing svsk for all
>>> child socket.
>>> 
>>> Link: https://lore.kernel.org/lkml/20230507091131.23540-1-dinghui@sangfor.com.cn/
>>> Fixes: fa9251afc33c ("SUNRPC: Call the default socket callbacks instead of open coding")
>>> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
>>> Cc: <stable@vger.kernel.org>
>>> ---
>>> net/sunrpc/svcsock.c | 23 +++++++++++------------
>>> 1 file changed, 11 insertions(+), 12 deletions(-)
>>> 
>>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>>> index a51c9b989d58..9aca6e1e78e4 100644
>>> --- a/net/sunrpc/svcsock.c
>>> +++ b/net/sunrpc/svcsock.c
>>> @@ -825,12 +825,6 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
>>> 
>>> trace_sk_data_ready(sk);
>>> 
>>> - if (svsk) {
>>> - /* Refer to svc_setup_socket() for details. */
>>> - rmb();
>>> - svsk->sk_odata(sk);
>>> - }
>>> -
>>> /*
>>> * This callback may called twice when a new connection
>>> * is established as a child socket inherits everything
>>> @@ -839,13 +833,18 @@ static void svc_tcp_listen_data_ready(struct sock *sk)
>>> *    when one of child sockets become ESTABLISHED.
>>> * 2) data_ready method of the child socket may be called
>>> *    when it receives data before the socket is accepted.
>>> - * In case of 2, we should ignore it silently.
>>> + * In case of 2, we should ignore it silently and DO NOT
>>> + * dereference svsk.
>>> */
>>> - if (sk->sk_state == TCP_LISTEN) {
>>> - if (svsk) {
>>> - set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
>>> - svc_xprt_enqueue(&svsk->sk_xprt);
>>> - }
>>> + if (sk->sk_state != TCP_LISTEN)
>>> + return;
>>> +
>>> + if (svsk) {
>>> + /* Refer to svc_setup_socket() for details. */
>>> + rmb();
>>> + svsk->sk_odata(sk);
>>> + set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
>>> + svc_xprt_enqueue(&svsk->sk_xprt);
>>> }
>>> }
>>> 
>>> -- 
>>> 2.17.1
>>> 
>> --
>> Chuck Lever
> 
> -- 
> Thanks,
> -dinghui


--
Chuck Lever
  

Patch

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index a51c9b989d58..9aca6e1e78e4 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -825,12 +825,6 @@  static void svc_tcp_listen_data_ready(struct sock *sk)
 
 	trace_sk_data_ready(sk);
 
-	if (svsk) {
-		/* Refer to svc_setup_socket() for details. */
-		rmb();
-		svsk->sk_odata(sk);
-	}
-
 	/*
 	 * This callback may called twice when a new connection
 	 * is established as a child socket inherits everything
@@ -839,13 +833,18 @@  static void svc_tcp_listen_data_ready(struct sock *sk)
 	 *    when one of child sockets become ESTABLISHED.
 	 * 2) data_ready method of the child socket may be called
 	 *    when it receives data before the socket is accepted.
-	 * In case of 2, we should ignore it silently.
+	 * In case of 2, we should ignore it silently and DO NOT
+	 * dereference svsk.
 	 */
-	if (sk->sk_state == TCP_LISTEN) {
-		if (svsk) {
-			set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
-			svc_xprt_enqueue(&svsk->sk_xprt);
-		}
+	if (sk->sk_state != TCP_LISTEN)
+		return;
+
+	if (svsk) {
+		/* Refer to svc_setup_socket() for details. */
+		rmb();
+		svsk->sk_odata(sk);
+		set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags);
+		svc_xprt_enqueue(&svsk->sk_xprt);
 	}
 }