From patchwork Thu Dec 21 03:08:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ahelenia_Ziemia=C5=84ska?= X-Patchwork-Id: 18397 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2483:b0:fb:cd0c:d3e with SMTP id q3csp155929dyi; Wed, 20 Dec 2023 19:09:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IE+LN6WneRnHJD6SDQHO846M/sqKyJZclr+e1MYF+cZV6XDEmJwvcA4OMGRaDO+W4LfNU9N X-Received: by 2002:a05:622a:5:b0:427:88c0:14e9 with SMTP id x5-20020a05622a000500b0042788c014e9mr2093885qtw.89.1703128166091; Wed, 20 Dec 2023 19:09:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703128166; cv=none; d=google.com; s=arc-20160816; b=gxlbUkW3CYFebJmUL0vSZw0ufuMWT5epQudBkGygf+4RsZk/InnvIQ+TP1dx3M084j YGOWsbEYXK6cBZ9mNF5ri5C4pV9v7YG1qlj3YS60EKf3aKA9kvGRVWIiftXwfGxPYNaL IgxlPcxAke8t9Ih93HiGfr9SFE01WyN0jpkVky/j+M3YfxUJDFcjf0ffDA4OC89XeYgm QpWKoNZomeqM95YVwl59I64OPicLj9QCGp366aVSQxfZE95ypa4wrG09EJ7eZb5HJdZW 5hRlE7GojpwuVtO1Nz4MED2IP+bpsldxwR5vCXRTxYoF5ctoAH76DDfF+Ev/VXUD0FEc Gsmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:subject:cc:from:date :dkim-signature; bh=h0q9+mIXz5J4NKGhU2heqR3Dexd48TZi7bt5YeuwCcA=; fh=TVXpXzQNoRlwVzBabv2uMmvIXL6buGoa3JhaEqV8/Z4=; b=QucVNg0dswqC8dGYzK1+qWUJkjZYljeGZaL09gclFomqFbKzx8QYWI5XjM4ZwJGl3C LuduKwnNntjqHNDL1R7AMB0sb4lYCBSBOxlk90klqlIA41+ucACIthyASt+sSlHkYzLH ArpafsmZ42F/q9sk54vW8IrRj9fn4dILPoQVwTzbQOqmutzrs1PGgg1PPUJL8cfxkYtT NTDfeQZixTeVqSWLuo5gjNNhJKxpNFgYaJZU77+C2C9rDm7VezCsgQkKB2lCA6vIsmOX ao/UdYDMyzJchwwHWrPC4s4STE1zQYPTSob/sUy+K1HYiOalWaDQ/Rkk+NpsYmuesADr peMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nabijaczleweli.xyz header.s=202305 header.b="DXdlXzR/"; spf=pass (google.com: domain of linux-kernel+bounces-7773-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7773-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nabijaczleweli.xyz Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id bm31-20020a05620a199f00b0077f3822393dsi1294685qkb.264.2023.12.20.19.09.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Dec 2023 19:09:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-7773-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@nabijaczleweli.xyz header.s=202305 header.b="DXdlXzR/"; spf=pass (google.com: domain of linux-kernel+bounces-7773-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-7773-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nabijaczleweli.xyz Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 3D9971C23BAA for ; Thu, 21 Dec 2023 03:09:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 57AFA9461; Thu, 21 Dec 2023 03:08:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nabijaczleweli.xyz header.i=@nabijaczleweli.xyz header.b="DXdlXzR/" X-Original-To: linux-kernel@vger.kernel.org Received: from tarta.nabijaczleweli.xyz (tarta.nabijaczleweli.xyz [139.28.40.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 91EB24416; Thu, 21 Dec 2023 03:08:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=nabijaczleweli.xyz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=nabijaczleweli.xyz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nabijaczleweli.xyz; s=202305; t=1703128121; bh=5pox7i0LMU5pswbZv4sORhQwC0BFgjzW/OuspbzrSpE=; h=Date:From:Cc:Subject:From; b=DXdlXzR/L5iIRsDgYOWlaiLPjOc+fnFTfX+ltbyhDtkXM/SxONn7FPJLXGLuLOrzK Pg5fkjgCgX0VbDEwsDvDTxmshnruk1Kbey43wgtPpmbXlJd66IFzC/KdRcacVGXUyX Ae/HVgQw0TjmPmwlMwr6yAdUkMN8fCxx1VrMUDmoHUF2/GMMJ514kIJ8x5/ZkskMNm rgc/iDLOax+DC5ZJbYklMCTj4K7tdaGWmdK4UcgLDm/Up0UiM/Pb9oqyPyslXr+3xk gnLgzMVzEUw/qRsfiM8F300FP7jlbvCCnmQ4YVZ/wPjuslpgG9hyT7a10bgUm9G7sp qSkA2UsVKz2Kw== Received: from tarta.nabijaczleweli.xyz (unknown [192.168.1.250]) by tarta.nabijaczleweli.xyz (Postfix) with ESMTPSA id AB3F313C4C; Thu, 21 Dec 2023 04:08:41 +0100 (CET) Date: Thu, 21 Dec 2023 04:08:41 +0100 From: Ahelenia =?utf-8?q?Ziemia=C5=84ska?= Cc: Jens Axboe , Christian Brauner , Alexander Viro , linux-fsdevel@vger.kernel.org, Greg Kroah-Hartman , Jiri Slaby , Miklos Szeredi , Vivek Goyal , Stefan Hajnoczi , Eric Dumazet , "David S. Miller" , David Ahern , Jakub Kicinski , Paolo Abeni , Wenjia Zhang , Jan Karcher , "D. Wythe" , Tony Lu , Wen Gu , Boris Pismenny , John Fastabend , David Howells , Shigeru Yoshida , Peilin Ye , Kuniyuki Iwashima , Alexander Mikhalitsyn , Daan De Meyer , linux-kernel@vger.kernel.org, linux-serial@vger.kernel.org, virtualization@lists.linux.dev, netdev@vger.kernel.org, linux-s390@vger.kernel.org, Alejandro Colomar , linux-man@vger.kernel.org Subject: [PATCH v2 00/11] Avoid unprivileged splice(file->)/(->socket) pipe exclusion Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline User-Agent: NeoMutt/20231103-116-3b855e-dirty X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785859319828538787 X-GMAIL-MSGID: 1785859319828538787 Hi! As it stands, splice(file -> pipe): 1. locks the pipe, 2. does a read from the file, 3. unlocks the pipe. When the file resides on a normal filesystem, this isn't an issue because the filesystem has been defined as trusted by root having mounted it. But when the file is actually IPC (FUSE) or is just IPC (sockets) or is a tty, this means that the pipe lock will be held for an attacker-controlled length of time, and during that time every process trying to read from, write to, open, or close the pipe enters an uninterruptible sleep, and will only exit it if the splicing process is killed. This trivially denies service to: * any hypothetical pipe-based log collexion system * all nullmailer installations * me, personally, when I'm pasting stuff into qemu -serial chardev:pipe A symmetric situation happens for splicing(pipe -> socket): the pipe is locked for as long as the socket is full. This follows: 1. https://lore.kernel.org/linux-fsdevel/qk6hjuam54khlaikf2ssom6custxf5is2ekkaequf4hvode3ls@zgf7j5j4ubvw/t/#u 2. a security@ thread rooted in 3. https://nabijaczleweli.xyz/content/blogn_t/011-linux-splice-exclusion.html 4. https://lore.kernel.org/lkml/cover.1697486714.git.nabijaczleweli@nabijaczleweli.xyz/t/#u (v1) https://lore.kernel.org/lkml/1cover.1697486714.git.nabijaczleweli@nabijaczleweli.xyz/t/#u (resend) https://lore.kernel.org/lkml/2cover.1697486714.git.nabijaczleweli@nabijaczleweli.xyz/t/#u (reresend) 5. https://lore.kernel.org/lkml/dtexwpw6zcdx7dkx3xj5gyjp5syxmyretdcbcdtvrnukd4vvuh@tarta.nabijaczleweli.xyz/t/#u (relay_file_splice_read removal) 1-7/11 request MSG_DONTWAIT (sockets)/IOCB_NOWAIT (generic) on the read 8/11 removes splice_read from tty completely 9/11 removes splice_read from FUSE filesystems (except virtiofs which has normal mounting security semantics, but is handled via FUSE code) 10/11 allows splice_read from FUSE filesystems mounted by real root (this matches the blessing received by non-FUSE network filesystems) 11/11 requests MSG_DONTWAIT for splice(pipe -> socket). 12/11 has the man-pages patch with draft wording. All but 5/11 (AF_SMC) have been tested and embed shell programs to repro them. AIUI I'd need an s390 machine for it? It's trivial. 6/11 (AF_KCM) also fixes kcm_splice_read() passing SPLICE_F_*-style flags to skb_recv_datagram(), which takes MSG_*-style flags. I don't think they did anything anyway? But. There are two implementations that definitely sleep all the time and I didn't touch them: tracing_splice_read_pipe tracing_buffers_splice_read (dropped in v2, v1 4/11) the semantics are lost on me, but they're in debugfs/tracefs, so it doesn't matter if they block so long as they work, and presumably they do right now. There is also relay_file_splice_read (dropped in v2, v1 5/11), which isn't an implementation at all because it's dead code, broken, and removed in -mm. The diffs in 1-7,11/11 are unchanged, save for a rebase in 7/11. 8/11 replaces the file type test in v1 10/11. 9/11 and 10/11 are new in v2. Ahelenia ZiemiaƄska (11): splice: copy_splice_read: do the I/O with IOCB_NOWAIT af_unix: unix_stream_splice_read: always request MSG_DONTWAIT fuse: fuse_dev_splice_read: use nonblocking I/O net/smc: smc_splice_read: always request MSG_DONTWAIT kcm: kcm_splice_read: always request MSG_DONTWAIT tls/sw: tls_sw_splice_read: always request non-blocking I/O net/tcp: tcp_splice_read: always do non-blocking reads tty: splice_read: disable fuse: file: limit splice_read to virtiofs fuse: allow splicing from filesystems mounted by real root splice: splice_to_socket: always request MSG_DONTWAIT drivers/tty/tty_io.c | 2 -- fs/fuse/dev.c | 10 ++++++---- fs/fuse/file.c | 17 ++++++++++++++++- fs/fuse/fuse_i.h | 4 ++++ fs/fuse/inode.c | 2 ++ fs/fuse/virtio_fs.c | 1 + fs/splice.c | 5 ++--- net/ipv4/tcp.c | 32 +++----------------------------- net/kcm/kcmsock.c | 2 +- net/smc/af_smc.c | 6 +----- net/tls/tls_sw.c | 5 ++--- net/unix/af_unix.c | 5 +---- 12 files changed, 39 insertions(+), 52 deletions(-) base-commit: 2cf4f94d8e8646803f8fb0facf134b0cd7fb691a --- 2.39.2