Message ID | 20231023233704.1185154-2-asmadeus@codewreck.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1614827vqx; Mon, 23 Oct 2023 16:37:53 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGJT+xMzJmTOUYfibskbTaBl657ZeFT41IkJf+TV9rOkqikrJjIdCn/cMng7hb1rFxyCsJY X-Received: by 2002:a05:6a20:8401:b0:174:210c:34b2 with SMTP id c1-20020a056a20840100b00174210c34b2mr1315326pzd.19.1698104273340; Mon, 23 Oct 2023 16:37:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698104273; cv=none; d=google.com; s=arc-20160816; b=r7xVRsw/updOddNbllcV7R2WTMdgTuH8oAufqsWQlvOyp/NiOBvzsUgyW7m3XDiOqN vghhtdsn6j0oECJ2fDvDF1k2jHTyycNW+AqPNCaswiKRzWkc2WNlqHhCWk+konMoPCqX Oa5fplkTDxhuh7VzXfo4YpJzhgclPzzyhbamqwM5/FySq+EoEwxjnFqcgx/F3+zwWdVH CSpeYsNFSGthY2EURfDj589llrdrJUd5Md41euU11ci+QW59lCJUjnhXvRAI8ske9d6m rbOoE3FfjMks5k6btAgl+sA7DsouS9gis9wvA4GpepgEYhAid0TyasYESmghoi+amvfG 3/zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-signature; bh=/A34CTskJseJi+dcqA1uEIZC2XG2pVw2hHNklQEakQo=; fh=J+dhwGGb5LpH4sU3wwfEIZbHyopUZv821EE0W0h63fM=; b=xaCjS+DYiwvlpfJO6uQ/fC1ry5LacUur0EslD1pxyFggTb8bnQCruCM7vTEZQIxW5L Zw+uVfclqd6II/e+4mGBZwWTptU2gviFUF5ltm/QszPyARj3J1vWlm3SW85VwR7y9GIE sOvhLS+8Pv4p6jPuiS+dgDFFuHni8JJ4ZHTKGFN2RCEX3K9AT0MSw1ZfT7onUrMscYsg mVVyn5fJDNcbi5wq6F2vovye2jRqrOIRWNg0Y4Xi4nAoZKhlnfX4YIxqV4J80LAX8hgl VF4hqYL+0nwn7PeozeUW4AqGXS+FIhlX1Ma3dZN8Uor9SYcm2kxYlzA/cfFSH4+UiYEP Ov9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codewreck.org header.s=2 header.b=2DLrhRRq; dkim=pass header.i=@codewreck.org header.s=2 header.b=H9q202+p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=codewreck.org Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id gt4-20020a17090af2c400b0027901ee93fbsi9579176pjb.156.2023.10.23.16.37.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 16:37:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@codewreck.org header.s=2 header.b=2DLrhRRq; dkim=pass header.i=@codewreck.org header.s=2 header.b=H9q202+p; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=codewreck.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 9169F809C4D5; Mon, 23 Oct 2023 16:37:49 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229487AbjJWXhf (ORCPT <rfc822;aposhian.dev@gmail.com> + 27 others); Mon, 23 Oct 2023 19:37:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231336AbjJWXh3 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 23 Oct 2023 19:37:29 -0400 Received: from nautica.notk.org (ipv6.notk.org [IPv6:2001:41d0:1:7a93::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14CE6F9 for <linux-kernel@vger.kernel.org>; Mon, 23 Oct 2023 16:37:26 -0700 (PDT) Received: by nautica.notk.org (Postfix, from userid 108) id 70795C021; Tue, 24 Oct 2023 01:37:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1698104243; bh=/A34CTskJseJi+dcqA1uEIZC2XG2pVw2hHNklQEakQo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=2DLrhRRq6njg7SLrk3WbzcM8N/46q79mgTGyaP+Y4UsZA9IAsvLg8UrMUkQDnhLai 47r5jW+s0LYnQ+kUlpNRGbHdIW6ce4qjFqAWaNZGQb8wrYsLUmyBopuThwZQ4tzVLY ns++BcHak+wS9q+Mw+8M8i0lDqOVcr1rjCP1v236lW9x5gQ5pXmEqrhizKiIRIVnCf 1fT9nYAtamv3TQrZsgIRRhpgBjbiRhU+8KC+8nRYDmzXYaKdh3pWzZnbaT6IwiWe8a 28u3H1rA0xiZGPPfskQ8D76BQll/+zcjFRXNn8lP5TZktmJhEOLFMr+ojPRiiNtwgj iVL1rbm5tnGmw== X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 Received: from gaia (localhost [127.0.0.1]) by nautica.notk.org (Postfix) with ESMTPS id B3BC4C009; Tue, 24 Oct 2023 01:37:18 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1698104242; bh=/A34CTskJseJi+dcqA1uEIZC2XG2pVw2hHNklQEakQo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=H9q202+phVyt45nk/teha6weJN6/png6bkYoZNW6ao1p6BRyO4FgYWMGIk6gSm4yv x+Rdsf2lO5NyDeiJocV63AiF77nMyntK2AXb0yjd4QgRJaPRTcnGAizwfR8OFCU3Fh XGmc27IUykFBEgB2AYeA47YCFFx3H4+HUc2gTcEebovIS63OVAshRNGQGfhUF9UsAq VZKjILf2nDgY/sL0cr0KcwiBiGWQPEdquVObzzsD2EHShTwkze9tF78G9Qeztdu6lN OpODxoG+xIiGbgNx9U7Slo+cW1IeCiPRfOKXnae6VxgHyg2KvuSZJ1ZlcmsdakVcxP 7AD0G8iwggf8Q== Received: from gaia.codewreck.org (localhost.lan [::1]) by gaia (OpenSMTPD) with ESMTP id 1bf2bc7c; Mon, 23 Oct 2023 23:37:15 +0000 (UTC) From: Dominique Martinet <asmadeus@codewreck.org> To: v9fs@lists.linux.dev, ericvh@kernel.org, linux_oss@crudebyte.com Cc: lucho@ionkov.net, linux-kernel@vger.kernel.org, Marco Elver <elver@google.com>, syzbot+e441aeeb422763cc5511@syzkaller.appspotmail.com, Dominique Martinet <asmadeus@codewreck.org> Subject: [PATCH 1/3] 9p: Annotate data-racy writes to file::f_flags on fd mount Date: Tue, 24 Oct 2023 08:37:02 +0900 Message-ID: <20231023233704.1185154-2-asmadeus@codewreck.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231023233704.1185154-1-asmadeus@codewreck.org> References: <20231023233704.1185154-1-asmadeus@codewreck.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 23 Oct 2023 16:37:49 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780591386090532760 X-GMAIL-MSGID: 1780591386090532760 |
Series |
Small patches for 6.7
|
|
Commit Message
Dominique Martinet
Oct. 23, 2023, 11:37 p.m. UTC
From: Marco Elver <elver@google.com> syzbot reported: | BUG: KCSAN: data-race in p9_fd_create / p9_fd_create | | read-write to 0xffff888130fb3d48 of 4 bytes by task 15599 on cpu 0: | p9_fd_open net/9p/trans_fd.c:842 [inline] | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 | p9_client_create+0x595/0xa70 net/9p/client.c:1010 | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 | vfs_get_tree+0x51/0x190 fs/super.c:1519 | do_new_mount+0x203/0x660 fs/namespace.c:3335 | path_mount+0x496/0xb30 fs/namespace.c:3662 | do_mount fs/namespace.c:3675 [inline] | __do_sys_mount fs/namespace.c:3884 [inline] | [...] | | read-write to 0xffff888130fb3d48 of 4 bytes by task 15563 on cpu 1: | p9_fd_open net/9p/trans_fd.c:842 [inline] | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 | p9_client_create+0x595/0xa70 net/9p/client.c:1010 | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 | vfs_get_tree+0x51/0x190 fs/super.c:1519 | do_new_mount+0x203/0x660 fs/namespace.c:3335 | path_mount+0x496/0xb30 fs/namespace.c:3662 | do_mount fs/namespace.c:3675 [inline] | __do_sys_mount fs/namespace.c:3884 [inline] | [...] | | value changed: 0x00008002 -> 0x00008802 Within p9_fd_open(), O_NONBLOCK is added to f_flags of the read and write files. This may happen concurrently if e.g. mounting process modifies the fd in another thread. Mark the plain read-modify-writes as intentional data-races, with the assumption that the result of executing the accesses concurrently will always result in the same result despite the accesses themselves not being atomic. Reported-by: syzbot+e441aeeb422763cc5511@syzkaller.appspotmail.com Signed-off-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/r/ZO38mqkS0TYUlpFp@elver.google.com Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> --- Hi Marco, sorry for taking ages to process this patch. I've reworded the commit message a bit and added a comment, so given this has your name on it please have a look. I'm planning to send this to Linus during the merge window next week net/9p/trans_fd.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
Comments
On Tue, 24 Oct 2023 at 01:37, Dominique Martinet <asmadeus@codewreck.org> wrote: > > From: Marco Elver <elver@google.com> > > syzbot reported: > > | BUG: KCSAN: data-race in p9_fd_create / p9_fd_create > | > | read-write to 0xffff888130fb3d48 of 4 bytes by task 15599 on cpu 0: > | p9_fd_open net/9p/trans_fd.c:842 [inline] > | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 > | p9_client_create+0x595/0xa70 net/9p/client.c:1010 > | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 > | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 > | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 > | vfs_get_tree+0x51/0x190 fs/super.c:1519 > | do_new_mount+0x203/0x660 fs/namespace.c:3335 > | path_mount+0x496/0xb30 fs/namespace.c:3662 > | do_mount fs/namespace.c:3675 [inline] > | __do_sys_mount fs/namespace.c:3884 [inline] > | [...] > | > | read-write to 0xffff888130fb3d48 of 4 bytes by task 15563 on cpu 1: > | p9_fd_open net/9p/trans_fd.c:842 [inline] > | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 > | p9_client_create+0x595/0xa70 net/9p/client.c:1010 > | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 > | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 > | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 > | vfs_get_tree+0x51/0x190 fs/super.c:1519 > | do_new_mount+0x203/0x660 fs/namespace.c:3335 > | path_mount+0x496/0xb30 fs/namespace.c:3662 > | do_mount fs/namespace.c:3675 [inline] > | __do_sys_mount fs/namespace.c:3884 [inline] > | [...] > | > | value changed: 0x00008002 -> 0x00008802 > > Within p9_fd_open(), O_NONBLOCK is added to f_flags of the read and > write files. This may happen concurrently if e.g. mounting process > modifies the fd in another thread. > > Mark the plain read-modify-writes as intentional data-races, with the > assumption that the result of executing the accesses concurrently will > always result in the same result despite the accesses themselves not > being atomic. > > Reported-by: syzbot+e441aeeb422763cc5511@syzkaller.appspotmail.com > Signed-off-by: Marco Elver <elver@google.com> > Link: https://lore.kernel.org/r/ZO38mqkS0TYUlpFp@elver.google.com > Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> > --- > > Hi Marco, sorry for taking ages to process this patch. I've reworded the > commit message a bit and added a comment, so given this has your name on > it please have a look. > I'm planning to send this to Linus during the merge window next week No worries, and thank you! > net/9p/trans_fd.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c > index f226953577b2..d89c88986950 100644 > --- a/net/9p/trans_fd.c > +++ b/net/9p/trans_fd.c > @@ -836,14 +836,16 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd) > goto out_free_ts; > if (!(ts->rd->f_mode & FMODE_READ)) > goto out_put_rd; > - /* prevent workers from hanging on IO when fd is a pipe */ > - ts->rd->f_flags |= O_NONBLOCK; > + /* Prevent workers from hanging on IO when fd is a pipe Add '.' at end of sentence(s)? > + * We don't support userspace messing with the fd after passing it > + * to mount, so flag possible data race for KCSAN */ The comment should explain why the data race is safe, independent of KCSAN. I still don't quite get why it's safe. The case that syzbot found was 2 concurrent mount. Is that also disallowed? Maybe something like: "We don't support userspace messing with the fd after passing it to the first mount. While it's not officially supported, concurrent modification of flags is unlikely to break this code. So that tooling (like KCSAN) knows about it, mark them as intentional data races." > + data_race(ts->rd->f_flags |= O_NONBLOCK); > ts->wr = fget(wfd); > if (!ts->wr) > goto out_put_rd; > if (!(ts->wr->f_mode & FMODE_WRITE)) > goto out_put_wr; > - ts->wr->f_flags |= O_NONBLOCK; > + data_race(ts->wr->f_flags |= O_NONBLOCK); > > client->trans = ts; > client->status = Connected; > -- > 2.41.0 >
Marco Elver wrote on Tue, Oct 24, 2023 at 09:12:56AM +0200: > > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c > > index f226953577b2..d89c88986950 100644 > > --- a/net/9p/trans_fd.c > > +++ b/net/9p/trans_fd.c > > @@ -836,14 +836,16 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd) > > goto out_free_ts; > > if (!(ts->rd->f_mode & FMODE_READ)) > > goto out_put_rd; > > - /* prevent workers from hanging on IO when fd is a pipe */ > > - ts->rd->f_flags |= O_NONBLOCK; > > + /* Prevent workers from hanging on IO when fd is a pipe > > Add '.' at end of sentence(s)? I don't think we have any rule about this in the 9p part of the tree, looking around there seem to be more comments without '.' than with, but it's never too late to start... I'll add some in a v2 after we've agreed with the rest. > > > + * We don't support userspace messing with the fd after passing it > > + * to mount, so flag possible data race for KCSAN */ > > The comment should explain why the data race is safe, independent of > KCSAN. I still don't quite get why it's safe. I guess it depends on what we call 'safe' here: if we agree the worst thing that can happen is weird flags being set when we didn't request them and socket operations behaving oddly (of the level of block when they shouldn't), we don't care because there's no way to make concurrent usage of the fd work anyway. If it's possible to get an invalid value there such that a socket operation ends up executing user-controlled code somewhere, then we've got a bigger problem and we should take some lock (presumably the same lock fcntl(F_SETFD) is taking, as that's got more potential for breakage than another mount in my opinon) > The case that syzbot found was 2 concurrent mount. Is that also disallowed? Yes, there's no way you'll get a working filesystem out of two mounts using the same fd as the protocol has no muxing > Maybe something like: "We don't support userspace messing with the fd > after passing it to the first mount. While it's not officially > supported, concurrent modification of flags is unlikely to break this > code. So that tooling (like KCSAN) knows about it, mark them as > intentional data races." I'd word this as much likely to break, how about something like this? /* Prevent workers from hanging on IO when fd is a pipe. * It's technically possible for userspace or concurrent mounts to * modify this flag concurrently, which will likely result in a * broken filesystem. However, just having bad flags here should * not crash the kernel or cause any other sort of bug, so mark this * particular data race as intentional so that tooling (like KCSAN) * can allow it and detect further problems. */ Thanks,
On Tue, 24 Oct 2023 at 09:44, Dominique Martinet <asmadeus@codewreck.org> wrote: > > Marco Elver wrote on Tue, Oct 24, 2023 at 09:12:56AM +0200: > > > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c > > > index f226953577b2..d89c88986950 100644 > > > --- a/net/9p/trans_fd.c > > > +++ b/net/9p/trans_fd.c > > > @@ -836,14 +836,16 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd) > > > goto out_free_ts; > > > if (!(ts->rd->f_mode & FMODE_READ)) > > > goto out_put_rd; > > > - /* prevent workers from hanging on IO when fd is a pipe */ > > > - ts->rd->f_flags |= O_NONBLOCK; > > > + /* Prevent workers from hanging on IO when fd is a pipe > > > > Add '.' at end of sentence(s)? > > I don't think we have any rule about this in the 9p part of the tree, > looking around there seem to be more comments without '.' than with, but > it's never too late to start... I'll add some in a v2 after we've agreed > with the rest. Sounds good. I think if there's 1 short sentence (1 line) comment, it's more or less optional. But I'd insist on punctuation as soon as there are 2 or more sentences. > > > > > + * We don't support userspace messing with the fd after passing it > > > + * to mount, so flag possible data race for KCSAN */ > > > > The comment should explain why the data race is safe, independent of > > KCSAN. I still don't quite get why it's safe. > > I guess it depends on what we call 'safe' here: if we agree the worst > thing that can happen is weird flags being set when we didn't request > them and socket operations behaving oddly (of the level of block when > they shouldn't), we don't care because there's no way to make concurrent > usage of the fd work anyway. Yes, that's reasonable. > If it's possible to get an invalid value there such that a socket > operation ends up executing user-controlled code somewhere, then we've > got a bigger problem and we should take some lock (presumably the same > lock fcntl(F_SETFD) is taking, as that's got more potential for breakage > than another mount in my opinon) > > > The case that syzbot found was 2 concurrent mount. Is that also disallowed? > > Yes, there's no way you'll get a working filesystem out of two mounts > using the same fd as the protocol has no muxing > > > Maybe something like: "We don't support userspace messing with the fd > > after passing it to the first mount. While it's not officially > > supported, concurrent modification of flags is unlikely to break this > > code. So that tooling (like KCSAN) knows about it, mark them as > > intentional data races." > > I'd word this as much likely to break, how about something like this? > > /* Prevent workers from hanging on IO when fd is a pipe. > * It's technically possible for userspace or concurrent mounts to > * modify this flag concurrently, which will likely result in a > * broken filesystem. However, just having bad flags here should > * not crash the kernel or cause any other sort of bug, so mark this > * particular data race as intentional so that tooling (like KCSAN) > * can allow it and detect further problems. > */ I think this sounds much clearer. Thanks!
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index f226953577b2..d89c88986950 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -836,14 +836,16 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd) goto out_free_ts; if (!(ts->rd->f_mode & FMODE_READ)) goto out_put_rd; - /* prevent workers from hanging on IO when fd is a pipe */ - ts->rd->f_flags |= O_NONBLOCK; + /* Prevent workers from hanging on IO when fd is a pipe + * We don't support userspace messing with the fd after passing it + * to mount, so flag possible data race for KCSAN */ + data_race(ts->rd->f_flags |= O_NONBLOCK); ts->wr = fget(wfd); if (!ts->wr) goto out_put_rd; if (!(ts->wr->f_mode & FMODE_WRITE)) goto out_put_wr; - ts->wr->f_flags |= O_NONBLOCK; + data_race(ts->wr->f_flags |= O_NONBLOCK); client->trans = ts; client->status = Connected;