Message ID | cover.1687898895.git.nabijaczleweli@nabijaczleweli.xyz |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp8471438vqr; Tue, 27 Jun 2023 14:08:01 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7pIm7LUlIpnZzRLSZ6r02UtFB212d8Hu+PpC2NeVJlKzEI+jcQwJAHk8SdF0cJuPVyIbeW X-Received: by 2002:a05:6a00:3a0a:b0:657:bdf1:cce1 with SMTP id fj10-20020a056a003a0a00b00657bdf1cce1mr43312091pfb.25.1687900080501; Tue, 27 Jun 2023 14:08:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687900080; cv=none; d=google.com; s=arc-20160816; b=yxm0qppu7m732C5I4a1/V7HPTdGgIZ65Ia75LaF/UWlu3MF4VmotGqhTCsH8gpou9c 0mWqMn37p5dVaOiSl3RuRdUUneVCod9GGcdpCLncs3DbkZXbvzoaTk9W5/fU47ci1dI8 A/fE2VTvMoahikNAHt9GOs0p7KsahJuWpDSItZ6bhhLwUt3bKvfUCJl8HDcsStxHhsvL 3su4qoi6NOX1lv+pYvZPrja6sqj8f5Xo9Inu8/TxRX32OKDu8yGqUzzcdPjhGasnLqOc 0uTopQmvkHPm/tJ3E1yaWVHzGme2TDdcVGpum03+/rNk/Qg8l+nIORA9ftG5WLwPYr40 +lmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=c7zI3Ikn18m+weO0Q9d4yhnigfqyd1iPv7UrHGPbFTU=; fh=EwHFucLs4bwDakTQKl/ndDudi5N9htFCanErUntz+qw=; b=BqV0oXEdKU467gX0Cq3NX9LZ65VIzlwRv1qqpIQu+boLGIcM8FLeTn4ouLkPLw6aej ah03dRC8l+ixVGTJqfNjn7tniZSBPh7M8Rl/p4vqdZ+FhivVjA+3imeJyGaFUkb5c56t q4EwDDZYykd2xK03woBqzWqY+/pvZmk7VKN9TSesKEn5REw34rf9dV7G0JHwOUZgySXl vDcgRD1c0lWPiNZG0d4GiJ4rJMfXfE29OwQDnkcAIeF7qJfj0+LfmW8fkl+uAho8yb2o c80tpOjhOxCupCh1IN73f0a8pWNdlWRgQSCwUZoWbydM2ii6QG7ZJwx+WGXyqlUJdc17 NKeQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nabijaczleweli.xyz header.s=202305 header.b=Xz4XR+Xh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nabijaczleweli.xyz Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r25-20020a6560d9000000b0053f0bfcd4fasi8070949pgv.173.2023.06.27.14.07.47; Tue, 27 Jun 2023 14:08:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@nabijaczleweli.xyz header.s=202305 header.b=Xz4XR+Xh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nabijaczleweli.xyz Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231488AbjF0Uuy (ORCPT <rfc822;nicolai.engesland@gmail.com> + 99 others); Tue, 27 Jun 2023 16:50:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230113AbjF0Uuu (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 27 Jun 2023 16:50:50 -0400 Received: from tarta.nabijaczleweli.xyz (unknown [139.28.40.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 194BBED; Tue, 27 Jun 2023 13:50:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nabijaczleweli.xyz; s=202305; t=1687899047; bh=zhLge2fIvDMjALdhMD05+tueCYzIk6/Kl0MFRGFrYIQ=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Xz4XR+XhvVLaJ53FyxpTAVg+9gCf9wr8ep7qkuA+qDeraK1HpJ1LFDrexTiH/h75M fiPAAfxv9YULL4NL+HtBNQ+TLY/cMJW9zfR72jw0vJGQdyX71kSN1Ja9b7DcQRIX6G uouB8Cz9aeDP7eBTLwVKcuV6rZujjsk3IK7oP6v2ncr7dQrxnzRF+ky+zdW1ZYlgDh gn4fp6+g/5C8ftj9KCKTfTjJAoUmRxsQblZGus9n5v4zfWvKkgCPmtOpi2KZTM+e3P qYtGNNIhsd0Rq2Jcr1BtxCpjBzPYw65ZiD+sf7wBX8HU7T8EClw/kigOXlHJ4y8Tlj P8WKYPLisSabw== Received: from tarta.nabijaczleweli.xyz (unknown [192.168.1.250]) by tarta.nabijaczleweli.xyz (Postfix) with ESMTPSA id 55DF412C2; Tue, 27 Jun 2023 22:50:47 +0200 (CEST) Date: Tue, 27 Jun 2023 22:50:46 +0200 From: Ahelenia =?utf-8?q?Ziemia=C5=84ska?= <nabijaczleweli@nabijaczleweli.xyz> To: Amir Goldstein <amir73il@gmail.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk>, Christian Brauner <brauner@kernel.org>, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Jan Kara <jack@suse.cz>, Chung-Chiang Cheng <cccheng@synology.com>, ltp@lists.linux.it Subject: [PATCH v4 0/3] fanotify accounting for fs/splice.c Message-ID: <cover.1687898895.git.nabijaczleweli@nabijaczleweli.xyz> References: <t5az5bvpfqd3rrwla43437r5vplmkujdytixcxgm7sc4hojspg@jcc63stk66hz> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="n5hw2fxp2pfe5yae" Content-Disposition: inline In-Reply-To: <t5az5bvpfqd3rrwla43437r5vplmkujdytixcxgm7sc4hojspg@jcc63stk66hz> User-Agent: NeoMutt/20230517 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,PDS_RDNS_DYNAMIC_FP, RDNS_DYNAMIC,SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769891514921608179?= X-GMAIL-MSGID: =?utf-8?q?1769891514921608179?= |
Commit Message
Ahelenia Ziemiańska
June 27, 2023, 8:50 p.m. UTC
Always generate modify out, access in for splice; this gets automatically merged with no ugly special cases. No changes to 2/3 or 3/3. Ahelenia Ziemiańska (3): splice: always fsnotify_access(in), fsnotify_modify(out) on success splice: fsnotify_access(fd)/fsnotify_modify(fd) in vmsplice splice: fsnotify_access(in), fsnotify_modify(out) on success in tee fs/splice.c | 38 ++++++++++++++++++++------------------ 1 file changed, 20 insertions(+), 18 deletions(-) Interdiff against v3:
Comments
On Tue, Jun 27, 2023 at 11:50 PM Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> wrote: > > Always generate modify out, access in for splice; > this gets automatically merged with no ugly special cases. Ahelenia, Obviously, you are new to sending patches to the kernel and you appear to be a very enthusiastic and fast learner, so I assume you won't mind getting some tips that you won't find in any document. a) CC LTP only on tests, not on the kernel patches b) Please don't post these "diff" patches. Developers (and bots) should be able to understand which upstream commit a patch is based on (git format-patch provides that info). I mean you can send those diff patches as part of a conversation to explain yourself, that's fine, just don't post them as if they are patches for review. c) When there are prospect reviewers that have not reviewed v1 (especially inotify maintainer), it is better to wait at least one day posting v2 and v3 and v4 ;), because: 1. It is better to accumulate review comments from several reviewers 2. Different reviewers may disagree, so if you are just following my advice you may need to go back and forth until everyone is happy 3. It's racy - reviewers may be in the middle of review of v1 without realizing that v2,v3,v4 is already in their inbox, so that's creating extra work for them - not a good outcome Going to review v4.... Thanks, Amir.
Hello! On Tue 27-06-23 22:50:46, Ahelenia Ziemiańska wrote: > Always generate modify out, access in for splice; > this gets automatically merged with no ugly special cases. > > No changes to 2/3 or 3/3. Thanks for the patches Ahelena! The code looks fine to me but to be honest I still have one unresolved question so let me think about it loud here for documentation purposes :). Do we want fsnotify (any filesystem notification framework like inotify or fanotify) to actually generate events on FIFOs? FIFOs are virtual objects and are not part of the filesystem as such (well, the inode itself and the name is), hence *filesystem* notification framework does not seem like a great fit to watch for changes or accesses there. And if we say "yes" for FIFOs, then why not AF_UNIX sockets? Where do we draw the line? And is it all worth the trouble? I understand the convenience of inotify working on FIFOs for the "tail -f" usecase but then wouldn't this better be fixed in tail(1) itself by using epoll(7) for FIFOs which, as I've noted in my other reply, does not have the problem that poll(2) has when there are no writers? Another issue with FIFOs is that they do not have a concept of file position. For hierarchical storage usecase we are introducing events that will report file ranges being modified / accessed and officially supporting FIFOs is one more special case to deal with. What is supporting your changes is that fsnotify mostly works for FIFOs already now (normal reads & writes generate notification) so splice not working could be viewed as an inconsistency. Sockets (although they are visible in the filesystem) cannot be open so for them the illusion of being a file is even weaker. So overall I guess I'm slightly in favor of making fsnotify generate events on FIFOs even with splice, provided Amir does not see a big trouble in supporting this with his upcoming HSM changes. Honza > Ahelenia Ziemiańska (3): > splice: always fsnotify_access(in), fsnotify_modify(out) on success > splice: fsnotify_access(fd)/fsnotify_modify(fd) in vmsplice > splice: fsnotify_access(in), fsnotify_modify(out) on success in tee > > fs/splice.c | 38 ++++++++++++++++++++------------------ > 1 file changed, 20 insertions(+), 18 deletions(-) > > Interdiff against v3: > diff --git a/fs/splice.c b/fs/splice.c > index 2ecfccbda956..bdbabc2ebfff 100644 > --- a/fs/splice.c > +++ b/fs/splice.c > @@ -1184,10 +1184,6 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out, > out->f_pos = offset; > else > *off_out = offset; > - > - // splice_write-> already marked out > - // as modified via vfs_iter_write() > - goto noaccessout; > } else if (opipe) { > if (off_out) > return -ESPIPE; > @@ -1211,11 +1207,10 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out, > } else > return -EINVAL; > > - if (ret > 0) > + if (ret > 0) { > fsnotify_modify(out); > -noaccessout: > - if (ret > 0) > fsnotify_access(in); > + } > > return ret; > } > -- > 2.39.2
On Wed, Jun 28, 2023 at 2:38 PM Jan Kara <jack@suse.cz> wrote: > > Hello! > > On Tue 27-06-23 22:50:46, Ahelenia Ziemiańska wrote: > > Always generate modify out, access in for splice; > > this gets automatically merged with no ugly special cases. > > > > No changes to 2/3 or 3/3. > > Thanks for the patches Ahelena! The code looks fine to me but to be honest > I still have one unresolved question so let me think about it loud here for > documentation purposes :). Do we want fsnotify (any filesystem > notification framework like inotify or fanotify) to actually generate > events on FIFOs? FIFOs are virtual objects and are not part of the > filesystem as such (well, the inode itself and the name is), hence > *filesystem* notification framework does not seem like a great fit to watch > for changes or accesses there. And if we say "yes" for FIFOs, then why not > AF_UNIX sockets? Where do we draw the line? And is it all worth the > trouble? > > I understand the convenience of inotify working on FIFOs for the "tail -f" > usecase but then wouldn't this better be fixed in tail(1) itself by using > epoll(7) for FIFOs which, as I've noted in my other reply, does not have > the problem that poll(2) has when there are no writers? > > Another issue with FIFOs is that they do not have a concept of file > position. For hierarchical storage usecase we are introducing events that > will report file ranges being modified / accessed and officially supporting > FIFOs is one more special case to deal with. > > What is supporting your changes is that fsnotify mostly works for FIFOs > already now (normal reads & writes generate notification) so splice not > working could be viewed as an inconsistency. Sockets (although they are > visible in the filesystem) cannot be open so for them the illusion of being > a file is even weaker. > > So overall I guess I'm slightly in favor of making fsnotify generate events > on FIFOs even with splice, provided Amir does not see a big trouble in > supporting this with his upcoming HSM changes. > I've also thought about this. The thing about the HSM events is that they are permission events and just like FAN_ACCESS_PERM, they originate from the common access control helpers {rw,remap}_verify_area(), which also happen to have the file range info (with ppos NULL for pipes). Ahelenia's patches do not add any new rw_verify_area() to pipes so no new FAN_ACCESS_PERM events were added. If we could go back to the design of fanotify we would have probably made it explicit that permission events are only allowed on regular files and dirs. For the new HSM events we can (and will) do that. In any case, the new events are supposed to be delivered with file access range records, so delivering HSM events on pipes wouldn't make any sense. So I do not see any problem with these patches wrt upcomping HSM events. However, note that these patches create more inconsistencies between IN_ACCESS and FAN_ACCESS_PERM on pipes. We can leave it at that if we want, but fixing the inconsistencies by adding more FAN_ACCESS_PERM events on pipes - this is not something that I wouldn't be comfortable with. If anything, we can remove FAN_ACCESS_PERM events from special files and see if anybody complains. I don't know of any users of FAN_ACCESS_PERM and even for FAN_OPEN_PERM, I don't think that AV-vendors have anything useful to do with open permission events on special files. Thanks, Amir.
Hi! On Wed, Jun 28, 2023 at 01:38:53PM +0200, Jan Kara wrote: > On Tue 27-06-23 22:50:46, Ahelenia Ziemiańska wrote: > > Always generate modify out, access in for splice; > > this gets automatically merged with no ugly special cases. > > > > No changes to 2/3 or 3/3. > Thanks for the patches Ahelena! The code looks fine to me but to be honest > I still have one unresolved question so let me think about it loud here for > documentation purposes :). Do we want fsnotify (any filesystem > notification framework like inotify or fanotify) to actually generate > events on FIFOs? FIFOs are virtual objects and are not part of the > filesystem as such (well, the inode itself and the name is), hence > *filesystem* notification framework does not seem like a great fit to watch > for changes or accesses there. And if we say "yes" for FIFOs, then why not > AF_UNIX sockets? Where do we draw the line? And is it all worth the > trouble? As a relative outsider (I haven't used inotify before this, and have not been subjected to it or its peripheries before), I interpreted inotify as being the Correct solution for: 1. stuff you can find in a normal (non-/dev, you don't want to touch devices) filesystem traversal 2. stuff you can open where, going down the list in inode(7): S_IFSOCK can't open S_IFLNK can't open S_IFREG yes! S_IFBLK it's a device S_IFDIR yes! S_IFCHR it's a device S_IFIFO yes! It appears that I'm not the only one who's interpreted it that way, especially since neither regular files nor pipes are pollable. (Though, under that same categorisation, I wouldn't be surprised if anonymous pipes had been refused, for example, since those are conventionally unnameable.) To this end, I'd say we're leaving the line precisely where it was drawn before, even if by accident. > I understand the convenience of inotify working on FIFOs for the "tail -f" > usecase but then wouldn't this better be fixed in tail(1) itself by using > epoll(7) for FIFOs which, as I've noted in my other reply, does not have > the problem that poll(2) has when there are no writers? Yes, epoll in ET mode returns POLLHUP only once, but you /also/ need the inotify anyway for regular files, which epoll refuses (and, with -F, you may want both epoll for a pipe and inotify for the directory it's contained in). Is it possible to do? yes. Is it more annoying than just having pipes report when they were written to? very much so. inotify actually working(*) is presumably why coreutils tail doesn't use epoll ‒ inotify already provides all required events(*), you can use the same code for regular files and fifos, and with one fewer level of indirection: there's just no need(*). (*: except with a magic syscall only I use apparently)
Hi! On Wed 28-06-23 20:54:28, Ahelenia Ziemiańska wrote: > On Wed, Jun 28, 2023 at 01:38:53PM +0200, Jan Kara wrote: > > On Tue 27-06-23 22:50:46, Ahelenia Ziemiańska wrote: > > > Always generate modify out, access in for splice; > > > this gets automatically merged with no ugly special cases. > > > > > > No changes to 2/3 or 3/3. > > Thanks for the patches Ahelena! The code looks fine to me but to be honest > > I still have one unresolved question so let me think about it loud here for > > documentation purposes :). Do we want fsnotify (any filesystem > > notification framework like inotify or fanotify) to actually generate > > events on FIFOs? FIFOs are virtual objects and are not part of the > > filesystem as such (well, the inode itself and the name is), hence > > *filesystem* notification framework does not seem like a great fit to watch > > for changes or accesses there. And if we say "yes" for FIFOs, then why not > > AF_UNIX sockets? Where do we draw the line? And is it all worth the > > trouble? > As a relative outsider (I haven't used inotify before this, and have not > been subjected to it or its peripheries before), > I interpreted inotify as being the Correct solution for: > 1. stuff you can find in a normal > (non-/dev, you don't want to touch devices) > filesystem traversal > 2. stuff you can open > where, going down the list in inode(7): > S_IFSOCK can't open > S_IFLNK can't open > S_IFREG yes! > S_IFBLK it's a device > S_IFDIR yes! > S_IFCHR it's a device > S_IFIFO yes! > > It appears that I'm not the only one who's interpreted it that way, > especially since neither regular files nor pipes are pollable. > (Though, under that same categorisation, I wouldn't be surprised > if anonymous pipes had been refused, for example, since those are > conventionally unnameable.) > > To this end, I'd say we're leaving the line precisely where it was drawn > before, even if by accident. I agree, although I'd note that there are S_IFREG inodes under /sys or /proc where it would be too difficult to provide fsnotify events (exactly because the file contents is not "data stored somewhere" but rather something "generated on the fly") so the illusion is not perfect already. > > I understand the convenience of inotify working on FIFOs for the "tail -f" > > usecase but then wouldn't this better be fixed in tail(1) itself by using > > epoll(7) for FIFOs which, as I've noted in my other reply, does not have > > the problem that poll(2) has when there are no writers? > Yes, epoll in ET mode returns POLLHUP only once, but you /also/ need the > inotify anyway for regular files, which epoll refuses > (and, with -F, you may want both epoll for a pipe and inotify for the > directory it's contained in). > Is it possible to do? yes. Is it more annoying than just having pipes > report when they were written to? very much so. > > inotify actually working(*) is presumably why coreutils tail doesn't use > epoll ‒ inotify already provides all required events(*), you can use the > same code for regular files and fifos, and with one fewer level of > indirection: there's just no need(*). > > (*: except with a magic syscall only I use apparently) Yeah, I've slept to this and I still think adding fsnotify events to splice is a nicer option so feel free to add: Acked-by: Jan Kara <jack@suse.cz> to all kernel patches in your series. Since the changes are in splice code, Christian or Al Viro (who you already have on CC list) should be merging this so please make sure to also include them in the v5 submission. Honza
diff --git a/fs/splice.c b/fs/splice.c index 2ecfccbda956..bdbabc2ebfff 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -1184,10 +1184,6 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out, out->f_pos = offset; else *off_out = offset; - - // splice_write-> already marked out - // as modified via vfs_iter_write() - goto noaccessout; } else if (opipe) { if (off_out) return -ESPIPE; @@ -1211,11 +1207,10 @@ long do_splice(struct file *in, loff_t *off_in, struct file *out, } else return -EINVAL; - if (ret > 0) + if (ret > 0) { fsnotify_modify(out); -noaccessout: - if (ret > 0) fsnotify_access(in); + } return ret; }