[RFC] fuse: disable support for file handle when FUSE_EXPORT_SUPPORT not configured
Message ID | 20240123093701.94166-1-jefflexu@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-35008-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2553:b0:103:945f:af90 with SMTP id p19csp219764dyi; Tue, 23 Jan 2024 01:37:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IHbyIBXlxofy9sDOWCEZrtPZev5x3edkbSqPDu1BrPFpiha7XM/OObL2ZQBoKBGdOmBwjAn X-Received: by 2002:a05:620a:14b4:b0:783:2d83:ee1d with SMTP id x20-20020a05620a14b400b007832d83ee1dmr5549997qkj.142.1706002653740; Tue, 23 Jan 2024 01:37:33 -0800 (PST) Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id w28-20020a05620a095c00b00783adf8c538si323698qkw.694.2024.01.23.01.37.33 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 01:37:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-35008-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-35008-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-35008-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 860291C2205D for <ouuuleilei@gmail.com>; Tue, 23 Jan 2024 09:37:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 340015A0EF; Tue, 23 Jan 2024 09:37:16 +0000 (UTC) Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 671A956478; Tue, 23 Jan 2024 09:37:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.111 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706002634; cv=none; b=nUA6aym/nynI3DBN0VLgtuuSqc0pFo7NnpITYUPBM57hTs3nsk8Z8IC6fSVzIK5rKt0DBGBk876zEjKX9S7p9u4qf+RT8JJ6+iU//f64zZRPQCKauLvahHiVSe8o/2co3OvoeN82d3NcFNORbAc97odYa29B40Ogh6c7NIJfPNE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706002634; c=relaxed/simple; bh=9L2HdofWJcLlzgURvijzTLaL93o4p+E1h/gJ1Az/+I8=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=UKvNi9dJRw8kuf72aujyN9PRAfgpjXvJd9cAlss9vwTmEfTsqtlc9CPq+q/sZ1X7MQBhAmQKBQzfJJZ4ocF2ovQiXGnjZWqjrGR7ZORagxcu8nXBDC8ApVlP4pgxNGxegTNE7RcAAVVQhtZWfAsJK8LhKHJuUxZicbwtseZPXVM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; arc=none smtp.client-ip=115.124.30.111 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=3;SR=0;TI=SMTPD_---0W.CRkwM_1706002621; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0W.CRkwM_1706002621) by smtp.aliyun-inc.com; Tue, 23 Jan 2024 17:37:02 +0800 From: Jingbo Xu <jefflexu@linux.alibaba.com> To: miklos@szeredi.hu, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org Subject: [RFC] fuse: disable support for file handle when FUSE_EXPORT_SUPPORT not configured Date: Tue, 23 Jan 2024 17:37:01 +0800 Message-Id: <20240123093701.94166-1-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788873439052490850 X-GMAIL-MSGID: 1788873439052490850 |
Series |
[RFC] fuse: disable support for file handle when FUSE_EXPORT_SUPPORT not configured
|
|
Commit Message
Jingbo Xu
Jan. 23, 2024, 9:37 a.m. UTC
I think this is more of an issue reporter.
I'm not sure if it's a known issue, but we found that following a
successful name_to_handle_at(2), open_by_handle_at(2) fails (-ESTALE,
Stale file handle) with the given file handle when the fuse daemon is in
"cache= none" mode.
It can be reproduced by the examples from the man page of
name_to_handle_at(2) and open_by_handle_at(2) [1], along with the
virtiofsd daemon (C implementation) in "cache= none" mode.
```
/t_name_to_handle_at t_open_by_handle_at.c > /tmp/fh
/t_open_by_handle_at < /tmp/fh
t_open_by_handle_at: open_by_handle_at: Stale file handle
```
After investigation into this issue, I found the root cause is that,
when virtiofsd is in "cache= none" mode, the entry_valid_timeout is
configured as 0. Thus the dput() called when name_to_handle_at(2)
finishes will trigger iput -> evict(), in which FUSE_FORGET will be sent
to the daemon. The following open_by_handle_at(2) will trigger a new
FUSE_LOOKUP request when no cached inode is found with the given file
handle. And then the fuse daemon fails the FUSE_LOOKUP request with
-ENOENT as the cached metadata of the requested inode has already been
cleaned up among the previous FUSE_FORGET.
This indeed confuses the application, as open_by_handle_at(2) fails in
the condition of the previous name_to_handle_at(2) succeeds, given the
requested file is not deleted and ready there. It is acceptable for the
application folks to fail name_to_handle_at(2) early in this case, in
which they will fallback to open(2) to access files.
As for this RFC patch, the idea is that if the fuse daemon is configured
with "cache=none" mode, FUSE_EXPORT_SUPPORT should also be explicitly
disabled and the following name_to_handle_at(2) will all fail as a
workaround of this issue.
[1] https://man7.org/linux/man-pages/man2/open_by_handle_at.2.html
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
---
fs/fuse/inode.c | 4 ++++
1 file changed, 4 insertions(+)
Comments
On Tue, Jan 23, 2024 at 11:37 AM Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > > I think this is more of an issue reporter. > > I'm not sure if it's a known issue, but we found that following a > successful name_to_handle_at(2), open_by_handle_at(2) fails (-ESTALE, > Stale file handle) with the given file handle when the fuse daemon is in > "cache= none" mode. > > It can be reproduced by the examples from the man page of > name_to_handle_at(2) and open_by_handle_at(2) [1], along with the > virtiofsd daemon (C implementation) in "cache= none" mode. > > ``` > ./t_name_to_handle_at t_open_by_handle_at.c > /tmp/fh > ./t_open_by_handle_at < /tmp/fh > t_open_by_handle_at: open_by_handle_at: Stale file handle > ``` > > After investigation into this issue, I found the root cause is that, > when virtiofsd is in "cache= none" mode, the entry_valid_timeout is > configured as 0. Thus the dput() called when name_to_handle_at(2) > finishes will trigger iput -> evict(), in which FUSE_FORGET will be sent > to the daemon. The following open_by_handle_at(2) will trigger a new > FUSE_LOOKUP request when no cached inode is found with the given file > handle. And then the fuse daemon fails the FUSE_LOOKUP request with > -ENOENT as the cached metadata of the requested inode has already been > cleaned up among the previous FUSE_FORGET. > > This indeed confuses the application, as open_by_handle_at(2) fails in > the condition of the previous name_to_handle_at(2) succeeds, given the > requested file is not deleted and ready there. It is acceptable for the > application folks to fail name_to_handle_at(2) early in this case, in > which they will fallback to open(2) to access files. > > > As for this RFC patch, the idea is that if the fuse daemon is configured > with "cache=none" mode, FUSE_EXPORT_SUPPORT should also be explicitly > disabled and the following name_to_handle_at(2) will all fail as a > workaround of this issue. This will probably regress NFS export of (many) fuse servers that do not have FUSE_EXPORT_SUPPORT, even though you are right to point out that those NFS exports are of dubious quality. Not only can an NFS client get ESTALE for evicted fuse inodes, but it can also get a completely different object for the same file handle if that fuse server was restarted and re-exported to NFS. > > [1] https://man7.org/linux/man-pages/man2/open_by_handle_at.2.html > > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > --- > fs/fuse/inode.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index 2a6d44f91729..9fed63be60fe 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -1025,6 +1025,7 @@ static struct dentry *fuse_get_dentry(struct super_block *sb, > static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, > struct inode *parent) > { > + struct fuse_conn *fc = get_fuse_conn(inode); > int len = parent ? 6 : 3; > u64 nodeid; > u32 generation; > @@ -1034,6 +1035,9 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, > return FILEID_INVALID; > } > > + if (!fc->export_support) > + return -EOPNOTSUPP; > + > nodeid = get_fuse_inode(inode)->nodeid; > generation = inode->i_generation; > If you somehow find a way to mitigate the regression for NFS export of old fuse servers (maybe an opt-in Kconfig?), your patch is also going to regress AT_HANDLE_FID functionality, which can be used by fanotify to monitor fuse. AT_HANDLE_FID flag to name_to_handle_at(2) means that open_by_handle_at(2) is not supposed to be called on that fh. The correct way to deal with that would be something like this: +static const struct export_operations fuse_fid_operations = { + .encode_fh = fuse_encode_fh, +}; + static const struct export_operations fuse_export_operations = { .fh_to_dentry = fuse_fh_to_dentry, .fh_to_parent = fuse_fh_to_parent, @@ -1529,12 +1533,16 @@ static void fuse_fill_attr_from_inode(struct fuse_attr *attr, static void fuse_sb_defaults(struct super_block *sb) { + struct fuse_mount *fm = get_fuse_mount_super(sb); + sb->s_magic = FUSE_SUPER_MAGIC; sb->s_op = &fuse_super_operations; sb->s_xattr = fuse_xattr_handlers; sb->s_maxbytes = MAX_LFS_FILESIZE; sb->s_time_gran = 1; - sb->s_export_op = &fuse_export_operations; + if (fm->fc->export_support) + sb->s_export_op = &fuse_export_operations; + else + sb->s_export_op = &fuse_fid_operations; sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; if (sb->s_user_ns != &init_user_ns) sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; --- This would make name_to_handle_at() without AT_HANDLE_FID fail and name_to_handle_at() with AT_HANDLE_FID to succeed as it should. Thanks, Amir.
On 1/23/24 6:17 PM, Amir Goldstein wrote: > On Tue, Jan 23, 2024 at 11:37 AM Jingbo Xu <jefflexu@linux.alibaba.com> wrote: >> >> I think this is more of an issue reporter. >> >> I'm not sure if it's a known issue, but we found that following a >> successful name_to_handle_at(2), open_by_handle_at(2) fails (-ESTALE, >> Stale file handle) with the given file handle when the fuse daemon is in >> "cache= none" mode. >> >> It can be reproduced by the examples from the man page of >> name_to_handle_at(2) and open_by_handle_at(2) [1], along with the >> virtiofsd daemon (C implementation) in "cache= none" mode. >> >> ``` >> ./t_name_to_handle_at t_open_by_handle_at.c > /tmp/fh >> ./t_open_by_handle_at < /tmp/fh >> t_open_by_handle_at: open_by_handle_at: Stale file handle >> ``` >> >> After investigation into this issue, I found the root cause is that, >> when virtiofsd is in "cache= none" mode, the entry_valid_timeout is >> configured as 0. Thus the dput() called when name_to_handle_at(2) >> finishes will trigger iput -> evict(), in which FUSE_FORGET will be sent >> to the daemon. The following open_by_handle_at(2) will trigger a new >> FUSE_LOOKUP request when no cached inode is found with the given file >> handle. And then the fuse daemon fails the FUSE_LOOKUP request with >> -ENOENT as the cached metadata of the requested inode has already been >> cleaned up among the previous FUSE_FORGET. >> >> This indeed confuses the application, as open_by_handle_at(2) fails in >> the condition of the previous name_to_handle_at(2) succeeds, given the >> requested file is not deleted and ready there. It is acceptable for the >> application folks to fail name_to_handle_at(2) early in this case, in >> which they will fallback to open(2) to access files. >> >> >> As for this RFC patch, the idea is that if the fuse daemon is configured >> with "cache=none" mode, FUSE_EXPORT_SUPPORT should also be explicitly >> disabled and the following name_to_handle_at(2) will all fail as a >> workaround of this issue. > > This will probably regress NFS export of (many) fuse servers that do > not have FUSE_EXPORT_SUPPORT, even though you are right to point > out that those NFS exports are of dubious quality. Yeah, the RFC itself is just for describing the problem, while the final fix (if any) needs further discussion. We even add an extra optional mount option, e.g "-o no_file_handle" to explicitly disable support for file handle in our internal product. > > Not only can an NFS client get ESTALE for evicted fuse inodes, but it > can also get a completely different object for the same file handle > if that fuse server was restarted and re-exported to NFS. > >> >> [1] https://man7.org/linux/man-pages/man2/open_by_handle_at.2.html >> >> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> >> --- >> fs/fuse/inode.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c >> index 2a6d44f91729..9fed63be60fe 100644 >> --- a/fs/fuse/inode.c >> +++ b/fs/fuse/inode.c >> @@ -1025,6 +1025,7 @@ static struct dentry *fuse_get_dentry(struct super_block *sb, >> static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, >> struct inode *parent) >> { >> + struct fuse_conn *fc = get_fuse_conn(inode); >> int len = parent ? 6 : 3; >> u64 nodeid; >> u32 generation; >> @@ -1034,6 +1035,9 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, >> return FILEID_INVALID; >> } >> >> + if (!fc->export_support) >> + return -EOPNOTSUPP; >> + >> nodeid = get_fuse_inode(inode)->nodeid; >> generation = inode->i_generation; >> > > If you somehow find a way to mitigate the regression for NFS export of > old fuse servers (maybe an opt-in Kconfig?), your patch is also going to > regress AT_HANDLE_FID functionality, which can be used by fanotify to > monitor fuse. > > AT_HANDLE_FID flag to name_to_handle_at(2) means that > open_by_handle_at(2) is not supposed to be called on that fh. > > The correct way to deal with that would be something like this: > > +static const struct export_operations fuse_fid_operations = { > + .encode_fh = fuse_encode_fh, > +}; > + > static const struct export_operations fuse_export_operations = { > .fh_to_dentry = fuse_fh_to_dentry, > .fh_to_parent = fuse_fh_to_parent, > @@ -1529,12 +1533,16 @@ static void fuse_fill_attr_from_inode(struct > fuse_attr *attr, > > static void fuse_sb_defaults(struct super_block *sb) > { > + struct fuse_mount *fm = get_fuse_mount_super(sb); > + > sb->s_magic = FUSE_SUPER_MAGIC; > sb->s_op = &fuse_super_operations; > sb->s_xattr = fuse_xattr_handlers; > sb->s_maxbytes = MAX_LFS_FILESIZE; > sb->s_time_gran = 1; > - sb->s_export_op = &fuse_export_operations; > + if (fm->fc->export_support) > + sb->s_export_op = &fuse_export_operations; > + else > + sb->s_export_op = &fuse_fid_operations; > sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; > if (sb->s_user_ns != &init_user_ns) > sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; > > --- > > This would make name_to_handle_at() without AT_HANDLE_FID fail > and name_to_handle_at() with AT_HANDLE_FID to succeed as it should. > > Thanks, > Amir.
On 1/23/24 6:17 PM, Amir Goldstein wrote: > If you somehow find a way to mitigate the regression for NFS export of > old fuse servers (maybe an opt-in Kconfig?), your patch is also going to > regress AT_HANDLE_FID functionality, which can be used by fanotify to > monitor fuse. > > AT_HANDLE_FID flag to name_to_handle_at(2) means that > open_by_handle_at(2) is not supposed to be called on that fh. > > The correct way to deal with that would be something like this: > > +static const struct export_operations fuse_fid_operations = { > + .encode_fh = fuse_encode_fh, > +}; > + > static const struct export_operations fuse_export_operations = { > .fh_to_dentry = fuse_fh_to_dentry, > .fh_to_parent = fuse_fh_to_parent, > @@ -1529,12 +1533,16 @@ static void fuse_fill_attr_from_inode(struct > fuse_attr *attr, > > static void fuse_sb_defaults(struct super_block *sb) > { > + struct fuse_mount *fm = get_fuse_mount_super(sb); > + > sb->s_magic = FUSE_SUPER_MAGIC; > sb->s_op = &fuse_super_operations; > sb->s_xattr = fuse_xattr_handlers; > sb->s_maxbytes = MAX_LFS_FILESIZE; > sb->s_time_gran = 1; > - sb->s_export_op = &fuse_export_operations; > + if (fm->fc->export_support) > + sb->s_export_op = &fuse_export_operations; > + else > + sb->s_export_op = &fuse_fid_operations; > sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; > if (sb->s_user_ns != &init_user_ns) > sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; > > --- > > This would make name_to_handle_at() without AT_HANDLE_FID fail > and name_to_handle_at() with AT_HANDLE_FID to succeed as it should. > Oh I didn't notice this. Many thanks!
On Tue, 23 Jan 2024 at 11:40, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > > > > On 1/23/24 6:17 PM, Amir Goldstein wrote: > > If you somehow find a way to mitigate the regression for NFS export of > > old fuse servers (maybe an opt-in Kconfig?), Better would be if the server explicitly disabled export support with an INIT flag (FUSE_NO_EXPORT). Thanks, Miklos
On 1/23/24 6:46 PM, Miklos Szeredi wrote: > On Tue, 23 Jan 2024 at 11:40, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: >> >> >> >> On 1/23/24 6:17 PM, Amir Goldstein wrote: >>> If you somehow find a way to mitigate the regression for NFS export of >>> old fuse servers (maybe an opt-in Kconfig?), > > Better would be if the server explicitly disabled export support with > an INIT flag (FUSE_NO_EXPORT). > I would give it a try (as a mitigation) if it's on the right direction.
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 2a6d44f91729..9fed63be60fe 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1025,6 +1025,7 @@ static struct dentry *fuse_get_dentry(struct super_block *sb, static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, struct inode *parent) { + struct fuse_conn *fc = get_fuse_conn(inode); int len = parent ? 6 : 3; u64 nodeid; u32 generation; @@ -1034,6 +1035,9 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, return FILEID_INVALID; } + if (!fc->export_support) + return -EOPNOTSUPP; + nodeid = get_fuse_inode(inode)->nodeid; generation = inode->i_generation;