Message ID | 20230124193025.185781-4-jlayton@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2335905wrn; Tue, 24 Jan 2023 11:32:39 -0800 (PST) X-Google-Smtp-Source: AMrXdXvsjklyWVcFHfVJ9dKHXTHPasn9cdCrFQKIOcruBfdMGxbA3epYoLbkaC/D1rzLVS9NPGZQ X-Received: by 2002:aa7:8f89:0:b0:58b:bce2:7eb7 with SMTP id t9-20020aa78f89000000b0058bbce27eb7mr40398548pfs.10.1674588759190; Tue, 24 Jan 2023 11:32:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674588759; cv=none; d=google.com; s=arc-20160816; b=b+S+vm7wq+ErHHCLPTYX8d3jDeLg/qmNfcNMSJ+toCWlEi+cjiNBlm0sYd/rFtLdOL J6v33R2NITsskjeuHrDGTQgVFYYMRmlRV6aGDHpWW3UjoT81vTWvIL43l2voMJnDrUfU l/QPVPYQ+SB9lRCa+4RjrvT3jaDOsgQ5MPrRJuyu4pZJeg1VPalXQr94IsEbxArcmthj zr6STbFOil+mbTljiWkmGcVFY+m1ofClFsmYoLCoE3k/6ql/EYm6UTwlwd5+PT/NwMwO fEsNaqd24z8fUf0ZNPvApT75Ehq0WxqFGFOz7xE6vLPP0DNd1bUduhUR22htOh8VlW3a odNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7QdByYsPLkBaqtl152WvMLjzuTMx78pbiHaAgRryPa4=; b=akQCUvxE0SWyd+zcqoklgROicoo9TJrBwlQFEwCihMp8T+UT5cwNrLnhOouHvqvgJ8 23J4zuL4o4XI6BorDZNSV/bHWT3dPxHq+zuMf6Mk/xztL+voeZ/7J/V3vQjnmqxjdCl1 ZuPatiOYapGpTPSGPtf8Mnj6z1Fn+dP1v31ula8n2TPTIr9RD+1y/gKp3ZI/a3szn3j8 itezcNe0eaBtR+8fSZNyOXhJGLlJDDK8DIYJAFKr7CLnPwCMkXHSrqgU/aIqf+0doOQ2 HAWDJHKTWrqTpWPVHKFPbHRCyZOaSh4NUlJeaPSM5HKEhXZQxUYDubpiyiLCbUO+gNVP 3g2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=gQoRCGuZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 6-20020aa79106000000b005814aae3d2asi3029854pfh.312.2023.01.24.11.32.27; Tue, 24 Jan 2023 11:32:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=gQoRCGuZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234276AbjAXTar (ORCPT <rfc822;rust.linux@gmail.com> + 99 others); Tue, 24 Jan 2023 14:30:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234228AbjAXTak (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 24 Jan 2023 14:30:40 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 940974ED2E; Tue, 24 Jan 2023 11:30:38 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4D573B816B3; Tue, 24 Jan 2023 19:30:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0CF72C4339E; Tue, 24 Jan 2023 19:30:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674588636; bh=5Id/6GwyRzJyDNnF9NPtwFxMtQLo/dWq0B8EUtSBvlc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gQoRCGuZT1N1d1fzlet9LS8GB2TCHTEOvSrcGiubd243jqpXMQW5CtC3Dja2W9cBj 91gepbKSr6cYm9EIdmn+hoyU49hNcRgyumNXVq/MpanjjnMz+fRboohGbWu61yNp5+ g0lMM7M2Sw12tcENSTz0lcOaONtANcfmHiTZD2hjxSPWqYjbg9bXuorJHJAAZtcJm8 QEeIE3BWZwWUSpZz7V0uce/mnelakMARV0plE83eeqjzeukyd327IRYqJjjLwjtj5e 7JjCjL8ybHzlMmX3Ga9rE9Zj7K85ZwyS41HPb5qloH2zPNjb1LBOMzRVw9ExTvaPWx iLG0oVZxEOoeQ== From: Jeff Layton <jlayton@kernel.org> To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v8 RESEND 3/8] vfs: plumb i_version handling into struct kstat Date: Tue, 24 Jan 2023 14:30:20 -0500 Message-Id: <20230124193025.185781-4-jlayton@kernel.org> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230124193025.185781-1-jlayton@kernel.org> References: <20230124193025.185781-1-jlayton@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755933582884719172?= X-GMAIL-MSGID: =?utf-8?q?1755933582884719172?= |
Series |
fs: clean up internal i_version handling
|
|
Commit Message
Jeff Layton
Jan. 24, 2023, 7:30 p.m. UTC
The NFS server has a lot of special handling for different types of change attribute access, depending on the underlying filesystem. In most cases, it's doing a getattr anyway and then fetching that value after the fact. Rather that do that, add a new STATX_CHANGE_COOKIE flag that is a kernel-only symbol (for now). If requested and getattr can implement it, it can fill out this field. For IS_I_VERSION inodes, add a generic implementation in vfs_getattr_nosec. Take care to mask STATX_CHANGE_COOKIE off in requests from userland and in the result mask. Since not all filesystems can give the same guarantees of monotonicity, claim a STATX_ATTR_CHANGE_MONOTONIC flag that filesystems can set to indicate that they offer an i_version value that can never go backward. Eventually if we decide to make the i_version available to userland, we can just designate a field for it in struct statx, and move the STATX_CHANGE_COOKIE definition to the uapi header. Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/stat.c | 17 +++++++++++++++-- include/linux/stat.h | 9 +++++++++ 2 files changed, 24 insertions(+), 2 deletions(-)
Comments
On Tue, Jan 24, 2023 at 02:30:20PM -0500, Jeff Layton wrote: > The NFS server has a lot of special handling for different types of > change attribute access, depending on the underlying filesystem. In > most cases, it's doing a getattr anyway and then fetching that value > after the fact. > > Rather that do that, add a new STATX_CHANGE_COOKIE flag that is a > kernel-only symbol (for now). If requested and getattr can implement it, > it can fill out this field. For IS_I_VERSION inodes, add a generic > implementation in vfs_getattr_nosec. Take care to mask > STATX_CHANGE_COOKIE off in requests from userland and in the result > mask. > > Since not all filesystems can give the same guarantees of monotonicity, > claim a STATX_ATTR_CHANGE_MONOTONIC flag that filesystems can set to > indicate that they offer an i_version value that can never go backward. > > Eventually if we decide to make the i_version available to userland, we > can just designate a field for it in struct statx, and move the > STATX_CHANGE_COOKIE definition to the uapi header. > > Reviewed-by: NeilBrown <neilb@suse.de> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/stat.c | 17 +++++++++++++++-- > include/linux/stat.h | 9 +++++++++ > 2 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/fs/stat.c b/fs/stat.c > index d6cc74ca8486..f43afe0081fe 100644 > --- a/fs/stat.c > +++ b/fs/stat.c > @@ -18,6 +18,7 @@ > #include <linux/syscalls.h> > #include <linux/pagemap.h> > #include <linux/compat.h> > +#include <linux/iversion.h> > > #include <linux/uaccess.h> > #include <asm/unistd.h> > @@ -122,6 +123,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, > stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | > STATX_ATTR_DAX); > > + if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { > + stat->result_mask |= STATX_CHANGE_COOKIE; > + stat->change_cookie = inode_query_iversion(inode); > + } > + > mnt_userns = mnt_user_ns(path->mnt); > if (inode->i_op->getattr) > return inode->i_op->getattr(mnt_userns, path, stat, > @@ -602,9 +608,11 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) > > memset(&tmp, 0, sizeof(tmp)); > > - tmp.stx_mask = stat->result_mask; > + /* STATX_CHANGE_COOKIE is kernel-only for now */ > + tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE; > tmp.stx_blksize = stat->blksize; > - tmp.stx_attributes = stat->attributes; > + /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */ > + tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC; > tmp.stx_nlink = stat->nlink; > tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); > tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); > @@ -643,6 +651,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, > if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) > return -EINVAL; > > + /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests > + * from userland. > + */ > + mask &= ~STATX_CHANGE_COOKIE; > + > error = vfs_statx(dfd, filename, flags, &stat, mask); > if (error) > return error; > diff --git a/include/linux/stat.h b/include/linux/stat.h > index ff277ced50e9..52150570d37a 100644 > --- a/include/linux/stat.h > +++ b/include/linux/stat.h Sorry being late to the party once again... > @@ -52,6 +52,15 @@ struct kstat { > u64 mnt_id; > u32 dio_mem_align; > u32 dio_offset_align; > + u64 change_cookie; > }; > > +/* These definitions are internal to the kernel for now. Mainly used by nfsd. */ > + > +/* mask values */ > +#define STATX_CHANGE_COOKIE 0x40000000U /* Want/got stx_change_attr */ > + > +/* file attribute values */ > +#define STATX_ATTR_CHANGE_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */ maybe it would be better to copy what we do for SB_* vs SB_I_* flags and at least rename them to: STATX_I_CHANGE_COOKIE STATX_I_ATTR_CHANGE_MONOTONIC i_change_cookie to visually distinguish internal and external flags. And also if possible it might be useful to move STATX_I_* flags to the higher 32 bits and then one can use upper_32_bits to retrieve kernel internal flags and lower_32_bits for userspace flags in tiny wrappers. (I did something similar for clone3() a few years ago but there to distinguish between flags available both in clone() and clone3() and such that are only available in clone3().) But just a thought. I mostly worry about accidently leaking this to userspace so ideally we'd even have separate fields in struct kstat for internal and external attributes but that might bump kstat size, though I don't think struct kstat is actually ever really allocated all that much.
On Tue 24-01-23 14:30:20, Jeff Layton wrote: > The NFS server has a lot of special handling for different types of > change attribute access, depending on the underlying filesystem. In > most cases, it's doing a getattr anyway and then fetching that value > after the fact. > > Rather that do that, add a new STATX_CHANGE_COOKIE flag that is a > kernel-only symbol (for now). If requested and getattr can implement it, > it can fill out this field. For IS_I_VERSION inodes, add a generic > implementation in vfs_getattr_nosec. Take care to mask > STATX_CHANGE_COOKIE off in requests from userland and in the result > mask. > > Since not all filesystems can give the same guarantees of monotonicity, > claim a STATX_ATTR_CHANGE_MONOTONIC flag that filesystems can set to > indicate that they offer an i_version value that can never go backward. > > Eventually if we decide to make the i_version available to userland, we > can just designate a field for it in struct statx, and move the > STATX_CHANGE_COOKIE definition to the uapi header. > > Reviewed-by: NeilBrown <neilb@suse.de> > Signed-off-by: Jeff Layton <jlayton@kernel.org> Looks good to me. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/stat.c | 17 +++++++++++++++-- > include/linux/stat.h | 9 +++++++++ > 2 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/fs/stat.c b/fs/stat.c > index d6cc74ca8486..f43afe0081fe 100644 > --- a/fs/stat.c > +++ b/fs/stat.c > @@ -18,6 +18,7 @@ > #include <linux/syscalls.h> > #include <linux/pagemap.h> > #include <linux/compat.h> > +#include <linux/iversion.h> > > #include <linux/uaccess.h> > #include <asm/unistd.h> > @@ -122,6 +123,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, > stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | > STATX_ATTR_DAX); > > + if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { > + stat->result_mask |= STATX_CHANGE_COOKIE; > + stat->change_cookie = inode_query_iversion(inode); > + } > + > mnt_userns = mnt_user_ns(path->mnt); > if (inode->i_op->getattr) > return inode->i_op->getattr(mnt_userns, path, stat, > @@ -602,9 +608,11 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) > > memset(&tmp, 0, sizeof(tmp)); > > - tmp.stx_mask = stat->result_mask; > + /* STATX_CHANGE_COOKIE is kernel-only for now */ > + tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE; > tmp.stx_blksize = stat->blksize; > - tmp.stx_attributes = stat->attributes; > + /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */ > + tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC; > tmp.stx_nlink = stat->nlink; > tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); > tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); > @@ -643,6 +651,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, > if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) > return -EINVAL; > > + /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests > + * from userland. > + */ > + mask &= ~STATX_CHANGE_COOKIE; > + > error = vfs_statx(dfd, filename, flags, &stat, mask); > if (error) > return error; > diff --git a/include/linux/stat.h b/include/linux/stat.h > index ff277ced50e9..52150570d37a 100644 > --- a/include/linux/stat.h > +++ b/include/linux/stat.h > @@ -52,6 +52,15 @@ struct kstat { > u64 mnt_id; > u32 dio_mem_align; > u32 dio_offset_align; > + u64 change_cookie; > }; > > +/* These definitions are internal to the kernel for now. Mainly used by nfsd. */ > + > +/* mask values */ > +#define STATX_CHANGE_COOKIE 0x40000000U /* Want/got stx_change_attr */ > + > +/* file attribute values */ > +#define STATX_ATTR_CHANGE_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */ > + > #endif > -- > 2.39.1 >
On Wed, 2023-01-25 at 16:50 +0100, Christian Brauner wrote: > On Tue, Jan 24, 2023 at 02:30:20PM -0500, Jeff Layton wrote: > > The NFS server has a lot of special handling for different types of > > change attribute access, depending on the underlying filesystem. In > > most cases, it's doing a getattr anyway and then fetching that value > > after the fact. > > > > Rather that do that, add a new STATX_CHANGE_COOKIE flag that is a > > kernel-only symbol (for now). If requested and getattr can implement it, > > it can fill out this field. For IS_I_VERSION inodes, add a generic > > implementation in vfs_getattr_nosec. Take care to mask > > STATX_CHANGE_COOKIE off in requests from userland and in the result > > mask. > > > > Since not all filesystems can give the same guarantees of monotonicity, > > claim a STATX_ATTR_CHANGE_MONOTONIC flag that filesystems can set to > > indicate that they offer an i_version value that can never go backward. > > > > Eventually if we decide to make the i_version available to userland, we > > can just designate a field for it in struct statx, and move the > > STATX_CHANGE_COOKIE definition to the uapi header. > > > > Reviewed-by: NeilBrown <neilb@suse.de> > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/stat.c | 17 +++++++++++++++-- > > include/linux/stat.h | 9 +++++++++ > > 2 files changed, 24 insertions(+), 2 deletions(-) > > > > diff --git a/fs/stat.c b/fs/stat.c > > index d6cc74ca8486..f43afe0081fe 100644 > > --- a/fs/stat.c > > +++ b/fs/stat.c > > @@ -18,6 +18,7 @@ > > #include <linux/syscalls.h> > > #include <linux/pagemap.h> > > #include <linux/compat.h> > > +#include <linux/iversion.h> > > > > #include <linux/uaccess.h> > > #include <asm/unistd.h> > > @@ -122,6 +123,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, > > stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | > > STATX_ATTR_DAX); > > > > + if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { > > + stat->result_mask |= STATX_CHANGE_COOKIE; > > + stat->change_cookie = inode_query_iversion(inode); > > + } > > + > > mnt_userns = mnt_user_ns(path->mnt); > > if (inode->i_op->getattr) > > return inode->i_op->getattr(mnt_userns, path, stat, > > @@ -602,9 +608,11 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) > > > > memset(&tmp, 0, sizeof(tmp)); > > > > - tmp.stx_mask = stat->result_mask; > > + /* STATX_CHANGE_COOKIE is kernel-only for now */ > > + tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE; > > tmp.stx_blksize = stat->blksize; > > - tmp.stx_attributes = stat->attributes; > > + /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */ > > + tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC; > > tmp.stx_nlink = stat->nlink; > > tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); > > tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); > > @@ -643,6 +651,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, > > if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) > > return -EINVAL; > > > > + /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests > > + * from userland. > > + */ > > + mask &= ~STATX_CHANGE_COOKIE; > > + > > error = vfs_statx(dfd, filename, flags, &stat, mask); > > if (error) > > return error; > > diff --git a/include/linux/stat.h b/include/linux/stat.h > > index ff277ced50e9..52150570d37a 100644 > > --- a/include/linux/stat.h > > +++ b/include/linux/stat.h > > Sorry being late to the party once again... > > > @@ -52,6 +52,15 @@ struct kstat { > > u64 mnt_id; > > u32 dio_mem_align; > > u32 dio_offset_align; > > + u64 change_cookie; > > }; > > > > +/* These definitions are internal to the kernel for now. Mainly used by nfsd. */ > > + > > +/* mask values */ > > +#define STATX_CHANGE_COOKIE 0x40000000U /* Want/got stx_change_attr */ > > + > > +/* file attribute values */ > > +#define STATX_ATTR_CHANGE_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */ > > maybe it would be better to copy what we do for SB_* vs SB_I_* flags and > at least rename them to: > > STATX_I_CHANGE_COOKIE > STATX_I_ATTR_CHANGE_MONOTONIC > i_change_cookie An "i_"/"I_" prefix says "inode" to me. Maybe I've been at this too long. ;) > > to visually distinguish internal and external flags. > > And also if possible it might be useful to move STATX_I_* flags to the > higher 32 bits and then one can use upper_32_bits to retrieve kernel > internal flags and lower_32_bits for userspace flags in tiny wrappers. > > (I did something similar for clone3() a few years ago but there to > distinguish between flags available both in clone() and clone3() and > such that are only available in clone3().) > > But just a thought. I mostly worry about accidently leaking this to > userspace so ideally we'd even have separate fields in struct kstat for > internal and external attributes but that might bump kstat size, though > I don't think struct kstat is actually ever really allocated all that > much. I'm not sure that the internal/external distinction matters much for filesystem providers or consumers of it. The place that it matters is at the userland interface level, where statx or something similar is called. At some point we may want to make STATX_CHANGE_COOKIE queryable via statx, at which point we'll have to have a big flag day where we do s/STATX_I_CHANGE_COOKIE/STATX_CHANGE_COOKIE/. I don't think it's worth it here.
On Wed, Jan 25, 2023 at 01:30:07PM -0500, Jeff Layton wrote: > On Wed, 2023-01-25 at 16:50 +0100, Christian Brauner wrote: > > On Tue, Jan 24, 2023 at 02:30:20PM -0500, Jeff Layton wrote: > > > The NFS server has a lot of special handling for different types of > > > change attribute access, depending on the underlying filesystem. In > > > most cases, it's doing a getattr anyway and then fetching that value > > > after the fact. > > > > > > Rather that do that, add a new STATX_CHANGE_COOKIE flag that is a > > > kernel-only symbol (for now). If requested and getattr can implement it, > > > it can fill out this field. For IS_I_VERSION inodes, add a generic > > > implementation in vfs_getattr_nosec. Take care to mask > > > STATX_CHANGE_COOKIE off in requests from userland and in the result > > > mask. > > > > > > Since not all filesystems can give the same guarantees of monotonicity, > > > claim a STATX_ATTR_CHANGE_MONOTONIC flag that filesystems can set to > > > indicate that they offer an i_version value that can never go backward. > > > > > > Eventually if we decide to make the i_version available to userland, we > > > can just designate a field for it in struct statx, and move the > > > STATX_CHANGE_COOKIE definition to the uapi header. > > > > > > Reviewed-by: NeilBrown <neilb@suse.de> > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > > --- > > > fs/stat.c | 17 +++++++++++++++-- > > > include/linux/stat.h | 9 +++++++++ > > > 2 files changed, 24 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/stat.c b/fs/stat.c > > > index d6cc74ca8486..f43afe0081fe 100644 > > > --- a/fs/stat.c > > > +++ b/fs/stat.c > > > @@ -18,6 +18,7 @@ > > > #include <linux/syscalls.h> > > > #include <linux/pagemap.h> > > > #include <linux/compat.h> > > > +#include <linux/iversion.h> > > > > > > #include <linux/uaccess.h> > > > #include <asm/unistd.h> > > > @@ -122,6 +123,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, > > > stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | > > > STATX_ATTR_DAX); > > > > > > + if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { > > > + stat->result_mask |= STATX_CHANGE_COOKIE; > > > + stat->change_cookie = inode_query_iversion(inode); > > > + } > > > + > > > mnt_userns = mnt_user_ns(path->mnt); > > > if (inode->i_op->getattr) > > > return inode->i_op->getattr(mnt_userns, path, stat, > > > @@ -602,9 +608,11 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) > > > > > > memset(&tmp, 0, sizeof(tmp)); > > > > > > - tmp.stx_mask = stat->result_mask; > > > + /* STATX_CHANGE_COOKIE is kernel-only for now */ > > > + tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE; > > > tmp.stx_blksize = stat->blksize; > > > - tmp.stx_attributes = stat->attributes; > > > + /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */ > > > + tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC; > > > tmp.stx_nlink = stat->nlink; > > > tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); > > > tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); > > > @@ -643,6 +651,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, > > > if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) > > > return -EINVAL; > > > > > > + /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests > > > + * from userland. > > > + */ > > > + mask &= ~STATX_CHANGE_COOKIE; > > > + > > > error = vfs_statx(dfd, filename, flags, &stat, mask); > > > if (error) > > > return error; > > > diff --git a/include/linux/stat.h b/include/linux/stat.h > > > index ff277ced50e9..52150570d37a 100644 > > > --- a/include/linux/stat.h > > > +++ b/include/linux/stat.h > > > > Sorry being late to the party once again... > > > > > @@ -52,6 +52,15 @@ struct kstat { > > > u64 mnt_id; > > > u32 dio_mem_align; > > > u32 dio_offset_align; > > > + u64 change_cookie; > > > }; > > > > > > +/* These definitions are internal to the kernel for now. Mainly used by nfsd. */ > > > + > > > +/* mask values */ > > > +#define STATX_CHANGE_COOKIE 0x40000000U /* Want/got stx_change_attr */ > > > + > > > +/* file attribute values */ > > > +#define STATX_ATTR_CHANGE_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */ > > > > maybe it would be better to copy what we do for SB_* vs SB_I_* flags and > > at least rename them to: > > > > STATX_I_CHANGE_COOKIE > > STATX_I_ATTR_CHANGE_MONOTONIC > > i_change_cookie > > An "i_"/"I_" prefix says "inode" to me. Maybe I've been at this too > long. ;) > > > > > to visually distinguish internal and external flags. > > > > And also if possible it might be useful to move STATX_I_* flags to the > > higher 32 bits and then one can use upper_32_bits to retrieve kernel > > internal flags and lower_32_bits for userspace flags in tiny wrappers. > > > > (I did something similar for clone3() a few years ago but there to > > distinguish between flags available both in clone() and clone3() and > > such that are only available in clone3().) > > > > But just a thought. I mostly worry about accidently leaking this to > > userspace so ideally we'd even have separate fields in struct kstat for > > internal and external attributes but that might bump kstat size, though > > I don't think struct kstat is actually ever really allocated all that > > much. > > I'm not sure that the internal/external distinction matters much for > filesystem providers or consumers of it. The place that it matters is at > the userland interface level, where statx or something similar is > called. > > At some point we may want to make STATX_CHANGE_COOKIE queryable via > statx, at which point we'll have to have a big flag day where we do > s/STATX_I_CHANGE_COOKIE/STATX_CHANGE_COOKIE/. > > I don't think it's worth it here. I'm not fond of internal and external STATX_* flags having the exact same name but fine. :)
diff --git a/fs/stat.c b/fs/stat.c index d6cc74ca8486..f43afe0081fe 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -18,6 +18,7 @@ #include <linux/syscalls.h> #include <linux/pagemap.h> #include <linux/compat.h> +#include <linux/iversion.h> #include <linux/uaccess.h> #include <asm/unistd.h> @@ -122,6 +123,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | STATX_ATTR_DAX); + if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) { + stat->result_mask |= STATX_CHANGE_COOKIE; + stat->change_cookie = inode_query_iversion(inode); + } + mnt_userns = mnt_user_ns(path->mnt); if (inode->i_op->getattr) return inode->i_op->getattr(mnt_userns, path, stat, @@ -602,9 +608,11 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) memset(&tmp, 0, sizeof(tmp)); - tmp.stx_mask = stat->result_mask; + /* STATX_CHANGE_COOKIE is kernel-only for now */ + tmp.stx_mask = stat->result_mask & ~STATX_CHANGE_COOKIE; tmp.stx_blksize = stat->blksize; - tmp.stx_attributes = stat->attributes; + /* STATX_ATTR_CHANGE_MONOTONIC is kernel-only for now */ + tmp.stx_attributes = stat->attributes & ~STATX_ATTR_CHANGE_MONOTONIC; tmp.stx_nlink = stat->nlink; tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); @@ -643,6 +651,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; + /* STATX_CHANGE_COOKIE is kernel-only for now. Ignore requests + * from userland. + */ + mask &= ~STATX_CHANGE_COOKIE; + error = vfs_statx(dfd, filename, flags, &stat, mask); if (error) return error; diff --git a/include/linux/stat.h b/include/linux/stat.h index ff277ced50e9..52150570d37a 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -52,6 +52,15 @@ struct kstat { u64 mnt_id; u32 dio_mem_align; u32 dio_offset_align; + u64 change_cookie; }; +/* These definitions are internal to the kernel for now. Mainly used by nfsd. */ + +/* mask values */ +#define STATX_CHANGE_COOKIE 0x40000000U /* Want/got stx_change_attr */ + +/* file attribute values */ +#define STATX_ATTR_CHANGE_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */ + #endif