[v2,0/7] filesystem visibililty ioctls

Message ID 20240206201858.952303-1-kent.overstreet@linux.dev
Headers
Series filesystem visibililty ioctls |

Message

Kent Overstreet Feb. 6, 2024, 8:18 p.m. UTC
  previous:
https://lore.kernel.org/linux-fsdevel/20240206-aufwuchs-atomkraftgegner-dc53ce1e435f@brauner/T/

Changes since v1:
 - super_set_uuid() helper, per Dave

 - nix FS_IOC_SETUUID - Al raised this and I'm in 100% agreement,
   changing a UUID on an existing filesystem is a rare operation that
   should only be done when the filesystem is offline; we'd need to
   audit/fix a bunch of stuff if we wanted to support this

 - fix iocl numberisng, no longer using btrfs's space

 - flags argument in struct fsuuid2 is gone; since we're no longer
   setting this is no longer needed. As discussed previously, this
   interface is only for exporting the public, user-changable UUID (and
   there's now a comment saying that this exports the same UUID that
   libblkid reports, per Darrick).

Darrick also noticed that fscrypt (!) is using sb->s_uuid, which looks
busted - they want to be using the "this can never change" UUID, but
that is not an item for this patchset.

 - FS_IOC_GETSYSFSNAME -> FS_IOC_GETSYSFSPATH, per Darrick (the commit
   messages didn't get updated, whoops); and there's now a comment to
   reflect that this patch is also for finding filesystem info under
   debugfs, if present.

Christain, if nothing else comes up, are you ready to take this?

Cheers,
Kent

Kent Overstreet (7):
  fs: super_set_uuid()
  overlayfs: Convert to super_set_uuid()
  fs: FS_IOC_GETUUID
  fat: Hook up sb->s_uuid
  fs: FS_IOC_GETSYSFSNAME
  xfs: add support for FS_IOC_GETSYSFSNAME
  bcachefs: add support for FS_IOC_GETSYSFSNAME

 fs/bcachefs/fs.c        |  3 ++-
 fs/ext4/super.c         |  2 +-
 fs/f2fs/super.c         |  2 +-
 fs/fat/inode.c          |  3 +++
 fs/gfs2/ops_fstype.c    |  2 +-
 fs/ioctl.c              | 33 +++++++++++++++++++++++++++++++++
 fs/kernfs/mount.c       |  4 +++-
 fs/ocfs2/super.c        |  4 ++--
 fs/overlayfs/util.c     | 14 +++++++++-----
 fs/ubifs/super.c        |  2 +-
 fs/xfs/xfs_mount.c      |  4 +++-
 include/linux/fs.h      | 10 ++++++++++
 include/uapi/linux/fs.h | 27 +++++++++++++++++++++++++++
 mm/shmem.c              |  4 +++-
 14 files changed, 99 insertions(+), 15 deletions(-)
  

Comments

Eric Biggers Feb. 7, 2024, 1:47 a.m. UTC | #1
On Tue, Feb 06, 2024 at 03:18:48PM -0500, Kent Overstreet wrote:
> 
> Darrick also noticed that fscrypt (!) is using sb->s_uuid, which looks
> busted - they want to be using the "this can never change" UUID, but
> that is not an item for this patchset.
> 

fscrypt only uses sb->s_uuid if FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64 or
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32 is being used in the encryption policy.
These flags are only supported by ext4 and f2fs, and they are only useful when
the file contents encryption is being done with inline encryption hardware that
only allows 64-bits or less of the initialization vector to be specified and
that has poor performance when switching keys.  This hardware is currently only
known to be present on mobile and embedded systems that use eMMC or UFS storage.

Note that these settings assume the inode numbers are stable as well as the
UUID.  So, when they are in use, filesystem shrinking is prohibited as well as
changing the filesystem UUID.  (In ext4, both operations are forbidden using the
stable_inodes feature flag.  f2fs doesn't support either operation regardless.)

These restrictions are unfortunate, but so far they haven't been a problem for
the only known use case for these non-default settings.

In the case of s_uuid, for both ext4 and f2fs it's true that we could have used
s_encrypt_pw_salt instead, or added a new general-purpose internal UUID field
and used that.  Maybe we even should have, considering the precedent of ext4's
metadata_csum migrating away from using the UUID to its own internal seed.  I do
worry that relying on an internal UUID for these settings would make it easier
for people to create insecure setups where they're using the same fscrypt key on
multiple filesystems with the same internal UUID.  With the external UUID, such
misconfigurations are obvious and will be noticed and fixed.  With the internal
UUID, such vulnerabilities would not be noticed, as things will "just work".
Which is better?  It's not entirely clear to me.  We do encourage the use of
different fscrypt keys on different filesystems anyway, but this isn't required.

Of course, even if the usability improvement outweighs that concern, switching
these already-existing encryption settings over to use an internal UUID can't be
done trivially; it would have to be controlled by a new filesystem feature flag.
We probably shouldn't bother unless/until there's a clear use case for it.

If anyone does have any new use case for these weird and non-default encryption
settings (and I hope you don't!), I'd be interested to hear...

- Eric
  
Kent Overstreet Feb. 7, 2024, 2:09 a.m. UTC | #2
On Tue, Feb 06, 2024 at 05:47:29PM -0800, Eric Biggers wrote:
> On Tue, Feb 06, 2024 at 03:18:48PM -0500, Kent Overstreet wrote:
> > 
> > Darrick also noticed that fscrypt (!) is using sb->s_uuid, which looks
> > busted - they want to be using the "this can never change" UUID, but
> > that is not an item for this patchset.
> > 
> 
> fscrypt only uses sb->s_uuid if FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64 or
> FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32 is being used in the encryption policy.
> These flags are only supported by ext4 and f2fs, and they are only useful when
> the file contents encryption is being done with inline encryption hardware that
> only allows 64-bits or less of the initialization vector to be specified and
> that has poor performance when switching keys.  This hardware is currently only
> known to be present on mobile and embedded systems that use eMMC or UFS storage.
> 
> Note that these settings assume the inode numbers are stable as well as the
> UUID.  So, when they are in use, filesystem shrinking is prohibited as well as
> changing the filesystem UUID.  (In ext4, both operations are forbidden using the
> stable_inodes feature flag.  f2fs doesn't support either operation regardless.)
> 
> These restrictions are unfortunate, but so far they haven't been a problem for
> the only known use case for these non-default settings.
> 
> In the case of s_uuid, for both ext4 and f2fs it's true that we could have used
> s_encrypt_pw_salt instead, or added a new general-purpose internal UUID field
> and used that.  Maybe we even should have, considering the precedent of ext4's
> metadata_csum migrating away from using the UUID to its own internal seed.  I do
> worry that relying on an internal UUID for these settings would make it easier
> for people to create insecure setups where they're using the same fscrypt key on
> multiple filesystems with the same internal UUID.  With the external UUID, such
> misconfigurations are obvious and will be noticed and fixed.  With the internal
> UUID, such vulnerabilities would not be noticed, as things will "just work".
> Which is better?  It's not entirely clear to me.  We do encourage the use of
> different fscrypt keys on different filesystems anyway, but this isn't required.

*nod* nonce reuse is a thorny issue - that makes using the changeable,
external UUID a bit more of a feature than a bug.

> Of course, even if the usability improvement outweighs that concern, switching
> these already-existing encryption settings over to use an internal UUID can't be
> done trivially; it would have to be controlled by a new filesystem feature flag.
> We probably shouldn't bother unless/until there's a clear use case for it.
> 
> If anyone does have any new use case for these weird and non-default encryption
> settings (and I hope you don't!), I'd be interested to hear...

I just wanted to make sure I wasn't breaking fscrypt by baking-in that
s_uuid is the user facing one - thanks for the info.

Can we get this documented in a code comment somewhere visible? It was
definitely a "wtf" moment when Darrick and I saw it, I want to make sure
people know what's going on later.
  
Theodore Ts'o Feb. 7, 2024, 5:40 p.m. UTC | #3
On Tue, Feb 06, 2024 at 03:18:48PM -0500, Kent Overstreet wrote:
> previous:
> https://lore.kernel.org/linux-fsdevel/20240206-aufwuchs-atomkraftgegner-dc53ce1e435f@brauner/T/
> 
> Changes since v1:
>  - super_set_uuid() helper, per Dave
> 
>  - nix FS_IOC_SETUUID - Al raised this and I'm in 100% agreement,
>    changing a UUID on an existing filesystem is a rare operation that
>    should only be done when the filesystem is offline; we'd need to
>    audit/fix a bunch of stuff if we wanted to support this

NAK.  First, this happens every single time a VM in the cloud starts
up, so it happens ZILLIONS of time a day.  Secondly, there is already
userspace programs --- to wit, tune2fs --- that uses this ioctl, so
nixing FS_IOC_SETUUID will break userspace programs.

							- Ted
  
Kent Overstreet Feb. 7, 2024, 8:26 p.m. UTC | #4
On Wed, Feb 07, 2024 at 12:40:09PM -0500, Theodore Ts'o wrote:
> On Tue, Feb 06, 2024 at 03:18:48PM -0500, Kent Overstreet wrote:
> > previous:
> > https://lore.kernel.org/linux-fsdevel/20240206-aufwuchs-atomkraftgegner-dc53ce1e435f@brauner/T/
> > 
> > Changes since v1:
> >  - super_set_uuid() helper, per Dave
> > 
> >  - nix FS_IOC_SETUUID - Al raised this and I'm in 100% agreement,
> >    changing a UUID on an existing filesystem is a rare operation that
> >    should only be done when the filesystem is offline; we'd need to
> >    audit/fix a bunch of stuff if we wanted to support this
> 
> NAK.  First, this happens every single time a VM in the cloud starts
> up, so it happens ZILLIONS of time a day.  Secondly, there is already
> userspace programs --- to wit, tune2fs --- that uses this ioctl, so
> nixing FS_IOC_SETUUID will break userspace programs.

You've still got the ext4 version, we're not taking that away. But I
don't think other filesystems will want to deal with the hassle of
changing UUIDs at runtime, since that's effectively used for API access
via sysfs and debugfs.
  
Christian Brauner Feb. 8, 2024, 9:01 a.m. UTC | #5
> don't think other filesystems will want to deal with the hassle of
> changing UUIDs at runtime, since that's effectively used for API access
> via sysfs and debugfs.

/me nods.
  
Christian Brauner Feb. 8, 2024, 9:48 a.m. UTC | #6
> Christain, if nothing else comes up, are you ready to take this?

I'm amazed how consistently you mistype my name. Sorry, I just read
that. Yep, I'm about to pick this up.
  
Kent Overstreet Feb. 8, 2024, 6:16 p.m. UTC | #7
On Thu, Feb 08, 2024 at 10:48:31AM +0100, Christian Brauner wrote:
> > Christain, if nothing else comes up, are you ready to take this?
> 
> I'm amazed how consistently you mistype my name. Sorry, I just read
> that. Yep, I'm about to pick this up.

Gah, am I becoming dyslexic in my old age...
  
Theodore Ts'o Feb. 12, 2024, 10:47 p.m. UTC | #8
On Wed, Feb 07, 2024 at 03:26:55PM -0500, Kent Overstreet wrote:
> You've still got the ext4 version, we're not taking that away. But I
> don't think other filesystems will want to deal with the hassle of
> changing UUIDs at runtime, since that's effectively used for API access
> via sysfs and debugfs.

Thanks. I misunderstood the log.  I didn't realize this was just about
not hoisting the ioctl to the VFS level, and dropping the generic uuid
set.

I'm not convinced that we should be using the UUID for kernel API
access, if for no other reason that not all file systems have UUID's.
Sure, modern file systems have UUID's, and individual file systems
might have to have specific features that don't play well with UUID's
changing while the file system is mounted.  But I'm hoping that we
don't add any new interfaces that rely on using the UUID for API
access at the VFS layer.  After all, ext2 (not just ext3 and ext4) has
supported changing the UUID while the file system has been mounted for
*decades*.

     	       	    	       	    	    - Ted
  
Kent Overstreet Feb. 12, 2024, 11:24 p.m. UTC | #9
On Mon, Feb 12, 2024 at 05:47:40PM -0500, Theodore Ts'o wrote:
> On Wed, Feb 07, 2024 at 03:26:55PM -0500, Kent Overstreet wrote:
> > You've still got the ext4 version, we're not taking that away. But I
> > don't think other filesystems will want to deal with the hassle of
> > changing UUIDs at runtime, since that's effectively used for API access
> > via sysfs and debugfs.
> 
> Thanks. I misunderstood the log.  I didn't realize this was just about
> not hoisting the ioctl to the VFS level, and dropping the generic uuid
> set.
> 
> I'm not convinced that we should be using the UUID for kernel API
> access, if for no other reason that not all file systems have UUID's.
> Sure, modern file systems have UUID's, and individual file systems
> might have to have specific features that don't play well with UUID's
> changing while the file system is mounted.  But I'm hoping that we
> don't add any new interfaces that rely on using the UUID for API
> access at the VFS layer.  After all, ext2 (not just ext3 and ext4) has
> supported changing the UUID while the file system has been mounted for
> *decades*.

*nod*

The intention isn't for every filesystem to be using the UUID for API
access - there's no reason to for single device filesystems, after all.

The intent is rather - for filesystems that _do_ need the UUID as a
stable identifier, let's try to standardize how's it's exposed and
used...