[RFC,v2,00/21] FUSE BPF: A Stacked Filesystem Extension for FUSE

Message ID 20221122021536.1629178-1-drosen@google.com
Headers
Series FUSE BPF: A Stacked Filesystem Extension for FUSE |

Message

Daniel Rosenberg Nov. 22, 2022, 2:15 a.m. UTC
  These patches extend FUSE to be able to act as a stacked filesystem. This
allows pure passthrough, where the fuse file system simply reflects the lower
filesystem, and also allows optional pre and post filtering in BPF and/or the
userspace daemon as needed. This can dramatically reduce or even eliminate
transitions to and from userspace.

For this patch set, I have removed the code related to the bpf side of things
since that is undergoing some large reworks to get it in line with the more
recent BPF developements. This set of patches implements direct passthrough to
the lower filesystem with no alteration. Looking at the v1 code should give a
pretty good idea of what the general shape of the bpf calls will look like.
Without the bpf side, it's like a less efficient bind mount. Not very useful
on its own, but still useful to get eyes on it since the backing calls will be
larglely the same when bpf is in the mix.

This changes the format of adding a backing file/bpf slightly from v1. It's now
a bit more modular. You add a block of data at the end of a lookup response to
give the bpf fd and backing id, but there is now a type header to both blocks,
and a reserved value for future additions. In the future, we may allow for
multiple bpfs or backing files, and this will allow us to extend it without any
UAPI breaking changes. Multiple BPFs would be useful for combining fuse-bpf
implementations without needing to manually combine bpf fragments. Multiple
backing files would allow implementing things like a limited overlayfs.
In this patch set, this is only a single block, with only backing supported,
although I've left the definitions reflecting the BPF case as well.
For bpf, the plan is to have two blocks, with the bpf one coming first.
Any further extensions are currently just speculative.

You can run this without needing to set up a userspace daemon by adding these
mount options: root_dir=[fd],no_daemon where fd is an open file descriptor
pointing to the folder you'd like to use as the root directory. The fd can be
immediately closed after mounting. This is useful for running various fs tests.

The main changes for v2:
-Refactored code to remove many of the ifdefs
-Adjusted attr related code per Amir's suggestions
-Added ioctl interface for responding to fuse requests (required for backing)
-Adjusted lookup add-on block for adding backing file/bpf
-Moved bpf related patches to the end of the stack (not included currently)

TODO:
override_creds to interact with backing files in the same context the daemon
would

Implement backing calls for other FUSE operations (i.e. File Locking/tmp files)

Convert BPF over to more modern version

Alessio Balsini (1):
  fs: Generic function to convert iocb to rw flags

Daniel Rosenberg (20):
  fuse-bpf: Update fuse side uapi
  fuse-bpf: Prepare for fuse-bpf patch
  fuse: Add fuse-bpf, a stacked fs extension for FUSE
  fuse-bpf: Add ioctl interface for /dev/fuse
  fuse-bpf: Don't support export_operations
  fuse-bpf: Add support for FUSE_ACCESS
  fuse-bpf: Partially add mapping support
  fuse-bpf: Add lseek support
  fuse-bpf: Add support for fallocate
  fuse-bpf: Support file/dir open/close
  fuse-bpf: Support mknod/unlink/mkdir/rmdir
  fuse-bpf: Add support for read/write iter
  fuse-bpf: support FUSE_READDIR
  fuse-bpf: Add support for sync operations
  fuse-bpf: Add Rename support
  fuse-bpf: Add attr support
  fuse-bpf: Add support for FUSE_COPY_FILE_RANGE
  fuse-bpf: Add xattr support
  fuse-bpf: Add symlink/link support
  fuse-bpf: allow mounting with no userspace daemon

 fs/fuse/Kconfig           |    8 +
 fs/fuse/Makefile          |    1 +
 fs/fuse/backing.c         | 3118 +++++++++++++++++++++++++++++++++++++
 fs/fuse/control.c         |    2 +-
 fs/fuse/dev.c             |   83 +-
 fs/fuse/dir.c             |  326 ++--
 fs/fuse/file.c            |   62 +-
 fs/fuse/fuse_i.h          |  424 ++++-
 fs/fuse/inode.c           |  264 +++-
 fs/fuse/ioctl.c           |    2 +-
 fs/fuse/readdir.c         |    5 +
 fs/fuse/xattr.c           |   18 +
 fs/overlayfs/file.c       |   23 +-
 include/linux/fs.h        |    5 +
 include/uapi/linux/fuse.h |   24 +-
 15 files changed, 4154 insertions(+), 211 deletions(-)
 create mode 100644 fs/fuse/backing.c


base-commit: 23a60a03d9a9980d1e91190491ceea0dc58fae62
  

Comments

Amir Goldstein Nov. 22, 2022, 11:13 a.m. UTC | #1
On Tue, Nov 22, 2022 at 4:15 AM Daniel Rosenberg <drosen@google.com> wrote:
>
> These patches extend FUSE to be able to act as a stacked filesystem. This
> allows pure passthrough, where the fuse file system simply reflects the lower
> filesystem, and also allows optional pre and post filtering in BPF and/or the
> userspace daemon as needed. This can dramatically reduce or even eliminate
> transitions to and from userspace.
>
> For this patch set, I have removed the code related to the bpf side of things
> since that is undergoing some large reworks to get it in line with the more
> recent BPF developements. This set of patches implements direct passthrough to
> the lower filesystem with no alteration. Looking at the v1 code should give a
> pretty good idea of what the general shape of the bpf calls will look like.
> Without the bpf side, it's like a less efficient bind mount. Not very useful
> on its own, but still useful to get eyes on it since the backing calls will be
> larglely the same when bpf is in the mix.
>
> This changes the format of adding a backing file/bpf slightly from v1. It's now
> a bit more modular. You add a block of data at the end of a lookup response to
> give the bpf fd and backing id, but there is now a type header to both blocks,
> and a reserved value for future additions. In the future, we may allow for
> multiple bpfs or backing files, and this will allow us to extend it without any
> UAPI breaking changes. Multiple BPFs would be useful for combining fuse-bpf
> implementations without needing to manually combine bpf fragments. Multiple
> backing files would allow implementing things like a limited overlayfs.
> In this patch set, this is only a single block, with only backing supported,
> although I've left the definitions reflecting the BPF case as well.
> For bpf, the plan is to have two blocks, with the bpf one coming first.
> Any further extensions are currently just speculative.
>
> You can run this without needing to set up a userspace daemon by adding these
> mount options: root_dir=[fd],no_daemon where fd is an open file descriptor
> pointing to the folder you'd like to use as the root directory. The fd can be
> immediately closed after mounting. This is useful for running various fs tests.
>

Which tests did you run?

My recommendation (if you haven't done that already):
Add a variant to libfuse test_passthrough (test_examples.py):
@pytest.mark.parametrize("name", ('passthrough', 'passthrough_plus',
                           'passthrough_fh', 'passthrough_ll',
'passthrough_bpf'))

and compose the no_daemon cmdline for the 'passthrough_bpf' mount.

This gives pretty good basic test coverage for FUSE passthrough operations.

I've extended test_passthrough_hp() for my libfuse_passthrough patches [1],
but it's the same principle.

Thanks,
Amir.

[1] https://github.com/amir73il/libfuse/commits/fuse_passthrough
* 'passthrough_module' uses 'libfuse_passthrough' which enables
   Allesio's FUSE_DEV_IOC_PASSTHROUGH_OPEN by default.
  
Daniel Rosenberg Nov. 22, 2022, 8:56 p.m. UTC | #2
I've been running the generic xfstests against it, with some
modifications to do things like mount/unmount the lower and upper fs
at once. Most of the failures I see there are related to missing
opcodes, like FUSE_SETLK, FUSE_GETLK, and FUSE_IOCTL. The main failure
I have been seeing is generic/126, which is happening due to some
additional checks we're doing in fuse_open_backing. I figured at some
point we'd add some tests into libfuse, and that sounds like a good
place to start.

On Tue, Nov 22, 2022 at 3:13 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Tue, Nov 22, 2022 at 4:15 AM Daniel Rosenberg <drosen@google.com> wrote:
> >
> > These patches extend FUSE to be able to act as a stacked filesystem. This
> > allows pure passthrough, where the fuse file system simply reflects the lower
> > filesystem, and also allows optional pre and post filtering in BPF and/or the
> > userspace daemon as needed. This can dramatically reduce or even eliminate
> > transitions to and from userspace.
> >
> > For this patch set, I have removed the code related to the bpf side of things
> > since that is undergoing some large reworks to get it in line with the more
> > recent BPF developements. This set of patches implements direct passthrough to
> > the lower filesystem with no alteration. Looking at the v1 code should give a
> > pretty good idea of what the general shape of the bpf calls will look like.
> > Without the bpf side, it's like a less efficient bind mount. Not very useful
> > on its own, but still useful to get eyes on it since the backing calls will be
> > larglely the same when bpf is in the mix.
> >
> > This changes the format of adding a backing file/bpf slightly from v1. It's now
> > a bit more modular. You add a block of data at the end of a lookup response to
> > give the bpf fd and backing id, but there is now a type header to both blocks,
> > and a reserved value for future additions. In the future, we may allow for
> > multiple bpfs or backing files, and this will allow us to extend it without any
> > UAPI breaking changes. Multiple BPFs would be useful for combining fuse-bpf
> > implementations without needing to manually combine bpf fragments. Multiple
> > backing files would allow implementing things like a limited overlayfs.
> > In this patch set, this is only a single block, with only backing supported,
> > although I've left the definitions reflecting the BPF case as well.
> > For bpf, the plan is to have two blocks, with the bpf one coming first.
> > Any further extensions are currently just speculative.
> >
> > You can run this without needing to set up a userspace daemon by adding these
> > mount options: root_dir=[fd],no_daemon where fd is an open file descriptor
> > pointing to the folder you'd like to use as the root directory. The fd can be
> > immediately closed after mounting. This is useful for running various fs tests.
> >
>
> Which tests did you run?
>
> My recommendation (if you haven't done that already):
> Add a variant to libfuse test_passthrough (test_examples.py):
> @pytest.mark.parametrize("name", ('passthrough', 'passthrough_plus',
>                            'passthrough_fh', 'passthrough_ll',
> 'passthrough_bpf'))
>
> and compose the no_daemon cmdline for the 'passthrough_bpf' mount.
>
> This gives pretty good basic test coverage for FUSE passthrough operations.
>
> I've extended test_passthrough_hp() for my libfuse_passthrough patches [1],
> but it's the same principle.
>
> Thanks,
> Amir.
>
> [1] https://github.com/amir73il/libfuse/commits/fuse_passthrough
> * 'passthrough_module' uses 'libfuse_passthrough' which enables
>    Allesio's FUSE_DEV_IOC_PASSTHROUGH_OPEN by default.
  
Bernd Schubert Nov. 22, 2022, 9:23 p.m. UTC | #3
On 11/22/22 21:56, Daniel Rosenberg wrote:
> I've been running the generic xfstests against it, with some
> modifications to do things like mount/unmount the lower and upper fs
> at once. Most of the failures I see there are related to missing
> opcodes, like FUSE_SETLK, FUSE_GETLK, and FUSE_IOCTL. The main failure
> I have been seeing is generic/126, which is happening due to some
> additional checks we're doing in fuse_open_backing. I figured at some
> point we'd add some tests into libfuse, and that sounds like a good
> place to start.


Here is a branch of xfstests that should work with fuse and should not 
run "rm -fr /" (we are going to give it more testing this week).

https://github.com/hbirth/xfstests


Bernd