Message ID | 20221122021536.1629178-1-drosen@google.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1956426wrr; Mon, 21 Nov 2022 18:18:10 -0800 (PST) X-Google-Smtp-Source: AA0mqf5RbGwSizqchDSNO3fC6oWqaEialMTRsFUfHYm2TzhKg6m/Y5+60Ml3i2J1vrX+qodHdXmQ X-Received: by 2002:a17:90a:7848:b0:218:a3af:3bd3 with SMTP id y8-20020a17090a784800b00218a3af3bd3mr10975557pjl.183.1669083490713; Mon, 21 Nov 2022 18:18:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669083490; cv=none; d=google.com; s=arc-20160816; b=mJQNL5MS3PWo97jm5WWrQmcFq7kHNryLHvEwdDYbDI5/mtYwvLSp3ba4USTmKcdjJj wepjJE7muT7TP+Fj8IHE9bxvQdlzHHsYZG+41tVZJJkiGQpzi8NavvqUYmVTlkpHHpxG pDbprlMq4c2hE8uO67gwhOALv5sGSqW5JboNc+oMPlX+LnQ0XTomLcsAcaMwrIyRFzi8 eyFrNBS/2hVXSqc8sWzyLNoWywqIRlQ1Lfb/Z5bho0h5F8JArGPlLS1V342klMtOhAId rMfA3e4FbjfqSnkCJxrHy45jLLjI8DAfarqnYyPNKTLyejJ4xwV0HWgWsPp1yy+uGN7K c2Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=QwnVufQmqgsD71rU58/FXnZ6Kqu/OdIMDvlzn78hKF0=; b=Rf3qcqn8Y1EhjmXlQOMamAfVolJ1jkUR3460/P/7UTXyro3pH1gYXZnHCTdW7zcEar Adk4cYxchZjFBGy2M8n9dLwjnhLk3+mfJgh5hygqRSou/Hy6C+lwjl7yHxQtJ774GHTt hTBkKtG6jtPxHLHS3hMi/NCwp5r4J+imIpi3GZjif7q6c1tS+CYNeS78vTSbMwgmA8Ky UarwqWzcwtJhfSCjHbxRqRMZFT3+L6n1/3vKzLEIzNKv5F2Z258XqaXjG9hTVq5tno02 A7D7vMa1pT1E5Zb2V90aY92aIAeIX3N58T8C8TnhKQ7dbDYu1ENGwEz6CCK8U5Goh7uo IsDA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="k4vZ9/HU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g62-20020a636b41000000b00477466d007bsi8701070pgc.205.2022.11.21.18.17.57; Mon, 21 Nov 2022 18:18:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="k4vZ9/HU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232461AbiKVCQA (ORCPT <rfc822;cjcooper78@gmail.com> + 99 others); Mon, 21 Nov 2022 21:16:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231678AbiKVCP4 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 21 Nov 2022 21:15:56 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9FDBE0CB2 for <linux-kernel@vger.kernel.org>; Mon, 21 Nov 2022 18:15:54 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-39967085ae9so75175667b3.11 for <linux-kernel@vger.kernel.org>; Mon, 21 Nov 2022 18:15:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=QwnVufQmqgsD71rU58/FXnZ6Kqu/OdIMDvlzn78hKF0=; b=k4vZ9/HUzOIP2zBuWKoMtl7qYaF7WljtXH1qLMOjUPNFZJcho4S6mcX9oJEpI4eEhd jQaDsDTCWLUiqZD9gEJJFxO5PXor/IPqkbWeHOf7/uBxPI/E42dhTBwWw9pGFQOWfXAz XDQ8fyBePklEduv9JeWkDQ4ti9U2YQjgKEpwKg/jWHBZfFhH4Wu+UWVBtFUv25SKyusZ fSg7ZhiEjA0ThA2qFhWu1e3M4PLIRyFPrs/duf2FcIkJ/FvMpg33uhosLAvbGHdEgxYZ /ZnujCg1Q8yPcTksxCuAuCvR2S9fKlQMl0DKNoWVw/W4d1VCUoEAFxKyWrQ4PB5xS8r4 TzGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=QwnVufQmqgsD71rU58/FXnZ6Kqu/OdIMDvlzn78hKF0=; b=Xj8XwZwyxy/tB3L8vQyXOPg88MKA2+s8x9trJGYy3ed6+IRI+/c1fnBOPjKua4xc3B SXA00jtVcHnMEbgJiC8/l35luWUQdKQ10bCO70RytQilZZh8QGl32b/5mUWdbOiGyySN OVxdtANbH0J4AptH3mqoB5zV2uPiMt/ySLQs6Zrp2m1dqLa0NBD+NEKJPzwv39MpquEd 5G44/d4yHA7JarQl72kcR5zrFW+HEfGIfLWOZtgwjhga3YAmgZimpl9UOvSM7Jz19RiM s46hVznaszIyHL9Fb/yojr9TMtPNH+koe4kWU5lFWiiupCsvqbQIe7DYWDTQqEyil6cy WzVQ== X-Gm-Message-State: ANoB5pmk9XAZziJ+GhX0kr6ta59LVRL9CI1lWrHKDr0TSnBB/t+Q65Kd /LZcKJ8U9odcFRryjzRj0md4U3XOo4k= X-Received: from drosen.mtv.corp.google.com ([2620:15c:211:200:8539:aadd:13be:6e82]) (user=drosen job=sendgmr) by 2002:a81:d449:0:b0:38f:af02:ee94 with SMTP id g9-20020a81d449000000b0038faf02ee94mr3ywl.230.1669083353587; Mon, 21 Nov 2022 18:15:53 -0800 (PST) Date: Mon, 21 Nov 2022 18:15:15 -0800 Mime-Version: 1.0 X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog Message-ID: <20221122021536.1629178-1-drosen@google.com> Subject: [RFC PATCH v2 00/21] FUSE BPF: A Stacked Filesystem Extension for FUSE From: Daniel Rosenberg <drosen@google.com> To: Miklos Szeredi <miklos@szeredi.hu> Cc: Amir Goldstein <amir73il@gmail.com>, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, kernel-team@android.com, Daniel Rosenberg <drosen@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750160890291265829?= X-GMAIL-MSGID: =?utf-8?q?1750160890291265829?= |
Series |
FUSE BPF: A Stacked Filesystem Extension for FUSE
|
|
Message
Daniel Rosenberg
Nov. 22, 2022, 2:15 a.m. UTC
These patches extend FUSE to be able to act as a stacked filesystem. This allows pure passthrough, where the fuse file system simply reflects the lower filesystem, and also allows optional pre and post filtering in BPF and/or the userspace daemon as needed. This can dramatically reduce or even eliminate transitions to and from userspace. For this patch set, I have removed the code related to the bpf side of things since that is undergoing some large reworks to get it in line with the more recent BPF developements. This set of patches implements direct passthrough to the lower filesystem with no alteration. Looking at the v1 code should give a pretty good idea of what the general shape of the bpf calls will look like. Without the bpf side, it's like a less efficient bind mount. Not very useful on its own, but still useful to get eyes on it since the backing calls will be larglely the same when bpf is in the mix. This changes the format of adding a backing file/bpf slightly from v1. It's now a bit more modular. You add a block of data at the end of a lookup response to give the bpf fd and backing id, but there is now a type header to both blocks, and a reserved value for future additions. In the future, we may allow for multiple bpfs or backing files, and this will allow us to extend it without any UAPI breaking changes. Multiple BPFs would be useful for combining fuse-bpf implementations without needing to manually combine bpf fragments. Multiple backing files would allow implementing things like a limited overlayfs. In this patch set, this is only a single block, with only backing supported, although I've left the definitions reflecting the BPF case as well. For bpf, the plan is to have two blocks, with the bpf one coming first. Any further extensions are currently just speculative. You can run this without needing to set up a userspace daemon by adding these mount options: root_dir=[fd],no_daemon where fd is an open file descriptor pointing to the folder you'd like to use as the root directory. The fd can be immediately closed after mounting. This is useful for running various fs tests. The main changes for v2: -Refactored code to remove many of the ifdefs -Adjusted attr related code per Amir's suggestions -Added ioctl interface for responding to fuse requests (required for backing) -Adjusted lookup add-on block for adding backing file/bpf -Moved bpf related patches to the end of the stack (not included currently) TODO: override_creds to interact with backing files in the same context the daemon would Implement backing calls for other FUSE operations (i.e. File Locking/tmp files) Convert BPF over to more modern version Alessio Balsini (1): fs: Generic function to convert iocb to rw flags Daniel Rosenberg (20): fuse-bpf: Update fuse side uapi fuse-bpf: Prepare for fuse-bpf patch fuse: Add fuse-bpf, a stacked fs extension for FUSE fuse-bpf: Add ioctl interface for /dev/fuse fuse-bpf: Don't support export_operations fuse-bpf: Add support for FUSE_ACCESS fuse-bpf: Partially add mapping support fuse-bpf: Add lseek support fuse-bpf: Add support for fallocate fuse-bpf: Support file/dir open/close fuse-bpf: Support mknod/unlink/mkdir/rmdir fuse-bpf: Add support for read/write iter fuse-bpf: support FUSE_READDIR fuse-bpf: Add support for sync operations fuse-bpf: Add Rename support fuse-bpf: Add attr support fuse-bpf: Add support for FUSE_COPY_FILE_RANGE fuse-bpf: Add xattr support fuse-bpf: Add symlink/link support fuse-bpf: allow mounting with no userspace daemon fs/fuse/Kconfig | 8 + fs/fuse/Makefile | 1 + fs/fuse/backing.c | 3118 +++++++++++++++++++++++++++++++++++++ fs/fuse/control.c | 2 +- fs/fuse/dev.c | 83 +- fs/fuse/dir.c | 326 ++-- fs/fuse/file.c | 62 +- fs/fuse/fuse_i.h | 424 ++++- fs/fuse/inode.c | 264 +++- fs/fuse/ioctl.c | 2 +- fs/fuse/readdir.c | 5 + fs/fuse/xattr.c | 18 + fs/overlayfs/file.c | 23 +- include/linux/fs.h | 5 + include/uapi/linux/fuse.h | 24 +- 15 files changed, 4154 insertions(+), 211 deletions(-) create mode 100644 fs/fuse/backing.c base-commit: 23a60a03d9a9980d1e91190491ceea0dc58fae62
Comments
On Tue, Nov 22, 2022 at 4:15 AM Daniel Rosenberg <drosen@google.com> wrote: > > These patches extend FUSE to be able to act as a stacked filesystem. This > allows pure passthrough, where the fuse file system simply reflects the lower > filesystem, and also allows optional pre and post filtering in BPF and/or the > userspace daemon as needed. This can dramatically reduce or even eliminate > transitions to and from userspace. > > For this patch set, I have removed the code related to the bpf side of things > since that is undergoing some large reworks to get it in line with the more > recent BPF developements. This set of patches implements direct passthrough to > the lower filesystem with no alteration. Looking at the v1 code should give a > pretty good idea of what the general shape of the bpf calls will look like. > Without the bpf side, it's like a less efficient bind mount. Not very useful > on its own, but still useful to get eyes on it since the backing calls will be > larglely the same when bpf is in the mix. > > This changes the format of adding a backing file/bpf slightly from v1. It's now > a bit more modular. You add a block of data at the end of a lookup response to > give the bpf fd and backing id, but there is now a type header to both blocks, > and a reserved value for future additions. In the future, we may allow for > multiple bpfs or backing files, and this will allow us to extend it without any > UAPI breaking changes. Multiple BPFs would be useful for combining fuse-bpf > implementations without needing to manually combine bpf fragments. Multiple > backing files would allow implementing things like a limited overlayfs. > In this patch set, this is only a single block, with only backing supported, > although I've left the definitions reflecting the BPF case as well. > For bpf, the plan is to have two blocks, with the bpf one coming first. > Any further extensions are currently just speculative. > > You can run this without needing to set up a userspace daemon by adding these > mount options: root_dir=[fd],no_daemon where fd is an open file descriptor > pointing to the folder you'd like to use as the root directory. The fd can be > immediately closed after mounting. This is useful for running various fs tests. > Which tests did you run? My recommendation (if you haven't done that already): Add a variant to libfuse test_passthrough (test_examples.py): @pytest.mark.parametrize("name", ('passthrough', 'passthrough_plus', 'passthrough_fh', 'passthrough_ll', 'passthrough_bpf')) and compose the no_daemon cmdline for the 'passthrough_bpf' mount. This gives pretty good basic test coverage for FUSE passthrough operations. I've extended test_passthrough_hp() for my libfuse_passthrough patches [1], but it's the same principle. Thanks, Amir. [1] https://github.com/amir73il/libfuse/commits/fuse_passthrough * 'passthrough_module' uses 'libfuse_passthrough' which enables Allesio's FUSE_DEV_IOC_PASSTHROUGH_OPEN by default.
I've been running the generic xfstests against it, with some modifications to do things like mount/unmount the lower and upper fs at once. Most of the failures I see there are related to missing opcodes, like FUSE_SETLK, FUSE_GETLK, and FUSE_IOCTL. The main failure I have been seeing is generic/126, which is happening due to some additional checks we're doing in fuse_open_backing. I figured at some point we'd add some tests into libfuse, and that sounds like a good place to start. On Tue, Nov 22, 2022 at 3:13 AM Amir Goldstein <amir73il@gmail.com> wrote: > > On Tue, Nov 22, 2022 at 4:15 AM Daniel Rosenberg <drosen@google.com> wrote: > > > > These patches extend FUSE to be able to act as a stacked filesystem. This > > allows pure passthrough, where the fuse file system simply reflects the lower > > filesystem, and also allows optional pre and post filtering in BPF and/or the > > userspace daemon as needed. This can dramatically reduce or even eliminate > > transitions to and from userspace. > > > > For this patch set, I have removed the code related to the bpf side of things > > since that is undergoing some large reworks to get it in line with the more > > recent BPF developements. This set of patches implements direct passthrough to > > the lower filesystem with no alteration. Looking at the v1 code should give a > > pretty good idea of what the general shape of the bpf calls will look like. > > Without the bpf side, it's like a less efficient bind mount. Not very useful > > on its own, but still useful to get eyes on it since the backing calls will be > > larglely the same when bpf is in the mix. > > > > This changes the format of adding a backing file/bpf slightly from v1. It's now > > a bit more modular. You add a block of data at the end of a lookup response to > > give the bpf fd and backing id, but there is now a type header to both blocks, > > and a reserved value for future additions. In the future, we may allow for > > multiple bpfs or backing files, and this will allow us to extend it without any > > UAPI breaking changes. Multiple BPFs would be useful for combining fuse-bpf > > implementations without needing to manually combine bpf fragments. Multiple > > backing files would allow implementing things like a limited overlayfs. > > In this patch set, this is only a single block, with only backing supported, > > although I've left the definitions reflecting the BPF case as well. > > For bpf, the plan is to have two blocks, with the bpf one coming first. > > Any further extensions are currently just speculative. > > > > You can run this without needing to set up a userspace daemon by adding these > > mount options: root_dir=[fd],no_daemon where fd is an open file descriptor > > pointing to the folder you'd like to use as the root directory. The fd can be > > immediately closed after mounting. This is useful for running various fs tests. > > > > Which tests did you run? > > My recommendation (if you haven't done that already): > Add a variant to libfuse test_passthrough (test_examples.py): > @pytest.mark.parametrize("name", ('passthrough', 'passthrough_plus', > 'passthrough_fh', 'passthrough_ll', > 'passthrough_bpf')) > > and compose the no_daemon cmdline for the 'passthrough_bpf' mount. > > This gives pretty good basic test coverage for FUSE passthrough operations. > > I've extended test_passthrough_hp() for my libfuse_passthrough patches [1], > but it's the same principle. > > Thanks, > Amir. > > [1] https://github.com/amir73il/libfuse/commits/fuse_passthrough > * 'passthrough_module' uses 'libfuse_passthrough' which enables > Allesio's FUSE_DEV_IOC_PASSTHROUGH_OPEN by default.
On 11/22/22 21:56, Daniel Rosenberg wrote: > I've been running the generic xfstests against it, with some > modifications to do things like mount/unmount the lower and upper fs > at once. Most of the failures I see there are related to missing > opcodes, like FUSE_SETLK, FUSE_GETLK, and FUSE_IOCTL. The main failure > I have been seeing is generic/126, which is happening due to some > additional checks we're doing in fuse_open_backing. I figured at some > point we'd add some tests into libfuse, and that sounds like a good > place to start. Here is a branch of xfstests that should work with fuse and should not run "rm -fr /" (we are going to give it more testing this week). https://github.com/hbirth/xfstests Bernd