[5/5] docs: fuse: improve FUSE consistency explanation

Message ID 20230711043405.66256-6-zhangjiachen.jaycee@bytedance.com
State New
Headers
Series FUSE consistency improvements |

Commit Message

Jiachen Zhang July 11, 2023, 4:34 a.m. UTC
  Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
---
 Documentation/filesystems/fuse-io.rst | 32 +++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)
  

Comments

Randy Dunlap July 11, 2023, 4:42 a.m. UTC | #1
Hi--

On 7/10/23 21:34, Jiachen Zhang wrote:
> Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
> ---
>  Documentation/filesystems/fuse-io.rst | 32 +++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/filesystems/fuse-io.rst b/Documentation/filesystems/fuse-io.rst
> index 255a368fe534..cdd292dd2e9c 100644
> --- a/Documentation/filesystems/fuse-io.rst
> +++ b/Documentation/filesystems/fuse-io.rst

> @@ -24,7 +31,8 @@ after any writes to the file.  All mmap modes are supported.
>  The cached mode has two sub modes controlling how writes are handled.  The
>  write-through mode is the default and is supported on all kernels.  The
>  writeback-cache mode may be selected by the FUSE_WRITEBACK_CACHE flag in the
> -FUSE_INIT reply.
> +FUSE_INIT reply. In either modes, if the FOPEN_KEEP_CACHE flag is not set in

                       either mode,

> +the FUSE_OPEN, cached pages of the file will be invalidated immediatedly.

                                                               immediately.

>  
>  In write-through mode each write is immediately sent to userspace as one or more
>  WRITE requests, as well as updating any cached pages (and caching previously
> @@ -38,7 +46,27 @@ reclaim on memory pressure) or explicitly (invoked by close(2), fsync(2) and
>  when the last ref to the file is being released on munmap(2)).  This mode
>  assumes that all changes to the filesystem go through the FUSE kernel module
>  (size and atime/ctime/mtime attributes are kept up-to-date by the kernel), so
> -it's generally not suitable for network filesystems.  If a partial page is
> +it's generally not suitable for network filesystems (you can consider the
> +writeback-cache-v2 mode mentioned latter for them).  If a partial page is

                                     later

>  written, then the page needs to be first read from userspace.  This means, that
>  even for files opened for O_WRONLY it is possible that READ requests will be
>  generated by the kernel.
  
Jiachen Zhang July 11, 2023, 7:15 a.m. UTC | #2
On 2023/7/11 12:42, Randy Dunlap wrote:
> Hi--
> 
> On 7/10/23 21:34, Jiachen Zhang wrote:
>> Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
>> ---
>>   Documentation/filesystems/fuse-io.rst | 32 +++++++++++++++++++++++++--
>>   1 file changed, 30 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/filesystems/fuse-io.rst b/Documentation/filesystems/fuse-io.rst
>> index 255a368fe534..cdd292dd2e9c 100644
>> --- a/Documentation/filesystems/fuse-io.rst
>> +++ b/Documentation/filesystems/fuse-io.rst
> 
>> @@ -24,7 +31,8 @@ after any writes to the file.  All mmap modes are supported.
>>   The cached mode has two sub modes controlling how writes are handled.  The
>>   write-through mode is the default and is supported on all kernels.  The
>>   writeback-cache mode may be selected by the FUSE_WRITEBACK_CACHE flag in the
>> -FUSE_INIT reply.
>> +FUSE_INIT reply. In either modes, if the FOPEN_KEEP_CACHE flag is not set in
> 
>                         either mode,
> 
>> +the FUSE_OPEN, cached pages of the file will be invalidated immediatedly.
> 
>                                                                 immediately.
> 
>>   
>>   In write-through mode each write is immediately sent to userspace as one or more
>>   WRITE requests, as well as updating any cached pages (and caching previously
>> @@ -38,7 +46,27 @@ reclaim on memory pressure) or explicitly (invoked by close(2), fsync(2) and
>>   when the last ref to the file is being released on munmap(2)).  This mode
>>   assumes that all changes to the filesystem go through the FUSE kernel module
>>   (size and atime/ctime/mtime attributes are kept up-to-date by the kernel), so
>> -it's generally not suitable for network filesystems.  If a partial page is
>> +it's generally not suitable for network filesystems (you can consider the
>> +writeback-cache-v2 mode mentioned latter for them).  If a partial page is
> 
>                                       later
> 
>>   written, then the page needs to be first read from userspace.  This means, that
>>   even for files opened for O_WRONLY it is possible that READ requests will be
>>   generated by the kernel.
> 
> 


Thanks, Randy. I will fix them in the next version.

Jiachen
  

Patch

diff --git a/Documentation/filesystems/fuse-io.rst b/Documentation/filesystems/fuse-io.rst
index 255a368fe534..cdd292dd2e9c 100644
--- a/Documentation/filesystems/fuse-io.rst
+++ b/Documentation/filesystems/fuse-io.rst
@@ -10,6 +10,10 @@  Fuse supports the following I/O modes:
 - cached
   + write-through
   + writeback-cache
+  + writeback-cache-v2
+
+Direct-io Mode
+==============
 
 The direct-io mode can be selected with the FOPEN_DIRECT_IO flag in the
 FUSE_OPEN reply.
@@ -17,6 +21,9 @@  FUSE_OPEN reply.
 In direct-io mode the page cache is completely bypassed for reads and writes.
 No read-ahead takes place. Shared mmap is disabled.
 
+Cached Modes and Cache Coherence
+================================
+
 In cached mode reads may be satisfied from the page cache, and data may be
 read-ahead by the kernel to fill the cache.  The cache is always kept consistent
 after any writes to the file.  All mmap modes are supported.
@@ -24,7 +31,8 @@  after any writes to the file.  All mmap modes are supported.
 The cached mode has two sub modes controlling how writes are handled.  The
 write-through mode is the default and is supported on all kernels.  The
 writeback-cache mode may be selected by the FUSE_WRITEBACK_CACHE flag in the
-FUSE_INIT reply.
+FUSE_INIT reply. In either modes, if the FOPEN_KEEP_CACHE flag is not set in
+the FUSE_OPEN, cached pages of the file will be invalidated immediatedly.
 
 In write-through mode each write is immediately sent to userspace as one or more
 WRITE requests, as well as updating any cached pages (and caching previously
@@ -38,7 +46,27 @@  reclaim on memory pressure) or explicitly (invoked by close(2), fsync(2) and
 when the last ref to the file is being released on munmap(2)).  This mode
 assumes that all changes to the filesystem go through the FUSE kernel module
 (size and atime/ctime/mtime attributes are kept up-to-date by the kernel), so
-it's generally not suitable for network filesystems.  If a partial page is
+it's generally not suitable for network filesystems (you can consider the
+writeback-cache-v2 mode mentioned latter for them).  If a partial page is
 written, then the page needs to be first read from userspace.  This means, that
 even for files opened for O_WRONLY it is possible that READ requests will be
 generated by the kernel.
+
+Writeback-cache-v2 mode (enabled by the FUSE_WRITEBACK_CACHE_V2 flag) retains
+the dirty page management logic of the writeback-cache mode, which provides
+great write performance.  Furthermore, the v2 mode improves cache coherence for
+multiple FUSE mounts scenarios, especially for network filesystems. The kernel
+a/c/mtime and size attributes are allowed to be updated from the filesystem
+either on timeout or when they have been explicitly invalidated. Meanwhile, if
+ever updated by kernel locally, the attributes will not be propagated to the
+filesystem. In other words, the filesystem rather than kernel is considered the
+official source for generating these attributes.
+
+By combining the writeback-cache-v2 mode with the appropriate open flags
+(FOPEN_KEEP_CACHE and FOPEN_INVAL_ATTR for keeping page cache and invalidating
+attributes on FUSE_OPEN respectively), filesystems are able to implement the
+close-to-open (CTO) consistency semantics, which is widely supported by NFS
+client implementations. This allows for maintaining the writeback manner of
+dirty pages while ensuring cache coherence of attributes and file data if the
+operations among different FUSE mounts on a file are properly serialized by
+users using the open-after-close manner.