afs: Fix waiting for writeback then skipping folio

Message ID 20230607204120.89416-2-vishal.moola@gmail.com
State New
Headers
Series afs: Fix waiting for writeback then skipping folio |

Commit Message

Vishal Moola June 7, 2023, 8:41 p.m. UTC
  Commit acc8d8588cb7 converted afs_writepages_region() to write back a
folio batch. The function waits for writeback to a folio, but then
proceeds to the rest of the batch without trying to write that folio
again. This patch fixes has it attempt to write the folio again.

This has only been compile tested.

Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()")
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 fs/afs/write.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Andrew Morton June 9, 2023, 12:50 a.m. UTC | #1
On Wed,  7 Jun 2023 13:41:20 -0700 "Vishal Moola (Oracle)" <vishal.moola@gmail.com> wrote:

> Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> folio batch. The function waits for writeback to a folio, but then
> proceeds to the rest of the batch without trying to write that folio
> again. This patch fixes has it attempt to write the folio again.
> 
> This has only been compile tested.

This seems fairly serious?

> --- a/fs/afs/write.c
> +++ b/fs/afs/write.c
> @@ -731,6 +731,7 @@ static int afs_writepages_region(struct address_space *mapping,
>  			 * (changing page->mapping to NULL), or even swizzled
>  			 * back from swapper_space to tmpfs file mapping
>  			 */
> +try_again:
>  			if (wbc->sync_mode != WB_SYNC_NONE) {
>  				ret = folio_lock_killable(folio);
>  				if (ret < 0) {
> @@ -757,6 +758,7 @@ static int afs_writepages_region(struct address_space *mapping,
>  #ifdef CONFIG_AFS_FSCACHE
>  					folio_wait_fscache(folio);
>  #endif
> +					goto try_again;
>  				} else {
>  					start += folio_size(folio);
>  				}

From my reading, we'll fail to write out the dirty data.  Presumably
not easily observable, as it will get written out again later on.  But
we're also calling afs_write_back_from_locked_folio() with an unlocked
folio, which might cause mayhem.

So I'm suspecting that a cc:stable is needed.  David, could you please
take a look and perhaps retest?

Thanks.
  
David Howells June 16, 2023, 10:43 p.m. UTC | #2
Andrew Morton <akpm@linux-foundation.org> wrote:

> > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > folio batch. The function waits for writeback to a folio, but then
> > proceeds to the rest of the batch without trying to write that folio
> > again. This patch fixes has it attempt to write the folio again.
> > 
> > This has only been compile tested.
> 
> This seems fairly serious?

We will try to write the again later, but sync()/fsync() might now have
skipped it.

> From my reading, we'll fail to write out the dirty data.  Presumably
> not easily observable, as it will get written out again later on.

As it's a network filesystem, interactions with third parties could cause
apparent corruption.  Closing a file will flush it - but if there's a
simultaneous op of some other kind, a bit of a flush or a sync may get missed
and the copy visible to another user be temporarily missing that bit.

> But we're also calling afs_write_back_from_locked_folio() with an unlocked
> folio, which might cause mayhem.

Without this patch, you mean?  There's a "continue" statement that should send
us back to the top of the loop before we get as far as
afs_write_back_from_locked_folio() - and then the folio_unlock() there would
go bang.

David
  
David Howells June 16, 2023, 10:43 p.m. UTC | #3
Vishal Moola (Oracle) <vishal.moola@gmail.com> wrote:

> +					goto try_again;
>  				} else {
>  					start += folio_size(folio);

The "else" is then redundant.

David
  
Andrew Morton June 16, 2023, 11:22 p.m. UTC | #4
On Fri, 16 Jun 2023 23:43:02 +0100 David Howells <dhowells@redhat.com> wrote:

> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > > folio batch. The function waits for writeback to a folio, but then
> > > proceeds to the rest of the batch without trying to write that folio
> > > again. This patch fixes has it attempt to write the folio again.
> > > 
> > > This has only been compile tested.
> > 
> > This seems fairly serious?
> 
> We will try to write the again later, but sync()/fsync() might now have
> skipped it.
> 
> > From my reading, we'll fail to write out the dirty data.  Presumably
> > not easily observable, as it will get written out again later on.
> 
> As it's a network filesystem, interactions with third parties could cause
> apparent corruption.  Closing a file will flush it - but if there's a
> simultaneous op of some other kind, a bit of a flush or a sync may get missed
> and the copy visible to another user be temporarily missing that bit.
> 
> > But we're also calling afs_write_back_from_locked_folio() with an unlocked
> > folio, which might cause mayhem.
> 
> Without this patch, you mean?  There's a "continue" statement that should send
> us back to the top of the loop before we get as far as
> afs_write_back_from_locked_folio() - and then the folio_unlock() there would
> go bang.
> 

Well, what I'm really asking is the thing I ask seven times a day:

- what are the end-user visible effects of the bug

- should be fix be backported into earlier kernels
  
David Howells June 16, 2023, 11:26 p.m. UTC | #5
Andrew Morton <akpm@linux-foundation.org> wrote:

> Well, what I'm really asking is the thing I ask seven times a day:
> 
> - what are the end-user visible effects of the bug

A third party might see an incomplete flush after they've done a sync - which
amounts to temporary file corruption.

> - should be fix be backported into earlier kernels

Yes.

David
  

Patch

diff --git a/fs/afs/write.c b/fs/afs/write.c
index a724228e4d94..18ccb613dff8 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -731,6 +731,7 @@  static int afs_writepages_region(struct address_space *mapping,
 			 * (changing page->mapping to NULL), or even swizzled
 			 * back from swapper_space to tmpfs file mapping
 			 */
+try_again:
 			if (wbc->sync_mode != WB_SYNC_NONE) {
 				ret = folio_lock_killable(folio);
 				if (ret < 0) {
@@ -757,6 +758,7 @@  static int afs_writepages_region(struct address_space *mapping,
 #ifdef CONFIG_AFS_FSCACHE
 					folio_wait_fscache(folio);
 #endif
+					goto try_again;
 				} else {
 					start += folio_size(folio);
 				}