diff mbox series

netfs: Fix missing xas_retry() calls in xarray iteration

Message ID	166751120808.117671.15797010154703575921.stgit@warthog.procyon.org.uk
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [PATCH] netfs: Fix missing xas_retry() calls in xarray iteration From: David Howells <dhowells@redhat.com> To: willy@infradead.org Cc: George Law <glaw@redhat.com>, Jeff Layton <jlayton@kernel.org>, linux-cachefs@redhat.com, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, linux-kernel@vger.kernel.org Date: Thu, 03 Nov 2022 21:33:28 +0000 Message-ID: <166751120808.117671.15797010154703575921.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Precedence: bulk
Series	netfs: Fix missing xas_retry() calls in xarray iteration \| netfs: Fix missing xas_retry() calls in xarray iteration

Commit Message

David Howells Nov. 3, 2022, 9:33 p.m. UTC

  netfslib has a number of places in which it performs iteration of an xarray
whilst being under the RCU read lock.  It *should* call xas_retry() as the
first thing inside of the loop and do "continue" if it returns true in case
the xarray walker passed out a special value indicating that the walk needs
to be redone from the root[*].

Fix this by adding the missing retry checks.

[*] I wonder if this should be done inside xas_find(), xas_next_node() and
    suchlike, but I'm told that's not an simple change to effect.

This can cause an oops like that below.  Note the faulting address - this
is an internal value (|0x2) returned from xarray.

BUG: kernel NULL pointer dereference, address: 0000000000000402
...
RIP: 0010:netfs_rreq_unlock+0xef/0x380 [netfs]
...
Call Trace:
 netfs_rreq_assess+0xa6/0x240 [netfs]
 netfs_readpage+0x173/0x3b0 [netfs]
 ? init_wait_var_entry+0x50/0x50
 filemap_read_page+0x33/0xf0
 filemap_get_pages+0x2f2/0x3f0
 filemap_read+0xaa/0x320
 ? do_filp_open+0xb2/0x150
 ? rmqueue+0x3be/0xe10
 ceph_read_iter+0x1fe/0x680 [ceph]
 ? new_sync_read+0x115/0x1a0
 new_sync_read+0x115/0x1a0
 vfs_read+0xf3/0x180
 ksys_read+0x5f/0xe0
 do_syscall_64+0x38/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 3d3c95046742 ("netfs: Provide readahead and readpage netfs helpers")
Reported-by: George Law <glaw@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
---

 fs/netfs/buffered_read.c |    9 +++++++--
 fs/netfs/io.c            |    3 +++
 2 files changed, 10 insertions(+), 2 deletions(-)

Comments

Matthew Wilcox Nov. 4, 2022, 3:40 a.m. UTC | #1

On Thu, Nov 03, 2022 at 09:33:28PM +0000, David Howells wrote:
> +++ b/fs/netfs/buffered_read.c
> @@ -46,10 +46,15 @@ void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
>  
>  	rcu_read_lock();
>  	xas_for_each(&xas, folio, last_page) {
> -		unsigned int pgpos = (folio_index(folio) - start_page) * PAGE_SIZE;
> -		unsigned int pgend = pgpos + folio_size(folio);
> +		unsigned int pgpos, pgend;

"unsigned int" assumes that the number of bytes isn't going to exceed 32
bits.  I tend to err on the side of safety here and use size_t.

>  		bool pg_failed = false;
>  
> +		if (xas_retry(&xas, folio))
> +			continue;
> +
> +		pgpos = (folio_index(folio) - start_page) * PAGE_SIZE;
> +		pgend = pgpos + folio_size(folio);

What happens if start_page is somewhere inside folio?  Seems to me
that pgend ends up overhanging into the next folio?

David Howells Nov. 4, 2022, 3:34 p.m. UTC | #2

Matthew Wilcox <willy@infradead.org> wrote:

> "unsigned int" assumes that the number of bytes isn't going to exceed 32
> bits.  I tend to err on the side of safety here and use size_t.

Not unreasonable.

> > +		pgpos = (folio_index(folio) - start_page) * PAGE_SIZE;
> > +		pgend = pgpos + folio_size(folio);
> 
> What happens if start_page is somewhere inside folio?  Seems to me
> that pgend ends up overhanging into the next folio?

Yeah, I think my maths is dodgy.  I should probably use folio_pos() and/or
offset_in_folio().

David

diff mbox series

Patch

diff --git a/fs/netfs/buffered_read.c b/fs/netfs/buffered_read.c
index 0ce535852151..baf668fb4315 100644
--- a/fs/netfs/buffered_read.c
+++ b/fs/netfs/buffered_read.c
@@ -46,10 +46,15 @@  void netfs_rreq_unlock_folios(struct netfs_io_request *rreq)
 
 	rcu_read_lock();
 	xas_for_each(&xas, folio, last_page) {
-		unsigned int pgpos = (folio_index(folio) - start_page) * PAGE_SIZE;
-		unsigned int pgend = pgpos + folio_size(folio);
+		unsigned int pgpos, pgend;
 		bool pg_failed = false;
 
+		if (xas_retry(&xas, folio))
+			continue;
+
+		pgpos = (folio_index(folio) - start_page) * PAGE_SIZE;
+		pgend = pgpos + folio_size(folio);
+
 		for (;;) {
 			if (!subreq) {
 				pg_failed = true;
diff --git a/fs/netfs/io.c b/fs/netfs/io.c
index 428925899282..e374767d1b68 100644
--- a/fs/netfs/io.c
+++ b/fs/netfs/io.c
@@ -121,6 +121,9 @@  static void netfs_rreq_unmark_after_write(struct netfs_io_request *rreq,
 		XA_STATE(xas, &rreq->mapping->i_pages, subreq->start / PAGE_SIZE);
 
 		xas_for_each(&xas, folio, (subreq->start + subreq->len - 1) / PAGE_SIZE) {
+			if (xas_retry(&xas, folio))
+				continue;
+
 			/* We might have multiple writes from the same huge
 			 * folio, but we mustn't unlock a folio more than once.
 			 */