block: Fix page refcounts for unaligned buffers in __bio_release_pages()

Message ID 86e592a9-98d4-4cff-a646-0c0084328356@cybernetics.com
State New
Headers
Series block: Fix page refcounts for unaligned buffers in __bio_release_pages() |

Commit Message

Tony Battersby Feb. 29, 2024, 6:08 p.m. UTC
  Fix an incorrect number of pages being released for buffers that do not
start at the beginning of a page.

Fixes: 1b151e2435fc ("block: Remove special-casing of compound pages")
Cc: stable@vger.kernel.org
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
---

Tested with 6.1.79.  The 6.1 backport can just use
folio_put_refs(fi.folio, nr_pages) instead of do {...} while.

 block/bio.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)


base-commit: d206a76d7d2726f3b096037f2079ce0bd3ba329b
  

Comments

Matthew Wilcox Feb. 29, 2024, 7:53 p.m. UTC | #1
On Thu, Feb 29, 2024 at 01:08:09PM -0500, Tony Battersby wrote:
> Fix an incorrect number of pages being released for buffers that do not
> start at the beginning of a page.

Oh, I see what I did.  Wouldn't a simpler fix be to just set "done" to
offset_in_page(fi.offset)?

> @@ -1152,7 +1152,7 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
>  
>  	bio_for_each_folio_all(fi, bio) {
>  		struct page *page;
> -		size_t done = 0;
> +		size_t nr_pages;
>  
>  		if (mark_dirty) {
>  			folio_lock(fi.folio);
> @@ -1160,10 +1160,11 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
>  			folio_unlock(fi.folio);
>  		}
>  		page = folio_page(fi.folio, fi.offset / PAGE_SIZE);
> +		nr_pages = (fi.offset + fi.length - 1) / PAGE_SIZE -
> +			   fi.offset / PAGE_SIZE + 1;
>  		do {
>  			bio_release_page(bio, page++);
> -			done += PAGE_SIZE;
> -		} while (done < fi.length);
> +		} while (--nr_pages != 0);
>  	}
>  }
>  EXPORT_SYMBOL_GPL(__bio_release_pages);

The long-term path here, I think, is to replace this bio_release_page()
with a bio_release_folio(folio, offset, length) which calls into
a new unpin_user_folio(folio, nr) which calls gup_put_folio().
  
Tony Battersby Feb. 29, 2024, 8:40 p.m. UTC | #2
On 2/29/24 14:53, Matthew Wilcox wrote:
> On Thu, Feb 29, 2024 at 01:08:09PM -0500, Tony Battersby wrote:
>> Fix an incorrect number of pages being released for buffers that do not
>> start at the beginning of a page.
> Oh, I see what I did.  Wouldn't a simpler fix be to just set "done" to
> offset_in_page(fi.offset)?

Actually it would be:

ssize_t done = -offset_in_page(offset);

But then you have signed vs. unsigned comparison in the while(), or you
could rearrange the loop to avoid the negative, but then it gets clunky.

>
>> @@ -1152,7 +1152,7 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
>>  
>>  	bio_for_each_folio_all(fi, bio) {
>>  		struct page *page;
>> -		size_t done = 0;
>> +		size_t nr_pages;
>>  
>>  		if (mark_dirty) {
>>  			folio_lock(fi.folio);
>> @@ -1160,10 +1160,11 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
>>  			folio_unlock(fi.folio);
>>  		}
>>  		page = folio_page(fi.folio, fi.offset / PAGE_SIZE);
>> +		nr_pages = (fi.offset + fi.length - 1) / PAGE_SIZE -
>> +			   fi.offset / PAGE_SIZE + 1;
>>  		do {
>>  			bio_release_page(bio, page++);
>> -			done += PAGE_SIZE;
>> -		} while (done < fi.length);
>> +		} while (--nr_pages != 0);
>>  	}
>>  }
>>  EXPORT_SYMBOL_GPL(__bio_release_pages);
> The long-term path here, I think, is to replace this bio_release_page()
> with a bio_release_folio(folio, offset, length) which calls into
> a new unpin_user_folio(folio, nr) which calls gup_put_folio().

I developed the patch with the 6.1 stable series, which just has:

nr_pages = (fi.offset + fi.length - 1) / PAGE_SIZE -
	   fi.offset / PAGE_SIZE + 1;
folio_put_refs(fi.folio, nr_pages);

Which is another reason that I went for the direct nr_pages
calculation.  Would you still prefer the negative offset_in_page() approach?

Tony
  
Greg Edwards Feb. 29, 2024, 10:56 p.m. UTC | #3
On Thu, Feb 29, 2024 at 01:08:09PM -0500, Tony Battersby wrote:
> Fix an incorrect number of pages being released for buffers that do not
> start at the beginning of a page.
>
> Fixes: 1b151e2435fc ("block: Remove special-casing of compound pages")
> Cc: stable@vger.kernel.org
> Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
> ---

This resolves the QEMU hugetlb issue I noted earlier today here [1].
I tested it on 6.1.79, 6.8-rc6 and linux-next-20240229.  Thank you!

Feel free to add a:

Tested-by: Greg Edwards <gedwards@ddn.com>

[1] https://lore.kernel.org/linux-block/20240229182513.GA17355@bobdog.home.arpa/
  

Patch

diff --git a/block/bio.c b/block/bio.c
index b9642a41f286..b52b56067e79 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1152,7 +1152,7 @@  void __bio_release_pages(struct bio *bio, bool mark_dirty)
 
 	bio_for_each_folio_all(fi, bio) {
 		struct page *page;
-		size_t done = 0;
+		size_t nr_pages;
 
 		if (mark_dirty) {
 			folio_lock(fi.folio);
@@ -1160,10 +1160,11 @@  void __bio_release_pages(struct bio *bio, bool mark_dirty)
 			folio_unlock(fi.folio);
 		}
 		page = folio_page(fi.folio, fi.offset / PAGE_SIZE);
+		nr_pages = (fi.offset + fi.length - 1) / PAGE_SIZE -
+			   fi.offset / PAGE_SIZE + 1;
 		do {
 			bio_release_page(bio, page++);
-			done += PAGE_SIZE;
-		} while (done < fi.length);
+		} while (--nr_pages != 0);
 	}
 }
 EXPORT_SYMBOL_GPL(__bio_release_pages);