[0/6] shmem: high order folios support in write path

Message ID 20230915095042.1320180-1-da.gomez@samsung.com
Headers
Series shmem: high order folios support in write path |

Message

Daniel Gomez Sept. 15, 2023, 9:51 a.m. UTC
  This series add support for high order folios in shmem write
path.

This is a continuation of the shmem work from Luis here [1]
following Matthew Wilcox's suggestion [2] regarding the path to take
for the folio allocation order calculation.

[1] RFC v2 add support for blocksize > PAGE_SIZE
https://lore.kernel.org/all/ZHBowMEDfyrAAOWH@bombadil.infradead.org/T/#md3e93ab46ce2ad9254e1eb54ffe71211988b5632
[2] https://lore.kernel.org/all/ZHD9zmIeNXICDaRJ@casper.infradead.org/

Patches have been tested and sent from next-230911. They do apply
cleanly to the latest next-230914.

fsx and fstests has been performed on tmpfs with noswap with the
following results:
- fsx: 2d test, 21,5B
- fstests: Same result as baseline for next-230911 [3][4][5]

[3] Baseline next-230911 failures are: generic/080 generic/126
generic/193 generic/633 generic/689
[4] fstests logs baseline: https://gitlab.com/-/snippets/3598621
[5] fstests logs patches: https://gitlab.com/-/snippets/3598628

There are at least 2 cases/topics to handle that I'd appreciate
feedback.
1. With the new strategy, you might end up with a folio order matching
HPAGE_PMD_ORDER. However, we won't respect the 'huge' flag anymore if
THP is enabled.
2. When the above (1.) occurs, the code skips the huge path, so
xa_find with hindex is skipped.

Daniel

Daniel Gomez (5):
  filemap: make the folio order calculation shareable
  shmem: drop BLOCKS_PER_PAGE macro
  shmem: add order parameter support to shmem_alloc_folio
  shmem: add file length in shmem_get_folio path
  shmem: add large folios support to the write path

Luis Chamberlain (1):
  shmem: account for large order folios

 fs/iomap/buffered-io.c   |  6 ++-
 include/linux/pagemap.h  | 42 ++++++++++++++++---
 include/linux/shmem_fs.h |  2 +-
 mm/filemap.c             |  8 ----
 mm/khugepaged.c          |  2 +-
 mm/shmem.c               | 91 +++++++++++++++++++++++++---------------
 6 files changed, 100 insertions(+), 51 deletions(-)

--
2.39.2
  

Comments

David Hildenbrand Sept. 15, 2023, 3:36 p.m. UTC | #1
On 15.09.23 17:34, Matthew Wilcox wrote:
> On Fri, Sep 15, 2023 at 05:29:51PM +0200, David Hildenbrand wrote:
>> On 15.09.23 11:51, Daniel Gomez wrote:
>>> This series add support for high order folios in shmem write
>>> path.
>>> There are at least 2 cases/topics to handle that I'd appreciate
>>> feedback.
>>> 1. With the new strategy, you might end up with a folio order matching
>>> HPAGE_PMD_ORDER. However, we won't respect the 'huge' flag anymore if
>>> THP is enabled.
>>> 2. When the above (1.) occurs, the code skips the huge path, so
>>> xa_find with hindex is skipped.
>>
>> Similar to large anon folios (but different to large non-shmem folios in the
>> pagecache), this can result in memory waste.
> 
> No, it can't.  This patchset triggers only on write, not on read or page
> fault, and it's conservative, so it will only allocate folios which are
> entirely covered by the write.  IOW this is memory we must allocate in
> order to satisfy the write; we're just allocating it in larger chunks
> when we can.

Oh, good! I was assuming you would eventually over-allocate on the write 
path.
  
Matthew Wilcox Sept. 15, 2023, 3:40 p.m. UTC | #2
On Fri, Sep 15, 2023 at 05:36:27PM +0200, David Hildenbrand wrote:
> On 15.09.23 17:34, Matthew Wilcox wrote:
> > No, it can't.  This patchset triggers only on write, not on read or page
> > fault, and it's conservative, so it will only allocate folios which are
> > entirely covered by the write.  IOW this is memory we must allocate in
> > order to satisfy the write; we're just allocating it in larger chunks
> > when we can.
> 
> Oh, good! I was assuming you would eventually over-allocate on the write
> path.

We might!  But that would be a different patchset, and it would be
subject to its own discussion.

Something else I've been wondering about is possibly reallocating the
pages on a write.  This would apply to both normal files and shmem.
If you read in a file one byte at a time, then overwrite a big chunk of
it with a large single write, that seems like a good signal that maybe
we should manage that part of the file as a single large chunk instead
of individual pages.  Maybe.

Lots of things for people who are obsessed with performance to play
with ;-)
  
David Hildenbrand Sept. 15, 2023, 3:43 p.m. UTC | #3
On 15.09.23 17:40, Matthew Wilcox wrote:
> On Fri, Sep 15, 2023 at 05:36:27PM +0200, David Hildenbrand wrote:
>> On 15.09.23 17:34, Matthew Wilcox wrote:
>>> No, it can't.  This patchset triggers only on write, not on read or page
>>> fault, and it's conservative, so it will only allocate folios which are
>>> entirely covered by the write.  IOW this is memory we must allocate in
>>> order to satisfy the write; we're just allocating it in larger chunks
>>> when we can.
>>
>> Oh, good! I was assuming you would eventually over-allocate on the write
>> path.
> 
> We might!  But that would be a different patchset, and it would be
> subject to its own discussion.
> 
> Something else I've been wondering about is possibly reallocating the
> pages on a write.  This would apply to both normal files and shmem.
> If you read in a file one byte at a time, then overwrite a big chunk of
> it with a large single write, that seems like a good signal that maybe
> we should manage that part of the file as a single large chunk instead
> of individual pages.  Maybe.
> 
> Lots of things for people who are obsessed with performance to play
> with ;-)

:) Absolutely. ... because if nobody will be consuming that written 
memory any time soon, it might also be the wrong place for a large/huge 
folio.
  
Daniel Gomez Sept. 18, 2023, 7:32 a.m. UTC | #4
On Fri, Sep 15, 2023 at 05:29:51PM +0200, David Hildenbrand wrote:
> On 15.09.23 11:51, Daniel Gomez wrote:
> > This series add support for high order folios in shmem write
> > path.
> >
> > This is a continuation of the shmem work from Luis here [1]
> > following Matthew Wilcox's suggestion [2] regarding the path to take
> > for the folio allocation order calculation.
> >
> > [1] RFC v2 add support for blocksize > PAGE_SIZE
> > https://lore.kernel.org/all/ZHBowMEDfyrAAOWH@bombadil.infradead.org/T/#md3e93ab46ce2ad9254e1eb54ffe71211988b5632
> > [2] https://lore.kernel.org/all/ZHD9zmIeNXICDaRJ@casper.infradead.org/
> >
> > Patches have been tested and sent from next-230911. They do apply
> > cleanly to the latest next-230914.
> >
> > fsx and fstests has been performed on tmpfs with noswap with the
> > following results:
> > - fsx: 2d test, 21,5B
> > - fstests: Same result as baseline for next-230911 [3][4][5]
> >
> > [3] Baseline next-230911 failures are: generic/080 generic/126
> > generic/193 generic/633 generic/689
> > [4] fstests logs baseline: https://gitlab.com/-/snippets/3598621
> > [5] fstests logs patches: https://gitlab.com/-/snippets/3598628
> >
> > There are at least 2 cases/topics to handle that I'd appreciate
> > feedback.
> > 1. With the new strategy, you might end up with a folio order matching
> > HPAGE_PMD_ORDER. However, we won't respect the 'huge' flag anymore if
> > THP is enabled.
> > 2. When the above (1.) occurs, the code skips the huge path, so
> > xa_find with hindex is skipped.
>
> Similar to large anon folios (but different to large non-shmem folios in the
> pagecache), this can result in memory waste.
>
> We discussed that topic in the last bi-weekly mm meeting, and also how to
> eventually configure that for shmem.
>
> Refer to of a summary. [1]
>
> [1] https://lkml.kernel.org/r/4966f496-9f71-460c-b2ab-8661384ce626@arm.com

Thanks for the summary David (I was missing linux-MM from kvack in lei).

I think the PMD_ORDER-1 as max would suffice here to honor/respect the
huge flag. Although, we would end up having a different max value
than pagecache/readahead.
>
> --
> Cheers,
>
> David / dhildenb
>