mm: Wire up tail page poisoning over ->mappings
Commit Message
Tail pages have a sanity check on ->mapping fields, not all of them but
only upon index>2, for now. It's because we reused ->mapping fields of the
tail pages index=1,2 for other things.
Define a macro for "max index of tail pages that got ->mapping field
reused" on top of folio definition, because when we grow folio tail pages
we'd want to boost this too together.
Then wire everything up using that macro.
Don't try to poison the ->mapping field in prep_compound_tail() for tail
pages <=TAIL_MAPPING_REUSED_MAX because it's wrong. For example, the 1st
tail page already reused ->mapping field as _nr_pages_mapped. It didn't
already blow up only because we luckily always prepare tail pages before
preparing the head, then prep_compound_head() will update
folio->_nr_pages_mapped so as to void the poisoning. This should make it
always safe again, even e.g. if we prep the head first.
Clean up free_tail_page_prepare() along the way on checking ->mapping
poisoning to also leverage the new macro.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
I split this out from another rfc series. Removed RFC tag because it
wasn't for this patch but for the documentation updates. I'll post the rfc
part alone. Comments welcomed, thanks.
---
include/linux/mm_types.h | 11 +++++++++++
mm/huge_memory.c | 6 +++---
mm/internal.h | 3 ++-
mm/page_alloc.c | 28 +++++++++++-----------------
4 files changed, 27 insertions(+), 21 deletions(-)
Comments
On Mon, Aug 21, 2023 at 12:48:18PM -0400, Peter Xu wrote:
> On Mon, Aug 21, 2023 at 03:29:01AM +0100, Matthew Wilcox wrote:
> > On Sun, Aug 20, 2023 at 09:13:55PM -0400, Peter Xu wrote:
> > > Setting tail mapping for tail 1/2 is even wrong, which part of this patch
> > > fixes:
> > >
> > > @@ -428,7 +428,8 @@ static inline void prep_compound_tail(struct page *head, int tail_idx)
> > > {
> > > struct page *p = head + tail_idx;
> > >
> > > - p->mapping = TAIL_MAPPING;
> > > + if (tail_idx > TAIL_MAPPING_REUSED_MAX)
> > > + p->mapping = TAIL_MAPPING;
> > > set_compound_head(p, head);
> > > set_page_private(p, 0);
> > > }
> >
> > I didn't see this. This is wrong. tail->mapping is only reused for
> > large rmappable pages. It's not reused for other compound pages.
>
> Just to make sure we're on the same page: I think it's not only
> _deferred_list (of tail page 2) that reused the mapping field (word offset
> 3), but also _nr_pages_mapped (of tail page 1)?
I don't see how this comment is related to the part of the email you're
replying to. But yes, prep_large_rmappable overwrites ->mapping in
two tail pages.
> > However, I have a small patch series which enables 'allnoconfig' to
> > build after renaming page->mapping to page->_mapping. Aside from fs/
> > there are *very* few places which look at page->mapping [1]. I'll post
> > that patch series tomorrow.
>
> Assuming it's still not yet posted; I can wait and read it.
I sent out a few patches. Some have made it to -next already because
they're almost trivial. Nobody's commented on the difficult one.
https://lore.kernel.org/linux-mm/20230821202016.2910321-1-willy@infradead.org/
> If you plan to remove the whole TAIL_MAPPING in a few days then I agree
> this patch is not needed, but so far I don't know when it'll land and also
> why, before that it does sound like we can still keep this patch.
This patch is putting fresh paint on a condemned building. Just stop
it.
> Regarding the question on "why removing TAIL_MAPPING": poisoning an unused
> field is always helpful to me even if not referenced with "page->mapping".
> So I don't see an immediate benefit from removing the poisoning if it's
> already there; OTOH not sure whether poison more unused fields will be more
> helpful in general?
You are wrong.
@@ -248,6 +248,17 @@ static inline struct page *encoded_page_ptr(struct encoded_page *page)
return (struct page *)(~ENCODE_PAGE_BITS & (unsigned long)page);
}
+/*
+ * This macro defines the maximum tail pages (of a folio) that can have the
+ * page->mapping field reused.
+ *
+ * When the tail page's mapping field reused, it'll be exempted from
+ * ->mapping poisoning and checks. Also see the macro TAIL_MAPPING.
+ *
+ * When grow the folio struct, please consider growing this too.
+ */
+#define TAIL_MAPPING_REUSED_MAX (2)
+
/**
* struct folio - Represents a contiguous set of bytes.
* @flags: Identical to the page flags.
@@ -2444,9 +2444,9 @@ static void __split_huge_page_tail(struct page *head, int tail,
(1L << PG_dirty) |
LRU_GEN_MASK | LRU_REFS_MASK));
- /* ->mapping in first and second tail page is replaced by other uses */
- VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING,
- page_tail);
+ /* ->mapping in <=TAIL_MAPPING_REUSED_MAX tail pages are reused */
+ VM_BUG_ON_PAGE(tail > TAIL_MAPPING_REUSED_MAX &&
+ page_tail->mapping != TAIL_MAPPING, page_tail);
page_tail->mapping = head->mapping;
page_tail->index = head->index + tail;
@@ -428,7 +428,8 @@ static inline void prep_compound_tail(struct page *head, int tail_idx)
{
struct page *p = head + tail_idx;
- p->mapping = TAIL_MAPPING;
+ if (tail_idx > TAIL_MAPPING_REUSED_MAX)
+ p->mapping = TAIL_MAPPING;
set_compound_head(p, head);
set_page_private(p, 0);
}
@@ -990,7 +990,7 @@ static inline bool is_check_pages_enabled(void)
static int free_tail_page_prepare(struct page *head_page, struct page *page)
{
struct folio *folio = (struct folio *)head_page;
- int ret = 1;
+ int ret = 1, index = page - head_page;
/*
* We rely page->lru.next never has bit 0 set, unless the page
@@ -1002,9 +1002,9 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page)
ret = 0;
goto out;
}
- switch (page - head_page) {
- case 1:
- /* the first tail page: these may be in place of ->mapping */
+
+ /* Sanity check the first tail page */
+ if (index == 1) {
if (unlikely(folio_entire_mapcount(folio))) {
bad_page(page, "nonzero entire_mapcount");
goto out;
@@ -1017,20 +1017,14 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page)
bad_page(page, "nonzero pincount");
goto out;
}
- break;
- case 2:
- /*
- * the second tail page: ->mapping is
- * deferred_list.next -- ignore value.
- */
- break;
- default:
- if (page->mapping != TAIL_MAPPING) {
- bad_page(page, "corrupted mapping in tail page");
- goto out;
- }
- break;
}
+
+ /* Sanity check the rest tail pages over ->mapping */
+ if (index > TAIL_MAPPING_REUSED_MAX && page->mapping != TAIL_MAPPING) {
+ bad_page(page, "corrupted mapping in tail page");
+ goto out;
+ }
+
if (unlikely(!PageTail(page))) {
bad_page(page, "PageTail not set");
goto out;