[2/2] mm: pgtable: remove unnecessary split ptlock for kernel PMD page
Commit Message
For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
there is no need to allocate and initialize the split ptlock for kernel
PMD page.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
include/asm-generic/pgalloc.h | 10 ++++++++--
include/linux/mm.h | 21 ++++++++++++++++-----
2 files changed, 24 insertions(+), 7 deletions(-)
Comments
> On Feb 1, 2024, at 16:05, Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>
> For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
> there is no need to allocate and initialize the split ptlock for kernel
> PMD page.
>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Thanks.
On Thu, Feb 01, 2024 at 04:05:41PM +0800, Qi Zheng wrote:
> For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
> there is no need to allocate and initialize the split ptlock for kernel
> PMD page.
I don't think this is a great idea. Maybe there's no need to initialise
it, but keeping things the same between kernel & user page tables is a
usually better. We don't normally allocate memory for the spinlock,
it's only in debugging scenarios like LOCKDEP. I would drop this unless
you have a really compelling argument to make.
Hi Matthew,
On 2024/2/5 02:54, Matthew Wilcox wrote:
> On Thu, Feb 01, 2024 at 04:05:41PM +0800, Qi Zheng wrote:
>> For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
>> there is no need to allocate and initialize the split ptlock for kernel
>> PMD page.
>
> I don't think this is a great idea. Maybe there's no need to initialise
> it, but keeping things the same between kernel & user page tables is a
> usually better. We don't normally allocate memory for the spinlock,
> it's only in debugging scenarios like LOCKDEP. I would drop this unless
> you have a really compelling argument to make.
The reason I first noticed this is that we didn't allocate and
initialize the ptlock in __pte_alloc_one_kernel(). So in at the PTE
level, the implementation of kernel & user page tables is already
different.
Thanks.
@@ -139,7 +139,10 @@ static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
ptdesc = pagetable_alloc(gfp, 0);
if (!ptdesc)
return NULL;
- if (!pagetable_pmd_ctor(ptdesc)) {
+
+ if (mm == &init_mm) {
+ __pagetable_pmd_ctor(ptdesc);
+ } else if (!pagetable_pmd_ctor(ptdesc)) {
pagetable_free(ptdesc);
return NULL;
}
@@ -153,7 +156,10 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
- pagetable_pmd_dtor(ptdesc);
+ if (mm == &init_mm)
+ __pagetable_pmd_dtor(ptdesc);
+ else
+ pagetable_pmd_dtor(ptdesc);
pagetable_free(ptdesc);
}
#endif
@@ -3048,26 +3048,37 @@ static inline spinlock_t *pmd_lock(struct mm_struct *mm, pmd_t *pmd)
return ptl;
}
-static inline bool pagetable_pmd_ctor(struct ptdesc *ptdesc)
+static inline void __pagetable_pmd_ctor(struct ptdesc *ptdesc)
{
struct folio *folio = ptdesc_folio(ptdesc);
- if (!pmd_ptlock_init(ptdesc))
- return false;
__folio_set_pgtable(folio);
lruvec_stat_add_folio(folio, NR_PAGETABLE);
+}
+
+static inline bool pagetable_pmd_ctor(struct ptdesc *ptdesc)
+{
+ if (!pmd_ptlock_init(ptdesc))
+ return false;
+
+ __pagetable_pmd_ctor(ptdesc);
return true;
}
-static inline void pagetable_pmd_dtor(struct ptdesc *ptdesc)
+static inline void __pagetable_pmd_dtor(struct ptdesc *ptdesc)
{
struct folio *folio = ptdesc_folio(ptdesc);
- pmd_ptlock_free(ptdesc);
__folio_clear_pgtable(folio);
lruvec_stat_sub_folio(folio, NR_PAGETABLE);
}
+static inline void pagetable_pmd_dtor(struct ptdesc *ptdesc)
+{
+ pmd_ptlock_free(ptdesc);
+ __pagetable_pmd_dtor(ptdesc);
+}
+
/*
* No scalability reason to split PUD locks yet, but follow the same pattern
* as the PMD locks to make it easier if we decide to. The VM should not be