[2/2] mm: pgtable: remove unnecessary split ptlock for kernel PMD page

Message ID 63f0b3d2f9124ae5076963fb5505bd36daba0393.1706774109.git.zhengqi.arch@bytedance.com
State New
Headers
Series [1/2] mm: pgtable: add missing flag and statistics for kernel PTE page |

Commit Message

Qi Zheng Feb. 1, 2024, 8:05 a.m. UTC
  For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
there is no need to allocate and initialize the split ptlock for kernel
PMD page.

Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 include/asm-generic/pgalloc.h | 10 ++++++++--
 include/linux/mm.h            | 21 ++++++++++++++++-----
 2 files changed, 24 insertions(+), 7 deletions(-)
  

Comments

Muchun Song Feb. 2, 2024, 3:16 a.m. UTC | #1
> On Feb 1, 2024, at 16:05, Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> 
> For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
> there is no need to allocate and initialize the split ptlock for kernel
> PMD page.
> 
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>

Reviewed-by: Muchun Song <muchun.song@linux.dev>

Thanks.
  
Matthew Wilcox Feb. 4, 2024, 6:54 p.m. UTC | #2
On Thu, Feb 01, 2024 at 04:05:41PM +0800, Qi Zheng wrote:
> For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
> there is no need to allocate and initialize the split ptlock for kernel
> PMD page.

I don't think this is a great idea.  Maybe there's no need to initialise
it, but keeping things the same between kernel & user page tables is a
usually better.  We don't normally allocate memory for the spinlock,
it's only in debugging scenarios like LOCKDEP.  I would drop this unless
you have a really compelling argument to make.
  
Qi Zheng Feb. 5, 2024, 2:14 a.m. UTC | #3
Hi Matthew,

On 2024/2/5 02:54, Matthew Wilcox wrote:
> On Thu, Feb 01, 2024 at 04:05:41PM +0800, Qi Zheng wrote:
>> For kernel PMD entry, we use init_mm.page_table_lock to protect it, so
>> there is no need to allocate and initialize the split ptlock for kernel
>> PMD page.
> 
> I don't think this is a great idea.  Maybe there's no need to initialise
> it, but keeping things the same between kernel & user page tables is a
> usually better.  We don't normally allocate memory for the spinlock,
> it's only in debugging scenarios like LOCKDEP.  I would drop this unless
> you have a really compelling argument to make.

The reason I first noticed this is that we didn't allocate and
initialize the ptlock in __pte_alloc_one_kernel(). So in at the PTE
level, the implementation of kernel & user page tables is already
different.

Thanks.
  

Patch

diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index 908bd9140ac2..57bd41adf760 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -139,7 +139,10 @@  static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 	ptdesc = pagetable_alloc(gfp, 0);
 	if (!ptdesc)
 		return NULL;
-	if (!pagetable_pmd_ctor(ptdesc)) {
+
+	if (mm == &init_mm) {
+		__pagetable_pmd_ctor(ptdesc);
+	} else if (!pagetable_pmd_ctor(ptdesc)) {
 		pagetable_free(ptdesc);
 		return NULL;
 	}
@@ -153,7 +156,10 @@  static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
 	struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
 
 	BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
-	pagetable_pmd_dtor(ptdesc);
+	if (mm == &init_mm)
+		__pagetable_pmd_dtor(ptdesc);
+	else
+		pagetable_pmd_dtor(ptdesc);
 	pagetable_free(ptdesc);
 }
 #endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e37db032764e..68ca71407177 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3048,26 +3048,37 @@  static inline spinlock_t *pmd_lock(struct mm_struct *mm, pmd_t *pmd)
 	return ptl;
 }
 
-static inline bool pagetable_pmd_ctor(struct ptdesc *ptdesc)
+static inline void __pagetable_pmd_ctor(struct ptdesc *ptdesc)
 {
 	struct folio *folio = ptdesc_folio(ptdesc);
 
-	if (!pmd_ptlock_init(ptdesc))
-		return false;
 	__folio_set_pgtable(folio);
 	lruvec_stat_add_folio(folio, NR_PAGETABLE);
+}
+
+static inline bool pagetable_pmd_ctor(struct ptdesc *ptdesc)
+{
+	if (!pmd_ptlock_init(ptdesc))
+		return false;
+
+	__pagetable_pmd_ctor(ptdesc);
 	return true;
 }
 
-static inline void pagetable_pmd_dtor(struct ptdesc *ptdesc)
+static inline void __pagetable_pmd_dtor(struct ptdesc *ptdesc)
 {
 	struct folio *folio = ptdesc_folio(ptdesc);
 
-	pmd_ptlock_free(ptdesc);
 	__folio_clear_pgtable(folio);
 	lruvec_stat_sub_folio(folio, NR_PAGETABLE);
 }
 
+static inline void pagetable_pmd_dtor(struct ptdesc *ptdesc)
+{
+	pmd_ptlock_free(ptdesc);
+	__pagetable_pmd_dtor(ptdesc);
+}
+
 /*
  * No scalability reason to split PUD locks yet, but follow the same pattern
  * as the PMD locks to make it easier if we decide to.  The VM should not be