Message ID | f023a6687b9f2109401e7522b727aa4708dc05f1.1706774109.git.zhengqi.arch@bytedance.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-47749-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2685:b0:106:209c:c626 with SMTP id mn5csp285766dyc; Thu, 1 Feb 2024 00:07:07 -0800 (PST) X-Google-Smtp-Source: AGHT+IGyyCMxit8FQ5mQnmAspuHwhrSGjs+FO3B43dnrBniBqZBSyXFscncd+CkMhbHPYeUqurgx X-Received: by 2002:a05:6808:1718:b0:3bd:cd55:d320 with SMTP id bc24-20020a056808171800b003bdcd55d320mr4098981oib.47.1706774827280; Thu, 01 Feb 2024 00:07:07 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706774827; cv=pass; d=google.com; s=arc-20160816; b=EFYtJ54q5KjXZxfZ9e7+fskuG/qkA4Jhic2oV7V4RwuhTI9/d0UkjdXscs/EJTCUJ6 aUm/EHNzc8BboVWRG8W39XBQ/Eaeo/70/Em1S/fNtL4XjKqKZUJNtDHEzuX7QOgze5FR n8ZT+vgXD0MiscLfwmdknkWQSkOSTTHSz7yZz+FyxSQSp5b9rRx6JpxZlJRqYjwD/QmP VMVvxmXKjL4Row3QLBrAeITsaQ26tmhHK/Z419wE8gg7ihBCCBcu2UGu86EGt9/Vz3wd lwfnoCsw1GdlMQVHnCI62N+hI7QweCjVAXCIG4XkZc++fl95HH7uRevIOKb0hsuXpZih XVQg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=xUqkQTp/EejRMwYJr1zUIe1l3Bx/6LHMX3gDVJzEF5k=; fh=WPT+fpfJe8l3vKob42lFtRgKbHMWmWeybfi7IbD2vTA=; b=bah1NFS/2vY5WAPSxWMKlDj60zCOqSOpmOicakRw2AO8/kJP8zYepciu9xM7vjehfD BAo1klfob0kuaE/ap6BEsKJ9jtWY+soFHUypotKD1snyitQbrwtsM7q4zTd4X2Mc76YS HaU90zVGe0IKZ6fS1V2qrlKHbEhW6ucf6I2P1jDQyl7jUpj2l8BPq6QUpkwF5sE6gMiV EztkkQVzvp/PEI75IC7+YZ6GuBdDQ47CBN15mD35HeOzv6OIx9mJMe4arRJ5Z6B+qHbO 6LFlgOMrMemYHVjCX7HExWeizPpiiy9fE+Jl7k+yNNEeFaWKV8K6UYEpqn55s/tOVrwa 3iyg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=BZLJuWKx; arc=pass (i=1 spf=pass spfdomain=bytedance.com dkim=pass dkdomain=bytedance.com dmarc=pass fromdomain=bytedance.com); spf=pass (google.com: domain of linux-kernel+bounces-47749-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47749-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com X-Forwarded-Encrypted: i=1; AJvYcCUuVKR/qbYiX4+Tjd2KUqJnOvT05zMKa4EyXXYa6d7lTpdo2LebPAVYG02T3JOFjw2RLKms4w+tKJiUMp8fMy+Y2cZFMQ== Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id d15-20020ac85d8f000000b0042aa75fee35si8180386qtx.196.2024.02.01.00.07.07 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Feb 2024 00:07:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-47749-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=BZLJuWKx; arc=pass (i=1 spf=pass spfdomain=bytedance.com dkim=pass dkdomain=bytedance.com dmarc=pass fromdomain=bytedance.com); spf=pass (google.com: domain of linux-kernel+bounces-47749-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47749-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 442171C273CA for <ouuuleilei@gmail.com>; Thu, 1 Feb 2024 08:06:16 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E4CD9158D8A; Thu, 1 Feb 2024 08:06:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="BZLJuWKx" Received: from mail-il1-f180.google.com (mail-il1-f180.google.com [209.85.166.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 393EE664BE for <linux-kernel@vger.kernel.org>; Thu, 1 Feb 2024 08:05:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706774760; cv=none; b=atU/l2TtZG8EnmU9fmd52OAMHtnsGA6LP0LuCfPp2iqCRac5/5+DDh0bZftXtR7dS9+3Z1Q0vNf6gp0csFy/uzeFsPkITvIsTHouDMUF259X3QiEABjyzgMska0cvqzNkEDzJ0fcjpMkZQF7J1k0qztR20uGgrRtIP+Jr6gNHCU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706774760; c=relaxed/simple; bh=UFjVSsl6NY5jcy9u+P6W6fC+epRmU3IECOgX+lhK+Oo=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=QBOg4bW6vRRzH4SYOuU8uj+h85zPjWYm3e4gfu/MsY7vnO/SF/VpO4KclAhzsOrYwFNu50VHQrzSP+YPPKZxbKmuBs1YxBmZqqras9kTB5LUi7WNFiQS/1XUpVNdiGoSNItgG1sfPLJTcLTLasAAnlAPJr8k1TAYVAbHPL0Cqys= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=BZLJuWKx; arc=none smtp.client-ip=209.85.166.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Received: by mail-il1-f180.google.com with SMTP id e9e14a558f8ab-3639fac1d06so392985ab.1 for <linux-kernel@vger.kernel.org>; Thu, 01 Feb 2024 00:05:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1706774757; x=1707379557; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=xUqkQTp/EejRMwYJr1zUIe1l3Bx/6LHMX3gDVJzEF5k=; b=BZLJuWKxY9cWyv+yJ0YHx5za3LhQ9hTVNOEpTX58KCmWUT25n2mRiVmojAJOTxa7tA 301slMpaQpuezgC7tuEWRcwpvqhEZmTRr0XeClWtay9lmvjsLBVOP+OH24k0cRDMB1im tcQmCo/hthecT9czbi5S6F7VH0+i65tmtKvw2NoXhwUxdtecoXjScLnW8OMwIjHUd7Cb iya8FIUgeHv82hRD7L1p908H2x5fBLCMlRqc+zE0KCwi1jlV0+dIA1LNn6kNJpAAhntP Ce/jZYhbobUDAOYIMAYMuOUA2wANQTRrfrIuCfBl3Yc27Blh4LUV4ldM1ZUd8GZArsqR z10w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706774757; x=1707379557; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=xUqkQTp/EejRMwYJr1zUIe1l3Bx/6LHMX3gDVJzEF5k=; b=nuOVz6tFpduD0prK9QEdnl89yMK8Z/h1E88nHCkgmaDY01/gZV8XxFKbOKmMnUvjJd NQP/hhYAhZND1j6njNkaG/hFdRIQ/ELBc3d6Q4RB3AnjqKHQNQeYtri9g1Xo8NiEnnqT LSPixPXG8+75QxbxpmmyJXUUOvz+VWCBIzT9RyuOiHzbehdNDShSRu3IcrhnkzOk5P2v ld4sbGaz0/u69TsxVfg6XAi3YqBGfY0KAuWvQNPFsnF9ASNAW57dY3YNxKy+HmbSTnZh gwPGoRaDLd66j0dMnhk8tlUhAutLpVH/OWapELy1K9zPH/jzNFl6VdbYYSynnY9kTOQp Kgyw== X-Forwarded-Encrypted: i=0; AJvYcCW/dl7lIka+R3BgcRgU5W/j+BaVZYzWhtoFJqitoI4pCvrRX1+AFUUZMynbtgoCo6aZqfjrYNLL0gm3FTrdWLmYZ/ySy5yYlVPREC0m X-Gm-Message-State: AOJu0Yyp/F4MVu3+UeAbIMDTdjHn8m55NokMGNx90cuYCuxwsBnZLKzB Igqv4fuYG6i1T7a2mJcTjtggHLJkuh3A86WcizGI1YqPYvbo4mMLjqjAUexRfqY= X-Received: by 2002:a6b:c9d8:0:b0:7bf:cc4d:ea53 with SMTP id z207-20020a6bc9d8000000b007bfcc4dea53mr4665474iof.0.1706774757261; Thu, 01 Feb 2024 00:05:57 -0800 (PST) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.153]) by smtp.gmail.com with ESMTPSA id d14-20020a056a00198e00b006de1da4ca81sm8389738pfl.55.2024.02.01.00.05.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Feb 2024 00:05:56 -0800 (PST) From: Qi Zheng <zhengqi.arch@bytedance.com> To: akpm@linux-foundation.org, arnd@arndb.de Cc: muchun.song@linux.dev, david@redhat.com, willy@infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, Qi Zheng <zhengqi.arch@bytedance.com> Subject: [PATCH 1/2] mm: pgtable: add missing flag and statistics for kernel PTE page Date: Thu, 1 Feb 2024 16:05:40 +0800 Message-Id: <f023a6687b9f2109401e7522b727aa4708dc05f1.1706774109.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789683121438311937 X-GMAIL-MSGID: 1789683121438311937 |
Series |
[1/2] mm: pgtable: add missing flag and statistics for kernel PTE page
|
|
Commit Message
Qi Zheng
Feb. 1, 2024, 8:05 a.m. UTC
For kernel PTE page, we do not need to allocate and initialize its split
ptlock, but as a page table page, it's still necessary to add PG_table
flag and NR_PAGETABLE statistics for it.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
include/asm-generic/pgalloc.h | 7 ++++++-
include/linux/mm.h | 21 ++++++++++++++++-----
2 files changed, 22 insertions(+), 6 deletions(-)
Comments
> On Feb 1, 2024, at 16:05, Qi Zheng <zhengqi.arch@bytedance.com> wrote: > > For kernel PTE page, we do not need to allocate and initialize its split > ptlock, but as a page table page, it's still necessary to add PG_table > flag and NR_PAGETABLE statistics for it. > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Muchun Song <muchun.song@linux.dev> Thanks.
On Thu, Feb 01, 2024 at 04:05:40PM +0800, Qi Zheng wrote: > For kernel PTE page, we do not need to allocate and initialize its split > ptlock, but as a page table page, it's still necessary to add PG_table > flag and NR_PAGETABLE statistics for it. > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > --- > include/asm-generic/pgalloc.h | 7 ++++++- > include/linux/mm.h | 21 ++++++++++++++++----- > 2 files changed, 22 insertions(+), 6 deletions(-) This should also update the architectures that define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, otherwise NR_PAGETABLE counts will get wrong. Another related thing is that many architectures have custom allocations for early page tables and these would also benefit form NR_PAGETABLE accounting. > diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h > index 879e5f8aa5e9..908bd9140ac2 100644 > --- a/include/asm-generic/pgalloc.h > +++ b/include/asm-generic/pgalloc.h > @@ -23,6 +23,8 @@ static inline pte_t *__pte_alloc_one_kernel(struct mm_struct *mm) > > if (!ptdesc) > return NULL; > + > + __pagetable_pte_ctor(ptdesc); > return ptdesc_address(ptdesc); > } > > @@ -46,7 +48,10 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) > */ > static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) > { > - pagetable_free(virt_to_ptdesc(pte)); > + struct ptdesc *ptdesc = virt_to_ptdesc(pte); > + > + __pagetable_pte_dtor(ptdesc); > + pagetable_free(ptdesc); > } > > /** > diff --git a/include/linux/mm.h b/include/linux/mm.h > index e442fd0efdd9..e37db032764e 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2922,26 +2922,37 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } > static inline void ptlock_free(struct ptdesc *ptdesc) {} > #endif /* USE_SPLIT_PTE_PTLOCKS */ > > -static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) > +static inline void __pagetable_pte_ctor(struct ptdesc *ptdesc) > { > struct folio *folio = ptdesc_folio(ptdesc); > > - if (!ptlock_init(ptdesc)) > - return false; > __folio_set_pgtable(folio); > lruvec_stat_add_folio(folio, NR_PAGETABLE); > +} > + > +static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) > +{ > + if (!ptlock_init(ptdesc)) > + return false; > + > + __pagetable_pte_ctor(ptdesc); > return true; > } > > -static inline void pagetable_pte_dtor(struct ptdesc *ptdesc) > +static inline void __pagetable_pte_dtor(struct ptdesc *ptdesc) > { > struct folio *folio = ptdesc_folio(ptdesc); > > - ptlock_free(ptdesc); > __folio_clear_pgtable(folio); > lruvec_stat_sub_folio(folio, NR_PAGETABLE); > } > > +static inline void pagetable_pte_dtor(struct ptdesc *ptdesc) > +{ > + ptlock_free(ptdesc); > + __pagetable_pte_dtor(ptdesc); > +} > + > pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp); > static inline pte_t *pte_offset_map(pmd_t *pmd, unsigned long addr) > { > -- > 2.30.2 > >
Hi Mike, On 2024/2/4 18:58, Mike Rapoport wrote: > On Thu, Feb 01, 2024 at 04:05:40PM +0800, Qi Zheng wrote: >> For kernel PTE page, we do not need to allocate and initialize its split >> ptlock, but as a page table page, it's still necessary to add PG_table >> flag and NR_PAGETABLE statistics for it. >> >> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >> --- >> include/asm-generic/pgalloc.h | 7 ++++++- >> include/linux/mm.h | 21 ++++++++++++++++----- >> 2 files changed, 22 insertions(+), 6 deletions(-) > > This should also update the architectures that define > __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, otherwise NR_PAGETABLE counts will get > wrong. Yes, this patchset only focuses on the generic implementation. For those architectures that define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, some reuse the generic __pte_alloc_one_kernel(), but some have their own customized implementations, which indeed need to be fixed. I wasn't familiar with those architectures and didn't investigate why they couldn't reuse the generic __pte_alloc_one_kernel(), so I didn't fix them. It would be better if there are maintainers corresponding to the architecture who can help fix it. After all, they have a better understanding of the historical background and have a testing environment. ;) > > Another related thing is that many architectures have custom allocations > for early page tables and these would also benefit form NR_PAGETABLE > accounting. Indeed, this is also a point that can be optimized. Thanks. > >> diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h >> index 879e5f8aa5e9..908bd9140ac2 100644 >> --- a/include/asm-generic/pgalloc.h >> +++ b/include/asm-generic/pgalloc.h >> @@ -23,6 +23,8 @@ static inline pte_t *__pte_alloc_one_kernel(struct mm_struct *mm) >> >> if (!ptdesc) >> return NULL; >> + >> + __pagetable_pte_ctor(ptdesc); >> return ptdesc_address(ptdesc); >> } >> >> @@ -46,7 +48,10 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) >> */ >> static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) >> { >> - pagetable_free(virt_to_ptdesc(pte)); >> + struct ptdesc *ptdesc = virt_to_ptdesc(pte); >> + >> + __pagetable_pte_dtor(ptdesc); >> + pagetable_free(ptdesc); >> } >> >> /** >> diff --git a/include/linux/mm.h b/include/linux/mm.h >> index e442fd0efdd9..e37db032764e 100644 >> --- a/include/linux/mm.h >> +++ b/include/linux/mm.h >> @@ -2922,26 +2922,37 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } >> static inline void ptlock_free(struct ptdesc *ptdesc) {} >> #endif /* USE_SPLIT_PTE_PTLOCKS */ >> >> -static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) >> +static inline void __pagetable_pte_ctor(struct ptdesc *ptdesc) >> { >> struct folio *folio = ptdesc_folio(ptdesc); >> >> - if (!ptlock_init(ptdesc)) >> - return false; >> __folio_set_pgtable(folio); >> lruvec_stat_add_folio(folio, NR_PAGETABLE); >> +} >> + >> +static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) >> +{ >> + if (!ptlock_init(ptdesc)) >> + return false; >> + >> + __pagetable_pte_ctor(ptdesc); >> return true; >> } >> >> -static inline void pagetable_pte_dtor(struct ptdesc *ptdesc) >> +static inline void __pagetable_pte_dtor(struct ptdesc *ptdesc) >> { >> struct folio *folio = ptdesc_folio(ptdesc); >> >> - ptlock_free(ptdesc); >> __folio_clear_pgtable(folio); >> lruvec_stat_sub_folio(folio, NR_PAGETABLE); >> } >> >> +static inline void pagetable_pte_dtor(struct ptdesc *ptdesc) >> +{ >> + ptlock_free(ptdesc); >> + __pagetable_pte_dtor(ptdesc); >> +} >> + >> pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp); >> static inline pte_t *pte_offset_map(pmd_t *pmd, unsigned long addr) >> { >> -- >> 2.30.2 >> >> >
On Sun, Feb 04, 2024 at 07:39:38PM +0800, Qi Zheng wrote: > Hi Mike, > > On 2024/2/4 18:58, Mike Rapoport wrote: > > On Thu, Feb 01, 2024 at 04:05:40PM +0800, Qi Zheng wrote: > > > For kernel PTE page, we do not need to allocate and initialize its split > > > ptlock, but as a page table page, it's still necessary to add PG_table > > > flag and NR_PAGETABLE statistics for it. > > > > > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > > > --- > > > include/asm-generic/pgalloc.h | 7 ++++++- > > > include/linux/mm.h | 21 ++++++++++++++++----- > > > 2 files changed, 22 insertions(+), 6 deletions(-) > > > > This should also update the architectures that define > > __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, otherwise NR_PAGETABLE counts will get > > wrong. > > Yes, this patchset only focuses on the generic implementation. For those > architectures that define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, some reuse > the generic __pte_alloc_one_kernel(), but some have their own customized > implementations, which indeed need to be fixed. > > I wasn't familiar with those architectures and didn't investigate why > they couldn't reuse the generic __pte_alloc_one_kernel(), so I didn't > fix them. But with your patch NR_PAGETABLE will underflow e.g. on arm and it'd be a regression for no good reason. > It would be better if there are maintainers corresponding to > the architecture who can help fix it. After all, they have a better > understanding of the historical background and have a testing > environment. ;)
On 2024/2/4 20:15, Mike Rapoport wrote: > On Sun, Feb 04, 2024 at 07:39:38PM +0800, Qi Zheng wrote: >> Hi Mike, >> >> On 2024/2/4 18:58, Mike Rapoport wrote: >>> On Thu, Feb 01, 2024 at 04:05:40PM +0800, Qi Zheng wrote: >>>> For kernel PTE page, we do not need to allocate and initialize its split >>>> ptlock, but as a page table page, it's still necessary to add PG_table >>>> flag and NR_PAGETABLE statistics for it. >>>> >>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >>>> --- >>>> include/asm-generic/pgalloc.h | 7 ++++++- >>>> include/linux/mm.h | 21 ++++++++++++++++----- >>>> 2 files changed, 22 insertions(+), 6 deletions(-) >>> >>> This should also update the architectures that define >>> __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, otherwise NR_PAGETABLE counts will get >>> wrong. >> >> Yes, this patchset only focuses on the generic implementation. For those >> architectures that define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, some reuse >> the generic __pte_alloc_one_kernel(), but some have their own customized >> implementations, which indeed need to be fixed. >> >> I wasn't familiar with those architectures and didn't investigate why >> they couldn't reuse the generic __pte_alloc_one_kernel(), so I didn't >> fix them. > > But with your patch NR_PAGETABLE will underflow e.g. on arm and it'd be a > regression for no good reason. Oh, I see. In some architectures, they implement their own pte_alloc_one_kernel() and do not call generic __pte_alloc_one_kernel(), but still reuse generic pte_free_kernel(). So it needs to be fixed together. I will try to fix them and send the v2. But since I'm on vacation recently, updates may not be quick. Hi Andrew, please help to temporarily remove this patchset from the mm-unstable. Thanks! > >> It would be better if there are maintainers corresponding to >> the architecture who can help fix it. After all, they have a better >> understanding of the historical background and have a testing >> environment. ;) >
On Thu, Feb 01, 2024 at 04:05:40PM +0800, Qi Zheng wrote: > For kernel PTE page, we do not need to allocate and initialize its split > ptlock, but as a page table page, it's still necessary to add PG_table > flag and NR_PAGETABLE statistics for it. No, this is wrong. We do not account _kernel_ page tables to the _user_. Just because the kernel, say, called vmalloc() doesn't mean we should charge the task for it. Moreover, one task may call vmalloc() and a different task would then call vfree(). This is a can of worms you don't want to open. Why did you want to do this?
Hi Matthew, On 2024/2/5 02:51, Matthew Wilcox wrote: > On Thu, Feb 01, 2024 at 04:05:40PM +0800, Qi Zheng wrote: >> For kernel PTE page, we do not need to allocate and initialize its split >> ptlock, but as a page table page, it's still necessary to add PG_table >> flag and NR_PAGETABLE statistics for it. > > No, this is wrong. > > We do not account _kernel_ page tables to the _user_. Just because > the kernel, say, called vmalloc() doesn't mean we should charge the > task for it. Moreover, one task may call vmalloc() and a different task > would then call vfree(). > Got it. Thanks for providing this information! > This is a can of worms you don't want to open. Why did you want to do > this? Ah, just because generic {pmd|pud}_alloc_one() has opened it. ;) And When I looked through the commits (e.g. commit 1d40a5ea01d5), I couldn't find the information you provided above. And that is why I CC'd you to double check this, in case I might have overlooked some important background information. So we should actually fix generic {pmd|pud}_alloc_one() (and maybe some implementation in the arch), right? And it would be better to add some comments to clarify. Thanks.
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h index 879e5f8aa5e9..908bd9140ac2 100644 --- a/include/asm-generic/pgalloc.h +++ b/include/asm-generic/pgalloc.h @@ -23,6 +23,8 @@ static inline pte_t *__pte_alloc_one_kernel(struct mm_struct *mm) if (!ptdesc) return NULL; + + __pagetable_pte_ctor(ptdesc); return ptdesc_address(ptdesc); } @@ -46,7 +48,10 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm) */ static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) { - pagetable_free(virt_to_ptdesc(pte)); + struct ptdesc *ptdesc = virt_to_ptdesc(pte); + + __pagetable_pte_dtor(ptdesc); + pagetable_free(ptdesc); } /** diff --git a/include/linux/mm.h b/include/linux/mm.h index e442fd0efdd9..e37db032764e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2922,26 +2922,37 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } static inline void ptlock_free(struct ptdesc *ptdesc) {} #endif /* USE_SPLIT_PTE_PTLOCKS */ -static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) +static inline void __pagetable_pte_ctor(struct ptdesc *ptdesc) { struct folio *folio = ptdesc_folio(ptdesc); - if (!ptlock_init(ptdesc)) - return false; __folio_set_pgtable(folio); lruvec_stat_add_folio(folio, NR_PAGETABLE); +} + +static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) +{ + if (!ptlock_init(ptdesc)) + return false; + + __pagetable_pte_ctor(ptdesc); return true; } -static inline void pagetable_pte_dtor(struct ptdesc *ptdesc) +static inline void __pagetable_pte_dtor(struct ptdesc *ptdesc) { struct folio *folio = ptdesc_folio(ptdesc); - ptlock_free(ptdesc); __folio_clear_pgtable(folio); lruvec_stat_sub_folio(folio, NR_PAGETABLE); } +static inline void pagetable_pte_dtor(struct ptdesc *ptdesc) +{ + ptlock_free(ptdesc); + __pagetable_pte_dtor(ptdesc); +} + pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp); static inline pte_t *pte_offset_map(pmd_t *pmd, unsigned long addr) {