Message ID | 20230626171430.3167004-9-ryan.roberts@arm.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7636937vqr; Mon, 26 Jun 2023 10:22:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7kmey6v7CtXnr3iETawXiZBCEuHxKB/kbnFNSww3jiMbUvtC9ZjZr0OkRhixnL4JydjAci X-Received: by 2002:a17:907:25c1:b0:974:e755:9fde with SMTP id ae1-20020a17090725c100b00974e7559fdemr25676977ejc.19.1687800169986; Mon, 26 Jun 2023 10:22:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800169; cv=none; d=google.com; s=arc-20160816; b=suQDhaFGqvQXmwGSE6soss8LwpUIbrsGye1LD2DApMC3klJ+/3Wlx1SavmRdJ2M2pu cMfKLigQqr4Wm3ntAR+bYR3yUpO21oLhhLNAQQ6F4k8fhsEzMNqfawVYVRZEXt3JVRbf s2CFK+M5N/+Jt/dARKr8kTbRDLkHjeul1GkO08T7EjIAV+xLbBJtMTzaQVeqHRSpeK8W gXTE8MgYaBVLEwxdZQR+5g8oSCz7J4bG3LCSK49xRCOwkCA/Nwqsm7QTjTcf14i+ApAC b+KgmjUM4yr5EIEjSe/lfVONoNCYD5FHAN6zqKgLSdpJq386xiVARh0LYs02uHD5gb/i tWBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=AKvNVAfIA7FhYjb4NI628m6rn2oSwSbs9tNtgVulh4k=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=r9iS21Hgen7Sl+BB/KQrciRXILM2FtVfSW4pMDYgKeHnxAsq4zyi/jxFjr9vG3W1XT mN590fWefwE4VI3km/FDd95zv2X+MqUBAWxB5FV7xB1KlUOXo94LxiSgVr2PFT1efbPK 8u21/zjQM5V7e0DqNPhsRpoyxEAEmjoF2XYVYSZcTfPmg/7Fjlqmh2mm15JIeTnCt9DV h6ZNYQoLSrk7Zmq425ZWzWz/wfXtbTigfbWXL5jGIkgT3O7cUyRT5cTKEjua+YPaT3GD rG+f9A2iOHVhkGfyfBUYU4TLToTvpaRMitsV5IY1tYxr9MZkeNjrIDwKiFoxCkuHb76Q agaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xa10-20020a170907b9ca00b00986486cb8d6si2930756ejc.705.2023.06.26.10.22.24; Mon, 26 Jun 2023 10:22:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229597AbjFZRQH (ORCPT <rfc822;filip.gregor98@gmail.com> + 99 others); Mon, 26 Jun 2023 13:16:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230497AbjFZRPH (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 26 Jun 2023 13:15:07 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E080D10C0; Mon, 26 Jun 2023 10:15:05 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B2AB21596; Mon, 26 Jun 2023 10:15:49 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C8DE63F663; Mon, 26 Jun 2023 10:15:02 -0700 (PDT) From: Ryan Roberts <ryan.roberts@arm.com> To: Andrew Morton <akpm@linux-foundation.org>, "Matthew Wilcox (Oracle)" <willy@infradead.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Yin Fengwei <fengwei.yin@intel.com>, David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Geert Uytterhoeven <geert@linux-m68k.org>, Christian Borntraeger <borntraeger@linux.ibm.com>, Sven Schnelle <svens@linux.ibm.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com> Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 08/10] mm: Kconfig hooks to determine max anon folio allocation order Date: Mon, 26 Jun 2023 18:14:28 +0100 Message-Id: <20230626171430.3167004-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769786750697218625?= X-GMAIL-MSGID: =?utf-8?q?1769786750697218625?= |
Series |
variable-order, large folios for anonymous memory
|
|
Commit Message
Ryan Roberts
June 26, 2023, 5:14 p.m. UTC
For variable-order anonymous folios, we need to determine the order that
we will allocate. From a SW perspective, the higher the order we
allocate, the less overhead we will have; fewer faults, fewer folios in
lists, etc. But of course there will also be more memory wastage as the
order increases.
From a HW perspective, there are memory block sizes that can be
beneficial to reducing TLB pressure. arm64, for example, has the ability
to map "contpte" sized chunks (64K for a 4K base page, 2M for 16K and
64K base pages) such that one of these chunks only uses a single TLB
entry.
So we let the architecture specify the order of the maximally beneficial
mapping unit when PTE-mapped. Furthermore, because in some cases, this
order may be quite big (and therefore potentially wasteful of memory),
allow the arch to specify 2 values; One is the max order for a mapping
that _would not_ use THP if all size and alignment constraints were met,
and the other is the max order for a mapping that _would_ use THP if all
those constraints were met.
Implement this with Kconfig by introducing some new options to allow the
architecture to declare that it supports large anonymous folios along
with these 2 preferred max order values. Then introduce a user-facing
option, LARGE_ANON_FOLIO, which defaults to disabled and can only be
enabled if the architecture has declared its support. When disabled, it
forces the max order values, LARGE_ANON_FOLIO_NOTHP_ORDER_MAX and
LARGE_ANON_FOLIO_THP_ORDER_MAX to 0, meaning only a single page is ever
allocated.
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
mm/Kconfig | 39 +++++++++++++++++++++++++++++++++++++++
mm/memory.c | 8 ++++++++
2 files changed, 47 insertions(+)
Comments
On Mon, Jun 26, 2023 at 11:15 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > For variable-order anonymous folios, we need to determine the order that > we will allocate. From a SW perspective, the higher the order we > allocate, the less overhead we will have; fewer faults, fewer folios in > lists, etc. But of course there will also be more memory wastage as the > order increases. > > From a HW perspective, there are memory block sizes that can be > beneficial to reducing TLB pressure. arm64, for example, has the ability > to map "contpte" sized chunks (64K for a 4K base page, 2M for 16K and > 64K base pages) such that one of these chunks only uses a single TLB > entry. > > So we let the architecture specify the order of the maximally beneficial > mapping unit when PTE-mapped. Furthermore, because in some cases, this > order may be quite big (and therefore potentially wasteful of memory), > allow the arch to specify 2 values; One is the max order for a mapping > that _would not_ use THP if all size and alignment constraints were met, > and the other is the max order for a mapping that _would_ use THP if all > those constraints were met. > > Implement this with Kconfig by introducing some new options to allow the > architecture to declare that it supports large anonymous folios along > with these 2 preferred max order values. Then introduce a user-facing > option, LARGE_ANON_FOLIO, which defaults to disabled and can only be > enabled if the architecture has declared its support. When disabled, it > forces the max order values, LARGE_ANON_FOLIO_NOTHP_ORDER_MAX and > LARGE_ANON_FOLIO_THP_ORDER_MAX to 0, meaning only a single page is ever > allocated. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > --- > mm/Kconfig | 39 +++++++++++++++++++++++++++++++++++++++ > mm/memory.c | 8 ++++++++ > 2 files changed, 47 insertions(+) > > diff --git a/mm/Kconfig b/mm/Kconfig > index 7672a22647b4..f4ba48c37b75 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -1208,4 +1208,43 @@ config PER_VMA_LOCK > > source "mm/damon/Kconfig" > > +config ARCH_SUPPORTS_LARGE_ANON_FOLIO > + def_bool n > + help > + An arch should select this symbol if wants to allow LARGE_ANON_FOLIO > + to be enabled. It must also set the following integer values: > + - ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + - ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX > + > +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + int > + help > + The maximum size of folio to allocate for an anonymous VMA PTE-mapping > + that does not have the MADV_HUGEPAGE hint set. > + > +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX > + int > + help > + The maximum size of folio to allocate for an anonymous VMA PTE-mapping > + that has the MADV_HUGEPAGE hint set. > + > +config LARGE_ANON_FOLIO > + bool "Allocate large folios for anonymous memory" > + depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO > + default n > + help > + Use large (bigger than order-0) folios to back anonymous memory where > + possible. This reduces the number of page faults, as well as other > + per-page overheads to improve performance for many workloads. > + > +config LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + int > + default 0 if !LARGE_ANON_FOLIO > + default ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + > +config LARGE_ANON_FOLIO_THP_ORDER_MAX > + int > + default 0 if !LARGE_ANON_FOLIO > + default ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX > + > endmenu I don't think an MVP should add this many Kconfigs. One Kconfig sounds reasonable to me for now.
On 27/06/2023 03:47, Yu Zhao wrote: > On Mon, Jun 26, 2023 at 11:15 AM Ryan Roberts <ryan.roberts@arm.com> wrote: >> >> For variable-order anonymous folios, we need to determine the order that >> we will allocate. From a SW perspective, the higher the order we >> allocate, the less overhead we will have; fewer faults, fewer folios in >> lists, etc. But of course there will also be more memory wastage as the >> order increases. >> >> From a HW perspective, there are memory block sizes that can be >> beneficial to reducing TLB pressure. arm64, for example, has the ability >> to map "contpte" sized chunks (64K for a 4K base page, 2M for 16K and >> 64K base pages) such that one of these chunks only uses a single TLB >> entry. >> >> So we let the architecture specify the order of the maximally beneficial >> mapping unit when PTE-mapped. Furthermore, because in some cases, this >> order may be quite big (and therefore potentially wasteful of memory), >> allow the arch to specify 2 values; One is the max order for a mapping >> that _would not_ use THP if all size and alignment constraints were met, >> and the other is the max order for a mapping that _would_ use THP if all >> those constraints were met. >> >> Implement this with Kconfig by introducing some new options to allow the >> architecture to declare that it supports large anonymous folios along >> with these 2 preferred max order values. Then introduce a user-facing >> option, LARGE_ANON_FOLIO, which defaults to disabled and can only be >> enabled if the architecture has declared its support. When disabled, it >> forces the max order values, LARGE_ANON_FOLIO_NOTHP_ORDER_MAX and >> LARGE_ANON_FOLIO_THP_ORDER_MAX to 0, meaning only a single page is ever >> allocated. >> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >> --- >> mm/Kconfig | 39 +++++++++++++++++++++++++++++++++++++++ >> mm/memory.c | 8 ++++++++ >> 2 files changed, 47 insertions(+) >> >> diff --git a/mm/Kconfig b/mm/Kconfig >> index 7672a22647b4..f4ba48c37b75 100644 >> --- a/mm/Kconfig >> +++ b/mm/Kconfig >> @@ -1208,4 +1208,43 @@ config PER_VMA_LOCK >> >> source "mm/damon/Kconfig" >> >> +config ARCH_SUPPORTS_LARGE_ANON_FOLIO >> + def_bool n >> + help >> + An arch should select this symbol if wants to allow LARGE_ANON_FOLIO >> + to be enabled. It must also set the following integer values: >> + - ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + - ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX >> + >> +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + int >> + help >> + The maximum size of folio to allocate for an anonymous VMA PTE-mapping >> + that does not have the MADV_HUGEPAGE hint set. >> + >> +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX >> + int >> + help >> + The maximum size of folio to allocate for an anonymous VMA PTE-mapping >> + that has the MADV_HUGEPAGE hint set. >> + >> +config LARGE_ANON_FOLIO >> + bool "Allocate large folios for anonymous memory" >> + depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO >> + default n >> + help >> + Use large (bigger than order-0) folios to back anonymous memory where >> + possible. This reduces the number of page faults, as well as other >> + per-page overheads to improve performance for many workloads. >> + >> +config LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + int >> + default 0 if !LARGE_ANON_FOLIO >> + default ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + >> +config LARGE_ANON_FOLIO_THP_ORDER_MAX >> + int >> + default 0 if !LARGE_ANON_FOLIO >> + default ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX >> + >> endmenu > > I don't think an MVP should add this many Kconfigs. One Kconfig sounds > reasonable to me for now. If we move to arch_wants_pte_order() as you suggested (in your response to patch 3) then I agree we can remove most of these. I still think we might want 2 though. For an arch that does not implement arch_wants_pte_order() we wouldn't want LARGE_ANON_FOLIO to show up in menuconfig so we would still need ARCH_SUPPORTS_LARGE_ANON_FOLIO: config ARCH_SUPPORTS_LARGE_ANON_FOLIO def_bool n help An arch should select this symbol if wants to allow LARGE_ANON_FOLIO to be enabled. In this case, It must also define arch_wants_pte_order() config LARGE_ANON_FOLIO bool "Allocate large folios for anonymous memory" depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO default n help Use large (bigger than order-0) folios to back anonymous memory where possible. This reduces the number of page faults, as well as other per-page overheads to improve performance for many workloads. What do you think?
On Mon, Jun 26, 2023 at 10:15 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > For variable-order anonymous folios, we need to determine the order that > we will allocate. From a SW perspective, the higher the order we > allocate, the less overhead we will have; fewer faults, fewer folios in > lists, etc. But of course there will also be more memory wastage as the > order increases. > > From a HW perspective, there are memory block sizes that can be > beneficial to reducing TLB pressure. arm64, for example, has the ability > to map "contpte" sized chunks (64K for a 4K base page, 2M for 16K and > 64K base pages) such that one of these chunks only uses a single TLB > entry. > > So we let the architecture specify the order of the maximally beneficial > mapping unit when PTE-mapped. Furthermore, because in some cases, this > order may be quite big (and therefore potentially wasteful of memory), > allow the arch to specify 2 values; One is the max order for a mapping > that _would not_ use THP if all size and alignment constraints were met, > and the other is the max order for a mapping that _would_ use THP if all > those constraints were met. > > Implement this with Kconfig by introducing some new options to allow the > architecture to declare that it supports large anonymous folios along > with these 2 preferred max order values. Then introduce a user-facing > option, LARGE_ANON_FOLIO, which defaults to disabled and can only be > enabled if the architecture has declared its support. When disabled, it > forces the max order values, LARGE_ANON_FOLIO_NOTHP_ORDER_MAX and > LARGE_ANON_FOLIO_THP_ORDER_MAX to 0, meaning only a single page is ever > allocated. > > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> > --- > mm/Kconfig | 39 +++++++++++++++++++++++++++++++++++++++ > mm/memory.c | 8 ++++++++ > 2 files changed, 47 insertions(+) > > diff --git a/mm/Kconfig b/mm/Kconfig > index 7672a22647b4..f4ba48c37b75 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -1208,4 +1208,43 @@ config PER_VMA_LOCK > > source "mm/damon/Kconfig" > > +config ARCH_SUPPORTS_LARGE_ANON_FOLIO > + def_bool n > + help > + An arch should select this symbol if wants to allow LARGE_ANON_FOLIO > + to be enabled. It must also set the following integer values: > + - ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + - ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX > + > +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + int > + help > + The maximum size of folio to allocate for an anonymous VMA PTE-mapping > + that does not have the MADV_HUGEPAGE hint set. > + > +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX > + int > + help > + The maximum size of folio to allocate for an anonymous VMA PTE-mapping > + that has the MADV_HUGEPAGE hint set. > + > +config LARGE_ANON_FOLIO > + bool "Allocate large folios for anonymous memory" > + depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO > + default n > + help > + Use large (bigger than order-0) folios to back anonymous memory where > + possible. This reduces the number of page faults, as well as other > + per-page overheads to improve performance for many workloads. > + > +config LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + int > + default 0 if !LARGE_ANON_FOLIO > + default ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX > + > +config LARGE_ANON_FOLIO_THP_ORDER_MAX > + int > + default 0 if !LARGE_ANON_FOLIO > + default ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX > + IMHO I don't think we need all of the new kconfigs. Ideally the large anon folios could be supported by all arches, although some of them may not benefit from larger TLB entries due to lack of hardware support.t For now with a minimum implementation, I think you could define a macro or a function that returns the hardware preferred order. > endmenu > diff --git a/mm/memory.c b/mm/memory.c > index 9165ed1b9fc2..a8f7e2b28d7a 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3153,6 +3153,14 @@ static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, > return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); > } > > +static inline int max_anon_folio_order(struct vm_area_struct *vma) > +{ > + if (hugepage_vma_check(vma, vma->vm_flags, false, true, true)) > + return CONFIG_LARGE_ANON_FOLIO_THP_ORDER_MAX; > + else > + return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; > +} > + > /* > * Handle write page faults for pages that can be reused in the current vma > * > -- > 2.25.1 > >
On 29/06/2023 02:38, Yang Shi wrote: > On Mon, Jun 26, 2023 at 10:15 AM Ryan Roberts <ryan.roberts@arm.com> wrote: >> >> For variable-order anonymous folios, we need to determine the order that >> we will allocate. From a SW perspective, the higher the order we >> allocate, the less overhead we will have; fewer faults, fewer folios in >> lists, etc. But of course there will also be more memory wastage as the >> order increases. >> >> From a HW perspective, there are memory block sizes that can be >> beneficial to reducing TLB pressure. arm64, for example, has the ability >> to map "contpte" sized chunks (64K for a 4K base page, 2M for 16K and >> 64K base pages) such that one of these chunks only uses a single TLB >> entry. >> >> So we let the architecture specify the order of the maximally beneficial >> mapping unit when PTE-mapped. Furthermore, because in some cases, this >> order may be quite big (and therefore potentially wasteful of memory), >> allow the arch to specify 2 values; One is the max order for a mapping >> that _would not_ use THP if all size and alignment constraints were met, >> and the other is the max order for a mapping that _would_ use THP if all >> those constraints were met. >> >> Implement this with Kconfig by introducing some new options to allow the >> architecture to declare that it supports large anonymous folios along >> with these 2 preferred max order values. Then introduce a user-facing >> option, LARGE_ANON_FOLIO, which defaults to disabled and can only be >> enabled if the architecture has declared its support. When disabled, it >> forces the max order values, LARGE_ANON_FOLIO_NOTHP_ORDER_MAX and >> LARGE_ANON_FOLIO_THP_ORDER_MAX to 0, meaning only a single page is ever >> allocated. >> >> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> >> --- >> mm/Kconfig | 39 +++++++++++++++++++++++++++++++++++++++ >> mm/memory.c | 8 ++++++++ >> 2 files changed, 47 insertions(+) >> >> diff --git a/mm/Kconfig b/mm/Kconfig >> index 7672a22647b4..f4ba48c37b75 100644 >> --- a/mm/Kconfig >> +++ b/mm/Kconfig >> @@ -1208,4 +1208,43 @@ config PER_VMA_LOCK >> >> source "mm/damon/Kconfig" >> >> +config ARCH_SUPPORTS_LARGE_ANON_FOLIO >> + def_bool n >> + help >> + An arch should select this symbol if wants to allow LARGE_ANON_FOLIO >> + to be enabled. It must also set the following integer values: >> + - ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + - ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX >> + >> +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + int >> + help >> + The maximum size of folio to allocate for an anonymous VMA PTE-mapping >> + that does not have the MADV_HUGEPAGE hint set. >> + >> +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX >> + int >> + help >> + The maximum size of folio to allocate for an anonymous VMA PTE-mapping >> + that has the MADV_HUGEPAGE hint set. >> + >> +config LARGE_ANON_FOLIO >> + bool "Allocate large folios for anonymous memory" >> + depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO >> + default n >> + help >> + Use large (bigger than order-0) folios to back anonymous memory where >> + possible. This reduces the number of page faults, as well as other >> + per-page overheads to improve performance for many workloads. >> + >> +config LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + int >> + default 0 if !LARGE_ANON_FOLIO >> + default ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX >> + >> +config LARGE_ANON_FOLIO_THP_ORDER_MAX >> + int >> + default 0 if !LARGE_ANON_FOLIO >> + default ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX >> + > > IMHO I don't think we need all of the new kconfigs. Ideally the large > anon folios could be supported by all arches, although some of them > may not benefit from larger TLB entries due to lack of hardware > support.t > > For now with a minimum implementation, I think you could define a > macro or a function that returns the hardware preferred order. Thanks for the feedback - that aligns with what Yu Zhao suggested. I'm implementing it for v2. Thanks, Ryan > >> endmenu >> diff --git a/mm/memory.c b/mm/memory.c >> index 9165ed1b9fc2..a8f7e2b28d7a 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -3153,6 +3153,14 @@ static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, >> return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); >> } >> >> +static inline int max_anon_folio_order(struct vm_area_struct *vma) >> +{ >> + if (hugepage_vma_check(vma, vma->vm_flags, false, true, true)) >> + return CONFIG_LARGE_ANON_FOLIO_THP_ORDER_MAX; >> + else >> + return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; >> +} >> + >> /* >> * Handle write page faults for pages that can be reused in the current vma >> * >> -- >> 2.25.1 >> >>
diff --git a/mm/Kconfig b/mm/Kconfig index 7672a22647b4..f4ba48c37b75 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1208,4 +1208,43 @@ config PER_VMA_LOCK source "mm/damon/Kconfig" +config ARCH_SUPPORTS_LARGE_ANON_FOLIO + def_bool n + help + An arch should select this symbol if wants to allow LARGE_ANON_FOLIO + to be enabled. It must also set the following integer values: + - ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + - ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + help + The maximum size of folio to allocate for an anonymous VMA PTE-mapping + that does not have the MADV_HUGEPAGE hint set. + +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + int + help + The maximum size of folio to allocate for an anonymous VMA PTE-mapping + that has the MADV_HUGEPAGE hint set. + +config LARGE_ANON_FOLIO + bool "Allocate large folios for anonymous memory" + depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO + default n + help + Use large (bigger than order-0) folios to back anonymous memory where + possible. This reduces the number of page faults, as well as other + per-page overheads to improve performance for many workloads. + +config LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + default 0 if !LARGE_ANON_FOLIO + default ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + +config LARGE_ANON_FOLIO_THP_ORDER_MAX + int + default 0 if !LARGE_ANON_FOLIO + default ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + endmenu diff --git a/mm/memory.c b/mm/memory.c index 9165ed1b9fc2..a8f7e2b28d7a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3153,6 +3153,14 @@ static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); } +static inline int max_anon_folio_order(struct vm_area_struct *vma) +{ + if (hugepage_vma_check(vma, vma->vm_flags, false, true, true)) + return CONFIG_LARGE_ANON_FOLIO_THP_ORDER_MAX; + else + return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; +} + /* * Handle write page faults for pages that can be reused in the current vma *