From patchwork Mon Jun 26 17:14:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113063 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7643713vqr; Mon, 26 Jun 2023 10:34:40 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6DzUrk+oTcdPjP2qwEE8n2533aVCl/vfygydtd8sCf9AP2Muljc5mlz11mL5M8/i59xTu0 X-Received: by 2002:a05:6870:c904:b0:19f:45a1:b59d with SMTP id hj4-20020a056870c90400b0019f45a1b59dmr25745047oab.12.1687800880195; Mon, 26 Jun 2023 10:34:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800880; cv=none; d=google.com; s=arc-20160816; b=idsraFVZ3b4IZ9CVBM8qNwkiiRyLxgnhddzyICN4V/Jg2sTKNAqOj8cjF+cY33tVj+ PnbqD/dPqB4K/HKm5xWzNI6HV7GfOuUJxwh8s+dqwtdm+TcDcz02k21k5FeOktHcHmvu 7tekUI5cgZ5VmHWA6YH1Z5TYAaVvFp0vHA6oFmGg+oppO8VfgRaCBqzEJdUipU1SsKNZ CddkzV91IQLzqJ52AWvbm36PyAhpL/sICBud810DNMeg0nRzdZaKBeK/f6HVTFV+9KDq KRvhH3drd+YcHO3pm4b6Xx9dYmtJ+/Z4yljSdupJidcb220Ka1Ku//nsRqLaxuYLuZNB v0hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=zeQagWoBaXA3rXEOQANZ4Snh6DOQHHPxdX/QvLLvgp4=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=kdFRphgOkwHXdKLWnjhOAJiAICb28Pn7qye/mIobVIwHwQKbuLB4W7ZH2Ot8m8lPX5 o/izWAYrxULm9EovDeIX2Yr5afam94PZ+TZs19qt9AKTL07bf9m9iA+dh+vIeuEUS00A WqD5TeDGhVE1e0+C8+mtWqyjfqZnYmRd5obtjdxKH1h2Uu8GCajVn2RwdTMkoCSNXyxu G5FGOfR9LJjAGnkXb/1fOfh5HngaUkZMJV5bLHHF4F4nmE50BKBsjf78S5laYF8l6Afn 6TxVMSYUBUK6VfGLHWleX0oDCaJUcht3SseOiF2MC1twPUQeHf2wRacCmqmyPeNI9+LV ZJMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j23-20020a632317000000b005574480a875si5508202pgj.898.2023.06.26.10.34.26; Mon, 26 Jun 2023 10:34:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230493AbjFZRPG (ORCPT + 99 others); Mon, 26 Jun 2023 13:15:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230517AbjFZROu (ORCPT ); Mon, 26 Jun 2023 13:14:50 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 64DB81703; Mon, 26 Jun 2023 10:14:44 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 31D9613D5; Mon, 26 Jun 2023 10:15:28 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 484F93F663; Mon, 26 Jun 2023 10:14:41 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 01/10] mm: Expose clear_huge_page() unconditionally Date: Mon, 26 Jun 2023 18:14:21 +0100 Message-Id: <20230626171430.3167004-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769787496119987734?= X-GMAIL-MSGID: =?utf-8?q?1769787496119987734?= In preparation for extending vma_alloc_zeroed_movable_folio() to allocate a arbitrary order folio, expose clear_huge_page() unconditionally, so that it can be used to zero the allocated folio in the generic implementation of vma_alloc_zeroed_movable_folio(). Signed-off-by: Ryan Roberts --- include/linux/mm.h | 3 ++- mm/memory.c | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7f1741bd870a..7e3bf45e6491 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3684,10 +3684,11 @@ enum mf_action_page_type { */ extern const struct attribute_group memory_failure_attr_group; -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) extern void clear_huge_page(struct page *page, unsigned long addr_hint, unsigned int pages_per_huge_page); + +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) int copy_user_large_folio(struct folio *dst, struct folio *src, unsigned long addr_hint, struct vm_area_struct *vma); diff --git a/mm/memory.c b/mm/memory.c index fb30f7523550..3d4ea668c4d1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5741,7 +5741,6 @@ void __might_fault(const char *file, int line) EXPORT_SYMBOL(__might_fault); #endif -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) /* * Process all subpages of the specified huge page with the specified * operation. The target subpage will be processed last to keep its @@ -5839,6 +5838,7 @@ void clear_huge_page(struct page *page, process_huge_page(addr_hint, pages_per_huge_page, clear_subpage, page); } +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) static int copy_user_gigantic_page(struct folio *dst, struct folio *src, unsigned long addr, struct vm_area_struct *vma, From patchwork Mon Jun 26 17:14:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113056 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7634397vqr; Mon, 26 Jun 2023 10:18:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4f1c/fkG1GxfEK5ixs8gwACgPF/oQs8boYnVN762IMNtk04MROZ5jpc8smRX4rJTYg6iv/ X-Received: by 2002:a17:907:969f:b0:947:335f:5a0d with SMTP id hd31-20020a170907969f00b00947335f5a0dmr27757257ejc.62.1687799900010; Mon, 26 Jun 2023 10:18:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687799899; cv=none; d=google.com; s=arc-20160816; b=sXQRvSR3I4vGGaQOUBl8aJr9qCQOz5yv9fqJmO8sW6+/anTXPFFf7uQdMsI8wDQap2 V4mb9N+40ivhnOn0SZWZip+FilLquo7OBvN2LKbqI4apSgbIEK/764EiMo/Qe1IbE9Pf bxj/zheVeO4YNAajGwaZjnuxlbUJt7gcy6MaJkvvr04pxLdfeY4ag20nAGiRyqYIPdJU 9Uf50CPxjykrr1zYglMEAH7wvplH+FJiwnTJdhfhvpKsADk2Akdckv3AY754NwGxhvJO Lydoj03xjDRQF+VTsFX/ROfr9PPkpdDhOEGsEHyKeIHf1zeugjaQb18ZCAOiUes6rsWO CjrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=YKOYDnZuFPt4HShEut0g3IABiQ9Dibq44QzpCDeDSkU=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=f9LidUHpdePP9qW8t0qQy3NkhAK4uAnc8eWXVEZMs3HdcezYyAZwJhS63Y/vvGOeAu VdF/ecYM7pykgUm2FfgqZO9bH9qmw7XkqJkYs4wpx+mSPTU7ULHkrX7omxNbQPTzbs+v eQ6uylUx7yJm4LV5oT6Aeg61QgxKOdSUbAuPNt9K1havZ3vv4xMx5MPqBds3lxFJPojz EoPdXa4G/mtqEK3vjF3S9ek+43PYdt4IrZ4zpGJmLMzZMkfbT4+s+iOMEiYUbOprxATn 32GPxiiVambsEdUQsfV1JoFhyVFnc65n0u8UtqC/rR+xMvLeMliBmlPY5yhtXgYULIxJ rBtQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ox27-20020a170907101b00b0098ad2c36e5fsi2982746ejb.883.2023.06.26.10.17.55; Mon, 26 Jun 2023 10:18:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231210AbjFZRP3 (ORCPT + 99 others); Mon, 26 Jun 2023 13:15:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231211AbjFZROx (ORCPT ); Mon, 26 Jun 2023 13:14:53 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8962B1729; Mon, 26 Jun 2023 10:14:47 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5BD8D143D; Mon, 26 Jun 2023 10:15:31 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 577523F663; Mon, 26 Jun 2023 10:14:44 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 02/10] mm: pass gfp flags and order to vma_alloc_zeroed_movable_folio() Date: Mon, 26 Jun 2023 18:14:22 +0100 Message-Id: <20230626171430.3167004-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769786467685784154?= X-GMAIL-MSGID: =?utf-8?q?1769786467685784154?= Allow allocation of large folios with vma_alloc_zeroed_movable_folio(). This prepares the ground for large anonymous folios. The generic implementation of vma_alloc_zeroed_movable_folio() now uses clear_huge_page() to zero the allocated folio since it may now be a non-0 order. Currently the function is always called with order 0 and no extra gfp flags, so no functional change intended. But a subsequent commit will take advantage of the new parameters to allocate large folios. The extra gfp flags will be used to control the reclaim policy. Signed-off-by: Ryan Roberts --- arch/alpha/include/asm/page.h | 5 +++-- arch/arm64/include/asm/page.h | 3 ++- arch/arm64/mm/fault.c | 7 ++++--- arch/ia64/include/asm/page.h | 5 +++-- arch/m68k/include/asm/page_no.h | 7 ++++--- arch/s390/include/asm/page.h | 5 +++-- arch/x86/include/asm/page.h | 5 +++-- include/linux/highmem.h | 23 +++++++++++++---------- mm/memory.c | 5 +++-- 9 files changed, 38 insertions(+), 27 deletions(-) diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h index 4db1ebc0ed99..6fc7fe91b6cb 100644 --- a/arch/alpha/include/asm/page.h +++ b/arch/alpha/include/asm/page.h @@ -17,8 +17,9 @@ extern void clear_page(void *page); #define clear_user_page(page, vaddr, pg) clear_page(page) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) extern void copy_page(void * _to, void * _from); #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index 2312e6ee595f..47710852f872 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -30,7 +30,8 @@ void copy_highpage(struct page *to, struct page *from); #define __HAVE_ARCH_COPY_HIGHPAGE struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr); + unsigned long vaddr, + gfp_t gfp, int order); #define vma_alloc_zeroed_movable_folio vma_alloc_zeroed_movable_folio void tag_clear_highpage(struct page *to); diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 6045a5117ac1..0a43c3b3f190 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -961,9 +961,10 @@ NOKPROBE_SYMBOL(do_debug_exception); * Used during anonymous page fault handling. */ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr) + unsigned long vaddr, + gfp_t gfp, int order) { - gfp_t flags = GFP_HIGHUSER_MOVABLE | __GFP_ZERO; + gfp_t flags = GFP_HIGHUSER_MOVABLE | __GFP_ZERO | gfp; /* * If the page is mapped with PROT_MTE, initialise the tags at the @@ -973,7 +974,7 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, if (vma->vm_flags & VM_MTE) flags |= __GFP_ZEROTAGS; - return vma_alloc_folio(flags, 0, vma, vaddr, false); + return vma_alloc_folio(flags, order, vma, vaddr, false); } void tag_clear_highpage(struct page *page) diff --git a/arch/ia64/include/asm/page.h b/arch/ia64/include/asm/page.h index 310b09c3342d..ebdf04274023 100644 --- a/arch/ia64/include/asm/page.h +++ b/arch/ia64/include/asm/page.h @@ -82,10 +82,11 @@ do { \ } while (0) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ ({ \ struct folio *folio = vma_alloc_folio( \ - GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false); \ + GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false); \ if (folio) \ flush_dcache_folio(folio); \ folio; \ diff --git a/arch/m68k/include/asm/page_no.h b/arch/m68k/include/asm/page_no.h index 060e4c0e7605..4a2fe57fef5e 100644 --- a/arch/m68k/include/asm/page_no.h +++ b/arch/m68k/include/asm/page_no.h @@ -3,7 +3,7 @@ #define _M68K_PAGE_NO_H #ifndef __ASSEMBLY__ - + extern unsigned long memory_start; extern unsigned long memory_end; @@ -13,8 +13,9 @@ extern unsigned long memory_end; #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) #define __pa(vaddr) ((unsigned long)(vaddr)) #define __va(paddr) ((void *)((unsigned long)(paddr))) diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h index 8a2a3b5d1e29..b749564140f1 100644 --- a/arch/s390/include/asm/page.h +++ b/arch/s390/include/asm/page.h @@ -73,8 +73,9 @@ static inline void copy_page(void *to, void *from) #define clear_user_page(page, vaddr, pg) clear_page(page) #define copy_user_page(to, from, vaddr, pg) copy_page(to, from) -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) /* * These are used to make use of C type-checking.. diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h index d18e5c332cb9..34deab1a8dae 100644 --- a/arch/x86/include/asm/page.h +++ b/arch/x86/include/asm/page.h @@ -34,8 +34,9 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr, copy_page(to, from); } -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \ - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr, false) +#define vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order) \ + vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO | (gfp), \ + order, vma, vaddr, false) #ifndef __pa #define __pa(x) __phys_addr((unsigned long)(x)) diff --git a/include/linux/highmem.h b/include/linux/highmem.h index 4de1dbcd3ef6..b9a9b0340557 100644 --- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -209,26 +209,29 @@ static inline void clear_user_highpage(struct page *page, unsigned long vaddr) #ifndef vma_alloc_zeroed_movable_folio /** - * vma_alloc_zeroed_movable_folio - Allocate a zeroed page for a VMA. - * @vma: The VMA the page is to be allocated for. - * @vaddr: The virtual address the page will be inserted into. - * - * This function will allocate a page suitable for inserting into this - * VMA at this virtual address. It may be allocated from highmem or + * vma_alloc_zeroed_movable_folio - Allocate a zeroed folio for a VMA. + * @vma: The start VMA the folio is to be allocated for. + * @vaddr: The virtual address the folio will be inserted into. + * @gfp: Additional gfp falgs to mix in or 0. + * @order: The order of the folio (2^order pages). + * + * This function will allocate a folio suitable for inserting into this + * VMA starting at this virtual address. It may be allocated from highmem or * the movable zone. An architecture may provide its own implementation. * - * Return: A folio containing one allocated and zeroed page or NULL if + * Return: A folio containing 2^order allocated and zeroed pages or NULL if * we are out of memory. */ static inline struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma, - unsigned long vaddr) + unsigned long vaddr, gfp_t gfp, int order) { struct folio *folio; - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr, false); + folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, + order, vma, vaddr, false); if (folio) - clear_user_highpage(&folio->page, vaddr); + clear_huge_page(&folio->page, vaddr, 1U << order); return folio; } diff --git a/mm/memory.c b/mm/memory.c index 3d4ea668c4d1..367bbbb29d91 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3073,7 +3073,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) goto oom; if (is_zero_pfn(pte_pfn(vmf->orig_pte))) { - new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + new_folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, + 0, 0); if (!new_folio) goto oom; } else { @@ -4087,7 +4088,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) /* Allocate our own private page. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address); + folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, 0, 0); if (!folio) goto oom; From patchwork Mon Jun 26 17:14:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113064 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7643954vqr; Mon, 26 Jun 2023 10:35:13 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6EmnYv8kQRBDbZumO2L9qOfTcfxNiRathBCe4eKB0V6WRreKOsQywdLVQVpC+PHwk1HjMg X-Received: by 2002:a05:6808:1407:b0:39e:dde3:485 with SMTP id w7-20020a056808140700b0039edde30485mr31102676oiv.41.1687800912821; Mon, 26 Jun 2023 10:35:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800912; cv=none; d=google.com; s=arc-20160816; b=EXt4G3bIphJ1xqBWYWRnXm1BWYcLoAmqol7DXW/ESEKFytIkXxGmtshbZSPr9e3VRN oV04X79uhM71xObxeYLqq0ZkLWeX9vTpahSeeVkcx/+w/Au8PELLw1r6cG5xlZNEsSyk PXF7b5h+Hjp4XSrzaz8DOYYLlrh652Kw81ih4nWHfg3LUHjdO4dv0uH9b4djewlVtsxy t1CU8NstHlIPFO3c61rOT+4K1fKTCMlRZ9roqAIHoGYDsb22zQEAtTNlqI9PBzJ0L2g3 442mZEhtj5Vq+3EUI7DvG/xP8ECMvYAz7O1QWZfgKYhHapQxIUxh6WIXIZN1SNazjycq GlNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=lSO1RBWRS/esyQ/yVe9ST1uKh8G+pWrwI0yEsZeuvk8=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=X6NwO1k/NHZpFn5dUVGgbBkxcj5ciPyOgEtQk1nzTY1nJhPjLouqZfC43zHP3klQXY FquXm3/kZ1eR1RobeRHWYmWLBQTW9lvnhFgiD6Dq4d2Y5EHdV7fRdpNt44MtJp/JrWa0 Xdwb1pQ0PiZWETZ7JN34t6yhQkAXOpNPcXnF/+rzKAApF10MyTh24wodvhlBYvztOacs L5g0wUUOSA9HhxBP0Jt5moCTmZ3gleosdHBKHz6n+lPwg9tzHhMCUhLJHNwbX6k3U8ms RReL0FBhhEgh0Givb58l2Qvv8Jnl7u+BJVWZCyyRcZuV22UaNvAPWiu+0ilDUUrQ2rU3 blEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v3-20020a655c43000000b0054fe7a4c49esi5557887pgr.824.2023.06.26.10.35.00; Mon, 26 Jun 2023 10:35:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231339AbjFZRPw (ORCPT + 99 others); Mon, 26 Jun 2023 13:15:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231267AbjFZRO5 (ORCPT ); Mon, 26 Jun 2023 13:14:57 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9547010D2; Mon, 26 Jun 2023 10:14:50 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6A1EA1474; Mon, 26 Jun 2023 10:15:34 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 80DFE3F663; Mon, 26 Jun 2023 10:14:47 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 03/10] mm: Introduce try_vma_alloc_movable_folio() Date: Mon, 26 Jun 2023 18:14:23 +0100 Message-Id: <20230626171430.3167004-4-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769787529938797673?= X-GMAIL-MSGID: =?utf-8?q?1769787529938797673?= Opportunistically attempt to allocate high-order folios in highmem, optionally zeroed. Retry with lower orders all the way to order-0, until success. Although, of note, order-1 allocations are skipped since a large folio must be at least order-2 to work with the THP machinery. The user must check what they got with folio_order(). This will be used to oportunistically allocate large folios for anonymous memory with a sensible fallback under memory pressure. For attempts to allocate non-0 orders, we set __GFP_NORETRY to prevent high latency due to reclaim, instead preferring to just try for a lower order. The same approach is used by the readahead code when allocating large folios. Signed-off-by: Ryan Roberts --- mm/memory.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 367bbbb29d91..53896d46e686 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3001,6 +3001,39 @@ static vm_fault_t fault_dirty_shared_page(struct vm_fault *vmf) return 0; } +static inline struct folio *vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + gfp_t gfp = order > 0 ? __GFP_NORETRY | __GFP_NOWARN : 0; + + if (zeroed) + return vma_alloc_zeroed_movable_folio(vma, vaddr, gfp, order); + else + return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | gfp, order, vma, + vaddr, false); +} + +/* + * Opportunistically attempt to allocate high-order folios, retrying with lower + * orders all the way to order-0, until success. order-1 allocations are skipped + * since a folio must be at least order-2 to work with the THP machinery. The + * user must check what they got with folio_order(). vaddr can be any virtual + * address that will be mapped by the allocated folio. + */ +static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, + unsigned long vaddr, int order, bool zeroed) +{ + struct folio *folio; + + for (; order > 1; order--) { + folio = vma_alloc_movable_folio(vma, vaddr, order, zeroed); + if (folio) + return folio; + } + + return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); +} + /* * Handle write page faults for pages that can be reused in the current vma * From patchwork Mon Jun 26 17:14:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113061 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7639159vqr; Mon, 26 Jun 2023 10:26:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4MnvWHAkuydJQh0u2ggcXfkrwtOiXwPYmBPcx1mJ581uVTlNDGBPLqgFshqG+6JSl+wmF6 X-Received: by 2002:a05:6512:449:b0:4f8:674c:415b with SMTP id y9-20020a056512044900b004f8674c415bmr13671671lfk.42.1687800417379; Mon, 26 Jun 2023 10:26:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800417; cv=none; d=google.com; s=arc-20160816; b=gUWUCShbwQMSrOiBWtHAGKYbXgwrOGaYCcsnPKdmX+NeCLzjIJtfh8di1hBUIgvG3j zLeIjLmYJR8rU5h0rza2EjFhj0Incp41vh57h2Jq/lvcukEuCC6htk+gRVuXHwUPcR3P 8hqZ+52FC6rmFlUtMEPMn0/hjfrcijwc/VMXdhqxkXnRoYCx7+VSMB9qvmP7627cUQ30 va4n+Rfgd6gM2uN96tzkQp8I5DbxuA1tN/9A+aOfhJvrGnG1EIK27lQRnj0eD9awOCw3 jV8UG3snPvMqfUqXqUqjjSIorUutT0IYhSh8zTpakhY9FEyFLVM3WfYmvTImynT7pJo/ O8bA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=JK++ZwPDFjFzL1t38tJLvARD4YCY0yS3OUbZtrWhuTw=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=foXgrl8TeJ9Hgg97Q0XApt0GXR267fPb/cvGbPtS3ZZUNIFwtJ21mYqPsIhMUX/Jzr XdHpEZNxxoxqgWiF5eQ1Cc4gOtPeDkLxHMrECEtOZDOqS3phnupYhjgJ5Vv9AzdrbUkz kBrdzqmPEAU710SWypYdVWB+SdHYMP0jaxi4ChMehmpTaD4/Z2H1LgKt8NjGqwFHCmah RYdZpWbdbARaWUFKMkjL46x0QYjUUW9ZjxDvz/eQF+p0eTpQNvkaz1Scv9RuWrt96Bxs g9fNasPcMRd005vTVY8gV8XgWlea8zz5lxhMj1SQQPYWvK+8C2Jss8ZZnhfPCiRVUaBW LghQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h21-20020aa7c615000000b0051bed973bb2si2982975edq.455.2023.06.26.10.26.33; Mon, 26 Jun 2023 10:26:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231318AbjFZRPs (ORCPT + 99 others); Mon, 26 Jun 2023 13:15:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231314AbjFZRPB (ORCPT ); Mon, 26 Jun 2023 13:15:01 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A42F910E0; Mon, 26 Jun 2023 10:14:53 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7835F1480; Mon, 26 Jun 2023 10:15:37 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8F06B3F663; Mon, 26 Jun 2023 10:14:50 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 04/10] mm: Implement folio_add_new_anon_rmap_range() Date: Mon, 26 Jun 2023 18:14:24 +0100 Message-Id: <20230626171430.3167004-5-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769787010461861650?= X-GMAIL-MSGID: =?utf-8?q?1769787010461861650?= Like folio_add_new_anon_rmap() but batch-rmaps a range of pages belonging to a folio, for effciency savings. All pages are accounted as small pages. Signed-off-by: Ryan Roberts --- include/linux/rmap.h | 2 ++ mm/rmap.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index a3825ce81102..15433a3d0cbf 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -196,6 +196,8 @@ void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address); void folio_add_new_anon_rmap(struct folio *, struct vm_area_struct *, unsigned long address); +void folio_add_new_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma, unsigned long address); void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr, diff --git a/mm/rmap.c b/mm/rmap.c index 1d8369549424..4050bcea7ae7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1305,6 +1305,49 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, __page_set_anon_rmap(folio, &folio->page, vma, address, 1); } +/** + * folio_add_new_anon_rmap_range - Add mapping to a set of pages within a new + * anonymous potentially large folio. + * @folio: The folio containing the pages to be mapped + * @page: First page in the folio to be mapped + * @nr: Number of pages to be mapped + * @vma: the vm area in which the mapping is added + * @address: the user virtual address of the first page to be mapped + * + * Like folio_add_new_anon_rmap() but batch-maps a range of pages within a folio + * using non-THP accounting. Like folio_add_new_anon_rmap(), the inc-and-test is + * bypassed and the folio does not have to be locked. All pages in the folio are + * individually accounted. + * + * As the folio is new, it's assumed to be mapped exclusively by a single + * process. + */ +void folio_add_new_anon_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma, unsigned long address) +{ + int i; + + VM_BUG_ON_VMA(address < vma->vm_start || + address + (nr << PAGE_SHIFT) > vma->vm_end, vma); + __folio_set_swapbacked(folio); + + if (folio_test_large(folio)) { + /* increment count (starts at 0) */ + atomic_set(&folio->_nr_pages_mapped, nr); + } + + for (i = 0; i < nr; i++) { + /* increment count (starts at -1) */ + atomic_set(&page->_mapcount, 0); + __page_set_anon_rmap(folio, page, vma, address, 1); + page++; + address += PAGE_SIZE; + } + + __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); + +} + /** * folio_add_file_rmap_range - add pte mapping to page range of a folio * @folio: The folio to add the mapping to From patchwork Mon Jun 26 17:14:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113059 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7637546vqr; Mon, 26 Jun 2023 10:23:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5BeTn3qQu98VWbjo95x+eixjHK9KVKdFMm7UWkoU5RU4cZV+USBrKP6TQ+yYyBscQqpdUp X-Received: by 2002:a17:907:7293:b0:988:84e8:747c with SMTP id dt19-20020a170907729300b0098884e8747cmr19817492ejc.32.1687800236519; Mon, 26 Jun 2023 10:23:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800236; cv=none; d=google.com; s=arc-20160816; b=eQ3tVnJlDL3eu7BAR8YO/R3U/d94j/33vGwkubO8pcgpgoaAT34xclOvxj1tOY9LCc SeIXhIkBNY3qgVW0oCGSfjABCTnsVHtbgrluyxemLb7WZ1hmYCg6dhRD0wLLe6ma6wnQ CrJY0s0ON0VOypSwEqCylD0sUBMEbDBQnbKRo2jBP2F3s7cTNBKGAIiFqCePQ+s0f49i /1NyicHn5s3PyS79LNdFz41QcSPwf2IkHbjJPt093mG0E37FiHoSxi2RZiaN/EjA4TJt hdwSHmFVJXayrfdn3VTeeJqqg5IJX5CIKgNlT00IZfIEeP77Th7+DPQt92uImjHHD9dh BBfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=MCPPpPFh5Z8cSR0+xeGHeJRTVorBAm/+P3hCtLuZNVU=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=J5guoM8/ZY87qiLC8SQH42sT3nuwW7B4nDNSO9AriGXkSv8Jhmnhwd4hhpJ9rHZACx wDyun68VXhjpYL/IPhZLxsFFQW43hR1y2Ba29LXevyvnYDU3E0sm/s7KTPGoK6GZDmg+ qC3SnGDsBCNfPgbwSx9FpdaRTJzU+wL0usLKh2TonRyp5QRMWuLtIGfO/ZEbdF7yO95x zUQDxavn91LnYTThDs+XSFnte7m8+0TccldhWXF9yhSkPI6FYTVQpAdpaTJo9loMWbPR a6Lm3VhrAQzb9V3kBUWmd+qKvEOKH2HqBepXTcnV2CvXQGGj/YSzFFhMMOgnCTc8OrFF Empw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k18-20020a1709060cb200b009827ba84224si3023359ejh.1018.2023.06.26.10.23.31; Mon, 26 Jun 2023 10:23:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229495AbjFZRPj (ORCPT + 99 others); Mon, 26 Jun 2023 13:15:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231343AbjFZRPC (ORCPT ); Mon, 26 Jun 2023 13:15:02 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B03FF10EF; Mon, 26 Jun 2023 10:14:56 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 864C1150C; Mon, 26 Jun 2023 10:15:40 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9D3913F663; Mon, 26 Jun 2023 10:14:53 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 05/10] mm: Implement folio_remove_rmap_range() Date: Mon, 26 Jun 2023 18:14:25 +0100 Message-Id: <20230626171430.3167004-6-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769786820586299351?= X-GMAIL-MSGID: =?utf-8?q?1769786820586299351?= Like page_remove_rmap() but batch-removes the rmap for a range of pages belonging to a folio, for effciency savings. All pages are accounted as small pages. Signed-off-by: Ryan Roberts --- include/linux/rmap.h | 2 ++ mm/rmap.c | 62 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 15433a3d0cbf..50f50e4cb0f8 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -204,6 +204,8 @@ void folio_add_file_rmap_range(struct folio *, struct page *, unsigned int nr, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma); void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index 4050bcea7ae7..ac1d93d43f2b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1434,6 +1434,68 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, folio_add_file_rmap_range(folio, page, nr_pages, vma, compound); } +/* + * folio_remove_rmap_range - take down pte mappings from a range of pages + * belonging to a folio. All pages are accounted as small pages. + * @folio: folio that all pages belong to + * @page: first page in range to remove mapping from + * @nr: number of pages in range to remove mapping from + * @vma: the vm area from which the mapping is removed + * + * The caller needs to hold the pte lock. + */ +void folio_remove_rmap_range(struct folio *folio, struct page *page, + int nr, struct vm_area_struct *vma) +{ + atomic_t *mapped = &folio->_nr_pages_mapped; + int nr_unmapped = 0; + int nr_mapped; + bool last; + enum node_stat_item idx; + + VM_BUG_ON_FOLIO(folio_test_hugetlb(folio), folio); + + if (!folio_test_large(folio)) { + /* Is this the page's last map to be removed? */ + last = atomic_add_negative(-1, &page->_mapcount); + nr_unmapped = last; + } else { + for (; nr != 0; nr--, page++) { + /* Is this the page's last map to be removed? */ + last = atomic_add_negative(-1, &page->_mapcount); + if (last) { + /* Page still mapped if folio mapped entirely */ + nr_mapped = atomic_dec_return_relaxed(mapped); + if (nr_mapped < COMPOUND_MAPPED) + nr_unmapped++; + } + } + } + + if (nr_unmapped) { + idx = folio_test_anon(folio) ? NR_ANON_MAPPED : NR_FILE_MAPPED; + __lruvec_stat_mod_folio(folio, idx, -nr_unmapped); + + /* + * Queue anon THP for deferred split if we have just unmapped at + * least 1 page, while at least 1 page remains mapped. + */ + if (folio_test_large(folio) && folio_test_anon(folio)) + if (nr_mapped) + deferred_split_folio(folio); + } + + /* + * It would be tidy to reset folio_test_anon mapping when fully + * unmapped, but that might overwrite a racing page_add_anon_rmap + * which increments mapcount after us but sets mapping before us: + * so leave the reset to free_pages_prepare, and remember that + * it's only reliable while mapped. + */ + + munlock_vma_folio(folio, vma, false); +} + /** * page_remove_rmap - take down pte mapping from a page * @page: page to remove mapping from From patchwork Mon Jun 26 17:14:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113062 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7640625vqr; Mon, 26 Jun 2023 10:29:42 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5KdZ/bRVUWYqrOrZ+PF1S1nU77tzFvbWzQ5Nlfv6RCk8HlQIajRi1jrKcYYWktnn9rMUR8 X-Received: by 2002:a17:907:26cb:b0:96f:e5af:ac5f with SMTP id bp11-20020a17090726cb00b0096fe5afac5fmr24727805ejc.47.1687800581782; Mon, 26 Jun 2023 10:29:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800581; cv=none; d=google.com; s=arc-20160816; b=hyMb4ZXnndY0pm/rot7xg9TXJxh0MX6EbOyT0f2iCV24Aqouy6di8aIKBH4dPSLTya V50xigXy1XxjxsPyIos9KxgabBw/cd7o339S9d0br1u8nqmFN01uhQ8Jlkg3xZmURVU4 sgqIJt/nPNc9wtR1NnUYPectfky+QzP7oy8i2/5M7LF9aW9Vl7j2thL2XJI0C29TMN06 WpLUcbzHlFvuOMwLpOkxBFJwmoUpU0DCjcu9lC5tkm6Gh0YoHbRbGGX4/nlOKGZ+8tGO +U4cfbzdiISGmTPEhj9Ngs/ch9tQDra+ukd/89aeaiEAtgmN2gh6dJOl+nkGR4MurnD+ /aEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=9LDnxtcMGKzMwMeO2DGDBjGbhx12txEm8328niFWh5M=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=nKA/X7wJ2qAMcdLUxkinRODrg0+l4YBB61gNqZxghnz/dH4Bfp9nd5Zm4K+BHVuoUP Mnj8HmIRxZLpB/egteN6jB8gWkSghhfMoHnHhIyFGa9bCYGPtcVumiCuYfC8JRTIYK8Y avG6yv6EiL75jGHtJoXS0vrUbZnBfLMONPU3xSdF+OEHORxLsLjW3IdiQ09RjLgDcaIW 7Aqh0wZUhuaFQ6+ySSmDevtBsgBBQP0+M8AffjJx29pdRjxWiyGsYXQdHO1aSH/NVaR0 C8jonXZICxNeo81mhiTgYB40Z7znEwgs5naGZp6foQvxJTjKhBQ9m+atYk0st4BJuX2G GGwg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gq17-20020a170906e25100b0098e0c0cfb64si2677223ejb.111.2023.06.26.10.29.17; Mon, 26 Jun 2023 10:29:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231305AbjFZRPp (ORCPT + 99 others); Mon, 26 Jun 2023 13:15:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231356AbjFZRPC (ORCPT ); Mon, 26 Jun 2023 13:15:02 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C05EB10F5; Mon, 26 Jun 2023 10:14:59 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 94AD9152B; Mon, 26 Jun 2023 10:15:43 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AB7DB3F663; Mon, 26 Jun 2023 10:14:56 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 06/10] mm: Allow deferred splitting of arbitrary large anon folios Date: Mon, 26 Jun 2023 18:14:26 +0100 Message-Id: <20230626171430.3167004-7-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769787183194393377?= X-GMAIL-MSGID: =?utf-8?q?1769787183194393377?= With the introduction of large folios for anonymous memory, we would like to be able to split them when they have unmapped subpages, in order to free those unused pages under memory pressure. So remove the artificial requirement that the large folio needed to be at least PMD-sized. Signed-off-by: Ryan Roberts Reviewed-by: Yu Zhao Reviewed-by: Yin Fengwei --- mm/rmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index ac1d93d43f2b..3d11c5fb6090 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1567,7 +1567,7 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, * page of the folio is unmapped and at least one page * is still mapped. */ - if (folio_test_pmd_mappable(folio) && folio_test_anon(folio)) + if (folio_test_large(folio) && folio_test_anon(folio)) if (!compound || nr < nr_pmdmapped) deferred_split_folio(folio); } From patchwork Mon Jun 26 17:14:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113066 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7643965vqr; Mon, 26 Jun 2023 10:35:13 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4zTneu5DOduh9TawAF+JVGzRVI9k3wwnv/TlBcGZQYrkyjOvy1/g2vRvNTzotPeWBspCJy X-Received: by 2002:a17:902:9b96:b0:1b6:79e3:636d with SMTP id y22-20020a1709029b9600b001b679e3636dmr4675187plp.58.1687800913384; Mon, 26 Jun 2023 10:35:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800913; cv=none; d=google.com; s=arc-20160816; b=0CCEBu01D3PWC8Q49XMl4VlJ4TMrs71vAUwrEfz2EPQAUnu31FqjF4kpYM9B2G7s6u Vl3SfpGrsZvhY72z2QzkWnNUUkIkkRfBfGfX4suQE1wdXWy400wxdEYYhPYtjqut9FdH wdfj1aqqq0AnSVjA8CNVdtIMrHFXRWimhh1bnRTSNcbMYFnbY/vB5boSFn4+XWSfjYuc odKNuL7ImakzTwN6erPtQa3AGkdzWPJqeaI+nWKLt4+0yjVPM+DgW4JqqBP+1H+l053s Q8PuIjwz44uwSmuF/qaeHkoPVWaWQ5bzeNHdXxsLqjq+0Lgnh0hFPCMODd0NaXa7VEL0 SAAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=igZkz1a6pk30y7Cb39JwkU8z+NvNmcA3pvWODxUTXD0=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=XOS+pvN8I6iGTkbIYpTm9DRtQ0HzSMJUeN+6bI17IVyTW6JvMg91eIVW0HFl+Czal0 bxA6AALeP6GECzxBiiSb808LF8wSknIwlQdxQ9BXZJQ2pB47eX4Y9r1wV3MuS4wuDVSZ V3qFYSuxYW0sr/i0wJXVPR6vDxVk9loAioVP0z3/fPMbpSUgY+xc25a/17sX7tU3DEXR zqWXbECcdFJUoKwqxKXfbWsQgizz/rHp8XHIiVAO/I8ROCO92RmJ5APTvw0oXOHfc0Lq DzbykhgSfhJzzMfXAynMPVZB02yKroN/quJ1ZAMXuBfP2rFxn/4qpY4DPLkJbF4GcmyV 5wTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c9-20020a170902d48900b001b04bc1f28csi5603651plg.517.2023.06.26.10.35.00; Mon, 26 Jun 2023 10:35:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229886AbjFZRQE (ORCPT + 99 others); Mon, 26 Jun 2023 13:16:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231374AbjFZRPE (ORCPT ); Mon, 26 Jun 2023 13:15:04 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CE95B10FD; Mon, 26 Jun 2023 10:15:02 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A3C721595; Mon, 26 Jun 2023 10:15:46 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BA1553F663; Mon, 26 Jun 2023 10:14:59 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 07/10] mm: Batch-zap large anonymous folio PTE mappings Date: Mon, 26 Jun 2023 18:14:27 +0100 Message-Id: <20230626171430.3167004-8-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769787530551689609?= X-GMAIL-MSGID: =?utf-8?q?1769787530551689609?= This allows batching the rmap removal with folio_remove_rmap_range(), which means we avoid spuriously adding a partially unmapped folio to the deferrred split queue in the common case, which reduces split queue lock contention. Previously each page was removed from the rmap individually with page_remove_rmap(). If the first page belonged to a large folio, this would cause page_remove_rmap() to conclude that the folio was now partially mapped and add the folio to the deferred split queue. But subsequent calls would cause the folio to become fully unmapped, meaning there is no value to adding it to the split queue. Signed-off-by: Ryan Roberts --- mm/memory.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 53896d46e686..9165ed1b9fc2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -914,6 +914,57 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma return 0; } +static inline unsigned long page_addr(struct page *page, + struct page *anchor, unsigned long anchor_addr) +{ + unsigned long offset; + unsigned long addr; + + offset = (page_to_pfn(page) - page_to_pfn(anchor)) << PAGE_SHIFT; + addr = anchor_addr + offset; + + if (anchor > page) { + if (addr > anchor_addr) + return 0; + } else { + if (addr < anchor_addr) + return ULONG_MAX; + } + + return addr; +} + +static int calc_anon_folio_map_pgcount(struct folio *folio, + struct page *page, pte_t *pte, + unsigned long addr, unsigned long end) +{ + pte_t ptent; + int floops; + int i; + unsigned long pfn; + + end = min(page_addr(&folio->page + folio_nr_pages(folio), page, addr), + end); + floops = (end - addr) >> PAGE_SHIFT; + pfn = page_to_pfn(page); + pfn++; + pte++; + + for (i = 1; i < floops; i++) { + ptent = ptep_get(pte); + + if (!pte_present(ptent) || + pte_pfn(ptent) != pfn) { + return i; + } + + pfn++; + pte++; + } + + return floops; +} + /* * Copy one pte. Returns 0 if succeeded, or -EAGAIN if one preallocated page * is required to copy this pte. @@ -1379,6 +1430,44 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma, pte_install_uffd_wp_if_needed(vma, addr, pte, pteval); } +static unsigned long zap_anon_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + struct page *page, pte_t *pte, + unsigned long addr, unsigned long end, + bool *full_out) +{ + struct folio *folio = page_folio(page); + struct mm_struct *mm = tlb->mm; + pte_t ptent; + int pgcount; + int i; + bool full; + + pgcount = calc_anon_folio_map_pgcount(folio, page, pte, addr, end); + + for (i = 0; i < pgcount;) { + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); + tlb_remove_tlb_entry(tlb, pte, addr); + full = __tlb_remove_page(tlb, page, 0); + + if (unlikely(page_mapcount(page) < 1)) + print_bad_pte(vma, addr, ptent, page); + + i++; + page++; + pte++; + addr += PAGE_SIZE; + + if (unlikely(full)) + break; + } + + folio_remove_rmap_range(folio, page - i, i, vma); + + *full_out = full; + return i; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1415,6 +1504,36 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, page = vm_normal_page(vma, addr, ptent); if (unlikely(!should_zap_page(details, page))) continue; + + /* + * Batch zap large anonymous folio mappings. This allows + * batching the rmap removal, which means we avoid + * spuriously adding a partially unmapped folio to the + * deferrred split queue in the common case, which + * reduces split queue lock contention. Require the VMA + * to be anonymous to ensure that none of the PTEs in + * the range require zap_install_uffd_wp_if_needed(). + */ + if (page && PageAnon(page) && vma_is_anonymous(vma)) { + bool full; + int pgcount; + + pgcount = zap_anon_pte_range(tlb, vma, + page, pte, addr, end, &full); + + rss[mm_counter(page)] -= pgcount; + pgcount--; + pte += pgcount; + addr += pgcount << PAGE_SHIFT; + + if (unlikely(full)) { + force_flush = 1; + addr += PAGE_SIZE; + break; + } + continue; + } + ptent = ptep_get_and_clear_full(mm, addr, pte, tlb->fullmm); tlb_remove_tlb_entry(tlb, pte, addr); From patchwork Mon Jun 26 17:14:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113058 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7636937vqr; Mon, 26 Jun 2023 10:22:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7kmey6v7CtXnr3iETawXiZBCEuHxKB/kbnFNSww3jiMbUvtC9ZjZr0OkRhixnL4JydjAci X-Received: by 2002:a17:907:25c1:b0:974:e755:9fde with SMTP id ae1-20020a17090725c100b00974e7559fdemr25676977ejc.19.1687800169986; Mon, 26 Jun 2023 10:22:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800169; cv=none; d=google.com; s=arc-20160816; b=suQDhaFGqvQXmwGSE6soss8LwpUIbrsGye1LD2DApMC3klJ+/3Wlx1SavmRdJ2M2pu cMfKLigQqr4Wm3ntAR+bYR3yUpO21oLhhLNAQQ6F4k8fhsEzMNqfawVYVRZEXt3JVRbf s2CFK+M5N/+Jt/dARKr8kTbRDLkHjeul1GkO08T7EjIAV+xLbBJtMTzaQVeqHRSpeK8W gXTE8MgYaBVLEwxdZQR+5g8oSCz7J4bG3LCSK49xRCOwkCA/Nwqsm7QTjTcf14i+ApAC b+KgmjUM4yr5EIEjSe/lfVONoNCYD5FHAN6zqKgLSdpJq386xiVARh0LYs02uHD5gb/i tWBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=AKvNVAfIA7FhYjb4NI628m6rn2oSwSbs9tNtgVulh4k=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=r9iS21Hgen7Sl+BB/KQrciRXILM2FtVfSW4pMDYgKeHnxAsq4zyi/jxFjr9vG3W1XT mN590fWefwE4VI3km/FDd95zv2X+MqUBAWxB5FV7xB1KlUOXo94LxiSgVr2PFT1efbPK 8u21/zjQM5V7e0DqNPhsRpoyxEAEmjoF2XYVYSZcTfPmg/7Fjlqmh2mm15JIeTnCt9DV h6ZNYQoLSrk7Zmq425ZWzWz/wfXtbTigfbWXL5jGIkgT3O7cUyRT5cTKEjua+YPaT3GD rG+f9A2iOHVhkGfyfBUYU4TLToTvpaRMitsV5IY1tYxr9MZkeNjrIDwKiFoxCkuHb76Q agaw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xa10-20020a170907b9ca00b00986486cb8d6si2930756ejc.705.2023.06.26.10.22.24; Mon, 26 Jun 2023 10:22:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229597AbjFZRQH (ORCPT + 99 others); Mon, 26 Jun 2023 13:16:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230497AbjFZRPH (ORCPT ); Mon, 26 Jun 2023 13:15:07 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E080D10C0; Mon, 26 Jun 2023 10:15:05 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B2AB21596; Mon, 26 Jun 2023 10:15:49 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C8DE63F663; Mon, 26 Jun 2023 10:15:02 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 08/10] mm: Kconfig hooks to determine max anon folio allocation order Date: Mon, 26 Jun 2023 18:14:28 +0100 Message-Id: <20230626171430.3167004-9-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769786750697218625?= X-GMAIL-MSGID: =?utf-8?q?1769786750697218625?= For variable-order anonymous folios, we need to determine the order that we will allocate. From a SW perspective, the higher the order we allocate, the less overhead we will have; fewer faults, fewer folios in lists, etc. But of course there will also be more memory wastage as the order increases. From a HW perspective, there are memory block sizes that can be beneficial to reducing TLB pressure. arm64, for example, has the ability to map "contpte" sized chunks (64K for a 4K base page, 2M for 16K and 64K base pages) such that one of these chunks only uses a single TLB entry. So we let the architecture specify the order of the maximally beneficial mapping unit when PTE-mapped. Furthermore, because in some cases, this order may be quite big (and therefore potentially wasteful of memory), allow the arch to specify 2 values; One is the max order for a mapping that _would not_ use THP if all size and alignment constraints were met, and the other is the max order for a mapping that _would_ use THP if all those constraints were met. Implement this with Kconfig by introducing some new options to allow the architecture to declare that it supports large anonymous folios along with these 2 preferred max order values. Then introduce a user-facing option, LARGE_ANON_FOLIO, which defaults to disabled and can only be enabled if the architecture has declared its support. When disabled, it forces the max order values, LARGE_ANON_FOLIO_NOTHP_ORDER_MAX and LARGE_ANON_FOLIO_THP_ORDER_MAX to 0, meaning only a single page is ever allocated. Signed-off-by: Ryan Roberts --- mm/Kconfig | 39 +++++++++++++++++++++++++++++++++++++++ mm/memory.c | 8 ++++++++ 2 files changed, 47 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index 7672a22647b4..f4ba48c37b75 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1208,4 +1208,43 @@ config PER_VMA_LOCK source "mm/damon/Kconfig" +config ARCH_SUPPORTS_LARGE_ANON_FOLIO + def_bool n + help + An arch should select this symbol if wants to allow LARGE_ANON_FOLIO + to be enabled. It must also set the following integer values: + - ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + - ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + help + The maximum size of folio to allocate for an anonymous VMA PTE-mapping + that does not have the MADV_HUGEPAGE hint set. + +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + int + help + The maximum size of folio to allocate for an anonymous VMA PTE-mapping + that has the MADV_HUGEPAGE hint set. + +config LARGE_ANON_FOLIO + bool "Allocate large folios for anonymous memory" + depends on ARCH_SUPPORTS_LARGE_ANON_FOLIO + default n + help + Use large (bigger than order-0) folios to back anonymous memory where + possible. This reduces the number of page faults, as well as other + per-page overheads to improve performance for many workloads. + +config LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + default 0 if !LARGE_ANON_FOLIO + default ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + +config LARGE_ANON_FOLIO_THP_ORDER_MAX + int + default 0 if !LARGE_ANON_FOLIO + default ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + endmenu diff --git a/mm/memory.c b/mm/memory.c index 9165ed1b9fc2..a8f7e2b28d7a 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3153,6 +3153,14 @@ static struct folio *try_vma_alloc_movable_folio(struct vm_area_struct *vma, return vma_alloc_movable_folio(vma, vaddr, 0, zeroed); } +static inline int max_anon_folio_order(struct vm_area_struct *vma) +{ + if (hugepage_vma_check(vma, vma->vm_flags, false, true, true)) + return CONFIG_LARGE_ANON_FOLIO_THP_ORDER_MAX; + else + return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; +} + /* * Handle write page faults for pages that can be reused in the current vma * From patchwork Mon Jun 26 17:14:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113057 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7634772vqr; Mon, 26 Jun 2023 10:18:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7PE3R65MUjgLg4f8q5a7ly1B7ObHu/RnR56aDiueXRKlYA9pPY3J/qeVp1pMBvGxLbTuYa X-Received: by 2002:a17:906:d542:b0:982:7505:fafa with SMTP id cr2-20020a170906d54200b009827505fafamr23377947ejc.47.1687799935337; Mon, 26 Jun 2023 10:18:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687799935; cv=none; d=google.com; s=arc-20160816; b=CkPqGtrgW2FA2kMDJoyw6efwbDpHSKivZxgxz+xmuGxPiOpFkndcDlscpX7kS6N9tu pOoMV4Q4OKOmdiFhO9GtrPR8huRaGHbKKuvVXEaqPiR0655mRbDgnJSJoqSYewrL8C6Q 4tZ6hxnjkY/omzURvYftrZTaob7yf0zhJaKc2X64lN/PuYp1hx568WPzZl57r10xUHgx f5wf0p0mooyXvwHG3iTeYyjIed8cPiPMS47WKAtHPqwQUd6fhz2we1lfSvRNIxaNo8+X eGWK87E/iROML9DHnly7glEVa2b+RyjAHhO8XmcKB7Seo1J7j7PJgN4it3jz8oOcGJ6D fmgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=6MmZkyfi0JXPji2JdUjitVD1cfMLJ46y/AmVM9gAT7Q=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=OA0TucdmyBCC8HSWF7S5YZiNWk8OBdd0/7eqYGmGMrW/S2khS355Nq/JSBlUEdVfyn mXubXW28AsqXsNR2NejUba2MuBwtMO9WVQhe7BtZjVk+WsPA2iv6M+0UVeyFyPNThXqb P7pq7TO2Fo2ZeTpGH99Qas2va6E0q3q7/b4KRA64rAXlLf+GYSWLhb+WE9kNMpA1+BcI XSnWgJfV7BrwQTgkN448Eu2HIwXdm01/iXbbDErW9AW37ZxlAE8G9xVZFFCY6A660Xx0 uIeu3vcygX5BMlmwrE2QZl5M3V+LlndEkzZoMQSGo+61b51xlR87/DU9+XmoHesXzOTd H9Tw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ca4-20020a170906a3c400b009887c7b0206si2972737ejb.632.2023.06.26.10.18.29; Mon, 26 Jun 2023 10:18:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231424AbjFZRQS (ORCPT + 99 others); Mon, 26 Jun 2023 13:16:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231127AbjFZRPK (ORCPT ); Mon, 26 Jun 2023 13:15:10 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EBAFE10D5; Mon, 26 Jun 2023 10:15:08 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C111215A1; Mon, 26 Jun 2023 10:15:52 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D7EBC3F663; Mon, 26 Jun 2023 10:15:05 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 09/10] arm64: mm: Declare support for large anonymous folios Date: Mon, 26 Jun 2023 18:14:29 +0100 Message-Id: <20230626171430.3167004-10-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769786504817859760?= X-GMAIL-MSGID: =?utf-8?q?1769786504817859760?= For the unhinted case, when THP is not permitted for the vma, don't allow anything bigger than 64K. This means we don't waste too much memory. Additionally, for 4K pages this is the contpte size, and for 16K, this is (usually) the HPA size when the uarch feature is implemented. For the hinted case, when THP is permitted for the vma, allow the contpte size for all page size configurations; 64K for 4K, 2M for 16K and 2M for 64K. Signed-off-by: Ryan Roberts --- arch/arm64/Kconfig | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 343e1e1cae10..0e91b5bc8cd9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -243,6 +243,7 @@ config ARM64 select TRACE_IRQFLAGS_SUPPORT select TRACE_IRQFLAGS_NMI_SUPPORT select HAVE_SOFTIRQ_ON_OWN_STACK + select ARCH_SUPPORTS_LARGE_ANON_FOLIO help ARM 64-bit (AArch64) Linux support. @@ -281,6 +282,18 @@ config ARM64_CONT_PMD_SHIFT default 5 if ARM64_16K_PAGES default 4 +config ARCH_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX + int + default 0 if ARM64_64K_PAGES # 64K (1 page) + default 2 if ARM64_16K_PAGES # 64K (4 pages; benefits from HPA where HW supports it) + default 4 if ARM64_4K_PAGES # 64K (16 pages; eligible for contpte-mapping) + +config ARCH_LARGE_ANON_FOLIO_THP_ORDER_MAX + int + default 5 if ARM64_64K_PAGES # 2M (32 page; eligible for contpte-mapping) + default 7 if ARM64_16K_PAGES # 2M (128 pages; eligible for contpte-mapping) + default 4 if ARM64_4K_PAGES # 64K (16 pages; eligible for contpte-mapping) + config ARCH_MMAP_RND_BITS_MIN default 14 if ARM64_64K_PAGES default 16 if ARM64_16K_PAGES From patchwork Mon Jun 26 17:14:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 113060 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7638405vqr; Mon, 26 Jun 2023 10:25:32 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6XgWbN0CsOs70G+D9rgAt1IzVLPH/BreRzf0CKKlMPWyAMEXghZPi85NQ9tOgDwAfMpPEi X-Received: by 2002:a05:6402:456:b0:50b:c693:70af with SMTP id p22-20020a056402045600b0050bc69370afmr19408742edw.2.1687800331945; Mon, 26 Jun 2023 10:25:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687800331; cv=none; d=google.com; s=arc-20160816; b=a7oeYtZkrcKVds110nwPZQD5nUD6soqdphr1zkJFhMSmIqJJvDIVoT0qrTCRdZ+ix8 mo95TFnwsGIrNqW7OQid7oSBHthpYcpTNi4Xae6BCZzwPwliJaMlaGTfoCLfCpQGjfD+ VmiDeGNkNFtiMlukJuAR4LkxA3xHgUxII3cHVXkwQ2bQ41nYLo8Dp+Rgt5sP9xR6EAwd 1ypdmkJkZkqTOYnIMp6hP2KyaGSpXzppDblbWruu13ZpHpFdiql/hQpNCFymygGg/q8L iblQq1lea9xcPDrmWO9YQ3sF2OArF/GKnnuDM2JkDcya/F3cqJiRqZC3TPHyNFRo+vsG iPYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Ayx1XUOP8Jqvje8vyoH06qGKySRPMl/T4IoLYSq146A=; fh=H2MVjBlipHVEN6kEAh1RDhnPLB9jpPNjGExTmo1/EvA=; b=Z9+po/9tisfEE97H/JZvCCbPd4Z8ZKSNU6e1814n7rcTB5GjHfJOQI6gDu2IvRjbUB 4p7AgBlR0tc7GqPYe1rPBy0d4/JtKWZO2lbgGxWENqgMzwAvM9XT1qp9eudQjzr5tTaP J8u8xQa/yRR/alXQ6b1A5Z6StTC2i0eWYqM7plZ9yuhUZJ1PfxJEdtw1qAp2dDoeZudp IYoPUAFNj7++lGM/wl6DAGI1ZbU1UnrNkv6pQ6hjFGh2iuUKreG++jbbpZU1w7U4KTnL WThvR68CP0LTlFmVbY7Hg5NBi9UFD/DTtDm6L4uSoKNqZk1CTseE9xBqxhyl4Vd/L3VV VcRA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y12-20020aa7c24c000000b0051bfba09815si3013394edo.358.2023.06.26.10.25.07; Mon, 26 Jun 2023 10:25:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231169AbjFZRQa (ORCPT + 99 others); Mon, 26 Jun 2023 13:16:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231163AbjFZRPO (ORCPT ); Mon, 26 Jun 2023 13:15:14 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1418510E6; Mon, 26 Jun 2023 10:15:12 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D0EE615DB; Mon, 26 Jun 2023 10:15:55 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E66B63F663; Mon, 26 Jun 2023 10:15:08 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , "Matthew Wilcox (Oracle)" , "Kirill A. Shutemov" , Yin Fengwei , David Hildenbrand , Yu Zhao , Catalin Marinas , Will Deacon , Geert Uytterhoeven , Christian Borntraeger , Sven Schnelle , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-alpha@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-s390@vger.kernel.org Subject: [PATCH v1 10/10] mm: Allocate large folios for anonymous memory Date: Mon, 26 Jun 2023 18:14:30 +0100 Message-Id: <20230626171430.3167004-11-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230626171430.3167004-1-ryan.roberts@arm.com> References: <20230626171430.3167004-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769786921194141271?= X-GMAIL-MSGID: =?utf-8?q?1769786921194141271?= With all of the enabler patches in place, modify the anonymous memory write allocation path so that it opportunistically attempts to allocate a large folio up to `max_anon_folio_order()` size (This value is ultimately configured by the architecture). This reduces the number of page faults, reduces the size of (e.g. LRU) lists, and generally improves performance by batching what were per-page operations into per-(large)-folio operations. If CONFIG_LARGE_ANON_FOLIO is not enabled (the default) then `max_anon_folio_order()` always returns 0, meaning we get the existing allocation behaviour. Signed-off-by: Ryan Roberts --- mm/memory.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 144 insertions(+), 15 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index a8f7e2b28d7a..d23c44cc5092 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3161,6 +3161,90 @@ static inline int max_anon_folio_order(struct vm_area_struct *vma) return CONFIG_LARGE_ANON_FOLIO_NOTHP_ORDER_MAX; } +/* + * Returns index of first pte that is not none, or nr if all are none. + */ +static inline int check_ptes_none(pte_t *pte, int nr) +{ + int i; + + for (i = 0; i < nr; i++) { + if (!pte_none(ptep_get(pte++))) + return i; + } + + return nr; +} + +static int calc_anon_folio_order_alloc(struct vm_fault *vmf, int order) +{ + /* + * The aim here is to determine what size of folio we should allocate + * for this fault. Factors include: + * - Order must not be higher than `order` upon entry + * - Folio must be naturally aligned within VA space + * - Folio must not breach boundaries of vma + * - Folio must be fully contained inside one pmd entry + * - Folio must not overlap any non-none ptes + * + * Additionally, we do not allow order-1 since this breaks assumptions + * elsewhere in the mm; THP pages must be at least order-2 (since they + * store state up to the 3rd struct page subpage), and these pages must + * be THP in order to correctly use pre-existing THP infrastructure such + * as folio_split(). + * + * As a consequence of relying on the THP infrastructure, if the system + * does not support THP, we always fallback to order-0. + * + * Note that the caller may or may not choose to lock the pte. If + * unlocked, the calculation should be considered an estimate that will + * need to be validated under the lock. + */ + + struct vm_area_struct *vma = vmf->vma; + int nr; + unsigned long addr; + pte_t *pte; + pte_t *first_set = NULL; + int ret; + + if (has_transparent_hugepage()) { + order = min(order, PMD_SHIFT - PAGE_SHIFT); + + for (; order > 1; order--) { + nr = 1 << order; + addr = ALIGN_DOWN(vmf->address, nr << PAGE_SHIFT); + pte = vmf->pte - ((vmf->address - addr) >> PAGE_SHIFT); + + /* Check vma bounds. */ + if (addr < vma->vm_start || + addr + (nr << PAGE_SHIFT) > vma->vm_end) + continue; + + /* Ptes covered by order already known to be none. */ + if (pte + nr <= first_set) + break; + + /* Already found set pte in range covered by order. */ + if (pte <= first_set) + continue; + + /* Need to check if all the ptes are none. */ + ret = check_ptes_none(pte, nr); + if (ret == nr) + break; + + first_set = pte + ret; + } + + if (order == 1) + order = 0; + } else + order = 0; + + return order; +} + /* * Handle write page faults for pages that can be reused in the current vma * @@ -4201,6 +4285,9 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) struct folio *folio; vm_fault_t ret = 0; pte_t entry; + unsigned long addr; + int order = uffd_wp ? 0 : max_anon_folio_order(vma); + int pgcount = BIT(order); /* File mapping without ->vm_ops ? */ if (vma->vm_flags & VM_SHARED) @@ -4242,24 +4329,44 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) pte_unmap_unlock(vmf->pte, vmf->ptl); return handle_userfault(vmf, VM_UFFD_MISSING); } - goto setpte; + if (uffd_wp) + entry = pte_mkuffd_wp(entry); + set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + + /* No need to invalidate - it was non-present before */ + update_mmu_cache(vma, vmf->address, vmf->pte); + goto unlock; } - /* Allocate our own private page. */ +retry: + /* + * Estimate the folio order to allocate. We are not under the ptl here + * so this estiamte needs to be re-checked later once we have the lock. + */ + vmf->pte = pte_offset_map(vmf->pmd, vmf->address); + order = calc_anon_folio_order_alloc(vmf, order); + pte_unmap(vmf->pte); + + /* Allocate our own private folio. */ if (unlikely(anon_vma_prepare(vma))) goto oom; - folio = vma_alloc_zeroed_movable_folio(vma, vmf->address, 0, 0); + folio = try_vma_alloc_movable_folio(vma, vmf->address, order, true); if (!folio) goto oom; + /* We may have been granted less than we asked for. */ + order = folio_order(folio); + pgcount = BIT(order); + addr = ALIGN_DOWN(vmf->address, pgcount << PAGE_SHIFT); + if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL)) goto oom_free_page; folio_throttle_swaprate(folio, GFP_KERNEL); /* * The memory barrier inside __folio_mark_uptodate makes sure that - * preceding stores to the page contents become visible before - * the set_pte_at() write. + * preceding stores to the folio contents become visible before + * the set_ptes() write. */ __folio_mark_uptodate(folio); @@ -4268,11 +4375,31 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (vma->vm_flags & VM_WRITE) entry = pte_mkwrite(pte_mkdirty(entry)); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); - if (vmf_pte_changed(vmf)) { - update_mmu_tlb(vma, vmf->address, vmf->pte); - goto release; + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); + + /* + * Ensure our estimate above is still correct; we could have raced with + * another thread to service a fault in the region. + */ + if (order == 0) { + if (vmf_pte_changed(vmf)) { + update_mmu_tlb(vma, vmf->address, vmf->pte); + goto release; + } + } else if (check_ptes_none(vmf->pte, pgcount) != pgcount) { + pte_t *pte = vmf->pte + ((vmf->address - addr) >> PAGE_SHIFT); + + /* If faulting pte was allocated by another, exit early. */ + if (!pte_none(ptep_get(pte))) { + update_mmu_tlb(vma, vmf->address, pte); + goto release; + } + + /* Else try again, with a lower order. */ + pte_unmap_unlock(vmf->pte, vmf->ptl); + folio_put(folio); + order--; + goto retry; } ret = check_stable_address_space(vma->vm_mm); @@ -4286,16 +4413,18 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) return handle_userfault(vmf, VM_UFFD_MISSING); } - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - folio_add_new_anon_rmap(folio, vma, vmf->address); + folio_ref_add(folio, pgcount - 1); + + add_mm_counter(vma->vm_mm, MM_ANONPAGES, pgcount); + folio_add_new_anon_rmap_range(folio, &folio->page, pgcount, vma, addr); folio_add_lru_vma(folio, vma); -setpte: + if (uffd_wp) entry = pte_mkuffd_wp(entry); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + set_ptes(vma->vm_mm, addr, vmf->pte, entry, pgcount); /* No need to invalidate - it was non-present before */ - update_mmu_cache(vma, vmf->address, vmf->pte); + update_mmu_cache_range(vma, addr, vmf->pte, pgcount); unlock: pte_unmap_unlock(vmf->pte, vmf->ptl); return ret;