From patchwork Thu Aug 10 14:29:38 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ryan Roberts <ryan.roberts@arm.com>
X-Patchwork-Id: 134020
Return-Path: <linux-kernel-owner@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp476293vqi;
        Thu, 10 Aug 2023 07:55:18 -0700 (PDT)
X-Google-Smtp-Source: 
 AGHT+IEawhT6hiokskcl4sU/GTnMUdvnV76tEIjUUb5OsHJfAfNHWtOLEaqrAQxQcNQIxpgtUaFb
X-Received: by 2002:a17:907:72ce:b0:993:da5f:5a9b with SMTP id
 du14-20020a17090772ce00b00993da5f5a9bmr3055033ejc.8.1691679318413;
        Thu, 10 Aug 2023 07:55:18 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1691679318; cv=none;
        d=google.com; s=arc-20160816;
        b=Nkiq+xdSu3yq6ODlrMFf1UBlab0UkgY8rQxRtuiOXV827oWWpqRcY2PYjXdnT+ixqp
         43hIwXWIM3wHtaUTmEpDBU7KXjATrWfn/kXCpTvX7iglOzKlJm+q1/GkyWkqyd580rMB
         ySbykYpP1kljCLcY1ae9wlDZLccK1cyGI4vAi7qBimLHokXz8ViiL5fL/YeuD4ojv4EM
         bjhpHHhCiRiYHp2YFrus08VIoPneD5P0euE6Mdsr35wDXtxH9mTUBjCXtQxbQvansp/9
         mt5eQLAXnruH/U5P5Cie6stc04AR8T0NJQYBjOLkYr8jNnTmLREopwqi9QtqS4XOFl19
         1AmQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=list-id:precedence:content-transfer-encoding:mime-version
         :references:in-reply-to:message-id:date:subject:cc:to:from;
        bh=jjkXyBj141BxTPIAchOKJ6WjdJR6TzI9YqXc7Ba2Scc=;
        fh=2S5jT4dIIqOhOs7q3j0K2DFa5C1ZmfWI8H2ybAUe/oA=;
        b=bWrF9tWeyLRtz41jIUWDPvBLGKryHyQN5aSkkyvC9cOqvSvgfpQ12P6HcTBgH6EzrL
         jF35ENqHs7aiDYh12CwAG4WsP+D6Vlx0rJ+SOcvGU9Q5OBBYWtPV2fyTvxWCIVvKX43a
         752mBD2ex4iAOtObfghad0+MxYp+BzEs2G6pGvE404GLft4fss7HK766pTpuFS/qIH+0
         OayHIxz9yJqLVwXf6atpzV3VTiBsEZg1weGcjIct0gxxAzog8A1Imwt1XxP3czoXxlg3
         ggYlJud8IqKH2hoYfKK5j3kLVK+Z/OdutaCGEKyPh7CMFn3Iu2CPxLVCF1+DAaperWdl
         4hDA==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20])
        by mx.google.com with ESMTP id
 h19-20020a170906591300b009930d9d6b4csi1563219ejq.888.2023.08.10.07.54.51;
        Thu, 10 Aug 2023 07:55:18 -0700 (PDT)
Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235728AbjHJOaA (ORCPT <rfc822;lanlanxiyiji@gmail.com>
        + 99 others); Thu, 10 Aug 2023 10:30:00 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53362 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S235706AbjHJO35 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 10 Aug 2023 10:29:57 -0400
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
        by lindbergh.monkeyblade.net (Postfix) with ESMTP id B9E6126BD
        for <linux-kernel@vger.kernel.org>;
 Thu, 10 Aug 2023 07:29:55 -0700 (PDT)
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
        by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D9B95113E;
        Thu, 10 Aug 2023 07:30:37 -0700 (PDT)
Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com
 [10.1.196.26])
        by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id
 288DB3F64C;
        Thu, 10 Aug 2023 07:29:53 -0700 (PDT)
From: Ryan Roberts <ryan.roberts@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
        Matthew Wilcox <willy@infradead.org>,
        Yin Fengwei <fengwei.yin@intel.com>,
        David Hildenbrand <david@redhat.com>,
        Yu Zhao <yuzhao@google.com>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Anshuman Khandual <anshuman.khandual@arm.com>,
        Yang Shi <shy828301@gmail.com>,
        "Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Itaru Kitayama <itaru.kitayama@gmail.com>,
        "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 1/5] mm: Allow deferred splitting of arbitrary large anon
 folios
Date: Thu, 10 Aug 2023 15:29:38 +0100
Message-Id: <20230810142942.3169679-2-ryan.roberts@arm.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com>
References: <20230810142942.3169679-1-ryan.roberts@arm.com>
MIME-Version: 1.0
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED,
        SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
        lindbergh.monkeyblade.net
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1773854333047527547
X-GMAIL-MSGID: 1773854333047527547

In preparation for the introduction of large folios for anonymous
memory, we would like to be able to split them when they have unmapped
subpages, in order to free those unused pages under memory pressure. So
remove the artificial requirement that the large folio needed to be at
least PMD-sized.

Reviewed-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 mm/rmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 1f04debdc87a..769fcabc6c56 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1446,11 +1446,11 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma,
 		__lruvec_stat_mod_folio(folio, idx, -nr);
 
 		/*
-		 * Queue anon THP for deferred split if at least one
+		 * Queue anon large folio for deferred split if at least one
 		 * page of the folio is unmapped and at least one page
 		 * is still mapped.
 		 */
-		if (folio_test_pmd_mappable(folio) && folio_test_anon(folio))
+		if (folio_test_large(folio) && folio_test_anon(folio))
 			if (!compound || nr < nr_pmdmapped)
 				deferred_split_folio(folio);
 	}

From patchwork Thu Aug 10 14:29:39 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ryan Roberts <ryan.roberts@arm.com>
X-Patchwork-Id: 134053
Return-Path: <linux-kernel-owner@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp507882vqi;
        Thu, 10 Aug 2023 08:43:12 -0700 (PDT)
X-Google-Smtp-Source: 
 AGHT+IGTWekzQlmZhDou2dUFwbEYh0KfsLfI2IRIqqvVYfNKXliq7kxnd4K7MCIUd3ibGzldL6En
X-Received: by 2002:a05:6512:3456:b0:4fe:cca:c6f7 with SMTP id
 j22-20020a056512345600b004fe0ccac6f7mr1739837lfr.48.1691682192371;
        Thu, 10 Aug 2023 08:43:12 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1691682192; cv=none;
        d=google.com; s=arc-20160816;
        b=hYwYhxCt4GFUW+ApEsGD38OSPWUL4dsTXgWSbsW+g6qEuFnqPtWuirrEr8/PjhHu8o
         bxBUylDQubU82wB/bOva01W+kaFWvC9EvsTl+pBd1Mu77X3FoB2GE+5nNFrmFxzsqNoi
         7f6whzuXPyZASPwLVn2oSNtWbEF8jwTyNfDdCYygfPOCVxHlUZQsaPCEisMpqoDnCUNV
         r3KPqPJ8FLyLLfPynrSocfjahPKhCsb9PMfIxUBDOqhLVgi+rZVyKWOrClcD+Vkj7p/8
         NMgZ7KpojQVxXXLoxMYR12F4oovw/EsJLbTxVfCFn+5uSUn35XzzAvuRBEoFINfAmWab
         7iYA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=list-id:precedence:content-transfer-encoding:mime-version
         :references:in-reply-to:message-id:date:subject:cc:to:from;
        bh=hZn/3SZrqk4Mk6iz6WOdEWWcv8iCLvWmnIBLw6ALzj4=;
        fh=2S5jT4dIIqOhOs7q3j0K2DFa5C1ZmfWI8H2ybAUe/oA=;
        b=uxTUHySw4jyiq927YO6Cty40lPO09FjebUA3hjSKEqab9NqZf4B7xehV5KH8SS0A6V
         9+w1F14wnntSOEVdFatUk56TwOkymk72bMfYpr9259VVttN6PW06tVhDUrUw0ls/8Ly+
         LMcYwWvkMp5/o+9WOiu3H0OGSXGhlph2zYG1SCAr6u2v8sBsBVybe22j+jNoywIccms2
         hMohsc5NWb0Y49L3hZMLuO32QMC4oQrts7frAO2i8E/Vv+VNXVFVe/6ytcBlqRPHDoAb
         bEd6DlF5sZf4iMAqEpA5Lp0jrRuLjZbvggnRl8llAofAEO1rIR+ejK8xPmlzVqttvrAJ
         AHjQ==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20])
        by mx.google.com with ESMTP id
 e16-20020a50ec90000000b00522f060f8desi1710673edr.326.2023.08.10.08.42.41;
        Thu, 10 Aug 2023 08:43:12 -0700 (PDT)
Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235734AbjHJOaG (ORCPT <rfc822;lanlanxiyiji@gmail.com>
        + 99 others); Thu, 10 Aug 2023 10:30:06 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53462 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S235726AbjHJO37 (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 10 Aug 2023 10:29:59 -0400
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
        by lindbergh.monkeyblade.net (Postfix) with ESMTP id 49ABF26BC
        for <linux-kernel@vger.kernel.org>;
 Thu, 10 Aug 2023 07:29:58 -0700 (PDT)
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
        by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 72BCB139F;
        Thu, 10 Aug 2023 07:30:40 -0700 (PDT)
Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com
 [10.1.196.26])
        by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id
 B5B913F64C;
        Thu, 10 Aug 2023 07:29:55 -0700 (PDT)
From: Ryan Roberts <ryan.roberts@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
        Matthew Wilcox <willy@infradead.org>,
        Yin Fengwei <fengwei.yin@intel.com>,
        David Hildenbrand <david@redhat.com>,
        Yu Zhao <yuzhao@google.com>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Anshuman Khandual <anshuman.khandual@arm.com>,
        Yang Shi <shy828301@gmail.com>,
        "Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Itaru Kitayama <itaru.kitayama@gmail.com>,
        "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 2/5] mm: Non-pmd-mappable,
 large folios for folio_add_new_anon_rmap()
Date: Thu, 10 Aug 2023 15:29:39 +0100
Message-Id: <20230810142942.3169679-3-ryan.roberts@arm.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com>
References: <20230810142942.3169679-1-ryan.roberts@arm.com>
MIME-Version: 1.0
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED,
        SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
        lindbergh.monkeyblade.net
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1773857346406476564
X-GMAIL-MSGID: 1773857346406476564

In preparation for LARGE_ANON_FOLIO support, improve
folio_add_new_anon_rmap() to allow a non-pmd-mappable, large folio to be
passed to it. In this case, all contained pages are accounted using the
order-0 folio (or base page) scheme.

Reviewed-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 mm/rmap.c | 27 ++++++++++++++++++++-------
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 769fcabc6c56..d1ff92b4bf6b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1266,31 +1266,44 @@ void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma,
  * This means the inc-and-test can be bypassed.
  * The folio does not have to be locked.
  *
- * If the folio is large, it is accounted as a THP.  As the folio
+ * If the folio is pmd-mappable, it is accounted as a THP.  As the folio
  * is new, it's assumed to be mapped exclusively by a single process.
  */
 void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
 		unsigned long address)
 {
-	int nr;
+	int nr = folio_nr_pages(folio);
 
-	VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma);
+	VM_BUG_ON_VMA(address < vma->vm_start ||
+			address + (nr << PAGE_SHIFT) > vma->vm_end, vma);
 	__folio_set_swapbacked(folio);
 
-	if (likely(!folio_test_pmd_mappable(folio))) {
+	if (likely(!folio_test_large(folio))) {
 		/* increment count (starts at -1) */
 		atomic_set(&folio->_mapcount, 0);
-		nr = 1;
+		__page_set_anon_rmap(folio, &folio->page, vma, address, 1);
+	} else if (!folio_test_pmd_mappable(folio)) {
+		int i;
+
+		for (i = 0; i < nr; i++) {
+			struct page *page = folio_page(folio, i);
+
+			/* increment count (starts at -1) */
+			atomic_set(&page->_mapcount, 0);
+			__page_set_anon_rmap(folio, page, vma,
+					address + (i << PAGE_SHIFT), 1);
+		}
+
+		atomic_set(&folio->_nr_pages_mapped, nr);
 	} else {
 		/* increment count (starts at -1) */
 		atomic_set(&folio->_entire_mapcount, 0);
 		atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED);
-		nr = folio_nr_pages(folio);
+		__page_set_anon_rmap(folio, &folio->page, vma, address, 1);
 		__lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr);
 	}
 
 	__lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr);
-	__page_set_anon_rmap(folio, &folio->page, vma, address, 1);
 }
 
 /**

From patchwork Thu Aug 10 14:29:40 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ryan Roberts <ryan.roberts@arm.com>
X-Patchwork-Id: 134017
Return-Path: <linux-kernel-owner@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp471007vqi;
        Thu, 10 Aug 2023 07:45:51 -0700 (PDT)
X-Google-Smtp-Source: 
 AGHT+IFMKCHXi4COnXFzKAdv84xGoRuJdtpPK1jbZYtvc9BtMxrzTRA0/r5ozs26XoDXUrTiqNVb
X-Received: by 2002:a05:6a00:1943:b0:687:5763:ef27 with SMTP id
 s3-20020a056a00194300b006875763ef27mr3293755pfk.33.1691678750955;
        Thu, 10 Aug 2023 07:45:50 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1691678750; cv=none;
        d=google.com; s=arc-20160816;
        b=VRB0dpJqL4Mk70fQbp/G2e5Dr4Gv/2eglXtVeiN56kGvZTlo1s//WkbU4NJPMKaYBm
         2aasAtVw6IHjfgpSZVnEg9BFgeP9xMXlWOowgQxrdO/6BqAYTiQRnAmYgFZEO6qwg4SK
         XY7svTGYjClVewyjMP4KjWbcEzyOHwIQnTvTIxybvQUBHdD03OmIKkBZXMkY4FXz7Thh
         vQU+kVopQNSkFURTiNg0XFXjp5sHvp1g41BKZfYD3W8IOWOjHd8nkqAMD+hpmBxiQFfE
         As6HumixWtoJiBnqjMKBN65k+hbl9agD0sHEVnOHCcM9/23n0fRRWlDAiywjPHe9Jr5k
         W58Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=list-id:precedence:content-transfer-encoding:mime-version
         :references:in-reply-to:message-id:date:subject:cc:to:from;
        bh=CkBS3801+DyIGuoTkxthrjCxqHAXo1ZB5C9GVz80YpI=;
        fh=2S5jT4dIIqOhOs7q3j0K2DFa5C1ZmfWI8H2ybAUe/oA=;
        b=nvoxOWIvqK4g0kHI9mX9vDVxy7cHAGJ/8+g541pAr4clOqWAViNNnqxKSOC4qJuNhu
         6dYBpcRX/Tq+aauku1BQ33y9UE6V3XtNpVo/lt6WsLusagqtoocbPYrbHCB6i7r7sMVa
         mWWrXnmLHRC4qvONhOmywk03uxCqiDKwEYxU1ak0j24laoQLmRxOlO4CfJ+O1A+6Nx5E
         zRaZHLnfr0syBqBpLMowNI3qR4MPV6D1E02By6jZWOHrZJKoHw2+Vtf3ozcgWEhjFBAc
         P9RCM2yWrGfCunjY6NSIvdYugx27rgdjW9nK0D94ukDwrExrb9Ke6WRfrQR9M8oTge2Q
         ti1Q==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20])
        by mx.google.com with ESMTP id
 o16-20020a634e50000000b005648d3f2031si1607877pgl.362.2023.08.10.07.45.36;
        Thu, 10 Aug 2023 07:45:50 -0700 (PDT)
Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235724AbjHJOaI (ORCPT <rfc822;lanlanxiyiji@gmail.com>
        + 99 others); Thu, 10 Aug 2023 10:30:08 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60168 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S235748AbjHJOaF (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 10 Aug 2023 10:30:05 -0400
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
        by lindbergh.monkeyblade.net (Postfix) with ESMTP id EF3F4270F
        for <linux-kernel@vger.kernel.org>;
 Thu, 10 Aug 2023 07:30:00 -0700 (PDT)
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
        by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 299A81480;
        Thu, 10 Aug 2023 07:30:43 -0700 (PDT)
Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com
 [10.1.196.26])
        by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id
 4EE6B3F64C;
        Thu, 10 Aug 2023 07:29:58 -0700 (PDT)
From: Ryan Roberts <ryan.roberts@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
        Matthew Wilcox <willy@infradead.org>,
        Yin Fengwei <fengwei.yin@intel.com>,
        David Hildenbrand <david@redhat.com>,
        Yu Zhao <yuzhao@google.com>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Anshuman Khandual <anshuman.khandual@arm.com>,
        Yang Shi <shy828301@gmail.com>,
        "Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Itaru Kitayama <itaru.kitayama@gmail.com>,
        "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 3/5] mm: LARGE_ANON_FOLIO for improved performance
Date: Thu, 10 Aug 2023 15:29:40 +0100
Message-Id: <20230810142942.3169679-4-ryan.roberts@arm.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com>
References: <20230810142942.3169679-1-ryan.roberts@arm.com>
MIME-Version: 1.0
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED,
        SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
        lindbergh.monkeyblade.net
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1773853737851322398
X-GMAIL-MSGID: 1773853737851322398

Introduce LARGE_ANON_FOLIO feature, which allows anonymous memory to be
allocated in large folios of a determined order. All pages of the large
folio are pte-mapped during the same page fault, significantly reducing
the number of page faults. The number of per-page operations (e.g. ref
counting, rmap management lru list management) are also significantly
reduced since those ops now become per-folio.

The new behaviour is hidden behind the new LARGE_ANON_FOLIO Kconfig,
which defaults to disabled for now; The long term aim is for this to
defaut to enabled, but there are some risks around internal
fragmentation that need to be better understood first.

Large anonymous folio (LAF) allocation is integrated with the existing
(PMD-order) THP and single (S) page allocation according to this policy,
where fallback (>) is performed for various reasons, such as the
proposed folio order not fitting within the bounds of the VMA, etc:

                | prctl=dis | prctl=ena   | prctl=ena     | prctl=ena
                | sysfs=X   | sysfs=never | sysfs=madvise | sysfs=always
----------------|-----------|-------------|---------------|-------------
no hint         | S         | LAF>S       | LAF>S         | THP>LAF>S
MADV_HUGEPAGE   | S         | LAF>S       | THP>LAF>S     | THP>LAF>S
MADV_NOHUGEPAGE | S         | S           | S             | S

This approach ensures that we don't violate existing hints to only
allocate single pages - this is required for QEMU's VM live migration
implementation to work correctly - while allowing us to use LAF
independently of THP (when sysfs=never). This makes wide scale
performance characterization simpler, while avoiding exposing any new
ABI to user space.

When using LAF for allocation, the folio order is determined as follows:
The return value of arch_wants_pte_order() is used. For vmas that have
not explicitly opted-in to use transparent hugepages (e.g. where
sysfs=madvise and the vma does not have MADV_HUGEPAGE or sysfs=never),
then arch_wants_pte_order() is limited to 64K (or PAGE_SIZE, whichever
is bigger). This allows for a performance boost without requiring any
explicit opt-in from the workload while limitting internal
fragmentation.

If the preferred order can't be used (e.g. because the folio would
breach the bounds of the vma, or because ptes in the region are already
mapped) then we fall back to a suitable lower order; first
PAGE_ALLOC_COSTLY_ORDER, then order-0.

arch_wants_pte_order() can be overridden by the architecture if desired.
Some architectures (e.g. arm64) can coalsece TLB entries if a contiguous
set of ptes map physically contigious, naturally aligned memory, so this
mechanism allows the architecture to optimize as required.

Here we add the default implementation of arch_wants_pte_order(), used
when the architecture does not define it, which returns -1, implying
that the HW has no preference. In this case, mm will choose it's own
default order.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 include/linux/pgtable.h |  13 ++++
 mm/Kconfig              |  10 +++
 mm/memory.c             | 144 +++++++++++++++++++++++++++++++++++++---
 3 files changed, 158 insertions(+), 9 deletions(-)

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 222a33b9600d..4b488cc66ddc 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -369,6 +369,19 @@ static inline bool arch_has_hw_pte_young(void)
 }
 #endif
 
+#ifndef arch_wants_pte_order
+/*
+ * Returns preferred folio order for pte-mapped memory. Must be in range [0,
+ * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios
+ * to be at least order-2. Negative value implies that the HW has no preference
+ * and mm will choose it's own default order.
+ */
+static inline int arch_wants_pte_order(void)
+{
+	return -1;
+}
+#endif
+
 #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long address,
diff --git a/mm/Kconfig b/mm/Kconfig
index 721dc88423c7..a1e28b8ddc24 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1243,4 +1243,14 @@ config LOCK_MM_AND_FIND_VMA
 
 source "mm/damon/Kconfig"
 
+config LARGE_ANON_FOLIO
+	bool "Allocate large folios for anonymous memory"
+	depends on TRANSPARENT_HUGEPAGE
+	default n
+	help
+	  Use large (bigger than order-0) folios to back anonymous memory where
+	  possible, even for pte-mapped memory. This reduces the number of page
+	  faults, as well as other per-page overheads to improve performance for
+	  many workloads.
+
 endmenu
diff --git a/mm/memory.c b/mm/memory.c
index d003076b218d..bbc7d4ce84f7 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4073,6 +4073,123 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 	return ret;
 }
 
+static bool vmf_pte_range_changed(struct vm_fault *vmf, int nr_pages)
+{
+	int i;
+
+	if (nr_pages == 1)
+		return vmf_pte_changed(vmf);
+
+	for (i = 0; i < nr_pages; i++) {
+		if (!pte_none(ptep_get_lockless(vmf->pte + i)))
+			return true;
+	}
+
+	return false;
+}
+
+#ifdef CONFIG_LARGE_ANON_FOLIO
+#define ANON_FOLIO_MAX_ORDER_UNHINTED \
+		(ilog2(max_t(unsigned long, SZ_64K, PAGE_SIZE)) - PAGE_SHIFT)
+
+static int anon_folio_order(struct vm_area_struct *vma)
+{
+	int order;
+
+	/*
+	 * If the vma is eligible for thp, allocate a large folio of the size
+	 * preferred by the arch. Or if the arch requested a very small size or
+	 * didn't request a size, then use PAGE_ALLOC_COSTLY_ORDER, which still
+	 * meets the arch's requirements but means we still take advantage of SW
+	 * optimizations (e.g. fewer page faults).
+	 *
+	 * If the vma isn't eligible for thp, take the arch-preferred size and
+	 * limit it to ANON_FOLIO_MAX_ORDER_UNHINTED. This ensures workloads
+	 * that have not explicitly opted-in take benefit while capping the
+	 * potential for internal fragmentation.
+	 */
+
+	order = max(arch_wants_pte_order(), PAGE_ALLOC_COSTLY_ORDER);
+
+	if (!hugepage_vma_check(vma, vma->vm_flags, false, true, true))
+		order = min(order, ANON_FOLIO_MAX_ORDER_UNHINTED);
+
+	return order;
+}
+
+static struct folio *alloc_anon_folio(struct vm_fault *vmf)
+{
+	int i;
+	gfp_t gfp;
+	pte_t *pte;
+	unsigned long addr;
+	struct folio *folio;
+	struct vm_area_struct *vma = vmf->vma;
+	int prefer = anon_folio_order(vma);
+	int orders[] = {
+		prefer,
+		prefer > PAGE_ALLOC_COSTLY_ORDER ? PAGE_ALLOC_COSTLY_ORDER : 0,
+		0,
+	};
+
+	/*
+	 * If uffd is active for the vma we need per-page fault fidelity to
+	 * maintain the uffd semantics.
+	 */
+	if (userfaultfd_armed(vma))
+		goto fallback;
+
+	/*
+	 * If hugepages are explicitly disabled for the vma (either
+	 * MADV_NOHUGEPAGE or prctl) fallback to order-0. Failure to do this
+	 * breaks correctness for user space. We ignore the sysfs global knob.
+	 */
+	if (!hugepage_vma_check(vma, vma->vm_flags, false, true, false))
+		goto fallback;
+
+	for (i = 0; orders[i]; i++) {
+		addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]);
+		if (addr >= vma->vm_start &&
+		    addr + (PAGE_SIZE << orders[i]) <= vma->vm_end)
+			break;
+	}
+
+	if (!orders[i])
+		goto fallback;
+
+	pte = pte_offset_map(vmf->pmd, vmf->address & PMD_MASK);
+	if (!pte)
+		return ERR_PTR(-EAGAIN);
+
+	for (; orders[i]; i++) {
+		addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]);
+		vmf->pte = pte + pte_index(addr);
+		if (!vmf_pte_range_changed(vmf, 1 << orders[i]))
+			break;
+	}
+
+	vmf->pte = NULL;
+	pte_unmap(pte);
+
+	gfp = vma_thp_gfp_mask(vma);
+
+	for (; orders[i]; i++) {
+		addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << orders[i]);
+		folio = vma_alloc_folio(gfp, orders[i], vma, addr, true);
+		if (folio) {
+			clear_huge_page(&folio->page, addr, 1 << orders[i]);
+			return folio;
+		}
+	}
+
+fallback:
+	return vma_alloc_zeroed_movable_folio(vma, vmf->address);
+}
+#else
+#define alloc_anon_folio(vmf) \
+		vma_alloc_zeroed_movable_folio((vmf)->vma, (vmf)->address)
+#endif
+
 /*
  * We enter with non-exclusive mmap_lock (to exclude vma changes,
  * but allow concurrent faults), and pte mapped but not yet locked.
@@ -4080,6 +4197,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
  */
 static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 {
+	int i;
+	int nr_pages = 1;
+	unsigned long addr = vmf->address;
 	bool uffd_wp = vmf_orig_pte_uffd_wp(vmf);
 	struct vm_area_struct *vma = vmf->vma;
 	struct folio *folio;
@@ -4124,10 +4244,15 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	/* Allocate our own private page. */
 	if (unlikely(anon_vma_prepare(vma)))
 		goto oom;
-	folio = vma_alloc_zeroed_movable_folio(vma, vmf->address);
+	folio = alloc_anon_folio(vmf);
+	if (IS_ERR(folio))
+		return 0;
 	if (!folio)
 		goto oom;
 
+	nr_pages = folio_nr_pages(folio);
+	addr = ALIGN_DOWN(vmf->address, nr_pages * PAGE_SIZE);
+
 	if (mem_cgroup_charge(folio, vma->vm_mm, GFP_KERNEL))
 		goto oom_free_page;
 	folio_throttle_swaprate(folio, GFP_KERNEL);
@@ -4144,12 +4269,12 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	if (vma->vm_flags & VM_WRITE)
 		entry = pte_mkwrite(pte_mkdirty(entry));
 
-	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address,
-			&vmf->ptl);
+	vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
 	if (!vmf->pte)
 		goto release;
-	if (vmf_pte_changed(vmf)) {
-		update_mmu_tlb(vma, vmf->address, vmf->pte);
+	if (vmf_pte_range_changed(vmf, nr_pages)) {
+		for (i = 0; i < nr_pages; i++)
+			update_mmu_tlb(vma, addr + PAGE_SIZE * i, vmf->pte + i);
 		goto release;
 	}
 
@@ -4164,16 +4289,17 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 		return handle_userfault(vmf, VM_UFFD_MISSING);
 	}
 
-	inc_mm_counter(vma->vm_mm, MM_ANONPAGES);
-	folio_add_new_anon_rmap(folio, vma, vmf->address);
+	folio_ref_add(folio, nr_pages - 1);
+	add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages);
+	folio_add_new_anon_rmap(folio, vma, addr);
 	folio_add_lru_vma(folio, vma);
 setpte:
 	if (uffd_wp)
 		entry = pte_mkuffd_wp(entry);
-	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
+	set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr_pages);
 
 	/* No need to invalidate - it was non-present before */
-	update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1);
+	update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages);
 unlock:
 	if (vmf->pte)
 		pte_unmap_unlock(vmf->pte, vmf->ptl);

From patchwork Thu Aug 10 14:29:41 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ryan Roberts <ryan.roberts@arm.com>
X-Patchwork-Id: 134033
Return-Path: <linux-kernel-owner@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp485950vqi;
        Thu, 10 Aug 2023 08:08:23 -0700 (PDT)
X-Google-Smtp-Source: 
 AGHT+IH1tYuVP+x5Nayzgo/PDLxIEgOpegOb5Yg7xBEkDUkVAztRSZB0QWnngwempt9xUyhMq9UY
X-Received: by 2002:aa7:cd94:0:b0:523:1465:8876 with SMTP id
 x20-20020aa7cd94000000b0052314658876mr2315346edv.14.1691680103123;
        Thu, 10 Aug 2023 08:08:23 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1691680103; cv=none;
        d=google.com; s=arc-20160816;
        b=E4Xk2+Fl8GwucKOYCzCcZSPnaf2sPPPsSGYDGQbp0d2fy8KbXlhOLkiqMYrkN1gmUu
         e9IPeHm/OIkkkgtRYci8y0yVc6PmqgjmFD/z7n+6+yzfxekeuiTGQQhTbgfZN9ucBp1f
         fViaSPHsUNwpWHJNlF9Hh0a8+rW+UTGIXxCYt20hpfhKm1Bp8SzLNSdi6oFLvt6l6ZbA
         66D0CFpGd5blPaO3Z9TBRzF6ozgSG3WDus+cOwhsTzcdp//Yxk8HZqgUhnsvuJmZ5DbP
         ZpZpiPUZjkf7PnaC3joEpEByRyJze4obdFH40goqoXyNOlXNDmStxEfffFrX7tKjofK6
         wKMg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=list-id:precedence:content-transfer-encoding:mime-version
         :references:in-reply-to:message-id:date:subject:cc:to:from;
        bh=mMjrxtyJTpWC3olVckqQ03r5Ge3JaClfF6iuN/DQcwg=;
        fh=2S5jT4dIIqOhOs7q3j0K2DFa5C1ZmfWI8H2ybAUe/oA=;
        b=SSWaI0HWk6d0JM4i6lmKT7iHk0FyQwbzwtP2tqdd/qHWhj0Arcks1Mt56QDGIIwFvF
         3zRfIB7VLQWX1+g67ZVhwWlr4eLpjj7HGOvK/lSgtESMWo/XSqsnQ7Bh/AdsUVTRxbiB
         ekN7kapvmvERjCAlksfMlFvPN4JeOncHevWGv0AJU+2RbVXyw5cFkn6/NpP2L/7azAEB
         rrBtP8/1goTEz/xUN8JPqhqw26uqUh972UBHXDfx+JezT9STtCOgYhebO6fqNdplJIk8
         gWmX2qlq6hEEW/KpZG6DPZqRRw7D3XmtB4xG18DT5O4dTnfE3cMGTtDgJpyrDBqMGPLV
         HRLg==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20])
        by mx.google.com with ESMTP id
 q11-20020a056402032b00b005223d772ca4si1609960edw.325.2023.08.10.08.07.53;
        Thu, 10 Aug 2023 08:08:23 -0700 (PDT)
Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235751AbjHJOaK (ORCPT <rfc822;lanlanxiyiji@gmail.com>
        + 99 others); Thu, 10 Aug 2023 10:30:10 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60228 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S235753AbjHJOaF (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 10 Aug 2023 10:30:05 -0400
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
        by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8A09F2723
        for <linux-kernel@vger.kernel.org>;
 Thu, 10 Aug 2023 07:30:03 -0700 (PDT)
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
        by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B756214BF;
        Thu, 10 Aug 2023 07:30:45 -0700 (PDT)
Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com
 [10.1.196.26])
        by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id
 05DF53F64C;
        Thu, 10 Aug 2023 07:30:00 -0700 (PDT)
From: Ryan Roberts <ryan.roberts@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
        Matthew Wilcox <willy@infradead.org>,
        Yin Fengwei <fengwei.yin@intel.com>,
        David Hildenbrand <david@redhat.com>,
        Yu Zhao <yuzhao@google.com>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Anshuman Khandual <anshuman.khandual@arm.com>,
        Yang Shi <shy828301@gmail.com>,
        "Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Itaru Kitayama <itaru.kitayama@gmail.com>,
        "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 4/5] selftests/mm/cow: Generalize do_run_with_thp() helper
Date: Thu, 10 Aug 2023 15:29:41 +0100
Message-Id: <20230810142942.3169679-5-ryan.roberts@arm.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com>
References: <20230810142942.3169679-1-ryan.roberts@arm.com>
MIME-Version: 1.0
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED,
        SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
        lindbergh.monkeyblade.net
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1773855155863901952
X-GMAIL-MSGID: 1773855155863901952

do_run_with_thp() prepares THP memory into different states before
running tests. We would like to reuse this logic to also test large anon
folios. So let's add a size parameter which tells the function what size
of memory it should operate on.

Remove references to THP and replace with LARGE, and fix up all existing
call sites to pass thpsize as the required size.

No functional change intended here, but a separate commit will add new
large anon folio tests that use this new capability.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 tools/testing/selftests/mm/cow.c | 118 ++++++++++++++++---------------
 1 file changed, 61 insertions(+), 57 deletions(-)

diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c
index 7324ce5363c0..304882bf2e5d 100644
--- a/tools/testing/selftests/mm/cow.c
+++ b/tools/testing/selftests/mm/cow.c
@@ -723,25 +723,25 @@ static void run_with_base_page_swap(test_fn fn, const char *desc)
 	do_run_with_base_page(fn, true);
 }
 
-enum thp_run {
-	THP_RUN_PMD,
-	THP_RUN_PMD_SWAPOUT,
-	THP_RUN_PTE,
-	THP_RUN_PTE_SWAPOUT,
-	THP_RUN_SINGLE_PTE,
-	THP_RUN_SINGLE_PTE_SWAPOUT,
-	THP_RUN_PARTIAL_MREMAP,
-	THP_RUN_PARTIAL_SHARED,
+enum large_run {
+	LARGE_RUN_PMD,
+	LARGE_RUN_PMD_SWAPOUT,
+	LARGE_RUN_PTE,
+	LARGE_RUN_PTE_SWAPOUT,
+	LARGE_RUN_SINGLE_PTE,
+	LARGE_RUN_SINGLE_PTE_SWAPOUT,
+	LARGE_RUN_PARTIAL_MREMAP,
+	LARGE_RUN_PARTIAL_SHARED,
 };
 
-static void do_run_with_thp(test_fn fn, enum thp_run thp_run)
+static void do_run_with_large(test_fn fn, enum large_run large_run, size_t size)
 {
 	char *mem, *mmap_mem, *tmp, *mremap_mem = MAP_FAILED;
-	size_t size, mmap_size, mremap_size;
+	size_t mmap_size, mremap_size;
 	int ret;
 
-	/* For alignment purposes, we need twice the thp size. */
-	mmap_size = 2 * thpsize;
+	/* For alignment purposes, we need twice the requested size. */
+	mmap_size = 2 * size;
 	mmap_mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
 			MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 	if (mmap_mem == MAP_FAILED) {
@@ -749,36 +749,40 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run)
 		return;
 	}
 
-	/* We need a THP-aligned memory area. */
-	mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1));
+	/* We need to naturally align the memory area. */
+	mem = (char *)(((uintptr_t)mmap_mem + size) & ~(size - 1));
 
-	ret = madvise(mem, thpsize, MADV_HUGEPAGE);
+	ret = madvise(mem, size, MADV_HUGEPAGE);
 	if (ret) {
 		ksft_test_result_fail("MADV_HUGEPAGE failed\n");
 		goto munmap;
 	}
 
 	/*
-	 * Try to populate a THP. Touch the first sub-page and test if we get
-	 * another sub-page populated automatically.
+	 * Try to populate a large folio. Touch the first sub-page and test if
+	 * we get the last sub-page populated automatically.
 	 */
 	mem[0] = 0;
-	if (!pagemap_is_populated(pagemap_fd, mem + pagesize)) {
-		ksft_test_result_skip("Did not get a THP populated\n");
+	if (!pagemap_is_populated(pagemap_fd, mem + size - pagesize)) {
+		ksft_test_result_skip("Did not get fully populated\n");
 		goto munmap;
 	}
-	memset(mem, 0, thpsize);
+	memset(mem, 0, size);
 
-	size = thpsize;
-	switch (thp_run) {
-	case THP_RUN_PMD:
-	case THP_RUN_PMD_SWAPOUT:
+	switch (large_run) {
+	case LARGE_RUN_PMD:
+	case LARGE_RUN_PMD_SWAPOUT:
+		if (size != thpsize) {
+			ksft_test_result_fail("test bug: can't PMD-map size\n");
+			goto munmap;
+		}
 		break;
-	case THP_RUN_PTE:
-	case THP_RUN_PTE_SWAPOUT:
+	case LARGE_RUN_PTE:
+	case LARGE_RUN_PTE_SWAPOUT:
 		/*
-		 * Trigger PTE-mapping the THP by temporarily mapping a single
-		 * subpage R/O.
+		 * Trigger PTE-mapping the large folio by temporarily mapping a
+		 * single subpage R/O. This is a noop if the large-folio is not
+		 * thpsize (and therefore already PTE-mapped).
 		 */
 		ret = mprotect(mem + pagesize, pagesize, PROT_READ);
 		if (ret) {
@@ -791,25 +795,25 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run)
 			goto munmap;
 		}
 		break;
-	case THP_RUN_SINGLE_PTE:
-	case THP_RUN_SINGLE_PTE_SWAPOUT:
+	case LARGE_RUN_SINGLE_PTE:
+	case LARGE_RUN_SINGLE_PTE_SWAPOUT:
 		/*
-		 * Discard all but a single subpage of that PTE-mapped THP. What
-		 * remains is a single PTE mapping a single subpage.
+		 * Discard all but a single subpage of that PTE-mapped large
+		 * folio. What remains is a single PTE mapping a single subpage.
 		 */
-		ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DONTNEED);
+		ret = madvise(mem + pagesize, size - pagesize, MADV_DONTNEED);
 		if (ret) {
 			ksft_test_result_fail("MADV_DONTNEED failed\n");
 			goto munmap;
 		}
 		size = pagesize;
 		break;
-	case THP_RUN_PARTIAL_MREMAP:
+	case LARGE_RUN_PARTIAL_MREMAP:
 		/*
-		 * Remap half of the THP. We need some new memory location
-		 * for that.
+		 * Remap half of the lareg folio. We need some new memory
+		 * location for that.
 		 */
-		mremap_size = thpsize / 2;
+		mremap_size = size / 2;
 		mremap_mem = mmap(NULL, mremap_size, PROT_NONE,
 				  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 		if (mem == MAP_FAILED) {
@@ -824,13 +828,13 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run)
 		}
 		size = mremap_size;
 		break;
-	case THP_RUN_PARTIAL_SHARED:
+	case LARGE_RUN_PARTIAL_SHARED:
 		/*
-		 * Share the first page of the THP with a child and quit the
-		 * child. This will result in some parts of the THP never
-		 * have been shared.
+		 * Share the first page of the large folio with a child and quit
+		 * the child. This will result in some parts of the large folio
+		 * never have been shared.
 		 */
-		ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DONTFORK);
+		ret = madvise(mem + pagesize, size - pagesize, MADV_DONTFORK);
 		if (ret) {
 			ksft_test_result_fail("MADV_DONTFORK failed\n");
 			goto munmap;
@@ -844,7 +848,7 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run)
 		}
 		wait(&ret);
 		/* Allow for sharing all pages again. */
-		ret = madvise(mem + pagesize, thpsize - pagesize, MADV_DOFORK);
+		ret = madvise(mem + pagesize, size - pagesize, MADV_DOFORK);
 		if (ret) {
 			ksft_test_result_fail("MADV_DOFORK failed\n");
 			goto munmap;
@@ -854,10 +858,10 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run)
 		assert(false);
 	}
 
-	switch (thp_run) {
-	case THP_RUN_PMD_SWAPOUT:
-	case THP_RUN_PTE_SWAPOUT:
-	case THP_RUN_SINGLE_PTE_SWAPOUT:
+	switch (large_run) {
+	case LARGE_RUN_PMD_SWAPOUT:
+	case LARGE_RUN_PTE_SWAPOUT:
+	case LARGE_RUN_SINGLE_PTE_SWAPOUT:
 		madvise(mem, size, MADV_PAGEOUT);
 		if (!range_is_swapped(mem, size)) {
 			ksft_test_result_skip("MADV_PAGEOUT did not work, is swap enabled?\n");
@@ -878,49 +882,49 @@ static void do_run_with_thp(test_fn fn, enum thp_run thp_run)
 static void run_with_thp(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_PMD);
+	do_run_with_large(fn, LARGE_RUN_PMD, thpsize);
 }
 
 static void run_with_thp_swap(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with swapped-out THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_PMD_SWAPOUT);
+	do_run_with_large(fn, LARGE_RUN_PMD_SWAPOUT, thpsize);
 }
 
 static void run_with_pte_mapped_thp(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with PTE-mapped THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_PTE);
+	do_run_with_large(fn, LARGE_RUN_PTE, thpsize);
 }
 
 static void run_with_pte_mapped_thp_swap(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with swapped-out, PTE-mapped THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_PTE_SWAPOUT);
+	do_run_with_large(fn, LARGE_RUN_PTE_SWAPOUT, thpsize);
 }
 
 static void run_with_single_pte_of_thp(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with single PTE of THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_SINGLE_PTE);
+	do_run_with_large(fn, LARGE_RUN_SINGLE_PTE, thpsize);
 }
 
 static void run_with_single_pte_of_thp_swap(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with single PTE of swapped-out THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_SINGLE_PTE_SWAPOUT);
+	do_run_with_large(fn, LARGE_RUN_SINGLE_PTE_SWAPOUT, thpsize);
 }
 
 static void run_with_partial_mremap_thp(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with partially mremap()'ed THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_PARTIAL_MREMAP);
+	do_run_with_large(fn, LARGE_RUN_PARTIAL_MREMAP, thpsize);
 }
 
 static void run_with_partial_shared_thp(test_fn fn, const char *desc)
 {
 	ksft_print_msg("[RUN] %s ... with partially shared THP\n", desc);
-	do_run_with_thp(fn, THP_RUN_PARTIAL_SHARED);
+	do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, thpsize);
 }
 
 static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize)
@@ -1338,7 +1342,7 @@ static void run_anon_thp_test_cases(void)
 		struct test_case const *test_case = &anon_thp_test_cases[i];
 
 		ksft_print_msg("[RUN] %s\n", test_case->desc);
-		do_run_with_thp(test_case->fn, THP_RUN_PMD);
+		do_run_with_large(test_case->fn, LARGE_RUN_PMD, thpsize);
 	}
 }
 

From patchwork Thu Aug 10 14:29:42 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ryan Roberts <ryan.roberts@arm.com>
X-Patchwork-Id: 134039
Return-Path: <linux-kernel-owner@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp489780vqi;
        Thu, 10 Aug 2023 08:13:40 -0700 (PDT)
X-Google-Smtp-Source: 
 AGHT+IFPgdqZwyroMGpaglhijH+JfBvb/aOghvegVwjUiTpdIH356aVC7vWppn6LYtp23ZJZP3nH
X-Received: by 2002:a05:6808:18a3:b0:3a4:8590:90f2 with SMTP id
 bi35-20020a05680818a300b003a4859090f2mr3889002oib.47.1691680419720;
        Thu, 10 Aug 2023 08:13:39 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1691680419; cv=none;
        d=google.com; s=arc-20160816;
        b=aJzyWKf181tNu9RtIct45GbFREWh3CRwQX7tbuHIeGpwN8fJpTnDxX0eeFyCjYIqkM
         X+LPc4kG9fy4N1PZ1zn0Goc/7HBNdcK/e5rr+fjrG8ttBY3G6NAbMqn1kTynHf0vWPl8
         DY2mC33yfFr8vWjurx7uosxSuK4lGS1hKXA7/+1tKNNOySArAWduuszlMmQZyXw1SjlX
         tZ6Ffw0R29G1MEj4KzZYzFavAzO9pjJVwACImZUUOUkow/UZFLke8QZTq5m2ivvMb4l8
         zxDdK66nQuPJIGG3VoQpF2wmoIXcm+IvVoivleex63CucFN9p5HPzy9npXqZM30SoEsI
         rqOg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=list-id:precedence:content-transfer-encoding:mime-version
         :references:in-reply-to:message-id:date:subject:cc:to:from;
        bh=YAYuga94X0gYR3iPhGr2eIcuvk2ipwl7xIVV/NAaqzI=;
        fh=2S5jT4dIIqOhOs7q3j0K2DFa5C1ZmfWI8H2ybAUe/oA=;
        b=Iyy6A/y72Ts+OyNUfZoMvwp9Cpd9ynQbOhK3QWMzxYGGA0VokEsY7OQtBgV1s8pan6
         3TY2E26toNeehYXQ9103VSIr6MLpX3QmhMdzSwg6D2P9qYjBjJZlvKI1HVIzEVpHKlUL
         1Nhvt+NmVI7tx/8nqefkxjU7nGZS7ECBuMG/FtZD56nHIa+omdGzr9rM5folVaOaHmTS
         fhZGZAjgC8Yvpb+YJXfqyw9xfy3mZqNiQIXebvwOjGkdALPns9OP1myqyP9bY86CQnQ8
         YDAoW9THFIII6vJ9XvmR69XJS7EQqVfkYhqfhCkp1hNTzt4CphogfTweYbApx4zkjAYn
         TmBg==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20])
        by mx.google.com with ESMTP id
 c4-20020a6566c4000000b0055c7da0216dsi1688863pgw.635.2023.08.10.08.13.23;
        Thu, 10 Aug 2023 08:13:39 -0700 (PDT)
Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 smtp.mailfrom=linux-kernel-owner@vger.kernel.org;
       dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S235378AbjHJOaN (ORCPT <rfc822;lanlanxiyiji@gmail.com>
        + 99 others); Thu, 10 Aug 2023 10:30:13 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60164 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S235723AbjHJOaG (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 10 Aug 2023 10:30:06 -0400
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
        by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2502BFA
        for <linux-kernel@vger.kernel.org>;
 Thu, 10 Aug 2023 07:30:06 -0700 (PDT)
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
        by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 50B52150C;
        Thu, 10 Aug 2023 07:30:48 -0700 (PDT)
Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com
 [10.1.196.26])
        by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id
 937D63F64C;
        Thu, 10 Aug 2023 07:30:03 -0700 (PDT)
From: Ryan Roberts <ryan.roberts@arm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
        Matthew Wilcox <willy@infradead.org>,
        Yin Fengwei <fengwei.yin@intel.com>,
        David Hildenbrand <david@redhat.com>,
        Yu Zhao <yuzhao@google.com>,
        Catalin Marinas <catalin.marinas@arm.com>,
        Anshuman Khandual <anshuman.khandual@arm.com>,
        Yang Shi <shy828301@gmail.com>,
        "Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Itaru Kitayama <itaru.kitayama@gmail.com>,
        "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-mm@kvack.org,
        linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 5/5] selftests/mm/cow: Add large anon folio tests
Date: Thu, 10 Aug 2023 15:29:42 +0100
Message-Id: <20230810142942.3169679-6-ryan.roberts@arm.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230810142942.3169679-1-ryan.roberts@arm.com>
References: <20230810142942.3169679-1-ryan.roberts@arm.com>
MIME-Version: 1.0
X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED,
        SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
        lindbergh.monkeyblade.net
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1773855487599153136
X-GMAIL-MSGID: 1773855487599153136

Add tests similar to the existing THP tests, but which operate on memory
backed by large anonymous folios, which are smaller than THP.

This reuses all the existing infrastructure. If the test suite detects
that large anonyomous folios are not supported by the kernel, the new
tests are skipped.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 tools/testing/selftests/mm/cow.c | 111 +++++++++++++++++++++++++++++--
 1 file changed, 106 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/mm/cow.c b/tools/testing/selftests/mm/cow.c
index 304882bf2e5d..932242c965a4 100644
--- a/tools/testing/selftests/mm/cow.c
+++ b/tools/testing/selftests/mm/cow.c
@@ -33,6 +33,7 @@
 static size_t pagesize;
 static int pagemap_fd;
 static size_t thpsize;
+static size_t lafsize;
 static int nr_hugetlbsizes;
 static size_t hugetlbsizes[10];
 static int gup_fd;
@@ -927,6 +928,42 @@ static void run_with_partial_shared_thp(test_fn fn, const char *desc)
 	do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, thpsize);
 }
 
+static void run_with_laf(test_fn fn, const char *desc)
+{
+	ksft_print_msg("[RUN] %s ... with large anon folio\n", desc);
+	do_run_with_large(fn, LARGE_RUN_PTE, lafsize);
+}
+
+static void run_with_laf_swap(test_fn fn, const char *desc)
+{
+	ksft_print_msg("[RUN] %s ... with swapped-out large anon folio\n", desc);
+	do_run_with_large(fn, LARGE_RUN_PTE_SWAPOUT, lafsize);
+}
+
+static void run_with_single_pte_of_laf(test_fn fn, const char *desc)
+{
+	ksft_print_msg("[RUN] %s ... with single PTE of large anon folio\n", desc);
+	do_run_with_large(fn, LARGE_RUN_SINGLE_PTE, lafsize);
+}
+
+static void run_with_single_pte_of_laf_swap(test_fn fn, const char *desc)
+{
+	ksft_print_msg("[RUN] %s ... with single PTE of swapped-out large anon folio\n", desc);
+	do_run_with_large(fn, LARGE_RUN_SINGLE_PTE_SWAPOUT, lafsize);
+}
+
+static void run_with_partial_mremap_laf(test_fn fn, const char *desc)
+{
+	ksft_print_msg("[RUN] %s ... with partially mremap()'ed large anon folio\n", desc);
+	do_run_with_large(fn, LARGE_RUN_PARTIAL_MREMAP, lafsize);
+}
+
+static void run_with_partial_shared_laf(test_fn fn, const char *desc)
+{
+	ksft_print_msg("[RUN] %s ... with partially shared large anon folio\n", desc);
+	do_run_with_large(fn, LARGE_RUN_PARTIAL_SHARED, lafsize);
+}
+
 static void run_with_hugetlb(test_fn fn, const char *desc, size_t hugetlbsize)
 {
 	int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB;
@@ -1105,6 +1142,14 @@ static void run_anon_test_case(struct test_case const *test_case)
 		run_with_partial_mremap_thp(test_case->fn, test_case->desc);
 		run_with_partial_shared_thp(test_case->fn, test_case->desc);
 	}
+	if (lafsize) {
+		run_with_laf(test_case->fn, test_case->desc);
+		run_with_laf_swap(test_case->fn, test_case->desc);
+		run_with_single_pte_of_laf(test_case->fn, test_case->desc);
+		run_with_single_pte_of_laf_swap(test_case->fn, test_case->desc);
+		run_with_partial_mremap_laf(test_case->fn, test_case->desc);
+		run_with_partial_shared_laf(test_case->fn, test_case->desc);
+	}
 	for (i = 0; i < nr_hugetlbsizes; i++)
 		run_with_hugetlb(test_case->fn, test_case->desc,
 				 hugetlbsizes[i]);
@@ -1126,6 +1171,8 @@ static int tests_per_anon_test_case(void)
 
 	if (thpsize)
 		tests += 8;
+	if (lafsize)
+		tests += 6;
 	return tests;
 }
 
@@ -1680,15 +1727,74 @@ static int tests_per_non_anon_test_case(void)
 	return tests;
 }
 
+static size_t large_anon_folio_size(void)
+{
+	/*
+	 * There is no interface to query this. But we know that it must be less
+	 * than thpsize. So we map a thpsize area, aligned to thpsize offset by
+	 * thpsize/2 (to avoid a hugepage being allocated), then touch the first
+	 * page and see how many pages get faulted in.
+	 */
+
+	int max_order = __builtin_ctz(thpsize);
+	size_t mmap_size = thpsize * 3;
+	char *mmap_mem = NULL;
+	int order = 0;
+	char *mem;
+	size_t offset;
+	int ret;
+
+	/* For alignment purposes, we need 2.5x the requested size. */
+	mmap_mem = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE,
+			MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (mmap_mem == MAP_FAILED)
+		goto out;
+
+	/* Align the memory area to thpsize then offset it by thpsize/2. */
+	mem = (char *)(((uintptr_t)mmap_mem + thpsize) & ~(thpsize - 1));
+	mem += thpsize / 2;
+
+	/* We might get a bigger large anon folio when MADV_HUGEPAGE is set. */
+	ret = madvise(mem, thpsize, MADV_HUGEPAGE);
+	if (ret)
+		goto out;
+
+	/* Probe the memory to see how much is populated. */
+	mem[0] = 0;
+	for (order = 0; order < max_order; order++) {
+		offset = (1 << order) * pagesize;
+		if (!pagemap_is_populated(pagemap_fd, mem + offset))
+			break;
+	}
+
+out:
+	if (mmap_mem)
+		munmap(mmap_mem, mmap_size);
+
+	if (order == 0)
+		return 0;
+
+	return offset;
+}
+
 int main(int argc, char **argv)
 {
 	int err;
 
+	gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
+	pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
+	if (pagemap_fd < 0)
+		ksft_exit_fail_msg("opening pagemap failed\n");
+
 	pagesize = getpagesize();
 	thpsize = read_pmd_pagesize();
 	if (thpsize)
 		ksft_print_msg("[INFO] detected THP size: %zu KiB\n",
 			       thpsize / 1024);
+	lafsize = large_anon_folio_size();
+	if (lafsize)
+		ksft_print_msg("[INFO] detected large anon folio size: %zu KiB\n",
+			       lafsize / 1024);
 	nr_hugetlbsizes = detect_hugetlb_page_sizes(hugetlbsizes,
 						    ARRAY_SIZE(hugetlbsizes));
 	detect_huge_zeropage();
@@ -1698,11 +1804,6 @@ int main(int argc, char **argv)
 		      ARRAY_SIZE(anon_thp_test_cases) * tests_per_anon_thp_test_case() +
 		      ARRAY_SIZE(non_anon_test_cases) * tests_per_non_anon_test_case());
 
-	gup_fd = open("/sys/kernel/debug/gup_test", O_RDWR);
-	pagemap_fd = open("/proc/self/pagemap", O_RDONLY);
-	if (pagemap_fd < 0)
-		ksft_exit_fail_msg("opening pagemap failed\n");
-
 	run_anon_test_cases();
 	run_anon_thp_test_cases();
 	run_non_anon_test_cases();