Message ID | 20240202161554.565023-2-zi.yan@sent.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-50118-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:9bc1:b0:106:209c:c626 with SMTP id op1csp535774dyc; Fri, 2 Feb 2024 08:16:57 -0800 (PST) X-Google-Smtp-Source: AGHT+IH7JMdMwI9KnpraTeMSKAbwFNUWm9LXQP+Ghf4tR+XaWLTjX408st/uvLL1DU6McAl5EROh X-Received: by 2002:a17:902:7c8a:b0:1d7:163:54be with SMTP id y10-20020a1709027c8a00b001d7016354bemr5172562pll.59.1706890617281; Fri, 02 Feb 2024 08:16:57 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706890617; cv=pass; d=google.com; s=arc-20160816; b=VZiGxNeQrTrp3D5KjwgSeJIyj78eKSaEW8PI3aMZXOl3RQfZJJSA3zL4fYR+E0S6rQ Rm4Y3Tg5mNlvrJFx+EksainVL3m3IG/Z+v+DwBy9YB1zlsORSSqVWrGXinrcJOI3oL1F vUtBktDZCKSl7kF5hB1yYtt1iOOSzy4pH1q16dlEQpYIJJNeHbcbD8728NQPel9zRLjB FuczlJAGjl41zIJyyT5PpITyhSXC7/9ueSL4wMaE3uMgU0Ydub7l2IyVxmRjwv/2Jj7e S6g+S9mRHHOsReIHbn3mHrTUKjI9psXNdU2xO6EE2XMoVJNHna/tSBffiVFXJ8Dnhank qi7Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:reply-to:references:in-reply-to :message-id:date:subject:cc:to:from:feedback-id:dkim-signature :dkim-signature; bh=b0LGZXB2atC7cRDxeVKB29tfMJXHfhbupvuqzgL5H0w=; fh=Rp3bqusgkkijOmjabARSozm5FaphuN+AzYdF93xAayQ=; b=xR4d7N2PWKEDgA9QCXJHja2iJxBOHu/naTndB/fSyZlaBB2opDwKgwyXu8lxl/Fez2 IyYXgL3ut3l30t5WYFstlAS9semvK+ZZ4EMARfREiIrIG4yJf6i1LlQJgFWhIeguRE8e QU5IcVW/VIZqOYlZKfWYdTIQbMx6U1VNjxLi0lUJJMzwqtAMVP5GH937A2dFpPMWWXbE PpwOR+8DNJ5BfB3A2SyxIUL3ciS6cUj13yXHuVAxa1BfNUcVmJl8W9DiwD5cXV8y3Lk9 XHqgER2i+SzFsRe3aJK+M87zG0G0vm5c3+4Jc77GQ8Neoruv50ZyFM27Wrw9xACL2G0g lR8Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=w9AwuLYZ; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=jITQjR9f; arc=pass (i=1 spf=pass spfdomain=sent.com dkim=pass dkdomain=sent.com dkim=pass dkdomain=messagingengine.com dmarc=pass fromdomain=sent.com); spf=pass (google.com: domain of linux-kernel+bounces-50118-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-50118-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com X-Forwarded-Encrypted: i=1; AJvYcCURmTge7LHhyogeKOQII/6YWaTMd9bfi7uXAyqKn5Dmdw2ETxGIz9Xjsqutax/urRbF9MoJgLKZxDc0jwEMPvlR8RwZLg== Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id p8-20020a170902bd0800b001d74bf99a3bsi1870796pls.575.2024.02.02.08.16.57 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Feb 2024 08:16:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-50118-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@sent.com header.s=fm3 header.b=w9AwuLYZ; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=jITQjR9f; arc=pass (i=1 spf=pass spfdomain=sent.com dkim=pass dkdomain=sent.com dkim=pass dkdomain=messagingengine.com dmarc=pass fromdomain=sent.com); spf=pass (google.com: domain of linux-kernel+bounces-50118-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-50118-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sent.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 0119B296EA6 for <ouuuleilei@gmail.com>; Fri, 2 Feb 2024 16:16:57 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D137C14831F; Fri, 2 Feb 2024 16:16:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=sent.com header.i=@sent.com header.b="w9AwuLYZ"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="jITQjR9f" Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BEC4B14198F for <linux-kernel@vger.kernel.org>; Fri, 2 Feb 2024 16:16:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=66.111.4.26 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706890579; cv=none; b=TTa9twik41BK5bT6TtsJZsP7UN59yqUkOCMR0jak0vNPckXwYBaFytda6dRKo5xZ//F6tcT/pLrvvEMa1AwLiFtP2P5pq4cMPEjSQtGLglTMKYoS+ad0XhWfs1o9PkhrMo14Z8OffyM52DOURI2pI7nYgJCA+M+000VBsPvOC/U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706890579; c=relaxed/simple; bh=CHBuBJuhamNrP6U0fj4Mt2MwBKcPu0swmlgl3BSB8TE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hcIG0Gr5CjCPh4r/9O/Yg2uPKRrouwyeOZJfefEWTQ9vmUEJrW9yhiMGHHo9b1ERP8qLl6/nqVKbSz+wNWFTRDBZFno0MSsQJPnF3JjClUTYrN6vGC7iSoQUVUErKk1ZrAgpej9nEFMoUR3FbbN5OvDpdUfL1Bu68wEilaEPD6Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=sent.com; spf=pass smtp.mailfrom=sent.com; dkim=pass (2048-bit key) header.d=sent.com header.i=@sent.com header.b=w9AwuLYZ; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=jITQjR9f; arc=none smtp.client-ip=66.111.4.26 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=sent.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sent.com Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.nyi.internal (Postfix) with ESMTP id BE2FD5C0080; Fri, 2 Feb 2024 11:16:16 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Fri, 02 Feb 2024 11:16:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:reply-to:subject:subject:to:to; s=fm3; t= 1706890576; x=1706976976; bh=b0LGZXB2atC7cRDxeVKB29tfMJXHfhbupvu qzgL5H0w=; b=w9AwuLYZjR/BP+cqnv3LMOCFQHBsCSwi895LQemcBc+Vo5ZToOt HCHtnMIu+rwnyKxCfClEPDT0/5F+CkwVbcGInCTjWeCWmRjStV7xls7XKOArtfsI w5V3vmmkw+KiHx2VdkyMZ0QacA6O+hqiEA/i4E9XOJ9z+Q/x3xQMDQ8arvA/lk0g BqHpu1wNeN9kn2L6C96cJ1thlhPyVv97m93rRRo8gewh7pXpNJHfVbnhc94d+/OT 8eny1dAyhp8oVMCHXdhdF/yxvyNKMP/6aOC8MZKzfU4jhN/6nuOWJw7hUSqswAT0 if8thmyelZ6EE/eBsHrgVEmiLME2G476qSA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:reply-to:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1706890576; x=1706976976; bh=b0LGZXB2atC7cRDxeVKB29tfMJXHfhbupvu qzgL5H0w=; b=jITQjR9fZ4vUH7jpIKUSGUf69CmGaNlTyKWC7zCok4dJZu5NM+M tfqbsksM2p0nTWynjXbTfU3uIPh+tzf0SEH3sdEbK6yLqYCUcsSri81slnPWg0Vl +tv7SmGQ6GXwLg/uEogMGAdWcRQ2uUpLFNRY5TeNJYaKuyqHrI9LDFZyoAHDTJrv laJlFvAREQb49Osr7ozT8+HHU7Do0nEFj3n2Ks80KnGAxDm5blkXMdpNaQpb5tkl znMVFC71CcxwrGQpSwLwE8Et4YppV1zGIp7VMnNW8+rHy1e+kLti4NGDKwzC/JLD oTrbPSupWq5z6Db9SfEWk+3EevvpLfpHAYw== X-ME-Sender: <xms:UBW9ZazlDycr4MFLl50ja0ITLIpUdhU4G0v3mGYvnyEIbX96U4VoJw> <xme:UBW9ZWRuoxQoSIpG-G8voG-QBLLYP5diyy94UtrpPgXuUw1NXP5GW8Npr6MWMZlta -KnadSJgoMM5kK7-Q> X-ME-Received: <xmr:UBW9ZcUwVBLkPVMYu4-WBAFp8Ib8lj5Mq9z17h0uxapV1Y3aU3SNGb4ThwqGFJgZDmVpwIkKINAcKaSJwO67m4WHcHeddHtsoVDidujGhNACtRLl_ami9jiQ> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrfedugedgkeehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkofgjfhhrgggtgfesthekredtredtjeenucfhrhhomhepkghi ucgjrghnuceoiihirdihrghnsehsvghnthdrtghomheqnecuggftrfgrthhtvghrnhepje ekteekffelleekudfftdefvddtjeejuedtuedtteegjefgvedtfedujeekieevnecuvehl uhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepiihirdihrghnse hsvghnthdrtghomh X-ME-Proxy: <xmx:UBW9ZQguKYLro88YJXYyfpcU7gbhtrx1itTHsshc0CT5t0x7SmSjbg> <xmx:UBW9ZcCNhwbq_F2UkcE9VXSB2bg3RiPwHxTifLhuFBn4eHb8ydit4w> <xmx:UBW9ZRJ3-D5zl4Beh-VCQe7V5sH5F6QzY7b7SFA21LIHX0mslwJJKw> <xmx:UBW9ZU61UoIpbNPoIrFOiXEVfHwkYb6xcTCImFdlGVWaFDcMlACBZg> Feedback-ID: iccd040f4:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 2 Feb 2024 11:16:15 -0500 (EST) From: Zi Yan <zi.yan@sent.com> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Zi Yan <ziy@nvidia.com>, "Huang, Ying" <ying.huang@intel.com>, Ryan Roberts <ryan.roberts@arm.com>, Andrew Morton <akpm@linux-foundation.org>, "Matthew Wilcox (Oracle)" <willy@infradead.org>, David Hildenbrand <david@redhat.com>, "Yin, Fengwei" <fengwei.yin@intel.com>, Yu Zhao <yuzhao@google.com>, Vlastimil Babka <vbabka@suse.cz>, "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>, Johannes Weiner <hannes@cmpxchg.org>, Baolin Wang <baolin.wang@linux.alibaba.com>, Kemeng Shi <shikemeng@huaweicloud.com>, Mel Gorman <mgorman@techsingularity.net>, Rohan Puri <rohan.puri15@gmail.com>, Mcgrof Chamberlain <mcgrof@kernel.org>, Adam Manzanares <a.manzanares@samsung.com>, "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Subject: [PATCH v3 1/3] mm/compaction: enable compacting >0 order folios. Date: Fri, 2 Feb 2024 11:15:52 -0500 Message-ID: <20240202161554.565023-2-zi.yan@sent.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240202161554.565023-1-zi.yan@sent.com> References: <20240202161554.565023-1-zi.yan@sent.com> Reply-To: Zi Yan <ziy@nvidia.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789804535885453391 X-GMAIL-MSGID: 1789804535885453391 |
Series |
Enable >0 order folio memory compaction
|
|
Commit Message
Zi Yan
Feb. 2, 2024, 4:15 p.m. UTC
From: Zi Yan <ziy@nvidia.com> migrate_pages() supports >0 order folio migration and during compaction, even if compaction_alloc() cannot provide >0 order free pages, migrate_pages() can split the source page and try to migrate the base pages from the split. It can be a baseline and start point for adding support for compacting >0 order folios. Suggested-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Zi Yan <ziy@nvidia.com> --- mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 35 insertions(+), 8 deletions(-)
Comments
On 2/2/24 17:15, Zi Yan wrote: > From: Zi Yan <ziy@nvidia.com> > > migrate_pages() supports >0 order folio migration and during compaction, > even if compaction_alloc() cannot provide >0 order free pages, > migrate_pages() can split the source page and try to migrate the base pages > from the split. It can be a baseline and start point for adding support for > compacting >0 order folios. > > Suggested-by: Huang Ying <ying.huang@intel.com> > Signed-off-by: Zi Yan <ziy@nvidia.com> > --- > mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- > 1 file changed, 35 insertions(+), 8 deletions(-) > > diff --git a/mm/compaction.c b/mm/compaction.c > index 4add68d40e8d..e43e898d2c77 100644 > --- a/mm/compaction.c > +++ b/mm/compaction.c > @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) > return too_many; > } > > +/* > + * 1. if the page order is larger than or equal to target_order (i.e., > + * cc->order and when it is not -1 for global compaction), skip it since > + * target_order already indicates no free page with larger than target_order > + * exists and later migrating it will most likely fail; > + * > + * 2. compacting > pageblock_order pages does not improve memory fragmentation, > + * skip them; > + */ > +static bool skip_isolation_on_order(int order, int target_order) > +{ > + return (target_order != -1 && order >= target_order) || > + order >= pageblock_order; > +} > + > /** > * isolate_migratepages_block() - isolate all migrate-able pages within > * a single pageblock > @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > /* > * Regardless of being on LRU, compound pages such as THP and > * hugetlbfs are not to be compacted unless we are attempting > - * an allocation much larger than the huge page size (eg CMA). > + * an allocation larger than the compound page size. > * We can potentially save a lot of iterations if we skip them > * at once. The check is racy, but we can consider only valid > * values and the only danger is skipping too much. > @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > if (PageCompound(page) && !cc->alloc_contig) { > const unsigned int order = compound_order(page); > > - if (likely(order <= MAX_PAGE_ORDER)) { > - low_pfn += (1UL << order) - 1; > - nr_scanned += (1UL << order) - 1; > + /* > + * Skip based on page order and compaction target order > + * and skip hugetlbfs pages. > + */ > + if (skip_isolation_on_order(order, cc->order) || > + PageHuge(page)) { Hm I'd try to avoid a new PageHuge() test here. Earlier we have a block that does if (PageHuge(page) && cc->alloc_contig) { ... think I'd rather rewrite it to handle the PageHuge() case completely and just make it skip the 1UL << order pages there for !cc->alloc_config. Even if it means duplicating a bit of the low_pfn and nr_scanned bumping code. Which reminds me the PageHuge() check there is probably still broken ATM: https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ Even better reason not to add another one. If the huge page materialized since the first check, we should bail out when testing PageLRU later anyway. > + if (order <= MAX_PAGE_ORDER) { > + low_pfn += (1UL << order) - 1; > + nr_scanned += (1UL << order) - 1; > + } > + goto isolate_fail; > } > - goto isolate_fail; > } > > /* > @@ -1165,10 +1187,11 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, > } > > /* > - * folio become large since the non-locked check, > - * and it's on LRU. > + * Check LRU folio order under the lock > */ > - if (unlikely(folio_test_large(folio) && !cc->alloc_contig)) { > + if (unlikely(skip_isolation_on_order(folio_order(folio), > + cc->order) && > + !cc->alloc_contig)) { > low_pfn += folio_nr_pages(folio) - 1; > nr_scanned += folio_nr_pages(folio) - 1; > folio_set_lru(folio); > @@ -1786,6 +1809,10 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) > struct compact_control *cc = (struct compact_control *)data; > struct folio *dst; > > + /* this makes migrate_pages() split the source page and retry */ > + if (folio_test_large(src) > 0) > + return NULL; > + > if (list_empty(&cc->freepages)) { > isolate_freepages(cc); >
On 9 Feb 2024, at 9:32, Vlastimil Babka wrote: > On 2/2/24 17:15, Zi Yan wrote: >> From: Zi Yan <ziy@nvidia.com> >> >> migrate_pages() supports >0 order folio migration and during compaction, >> even if compaction_alloc() cannot provide >0 order free pages, >> migrate_pages() can split the source page and try to migrate the base pages >> from the split. It can be a baseline and start point for adding support for >> compacting >0 order folios. >> >> Suggested-by: Huang Ying <ying.huang@intel.com> >> Signed-off-by: Zi Yan <ziy@nvidia.com> >> --- >> mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- >> 1 file changed, 35 insertions(+), 8 deletions(-) >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index 4add68d40e8d..e43e898d2c77 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) >> return too_many; >> } >> >> +/* >> + * 1. if the page order is larger than or equal to target_order (i.e., >> + * cc->order and when it is not -1 for global compaction), skip it since >> + * target_order already indicates no free page with larger than target_order >> + * exists and later migrating it will most likely fail; >> + * >> + * 2. compacting > pageblock_order pages does not improve memory fragmentation, >> + * skip them; >> + */ >> +static bool skip_isolation_on_order(int order, int target_order) >> +{ >> + return (target_order != -1 && order >= target_order) || >> + order >= pageblock_order; >> +} >> + >> /** >> * isolate_migratepages_block() - isolate all migrate-able pages within >> * a single pageblock >> @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >> /* >> * Regardless of being on LRU, compound pages such as THP and >> * hugetlbfs are not to be compacted unless we are attempting >> - * an allocation much larger than the huge page size (eg CMA). >> + * an allocation larger than the compound page size. >> * We can potentially save a lot of iterations if we skip them >> * at once. The check is racy, but we can consider only valid >> * values and the only danger is skipping too much. >> @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >> if (PageCompound(page) && !cc->alloc_contig) { >> const unsigned int order = compound_order(page); >> >> - if (likely(order <= MAX_PAGE_ORDER)) { >> - low_pfn += (1UL << order) - 1; >> - nr_scanned += (1UL << order) - 1; >> + /* >> + * Skip based on page order and compaction target order >> + * and skip hugetlbfs pages. >> + */ >> + if (skip_isolation_on_order(order, cc->order) || >> + PageHuge(page)) { > > Hm I'd try to avoid a new PageHuge() test here. > > Earlier we have a block that does > if (PageHuge(page) && cc->alloc_contig) { > ... > > think I'd rather rewrite it to handle the PageHuge() case completely and > just make it skip the 1UL << order pages there for !cc->alloc_config. Even > if it means duplicating a bit of the low_pfn and nr_scanned bumping code. > > Which reminds me the PageHuge() check there is probably still broken ATM: > > https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ > > Even better reason not to add another one. > If the huge page materialized since the first check, we should bail out when > testing PageLRU later anyway. OK, so basically something like: if (PageHuge(page)) { if (cc->alloc_contig) { // existing code for PageHuge(page) && cc->allc_contig } else { const unsigned int order = compound_order(page); if (order <= MAX_PAGE_ORDER) { low_pfn += (1UL << order) - 1; nr_scanned += (1UL << order) - 1; } goto isolate_fail; } } ... if (PageCompound(page) && !cc->alloc_contig) { const unsigned int order = compound_order(page); /* Skip based on page order and compaction target order. */ if (skip_isolation_on_order(order, cc->order)) { if (order <= MAX_PAGE_ORDER) { low_pfn += (1UL << order) - 1; nr_scanned += (1UL << order) - 1; } goto isolate_fail; } } > >> + if (order <= MAX_PAGE_ORDER) { >> + low_pfn += (1UL << order) - 1; >> + nr_scanned += (1UL << order) - 1; >> + } >> + goto isolate_fail; >> } >> - goto isolate_fail; >> } >> >> /* >> @@ -1165,10 +1187,11 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >> } >> >> /* >> - * folio become large since the non-locked check, >> - * and it's on LRU. >> + * Check LRU folio order under the lock >> */ >> - if (unlikely(folio_test_large(folio) && !cc->alloc_contig)) { >> + if (unlikely(skip_isolation_on_order(folio_order(folio), >> + cc->order) && >> + !cc->alloc_contig)) { >> low_pfn += folio_nr_pages(folio) - 1; >> nr_scanned += folio_nr_pages(folio) - 1; >> folio_set_lru(folio); >> @@ -1786,6 +1809,10 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) >> struct compact_control *cc = (struct compact_control *)data; >> struct folio *dst; >> >> + /* this makes migrate_pages() split the source page and retry */ >> + if (folio_test_large(src) > 0) >> + return NULL; >> + >> if (list_empty(&cc->freepages)) { >> isolate_freepages(cc); >> -- Best Regards, Yan, Zi
On 2/9/24 20:25, Zi Yan wrote: > On 9 Feb 2024, at 9:32, Vlastimil Babka wrote: > >> On 2/2/24 17:15, Zi Yan wrote: >>> From: Zi Yan <ziy@nvidia.com> >>> >>> migrate_pages() supports >0 order folio migration and during compaction, >>> even if compaction_alloc() cannot provide >0 order free pages, >>> migrate_pages() can split the source page and try to migrate the base pages >>> from the split. It can be a baseline and start point for adding support for >>> compacting >0 order folios. >>> >>> Suggested-by: Huang Ying <ying.huang@intel.com> >>> Signed-off-by: Zi Yan <ziy@nvidia.com> >>> --- >>> mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- >>> 1 file changed, 35 insertions(+), 8 deletions(-) >>> >>> diff --git a/mm/compaction.c b/mm/compaction.c >>> index 4add68d40e8d..e43e898d2c77 100644 >>> --- a/mm/compaction.c >>> +++ b/mm/compaction.c >>> @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) >>> return too_many; >>> } >>> >>> +/* >>> + * 1. if the page order is larger than or equal to target_order (i.e., >>> + * cc->order and when it is not -1 for global compaction), skip it since >>> + * target_order already indicates no free page with larger than target_order >>> + * exists and later migrating it will most likely fail; >>> + * >>> + * 2. compacting > pageblock_order pages does not improve memory fragmentation, >>> + * skip them; >>> + */ >>> +static bool skip_isolation_on_order(int order, int target_order) >>> +{ >>> + return (target_order != -1 && order >= target_order) || >>> + order >= pageblock_order; >>> +} >>> + >>> /** >>> * isolate_migratepages_block() - isolate all migrate-able pages within >>> * a single pageblock >>> @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>> /* >>> * Regardless of being on LRU, compound pages such as THP and >>> * hugetlbfs are not to be compacted unless we are attempting >>> - * an allocation much larger than the huge page size (eg CMA). >>> + * an allocation larger than the compound page size. >>> * We can potentially save a lot of iterations if we skip them >>> * at once. The check is racy, but we can consider only valid >>> * values and the only danger is skipping too much. >>> @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>> if (PageCompound(page) && !cc->alloc_contig) { >>> const unsigned int order = compound_order(page); >>> >>> - if (likely(order <= MAX_PAGE_ORDER)) { >>> - low_pfn += (1UL << order) - 1; >>> - nr_scanned += (1UL << order) - 1; >>> + /* >>> + * Skip based on page order and compaction target order >>> + * and skip hugetlbfs pages. >>> + */ >>> + if (skip_isolation_on_order(order, cc->order) || >>> + PageHuge(page)) { >> >> Hm I'd try to avoid a new PageHuge() test here. >> >> Earlier we have a block that does >> if (PageHuge(page) && cc->alloc_contig) { >> ... >> >> think I'd rather rewrite it to handle the PageHuge() case completely and >> just make it skip the 1UL << order pages there for !cc->alloc_config. Even >> if it means duplicating a bit of the low_pfn and nr_scanned bumping code. >> >> Which reminds me the PageHuge() check there is probably still broken ATM: >> >> https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ >> >> Even better reason not to add another one. >> If the huge page materialized since the first check, we should bail out when >> testing PageLRU later anyway. > > > OK, so basically something like: > > if (PageHuge(page)) { > if (cc->alloc_contig) { Yeah but I'd handle the !cc->alloc_contig first as that ends with a goto, and then the rest doesn't need to be "} else { ... }" with extra identation > // existing code for PageHuge(page) && cc->allc_contig > } else { > const unsigned int order = compound_order(page); > > if (order <= MAX_PAGE_ORDER) { > low_pfn += (1UL << order) - 1; > nr_scanned += (1UL << order) - 1; > } > goto isolate_fail; > } > }
On 9 Feb 2024, at 15:43, Vlastimil Babka wrote: > On 2/9/24 20:25, Zi Yan wrote: >> On 9 Feb 2024, at 9:32, Vlastimil Babka wrote: >> >>> On 2/2/24 17:15, Zi Yan wrote: >>>> From: Zi Yan <ziy@nvidia.com> >>>> >>>> migrate_pages() supports >0 order folio migration and during compaction, >>>> even if compaction_alloc() cannot provide >0 order free pages, >>>> migrate_pages() can split the source page and try to migrate the base pages >>>> from the split. It can be a baseline and start point for adding support for >>>> compacting >0 order folios. >>>> >>>> Suggested-by: Huang Ying <ying.huang@intel.com> >>>> Signed-off-by: Zi Yan <ziy@nvidia.com> >>>> --- >>>> mm/compaction.c | 43 +++++++++++++++++++++++++++++++++++-------- >>>> 1 file changed, 35 insertions(+), 8 deletions(-) >>>> >>>> diff --git a/mm/compaction.c b/mm/compaction.c >>>> index 4add68d40e8d..e43e898d2c77 100644 >>>> --- a/mm/compaction.c >>>> +++ b/mm/compaction.c >>>> @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) >>>> return too_many; >>>> } >>>> >>>> +/* >>>> + * 1. if the page order is larger than or equal to target_order (i.e., >>>> + * cc->order and when it is not -1 for global compaction), skip it since >>>> + * target_order already indicates no free page with larger than target_order >>>> + * exists and later migrating it will most likely fail; >>>> + * >>>> + * 2. compacting > pageblock_order pages does not improve memory fragmentation, >>>> + * skip them; >>>> + */ >>>> +static bool skip_isolation_on_order(int order, int target_order) >>>> +{ >>>> + return (target_order != -1 && order >= target_order) || >>>> + order >= pageblock_order; >>>> +} >>>> + >>>> /** >>>> * isolate_migratepages_block() - isolate all migrate-able pages within >>>> * a single pageblock >>>> @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>>> /* >>>> * Regardless of being on LRU, compound pages such as THP and >>>> * hugetlbfs are not to be compacted unless we are attempting >>>> - * an allocation much larger than the huge page size (eg CMA). >>>> + * an allocation larger than the compound page size. >>>> * We can potentially save a lot of iterations if we skip them >>>> * at once. The check is racy, but we can consider only valid >>>> * values and the only danger is skipping too much. >>>> @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, >>>> if (PageCompound(page) && !cc->alloc_contig) { >>>> const unsigned int order = compound_order(page); >>>> >>>> - if (likely(order <= MAX_PAGE_ORDER)) { >>>> - low_pfn += (1UL << order) - 1; >>>> - nr_scanned += (1UL << order) - 1; >>>> + /* >>>> + * Skip based on page order and compaction target order >>>> + * and skip hugetlbfs pages. >>>> + */ >>>> + if (skip_isolation_on_order(order, cc->order) || >>>> + PageHuge(page)) { >>> >>> Hm I'd try to avoid a new PageHuge() test here. >>> >>> Earlier we have a block that does >>> if (PageHuge(page) && cc->alloc_contig) { >>> ... >>> >>> think I'd rather rewrite it to handle the PageHuge() case completely and >>> just make it skip the 1UL << order pages there for !cc->alloc_config. Even >>> if it means duplicating a bit of the low_pfn and nr_scanned bumping code. >>> >>> Which reminds me the PageHuge() check there is probably still broken ATM: >>> >>> https://lore.kernel.org/all/8fa1c95c-4749-33dd-42ba-243e492ab109@suse.cz/ >>> >>> Even better reason not to add another one. >>> If the huge page materialized since the first check, we should bail out when >>> testing PageLRU later anyway. >> >> >> OK, so basically something like: >> >> if (PageHuge(page)) { >> if (cc->alloc_contig) { > > Yeah but I'd handle the !cc->alloc_contig first as that ends with a goto, > and then the rest doesn't need to be "} else { ... }" with extra identation OK. No problem. -- Best Regards, Yan, Zi
diff --git a/mm/compaction.c b/mm/compaction.c index 4add68d40e8d..e43e898d2c77 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -816,6 +816,21 @@ static bool too_many_isolated(struct compact_control *cc) return too_many; } +/* + * 1. if the page order is larger than or equal to target_order (i.e., + * cc->order and when it is not -1 for global compaction), skip it since + * target_order already indicates no free page with larger than target_order + * exists and later migrating it will most likely fail; + * + * 2. compacting > pageblock_order pages does not improve memory fragmentation, + * skip them; + */ +static bool skip_isolation_on_order(int order, int target_order) +{ + return (target_order != -1 && order >= target_order) || + order >= pageblock_order; +} + /** * isolate_migratepages_block() - isolate all migrate-able pages within * a single pageblock @@ -1010,7 +1025,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, /* * Regardless of being on LRU, compound pages such as THP and * hugetlbfs are not to be compacted unless we are attempting - * an allocation much larger than the huge page size (eg CMA). + * an allocation larger than the compound page size. * We can potentially save a lot of iterations if we skip them * at once. The check is racy, but we can consider only valid * values and the only danger is skipping too much. @@ -1018,11 +1033,18 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (PageCompound(page) && !cc->alloc_contig) { const unsigned int order = compound_order(page); - if (likely(order <= MAX_PAGE_ORDER)) { - low_pfn += (1UL << order) - 1; - nr_scanned += (1UL << order) - 1; + /* + * Skip based on page order and compaction target order + * and skip hugetlbfs pages. + */ + if (skip_isolation_on_order(order, cc->order) || + PageHuge(page)) { + if (order <= MAX_PAGE_ORDER) { + low_pfn += (1UL << order) - 1; + nr_scanned += (1UL << order) - 1; + } + goto isolate_fail; } - goto isolate_fail; } /* @@ -1165,10 +1187,11 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, } /* - * folio become large since the non-locked check, - * and it's on LRU. + * Check LRU folio order under the lock */ - if (unlikely(folio_test_large(folio) && !cc->alloc_contig)) { + if (unlikely(skip_isolation_on_order(folio_order(folio), + cc->order) && + !cc->alloc_contig)) { low_pfn += folio_nr_pages(folio) - 1; nr_scanned += folio_nr_pages(folio) - 1; folio_set_lru(folio); @@ -1786,6 +1809,10 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data) struct compact_control *cc = (struct compact_control *)data; struct folio *dst; + /* this makes migrate_pages() split the source page and retry */ + if (folio_test_large(src) > 0) + return NULL; + if (list_empty(&cc->freepages)) { isolate_freepages(cc);