Message ID | 20221227002859.27740-3-ying.huang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp1151749wrt; Mon, 26 Dec 2022 16:30:51 -0800 (PST) X-Google-Smtp-Source: AMrXdXuJAqZPI7TSDg180E81ujt6SVS1aBPPYJBxJDmsXzcepOVvHdwqJx5XkYYFl0B84rAwwaQr X-Received: by 2002:a05:6a20:d389:b0:af:7a4c:fb7d with SMTP id iq9-20020a056a20d38900b000af7a4cfb7dmr25999798pzb.23.1672101051609; Mon, 26 Dec 2022 16:30:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672101051; cv=none; d=google.com; s=arc-20160816; b=tmF612EYAfJPymmvOBYuxzjnPoz5Feu91A/sT1xKaTPA1BBEgXbHom8xcv8VqBSPWr xUfEqqE21MZrE6tVjLA6EQHsLjvBtCoKS0wfyAx7wJI8ph2qsjAz63ZxMqTH9KlSPBIr WwTDajwZkfw8/pymNvFx47qdfOif3y0mJ1QC+U0lJlkq7sHzkh8+sAgRlDbxmL93zQ3t g65D3IBOfjT3KMDJCMaRW2JMyyKDapGFwE302vaTNLV3Qcs9IXHlt+Gp/WSKtg6BmVo1 WOYRfTGVfb218swIW667o1d8HUgwrmVAeJjGAcTPYeEqGBCqKymGWwqWnrvSwnzDg2VH ayWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=EmUgQlD9tnG4P9yimIg6vRqU1RZS8A7fonhRfLBzs4c=; b=YPYCCLG1J/29f0j4zuKNVKeGt6RvZ7GCyVhxOx8v2AavckUNmF1qW0kUco80+WjAX9 ktn0Ro4CEjSbKke4wp2tzymkCxj7T5PivVKta7QW5L8CxFiuNVCKlJldaJcf/t8AnSxL mI3/BveaEjslO0/P1fmpjA3vb7c+tJ0AK7k8r7IZSTY3z0nwvwMqujskyEFjqDVQn57q VvTW2C4J+5wcs7xiHiLRYKZvNFoeOnKCyLNXzfpI3x3+6Ypp7/x7eG/tFGN2anvrh8CJ RktHOk1NylD+FdW0KxLzGb4upuz+la25N4JumEuWuXWb+DI9dKhWdCk6PLkqY6ThK05d QL2w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Tz4jGTh+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c193-20020a6335ca000000b0046b2ec0de65si12754192pga.789.2022.12.26.16.30.39; Mon, 26 Dec 2022 16:30:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Tz4jGTh+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232529AbiL0A3i (ORCPT <rfc822;eddaouddi.ayoub@gmail.com> + 99 others); Mon, 26 Dec 2022 19:29:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232418AbiL0A3b (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 26 Dec 2022 19:29:31 -0500 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78DF126DC for <linux-kernel@vger.kernel.org>; Mon, 26 Dec 2022 16:29:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1672100970; x=1703636970; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fdSCkAQLw2skvmVvw4V6UUPknrIjo/5xEY/WxCdRv2U=; b=Tz4jGTh+uDUQFGqQ5bzITpnGgSV2rzZydGaY9RVklDBUh4uJEOPLiy9E XrZjQZhq894XFV1/uoBsZQ33KWdB5AGVQZqaIdQYuSpP4tzqT4JM4aBYs 8rrENS1w33kunRBo8k5fGuh5Ka4x5Xbk47aCrisNZVfpreSiqh8HhCzcI tJP7KHJnQsth0gIeEmmEgWFvh15/pqLsh4fsCVab2nQfP5SdqdRp48lp9 E9tjNRAoPC/spXpLDOolZXJBNqreJBli+0E2Ja36HrLQt0wjuxe0D7hT2 0WOrhDXFZ7VWbfLzjs4e98VUPhawhRExf4WFFg10tsF22s/TQ3ONifYYe A==; X-IronPort-AV: E=McAfee;i="6500,9779,10572"; a="322597218" X-IronPort-AV: E=Sophos;i="5.96,277,1665471600"; d="scan'208";a="322597218" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Dec 2022 16:29:29 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10572"; a="760172194" X-IronPort-AV: E=Sophos;i="5.96,277,1665471600"; d="scan'208";a="760172194" Received: from yyang3-mobl1.ccr.corp.intel.com (HELO yhuang6-mobl2.ccr.corp.intel.com) ([10.254.212.104]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Dec 2022 16:29:25 -0800 From: Huang Ying <ying.huang@intel.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com>, Yang Shi <shy828301@gmail.com>, Baolin Wang <baolin.wang@linux.alibaba.com>, Oscar Salvador <osalvador@suse.de>, Matthew Wilcox <willy@infradead.org>, Bharata B Rao <bharata@amd.com>, Alistair Popple <apopple@nvidia.com>, haoxin <xhao@linux.alibaba.com> Subject: [PATCH 2/8] migrate_pages: separate hugetlb folios migration Date: Tue, 27 Dec 2022 08:28:53 +0800 Message-Id: <20221227002859.27740-3-ying.huang@intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221227002859.27740-1-ying.huang@intel.com> References: <20221227002859.27740-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753325032368793793?= X-GMAIL-MSGID: =?utf-8?q?1753325032368793793?= |
Series |
migrate_pages(): batch TLB flushing
|
|
Commit Message
Huang, Ying
Dec. 27, 2022, 12:28 a.m. UTC
This is a preparation patch to batch the folio unmapping and moving
for the non-hugetlb folios. Based on that we can batch the TLB
shootdown during the folio migration and make it possible to use some
hardware accelerator for the folio copying.
In this patch the hugetlb folios and non-hugetlb folios migration is
separated in migrate_pages() to make it easy to change the non-hugetlb
folios migration implementation.
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Bharata B Rao <bharata@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: haoxin <xhao@linux.alibaba.com>
---
mm/migrate.c | 114 ++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 99 insertions(+), 15 deletions(-)
Comments
On Tue, 27 Dec 2022 08:28:53 +0800 Huang Ying <ying.huang@intel.com> wrote: > This is a preparation patch to batch the folio unmapping and moving > for the non-hugetlb folios. Based on that we can batch the TLB > shootdown during the folio migration and make it possible to use some > hardware accelerator for the folio copying. > > In this patch the hugetlb folios and non-hugetlb folios migration is > separated in migrate_pages() to make it easy to change the non-hugetlb > folios migration implementation. > > ... > > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1404,6 +1404,87 @@ struct migrate_pages_stats { > int nr_thp_split; > }; > > +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page, > + free_page_t put_new_page, unsigned long private, > + enum migrate_mode mode, int reason, > + struct migrate_pages_stats *stats, > + struct list_head *ret_folios) > +{ > + int retry = 1; > + int nr_failed = 0; > + int nr_retry_pages = 0; > + int pass = 0; > + struct folio *folio, *folio2; > + int rc = 0, nr_pages; > + > + for (pass = 0; pass < 10 && retry; pass++) { Why 10? > + retry = 0; > + nr_retry_pages = 0; > + > + list_for_each_entry_safe(folio, folio2, from, lru) { > + if (!folio_test_hugetlb(folio)) > + continue; > + > + nr_pages = folio_nr_pages(folio); > + > + cond_resched(); > + > + rc = unmap_and_move_huge_page(get_new_page, > + put_new_page, private, > + &folio->page, pass > 2, mode, > + reason, ret_folios); > + /* > + * The rules are: > + * Success: hugetlb folio will be put back > + * -EAGAIN: stay on the from list > + * -ENOMEM: stay on the from list > + * -ENOSYS: stay on the from list > + * Other errno: put on ret_folios list > + */ > + switch(rc) { > + case -ENOSYS: > + /* Hugetlb migration is unsupported */ > + nr_failed++; > + stats->nr_failed_pages += nr_pages; > + list_move_tail(&folio->lru, ret_folios); > + break; > + case -ENOMEM: > + /* > + * When memory is low, don't bother to try to migrate > + * other folios, just exit. > + */ > + nr_failed++; > + stats->nr_failed_pages += nr_pages; > + goto out; > + case -EAGAIN: > + retry++; > + nr_retry_pages += nr_pages; > + break; > + case MIGRATEPAGE_SUCCESS: > + stats->nr_succeeded += nr_pages; > + break; > + default: > + /* > + * Permanent failure (-EBUSY, etc.): > + * unlike -EAGAIN case, the failed folio is > + * removed from migration folio list and not > + * retried in the next outer loop. > + */ > + nr_failed++; > + stats->nr_failed_pages += nr_pages; > + break; > + } > + } > + } > +out: > + nr_failed += retry; > + stats->nr_failed_pages += nr_retry_pages; > + if (rc != -ENOMEM) > + rc = nr_failed; > + > + return rc; > +} The interpretation of the return value of this function is somewhat unobvious. I suggest that this function be fully commented. Why does a retry contribute to nr_failed. What is the interpretation of nr_failed. etcetera.
Andrew Morton <akpm@linux-foundation.org> writes: > On Tue, 27 Dec 2022 08:28:53 +0800 Huang Ying <ying.huang@intel.com> wrote: > >> This is a preparation patch to batch the folio unmapping and moving >> for the non-hugetlb folios. Based on that we can batch the TLB >> shootdown during the folio migration and make it possible to use some >> hardware accelerator for the folio copying. >> >> In this patch the hugetlb folios and non-hugetlb folios migration is >> separated in migrate_pages() to make it easy to change the non-hugetlb >> folios migration implementation. >> >> ... >> >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -1404,6 +1404,87 @@ struct migrate_pages_stats { >> int nr_thp_split; >> }; >> >> +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page, >> + free_page_t put_new_page, unsigned long private, >> + enum migrate_mode mode, int reason, >> + struct migrate_pages_stats *stats, >> + struct list_head *ret_folios) >> +{ >> + int retry = 1; >> + int nr_failed = 0; >> + int nr_retry_pages = 0; >> + int pass = 0; >> + struct folio *folio, *folio2; >> + int rc = 0, nr_pages; >> + >> + for (pass = 0; pass < 10 && retry; pass++) { > > Why 10? This is inherited from the original max pass number from migrate_pages(). Which is introduced in commit 49d2e9cc4544 ("[PATCH] Swap Migration V5: migrate_pages() function"). From the code and commit message, I don't find out why. I guess that we need some magic number anyway. Now, because the magic number is used in 2 places (migrate_pages() and migrate_hugetlbs()), it's better to define it as a constant macro? >> + retry = 0; >> + nr_retry_pages = 0; >> + >> + list_for_each_entry_safe(folio, folio2, from, lru) { >> + if (!folio_test_hugetlb(folio)) >> + continue; >> + >> + nr_pages = folio_nr_pages(folio); >> + >> + cond_resched(); >> + >> + rc = unmap_and_move_huge_page(get_new_page, >> + put_new_page, private, >> + &folio->page, pass > 2, mode, >> + reason, ret_folios); >> + /* >> + * The rules are: >> + * Success: hugetlb folio will be put back >> + * -EAGAIN: stay on the from list >> + * -ENOMEM: stay on the from list >> + * -ENOSYS: stay on the from list >> + * Other errno: put on ret_folios list >> + */ >> + switch(rc) { >> + case -ENOSYS: >> + /* Hugetlb migration is unsupported */ >> + nr_failed++; >> + stats->nr_failed_pages += nr_pages; >> + list_move_tail(&folio->lru, ret_folios); >> + break; >> + case -ENOMEM: >> + /* >> + * When memory is low, don't bother to try to migrate >> + * other folios, just exit. >> + */ >> + nr_failed++; >> + stats->nr_failed_pages += nr_pages; >> + goto out; >> + case -EAGAIN: >> + retry++; >> + nr_retry_pages += nr_pages; >> + break; >> + case MIGRATEPAGE_SUCCESS: >> + stats->nr_succeeded += nr_pages; >> + break; >> + default: >> + /* >> + * Permanent failure (-EBUSY, etc.): >> + * unlike -EAGAIN case, the failed folio is >> + * removed from migration folio list and not >> + * retried in the next outer loop. >> + */ >> + nr_failed++; >> + stats->nr_failed_pages += nr_pages; >> + break; >> + } >> + } >> + } >> +out: >> + nr_failed += retry; >> + stats->nr_failed_pages += nr_retry_pages; >> + if (rc != -ENOMEM) >> + rc = nr_failed; >> + >> + return rc; >> +} > > The interpretation of the return value of this function is somewhat > unobvious. > > I suggest that this function be fully commented. > > Why does a retry contribute to nr_failed. What is the interpretation > of nr_failed. etcetera. Sure. Will do that in the next version. Best Regards, Huang, Ying
Huang Ying <ying.huang@intel.com> writes: > This is a preparation patch to batch the folio unmapping and moving > for the non-hugetlb folios. Based on that we can batch the TLB > shootdown during the folio migration and make it possible to use some > hardware accelerator for the folio copying. > > In this patch the hugetlb folios and non-hugetlb folios migration is > separated in migrate_pages() to make it easy to change the non-hugetlb > folios migration implementation. > > Signed-off-by: "Huang, Ying" <ying.huang@intel.com> > Cc: Zi Yan <ziy@nvidia.com> > Cc: Yang Shi <shy828301@gmail.com> > Cc: Baolin Wang <baolin.wang@linux.alibaba.com> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: Bharata B Rao <bharata@amd.com> > Cc: Alistair Popple <apopple@nvidia.com> > Cc: haoxin <xhao@linux.alibaba.com> > --- > mm/migrate.c | 114 ++++++++++++++++++++++++++++++++++++++++++++------- > 1 file changed, 99 insertions(+), 15 deletions(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index ec9263a33d38..bdbe73fe2eb7 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1404,6 +1404,87 @@ struct migrate_pages_stats { > int nr_thp_split; > }; > > +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page, > + free_page_t put_new_page, unsigned long private, > + enum migrate_mode mode, int reason, > + struct migrate_pages_stats *stats, > + struct list_head *ret_folios) > +{ > + int retry = 1; > + int nr_failed = 0; > + int nr_retry_pages = 0; > + int pass = 0; > + struct folio *folio, *folio2; > + int rc = 0, nr_pages; > + > + for (pass = 0; pass < 10 && retry; pass++) { > + retry = 0; > + nr_retry_pages = 0; > + > + list_for_each_entry_safe(folio, folio2, from, lru) { > + if (!folio_test_hugetlb(folio)) > + continue; > + > + nr_pages = folio_nr_pages(folio); > + > + cond_resched(); > + > + rc = unmap_and_move_huge_page(get_new_page, > + put_new_page, private, > + &folio->page, pass > 2, mode, > + reason, ret_folios); > + /* > + * The rules are: > + * Success: hugetlb folio will be put back > + * -EAGAIN: stay on the from list > + * -ENOMEM: stay on the from list > + * -ENOSYS: stay on the from list > + * Other errno: put on ret_folios list > + */ > + switch(rc) { > + case -ENOSYS: > + /* Hugetlb migration is unsupported */ > + nr_failed++; > + stats->nr_failed_pages += nr_pages; > + list_move_tail(&folio->lru, ret_folios); > + break; > + case -ENOMEM: > + /* > + * When memory is low, don't bother to try to migrate > + * other folios, just exit. > + */ > + nr_failed++; This currently isn't relevant for -ENOMEM and I think it would be clearer if it was dropped. > + stats->nr_failed_pages += nr_pages; Makes sense not to continue migration with low memory, but shouldn't we add the remaining unmigrated hugetlb folios to stats->nr_failed_pages as well? Ie. don't we still have to continue the iteration to to find and account for these? > + goto out; Given this is the only use of the out label, and that there is a special case for -ENOMEM there anyway I think it would be clearer to return directly. > + case -EAGAIN: > + retry++; > + nr_retry_pages += nr_pages; > + break; > + case MIGRATEPAGE_SUCCESS: > + stats->nr_succeeded += nr_pages; > + break; > + default: > + /* > + * Permanent failure (-EBUSY, etc.): > + * unlike -EAGAIN case, the failed folio is > + * removed from migration folio list and not > + * retried in the next outer loop. > + */ > + nr_failed++; > + stats->nr_failed_pages += nr_pages; > + break; > + } > + } > + } > +out: > + nr_failed += retry; > + stats->nr_failed_pages += nr_retry_pages; > + if (rc != -ENOMEM) > + rc = nr_failed; > + > + return rc; > +} > + > /* > * migrate_pages - migrate the folios specified in a list, to the free folios > * supplied as the target for the page migration > @@ -1437,7 +1518,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, > int retry = 1; > int large_retry = 1; > int thp_retry = 1; > - int nr_failed = 0; > + int nr_failed; > int nr_retry_pages = 0; > int nr_large_failed = 0; > int pass = 0; > @@ -1454,6 +1535,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, > trace_mm_migrate_pages_start(mode, reason); > > memset(&stats, 0, sizeof(stats)); > + rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason, > + &stats, &ret_folios); > + if (rc < 0) > + goto out; > + nr_failed = rc; > + > split_folio_migration: > for (pass = 0; pass < 10 && (retry || large_retry); pass++) { > retry = 0; > @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, > nr_retry_pages = 0; > > list_for_each_entry_safe(folio, folio2, from, lru) { > + if (folio_test_hugetlb(folio)) { How do we hit this case? Shouldn't migrate_hugetlbs() have already moved any hugetlb folios off the from list? > + list_move_tail(&folio->lru, &ret_folios); > + continue; > + } > + > /* > * Large folio statistics is based on the source large > * folio. Capture required information that might get > * lost during migration. > */ > - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); > + is_large = folio_test_large(folio); > is_thp = is_large && folio_test_pmd_mappable(folio); > nr_pages = folio_nr_pages(folio); > + > cond_resched(); > > - if (folio_test_hugetlb(folio)) > - rc = unmap_and_move_huge_page(get_new_page, > - put_new_page, private, > - &folio->page, pass > 2, mode, > - reason, > - &ret_folios); > - else > - rc = unmap_and_move(get_new_page, put_new_page, > - private, folio, pass > 2, mode, > - reason, &ret_folios); > + rc = unmap_and_move(get_new_page, put_new_page, > + private, folio, pass > 2, mode, > + reason, &ret_folios); > /* > * The rules are: > - * Success: non hugetlb folio will be freed, hugetlb > - * folio will be put back > + * Success: folio will be freed > * -EAGAIN: stay on the from list > * -ENOMEM: stay on the from list > * -ENOSYS: stay on the from list > @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, > stats.nr_thp_split += is_thp; > break; > } > - /* Hugetlb migration is unsupported */ > } else if (!no_split_folio_counting) { > nr_failed++; > }
Alistair Popple <apopple@nvidia.com> writes: > Huang Ying <ying.huang@intel.com> writes: > >> This is a preparation patch to batch the folio unmapping and moving >> for the non-hugetlb folios. Based on that we can batch the TLB >> shootdown during the folio migration and make it possible to use some >> hardware accelerator for the folio copying. >> >> In this patch the hugetlb folios and non-hugetlb folios migration is >> separated in migrate_pages() to make it easy to change the non-hugetlb >> folios migration implementation. >> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >> Cc: Zi Yan <ziy@nvidia.com> >> Cc: Yang Shi <shy828301@gmail.com> >> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> >> Cc: Oscar Salvador <osalvador@suse.de> >> Cc: Matthew Wilcox <willy@infradead.org> >> Cc: Bharata B Rao <bharata@amd.com> >> Cc: Alistair Popple <apopple@nvidia.com> >> Cc: haoxin <xhao@linux.alibaba.com> >> --- >> mm/migrate.c | 114 ++++++++++++++++++++++++++++++++++++++++++++------- >> 1 file changed, 99 insertions(+), 15 deletions(-) >> >> diff --git a/mm/migrate.c b/mm/migrate.c >> index ec9263a33d38..bdbe73fe2eb7 100644 >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -1404,6 +1404,87 @@ struct migrate_pages_stats { >> int nr_thp_split; >> }; >> >> +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page, >> + free_page_t put_new_page, unsigned long private, >> + enum migrate_mode mode, int reason, >> + struct migrate_pages_stats *stats, >> + struct list_head *ret_folios) >> +{ >> + int retry = 1; >> + int nr_failed = 0; >> + int nr_retry_pages = 0; >> + int pass = 0; >> + struct folio *folio, *folio2; >> + int rc = 0, nr_pages; >> + >> + for (pass = 0; pass < 10 && retry; pass++) { >> + retry = 0; >> + nr_retry_pages = 0; >> + >> + list_for_each_entry_safe(folio, folio2, from, lru) { >> + if (!folio_test_hugetlb(folio)) >> + continue; >> + >> + nr_pages = folio_nr_pages(folio); >> + >> + cond_resched(); >> + >> + rc = unmap_and_move_huge_page(get_new_page, >> + put_new_page, private, >> + &folio->page, pass > 2, mode, >> + reason, ret_folios); >> + /* >> + * The rules are: >> + * Success: hugetlb folio will be put back >> + * -EAGAIN: stay on the from list >> + * -ENOMEM: stay on the from list >> + * -ENOSYS: stay on the from list >> + * Other errno: put on ret_folios list >> + */ >> + switch(rc) { >> + case -ENOSYS: >> + /* Hugetlb migration is unsupported */ >> + nr_failed++; >> + stats->nr_failed_pages += nr_pages; >> + list_move_tail(&folio->lru, ret_folios); >> + break; >> + case -ENOMEM: >> + /* >> + * When memory is low, don't bother to try to migrate >> + * other folios, just exit. >> + */ >> + nr_failed++; > > This currently isn't relevant for -ENOMEM and I think it would be > clearer if it was dropped. OK. >> + stats->nr_failed_pages += nr_pages; > > Makes sense not to continue migration with low memory, but shouldn't we > add the remaining unmigrated hugetlb folios to stats->nr_failed_pages as > well? Ie. don't we still have to continue the iteration to to find and > account for these? I think nr_failed_pages only counts tried pages. IIUC, it's the original behavior and behavior for non-hugetlb pages too. >> + goto out; > > Given this is the only use of the out label, and that there is a special > case for -ENOMEM there anyway I think it would be clearer to return > directly. Sounds good. Will do that in next version. >> + case -EAGAIN: >> + retry++; >> + nr_retry_pages += nr_pages; >> + break; >> + case MIGRATEPAGE_SUCCESS: >> + stats->nr_succeeded += nr_pages; >> + break; >> + default: >> + /* >> + * Permanent failure (-EBUSY, etc.): >> + * unlike -EAGAIN case, the failed folio is >> + * removed from migration folio list and not >> + * retried in the next outer loop. >> + */ >> + nr_failed++; >> + stats->nr_failed_pages += nr_pages; >> + break; >> + } >> + } >> + } >> +out: >> + nr_failed += retry; >> + stats->nr_failed_pages += nr_retry_pages; >> + if (rc != -ENOMEM) >> + rc = nr_failed; >> + >> + return rc; >> +} >> + >> /* >> * migrate_pages - migrate the folios specified in a list, to the free folios >> * supplied as the target for the page migration >> @@ -1437,7 +1518,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >> int retry = 1; >> int large_retry = 1; >> int thp_retry = 1; >> - int nr_failed = 0; >> + int nr_failed; >> int nr_retry_pages = 0; >> int nr_large_failed = 0; >> int pass = 0; >> @@ -1454,6 +1535,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >> trace_mm_migrate_pages_start(mode, reason); >> >> memset(&stats, 0, sizeof(stats)); >> + rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason, >> + &stats, &ret_folios); >> + if (rc < 0) >> + goto out; >> + nr_failed = rc; >> + >> split_folio_migration: >> for (pass = 0; pass < 10 && (retry || large_retry); pass++) { >> retry = 0; >> @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >> nr_retry_pages = 0; >> >> list_for_each_entry_safe(folio, folio2, from, lru) { >> + if (folio_test_hugetlb(folio)) { > > How do we hit this case? Shouldn't migrate_hugetlbs() have already moved > any hugetlb folios off the from list? Retried hugetlb folios will be kept in from list. >> + list_move_tail(&folio->lru, &ret_folios); >> + continue; >> + } >> + >> /* >> * Large folio statistics is based on the source large >> * folio. Capture required information that might get >> * lost during migration. >> */ >> - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); >> + is_large = folio_test_large(folio); >> is_thp = is_large && folio_test_pmd_mappable(folio); >> nr_pages = folio_nr_pages(folio); >> + >> cond_resched(); >> >> - if (folio_test_hugetlb(folio)) >> - rc = unmap_and_move_huge_page(get_new_page, >> - put_new_page, private, >> - &folio->page, pass > 2, mode, >> - reason, >> - &ret_folios); >> - else >> - rc = unmap_and_move(get_new_page, put_new_page, >> - private, folio, pass > 2, mode, >> - reason, &ret_folios); >> + rc = unmap_and_move(get_new_page, put_new_page, >> + private, folio, pass > 2, mode, >> + reason, &ret_folios); >> /* >> * The rules are: >> - * Success: non hugetlb folio will be freed, hugetlb >> - * folio will be put back >> + * Success: folio will be freed >> * -EAGAIN: stay on the from list >> * -ENOMEM: stay on the from list >> * -ENOSYS: stay on the from list >> @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >> stats.nr_thp_split += is_thp; >> break; >> } >> - /* Hugetlb migration is unsupported */ >> } else if (!no_split_folio_counting) { >> nr_failed++; >> } Best Regards, Huang, Ying
"Huang, Ying" <ying.huang@intel.com> writes: > Alistair Popple <apopple@nvidia.com> writes: > >> Huang Ying <ying.huang@intel.com> writes: >> >>> This is a preparation patch to batch the folio unmapping and moving >>> for the non-hugetlb folios. Based on that we can batch the TLB >>> shootdown during the folio migration and make it possible to use some >>> hardware accelerator for the folio copying. >>> >>> In this patch the hugetlb folios and non-hugetlb folios migration is >>> separated in migrate_pages() to make it easy to change the non-hugetlb >>> folios migration implementation. >>> >>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com> >>> Cc: Zi Yan <ziy@nvidia.com> >>> Cc: Yang Shi <shy828301@gmail.com> >>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> >>> Cc: Oscar Salvador <osalvador@suse.de> >>> Cc: Matthew Wilcox <willy@infradead.org> >>> Cc: Bharata B Rao <bharata@amd.com> >>> Cc: Alistair Popple <apopple@nvidia.com> >>> Cc: haoxin <xhao@linux.alibaba.com> >>> --- >>> mm/migrate.c | 114 ++++++++++++++++++++++++++++++++++++++++++++------- >>> 1 file changed, 99 insertions(+), 15 deletions(-) >>> >>> diff --git a/mm/migrate.c b/mm/migrate.c >>> index ec9263a33d38..bdbe73fe2eb7 100644 >>> --- a/mm/migrate.c >>> +++ b/mm/migrate.c >>> @@ -1404,6 +1404,87 @@ struct migrate_pages_stats { >>> int nr_thp_split; >>> }; >>> >>> +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page, >>> + free_page_t put_new_page, unsigned long private, >>> + enum migrate_mode mode, int reason, >>> + struct migrate_pages_stats *stats, >>> + struct list_head *ret_folios) >>> +{ >>> + int retry = 1; >>> + int nr_failed = 0; >>> + int nr_retry_pages = 0; >>> + int pass = 0; >>> + struct folio *folio, *folio2; >>> + int rc = 0, nr_pages; >>> + >>> + for (pass = 0; pass < 10 && retry; pass++) { >>> + retry = 0; >>> + nr_retry_pages = 0; >>> + >>> + list_for_each_entry_safe(folio, folio2, from, lru) { >>> + if (!folio_test_hugetlb(folio)) >>> + continue; >>> + >>> + nr_pages = folio_nr_pages(folio); >>> + >>> + cond_resched(); >>> + >>> + rc = unmap_and_move_huge_page(get_new_page, >>> + put_new_page, private, >>> + &folio->page, pass > 2, mode, >>> + reason, ret_folios); >>> + /* >>> + * The rules are: >>> + * Success: hugetlb folio will be put back >>> + * -EAGAIN: stay on the from list >>> + * -ENOMEM: stay on the from list >>> + * -ENOSYS: stay on the from list >>> + * Other errno: put on ret_folios list >>> + */ >>> + switch(rc) { >>> + case -ENOSYS: >>> + /* Hugetlb migration is unsupported */ >>> + nr_failed++; >>> + stats->nr_failed_pages += nr_pages; >>> + list_move_tail(&folio->lru, ret_folios); >>> + break; >>> + case -ENOMEM: >>> + /* >>> + * When memory is low, don't bother to try to migrate >>> + * other folios, just exit. >>> + */ >>> + nr_failed++; >> >> This currently isn't relevant for -ENOMEM and I think it would be >> clearer if it was dropped. > > OK. > >>> + stats->nr_failed_pages += nr_pages; >> >> Makes sense not to continue migration with low memory, but shouldn't we >> add the remaining unmigrated hugetlb folios to stats->nr_failed_pages as >> well? Ie. don't we still have to continue the iteration to to find and >> account for these? > > I think nr_failed_pages only counts tried pages. IIUC, it's the > original behavior and behavior for non-hugetlb pages too. Hmm, I agree it seems this is the original behavior but that behaviour seems arbitrary and wrong IMHO. The page failed to migrate, therefore it should count as such. The fact we didn't even try seems irrelevant. Indeed it looks like this was introduced because it was confusing to see no failures even though migrate_pages() was called - see dfef2ef4027b ("mm, migrate: increment fail count on ENOMEM"). But that seems inconsistent - why count this one folio as failed because of the allocation failure while other folios which would also likely cause allocation failures don't get counted? Fixing it is probably outside the scope of this series so I won't insist, but it would be nice as it could still lead to confusion in some scenarios. [...] >>> @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> nr_retry_pages = 0; >>> >>> list_for_each_entry_safe(folio, folio2, from, lru) { >>> + if (folio_test_hugetlb(folio)) { >> >> How do we hit this case? Shouldn't migrate_hugetlbs() have already moved >> any hugetlb folios off the from list? > > Retried hugetlb folios will be kept in from list. Couldn't migrate_hugetlbs() remove the failing retried pages from the list on the final pass? That seems cleaner to me. >>> + list_move_tail(&folio->lru, &ret_folios); >>> + continue; >>> + } >>> + >>> /* >>> * Large folio statistics is based on the source large >>> * folio. Capture required information that might get >>> * lost during migration. >>> */ >>> - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); >>> + is_large = folio_test_large(folio); >>> is_thp = is_large && folio_test_pmd_mappable(folio); >>> nr_pages = folio_nr_pages(folio); >>> + >>> cond_resched(); >>> >>> - if (folio_test_hugetlb(folio)) >>> - rc = unmap_and_move_huge_page(get_new_page, >>> - put_new_page, private, >>> - &folio->page, pass > 2, mode, >>> - reason, >>> - &ret_folios); >>> - else >>> - rc = unmap_and_move(get_new_page, put_new_page, >>> - private, folio, pass > 2, mode, >>> - reason, &ret_folios); >>> + rc = unmap_and_move(get_new_page, put_new_page, >>> + private, folio, pass > 2, mode, >>> + reason, &ret_folios); >>> /* >>> * The rules are: >>> - * Success: non hugetlb folio will be freed, hugetlb >>> - * folio will be put back >>> + * Success: folio will be freed >>> * -EAGAIN: stay on the from list >>> * -ENOMEM: stay on the from list >>> * -ENOSYS: stay on the from list >>> @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>> stats.nr_thp_split += is_thp; >>> break; >>> } >>> - /* Hugetlb migration is unsupported */ >>> } else if (!no_split_folio_counting) { >>> nr_failed++; >>> } > > Best Regards, > Huang, Ying
[snip] > >>>> @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>> nr_retry_pages = 0; >>>> >>>> list_for_each_entry_safe(folio, folio2, from, lru) { >>>> + if (folio_test_hugetlb(folio)) { >>> >>> How do we hit this case? Shouldn't migrate_hugetlbs() have already moved >>> any hugetlb folios off the from list? >> >> Retried hugetlb folios will be kept in from list. > > Couldn't migrate_hugetlbs() remove the failing retried pages from the > list on the final pass? That seems cleaner to me. To do that, we need to go through the folio list again to remove all hugetlb pages. It could be time-consuming in some cases. So I think that it's better to keep this. Best Regards, Huang, Ying >>>> + list_move_tail(&folio->lru, &ret_folios); >>>> + continue; >>>> + } >>>> + >>>> /* >>>> * Large folio statistics is based on the source large >>>> * folio. Capture required information that might get >>>> * lost during migration. >>>> */ >>>> - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); >>>> + is_large = folio_test_large(folio); >>>> is_thp = is_large && folio_test_pmd_mappable(folio); >>>> nr_pages = folio_nr_pages(folio); >>>> + >>>> cond_resched(); >>>> >>>> - if (folio_test_hugetlb(folio)) >>>> - rc = unmap_and_move_huge_page(get_new_page, >>>> - put_new_page, private, >>>> - &folio->page, pass > 2, mode, >>>> - reason, >>>> - &ret_folios); >>>> - else >>>> - rc = unmap_and_move(get_new_page, put_new_page, >>>> - private, folio, pass > 2, mode, >>>> - reason, &ret_folios); >>>> + rc = unmap_and_move(get_new_page, put_new_page, >>>> + private, folio, pass > 2, mode, >>>> + reason, &ret_folios); >>>> /* >>>> * The rules are: >>>> - * Success: non hugetlb folio will be freed, hugetlb >>>> - * folio will be put back >>>> + * Success: folio will be freed >>>> * -EAGAIN: stay on the from list >>>> * -ENOMEM: stay on the from list >>>> * -ENOSYS: stay on the from list >>>> @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>> stats.nr_thp_split += is_thp; >>>> break; >>>> } >>>> - /* Hugetlb migration is unsupported */ >>>> } else if (!no_split_folio_counting) { >>>> nr_failed++; >>>> }
"Huang, Ying" <ying.huang@intel.com> writes: > [snip] > >> >>>>> @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>>> nr_retry_pages = 0; >>>>> >>>>> list_for_each_entry_safe(folio, folio2, from, lru) { >>>>> + if (folio_test_hugetlb(folio)) { >>>> >>>> How do we hit this case? Shouldn't migrate_hugetlbs() have already moved >>>> any hugetlb folios off the from list? >>> >>> Retried hugetlb folios will be kept in from list. >> >> Couldn't migrate_hugetlbs() remove the failing retried pages from the >> list on the final pass? That seems cleaner to me. > > To do that, we need to go through the folio list again to remove all > hugetlb pages. It could be time-consuming in some cases. So I think > that it's better to keep this. Why? Couldn't we test pass == 9 and remove it from the list if it fails the final retry in migrate_hugetlbs()? In any case if it's on the list due to failed retries we have already passed over it 10 times, so the extra loop hardly seems like a problem. - Alistair > Best Regards, > Huang, Ying > >>>>> + list_move_tail(&folio->lru, &ret_folios); >>>>> + continue; >>>>> + } >>>>> + >>>>> /* >>>>> * Large folio statistics is based on the source large >>>>> * folio. Capture required information that might get >>>>> * lost during migration. >>>>> */ >>>>> - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); >>>>> + is_large = folio_test_large(folio); >>>>> is_thp = is_large && folio_test_pmd_mappable(folio); >>>>> nr_pages = folio_nr_pages(folio); >>>>> + >>>>> cond_resched(); >>>>> >>>>> - if (folio_test_hugetlb(folio)) >>>>> - rc = unmap_and_move_huge_page(get_new_page, >>>>> - put_new_page, private, >>>>> - &folio->page, pass > 2, mode, >>>>> - reason, >>>>> - &ret_folios); >>>>> - else >>>>> - rc = unmap_and_move(get_new_page, put_new_page, >>>>> - private, folio, pass > 2, mode, >>>>> - reason, &ret_folios); >>>>> + rc = unmap_and_move(get_new_page, put_new_page, >>>>> + private, folio, pass > 2, mode, >>>>> + reason, &ret_folios); >>>>> /* >>>>> * The rules are: >>>>> - * Success: non hugetlb folio will be freed, hugetlb >>>>> - * folio will be put back >>>>> + * Success: folio will be freed >>>>> * -EAGAIN: stay on the from list >>>>> * -ENOMEM: stay on the from list >>>>> * -ENOSYS: stay on the from list >>>>> @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>>> stats.nr_thp_split += is_thp; >>>>> break; >>>>> } >>>>> - /* Hugetlb migration is unsupported */ >>>>> } else if (!no_split_folio_counting) { >>>>> nr_failed++; >>>>> }
Alistair Popple <apopple@nvidia.com> writes: > "Huang, Ying" <ying.huang@intel.com> writes: > >> [snip] >> >>> >>>>>> @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>>>> nr_retry_pages = 0; >>>>>> >>>>>> list_for_each_entry_safe(folio, folio2, from, lru) { >>>>>> + if (folio_test_hugetlb(folio)) { >>>>> >>>>> How do we hit this case? Shouldn't migrate_hugetlbs() have already moved >>>>> any hugetlb folios off the from list? >>>> >>>> Retried hugetlb folios will be kept in from list. >>> >>> Couldn't migrate_hugetlbs() remove the failing retried pages from the >>> list on the final pass? That seems cleaner to me. >> >> To do that, we need to go through the folio list again to remove all >> hugetlb pages. It could be time-consuming in some cases. So I think >> that it's better to keep this. > > Why? Couldn't we test pass == 9 and remove it from the list if it fails > the final retry in migrate_hugetlbs()? In any case if it's on the list > due to failed retries we have already passed over it 10 times, so the > extra loop hardly seems like a problem. Yes. That's possible. But "test pass == 9" looks more tricky than the current code. Feel free to change the code as you suggested on top this series. If no others object, I'm OK with that. OK? Best Regards, Huang, Ying >> >>>>>> + list_move_tail(&folio->lru, &ret_folios); >>>>>> + continue; >>>>>> + } >>>>>> + >>>>>> /* >>>>>> * Large folio statistics is based on the source large >>>>>> * folio. Capture required information that might get >>>>>> * lost during migration. >>>>>> */ >>>>>> - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); >>>>>> + is_large = folio_test_large(folio); >>>>>> is_thp = is_large && folio_test_pmd_mappable(folio); >>>>>> nr_pages = folio_nr_pages(folio); >>>>>> + >>>>>> cond_resched(); >>>>>> >>>>>> - if (folio_test_hugetlb(folio)) >>>>>> - rc = unmap_and_move_huge_page(get_new_page, >>>>>> - put_new_page, private, >>>>>> - &folio->page, pass > 2, mode, >>>>>> - reason, >>>>>> - &ret_folios); >>>>>> - else >>>>>> - rc = unmap_and_move(get_new_page, put_new_page, >>>>>> - private, folio, pass > 2, mode, >>>>>> - reason, &ret_folios); >>>>>> + rc = unmap_and_move(get_new_page, put_new_page, >>>>>> + private, folio, pass > 2, mode, >>>>>> + reason, &ret_folios); >>>>>> /* >>>>>> * The rules are: >>>>>> - * Success: non hugetlb folio will be freed, hugetlb >>>>>> - * folio will be put back >>>>>> + * Success: folio will be freed >>>>>> * -EAGAIN: stay on the from list >>>>>> * -ENOMEM: stay on the from list >>>>>> * -ENOSYS: stay on the from list >>>>>> @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>>>> stats.nr_thp_split += is_thp; >>>>>> break; >>>>>> } >>>>>> - /* Hugetlb migration is unsupported */ >>>>>> } else if (!no_split_folio_counting) { >>>>>> nr_failed++; >>>>>> }
"Huang, Ying" <ying.huang@intel.com> writes: > Alistair Popple <apopple@nvidia.com> writes: > >> "Huang, Ying" <ying.huang@intel.com> writes: >> >>> [snip] >>> >>>> >>>>>>> @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>>>>> nr_retry_pages = 0; >>>>>>> >>>>>>> list_for_each_entry_safe(folio, folio2, from, lru) { >>>>>>> + if (folio_test_hugetlb(folio)) { >>>>>> >>>>>> How do we hit this case? Shouldn't migrate_hugetlbs() have already moved >>>>>> any hugetlb folios off the from list? >>>>> >>>>> Retried hugetlb folios will be kept in from list. >>>> >>>> Couldn't migrate_hugetlbs() remove the failing retried pages from the >>>> list on the final pass? That seems cleaner to me. >>> >>> To do that, we need to go through the folio list again to remove all >>> hugetlb pages. It could be time-consuming in some cases. So I think >>> that it's better to keep this. >> >> Why? Couldn't we test pass == 9 and remove it from the list if it fails >> the final retry in migrate_hugetlbs()? In any case if it's on the list >> due to failed retries we have already passed over it 10 times, so the >> extra loop hardly seems like a problem. > > Yes. That's possible. But "test pass == 9" looks more tricky than the > current code. > > Feel free to change the code as you suggested on top this series. If no > others object, I'm OK with that. OK? Sure. Part of my problem when reviewing this series is that everytime I look at migrate_pages(), and in particular the number of conditionals that are sufficiently non-obvious to require extensive comments, I can't help but think it all needs some refactoring before making it any more complicated. However perhaps I am alone in that. Either way this kind of refactoring has been on my TODO list for a while - I have a WIP series to converge some of the migrate_device.c code which I will need to rebase on this anyway so as you suggest I could make a lot of my suggested changes on top of this series. Regards, Alistair > Best Regards, > Huang, Ying > >>> >>>>>>> + list_move_tail(&folio->lru, &ret_folios); >>>>>>> + continue; >>>>>>> + } >>>>>>> + >>>>>>> /* >>>>>>> * Large folio statistics is based on the source large >>>>>>> * folio. Capture required information that might get >>>>>>> * lost during migration. >>>>>>> */ >>>>>>> - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); >>>>>>> + is_large = folio_test_large(folio); >>>>>>> is_thp = is_large && folio_test_pmd_mappable(folio); >>>>>>> nr_pages = folio_nr_pages(folio); >>>>>>> + >>>>>>> cond_resched(); >>>>>>> >>>>>>> - if (folio_test_hugetlb(folio)) >>>>>>> - rc = unmap_and_move_huge_page(get_new_page, >>>>>>> - put_new_page, private, >>>>>>> - &folio->page, pass > 2, mode, >>>>>>> - reason, >>>>>>> - &ret_folios); >>>>>>> - else >>>>>>> - rc = unmap_and_move(get_new_page, put_new_page, >>>>>>> - private, folio, pass > 2, mode, >>>>>>> - reason, &ret_folios); >>>>>>> + rc = unmap_and_move(get_new_page, put_new_page, >>>>>>> + private, folio, pass > 2, mode, >>>>>>> + reason, &ret_folios); >>>>>>> /* >>>>>>> * The rules are: >>>>>>> - * Success: non hugetlb folio will be freed, hugetlb >>>>>>> - * folio will be put back >>>>>>> + * Success: folio will be freed >>>>>>> * -EAGAIN: stay on the from list >>>>>>> * -ENOMEM: stay on the from list >>>>>>> * -ENOSYS: stay on the from list >>>>>>> @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, >>>>>>> stats.nr_thp_split += is_thp; >>>>>>> break; >>>>>>> } >>>>>>> - /* Hugetlb migration is unsupported */ >>>>>>> } else if (!no_split_folio_counting) { >>>>>>> nr_failed++; >>>>>>> }
diff --git a/mm/migrate.c b/mm/migrate.c index ec9263a33d38..bdbe73fe2eb7 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1404,6 +1404,87 @@ struct migrate_pages_stats { int nr_thp_split; }; +static int migrate_hugetlbs(struct list_head *from, new_page_t get_new_page, + free_page_t put_new_page, unsigned long private, + enum migrate_mode mode, int reason, + struct migrate_pages_stats *stats, + struct list_head *ret_folios) +{ + int retry = 1; + int nr_failed = 0; + int nr_retry_pages = 0; + int pass = 0; + struct folio *folio, *folio2; + int rc = 0, nr_pages; + + for (pass = 0; pass < 10 && retry; pass++) { + retry = 0; + nr_retry_pages = 0; + + list_for_each_entry_safe(folio, folio2, from, lru) { + if (!folio_test_hugetlb(folio)) + continue; + + nr_pages = folio_nr_pages(folio); + + cond_resched(); + + rc = unmap_and_move_huge_page(get_new_page, + put_new_page, private, + &folio->page, pass > 2, mode, + reason, ret_folios); + /* + * The rules are: + * Success: hugetlb folio will be put back + * -EAGAIN: stay on the from list + * -ENOMEM: stay on the from list + * -ENOSYS: stay on the from list + * Other errno: put on ret_folios list + */ + switch(rc) { + case -ENOSYS: + /* Hugetlb migration is unsupported */ + nr_failed++; + stats->nr_failed_pages += nr_pages; + list_move_tail(&folio->lru, ret_folios); + break; + case -ENOMEM: + /* + * When memory is low, don't bother to try to migrate + * other folios, just exit. + */ + nr_failed++; + stats->nr_failed_pages += nr_pages; + goto out; + case -EAGAIN: + retry++; + nr_retry_pages += nr_pages; + break; + case MIGRATEPAGE_SUCCESS: + stats->nr_succeeded += nr_pages; + break; + default: + /* + * Permanent failure (-EBUSY, etc.): + * unlike -EAGAIN case, the failed folio is + * removed from migration folio list and not + * retried in the next outer loop. + */ + nr_failed++; + stats->nr_failed_pages += nr_pages; + break; + } + } + } +out: + nr_failed += retry; + stats->nr_failed_pages += nr_retry_pages; + if (rc != -ENOMEM) + rc = nr_failed; + + return rc; +} + /* * migrate_pages - migrate the folios specified in a list, to the free folios * supplied as the target for the page migration @@ -1437,7 +1518,7 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, int retry = 1; int large_retry = 1; int thp_retry = 1; - int nr_failed = 0; + int nr_failed; int nr_retry_pages = 0; int nr_large_failed = 0; int pass = 0; @@ -1454,6 +1535,12 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, trace_mm_migrate_pages_start(mode, reason); memset(&stats, 0, sizeof(stats)); + rc = migrate_hugetlbs(from, get_new_page, put_new_page, private, mode, reason, + &stats, &ret_folios); + if (rc < 0) + goto out; + nr_failed = rc; + split_folio_migration: for (pass = 0; pass < 10 && (retry || large_retry); pass++) { retry = 0; @@ -1462,30 +1549,28 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, nr_retry_pages = 0; list_for_each_entry_safe(folio, folio2, from, lru) { + if (folio_test_hugetlb(folio)) { + list_move_tail(&folio->lru, &ret_folios); + continue; + } + /* * Large folio statistics is based on the source large * folio. Capture required information that might get * lost during migration. */ - is_large = folio_test_large(folio) && !folio_test_hugetlb(folio); + is_large = folio_test_large(folio); is_thp = is_large && folio_test_pmd_mappable(folio); nr_pages = folio_nr_pages(folio); + cond_resched(); - if (folio_test_hugetlb(folio)) - rc = unmap_and_move_huge_page(get_new_page, - put_new_page, private, - &folio->page, pass > 2, mode, - reason, - &ret_folios); - else - rc = unmap_and_move(get_new_page, put_new_page, - private, folio, pass > 2, mode, - reason, &ret_folios); + rc = unmap_and_move(get_new_page, put_new_page, + private, folio, pass > 2, mode, + reason, &ret_folios); /* * The rules are: - * Success: non hugetlb folio will be freed, hugetlb - * folio will be put back + * Success: folio will be freed * -EAGAIN: stay on the from list * -ENOMEM: stay on the from list * -ENOSYS: stay on the from list @@ -1512,7 +1597,6 @@ int migrate_pages(struct list_head *from, new_page_t get_new_page, stats.nr_thp_split += is_thp; break; } - /* Hugetlb migration is unsupported */ } else if (!no_split_folio_counting) { nr_failed++; }