Message ID | 20230802095346.87449-2-wangkefeng.wang@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp347452vqx; Wed, 2 Aug 2023 03:14:28 -0700 (PDT) X-Google-Smtp-Source: APBJJlGWbpjY+EjhrA5ls/MWV2MQmUsToPcLlFZQkBUeIt5G9af18+4nbx0kLonK/Pgvv0VzuKfg X-Received: by 2002:a17:906:518a:b0:99b:fdbb:31f1 with SMTP id y10-20020a170906518a00b0099bfdbb31f1mr4858386ejk.16.1690971268623; Wed, 02 Aug 2023 03:14:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690971268; cv=none; d=google.com; s=arc-20160816; b=u7kQUcsZt/xFdklpeBwIBARtjeeT8Xs7M4RDZ5p8926MOqJwPxcS7s4y5PKcBitY/H 8KIaaGq9fV1vjKMb4I/1GUvuKJOvc6wT3+0nnaVE3zgQEZirNl/Y7Pa5FwMj+nTkRyDa zGKVUeL7rnzyKOi42hcpp+dI73+uBOdEEi1LRtiNavuBXpcUGiPyVGOM7BUcwdk62ZKD f1x6hyUElKLMoCUIPbIiP5NXV6Ny/oDv1DZ+KO1qjpaxjoiuKt4QiD7XP3sUqZf7+VlB LCB7J0VuvbgNyjwDWDwZDFFki0G8qBU4BNnybzVaWxF9Sw/z+SQshssXT5kvPHfH6evX v3Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=sKA63lWYPN2MF+P3BEUtg/sR22zgj0WfA+RU7kxoC38=; fh=frTJ5b0OruHscvMLDfKfibXYJSJ4/6aFlw/4ZMUelj0=; b=CvA/hyHxFxZ93BT/ZdeV9ABOBXiOo5X4yD0NZrFgwiL0oB1nXqUJow7vPHEINMMq1a cJSKk1ArH6iTl0nhuYevH4pODzQejTTmD9xiqg8XJp/Es20kQ1cwRSBxCw8tN6gylZaA p6rGBTx2rpDF2oTOQWgUqRvEZeQ1Pjdb4CjIXOciEG3vlarynm7DtE9cG3QtDkXsjwZh EFiac8xbal//hZbCM5U+ySHupp1bGeVzFEFky77jSBEToZrYEPiVxGdOfwBE04C4mr7m +Rp6RChF8hxrWIeCwwc06V0kN/E3Xdg1Ntpn5JjRCm7txpvgeFMaPYKWBWjeVoCx/Vhr z7rA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f16-20020a1709067f9000b009893fe84e74si10427757ejr.484.2023.08.02.03.14.04; Wed, 02 Aug 2023 03:14:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233857AbjHBJpf (ORCPT <rfc822;maxi.paulin@gmail.com> + 99 others); Wed, 2 Aug 2023 05:45:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230494AbjHBJpc (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 2 Aug 2023 05:45:32 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 902512D53 for <linux-kernel@vger.kernel.org>; Wed, 2 Aug 2023 02:45:21 -0700 (PDT) Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.54]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4RG6TT1cQ1zLnxK; Wed, 2 Aug 2023 17:42:37 +0800 (CST) Received: from localhost.localdomain.localdomain (10.175.113.25) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Wed, 2 Aug 2023 17:45:19 +0800 From: Kefeng Wang <wangkefeng.wang@huawei.com> To: Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org> CC: Huang Ying <ying.huang@intel.com>, David Hildenbrand <david@redhat.com>, Matthew Wilcox <willy@infradead.org>, Kefeng Wang <wangkefeng.wang@huawei.com> Subject: [PATCH 1/4] mm: migrate: use a folio in add_page_for_migration() Date: Wed, 2 Aug 2023 17:53:43 +0800 Message-ID: <20230802095346.87449-2-wangkefeng.wang@huawei.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230802095346.87449-1-wangkefeng.wang@huawei.com> References: <20230802095346.87449-1-wangkefeng.wang@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.113.25] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773111888484406168 X-GMAIL-MSGID: 1773111888484406168 |
Series |
mm: migrate: more folio conversion
|
|
Commit Message
Kefeng Wang
Aug. 2, 2023, 9:53 a.m. UTC
Use a folio in add_page_for_migration() to save compound_head() calls.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/migrate.c | 44 +++++++++++++++++++++-----------------------
1 file changed, 21 insertions(+), 23 deletions(-)
Comments
On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote: > err = -EACCES; > - if (page_mapcount(page) > 1 && !migrate_all) > - goto out_putpage; > + if (folio_estimated_sharers(folio) > 1 && !migrate_all) > + goto out_putfolio; I do not think this is the correct change. Maybe leave this line alone. > - if (PageHuge(page)) { > - if (PageHead(page)) { > - isolated = isolate_hugetlb(page_folio(page), pagelist); > + if (folio_test_hugetlb(folio)) { > + if (folio_test_large(folio)) { This makes no sense when you read it. All hugetlb folios are large, by definition. Think about what this code used to do, and what it should be changed to. > + isolated = isolate_hugetlb(folio, pagelist); > err = isolated ? 1 : -EBUSY; > } > } else { > - struct page *head; > - > - head = compound_head(page); > - isolated = isolate_lru_page(head); > + isolated = folio_isolate_lru(folio); > if (!isolated) { > err = -EBUSY; > - goto out_putpage; > + goto out_putfolio; > } > > err = 1; > - list_add_tail(&head->lru, pagelist); > - mod_node_page_state(page_pgdat(head), > - NR_ISOLATED_ANON + page_is_file_lru(head), > - thp_nr_pages(head)); > + list_add_tail(&folio->lru, pagelist); > + node_stat_mod_folio(folio, > + NR_ISOLATED_ANON + folio_is_file_lru(folio), > + folio_nr_pages(folio)); > }
On 2023/8/2 20:21, Matthew Wilcox wrote: > On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote: >> err = -EACCES; >> - if (page_mapcount(page) > 1 && !migrate_all) >> - goto out_putpage; >> + if (folio_estimated_sharers(folio) > 1 && !migrate_all) >> + goto out_putfolio; > > I do not think this is the correct change. Maybe leave this line > alone. Ok, I am aware of the discussion about this in other mail, will not change it(also the next two patch about this function), or wait the new work of David. > >> - if (PageHuge(page)) { >> - if (PageHead(page)) { >> - isolated = isolate_hugetlb(page_folio(page), pagelist); >> + if (folio_test_hugetlb(folio)) { >> + if (folio_test_large(folio)) { > > This makes no sense when you read it. All hugetlb folios are large, > by definition. Think about what this code used to do, and what it > should be changed to. hugetlb folio is self large folio, will drop redundant check
On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: > > > On 2023/8/2 20:21, Matthew Wilcox wrote: > > On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote: > > > err = -EACCES; > > > - if (page_mapcount(page) > 1 && !migrate_all) > > > - goto out_putpage; > > > + if (folio_estimated_sharers(folio) > 1 && !migrate_all) > > > + goto out_putfolio; > > > > I do not think this is the correct change. Maybe leave this line > > alone. > > Ok, I am aware of the discussion about this in other mail, will not > change it(also the next two patch about this function), or wait the > new work of David. > > > > > - if (PageHuge(page)) { > > > - if (PageHead(page)) { > > > - isolated = isolate_hugetlb(page_folio(page), pagelist); > > > + if (folio_test_hugetlb(folio)) { > > > + if (folio_test_large(folio)) { > > > > This makes no sense when you read it. All hugetlb folios are large, > > by definition. Think about what this code used to do, and what it > > should be changed to. > > hugetlb folio is self large folio, will drop redundant check No, that's not the difference. Keep thinking about it. This is not a mechanical translation!
On 2023/8/3 20:30, Matthew Wilcox wrote: > On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: >> >> >> On 2023/8/2 20:21, Matthew Wilcox wrote: >>> On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote: >>>> err = -EACCES; >>>> - if (page_mapcount(page) > 1 && !migrate_all) >>>> - goto out_putpage; >>>> + if (folio_estimated_sharers(folio) > 1 && !migrate_all) >>>> + goto out_putfolio; >>> >>> I do not think this is the correct change. Maybe leave this line >>> alone. >> >> Ok, I am aware of the discussion about this in other mail, will not >> change it(also the next two patch about this function), or wait the >> new work of David. >>> >>>> - if (PageHuge(page)) { >>>> - if (PageHead(page)) { >>>> - isolated = isolate_hugetlb(page_folio(page), pagelist); >>>> + if (folio_test_hugetlb(folio)) { >>>> + if (folio_test_large(folio)) { >>> >>> This makes no sense when you read it. All hugetlb folios are large, >>> by definition. Think about what this code used to do, and what it >>> should be changed to. >> >> hugetlb folio is self large folio, will drop redundant check > > No, that's not the difference. Keep thinking about it. This is not > a mechanical translation! if (PageHuge(page)) // page must be a hugetlb page if (PageHead(page)) // page must be a head page, not tail isolate_hugetlb() // isolate the hugetlb page if head After using folio, if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not I don't check the page is head or not, since the follow_page could return a sub-page, so the check PageHead need be retained, right?
On 3 Aug 2023, at 21:45, Kefeng Wang wrote: > On 2023/8/3 20:30, Matthew Wilcox wrote: >> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: >>> >>> >>> On 2023/8/2 20:21, Matthew Wilcox wrote: >>>> On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote: >>>>> err = -EACCES; >>>>> - if (page_mapcount(page) > 1 && !migrate_all) >>>>> - goto out_putpage; >>>>> + if (folio_estimated_sharers(folio) > 1 && !migrate_all) >>>>> + goto out_putfolio; >>>> >>>> I do not think this is the correct change. Maybe leave this line >>>> alone. >>> >>> Ok, I am aware of the discussion about this in other mail, will not >>> change it(also the next two patch about this function), or wait the >>> new work of David. >>>> >>>>> - if (PageHuge(page)) { >>>>> - if (PageHead(page)) { >>>>> - isolated = isolate_hugetlb(page_folio(page), pagelist); >>>>> + if (folio_test_hugetlb(folio)) { >>>>> + if (folio_test_large(folio)) { >>>> >>>> This makes no sense when you read it. All hugetlb folios are large, >>>> by definition. Think about what this code used to do, and what it >>>> should be changed to. >>> >>> hugetlb folio is self large folio, will drop redundant check >> >> No, that's not the difference. Keep thinking about it. This is not >> a mechanical translation! > > > if (PageHuge(page)) // page must be a hugetlb page > if (PageHead(page)) // page must be a head page, not tail > isolate_hugetlb() // isolate the hugetlb page if head > > After using folio, > > if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not > > I don't check the page is head or not, since the follow_page could > return a sub-page, so the check PageHead need be retained, right? Right. It will prevent the kernel from trying to isolate the same hugetlb page twice when two pages are in the same hugetlb folio. But looking at the code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb() would return false, no error would show up. But it changes err value from -EACCES to -EBUSY and user will see a different page status than before. I wonder why we do not have follow_folio() and returns -ENOENT error pointer when addr points to a non head page. It would make this patch more folio if follow_folio() can be used in place of follow_page(). One caveat is that user will see -ENOENT instead of -EACCES after this change. -- Best Regards, Yan, Zi
On 2023/8/4 10:42, Zi Yan wrote: > On 3 Aug 2023, at 21:45, Kefeng Wang wrote: > >> On 2023/8/3 20:30, Matthew Wilcox wrote: >>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: >>>> >>>> >>>> On 2023/8/2 20:21, Matthew Wilcox wrote: >>>>> On Wed, Aug 02, 2023 at 05:53:43PM +0800, Kefeng Wang wrote: >>>>>> err = -EACCES; >>>>>> - if (page_mapcount(page) > 1 && !migrate_all) >>>>>> - goto out_putpage; >>>>>> + if (folio_estimated_sharers(folio) > 1 && !migrate_all) >>>>>> + goto out_putfolio; >>>>> >>>>> I do not think this is the correct change. Maybe leave this line >>>>> alone. >>>> >>>> Ok, I am aware of the discussion about this in other mail, will not >>>> change it(also the next two patch about this function), or wait the >>>> new work of David. >>>>> >>>>>> - if (PageHuge(page)) { >>>>>> - if (PageHead(page)) { >>>>>> - isolated = isolate_hugetlb(page_folio(page), pagelist); >>>>>> + if (folio_test_hugetlb(folio)) { >>>>>> + if (folio_test_large(folio)) { >>>>> >>>>> This makes no sense when you read it. All hugetlb folios are large, >>>>> by definition. Think about what this code used to do, and what it >>>>> should be changed to. >>>> >>>> hugetlb folio is self large folio, will drop redundant check >>> >>> No, that's not the difference. Keep thinking about it. This is not >>> a mechanical translation! >> >> >> if (PageHuge(page)) // page must be a hugetlb page >> if (PageHead(page)) // page must be a head page, not tail >> isolate_hugetlb() // isolate the hugetlb page if head >> >> After using folio, >> >> if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not >> >> I don't check the page is head or not, since the follow_page could >> return a sub-page, so the check PageHead need be retained, right? > > Right. It will prevent the kernel from trying to isolate the same hugetlb page > twice when two pages are in the same hugetlb folio. But looking at the > code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb() > would return false, no error would show up. But it changes err value > from -EACCES to -EBUSY and user will see a different page status than before. When check man[1], the current -EACCES is not right, -EBUSY is not precise but more suitable for this scenario, -EACCES The page is mapped by multiple processes and can be moved only if MPOL_MF_MOVE_ALL is specified. -EBUSY The page is currently busy and cannot be moved. Try again later. This occurs if a page is undergoing I/O or another kernel subsystem is holding a reference to the page. -ENOENT The page is not present. > > I wonder why we do not have follow_folio() and returns -ENOENT error pointer > when addr points to a non head page. It would make this patch more folio if > follow_folio() can be used in place of follow_page(). One caveat is that > user will see -ENOENT instead of -EACCES after this change. > -ENOENT is ok, but maybe the man need to be updated too. [1] https://man7.org/linux/man-pages/man2/move_pages.2.html > > -- > Best Regards, > Yan, Zi
Hi Zi Yan and Matthew and Naoya, On 2023/8/4 13:54, Kefeng Wang wrote: > > > On 2023/8/4 10:42, Zi Yan wrote: >> On 3 Aug 2023, at 21:45, Kefeng Wang wrote: >> >>> On 2023/8/3 20:30, Matthew Wilcox wrote: >>>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: >>>>> ... >>> >>> >>> if (PageHuge(page)) // page must be a hugetlb page >>> if (PageHead(page)) // page must be a head page, not tail >>> isolate_hugetlb() // isolate the hugetlb page if head >>> >>> After using folio, >>> >>> if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not >>> >>> I don't check the page is head or not, since the follow_page could >>> return a sub-page, so the check PageHead need be retained, right? >> >> Right. It will prevent the kernel from trying to isolate the same >> hugetlb page >> twice when two pages are in the same hugetlb folio. But looking at the >> code, if you try to isolate an already-isolated hugetlb folio, >> isolate_hugetlb() >> would return false, no error would show up. But it changes err value >> from -EACCES to -EBUSY and user will see a different page status than >> before. > Before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") in v4.0, follow_page() will return NULL on tail page for Huagetlb page, and move_pages() will return -ENOENT errno,but after that commit, -EACCES is returned, which not match the manual, > > When check man[1], the current -EACCES is not right, -EBUSY is not > precise but more suitable for this scenario, > > -EACCES > The page is mapped by multiple processes and can be moved > only if MPOL_MF_MOVE_ALL is specified. > > -EBUSY The page is currently busy and cannot be moved. Try again > later. This occurs if a page is undergoing I/O or another > kernel subsystem is holding a reference to the page. > -ENOENT > The page is not present. > >> >> I wonder why we do not have follow_folio() and returns -ENOENT error >> pointer >> when addr points to a non head page. It would make this patch more >> folio if >> follow_folio() can be used in place of follow_page(). One caveat is that >> user will see -ENOENT instead of -EACCES after this change. >> > > -ENOENT is ok, but maybe the man need to be updated too. According to above analysis, -ENOENT is suitable when introduce the follow_folio(), but when THP migrate support is introduced by e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in v4.14, the tail page will be turned into head page and return -EBUSY, So should we unify errno(maybe use -ENOENT) about the tail page? > > > > [1] https://man7.org/linux/man-pages/man2/move_pages.2.html
On 7 Aug 2023, at 8:20, Kefeng Wang wrote: > Hi Zi Yan and Matthew and Naoya, > > On 2023/8/4 13:54, Kefeng Wang wrote: >> >> >> On 2023/8/4 10:42, Zi Yan wrote: >>> On 3 Aug 2023, at 21:45, Kefeng Wang wrote: >>> >>>> On 2023/8/3 20:30, Matthew Wilcox wrote: >>>>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: >>>>>> > > ... > >>>> >>>> >>>> if (PageHuge(page)) // page must be a hugetlb page >>>> if (PageHead(page)) // page must be a head page, not tail >>>> isolate_hugetlb() // isolate the hugetlb page if head >>>> >>>> After using folio, >>>> >>>> if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not >>>> >>>> I don't check the page is head or not, since the follow_page could >>>> return a sub-page, so the check PageHead need be retained, right? >>> >>> Right. It will prevent the kernel from trying to isolate the same hugetlb page >>> twice when two pages are in the same hugetlb folio. But looking at the >>> code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb() >>> would return false, no error would show up. But it changes err value >>> from -EACCES to -EBUSY and user will see a different page status than before. >> > > Before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") > in v4.0, follow_page() will return NULL on tail page for Huagetlb page, > and move_pages() will return -ENOENT errno,but after that commit, > -EACCES is returned, which not match the manual, > >> >> When check man[1], the current -EACCES is not right, -EBUSY is not >> precise but more suitable for this scenario, >> >> -EACCES >> The page is mapped by multiple processes and can be moved >> only if MPOL_MF_MOVE_ALL is specified. >> >> -EBUSY The page is currently busy and cannot be moved. Try again >> later. This occurs if a page is undergoing I/O or another >> kernel subsystem is holding a reference to the page. >> -ENOENT >> The page is not present. >> >>> >>> I wonder why we do not have follow_folio() and returns -ENOENT error pointer >>> when addr points to a non head page. It would make this patch more folio if >>> follow_folio() can be used in place of follow_page(). One caveat is that >>> user will see -ENOENT instead of -EACCES after this change. >>> >> >> -ENOENT is ok, but maybe the man need to be updated too. > > According to above analysis, -ENOENT is suitable when introduce the > follow_folio(), but when THP migrate support is introduced by > e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in > v4.14, the tail page will be turned into head page and return -EBUSY, > > So should we unify errno(maybe use -ENOENT) about the tail page? > > >> >> >> >> [1] https://man7.org/linux/man-pages/man2/move_pages.2.html I think so. I think -EBUSY is more reasonable for tail pages. But there is some subtle difference between THP and hugetlb from current code: For THP, compound_head() is used to get the head page for isolation, this means if user specifies a tail page address in move_pages(), the whole THP can be migrated. For hugetlb, only if user specifies the head page address of a hugetlb page, the hugetlb page will be migrated. Otherwise, an error would show up. Cc Mike to help us clarify the expected behavior of hugetlb. Hi Mike, what is the expected behavior, if a user tries to use move_pages() to migrate a non head page of a hugetlb page? -- Best Regards, Yan, Zi
Hi Mike On 2023/8/8 2:45, Zi Yan wrote: > On 7 Aug 2023, at 8:20, Kefeng Wang wrote: > >> Hi Zi Yan and Matthew and Naoya, >> >> On 2023/8/4 13:54, Kefeng Wang wrote: >>> >>> >>> On 2023/8/4 10:42, Zi Yan wrote: >>>> On 3 Aug 2023, at 21:45, Kefeng Wang wrote: >>>> >>>>> On 2023/8/3 20:30, Matthew Wilcox wrote: >>>>>> On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: >>>>>>> >> >> ... >> >>>>> >>>>> >>>>> if (PageHuge(page)) // page must be a hugetlb page >>>>> if (PageHead(page)) // page must be a head page, not tail >>>>> isolate_hugetlb() // isolate the hugetlb page if head >>>>> >>>>> After using folio, >>>>> >>>>> if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not >>>>> >>>>> I don't check the page is head or not, since the follow_page could >>>>> return a sub-page, so the check PageHead need be retained, right? >>>> >>>> Right. It will prevent the kernel from trying to isolate the same hugetlb page >>>> twice when two pages are in the same hugetlb folio. But looking at the >>>> code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb() >>>> would return false, no error would show up. But it changes err value >>>> from -EACCES to -EBUSY and user will see a different page status than before. >>> >> >> Before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") >> in v4.0, follow_page() will return NULL on tail page for Huagetlb page, >> and move_pages() will return -ENOENT errno,but after that commit, >> -EACCES is returned, which not match the manual, >> >>> >>> When check man[1], the current -EACCES is not right, -EBUSY is not >>> precise but more suitable for this scenario, >>> >>> -EACCES >>> The page is mapped by multiple processes and can be moved >>> only if MPOL_MF_MOVE_ALL is specified. >>> >>> -EBUSY The page is currently busy and cannot be moved. Try again >>> later. This occurs if a page is undergoing I/O or another >>> kernel subsystem is holding a reference to the page. >>> -ENOENT >>> The page is not present. >>> >>>> >>>> I wonder why we do not have follow_folio() and returns -ENOENT error pointer >>>> when addr points to a non head page. It would make this patch more folio if >>>> follow_folio() can be used in place of follow_page(). One caveat is that >>>> user will see -ENOENT instead of -EACCES after this change. >>>> >>> >>> -ENOENT is ok, but maybe the man need to be updated too. >> >> According to above analysis, -ENOENT is suitable when introduce the >> follow_folio(), but when THP migrate support is introduced by >> e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in >> v4.14, the tail page will be turned into head page and return -EBUSY, >> >> So should we unify errno(maybe use -ENOENT) about the tail page? >> >> >>> >>> >>> >>> [1] https://man7.org/linux/man-pages/man2/move_pages.2.html > > I think so. I think -EBUSY is more reasonable for tail pages. But there is > some subtle difference between THP and hugetlb from current code: > > For THP, compound_head() is used to get the head page for isolation, this means > if user specifies a tail page address in move_pages(), the whole THP can be > migrated. > > For hugetlb, only if user specifies the head page address of a hugetlb page, > the hugetlb page will be migrated. Otherwise, an error would show up. > > Cc Mike to help us clarify the expected behavior of hugetlb. > > Hi Mike, what is the expected behavior, if a user tries to use move_pages() > to migrate a non head page of a hugetlb page? Could you give some advise, thanks > > -- > Best Regards, > Yan, Zi
On 08/09/23 20:37, Kefeng Wang wrote: > Hi Mike > > On 2023/8/8 2:45, Zi Yan wrote: > > On 7 Aug 2023, at 8:20, Kefeng Wang wrote: > > > > > Hi Zi Yan and Matthew and Naoya, > > > > > > On 2023/8/4 13:54, Kefeng Wang wrote: > > > > > > > > > > > > On 2023/8/4 10:42, Zi Yan wrote: > > > > > On 3 Aug 2023, at 21:45, Kefeng Wang wrote: > > > > > > > > > > > On 2023/8/3 20:30, Matthew Wilcox wrote: > > > > > > > On Thu, Aug 03, 2023 at 03:13:21PM +0800, Kefeng Wang wrote: > > > > > > > > > > > > > > ... > > > > > > > > > > > > > > > > > > > > > if (PageHuge(page)) // page must be a hugetlb page > > > > > > if (PageHead(page)) // page must be a head page, not tail > > > > > > isolate_hugetlb() // isolate the hugetlb page if head > > > > > > > > > > > > After using folio, > > > > > > > > > > > > if (folio_test_hugetlb(folio)) // only check folio is hugetlb or not > > > > > > > > > > > > I don't check the page is head or not, since the follow_page could > > > > > > return a sub-page, so the check PageHead need be retained, right? > > > > > > > > > > Right. It will prevent the kernel from trying to isolate the same hugetlb page > > > > > twice when two pages are in the same hugetlb folio. But looking at the > > > > > code, if you try to isolate an already-isolated hugetlb folio, isolate_hugetlb() > > > > > would return false, no error would show up. But it changes err value > > > > > from -EACCES to -EBUSY and user will see a different page status than before. > > > > > > > > > > Before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") > > > in v4.0, follow_page() will return NULL on tail page for Huagetlb page, > > > and move_pages() will return -ENOENT errno,but after that commit, > > > -EACCES is returned, which not match the manual, > > > > > > > > > > > When check man[1], the current -EACCES is not right, -EBUSY is not > > > > precise but more suitable for this scenario, > > > > > > > > -EACCES > > > > The page is mapped by multiple processes and can be moved > > > > only if MPOL_MF_MOVE_ALL is specified. > > > > > > > > -EBUSY The page is currently busy and cannot be moved. Try again > > > > later. This occurs if a page is undergoing I/O or another > > > > kernel subsystem is holding a reference to the page. > > > > -ENOENT > > > > The page is not present. > > > > > > > > > > > > > > I wonder why we do not have follow_folio() and returns -ENOENT error pointer > > > > > when addr points to a non head page. It would make this patch more folio if > > > > > follow_folio() can be used in place of follow_page(). One caveat is that > > > > > user will see -ENOENT instead of -EACCES after this change. > > > > > > > > > > > > > -ENOENT is ok, but maybe the man need to be updated too. > > > > > > According to above analysis, -ENOENT is suitable when introduce the > > > follow_folio(), but when THP migrate support is introduced by > > > e8db67eb0ded ("mm: migrate: move_pages() supports thp migration") in > > > v4.14, the tail page will be turned into head page and return -EBUSY, > > > > > > So should we unify errno(maybe use -ENOENT) about the tail page? > > > > > > > > > > > > > > > > > > > > > > [1] https://man7.org/linux/man-pages/man2/move_pages.2.html > > > > I think so. I think -EBUSY is more reasonable for tail pages. But there is > > some subtle difference between THP and hugetlb from current code: > > > > For THP, compound_head() is used to get the head page for isolation, this means > > if user specifies a tail page address in move_pages(), the whole THP can be > > migrated. > > > > For hugetlb, only if user specifies the head page address of a hugetlb page, > > the hugetlb page will be migrated. Otherwise, an error would show up. > > > > Cc Mike to help us clarify the expected behavior of hugetlb. > > > > Hi Mike, what is the expected behavior, if a user tries to use move_pages() > > to migrate a non head page of a hugetlb page? > > Could you give some advise, thanks > Sorry, I was away for a while. It seems unfortunate that move_pages says the passed user addresses should be aligned to page boundaries. However, IIUC this is not checked or enforced. Otherwise, passing a hugetlb page should return the same error. One thought would be that hugetlb mappings should behave the same non-hugetlb mappings. If passed the address of a hugetlb tail page, align the address to a hugetlb boundary and migrate the page. This changes the existing behavior. However, it would be hard to imagine anyone depending on this. After taking a closer look at the add_page_for_migration(), it seems to just ignore passed tail pages and do nothing for such passed addresses. Correct? Or, am I missing something? Perhaps that is behavior we want/ need to preserve?
On 08/09/23 13:53, Mike Kravetz wrote: > On 08/09/23 20:37, Kefeng Wang wrote: > > > > > > Cc Mike to help us clarify the expected behavior of hugetlb. > > > > > > Hi Mike, what is the expected behavior, if a user tries to use move_pages() > > > to migrate a non head page of a hugetlb page? > > > > Could you give some advise, thanks > > > > Sorry, I was away for a while. > > It seems unfortunate that move_pages says the passed user addresses > should be aligned to page boundaries. However, IIUC this is not checked > or enforced. Otherwise, passing a hugetlb page should return the same > error. > > One thought would be that hugetlb mappings should behave the same > non-hugetlb mappings. If passed the address of a hugetlb tail page, align > the address to a hugetlb boundary and migrate the page. This changes the > existing behavior. However, it would be hard to imagine anyone depending > on this. > > After taking a closer look at the add_page_for_migration(), it seems to > just ignore passed tail pages and do nothing for such passed addresses. > Correct? Or, am I missing something? Perhaps that is behavior we want/ > need to preserve? My mistake, status -EACCES is returned when passing a tail page of a hugetlb page. Back to the question of 'What is the expected behavior if a tail page is passed?'. I do not think we have defined an expected behavior. If anything is 'expected' I would say it is -EACCES as returned today. BTW - hugetlb pages not migrated due to passing a tail page does not seem to contribute to a 'Positive return value' indicating the number of non-migrated pages.
On 2023/8/10 6:44, Mike Kravetz wrote: > On 08/09/23 13:53, Mike Kravetz wrote: >> On 08/09/23 20:37, Kefeng Wang wrote: >>>> >>>> Cc Mike to help us clarify the expected behavior of hugetlb. >>>> >>>> Hi Mike, what is the expected behavior, if a user tries to use move_pages() >>>> to migrate a non head page of a hugetlb page? >>> >>> Could you give some advise, thanks >>> >> >> Sorry, I was away for a while. >> >> It seems unfortunate that move_pages says the passed user addresses >> should be aligned to page boundaries. However, IIUC this is not checked >> or enforced. Otherwise, passing a hugetlb page should return the same >> error. >> >> One thought would be that hugetlb mappings should behave the same >> non-hugetlb mappings. If passed the address of a hugetlb tail page, align >> the address to a hugetlb boundary and migrate the page. This changes the >> existing behavior. However, it would be hard to imagine anyone depending >> on this. >> >> After taking a closer look at the add_page_for_migration(), it seems to >> just ignore passed tail pages and do nothing for such passed addresses. >> Correct? Or, am I missing something? Perhaps that is behavior we want/ >> need to preserve? > > My mistake, status -EACCES is returned when passing a tail page of a > hugetlb page. > As mentioned in previous mail, before e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") in v4.0, follow_page() will return NULL on tail page for Huagetlb page, so move_pages() will return -ENOENT errno, but after that commit, -EACCES is returned. Meanwhile, the behavior of THP/HUGETLB is different, the whole THP will be migrated on a tail page, but HUGETLB will return -EACCES(after v4.0) or -ENOENT(before v4.0) on tail page. > Back to the question of 'What is the expected behavior if a tail page is > passed?'. I do not think we have defined an expected behavior. If > anything is 'expected' I would say it is -EACCES as returned today. > My question is, Should we keep seem behavior between HUGETLB and THP, or only change the errno from -EACCES to -ENOENT/-EBUSY. I would like to drop PageHead() check for Hugetlb to keep seem behavior, which will keep seem error code if isolate fail or success on head/tail page. Thanks. > BTW - hugetlb pages not migrated due to passing a tail page does not > seem to contribute to a 'Positive return value' indicating the number of > non-migrated pages.
On 08/10/23 09:49, Kefeng Wang wrote: > > > On 2023/8/10 6:44, Mike Kravetz wrote: > > On 08/09/23 13:53, Mike Kravetz wrote: > > > On 08/09/23 20:37, Kefeng Wang wrote: > > > > > > > > > > Cc Mike to help us clarify the expected behavior of hugetlb. > > > > > > > > > > Hi Mike, what is the expected behavior, if a user tries to use move_pages() > > > > > to migrate a non head page of a hugetlb page? > > > > > > > > Could you give some advise, thanks > > > > > > > > > > Sorry, I was away for a while. > > > > > > It seems unfortunate that move_pages says the passed user addresses > > > should be aligned to page boundaries. However, IIUC this is not checked > > > or enforced. Otherwise, passing a hugetlb page should return the same > > > error. > > > > > > One thought would be that hugetlb mappings should behave the same > > > non-hugetlb mappings. If passed the address of a hugetlb tail page, align > > > the address to a hugetlb boundary and migrate the page. This changes the > > > existing behavior. However, it would be hard to imagine anyone depending > > > on this. > > > > > > After taking a closer look at the add_page_for_migration(), it seems to > > > just ignore passed tail pages and do nothing for such passed addresses. > > > Correct? Or, am I missing something? Perhaps that is behavior we want/ > > > need to preserve? > > > > My mistake, status -EACCES is returned when passing a tail page of a > > hugetlb page. > > > > As mentioned in previous mail, before e66f17ff7177 ("mm/hugetlb: take > page table lock in follow_huge_pmd()") in v4.0, follow_page() will > return NULL on tail page for Huagetlb page, so move_pages() will return > -ENOENT errno, but after that commit, -EACCES is returned. > > Meanwhile, the behavior of THP/HUGETLB is different, the whole THP will be > migrated on a tail page, but HUGETLB will return -EACCES(after v4.0) > or -ENOENT(before v4.0) on tail page. > > > Back to the question of 'What is the expected behavior if a tail page is > > passed?'. I do not think we have defined an expected behavior. If > > anything is 'expected' I would say it is -EACCES as returned today. > > > > My question is, > > Should we keep seem behavior between HUGETLB and THP, or only change the > errno from -EACCES to -ENOENT/-EBUSY. Just to be clear. When you say "keep seem behavior between HUGETLB and THP", are you saying that you would like hugetlb to perform migration of the entire hugetlb page if a tail page is passed? IMO, this would be ideal as it would mean that hugetlb and THP behave the same when passed the address of a tail page. The fewer places where hugetlb behavior diverges, the better. However, this does change behavior. As mentioned above, I have a hard time imagining someone depending on the behavior that passing the address of a hugetlb tail page returns error. But, this is almost impossible to predict. Thoughts from others?
On 2023/8/16 5:12, Mike Kravetz wrote: > On 08/15/23 11:58, Huang, Ying wrote: >> Mike Kravetz <mike.kravetz@oracle.com> writes: >> >>> On 08/10/23 09:49, Kefeng Wang wrote: >>>> >>>> >>>> On 2023/8/10 6:44, Mike Kravetz wrote: >>>>> On 08/09/23 13:53, Mike Kravetz wrote: >>>>>> On 08/09/23 20:37, Kefeng Wang wrote: >>>>>>>> >>>>>>>> Cc Mike to help us clarify the expected behavior of hugetlb. >>>>>>>> >>>>>>>> Hi Mike, what is the expected behavior, if a user tries to use move_pages() >>>>>>>> to migrate a non head page of a hugetlb page? >>>>>>> >>>>>>> Could you give some advise, thanks >>>>>>> >>>>>> >>>>>> Sorry, I was away for a while. >>>>>> >>>>>> It seems unfortunate that move_pages says the passed user addresses >>>>>> should be aligned to page boundaries. However, IIUC this is not checked >>>>>> or enforced. Otherwise, passing a hugetlb page should return the same >>>>>> error. >>>>>> >>>>>> One thought would be that hugetlb mappings should behave the same >>>>>> non-hugetlb mappings. If passed the address of a hugetlb tail page, align >>>>>> the address to a hugetlb boundary and migrate the page. This changes the >>>>>> existing behavior. However, it would be hard to imagine anyone depending >>>>>> on this. >>>>>> >>>>>> After taking a closer look at the add_page_for_migration(), it seems to >>>>>> just ignore passed tail pages and do nothing for such passed addresses. >>>>>> Correct? Or, am I missing something? Perhaps that is behavior we want/ >>>>>> need to preserve? >>>>> >>>>> My mistake, status -EACCES is returned when passing a tail page of a >>>>> hugetlb page. >>>>> >>>> >>>> As mentioned in previous mail, before e66f17ff7177 ("mm/hugetlb: take >>>> page table lock in follow_huge_pmd()") in v4.0, follow_page() will >>>> return NULL on tail page for Huagetlb page, so move_pages() will return >>>> -ENOENT errno, but after that commit, -EACCES is returned. >>>> >>>> Meanwhile, the behavior of THP/HUGETLB is different, the whole THP will be >>>> migrated on a tail page, but HUGETLB will return -EACCES(after v4.0) >>>> or -ENOENT(before v4.0) on tail page. >>>> >>>>> Back to the question of 'What is the expected behavior if a tail page is >>>>> passed?'. I do not think we have defined an expected behavior. If >>>>> anything is 'expected' I would say it is -EACCES as returned today. >>>>> >>>> >>>> My question is, >>>> >>>> Should we keep seem behavior between HUGETLB and THP, or only change the >>>> errno from -EACCES to -ENOENT/-EBUSY. >>> >>> Just to be clear. When you say "keep seem behavior between HUGETLB and THP", >>> are you saying that you would like hugetlb to perform migration of the entire >>> hugetlb page if a tail page is passed? >>> >>> IMO, this would be ideal as it would mean that hugetlb and THP behave the same >>> when passed the address of a tail page. The fewer places where hugetlb >>> behavior diverges, the better. However, this does change behavior. >> >> A separate patch will be needed for behavior change. >> > > Correct. > > Since the goal of this series is to convert to folios, we should maintain the > existing behavior and errno (-EACCES). In a subsequent patch, we can > change behavior. > > That would be my suggestion. Thanks all, will following the suggestion and re-post.
diff --git a/mm/migrate.c b/mm/migrate.c index e21d5a7e7447..b0c318bc531c 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -2057,6 +2057,7 @@ static int add_page_for_migration(struct mm_struct *mm, const void __user *p, struct vm_area_struct *vma; unsigned long addr; struct page *page; + struct folio *folio; int err; bool isolated; @@ -2079,45 +2080,42 @@ static int add_page_for_migration(struct mm_struct *mm, const void __user *p, if (!page) goto out; - if (is_zone_device_page(page)) - goto out_putpage; + folio = page_folio(page); + if (folio_is_zone_device(folio)) + goto out_putfolio; err = 0; - if (page_to_nid(page) == node) - goto out_putpage; + if (folio_nid(folio) == node) + goto out_putfolio; err = -EACCES; - if (page_mapcount(page) > 1 && !migrate_all) - goto out_putpage; + if (folio_estimated_sharers(folio) > 1 && !migrate_all) + goto out_putfolio; - if (PageHuge(page)) { - if (PageHead(page)) { - isolated = isolate_hugetlb(page_folio(page), pagelist); + if (folio_test_hugetlb(folio)) { + if (folio_test_large(folio)) { + isolated = isolate_hugetlb(folio, pagelist); err = isolated ? 1 : -EBUSY; } } else { - struct page *head; - - head = compound_head(page); - isolated = isolate_lru_page(head); + isolated = folio_isolate_lru(folio); if (!isolated) { err = -EBUSY; - goto out_putpage; + goto out_putfolio; } err = 1; - list_add_tail(&head->lru, pagelist); - mod_node_page_state(page_pgdat(head), - NR_ISOLATED_ANON + page_is_file_lru(head), - thp_nr_pages(head)); + list_add_tail(&folio->lru, pagelist); + node_stat_mod_folio(folio, + NR_ISOLATED_ANON + folio_is_file_lru(folio), + folio_nr_pages(folio)); } -out_putpage: +out_putfolio: /* - * Either remove the duplicate refcount from - * isolate_lru_page() or drop the page ref if it was - * not isolated. + * Either remove the duplicate refcount from folio_isolate_lru() + * or drop the folio ref if it was not isolated. */ - put_page(page); + folio_put(folio); out: mmap_read_unlock(mm); return err;