Message ID | 20231010064544.4162286-2-wangkefeng.wang@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp1562vqb; Mon, 9 Oct 2023 23:47:33 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFFdW5sTWvRuQGmc5k/9cwe9vLooUoNu6XT+3hRQhxDnP6lowWRdKp1MrnTlgmjCKmr33Wg X-Received: by 2002:a05:6a20:72a2:b0:11f:4707:7365 with SMTP id o34-20020a056a2072a200b0011f47077365mr19032538pzk.38.1696920453047; Mon, 09 Oct 2023 23:47:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696920453; cv=none; d=google.com; s=arc-20160816; b=kjfmVFWSSr0+9dtGcyrAOrLlUtnM11bWMA2DCRHDw4is2VE7rKAXsa08VXp9QAQOA4 R7fGT5HgZjb791waFWLS6iSB1SK4wYOnMHEV9HZUwfrXSIztejQLuMctF7HjReLP7db/ 1oNRu1hkNgUGF0OOwO5kB0RjoDCYJ1wTYMlvZ0tg1O67hr0cihXY00G9Ru8KUzKPNhtO KvEkTBa1zLk/VfGoeGoO0HIyZ+UVqUSNL1n2lCVVnP8ahHFa2fql5ckXxtd/wt+foAbf dAEkvJWf2oWowER/6IQr1AlgYsxrjYlPZiadX3AFdYywFEutazSqaxSNUfYjNdFE4kDu FknA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=0xbzl+RRPO9YDRU0w4/UNxHSJngCPUgSKYZLF88odr8=; fh=wFvkM9osNU8sYLmqaGTuFInFBWzcJKoGp+WLNOXGmbE=; b=ywPL3nGgJ+9H58PoPVqXz3ex8ams1TAxtIu63lCvQqDsRtU/nsoH+/jBJcjRG+aXNy yKf0/HycoWlDLfwN7kDTjcMPnH3xL2qTfbucegHgv6ZCjdyBlXqR2VjrGcV7nn1wnLEj hhG67XZaMCWE1CTTpXYEBgwOqtonjapAgVXS9HQuw3cuBEosgfFYT6KULS4DEHLmOEuo JGzxGJ9e8IRwdVYLQDK5XDVsTvsxIG7/iLHDB8cq5gJdkZmy+onC7A2zKcwbLjXslGgk CobC6xUFCR03Cyq9yl3CaY2x/YviL6N0zbf/gX4EUzE8N0gi05PkdfoKWLn0ZEwMhIWE l/2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id cu14-20020a056a00448e00b0068fba6a7375si8588612pfb.321.2023.10.09.23.47.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 09 Oct 2023 23:47:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id E7B3D8026BE3; Mon, 9 Oct 2023 23:47:26 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442315AbjJJGrM (ORCPT <rfc822;rua109.linux@gmail.com> + 20 others); Tue, 10 Oct 2023 02:47:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59500 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442242AbjJJGrJ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 10 Oct 2023 02:47:09 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B91819D for <linux-kernel@vger.kernel.org>; Mon, 9 Oct 2023 23:47:07 -0700 (PDT) Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4S4RDZ0Y2gzNnmZ; Tue, 10 Oct 2023 14:43:10 +0800 (CST) Received: from localhost.localdomain (10.175.112.125) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Tue, 10 Oct 2023 14:47:05 +0800 From: Kefeng Wang <wangkefeng.wang@huawei.com> To: Andrew Morton <akpm@linux-foundation.org> CC: <willy@infradead.org>, <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>, <ying.huang@intel.com>, <david@redhat.com>, Zi Yan <ziy@nvidia.com>, Kefeng Wang <wangkefeng.wang@huawei.com> Subject: [PATCH -next 1/7] mm_types: add _last_cpupid into folio Date: Tue, 10 Oct 2023 14:45:38 +0800 Message-ID: <20231010064544.4162286-2-wangkefeng.wang@huawei.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20231010064544.4162286-1-wangkefeng.wang@huawei.com> References: <20231010064544.4162286-1-wangkefeng.wang@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.112.125] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Spam-Status: No, score=2.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 09 Oct 2023 23:47:27 -0700 (PDT) X-Spam-Level: ** X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779350060785590297 X-GMAIL-MSGID: 1779350060785590297 |
Series |
mm: convert page cpupid functions to folios
|
|
Commit Message
Kefeng Wang
Oct. 10, 2023, 6:45 a.m. UTC
At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of
them don't support numa balancing, and the page struct is aligned
to _struct_page_alignment, it is safe to move _last_cpupid before
'virtual' in page, meanwhile, add it into folio, which make us to
use folio->_last_cpupid directly.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/mm_types.h | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
Comments
Kefeng Wang <wangkefeng.wang@huawei.com> writes: > At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of > them don't support numa balancing, and the page struct is aligned > to _struct_page_alignment, it is safe to move _last_cpupid before > 'virtual' in page, meanwhile, add it into folio, which make us to > use folio->_last_cpupid directly. Add BUILD_BUG_ON() to check this automatically? -- Best Regards, Huang, Ying
On 2023/10/10 16:17, Huang, Ying wrote: > Kefeng Wang <wangkefeng.wang@huawei.com> writes: > >> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of >> them don't support numa balancing, and the page struct is aligned >> to _struct_page_alignment, it is safe to move _last_cpupid before >> 'virtual' in page, meanwhile, add it into folio, which make us to >> use folio->_last_cpupid directly. > > Add BUILD_BUG_ON() to check this automatically? The WANT_PAGE_VIRTUAL and LAST_CPUPID_NOT_IN_PAGE_FLAGS are not conflict, the check is to make sure that the re-order the virtual and _last_cpupid is minimal impact, and there is a build warning in mm/memory.c when the LAST_CPUPID_NOT_IN_PAGE_FLAGS is enabled, so I don't think we need a new BUILD_BUG_ON here. Thanks. > > -- > Best Regards, > Huang, Ying
On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote: > At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of > them don't support numa balancing, and the page struct is aligned > to _struct_page_alignment, it is safe to move _last_cpupid before > 'virtual' in page, meanwhile, add it into folio, which make us to > use folio->_last_cpupid directly. What do you mean by "safe"? I think you mean "Does not increase the size of struct page", but if that is what you mean, why not just say so? If there's something else you mean, please explain. In any event, I'd like to see some reasoning that _last_cpupid is actually information which is logically maintained on a per-allocation basis, not a per-page basis (I think this is true, but I honestly don't know) And looking at all this, I think it makes sense to move _last_cpupid before the kmsan garbage, then add both 'virtual' and '_last_cpupid' to folio.
On 2023/10/10 20:33, Matthew Wilcox wrote: > On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote: >> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of >> them don't support numa balancing, and the page struct is aligned >> to _struct_page_alignment, it is safe to move _last_cpupid before >> 'virtual' in page, meanwhile, add it into folio, which make us to >> use folio->_last_cpupid directly. > > What do you mean by "safe"? I think you mean "Does not increase the > size of struct page", but if that is what you mean, why not just say so? > If there's something else you mean, please explain. Don't increase size of struct page and don't impact the real order of struct page as the above three archs without numa balancing support. > > In any event, I'd like to see some reasoning that _last_cpupid is actually > information which is logically maintained on a per-allocation basis, > not a per-page basis (I think this is true, but I honestly don't know) The _last_cpupid is updated in should_numa_migrate_memory() from numa fault(do_numa_page, and do_huge_pmd_numa_page), it is per-page(normal page and PMD-mapped page). Maybe I misunderstand your mean, please correct me. > > And looking at all this, I think it makes sense to move _last_cpupid > before the kmsan garbage, then add both 'virtual' and '_last_cpupid' > to folio. sure, I will add both of them into folio and don't re-order 'virtual' and '_last_cpupid'. > >
Kefeng Wang <wangkefeng.wang@huawei.com> writes: > On 2023/10/10 20:33, Matthew Wilcox wrote: >> On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote: >>> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of >>> them don't support numa balancing, and the page struct is aligned >>> to _struct_page_alignment, it is safe to move _last_cpupid before >>> 'virtual' in page, meanwhile, add it into folio, which make us to >>> use folio->_last_cpupid directly. >> What do you mean by "safe"? I think you mean "Does not increase the >> size of struct page", but if that is what you mean, why not just say so? >> If there's something else you mean, please explain. > > Don't increase size of struct page and don't impact the real order of > struct page as the above three archs without numa balancing support. > >> In any event, I'd like to see some reasoning that _last_cpupid is >> actually >> information which is logically maintained on a per-allocation basis, >> not a per-page basis (I think this is true, but I honestly don't know) > > The _last_cpupid is updated in should_numa_migrate_memory() from numa > fault(do_numa_page, and do_huge_pmd_numa_page), it is per-page(normal > page and PMD-mapped page). Maybe I misunderstand your mean, please > correct me. Because PTE mapped THP will not be migrated according to comments and folio_test_large() test in do_numa_page(). Only _last_cpuid of the head page will be used (that is, on per-allocation basis). Although in change_pte_range() in mprotect.c, _last_cpuid of tail pages may be changed, they are not used actually. All in all, _last_cpuid is on per-allocation basis for now. In the future, it's hard to say. PTE-mapped THPs or large folios give us an opportunity to check whether the different parts of a folio are accessed by multiple sockets, so that we should split the folio. But this is just some possibility in the future. -- Best Regards, Huang, Ying
On 2023/10/11 13:55, Huang, Ying wrote: > Kefeng Wang <wangkefeng.wang@huawei.com> writes: > >> On 2023/10/10 20:33, Matthew Wilcox wrote: >>> On Tue, Oct 10, 2023 at 02:45:38PM +0800, Kefeng Wang wrote: >>>> At present, only arc/sparc/m68k define WANT_PAGE_VIRTUAL, both of >>>> them don't support numa balancing, and the page struct is aligned >>>> to _struct_page_alignment, it is safe to move _last_cpupid before >>>> 'virtual' in page, meanwhile, add it into folio, which make us to >>>> use folio->_last_cpupid directly. >>> What do you mean by "safe"? I think you mean "Does not increase the >>> size of struct page", but if that is what you mean, why not just say so? >>> If there's something else you mean, please explain. >> >> Don't increase size of struct page and don't impact the real order of >> struct page as the above three archs without numa balancing support. >> >>> In any event, I'd like to see some reasoning that _last_cpupid is >>> actually >>> information which is logically maintained on a per-allocation basis, >>> not a per-page basis (I think this is true, but I honestly don't know) >> >> The _last_cpupid is updated in should_numa_migrate_memory() from numa >> fault(do_numa_page, and do_huge_pmd_numa_page), it is per-page(normal >> page and PMD-mapped page). Maybe I misunderstand your mean, please >> correct me. > > Because PTE mapped THP will not be migrated according to comments and > folio_test_large() test in do_numa_page(). Only _last_cpuid of the head > page will be used (that is, on per-allocation basis). Although in > change_pte_range() in mprotect.c, _last_cpuid of tail pages may be > changed, they are not used actually. All in all, _last_cpuid is on > per-allocation basis for now. Thanks for clarification, yes, it's what I mean, too > > In the future, it's hard to say. PTE-mapped THPs or large folios give > us an opportunity to check whether the different parts of a folio are > accessed by multiple sockets, so that we should split the folio. But > this is just some possibility in the future. It depends on memory access behavior of application,if multiple sockets access a large folio/PTE-mappped THP frequently, split maybe better, or it is enough to just migrate the entire folio. > > -- > Best Regards, > Huang, Ying >
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 36c5b43999e6..32af41160109 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -183,6 +183,9 @@ struct page { #ifdef CONFIG_MEMCG unsigned long memcg_data; #endif +#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS + int _last_cpupid; +#endif /* * On machines where all RAM is mapped into kernel address space, @@ -210,10 +213,6 @@ struct page { struct page *kmsan_shadow; struct page *kmsan_origin; #endif - -#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS - int _last_cpupid; -#endif } _struct_page_alignment; /* @@ -317,6 +316,9 @@ struct folio { atomic_t _refcount; #ifdef CONFIG_MEMCG unsigned long memcg_data; +#endif +#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS + int _last_cpupid; #endif /* private: the union with struct page is transitional */ }; @@ -373,6 +375,9 @@ FOLIO_MATCH(_refcount, _refcount); #ifdef CONFIG_MEMCG FOLIO_MATCH(memcg_data, memcg_data); #endif +#ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS +FOLIO_MATCH(_last_cpupid, _last_cpupid); +#endif #undef FOLIO_MATCH #define FOLIO_MATCH(pg, fl) \ static_assert(offsetof(struct folio, fl) == \