[v2,1/2] mm: compaction: consider the number of scanning compound pages in isolate fail path
Message ID | 73d6250a90707649cc010731aedc27f946d722ed.1678962352.git.baolin.wang@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp420624wrt; Thu, 16 Mar 2023 04:17:19 -0700 (PDT) X-Google-Smtp-Source: AK7set+ns79S8JyXAbgUCkWeSyGJ/AjHAMFtDBhzIuMfZAS3k/CFuDZA3ZcVJ9YUmhrth3nQALv2 X-Received: by 2002:a05:6a20:e688:b0:cc:e39e:3f64 with SMTP id mz8-20020a056a20e68800b000cce39e3f64mr2902601pzb.24.1678965439006; Thu, 16 Mar 2023 04:17:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678965438; cv=none; d=google.com; s=arc-20160816; b=DVjurG7lR0TIS4ndnT6aCNv8L7fBLICPTIGJBImXZ4tb3YACpCA9nDOIfV+xRsv6p2 b5mpxGBavmdLKhCkJkz+KxcDZ4Heu4fB1P0BdV8CF6eYJZOyJjthCxQ5aY8Vs67hwRS+ ZuhJa1XeT5se6L1XvsLLHNhmy0gxWVwhGNHJOc4Jj+Gy++HtoLTzwsA9Njyxsd3XNneo t1UDervcx9zcg2gXSLzMmVK4z6SE/+aSt3cCqLC4I+3hEgxnFOK4obNwDk5TQ2jhxiwI PbHKiOurGMj//9NbHKKK7JAAaqlPgGWj7BUQYePPh3MD5Wx1QTwGHU8sPHsxkZ8cDR/H gO1w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=4iEwgEufw1E44AmNlaoOTCnL7OKSRMw6hxynyb6vNaE=; b=MFDKV2btTrjz377Z3WR8YqzlU7WtF3vN9+04XQ0cFQ6trSkDm8LRqsCs2qSaKzoFC4 frG5NIs9ZmDPCno+skU/IgHWxg0b7+El5pZJLXn49/MEMjfHtS0kJio/JrVz3qq6/4Fw QMeo9HHZIqcmSJ4u/UqbgAd9y3iqtuGnV/ys4UIDqAyzSLTxoIPcAC35F8UcvQ7wLdtc 9LAxy6Xi9CwvYCmGmm1m3Y40Mkwqch2UmKGcI2yth5rDySAZfD6sVfhnMPlDRA3UpPTN fWgOS8AAbBMB9hVT7IdchWdADZB3AeCR2G0tjk83CohjaXRqGwnuIkFi9leevJBR0X2h U/tw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w32-20020a631620000000b004da377c33bbsi7962847pgl.85.2023.03.16.04.17.04; Thu, 16 Mar 2023 04:17:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230088AbjCPLHH (ORCPT <rfc822;ruipengqi7@gmail.com> + 99 others); Thu, 16 Mar 2023 07:07:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230091AbjCPLHD (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 16 Mar 2023 07:07:03 -0400 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1317F81CCE for <linux-kernel@vger.kernel.org>; Thu, 16 Mar 2023 04:06:57 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045168;MF=baolin.wang@linux.alibaba.com;NM=1;PH=DS;RN=9;SR=0;TI=SMTPD_---0Ve-NM83_1678964813; Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0Ve-NM83_1678964813) by smtp.aliyun-inc.com; Thu, 16 Mar 2023 19:06:54 +0800 From: Baolin Wang <baolin.wang@linux.alibaba.com> To: akpm@linux-foundation.org Cc: mgorman@techsingularity.net, osalvador@suse.de, vbabka@suse.cz, william.lam@bytedance.com, mike.kravetz@oracle.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 1/2] mm: compaction: consider the number of scanning compound pages in isolate fail path Date: Thu, 16 Mar 2023 19:06:46 +0800 Message-Id: <73d6250a90707649cc010731aedc27f946d722ed.1678962352.git.baolin.wang@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760249930219650806?= X-GMAIL-MSGID: =?utf-8?q?1760522864644268264?= |
Series |
[v2,1/2] mm: compaction: consider the number of scanning compound pages in isolate fail path
|
|
Commit Message
Baolin Wang
March 16, 2023, 11:06 a.m. UTC
The commit b717d6b93b54 ("mm: compaction: include compound page count
for scanning in pageblock isolation") had added compound page statistics
for scanning in pageblock isolation, to make sure the number of scanned
pages are always larger than the number of isolated pages when isolating
mirgratable or free pageblock.
However, when failed to isolate the pages when scanning the mirgratable or
free pageblock, the isolation failure path did not consider the scanning
statistics of the compound pages, which can show the incorrect number of
scanned pages in tracepoints or the vmstats to make people confusing about
the page scanning pressure in memory compaction.
Thus we should take into account the number of scanning pages when failed
to isolate the compound pages to make the statistics accurate.
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
Changes from v1:
- Move the compound pages statistics after sanity order checking.
---
mm/compaction.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
Comments
On 3/16/23 12:06, Baolin Wang wrote: > The commit b717d6b93b54 ("mm: compaction: include compound page count > for scanning in pageblock isolation") had added compound page statistics > for scanning in pageblock isolation, to make sure the number of scanned > pages are always larger than the number of isolated pages when isolating > mirgratable or free pageblock. > > However, when failed to isolate the pages when scanning the mirgratable or > free pageblock, the isolation failure path did not consider the scanning > statistics of the compound pages, which can show the incorrect number of > scanned pages in tracepoints or the vmstats to make people confusing about > the page scanning pressure in memory compaction. > > Thus we should take into account the number of scanning pages when failed > to isolate the compound pages to make the statistics accurate. > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Thanks!
On Thu, Mar 16, 2023 at 07:06:46PM +0800, Baolin Wang wrote: > The commit b717d6b93b54 ("mm: compaction: include compound page count > for scanning in pageblock isolation") had added compound page statistics > for scanning in pageblock isolation, to make sure the number of scanned > pages are always larger than the number of isolated pages when isolating > mirgratable or free pageblock. > > However, when failed to isolate the pages when scanning the mirgratable or > free pageblock, the isolation failure path did not consider the scanning > statistics of the compound pages, which can show the incorrect number of > scanned pages in tracepoints or the vmstats to make people confusing about > the page scanning pressure in memory compaction. > > Thus we should take into account the number of scanning pages when failed > to isolate the compound pages to make the statistics accurate. > > Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> However, the patch highlights weakeness in the tracepoints and how useful they are. Minimally, I think that the change might be misleading when comparing tracepoints across kernel versions as it'll be necessary to check the exact meaning of nr_scanned for a given kernel version. That's not a killer problem as such, just a hazard if using an analysis tool comparing kernel versions. As an example, consider this if (PageCompound(page)) { const unsigned int order = compound_order(page); if (likely(order < MAX_ORDER)) { blockpfn += (1UL << order) - 1; cursor += (1UL << order) - 1; nr_scanned += compound_nr(page) - 1; <<< patch adds } goto isolate_fail; } Only the head page is "scanned", the tail pages are not scanned so accounting for them as "scanned" is not an accurate reflection of the amount of work done. Isolation is different because the compound pages isolated is a prediction of how much work is necessary to migrate that page as it's obviously more work to copy 2M of data than 4K. The migrated pages combined with isolation then can measure efficiency of isolation vs migration although imperfectly as isolation is a span while migration probably fails at the head page. The same applies when skipping buddies, the tail pages are not scanned so skipping them is not additional work. Everything depends on what the tracepoint is being used for. If it's a measure of work done, then accounting for skipped tail pages over-estimates the amount of work. However, if the intent is to measure efficiency of isolation vs migration then the "span" scanned is more useful. None of this kills the patch, it only notes that the tracepoints as-is probably cannot answer all relevant questions, most of which are only relevant when making a modification to compaction in general. The patch means that an unspecified pressure metric can be derived (maybe interesting to sysadmins) but loses a metric about time spent on scanning (maybe interesting to developers writing a patch). Of those concerns, sysadmins are probably more common so the patch is acceptable but some care will be need if modifying the tracepoints further if it enables one type of analysis at the cost of another.
On 4/5/2023 6:31 PM, Mel Gorman wrote: > On Thu, Mar 16, 2023 at 07:06:46PM +0800, Baolin Wang wrote: >> The commit b717d6b93b54 ("mm: compaction: include compound page count >> for scanning in pageblock isolation") had added compound page statistics >> for scanning in pageblock isolation, to make sure the number of scanned >> pages are always larger than the number of isolated pages when isolating >> mirgratable or free pageblock. >> >> However, when failed to isolate the pages when scanning the mirgratable or >> free pageblock, the isolation failure path did not consider the scanning >> statistics of the compound pages, which can show the incorrect number of >> scanned pages in tracepoints or the vmstats to make people confusing about >> the page scanning pressure in memory compaction. >> >> Thus we should take into account the number of scanning pages when failed >> to isolate the compound pages to make the statistics accurate. >> >> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> > > Acked-by: Mel Gorman <mgorman@techsingularity.net> Thanks Mel. > > However, the patch highlights weakeness in the tracepoints and how > useful they are. > > Minimally, I think that the change might be misleading when comparing > tracepoints across kernel versions as it'll be necessary to check the exact > meaning of nr_scanned for a given kernel version. That's not a killer problem > as such, just a hazard if using an analysis tool comparing kernel versions. > > As an example, consider this > > if (PageCompound(page)) { > const unsigned int order = compound_order(page); > > if (likely(order < MAX_ORDER)) { > blockpfn += (1UL << order) - 1; > cursor += (1UL << order) - 1; > nr_scanned += compound_nr(page) - 1; <<< patch adds > } > goto isolate_fail; > } > > Only the head page is "scanned", the tail pages are not scanned so > accounting for them as "scanned" is not an accurate reflection of the > amount of work done. Isolation is different because the compound pages > isolated is a prediction of how much work is necessary to migrate that > page as it's obviously more work to copy 2M of data than 4K. The migrated > pages combined with isolation then can measure efficiency of isolation > vs migration although imperfectly as isolation is a span while migration > probably fails at the head page. > > The same applies when skipping buddies, the tail pages are not scanned > so skipping them is not additional work. > > Everything depends on what the tracepoint is being used for. If it's a > measure of work done, then accounting for skipped tail pages over-estimates > the amount of work. However, if the intent is to measure efficiency of > isolation vs migration then the "span" scanned is more useful. Yes, we are more concered about the efficiency of isolation vs migration. > None of this kills the patch, it only notes that the tracepoints as-is > probably cannot answer all relevant questions, most of which are only > relevant when making a modification to compaction in general. The patch > means that an unspecified pressure metric can be derived (maybe interesting > to sysadmins) but loses a metric about time spent on scanning (maybe > interesting to developers writing a patch). Of those concerns, sysadmins > are probably more common so the patch is acceptable but some care will be > need if modifying the tracepoints further if it enables one type of > analysis at the cost of another. I learned, and thanks for your excellent explaination.
diff --git a/mm/compaction.c b/mm/compaction.c index 5a9501e0ae01..7e645cdfc2e9 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -586,6 +586,7 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, if (likely(order < MAX_ORDER)) { blockpfn += (1UL << order) - 1; cursor += (1UL << order) - 1; + nr_scanned += (1UL << order) - 1; } goto isolate_fail; } @@ -904,6 +905,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (ret == -EBUSY) ret = 0; low_pfn += compound_nr(page) - 1; + nr_scanned += compound_nr(page) - 1; goto isolate_fail; } @@ -938,8 +940,10 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, * a valid page order. Consider only values in the * valid order range to prevent low_pfn overflow. */ - if (freepage_order > 0 && freepage_order < MAX_ORDER) + if (freepage_order > 0 && freepage_order < MAX_ORDER) { low_pfn += (1UL << freepage_order) - 1; + nr_scanned += (1UL << freepage_order) - 1; + } continue; } @@ -954,8 +958,10 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, if (PageCompound(page) && !cc->alloc_contig) { const unsigned int order = compound_order(page); - if (likely(order < MAX_ORDER)) + if (likely(order < MAX_ORDER)) { low_pfn += (1UL << order) - 1; + nr_scanned += (1UL << order) - 1; + } goto isolate_fail; } @@ -1077,6 +1083,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn, */ if (unlikely(PageCompound(page) && !cc->alloc_contig)) { low_pfn += compound_nr(page) - 1; + nr_scanned += compound_nr(page) - 1; SetPageLRU(page); goto isolate_fail_put; }