Message ID | 1687861992-8722-1-git-send-email-quic_charante@quicinc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp8100391vqr; Tue, 27 Jun 2023 03:50:34 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5PTmaCGS7P2qbIFt/uq8ultr+lYLgF219MFTgbmc8IjBjR8FeyfnowdXox6HIlEJFZ50HR X-Received: by 2002:a05:6a21:6d93:b0:126:f64b:6689 with SMTP id wl19-20020a056a216d9300b00126f64b6689mr5636512pzb.12.1687863033821; Tue, 27 Jun 2023 03:50:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687863033; cv=none; d=google.com; s=arc-20160816; b=a9e+9dOjwNB6WlhjrXGZU+zTAKJig5+OKpvEzO66NQAVECvcwLQt9Z6B/xAxvF8Bg4 dOnWoytxusFN0Vv2OV1uwaJi7E88fv5cQhMbp7McEs1piFkXTCNp8IqVi5Ergq/OiKcm 20BsbnmyDTQvjwFeKS539ge72p5yEVnFsZgcgZXaWbqg956GyB1TTlpDPSD/ZiBX6Adp aTxug7KSC5GozfcMtteOWabJ1F1u6/63D7VPGffUNYRKmEkhEekarCtbxNaz/fG1i+MD ZwZQlYOwedGYQ7Qy0dFc73d0gSgziGttSEurvcIPPQHnzodAcNjqFPljOqGmqZHJd114 lx4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from :dkim-signature; bh=KNfM7LtSPiwBmqZlNfK7ibDUjnF4FMfkSTe1KSMBQEA=; fh=8L11XEmbAdPDVtcqgNqPZcEGRcxftXOMCA7R+kHdfXw=; b=NcRpDylxm7yd2lkwaWYC8j3bpb/VAkiA3Bgr3+03x6sOpveRa3PkWpxPxg1AII4U8K WWCzD/HuEVQA4va1jSjuY3WPVrX9csruVTaoiWkpCMWWf8FvnbTVA+ZdEvbqwMtXxSYc 8t/Ntx/31+OrK2zYLtQigOGgATNp2oy1uKt6zUihNmIvIxJp0uttJRwY5uXBpY51YOY4 /nKDt2Jk6DHbPH9Hv8TuQ4HMB1+LvCMfVu1INKGGueiG6PrmQ/FHbSVjCilZYY/e2MyI raf9GlK5wTeXCMHNmHk4gbF65my4d/gBp9RTVgYV2LnnDHT/M7HH5Ez9VB9oRWhLoLiF b9dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=NIHdoA++; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o14-20020a635d4e000000b0053ff2ca1b24si6936369pgm.843.2023.06.27.03.50.20; Tue, 27 Jun 2023 03:50:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=NIHdoA++; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230089AbjF0Kdt (ORCPT <rfc822;nicolai.engesland@gmail.com> + 99 others); Tue, 27 Jun 2023 06:33:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229835AbjF0Kds (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 27 Jun 2023 06:33:48 -0400 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D6A5C9 for <linux-kernel@vger.kernel.org>; Tue, 27 Jun 2023 03:33:46 -0700 (PDT) Received: from pps.filterd (m0279862.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35R6DeUa022382; Tue, 27 Jun 2023 10:33:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=qcppdkim1; bh=KNfM7LtSPiwBmqZlNfK7ibDUjnF4FMfkSTe1KSMBQEA=; b=NIHdoA++jQ2lUqsWmW4PYcagoLjDwqAsSH2dgCsSHUftWqz78VP0uBRMsU5qRLYitFzU uMecvmP7hJzTmvbMEeBh/1+LmjnmY80HCkfm7+TmAMhXFZS1r8QIXAp/kYvhJECxnfrQ zAvU6CMu2LmTxEFs/RRwLrj2JvAxgE+IA1pcmdhwvJOEmQZOS+wEudfvhJHPOAmgISg6 f4KGTpiuKc1Kbv+p4nmEtZggjh61aAMtIId3epGvn3DOWpgZPEvWkKXTbmrQSvHF705T yJgGMjRI10q8cZySuPXvKX+CLDziCBvAefV8TvmRnnfOh2fw4vGRL3bJyy7HP4Lu3Luy 9Q== Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3rfartacnv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 10:33:35 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 35RAXZkQ026200 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 27 Jun 2023 10:33:35 GMT Received: from hu-charante-hyd.qualcomm.com (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.42; Tue, 27 Jun 2023 03:33:32 -0700 From: Charan Teja Kalla <quic_charante@quicinc.com> To: <akpm@linux-foundation.org>, <surenb@google.com>, <hannes@cmpxchg.org>, <minchan@kernel.org> CC: <quic_pkondeti@quicinc.com>, <quic_smanapra@quicinc.com>, <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>, Charan Teja Kalla <quic_charante@quicinc.com> Subject: [PATCH V2] mm: madvise: fix uneven accounting of psi Date: Tue, 27 Jun 2023 16:03:12 +0530 Message-ID: <1687861992-8722-1-git-send-email-quic_charante@quicinc.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: UM7BJlhXhIMhYVR-PACN7WE7fg6XDZyc X-Proofpoint-ORIG-GUID: UM7BJlhXhIMhYVR-PACN7WE7fg6XDZyc X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-06-27_06,2023-06-27_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 clxscore=1011 bulkscore=0 mlxscore=0 lowpriorityscore=0 impostorscore=0 priorityscore=1501 suspectscore=0 malwarescore=0 spamscore=0 adultscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306270096 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769852668079110505?= X-GMAIL-MSGID: =?utf-8?q?1769852668079110505?= |
Series |
[V2] mm: madvise: fix uneven accounting of psi
|
|
Commit Message
Charan Teja Kalla
June 27, 2023, 10:33 a.m. UTC
A folio turns into a Workingset during:
1) shrink_active_list() placing the folio from active to inactive list.
2) When a workingset transition is happening during the folio refault.
And when Workingset is set on a folio, PSI for memory can be accounted
during a) That folio is being reclaimed and b) Refault of that folio.
This accounting of PSI for memory is not consistent in the cases where
clients use madvise(COLD/PAGEOUT) to deactivate or proactively reclaim a
folio:
a) A folio started at inactive and moved to active as part of accesses.
Workingset is absent on the folio thus madvise(MADV_PAGEOUT) don't
account such folios for PSI.
b) When the same folio transition from inactive->active and then to
inactive through shrink_active_list(). Workingset is set on the folio
thus madvise(MADV_PAGEOUT) account such folios for PSI.
c) When the same folio is part of active list directly as a result of
folio refault and this was a workingset folio prior to eviction.
Workingset is set on the folio thus both the operations of MADV_PAGEOUT
and reclaim of the MADV_COLD operated folio account for PSI.
d) madvise(MADV_COLD) transfers the folio from active list to inactive
list. Such folios may not have the Workingset thus reclaim operation
on such folio doesn't account for PSI.
As said above, the MADV_PAGEOUT on a folio is accounts for memory PSI in
b) and c) but not in a). Reclaim of a folio on which MADV_COLD is
performed accounts memory PSI in c) but not in d) which is an
inconsistent behaviour. Make this PSI accounting always consistent by
turning a folio into a workingset one whenever it is leaving the active
list. Also, accounting of PSI on a folio whenever it leaves the
active list as part of the MADV_COLD/PAGEOUT operation helps the users
whether they are operating on proper folios[1].
[1] https://lore.kernel.org/all/20230605180013.GD221380@cmpxchg.org/
Suggested-by: Suren Baghdasaryan <surenb@google.com>
Reported-by: Sai Manobhiram Manapragada <quic_smanapra@quicinc.com>
Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com>
Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com>
---
V2: Made changes as per the comments from Johannes/Suren.
V1: https://lore.kernel.org/all/1685531374-6091-1-git-send-email-quic_charante@quicinc.com/
mm/madvise.c | 2 ++
1 file changed, 2 insertions(+)
Comments
On Tue, Jun 27, 2023 at 04:03:12PM +0530, Charan Teja Kalla wrote: > A folio turns into a Workingset during: > 1) shrink_active_list() placing the folio from active to inactive list. > 2) When a workingset transition is happening during the folio refault. > > And when Workingset is set on a folio, PSI for memory can be accounted > during a) That folio is being reclaimed and b) Refault of that folio. > Please help me understand why PSI for memory (I understood it as the time spent in psi_memstall_enter() to psi_memstall_leave()) would be accounted in (a) i.e during reclaim. I understand that when a working The (b) part is very clear. > This accounting of PSI for memory is not consistent in the cases where > clients use madvise(COLD/PAGEOUT) to deactivate or proactively reclaim a > folio: > a) A folio started at inactive and moved to active as part of accesses. > Workingset is absent on the folio thus madvise(MADV_PAGEOUT) don't > account such folios for PSI. > > b) When the same folio transition from inactive->active and then to > inactive through shrink_active_list(). Workingset is set on the folio > thus madvise(MADV_PAGEOUT) account such folios for PSI. > > c) When the same folio is part of active list directly as a result of > folio refault and this was a workingset folio prior to eviction. > Workingset is set on the folio thus both the operations of MADV_PAGEOUT > and reclaim of the MADV_COLD operated folio account for PSI. > > d) madvise(MADV_COLD) transfers the folio from active list to inactive > list. Such folios may not have the Workingset thus reclaim operation > on such folio doesn't account for PSI. > > As said above, the MADV_PAGEOUT on a folio is accounts for memory PSI in > b) and c) but not in a). Reclaim of a folio on which MADV_COLD is > performed accounts memory PSI in c) but not in d) which is an > inconsistent behaviour. Make this PSI accounting always consistent by > turning a folio into a workingset one whenever it is leaving the active > list. Also, accounting of PSI on a folio whenever it leaves the > active list as part of the MADV_COLD/PAGEOUT operation helps the users > whether they are operating on proper folios[1]. I understood the problem from V1 discussions. But the references to "madvise account such folios for PSI" is confusing. Why would madvise(PAGEOUT) be accounting anything related to PSI. I get that madvise() is messing up PSI accuracy indirectly.. > > [1] https://lore.kernel.org/all/20230605180013.GD221380@cmpxchg.org/ > > Suggested-by: Suren Baghdasaryan <surenb@google.com> > Reported-by: Sai Manobhiram Manapragada <quic_smanapra@quicinc.com> > Reported-by: Pavan Kondeti <quic_pkondeti@quicinc.com> > Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> > --- > V2: Made changes as per the comments from Johannes/Suren. > > V1: https://lore.kernel.org/all/1685531374-6091-1-git-send-email-quic_charante@quicinc.com/ > > mm/madvise.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/mm/madvise.c b/mm/madvise.c > index d9e7b42..76fb31f 100644 > --- a/mm/madvise.c > +++ b/mm/madvise.c > @@ -413,6 +413,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > > folio_clear_referenced(folio); > folio_test_clear_young(folio); > + folio_set_workingset(folio); > if (pageout) { > if (folio_isolate_lru(folio)) { > if (folio_test_unevictable(folio)) > @@ -512,6 +513,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > */ > folio_clear_referenced(folio); > folio_test_clear_young(folio); > + folio_set_workingset(folio); > if (pageout) { > if (folio_isolate_lru(folio)) { > if (folio_test_unevictable(folio)) > -- > 2.7.4 > This is not limited to madvise(PAGEOUT) right, anywhere an active page is reclaimed we have the same problem. For ex: damon_pa_pageout() and __alloc_contig_migrate_range()->reclaim_clean_pages_from_list(). If that is the case, can we set mark a folio as a workingset when it is activated? That way, we don't have make madvise() as a special case? Thanks, Pavan
Hi Charan, thanks for fixing this. One comment: On Tue, Jun 27, 2023 at 04:03:12PM +0530, Charan Teja Kalla wrote: > @@ -413,6 +413,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > > folio_clear_referenced(folio); > folio_test_clear_young(folio); > + folio_set_workingset(folio); Unless I'm missing something, this also includes inactive pages, which is undesirable. Shouldn't this be: if (folio_test_active(folio)) folio_set_workingset(folio); > @@ -512,6 +513,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, > */ > folio_clear_referenced(folio); > folio_test_clear_young(folio); > + folio_set_workingset(folio); Here as well.
Hi Pavan, On 6/27/2023 7:26 PM, Pavan Kondeti wrote: >> A folio turns into a Workingset during: >> 1) shrink_active_list() placing the folio from active to inactive list. >> 2) When a workingset transition is happening during the folio refault. >> >> And when Workingset is set on a folio, PSI for memory can be accounted >> during a) That folio is being reclaimed and b) Refault of that folio. >> > Please help me understand why PSI for memory (I understood it as the > time spent in psi_memstall_enter() to psi_memstall_leave()) would be > accounted in (a) i.e during reclaim. I understand that when a working > > The (b) part is very clear. > I meant to say, for usual reclaim, PSI is accounted on a folio for both reclaim and as well during the refault operation when Workingset is set on a folio i.e., both a) and b) cases above. >> This accounting of PSI for memory is not consistent in the cases where >> clients use madvise(COLD/PAGEOUT) to deactivate or proactively reclaim a >> folio: Seems I need to be explicit here. How about the below? This accounting of PSI for memory is not consistent for reclaim + refault operation between usual reclaim and madvise(COLD/PAGEOUT) which deactivate or proactively reclaim a folio: lmk for any better rephrasing? >> a) A folio started at inactive and moved to active as part of accesses. >> Workingset is absent on the folio thus madvise(MADV_PAGEOUT) don't >> account such folios for PSI. >> >> b) When the same folio transition from inactive->active and then to >> inactive through shrink_active_list(). Workingset is set on the folio >> thus madvise(MADV_PAGEOUT) account such folios for PSI. >> >> c) When the same folio is part of active list directly as a result of >> folio refault and this was a workingset folio prior to eviction. >> Workingset is set on the folio thus both the operations of MADV_PAGEOUT >> and reclaim of the MADV_COLD operated folio account for PSI. >> >> d) madvise(MADV_COLD) transfers the folio from active list to inactive >> list. Such folios may not have the Workingset thus reclaim operation >> on such folio doesn't account for PSI. > This is not limited to madvise(PAGEOUT) right, anywhere an active page > is reclaimed we have the same problem. For ex: damon_pa_pageout() and > __alloc_contig_migrate_range()->reclaim_clean_pages_from_list(). >> If that is the case, can we set mark a folio as a workingset when it is > activated? That way, we don't have make madvise() as a special case? I think marking the folio as a workingset when it sits on the active is not a correct thing. For the same example you mentioned, a simple CMA allocation will be dropping the clean pages instead of migration. PSI accounting on refault of those pages don't reveal anything to the user. Where as in the madvise() cases, this PSI tells the user about the type of pages that he is working on.[1] BTW, damon_pa_pageout() seems a valid case above. let me fix it in the next patch. [1]https://lore.kernel.org/all/20230605180013.GD221380@cmpxchg.org/
Thanks Johannes!! On 6/27/2023 8:16 PM, Johannes Weiner wrote: > Unless I'm missing something, this also includes inactive pages, which > is undesirable. Shouldn't this be: > > if (folio_test_active(folio)) My bad. Let me fix it. > folio_set_workingset(folio);
On Wed, Jun 28, 2023 at 04:19:01PM +0530, Charan Teja Kalla wrote: > Hi Pavan, > > On 6/27/2023 7:26 PM, Pavan Kondeti wrote: > >> A folio turns into a Workingset during: > >> 1) shrink_active_list() placing the folio from active to inactive list. > >> 2) When a workingset transition is happening during the folio refault. > >> > >> And when Workingset is set on a folio, PSI for memory can be accounted > >> during a) That folio is being reclaimed and b) Refault of that folio. > >> > > Please help me understand why PSI for memory (I understood it as the > > time spent in psi_memstall_enter() to psi_memstall_leave()) would be > > accounted in (a) i.e during reclaim. I understand that when a working > > > > The (b) part is very clear. > > > I meant to say, for usual reclaim, PSI is accounted on a folio for both > reclaim and as well during the refault operation when Workingset is set > on a folio i.e., both a) and b) cases above. > Got it. > >> This accounting of PSI for memory is not consistent in the cases where > >> clients use madvise(COLD/PAGEOUT) to deactivate or proactively reclaim a > >> folio: > > Seems I need to be explicit here. How about the below? > > This accounting of PSI for memory is not consistent for reclaim + > refault operation between usual reclaim and madvise(COLD/PAGEOUT) which > deactivate or proactively reclaim a folio: > Looks good. > lmk for any better rephrasing? > >> a) A folio started at inactive and moved to active as part of accesses. > >> Workingset is absent on the folio thus madvise(MADV_PAGEOUT) don't > >> account such folios for PSI. > >> > >> b) When the same folio transition from inactive->active and then to > >> inactive through shrink_active_list(). Workingset is set on the folio > >> thus madvise(MADV_PAGEOUT) account such folios for PSI. > >> > >> c) When the same folio is part of active list directly as a result of > >> folio refault and this was a workingset folio prior to eviction. > >> Workingset is set on the folio thus both the operations of MADV_PAGEOUT > >> and reclaim of the MADV_COLD operated folio account for PSI. > >> > >> d) madvise(MADV_COLD) transfers the folio from active list to inactive > >> list. Such folios may not have the Workingset thus reclaim operation > >> on such folio doesn't account for PSI. > > This is not limited to madvise(PAGEOUT) right, anywhere an active page > > is reclaimed we have the same problem. For ex: damon_pa_pageout() and > > __alloc_contig_migrate_range()->reclaim_clean_pages_from_list(). > >> If that is the case, can we set mark a folio as a workingset when it is > > activated? That way, we don't have make madvise() as a special case? > I think marking the folio as a workingset when it sits on the active is > not a correct thing. For the same example you mentioned, a simple CMA > allocation will be dropping the clean pages instead of migration. PSI > accounting on refault of those pages don't reveal anything to the user. > Agreed. Thanks for the clarification. > Where as in the madvise() cases, this PSI tells the user about the type > of pages that he is working on.[1] > > BTW, damon_pa_pageout() seems a valid case above. let me fix it in the > next patch. Looks good. Thanks, Pavan
Hi Pavan, On 6/28/2023 4:19 PM, Charan Teja Kalla wrote: > I think marking the folio as a workingset when it sits on the active is > not a correct thing. For the same example you mentioned, a simple CMA > allocation will be dropping the clean pages instead of migration. PSI > accounting on refault of those pages don't reveal anything to the user. > > Where as in the madvise() cases, this PSI tells the user about the type > of pages that he is working on.[1] > > BTW, damon_pa_pageout() seems a valid case above. let me fix it in the > next patch. I did look a little bit more at the damon code and IIUC it: DAMON monitors the ranges it is asked to operate as regions and operate(reclaim) on the region that has less number of accesses, IOW, damon won't do pageout operation on a folio if it is really under use, CMIW. This is unlike the case with the madvise() operation where Workingset helps in accounting PSI that helps user the type of folios he is operating on. Assume that damon is operating on wrong set of regions and Workingset helps in giving a PSI. This got no help to user and just telling the internals of damon. No? Having said that, theoretically it seems correct to me to set workingset on folios as they leave the active list, but I don't have any strong reason to say what happens if we won't. Moreover, this patch is mostly talks about the madvise() operated folios not inline with the usual reclaim. May be a separate change can be raised for damon() operated folios once we agree upon the importance of Workingset to these folios. WDYT? Thanks,
diff --git a/mm/madvise.c b/mm/madvise.c index d9e7b42..76fb31f 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -413,6 +413,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, folio_clear_referenced(folio); folio_test_clear_young(folio); + folio_set_workingset(folio); if (pageout) { if (folio_isolate_lru(folio)) { if (folio_test_unevictable(folio)) @@ -512,6 +513,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, */ folio_clear_referenced(folio); folio_test_clear_young(folio); + folio_set_workingset(folio); if (pageout) { if (folio_isolate_lru(folio)) { if (folio_test_unevictable(folio))