Message ID | 20231122100156.6568-3-ddrokosov@salutedevices.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2b07:b0:403:3b70:6f57 with SMTP id io7csp1214042vqb; Wed, 22 Nov 2023 02:03:36 -0800 (PST) X-Google-Smtp-Source: AGHT+IHHSku76dgmVC0fzZvPpkIVaXRe14HGP47ZZJXEJL2vTMM4GX8Cf/hsT9XsbfxKbWbqJqcB X-Received: by 2002:a05:6a00:188b:b0:6cb:8b47:cb6c with SMTP id x11-20020a056a00188b00b006cb8b47cb6cmr2114033pfh.6.1700647416220; Wed, 22 Nov 2023 02:03:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700647416; cv=none; d=google.com; s=arc-20160816; b=kqrS37niLWBEtTEazN6oVcvHN+z7FX++9q4j9AurlfKeRiMa2ASEj2z0D3cBI+MElo fJzPEM5w332S63R0LhM2md9+sNdqy4ZG0HCwT9YoaKGz2lfFtus/1sQ3XCmMbwzwcriz yJYmHLF7GfSSABag/t7IurFxaO9vLKVfIXy81u5hIdvyprb3hoKggagq6ZM4VDAWOYkt 1lWUcft1JFSeVsfilsgDfwWYQq4HSc4t+jNZEy3Q04FO2aYdJu7ZqdbECiDZzWgRX7rd t+7TP2+PaNG5+PYWQHjAFGvJ2sp4lOqfO5D/UlpCQnxloqh5OQEpIVDHpuYYaF8BvKW2 kBAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-filter; bh=gpgaCxHlI+b2COEC+6FpbgcB+0fsnGeeLl3cqShaJqM=; fh=0+l72ck3kOAs3cMd/mEkI3DSRNpPfi1ydrq5QwtlmRc=; b=c14jThfxbVkBEV6xuiCLQzTWoHQhaPHmyj83xrO8lVGrxiROPr08R0+KhNjpFqGaKW HZ71eBggsX5OZyEx1GqM5M8yuJJZZFdrz0a9zMxrZuimeV4ATeCsTRKmUl/AJBOO6OJH V5c7T7CsN/BaH4B/jFO2YBrCUUHqugfg7dnIr3jDhiii6llxTZL6GpJHCPsCDmYFZbWv bjL72yw/EESOZ4cgIKLghQ1cYr8ix+je0Wo6Au8859Vy1MhuynWwAlo3Ftou2bQCn0ps Xlo34BZKbg+ZDqGt6UBAqfRfX24uSkTmbWF+TL6Vxgs27th6uqQ+mwuxYhIYmrk2ltEf ekPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@salutedevices.com header.s=mail header.b=U2cR+jny; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=salutedevices.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id r15-20020a632b0f000000b00578e5228c76si12147050pgr.505.2023.11.22.02.03.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Nov 2023 02:03:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@salutedevices.com header.s=mail header.b=U2cR+jny; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=salutedevices.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 6E8FF812F442; Wed, 22 Nov 2023 02:02:37 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343747AbjKVKCV (ORCPT <rfc822;ouuuleilei@gmail.com> + 99 others); Wed, 22 Nov 2023 05:02:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235130AbjKVKCN (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 22 Nov 2023 05:02:13 -0500 Received: from mx1.sberdevices.ru (mx2.sberdevices.ru [45.89.224.132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBC8ED6A; Wed, 22 Nov 2023 02:02:07 -0800 (PST) Received: from p-infra-ksmg-sc-msk02 (localhost [127.0.0.1]) by mx1.sberdevices.ru (Postfix) with ESMTP id 2BF7912007A; Wed, 22 Nov 2023 13:02:06 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.sberdevices.ru 2BF7912007A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=salutedevices.com; s=mail; t=1700647326; bh=gpgaCxHlI+b2COEC+6FpbgcB+0fsnGeeLl3cqShaJqM=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type:From; b=U2cR+jnyFbVSzokJr22L16YfRk74aZVdMuKPLIDW3i2hdsMEGqhTka7lSAcD1g7oN z2fv1tAsNgBKodod7yLz0WGHRNZ8gq8n1M5u4/xt0vAf/N8fubkhsGuVdxTJJ60mEh khZTlInYrQnkPAqvmUR8tsBUNwLEF9LklqKdr9RH8xu5M3kZAIyIJ+45UK4zy0vUHM RFmVf4Ms267OyFVzFgAJXCHnIvGz5UB96iFfP3MHVwODner3M1vXSvnbdsyG+zCAyF bM12FaKE4cwOJPq3reOyAsKTYOA1t3jxeLOtSpSDEb0yBRXobevAbGBHsbZONIlxba qXQIQhmVNh7yQ== Received: from p-i-exch-sc-m01.sberdevices.ru (p-i-exch-sc-m01.sberdevices.ru [172.16.192.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.sberdevices.ru (Postfix) with ESMTPS; Wed, 22 Nov 2023 13:02:06 +0300 (MSK) Received: from localhost.localdomain (100.64.160.123) by p-i-exch-sc-m01.sberdevices.ru (172.16.192.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Wed, 22 Nov 2023 13:02:05 +0300 From: Dmitry Rokosov <ddrokosov@salutedevices.com> To: <rostedt@goodmis.org>, <mhiramat@kernel.org>, <hannes@cmpxchg.org>, <mhocko@kernel.org>, <roman.gushchin@linux.dev>, <shakeelb@google.com>, <muchun.song@linux.dev>, <akpm@linux-foundation.org> CC: <kernel@sberdevices.ru>, <rockosov@gmail.com>, <cgroups@vger.kernel.org>, <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>, <bpf@vger.kernel.org>, Dmitry Rokosov <ddrokosov@salutedevices.com> Subject: [PATCH v2 2/2] mm: memcg: introduce new event to trace shrink_memcg Date: Wed, 22 Nov 2023 13:01:56 +0300 Message-ID: <20231122100156.6568-3-ddrokosov@salutedevices.com> X-Mailer: git-send-email 2.36.0 In-Reply-To: <20231122100156.6568-1-ddrokosov@salutedevices.com> References: <20231122100156.6568-1-ddrokosov@salutedevices.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [100.64.160.123] X-ClientProxiedBy: p-i-exch-sc-m01.sberdevices.ru (172.16.192.107) To p-i-exch-sc-m01.sberdevices.ru (172.16.192.107) X-KSMG-Rule-ID: 10 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Lua-Profiles: 181525 [Nov 22 2023] X-KSMG-AntiSpam-Version: 6.0.0.2 X-KSMG-AntiSpam-Envelope-From: ddrokosov@salutedevices.com X-KSMG-AntiSpam-Rate: 0 X-KSMG-AntiSpam-Status: not_detected X-KSMG-AntiSpam-Method: none X-KSMG-AntiSpam-Auth: dkim=none X-KSMG-AntiSpam-Info: LuaCore: 3 0.3.3 e5c6a18a9a9bff0226d530c5b790210c0bd117c8, {Tracking_from_domain_doesnt_match_to}, salutedevices.com:7.1.1;127.0.0.199:7.1.2;d41d8cd98f00b204e9800998ecf8427e.com:7.1.1;p-i-exch-sc-m01.sberdevices.ru:5.0.1,7.1.1;100.64.160.123:7.1.2, FromAlignment: s, ApMailHostAddress: 100.64.160.123 X-MS-Exchange-Organization-SCL: -1 X-KSMG-AntiSpam-Interceptor-Info: scan successful X-KSMG-AntiPhishing: Clean X-KSMG-LinksScanning: Clean X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 2.0.1.6960, bases: 2023/11/22 05:48:00 #22499758 X-KSMG-AntiVirus-Status: Clean, skipped X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 22 Nov 2023 02:02:37 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783258065002063315 X-GMAIL-MSGID: 1783258065002063315 |
Series |
mm: memcg: improve vmscan tracepoints
|
|
Commit Message
Dmitry Rokosov
Nov. 22, 2023, 10:01 a.m. UTC
The shrink_memcg flow plays a crucial role in memcg reclamation.
Currently, it is not possible to trace this point from non-direct
reclaim paths. However, direct reclaim has its own tracepoint, so there
is no issue there. In certain cases, when debugging memcg pressure,
developers may need to identify all potential requests for memcg
reclamation including kswapd(). The patchset introduces the tracepoints
mm_vmscan_memcg_shrink_{begin|end}() to address this problem.
Example of output in the kswapd context (non-direct reclaim):
kswapd0-39 [001] ..... 240.356378: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=test
kswapd0-39 [001] ..... 240.356396: mm_vmscan_memcg_shrink_end: nr_reclaimed=0 memcg=test
kswapd0-39 [001] ..... 240.356420: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=test
kswapd0-39 [001] ..... 240.356454: mm_vmscan_memcg_shrink_end: nr_reclaimed=1 memcg=test
kswapd0-39 [001] ..... 240.356479: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=test
kswapd0-39 [001] ..... 240.356506: mm_vmscan_memcg_shrink_end: nr_reclaimed=4 memcg=test
kswapd0-39 [001] ..... 240.356525: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=test
kswapd0-39 [001] ..... 240.356593: mm_vmscan_memcg_shrink_end: nr_reclaimed=11 memcg=test
kswapd0-39 [001] ..... 240.356614: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=test
kswapd0-39 [001] ..... 240.356738: mm_vmscan_memcg_shrink_end: nr_reclaimed=25 memcg=test
kswapd0-39 [001] ..... 240.356790: mm_vmscan_memcg_shrink_begin: order=0 gfp_flags=GFP_KERNEL memcg=test
kswapd0-39 [001] ..... 240.357125: mm_vmscan_memcg_shrink_end: nr_reclaimed=53 memcg=test
Signed-off-by: Dmitry Rokosov <ddrokosov@salutedevices.com>
---
include/trace/events/vmscan.h | 14 ++++++++++++++
mm/vmscan.c | 11 +++++++++++
2 files changed, 25 insertions(+)
Comments
On Wed 22-11-23 13:01:56, Dmitry Rokosov wrote: > The shrink_memcg flow plays a crucial role in memcg reclamation. > Currently, it is not possible to trace this point from non-direct > reclaim paths. Is this really true? AFAICS we have mm_vmscan_lru_isolate mm_vmscan_lru_shrink_active mm_vmscan_lru_shrink_inactive which are in the vry core of the memory reclaim. Sure post processing those is some work. [...] > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 45780952f4b5..6d89b39d9a91 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -6461,6 +6461,12 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) > */ > cond_resched(); > > +#ifdef CONFIG_MEMCG > + trace_mm_vmscan_memcg_shrink_begin(sc->order, > + sc->gfp_mask, > + memcg); > +#endif this is a common code path for node and direct reclaim which means that we will have multiple begin/end tracepoints covering similar operations. To me that sounds excessive. If you are missing a cumulative kswapd alternative to mm_vmscan_direct_reclaim_begin mm_vmscan_direct_reclaim_end mm_vmscan_memcg_reclaim_begin mm_vmscan_memcg_reclaim_end mm_vmscan_memcg_softlimit_reclaim_begin mm_vmscan_memcg_softlimit_reclaim_end mm_vmscan_node_reclaim_begin mm_vmscan_node_reclaim_end then place it into kswapd path. But it would be really great to elaborate some more why this is really needed. Cannot you simply aggregate stats for kswapd from existing tracepoints?
Hello Michal, Thank you for the quick review! On Wed, Nov 22, 2023 at 11:23:24AM +0100, Michal Hocko wrote: > On Wed 22-11-23 13:01:56, Dmitry Rokosov wrote: > > The shrink_memcg flow plays a crucial role in memcg reclamation. > > Currently, it is not possible to trace this point from non-direct > > reclaim paths. > > Is this really true? AFAICS we have > mm_vmscan_lru_isolate > mm_vmscan_lru_shrink_active > mm_vmscan_lru_shrink_inactive > > which are in the vry core of the memory reclaim. Sure post processing > those is some work. Sure, you are absolutely right. In the usual scenario, the memcg shrinker utilizes two sub-shrinkers: slab and LRU. We can enable the tracepoints you mentioned and analyze them. However, there is one potential issue. Enabling these tracepoints will trigger the reclaim events show for all pages. Although we can filter them per pid, we cannot filter them per cgroup. Nevertheless, there are times when it would be extremely beneficial to comprehend the effectiveness of the reclaim process within the relevant cgroup. For this reason, I am adding the cgroup name to the memcg tracepoints and implementing a cumulative tracepoint for memcg shrink (LRU + slab)." > > [...] > > diff --git a/mm/vmscan.c b/mm/vmscan.c > > index 45780952f4b5..6d89b39d9a91 100644 > > --- a/mm/vmscan.c > > +++ b/mm/vmscan.c > > @@ -6461,6 +6461,12 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) > > */ > > cond_resched(); > > > > +#ifdef CONFIG_MEMCG > > + trace_mm_vmscan_memcg_shrink_begin(sc->order, > > + sc->gfp_mask, > > + memcg); > > +#endif > > this is a common code path for node and direct reclaim which means that > we will have multiple begin/end tracepoints covering similar operations. > To me that sounds excessive. If you are missing a cumulative kswapd > alternative to > mm_vmscan_direct_reclaim_begin > mm_vmscan_direct_reclaim_end > mm_vmscan_memcg_reclaim_begin > mm_vmscan_memcg_reclaim_end > mm_vmscan_memcg_softlimit_reclaim_begin > mm_vmscan_memcg_softlimit_reclaim_end > mm_vmscan_node_reclaim_begin > mm_vmscan_node_reclaim_end > > then place it into kswapd path. But it would be really great to > elaborate some more why this is really needed. Cannot you simply > aggregate stats for kswapd from existing tracepoints? > > -- > Michal Hocko > SUSE Labs
On Wed 22-11-23 13:58:36, Dmitry Rokosov wrote: > Hello Michal, > > Thank you for the quick review! > > On Wed, Nov 22, 2023 at 11:23:24AM +0100, Michal Hocko wrote: > > On Wed 22-11-23 13:01:56, Dmitry Rokosov wrote: > > > The shrink_memcg flow plays a crucial role in memcg reclamation. > > > Currently, it is not possible to trace this point from non-direct > > > reclaim paths. > > > > Is this really true? AFAICS we have > > mm_vmscan_lru_isolate > > mm_vmscan_lru_shrink_active > > mm_vmscan_lru_shrink_inactive > > > > which are in the vry core of the memory reclaim. Sure post processing > > those is some work. > > Sure, you are absolutely right. In the usual scenario, the memcg > shrinker utilizes two sub-shrinkers: slab and LRU. We can enable the > tracepoints you mentioned and analyze them. However, there is one > potential issue. Enabling these tracepoints will trigger the reclaim > events show for all pages. Although we can filter them per pid, we > cannot filter them per cgroup. Nevertheless, there are times when it > would be extremely beneficial to comprehend the effectiveness of the > reclaim process within the relevant cgroup. For this reason, I am adding > the cgroup name to the memcg tracepoints and implementing a cumulative > tracepoint for memcg shrink (LRU + slab)." I can see how printing memcg in mm_vmscan_memcg_reclaim_begin makes it easier to postprocess per memcg reclaim. But you could do that just by adding that to mm_vmscan_memcg_reclaim_{begin, end}, no? Why exactly does this matter for kswapd and other global reclaim contexts?
On Wed, Nov 22, 2023 at 02:24:59PM +0100, Michal Hocko wrote: > On Wed 22-11-23 13:58:36, Dmitry Rokosov wrote: > > Hello Michal, > > > > Thank you for the quick review! > > > > On Wed, Nov 22, 2023 at 11:23:24AM +0100, Michal Hocko wrote: > > > On Wed 22-11-23 13:01:56, Dmitry Rokosov wrote: > > > > The shrink_memcg flow plays a crucial role in memcg reclamation. > > > > Currently, it is not possible to trace this point from non-direct > > > > reclaim paths. > > > > > > Is this really true? AFAICS we have > > > mm_vmscan_lru_isolate > > > mm_vmscan_lru_shrink_active > > > mm_vmscan_lru_shrink_inactive > > > > > > which are in the vry core of the memory reclaim. Sure post processing > > > those is some work. > > > > Sure, you are absolutely right. In the usual scenario, the memcg > > shrinker utilizes two sub-shrinkers: slab and LRU. We can enable the > > tracepoints you mentioned and analyze them. However, there is one > > potential issue. Enabling these tracepoints will trigger the reclaim > > events show for all pages. Although we can filter them per pid, we > > cannot filter them per cgroup. Nevertheless, there are times when it > > would be extremely beneficial to comprehend the effectiveness of the > > reclaim process within the relevant cgroup. For this reason, I am adding > > the cgroup name to the memcg tracepoints and implementing a cumulative > > tracepoint for memcg shrink (LRU + slab)." > > I can see how printing memcg in mm_vmscan_memcg_reclaim_begin makes it > easier to postprocess per memcg reclaim. But you could do that just by > adding that to mm_vmscan_memcg_reclaim_{begin, end}, no? Why exactly > does this matter for kswapd and other global reclaim contexts? From my point of view, kswapd and other non-direct reclaim paths are important for memcg analysis because they also influence the memcg reclaim statistics. The tracepoint mm_vmscan_memcg_reclaim_{begin, end} is called from the direct memcg reclaim flow, such as: - a direct write to the 'reclaim' node - changing 'max' and 'high' thresholds - raising the 'force_empty' mechanism - the charge path - etc. However, it doesn't cover global reclaim contexts, so it doesn't provide us with the full memcg reclaim statistics.
Michal, Shakeel, Sorry for pinging you here, but I don't quite understand your decision on this patchset. Is it a NAK or not? If it's not, should I consider redesigning something? For instance, introducing stub functions to remove ifdefs from shrink_node_memcgs(). Thank you for taking the time to look into this! On Wed, Nov 22, 2023 at 09:57:27PM +0300, Dmitry Rokosov wrote: > On Wed, Nov 22, 2023 at 02:24:59PM +0100, Michal Hocko wrote: > > On Wed 22-11-23 13:58:36, Dmitry Rokosov wrote: > > > Hello Michal, > > > > > > Thank you for the quick review! > > > > > > On Wed, Nov 22, 2023 at 11:23:24AM +0100, Michal Hocko wrote: > > > > On Wed 22-11-23 13:01:56, Dmitry Rokosov wrote: > > > > > The shrink_memcg flow plays a crucial role in memcg reclamation. > > > > > Currently, it is not possible to trace this point from non-direct > > > > > reclaim paths. > > > > > > > > Is this really true? AFAICS we have > > > > mm_vmscan_lru_isolate > > > > mm_vmscan_lru_shrink_active > > > > mm_vmscan_lru_shrink_inactive > > > > > > > > which are in the vry core of the memory reclaim. Sure post processing > > > > those is some work. > > > > > > Sure, you are absolutely right. In the usual scenario, the memcg > > > shrinker utilizes two sub-shrinkers: slab and LRU. We can enable the > > > tracepoints you mentioned and analyze them. However, there is one > > > potential issue. Enabling these tracepoints will trigger the reclaim > > > events show for all pages. Although we can filter them per pid, we > > > cannot filter them per cgroup. Nevertheless, there are times when it > > > would be extremely beneficial to comprehend the effectiveness of the > > > reclaim process within the relevant cgroup. For this reason, I am adding > > > the cgroup name to the memcg tracepoints and implementing a cumulative > > > tracepoint for memcg shrink (LRU + slab)." > > > > I can see how printing memcg in mm_vmscan_memcg_reclaim_begin makes it > > easier to postprocess per memcg reclaim. But you could do that just by > > adding that to mm_vmscan_memcg_reclaim_{begin, end}, no? Why exactly > > does this matter for kswapd and other global reclaim contexts? > > From my point of view, kswapd and other non-direct reclaim paths are > important for memcg analysis because they also influence the memcg > reclaim statistics. > > The tracepoint mm_vmscan_memcg_reclaim_{begin, end} is called from the > direct memcg reclaim flow, such as: > - a direct write to the 'reclaim' node > - changing 'max' and 'high' thresholds > - raising the 'force_empty' mechanism > - the charge path > - etc. > > However, it doesn't cover global reclaim contexts, so it doesn't provide > us with the full memcg reclaim statistics. > > -- > Thank you, > Dmitry
On Thu 23-11-23 14:26:29, Dmitry Rokosov wrote: > Michal, Shakeel, > > Sorry for pinging you here, but I don't quite understand your decision > on this patchset. > > Is it a NAK or not? If it's not, should I consider redesigning > something? For instance, introducing stub functions to > remove ifdefs from shrink_node_memcgs(). > > Thank you for taking the time to look into this! Sorry for a late reply. I have noticed you have posted a new version. Let me have a look and comment there.
Michal, On Mon, Nov 27, 2023 at 10:25:12AM +0100, Michal Hocko wrote: > On Thu 23-11-23 14:26:29, Dmitry Rokosov wrote: > > Michal, Shakeel, > > > > Sorry for pinging you here, but I don't quite understand your decision > > on this patchset. > > > > Is it a NAK or not? If it's not, should I consider redesigning > > something? For instance, introducing stub functions to > > remove ifdefs from shrink_node_memcgs(). > > > > Thank you for taking the time to look into this! > > Sorry for a late reply. I have noticed you have posted a new version. > Let me have a look and comment there. No problem! Thanks a lot for your time and attention! Let's continue in the next version thread.
diff --git a/include/trace/events/vmscan.h b/include/trace/events/vmscan.h index 9b49cd120ae9..cb39b4f0dca9 100644 --- a/include/trace/events/vmscan.h +++ b/include/trace/events/vmscan.h @@ -182,6 +182,13 @@ DEFINE_EVENT(mm_vmscan_memcg_reclaim_begin_template, mm_vmscan_memcg_softlimit_r TP_ARGS(order, gfp_flags, memcg) ); +DEFINE_EVENT(mm_vmscan_memcg_reclaim_begin_template, mm_vmscan_memcg_shrink_begin, + + TP_PROTO(int order, gfp_t gfp_flags, const struct mem_cgroup *memcg), + + TP_ARGS(order, gfp_flags, memcg) +); + #endif /* CONFIG_MEMCG */ DECLARE_EVENT_CLASS(mm_vmscan_direct_reclaim_end_template, @@ -247,6 +254,13 @@ DEFINE_EVENT(mm_vmscan_memcg_reclaim_end_template, mm_vmscan_memcg_softlimit_rec TP_ARGS(nr_reclaimed, memcg) ); +DEFINE_EVENT(mm_vmscan_memcg_reclaim_end_template, mm_vmscan_memcg_shrink_end, + + TP_PROTO(unsigned long nr_reclaimed, const struct mem_cgroup *memcg), + + TP_ARGS(nr_reclaimed, memcg) +); + #endif /* CONFIG_MEMCG */ TRACE_EVENT(mm_shrink_slab_start, diff --git a/mm/vmscan.c b/mm/vmscan.c index 45780952f4b5..6d89b39d9a91 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -6461,6 +6461,12 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) */ cond_resched(); +#ifdef CONFIG_MEMCG + trace_mm_vmscan_memcg_shrink_begin(sc->order, + sc->gfp_mask, + memcg); +#endif + mem_cgroup_calculate_protection(target_memcg, memcg); if (mem_cgroup_below_min(target_memcg, memcg)) { @@ -6491,6 +6497,11 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc) shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, sc->priority); +#ifdef CONFIG_MEMCG + trace_mm_vmscan_memcg_shrink_end(sc->nr_reclaimed - reclaimed, + memcg); +#endif + /* Record the group's reclaim efficiency */ if (!sc->proactive) vmpressure(sc->gfp_mask, memcg, false,