Message ID | 20231129032154.3710765-1-yosryahmed@google.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a5a7:0:b0:403:3b70:6f57 with SMTP id d7csp93702vqn; Tue, 28 Nov 2023 19:22:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IHJTR5XiIfxMXXPB+S8X7vp6h9RGQumC9W5yIFwgEXRhHEyA+mggnHIia5ltJ4wcD1zsh3M X-Received: by 2002:a05:6a20:244b:b0:18c:d9:b413 with SMTP id t11-20020a056a20244b00b0018c00d9b413mr20950735pzc.5.1701228131396; Tue, 28 Nov 2023 19:22:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701228131; cv=none; d=google.com; s=arc-20160816; b=t+DqlqjrhkyhdoU7MuXGxI9fBISOI4WtWi2Qbzc0NtF57YIGqSwh55eULXUBorNAlV U7fdLewIci0H2VXBh/yqnt7Wf5Zc/+Hc0melEuaQj6yUDS5xbkE1G7hGLls+VZuSqrLP oXhIbKGBe5xkpP6hlqKZKyZqJjvLeKozjCHoztIYfTA5zMQkxABCrVAeuBxTCsHOZV4t GcofwhgNp1drUHBuCT0yK3w653Xqd0rFDQTwfWSxDQcKV9GpJ4g3VpwcVFOtfmav1iLw 2W7BJRWSzAOLdOscpe5+WejOCJE3R6u/cWKAECjcYrLqtZn+PM2Mey//epjKoDm540QU 3QxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=Ky2REY4MxzMO+RxmPId6O0x4sdb+oIV13EhP53U4pHw=; fh=IfBS1eUR4Ph2d1rlykJm3WUUhflOgyoV/WQxPmHrs48=; b=0hqx/dRqsQGYLOAKoHq9tXE+mygNfE+NcFoi5BJQwOb6sG3+TrATlPsTV1DZZOulLt es3qYir1PJvOQ9zsjS9ECmqyaPA8QRuWUiHCYivtkQIBv0fYZXorqRdcGIpY8m8VGS5s ghuplxd342x5pBbHR1EF4NxttNIB7VQJu97NUWRpcu6ix7UBY80nwtozqroxAh9KMh2F 60BeMm6JYTYDDyWJ0vgqg4DAR4JoDnkjZoPyfMVccqkN+k4D/FAf6OJiTYFCfe+meJGI Xj746BmV83gWa6G7L/m5fLtg5BKyJ7absB8xsq5A6o9v685MV8hTcmSChBtOjz/ftqzL /8QQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=sL5L5PZv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id be14-20020a656e4e000000b005bdfbf1d562si14923202pgb.192.2023.11.28.19.22.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 19:22:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=sL5L5PZv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 87CA181C46C8; Tue, 28 Nov 2023 19:22:05 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376776AbjK2DV4 (ORCPT <rfc822;toshivichauhan@gmail.com> + 99 others); Tue, 28 Nov 2023 22:21:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47952 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229880AbjK2DVz (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 28 Nov 2023 22:21:55 -0500 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC0571A3 for <linux-kernel@vger.kernel.org>; Tue, 28 Nov 2023 19:22:01 -0800 (PST) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5d12853cb89so34215367b3.3 for <linux-kernel@vger.kernel.org>; Tue, 28 Nov 2023 19:22:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701228121; x=1701832921; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=Ky2REY4MxzMO+RxmPId6O0x4sdb+oIV13EhP53U4pHw=; b=sL5L5PZvd4xwqyCkpJqWaIUcaB7q8DD1oraoEiYuNedWYrqGfEGZ6UeQP974N/CUAm yJyw2TIyhnmQmeMGcfsFtlbjJqPxU3qkL/xUo7kt+suL2/x8j9RoYQ1rUXcziNJ1iXhL i0nNCDtfD2zYTxVveP7mOm3LOhNdmt4/yHGGRA1E8ldLU0MXYlD39cZFiCEl9rLrYen7 6OnelRY0AUs8z9RuCBcDJtTYkbvNn4lHNor9J/KsEri3kXENxCpoCWr0Nql1p+iFJ2LQ Aabk5+55jG0msA/OjQTJ5YQEMLQC87f8Ih2bJiBTlyyrCuppffQRwMQZ61+e/ibyUrEM bdeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701228121; x=1701832921; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Ky2REY4MxzMO+RxmPId6O0x4sdb+oIV13EhP53U4pHw=; b=LQiymTWE/uE3YtCE1pMV+cuHM00kssWM/3lF1DXIlxjpOrVa1BoQyxogM2P7zEpx7U B+nfWG4RZTsOtdp37qMkMt9a9MPM+iFSmdw1FI+VpxenJNvM0Q90O97h++nzPppCP334 3xXIZdjqX4X5eLS8WeSZ3SV9pzo4HgAc8qLGMm19qLDgrrE/G+rF6DzIvXTD4wz3JPKu F1h9qd4uVXgQbyCb1STZymheotdrFNFbm5uqsG3xf8anqvuK0qgk5b8tKPYYmNbmyO1X CZQIAdUC6Ox5QuMpKU8wIUgg1xPa7195bKIm9nxgVDJho+N3FvSrjEVpu38C6kpdvJw0 LPrg== X-Gm-Message-State: AOJu0YwXws3GfMsEQfMLXh1NN3xmp6vJbM3kracX6LlSQ5GuuYoyMS4I ChytnUA05Zf+Gfj1OWeBulPk2TmstgePcYuQ X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a05:690c:2e10:b0:5cd:c47d:d89f with SMTP id et16-20020a05690c2e1000b005cdc47dd89fmr499535ywb.5.1701228121058; Tue, 28 Nov 2023 19:22:01 -0800 (PST) Date: Wed, 29 Nov 2023 03:21:48 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.43.0.rc1.413.gea7ed67945-goog Message-ID: <20231129032154.3710765-1-yosryahmed@google.com> Subject: [mm-unstable v4 0/5] mm: memcg: subtree stats flushing and thresholds From: Yosry Ahmed <yosryahmed@google.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: Johannes Weiner <hannes@cmpxchg.org>, Michal Hocko <mhocko@kernel.org>, Roman Gushchin <roman.gushchin@linux.dev>, Shakeel Butt <shakeelb@google.com>, Muchun Song <muchun.song@linux.dev>, Ivan Babrou <ivan@cloudflare.com>, Tejun Heo <tj@kernel.org>, " =?utf-8?q?M?= =?utf-8?q?ichal_Koutn=C3=BD?= " <mkoutny@suse.com>, Waiman Long <longman@redhat.com>, kernel-team@cloudflare.com, Wei Xu <weixugc@google.com>, Greg Thelen <gthelen@google.com>, Domenico Cerasuolo <cerasuolodomenico@gmail.com>, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed <yosryahmed@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Tue, 28 Nov 2023 19:22:05 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783866989386284282 X-GMAIL-MSGID: 1783866989386284282 |
Series |
mm: memcg: subtree stats flushing and thresholds
|
|
Message
Yosry Ahmed
Nov. 29, 2023, 3:21 a.m. UTC
This series attempts to address shortages in today's approach for memcg stats flushing, namely occasionally stale or expensive stat reads. The series does so by changing the threshold that we use to decide whether to trigger a flush to be per memcg instead of global (patch 3), and then changing flushing to be per memcg (i.e. subtree flushes) instead of global (patch 5). Patch 3 & 5 are the core of the series, and they include more details and testing results. The rest are either cleanups or prep work. This series replaces the "memcg: more sophisticated stats flushing" series [1], which also replaces another series, in a long list of attempts to improve memcg stats flushing. It is not a new version of the same patchset as it is a completely different approach. This is based on collected feedback from discussions on lkml in all previous attempts. Hopefully, this is the final attempt. There was a reported regression in v2 [2] for will-it-scale::fallocate benchmark. I believe this regression should not affect production workloads. This specific benchmark is allocating and freeing memory (using fallocate/ftruncate) at a rate that is much faster to make actual use of the memory. Testing this series on 100+ machines running production workloads did not show any practical regressions in page fault latency or allocation latency, but it showed great improvements in stats read time. I do not have numbers about the exact improvements for this series, but combined with another optimization for cgroup v1 [3] we see 5-10x improvements. A significant chunk of that is coming from the cgroup v1 optimization, but this series also made an improvement as reported by Domenico [4]. v3 -> v4: - Rebased on top of mm-unstable + "workload-specific and memory pressure-driven zswap writeback" series to fix conflicts [5]. v3: https://lore.kernel.org/all/20231116022411.2250072-1-yosryahmed@google.com/ [1]https://lore.kernel.org/lkml/20230913073846.1528938-1-yosryahmed@google.com/ [2]https://lore.kernel.org/lkml/202310202303.c68e7639-oliver.sang@intel.com/ [3]https://lore.kernel.org/lkml/20230803185046.1385770-1-yosryahmed@google.com/ [4]https://lore.kernel.org/lkml/CAFYChMv_kv_KXOMRkrmTN-7MrfgBHMcK3YXv0dPYEL7nK77e2A@mail.gmail.com/ [5]https://lore.kernel.org/all/20231127234600.2971029-1-nphamcs@gmail.com/ Yosry Ahmed (5): mm: memcg: change flush_next_time to flush_last_time mm: memcg: move vmstats structs definition above flushing code mm: memcg: make stats flushing threshold per-memcg mm: workingset: move the stats flush into workingset_test_recent() mm: memcg: restore subtree stats flushing include/linux/memcontrol.h | 8 +- mm/memcontrol.c | 272 +++++++++++++++++++++---------------- mm/vmscan.c | 2 +- mm/workingset.c | 42 ++++-- 4 files changed, 188 insertions(+), 136 deletions(-)
Comments
On Wed, Nov 29, 2023 at 03:21:48AM +0000, Yosry Ahmed wrote: > This series attempts to address shortages in today's approach for memcg > stats flushing, namely occasionally stale or expensive stat reads. The > series does so by changing the threshold that we use to decide whether > to trigger a flush to be per memcg instead of global (patch 3), and then > changing flushing to be per memcg (i.e. subtree flushes) instead of > global (patch 5). > > Patch 3 & 5 are the core of the series, and they include more details > and testing results. The rest are either cleanups or prep work. > > This series replaces the "memcg: more sophisticated stats flushing" > series [1], which also replaces another series, in a long list of > attempts to improve memcg stats flushing. It is not a new version of > the same patchset as it is a completely different approach. This is > based on collected feedback from discussions on lkml in all previous > attempts. Hopefully, this is the final attempt. > > There was a reported regression in v2 [2] for will-it-scale::fallocate > benchmark. I believe this regression should not affect production > workloads. This specific benchmark is allocating and freeing memory > (using fallocate/ftruncate) at a rate that is much faster to make actual > use of the memory. Testing this series on 100+ machines running > production workloads did not show any practical regressions in page > fault latency or allocation latency, but it showed great improvements in > stats read time. I do not have numbers about the exact improvements for > this series, but combined with another optimization for cgroup v1 [3] we > see 5-10x improvements. A significant chunk of that is coming from the > cgroup v1 optimization, but this series also made an improvement as > reported by Domenico [4]. > > v3 -> v4: > - Rebased on top of mm-unstable + "workload-specific and memory > pressure-driven zswap writeback" series to fix conflicts [5]. > > v3: https://lore.kernel.org/all/20231116022411.2250072-1-yosryahmed@google.com/ > > [1]https://lore.kernel.org/lkml/20230913073846.1528938-1-yosryahmed@google.com/ > [2]https://lore.kernel.org/lkml/202310202303.c68e7639-oliver.sang@intel.com/ > [3]https://lore.kernel.org/lkml/20230803185046.1385770-1-yosryahmed@google.com/ > [4]https://lore.kernel.org/lkml/CAFYChMv_kv_KXOMRkrmTN-7MrfgBHMcK3YXv0dPYEL7nK77e2A@mail.gmail.com/ > [5]https://lore.kernel.org/all/20231127234600.2971029-1-nphamcs@gmail.com/ > > Yosry Ahmed (5): > mm: memcg: change flush_next_time to flush_last_time > mm: memcg: move vmstats structs definition above flushing code > mm: memcg: make stats flushing threshold per-memcg > mm: workingset: move the stats flush into workingset_test_recent() > mm: memcg: restore subtree stats flushing > > include/linux/memcontrol.h | 8 +- > mm/memcontrol.c | 272 +++++++++++++++++++++---------------- > mm/vmscan.c | 2 +- > mm/workingset.c | 42 ++++-- > 4 files changed, 188 insertions(+), 136 deletions(-) > No regressions when booting the kernel with this series applied. Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>