Message ID | 20230922175741.635002-3-yosryahmed@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5868642vqi; Fri, 22 Sep 2023 14:12:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH5/O6bPlORLA5BeK8yie1oGKUFyHdAjBgnottJIbBhcDRzPkB9xI8u+x+IEwCMc1nq5crG X-Received: by 2002:a05:6a20:3213:b0:15d:ec88:3570 with SMTP id hl19-20020a056a20321300b0015dec883570mr678417pzc.22.1695417159243; Fri, 22 Sep 2023 14:12:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695417159; cv=none; d=google.com; s=arc-20160816; b=znBLw03u2VIyN4jF04hf1CrRELa/7V6MneeeplHIPGUjQ47Uq0FZwj/q8lu/z58o4H 8J6Uo0C7pihnKEwWKk1Sl3vk71+Q7MmRk4V/xYbwFKHiiMap9oLfKD5zMkq+Y7n6Ws6m Wt+N4PZKf104CM98p62PljgZBgJNc1Clmyr7MECHQKQgi/8fc8etVDmfsdAw8S+koHrm yopz6+rFWSrmFjBA+xC9eqq5ktr1RIuiAyn7tO0lkAxrqqwt+W72g2rKH4KOYrKDaxkW Un9XwYvOsjp4T3HIFB1gi678HLOJZ8jgEx+CU8HfyykCGUAnZxE9vnFL2KHXhUp+Wf0v jC8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=3CSLPc6OXSIH75VPEK2Or6+ByoA8DGo3BCTW1zZyxTA=; fh=8R3NybVVYxF3mxyewwJ7cqkhC8pADgA9jxe98eswTvE=; b=ENWIMtiQ4HaBXLvbKqhk37uQHt1vM4EMMku0bGeZqlZ6RXZYU7eqq7in1cZGxwMCRG 4DBwZVKW+Np12IH6M/Iju9lHcnIOoxD/IdVwe3KbzCb7em3A/4DMTJ8KW1SQYnYst3N+ rpHetyMObEHNwaGSAQdKRjUyF/kIgwTq4tBZYg3xTZ+of1EosMa2JxbjuZMHrzxXcnSU SOtiF8DV27Hi03BwcEO9ErfmgYtgVQIvgbMP0bNzlL7H2q6X+/jeQNN+oi9pTCzCIrwu s02AanwxKHGD8avTN2S1LwLNNODbH19TjH/UamAcYHkCTd8AKMslHn7CL4+O+0z1xSEt 4B5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=XSDFVFUj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id e15-20020a17090a7c4f00b002691d885301si4523875pjl.0.2023.09.22.14.12.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 14:12:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=XSDFVFUj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id E9068805E417; Fri, 22 Sep 2023 11:05:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233158AbjIVSFR (ORCPT <rfc822;pwkd43@gmail.com> + 28 others); Fri, 22 Sep 2023 14:05:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233538AbjIVSFJ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 22 Sep 2023 14:05:09 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A1664223 for <linux-kernel@vger.kernel.org>; Fri, 22 Sep 2023 10:57:48 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d81d85aae7cso437482276.0 for <linux-kernel@vger.kernel.org>; Fri, 22 Sep 2023 10:57:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695405467; x=1696010267; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3CSLPc6OXSIH75VPEK2Or6+ByoA8DGo3BCTW1zZyxTA=; b=XSDFVFUj0s4Agwlv19dfVnA8fyANS3Lio1Mu22DmB1ClvVA1YT+CSVeqE0EWxIbDkm SAOwHueWm2RAHr7SLAyT7mtrmUF/p3ORh/dRzumeM8ShBlB/eZ49uVQkTDrPZvOXa3XG U29dF0ipmdT7qgaXW/sTeHjT5QcTJTbW0lQMSCXZoeo/RTkaUTIwfORnOtEzIRVTLqd8 bXAbpQVdO5Lhp7LQ1XxgAJ4MTH98wJEJ9XVqEtqyOKGB7vCX6eCUAODfsLHNMKALhOHf 9vlWXS0Zt6co266BHa3btRr3isBFVGn1eqvxEeXTsc7qG3q58Ws3hXBi5lb3+lId+t+K Pq+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695405467; x=1696010267; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3CSLPc6OXSIH75VPEK2Or6+ByoA8DGo3BCTW1zZyxTA=; b=IwnWH1DYAPMya73C4veAFt0ga+a0hCiDNIhEvPpvZ1Efq6f0fonuf+GxaX/7TtbBuj mI8F/oLTbrg9io/KaFONYzsYlQKO6N3eMLASZ4fPTHQGoDgM4tdfJJfIBzmSwcaWPIa4 nKPuHdsKmzV4hYYTigeVGFp9JEP1pVPD9I8Gq67bCCmPxufc78XPirUNQ+sauYaAVVoX N3RB0qVeLqptt9sAPTeL9wCKsc2mT4CjuBSdO96yNfkncocogB0fbX3BJZrTleurn1F1 jp+0IwJkBAXzTo1oxbl/kwGAlWMYnnQ0VZHGLOKyoBLx6AT3tGfryctVFh23DPK+4rP6 qO9A== X-Gm-Message-State: AOJu0YzF5JLAoqQhYrXuXmTbM5xhR3ncYfBz515OwLn6cdp8PCKzaetL lq7jfdg/MudyZLK5aVQyWRMY3Sw4lcdJk/0f X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:29b4]) (user=yosryahmed job=sendgmr) by 2002:a25:8304:0:b0:d86:5913:55cb with SMTP id s4-20020a258304000000b00d86591355cbmr1694ybk.0.1695405467283; Fri, 22 Sep 2023 10:57:47 -0700 (PDT) Date: Fri, 22 Sep 2023 17:57:40 +0000 In-Reply-To: <20230922175741.635002-1-yosryahmed@google.com> Mime-Version: 1.0 References: <20230922175741.635002-1-yosryahmed@google.com> X-Mailer: git-send-email 2.42.0.515.g380fc7ccd1-goog Message-ID: <20230922175741.635002-3-yosryahmed@google.com> Subject: [PATCH v2 2/2] mm: memcg: normalize the value passed into memcg_rstat_updated() From: Yosry Ahmed <yosryahmed@google.com> To: Andrew Morton <akpm@linux-foundation.org>, Shakeel Butt <shakeelb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org>, Michal Hocko <mhocko@kernel.org>, Roman Gushchin <roman.gushchin@linux.dev>, Muchun Song <muchun.song@linux.dev>, " =?utf-8?q?Michal_Koutn=C3=BD?= " <mkoutny@suse.com>, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed <yosryahmed@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 22 Sep 2023 11:05:27 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777773743127659723 X-GMAIL-MSGID: 1777773743127659723 |
Series |
mm: memcg: fix tracking of pending stats updates values
|
|
Commit Message
Yosry Ahmed
Sept. 22, 2023, 5:57 p.m. UTC
memcg_rstat_updated() uses the value of the state update to keep track
of the magnitude of pending updates, so that we only do a stats flush
when it's worth the work. Most values passed into memcg_rstat_updated()
are in pages, however, a few of them are actually in bytes or KBs.
To put this into perspective, a 512 byte slab allocation today would
look the same as allocating 512 pages. This may result in premature
flushes, which means unnecessary work and latency.
Normalize all the state values passed into memcg_rstat_updated() to
pages. Round up non-zero sub-page to 1 page, because
memcg_rstat_updated() ignores 0 page updates.
Fixes: 5b3be698a872 ("memcg: better bounds on the memcg stats updates")
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
---
mm/memcontrol.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
Comments
On Fri, Sep 22, 2023 at 05:57:40PM +0000, Yosry Ahmed wrote: > memcg_rstat_updated() uses the value of the state update to keep track > of the magnitude of pending updates, so that we only do a stats flush > when it's worth the work. Most values passed into memcg_rstat_updated() > are in pages, however, a few of them are actually in bytes or KBs. > > To put this into perspective, a 512 byte slab allocation today would > look the same as allocating 512 pages. This may result in premature > flushes, which means unnecessary work and latency. Yikes. I'm somewhat less concerned about the performance as I am about the variance in flushing cost that could be quite difficult to pinpoint. IMO this is a correctness fix and a code cleanup, not a performance thing. > Normalize all the state values passed into memcg_rstat_updated() to > pages. Round up non-zero sub-page to 1 page, because > memcg_rstat_updated() ignores 0 page updates. > > Fixes: 5b3be698a872 ("memcg: better bounds on the memcg stats updates") > Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
On Tue, Oct 3, 2023 at 6:13 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Fri, Sep 22, 2023 at 05:57:40PM +0000, Yosry Ahmed wrote: > > memcg_rstat_updated() uses the value of the state update to keep track > > of the magnitude of pending updates, so that we only do a stats flush > > when it's worth the work. Most values passed into memcg_rstat_updated() > > are in pages, however, a few of them are actually in bytes or KBs. > > > > To put this into perspective, a 512 byte slab allocation today would > > look the same as allocating 512 pages. This may result in premature > > flushes, which means unnecessary work and latency. > > Yikes. > > I'm somewhat less concerned about the performance as I am about the > variance in flushing cost that could be quite difficult to pinpoint. > IMO this is a correctness fix and a code cleanup, not a performance > thing. Agreed, the code right now has a subtle mistake. > > > Normalize all the state values passed into memcg_rstat_updated() to > > pages. Round up non-zero sub-page to 1 page, because > > memcg_rstat_updated() ignores 0 page updates. > > > > Fixes: 5b3be698a872 ("memcg: better bounds on the memcg stats updates") > > Signed-off-by: Yosry Ahmed <yosryahmed@google.com> > > Acked-by: Johannes Weiner <hannes@cmpxchg.org> Thanks for taking a look!
On Fri, Sep 22, 2023 at 05:57:40PM +0000, Yosry Ahmed <yosryahmed@google.com> wrote: > memcg_rstat_updated() uses the value of the state update to keep track > of the magnitude of pending updates, so that we only do a stats flush > when it's worth the work. Most values passed into memcg_rstat_updated() > are in pages, however, a few of them are actually in bytes or KBs. > > To put this into perspective, a 512 byte slab allocation today would > look the same as allocating 512 pages. This may result in premature > flushes, which means unnecessary work and latency. > > Normalize all the state values passed into memcg_rstat_updated() to > pages. I've dreamed about such normalization since error estimates were introduced :-) (As touched in the previous patch) it makes me wonder whether it makes sense to add up state and event counters (apples and oranges). Shouldn't with this approach events: a) have a separate counter, b) wight with zero and rely on time-based flushing only? Thanks, Michal
On Tue, Oct 3, 2023 at 11:22 AM Michal Koutný <mkoutny@suse.com> wrote: > > On Fri, Sep 22, 2023 at 05:57:40PM +0000, Yosry Ahmed <yosryahmed@google.com> wrote: > > memcg_rstat_updated() uses the value of the state update to keep track > > of the magnitude of pending updates, so that we only do a stats flush > > when it's worth the work. Most values passed into memcg_rstat_updated() > > are in pages, however, a few of them are actually in bytes or KBs. > > > > To put this into perspective, a 512 byte slab allocation today would > > look the same as allocating 512 pages. This may result in premature > > flushes, which means unnecessary work and latency. > > > > Normalize all the state values passed into memcg_rstat_updated() to > > pages. > > I've dreamed about such normalization since error estimates were > introduced :-) > > (As touched in the previous patch) it makes me wonder whether it makes > sense to add up state and event counters (apples and oranges). I conceptually agree that we are adding apples and oranges, but in practice if you look at memcg_vm_event_stat, most stat updates correspond to 1 page worth of something. I am guessing the implicit assumption here is that event updates correspond roughly to page-sized state updates. It's not perfect, but perhaps it's acceptable. > > Shouldn't with this approach events: a) have a separate counter, b) > wight with zero and rely on time-based flushing only? (a) If we have separate counters we need to eventually sum them to figure out if the total magnitude of pending updates is worth flushing anyway, right? (b) I would be more nervous to rely on time-based flushing only as I don't know how this can affect workloads. > > Thanks, > Michal
diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 308cc7353ef0..d1a322a75172 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -763,6 +763,22 @@ unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx) return x; } +static int memcg_page_state_unit(int item); + +/* + * Normalize the value passed into memcg_rstat_updated() to be in pages. Round + * up non-zero sub-page updates to 1 page as zero page updates are ignored. + */ +static int memcg_state_val_in_pages(int idx, int val) +{ + int unit = memcg_page_state_unit(idx); + + if (!val || unit == PAGE_SIZE) + return val; + else + return max(val * unit / PAGE_SIZE, 1UL); +} + /** * __mod_memcg_state - update cgroup memory statistics * @memcg: the memory cgroup @@ -775,7 +791,7 @@ void __mod_memcg_state(struct mem_cgroup *memcg, int idx, int val) return; __this_cpu_add(memcg->vmstats_percpu->state[idx], val); - memcg_rstat_updated(memcg, val); + memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); } /* idx can be of type enum memcg_stat_item or node_stat_item. */ @@ -826,7 +842,7 @@ void __mod_memcg_lruvec_state(struct lruvec *lruvec, enum node_stat_item idx, /* Update lruvec */ __this_cpu_add(pn->lruvec_stats_percpu->state[idx], val); - memcg_rstat_updated(memcg, val); + memcg_rstat_updated(memcg, memcg_state_val_in_pages(idx, val)); memcg_stats_unlock(); }