Message ID | 20230608171256.17827-1-mkoutny@suse.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp433947vqr; Thu, 8 Jun 2023 10:28:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7jQm0fcye1XHwjVzyc7EIzH/kcL3Eh4Kcn5F8xK4uG0CIM/WyYjQyekv22oNBvI+sw1P4p X-Received: by 2002:a05:6a00:ad5:b0:649:93a7:571b with SMTP id c21-20020a056a000ad500b0064993a7571bmr11564735pfl.13.1686245300641; Thu, 08 Jun 2023 10:28:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686245300; cv=none; d=google.com; s=arc-20160816; b=AvUztjjFexrLMkbtKz6vnUlBURdwPiNrpESCxwdw8EtB4MrNrojIg1wHsszD19J7EN Ko1u4TGS/fn2xGun1io/3iJw7r91yrhLmUImSdz6dTGSXdS7R0VrC3m4ny+IjNJ7yfNO VCuZOm1BKpNouRXAKrdGZZz2Xt2F612k7TeqcwHdypdr1iAKfrEwn9X5cVBFMbDTum3Q 1bm2Lo/zVAXK1upnwxORBpf8Al+69P8YIJVI2gHNJT3O9igYtb4wAh8hnHwg3NlNX6NM FSnla0fHAJ0T/V2FPiOChIoo5J7rawd3Fzk8f/6xCHwFBUdA/Ks536Ay3bY+468HyhWB nvCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=zcwvri9EO5NgmPKueOn7NC6SyLLY9pSstgNvXylAdIw=; b=W3R3xSIz9tt+tcjuQ0Xw2HL9Pw0bPQcz2uDNtCIshB2+YL1rMK+JmKvPP4bmOS1pIT OyzrJHRYrNFHYj4qg+X/3D0n/QgRbQEjh4aRNvqX7QTwKEeAk3ZSHLk9uQ3PabCh62/q ep7hEICUjmWLXfWPfs6KWcr8XaYcwZzaRIZ6Gdq5XoTG4fP/Agtorve5UV/6u0EjlY1k tpFrovPDmk4lPHC2W4q36c1qzXDUBgdHEf/LFgciAxxYA84CKiDa6h6Z5zYjFlvR1Z7N BU5EaKmmUzM6K+ey11EFos+iYtAq85Fhk8m3iDQRotu8wfOt8RcK6aUC7euNw9Zln/Y9 9arw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=O8ZJ8Pyi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v9-20020a63b649000000b0053b52fed717si1250404pgt.864.2023.06.08.10.28.06; Thu, 08 Jun 2023 10:28:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=O8ZJ8Pyi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234459AbjFHRNP (ORCPT <rfc822;literming00@gmail.com> + 99 others); Thu, 8 Jun 2023 13:13:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229889AbjFHRNN (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 8 Jun 2023 13:13:13 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F1E7E59 for <linux-kernel@vger.kernel.org>; Thu, 8 Jun 2023 10:13:09 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id EA395219BC; Thu, 8 Jun 2023 17:13:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1686244387; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=zcwvri9EO5NgmPKueOn7NC6SyLLY9pSstgNvXylAdIw=; b=O8ZJ8Pyiu6ad1fjW2LpeHDPLJm7yd9dnHlsYk6/29iqbd/j4jp1uSSzrWqlHs1/Eg2Jorv Dq0BwIHM7DD0/T2Lod56QaRJCVwuDwMeYtE5XAzcfiBFt4c9kRGImzGMGSUYfBbJ2jolAj DJfbtf2g2NizOT63NJ91uTvf3xskc20= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A358813480; Thu, 8 Jun 2023 17:13:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id Glv/JiMMgmQ5SQAAMHmgww (envelope-from <mkoutny@suse.com>); Thu, 08 Jun 2023 17:13:07 +0000 From: =?utf-8?q?Michal_Koutn=C3=BD?= <mkoutny@suse.com> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org>, Christian Brauner <brauner@kernel.org>, "Liam R . Howlett" <Liam.Howlett@oracle.com>, Suren Baghdasaryan <surenb@google.com>, "Michael S . Tsirkin" <mst@redhat.com>, Mike Christie <michael.christie@oracle.com>, Andrei Vagin <avagin@gmail.com>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, Nicholas Piggin <npiggin@gmail.com>, Peter Zijlstra <peterz@infradead.org>, Shakeel Butt <shakeelb@google.com>, Adam Majer <amajer@suse.com>, Jan Kara <jack@suse.cz>, Michal Hocko <mhocko@kernel.org> Subject: [RFC PATCH] mm: Sync percpu mm RSS counters before querying Date: Thu, 8 Jun 2023 19:12:56 +0200 Message-Id: <20230608171256.17827-1-mkoutny@suse.com> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_PASS, T_SCC_BODY_TEXT_LINE,T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768156351991070543?= X-GMAIL-MSGID: =?utf-8?q?1768156351991070543?= |
Series |
[RFC] mm: Sync percpu mm RSS counters before querying
|
|
Commit Message
Michal Koutný
June 8, 2023, 5:12 p.m. UTC
An issue was observed with stats collected in struct rusage on ppc64le
with 64kB pages. The percpu counters use batching with
percpu_counter_batch = max(32, nr*2) # in PAGE_SIZE
i.e. with larger pages but similar RSS consumption (bytes), there'll be
less flushes and error more noticeable.
In this given case (getting consumption of exited child), we can request
percpu counter's flush without worrying about contention with updaters.
Fortunately, the commit f1a7941243c1 ("mm: convert mm's rss stats into
percpu_counter") didn't eradicate all traces of SPLIT_RSS_COUNTING and
this mechanism already provided some synchronization points before
reading stats.
Therefore, use sync_mm_rss as carrier for percpu counters refreshes and
forget SPLIT_RSS_COUNTING macro for good.
Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
Reported-by: Adam Majer <amajer@suse.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
include/linux/mm.h | 6 ++----
kernel/fork.c | 4 ----
2 files changed, 2 insertions(+), 8 deletions(-)
Comments
On Thu, Jun 08, 2023 at 07:12:56PM +0200, Michal Koutný wrote: > An issue was observed with stats collected in struct rusage on ppc64le > with 64kB pages. The percpu counters use batching with > percpu_counter_batch = max(32, nr*2) # in PAGE_SIZE > i.e. with larger pages but similar RSS consumption (bytes), there'll be > less flushes and error more noticeable. > > In this given case (getting consumption of exited child), we can request > percpu counter's flush without worrying about contention with updaters. > > Fortunately, the commit f1a7941243c1 ("mm: convert mm's rss stats into > percpu_counter") didn't eradicate all traces of SPLIT_RSS_COUNTING and > this mechanism already provided some synchronization points before > reading stats. > Therefore, use sync_mm_rss as carrier for percpu counters refreshes and > forget SPLIT_RSS_COUNTING macro for good. > > Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter") > Reported-by: Adam Majer <amajer@suse.com> > Signed-off-by: Michal Koutný <mkoutny@suse.com> The patch seems reasonable to me. Are any of the callsites of sync_mm_rss performance sensitive? > --- > include/linux/mm.h | 6 ++---- > kernel/fork.c | 4 ---- > 2 files changed, 2 insertions(+), 8 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 27ce77080c79..30cfde88d5b2 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2547,13 +2547,11 @@ static inline void setmax_mm_hiwater_rss(unsigned long *maxrss, > *maxrss = hiwater_rss; > } > > -#if defined(SPLIT_RSS_COUNTING) > -void sync_mm_rss(struct mm_struct *mm); > -#else > static inline void sync_mm_rss(struct mm_struct *mm) > { > + for (int i = 0; i < NR_MM_COUNTERS; ++i) > + percpu_counter_sum(&mm->rss_stat[i]); > } > -#endif > > #ifndef CONFIG_ARCH_HAS_PTE_SPECIAL > static inline int pte_special(pte_t pte) > diff --git a/kernel/fork.c b/kernel/fork.c > index 81cba91f30bb..e030eb902e4b 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -2412,10 +2412,6 @@ __latent_entropy struct task_struct *copy_process( > p->io_uring = NULL; > #endif > > -#if defined(SPLIT_RSS_COUNTING) > - memset(&p->rss_stat, 0, sizeof(p->rss_stat)); > -#endif > - > p->default_timer_slack_ns = current->timer_slack_ns; > > #ifdef CONFIG_PSI > -- > 2.40.1 >
diff --git a/include/linux/mm.h b/include/linux/mm.h index 27ce77080c79..30cfde88d5b2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2547,13 +2547,11 @@ static inline void setmax_mm_hiwater_rss(unsigned long *maxrss, *maxrss = hiwater_rss; } -#if defined(SPLIT_RSS_COUNTING) -void sync_mm_rss(struct mm_struct *mm); -#else static inline void sync_mm_rss(struct mm_struct *mm) { + for (int i = 0; i < NR_MM_COUNTERS; ++i) + percpu_counter_sum(&mm->rss_stat[i]); } -#endif #ifndef CONFIG_ARCH_HAS_PTE_SPECIAL static inline int pte_special(pte_t pte) diff --git a/kernel/fork.c b/kernel/fork.c index 81cba91f30bb..e030eb902e4b 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2412,10 +2412,6 @@ __latent_entropy struct task_struct *copy_process( p->io_uring = NULL; #endif -#if defined(SPLIT_RSS_COUNTING) - memset(&p->rss_stat, 0, sizeof(p->rss_stat)); -#endif - p->default_timer_slack_ns = current->timer_slack_ns; #ifdef CONFIG_PSI