Message ID | 20230801135632.1768830-1-hannes@cmpxchg.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp2776506vqg; Tue, 1 Aug 2023 09:15:16 -0700 (PDT) X-Google-Smtp-Source: APBJJlGhkvdvAqhr1HEJkV8y2VTD53blnpJzcks/11zp4Jp4EOi/O+U1TxxxL0l86sSKuTWZC93q X-Received: by 2002:a17:90b:4f82:b0:262:e3aa:fd73 with SMTP id qe2-20020a17090b4f8200b00262e3aafd73mr13310513pjb.17.1690906516146; Tue, 01 Aug 2023 09:15:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690906516; cv=none; d=google.com; s=arc-20160816; b=IGql6a4hGqPQ/nsEVt/TuiAxAIDXr1r8nEcAwTBefl7ouotxLeIAsAhQiRONmdPd7c lsAz+w2puxRDxREnT/bwgc2jgINnPcSPYCOzL8iNwolTD9cW+ThN64bxDgsjlsKLF4o0 BOYrw4mcZPlX5J5JwZgYIGXxiM/crFwIMyLOjgFTfuBruoLHSmEK5kS5tx6yUfJZ1HQD S8MHqGsq6+1Ga1bm+Chj6YzmY9G2nFDN6XeCldO9nR6EwtYRWXPYBjrGOvvZgoEvrv/V 0cXJxQvMbSjs6+nML7ZtLOzDOvetFD3X/X3K6N43mtwmImtJYYruyxiot2eoI6UKOnX+ 6AaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=XJLGW04mxEToFfRDeoOpyp00hE8LNX9fFBjKrCRhwps=; fh=FEcnz2VGBTvUoMN2OnhhWIppEf4sup7pYr2iY6IesY8=; b=wIZEauQw++ZS3W4DFMmQ9Q2hl/Q48+0UYJIQLbrr5T9b6/wSyEjI0em5VwYu+T64n6 SDQYgwimPCk4pJprrEOTmnJJ5srl+BYEvBUefWmZaW2SVC7Tv2SFGRh6lMVloJduku9b GIpSGpfWyB7F8Zd0UpnGVhgSUCky6UxQk+ld7FuYVDrvoQ5QMM3QQVdDsk+Pgk2kMoa7 q9iBK3AHhpYjskakpTrs/7XuJfL+7MpJpS+TAPEeIje9St8xRzaXN7tROHm4LBU6TLHp kHF9WaiJqxxku0zTjiJGlvTTtO/2dqXgy64GcMaiqPu4cIW4iKrGLn1uVusFK4INPygI bO0A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=iLH7vqnb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lj9-20020a17090b344900b00268198ef8f6si10862494pjb.39.2023.08.01.09.14.39; Tue, 01 Aug 2023 09:15:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b=iLH7vqnb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=cmpxchg.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234426AbjHAN4n (ORCPT <rfc822;maxi.paulin@gmail.com> + 99 others); Tue, 1 Aug 2023 09:56:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234230AbjHAN4i (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 1 Aug 2023 09:56:38 -0400 Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D06A2134 for <linux-kernel@vger.kernel.org>; Tue, 1 Aug 2023 06:56:34 -0700 (PDT) Received: by mail-qt1-x829.google.com with SMTP id d75a77b69052e-406b9bcad5dso28475751cf.2 for <linux-kernel@vger.kernel.org>; Tue, 01 Aug 2023 06:56:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1690898193; x=1691502993; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=XJLGW04mxEToFfRDeoOpyp00hE8LNX9fFBjKrCRhwps=; b=iLH7vqnb28PIkYe34u+mDj70sMwzhlg2Nc9Lt2f4INh+KceFZ4zMmArs8alLU4u0W1 xuDos8oBmOj2eI8hOcKxNyjB9tlUl5gULORkj9AFCEfX0mZRPKbeVkBWvQtw3rU0aFZL qGVxwxzioM5kQNEUjBSr/VS4Yudj1ZSYn+Pk67O/2OGEHBnObl0Oxjqh5w56/qFguZeF 0P4o7c44YclUPTxHlqXaMpNGRbK83ZiiVTuS+udod4EI3YUzNvZi7AAMVqNq6u/dr4Mv NoUhcNOKuM2pesfsikO7xYyIiTwESwnALMqSuZQOqB6X+3xxzvPkTuuBoUokQu/Pu+Gu /KBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690898193; x=1691502993; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XJLGW04mxEToFfRDeoOpyp00hE8LNX9fFBjKrCRhwps=; b=AGMfqMG5jNtHIavMYjW1EwxCE2SXDgOqSmpSD71VJk0HK430wNWLmnI+jbzztCdYz1 aO/PT6oxV6hpcKl46KveXIYYEL/HQMRdbNhfdVyOr1aVF6jGu1J663Z/Ea0wZ8jLbnrb /EGapzIhzG5Iv/jAH7602UoaQSkLsGPmac66KHUJZ0AtMGMg2x5I5/v3YuJqJhnx8gQF e2y1dN137Uj4Eam5QxNXyV91K6WIpssmo7U5E86N910sitln1OCg7QeYq0aAAlns42nT ag6AO+/N4Km24KuVHbX9/c0Bjc4BMtrYSjIfjRpsTFwQbJDk+neUSwzFdNE4seHE57lk 4PJw== X-Gm-Message-State: ABy/qLb6jQEw3xOIYK7dqDydpuQzowlfztDDILeAU2xbFq8NkuziNP1R yXMdfcUYv1dXUQUIlhW8NZCScLwBpDxEyygY6weQdA== X-Received: by 2002:a05:622a:1a18:b0:402:4bf3:7f41 with SMTP id f24-20020a05622a1a1800b004024bf37f41mr12515923qtb.29.1690898193176; Tue, 01 Aug 2023 06:56:33 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-8f57-5681-ccd3-4a2e.res6.spectrum.com. [2603:7000:c01:2716:8f57:5681:ccd3:4a2e]) by smtp.gmail.com with ESMTPSA id c27-20020ac8009b000000b004054b435f8csm4446390qtg.65.2023.08.01.06.56.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Aug 2023 06:56:32 -0700 (PDT) From: Johannes Weiner <hannes@cmpxchg.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: Roman Gushchin <roman.gushchin@linux.dev>, Michal Hocko <mhocko@suse.com>, "Paul E. McKenney" <paulmck@kernel.org>, linux-kselftest@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] selftests: cgroup: fix test_kmem_basic false positives Date: Tue, 1 Aug 2023 09:56:32 -0400 Message-ID: <20230801135632.1768830-1-hannes@cmpxchg.org> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773043991097982677 X-GMAIL-MSGID: 1773043991097982677 |
Series |
selftests: cgroup: fix test_kmem_basic false positives
|
|
Commit Message
Johannes Weiner
Aug. 1, 2023, 1:56 p.m. UTC
This test fails routinely in our prod testing environment, and I can
reproduce it locally as well.
The test allocates dcache inside a cgroup, then drops the memory limit
and checks that usage drops correspondingly. The reason it fails is
because dentries are freed with an RCU delay - a debugging sleep shows
that usage drops as expected shortly after.
Insert a 1s sleep after dropping the limit. This should be good
enough, assuming that machines running those tests are otherwise not
very busy.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
tools/testing/selftests/cgroup/test_kmem.c | 4 ++++
1 file changed, 4 insertions(+)
Comments
On Tue, Aug 01, 2023 at 09:56:32AM -0400, Johannes Weiner wrote: > This test fails routinely in our prod testing environment, and I can > reproduce it locally as well. > > The test allocates dcache inside a cgroup, then drops the memory limit > and checks that usage drops correspondingly. The reason it fails is > because dentries are freed with an RCU delay - a debugging sleep shows > that usage drops as expected shortly after. > > Insert a 1s sleep after dropping the limit. This should be good > enough, assuming that machines running those tests are otherwise not > very busy. > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> I am putting together something more formal, but this will certainly improve things, as Johannes says, assuming the system goes mostly idle during that one-second wait. So: Acked-by: Paul E. McKenney <paulmck@kernel.org> Yes, there are corner cases, such as the system having millions of RCU callbacks queued and being unable to invoke them all during that one-second interval. But that is a corner case, and that is exactly why I will be putting together something more formal. ;-) Thanx, Paul > --- > tools/testing/selftests/cgroup/test_kmem.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c > index 258ddc565deb..1b2cec9d18a4 100644 > --- a/tools/testing/selftests/cgroup/test_kmem.c > +++ b/tools/testing/selftests/cgroup/test_kmem.c > @@ -70,6 +70,10 @@ static int test_kmem_basic(const char *root) > goto cleanup; > > cg_write(cg, "memory.high", "1M"); > + > + /* wait for RCU freeing */ > + sleep(1); > + > slab1 = cg_read_key_long(cg, "memory.stat", "slab "); > if (slab1 <= 0) > goto cleanup; > -- > 2.41.0 >
On Tue, Aug 01, 2023 at 09:39:28AM -0700, Paul E. McKenney wrote: > On Tue, Aug 01, 2023 at 09:56:32AM -0400, Johannes Weiner wrote: > > This test fails routinely in our prod testing environment, and I can > > reproduce it locally as well. > > > > The test allocates dcache inside a cgroup, then drops the memory limit > > and checks that usage drops correspondingly. The reason it fails is > > because dentries are freed with an RCU delay - a debugging sleep shows > > that usage drops as expected shortly after. > > > > Insert a 1s sleep after dropping the limit. This should be good > > enough, assuming that machines running those tests are otherwise not > > very busy. > > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> > > I am putting together something more formal, but this will certainly > improve things, as Johannes says, assuming the system goes mostly > idle during that one-second wait. So: > > Acked-by: Paul E. McKenney <paulmck@kernel.org> > > Yes, there are corner cases, such as the system having millions of > RCU callbacks queued and being unable to invoke them all during that > one-second interval. But that is a corner case, and that is exactly > why I will be putting together something more formal. ;-) > > Thanx, Paul > > > --- > > tools/testing/selftests/cgroup/test_kmem.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c > > index 258ddc565deb..1b2cec9d18a4 100644 > > --- a/tools/testing/selftests/cgroup/test_kmem.c > > +++ b/tools/testing/selftests/cgroup/test_kmem.c > > @@ -70,6 +70,10 @@ static int test_kmem_basic(const char *root) > > goto cleanup; > > > > cg_write(cg, "memory.high", "1M"); > > + > > + /* wait for RCU freeing */ > > + sleep(1); > > + > > slab1 = cg_read_key_long(cg, "memory.stat", "slab "); > > if (slab1 <= 0) > > goto cleanup; > > -- > > 2.41.0 > > The same issue exists in the test case test_kmem_memcg_deletion. I wouldn't mind posting the patch, but it seems you want to propose something more formal. Let me know your opinion. Thanks, Lucas
On Thu, Aug 03, 2023 at 12:13:26PM -0400, Lucas Karpinski wrote: > On Tue, Aug 01, 2023 at 09:39:28AM -0700, Paul E. McKenney wrote: > > On Tue, Aug 01, 2023 at 09:56:32AM -0400, Johannes Weiner wrote: > > > This test fails routinely in our prod testing environment, and I can > > > reproduce it locally as well. > > > > > > The test allocates dcache inside a cgroup, then drops the memory limit > > > and checks that usage drops correspondingly. The reason it fails is > > > because dentries are freed with an RCU delay - a debugging sleep shows > > > that usage drops as expected shortly after. > > > > > > Insert a 1s sleep after dropping the limit. This should be good > > > enough, assuming that machines running those tests are otherwise not > > > very busy. > > > > > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> > > > > I am putting together something more formal, but this will certainly > > improve things, as Johannes says, assuming the system goes mostly > > idle during that one-second wait. So: > > > > Acked-by: Paul E. McKenney <paulmck@kernel.org> > > > > Yes, there are corner cases, such as the system having millions of > > RCU callbacks queued and being unable to invoke them all during that > > one-second interval. But that is a corner case, and that is exactly > > why I will be putting together something more formal. ;-) > > > > Thanx, Paul > > > > > --- > > > tools/testing/selftests/cgroup/test_kmem.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c > > > index 258ddc565deb..1b2cec9d18a4 100644 > > > --- a/tools/testing/selftests/cgroup/test_kmem.c > > > +++ b/tools/testing/selftests/cgroup/test_kmem.c > > > @@ -70,6 +70,10 @@ static int test_kmem_basic(const char *root) > > > goto cleanup; > > > > > > cg_write(cg, "memory.high", "1M"); > > > + > > > + /* wait for RCU freeing */ > > > + sleep(1); > > > + > > > slab1 = cg_read_key_long(cg, "memory.stat", "slab "); > > > if (slab1 <= 0) > > > goto cleanup; > > > -- > > > 2.41.0 > > > > > The same issue exists in the test case test_kmem_memcg_deletion. I > wouldn't mind posting the patch, but it seems you want to propose > something more formal. Let me know your opinion. I am proposing a /sys/module/rcutree/parameters/do_rcu_barrier file. Writing a "1" into this file results in an rcu_barrier() in the kernel, but set up so that there is no more than a single rcu_barrier() call per second. So you could do the following: run-a-test echo 1 > /sys/module/rcutree/parameters/do_rcu_barrier # As root # All RCU callbacks from run-a-test have now been invoked run-another-test Please note that this handles only RCU, as in call_rcu(), and not SRCU, Tasks RCU, and so on. Thanx, Paul
diff --git a/tools/testing/selftests/cgroup/test_kmem.c b/tools/testing/selftests/cgroup/test_kmem.c index 258ddc565deb..1b2cec9d18a4 100644 --- a/tools/testing/selftests/cgroup/test_kmem.c +++ b/tools/testing/selftests/cgroup/test_kmem.c @@ -70,6 +70,10 @@ static int test_kmem_basic(const char *root) goto cleanup; cg_write(cg, "memory.high", "1M"); + + /* wait for RCU freeing */ + sleep(1); + slab1 = cg_read_key_long(cg, "memory.stat", "slab "); if (slab1 <= 0) goto cleanup;