Message ID | 20230101061555.278129-1-joel@joelfernandes.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp3656984wrt; Sat, 31 Dec 2022 22:51:40 -0800 (PST) X-Google-Smtp-Source: AMrXdXv3swri75YpXpCDPovDY4TEM1i/XiuFYs2NnQEsQ4yE76UZEgRaG7z+B4nUAPDiQmyeejtZ X-Received: by 2002:a17:906:30c9:b0:78d:f457:1052 with SMTP id b9-20020a17090630c900b0078df4571052mr31049327ejb.15.1672555899984; Sat, 31 Dec 2022 22:51:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672555899; cv=none; d=google.com; s=arc-20160816; b=TKdDcvdtp/D2Rh46bZ3wBxRS82M6Qivr1FWeRKP5RLTrk+5hW4Q3OFzzKPw3VMYnvX zv9VOxxw7gYJUaQTisX14uZbg0KRcmkyfc2rp3yugKN+i6aNbFeBcRHc2bJCq73lUXs4 Wj/H5dYAe9Ox42bCpaIvzMsGS8LLQ6KuBAj0fmZ4riHMxLWBfy5YKPdH3D/UvYYqb6nr rEH3TnsYc0kDRU0g6t1BdmOx0yv1Vbq6AyZJB0ywydmd1G3gJ5W1pl6Sk+w/ckeQ6psq qTRZA1MTcl/RCqW5rQnSvt5R1q4zAlVfqhKMl9h4aXBB3HJqY57/GiymE04s95Uh4bqR rwgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=v+W8S4Gw96ZyMz/iU7xZEIkDgZoTIpygT5AgywG1R4c=; b=nOD/U+CNcaLBWLFvgq7i+rMXFlH+neEeHT+iX6oPNgwvTGk/9CcdAxqxPA0AHvDTGU g+WAz7Vy8KnUIL7ZWhLtg2T9UXOj1NHim/OEWpEXyquZeiO66yK8gpm4gvTMo3S3bgPw j6MeKh/YqikmpHZnSpExY0udWYxEuaG0gKLcb9wt/ykbWM9KBNSFnI0ing8aBJm4kqP3 YofpTRtZHflvIJUqLWjPPBASEeZhyg90/Sn9XmJmxIgGa8Y1hJ9hvW1T2br147XHJVwd IUl3qwaDmJQ0+xsdI131g1IXMxxlYsl1P+/HCORnT+K8zesNosY9K5v6cgmqTf7xK0SI CdpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=a6T7HkbR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xc2-20020a170907074200b007ae30b550e5si21952522ejb.854.2022.12.31.22.51.12; Sat, 31 Dec 2022 22:51:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=a6T7HkbR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229714AbjAAGQR (ORCPT <rfc822;cscallsign@gmail.com> + 99 others); Sun, 1 Jan 2023 01:16:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229447AbjAAGQN (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 1 Jan 2023 01:16:13 -0500 Received: from mail-qt1-x832.google.com (mail-qt1-x832.google.com [IPv6:2607:f8b0:4864:20::832]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6F06F66 for <linux-kernel@vger.kernel.org>; Sat, 31 Dec 2022 22:16:10 -0800 (PST) Received: by mail-qt1-x832.google.com with SMTP id z12so20253367qtv.5 for <linux-kernel@vger.kernel.org>; Sat, 31 Dec 2022 22:16:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=v+W8S4Gw96ZyMz/iU7xZEIkDgZoTIpygT5AgywG1R4c=; b=a6T7HkbRuT762R6I6HWxKnaLfnx0/4vDgmHoId90Mn7bOd33CT6EN+PkQXW/TYNBIv y4URlU0QGxe1NWqOV4k6tCWrPz6D64eB8134YQh71ZD44IG8w8dprOYEY+2tVjyWvZFi 7YtcF6vWzY8m8ZZ9bdQn74IAsn2EcurHYMOmk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=v+W8S4Gw96ZyMz/iU7xZEIkDgZoTIpygT5AgywG1R4c=; b=YfkpC/XpzPe91q4OQRHzAoe1VtRLY2wwMhpoHm57cmvdZlaUH6cSWrtAK16N8syXEH YN8hLvISAgb/15/eP/sleYzYE210g4inYa+Nw7YNJJ+tpMJi7PcUigDzkGdkfVDXJ8If sq9UsXOhNDMh0dDq3pQCO38jOM8plhy/YXtt3Oi9cZY1un/K+04sYobWOVu9vpAzmvkT /ohA5I0yosjRlUYJJq0w0xKBj7cQ+dOvk7xpehCccNOCU9896fpAddDWTb1zBb5uAxRa oBAwKb1dDkiWgfiMAVPjJkmu5VrFHLr60ou4MvKyUYzlybkxfpWd1UjgMXaPkKr4hSEp YWyw== X-Gm-Message-State: AFqh2kq8KSoesB6eMOEUqhgqD7cE//JFBN8GFGaQiCLCgQQL0xb9El20 fAzv8CZBo63M8J2+ROT7EVLEZNgWEegHsHsJ X-Received: by 2002:a05:622a:1f14:b0:3a5:43af:d7ac with SMTP id ca20-20020a05622a1f1400b003a543afd7acmr54924433qtb.67.1672553768453; Sat, 31 Dec 2022 22:16:08 -0800 (PST) Received: from joelboxx.c.googlers.com.com (228.221.150.34.bc.googleusercontent.com. [34.150.221.228]) by smtp.gmail.com with ESMTPSA id i17-20020a05620a405100b006fcb77f3bd6sm18761282qko.98.2022.12.31.22.16.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 31 Dec 2022 22:16:07 -0800 (PST) From: "Joel Fernandes (Google)" <joel@joelfernandes.org> To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>, Paul McKenney <paulmck@kernel.org>, Frederic Weisbecker <fweisbec@gmail.com>, Zhouyi Zhou <zhouzhouyi@gmail.com>, stable@vger.kernel.org, Davidlohr Bueso <dave@stgolabs.net>, Josh Triplett <josh@joshtriplett.org> Subject: [PATCH] torture: Fix hang during kthread shutdown phase Date: Sun, 1 Jan 2023 06:15:55 +0000 Message-Id: <20230101061555.278129-1-joel@joelfernandes.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753801975404926670?= X-GMAIL-MSGID: =?utf-8?q?1753801975404926670?= |
Series |
torture: Fix hang during kthread shutdown phase
|
|
Commit Message
Joel Fernandes
Jan. 1, 2023, 6:15 a.m. UTC
During shutdown of rcutorture, the shutdown thread in
rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop
to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for
readers and fakewriters to breakout of their main while loop and start
shutting down.
Once out of their main loop, they then call torture_kthread_stopping()
which in turn waits for kthread_stop() to be called, however
rcu_torture_cleanup() has not even called kthread_stop() on those
threads yet, it does that a bit later. However, before it gets a chance
to do so, torture_kthread_stopping() calls
schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed
this makes the timer softirq constantly execute timer callbacks, while
never returning back to the softirq exit path and is essentially "locked
up" because of that. If the softirq preempts the shutdown thread,
kthread_stop() may never be called.
This commit improves the situation dramatically, by increasing timeout
passed to schedule_timeout_interruptible() 1/20th of a second. This
causes the timer softirq to not lock up a CPU and everything works fine.
Testing has shown 100 runs of TREE07 passing reliably, which was not the
case before because of RCU stalls.
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: <stable@vger.kernel.org> # 6.0.x
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
kernel/torture.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Sun, Jan 1, 2023 at 1:16 AM Joel Fernandes (Google) <joel@joelfernandes.org> wrote: > > During shutdown of rcutorture, the shutdown thread in > rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop > to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for > readers and fakewriters to breakout of their main while loop and start > shutting down. > > Once out of their main loop, they then call torture_kthread_stopping() > which in turn waits for kthread_stop() to be called, however > rcu_torture_cleanup() has not even called kthread_stop() on those > threads yet, it does that a bit later. However, before it gets a chance > to do so, torture_kthread_stopping() calls > schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed > this makes the timer softirq constantly execute timer callbacks, while > never returning back to the softirq exit path and is essentially "locked > up" because of that. If the softirq preempts the shutdown thread, > kthread_stop() may never be called. > > This commit improves the situation dramatically, by increasing timeout > passed to schedule_timeout_interruptible() 1/20th of a second. This > causes the timer softirq to not lock up a CPU and everything works fine. > Testing has shown 100 runs of TREE07 passing reliably, which was not the > case before because of RCU stalls. > > Cc: Paul McKenney <paulmck@kernel.org> > Cc: Frederic Weisbecker <fweisbec@gmail.com> > Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> > Cc: <stable@vger.kernel.org> # 6.0.x Question for stable maintainers: This patch is for mainline and 6.0 stable. However, it should also go to 6.1 stable. How do we tag it to do that? I did not know how to tag 2 stable versions. I guess the above implies > 6.0 ? Thanks, - Joel
On Sun, Jan 01, 2023 at 01:20:01AM -0500, Joel Fernandes wrote: > On Sun, Jan 1, 2023 at 1:16 AM Joel Fernandes (Google) > > Cc: Paul McKenney <paulmck@kernel.org> > > Cc: Frederic Weisbecker <fweisbec@gmail.com> > > Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> > > Cc: <stable@vger.kernel.org> # 6.0.x > > Question for stable maintainers: > This patch is for mainline and 6.0 stable. However, it should also go > to 6.1 stable. How do we tag it to do that? I did not know how to tag > 2 stable versions. I guess the above implies > 6.0 ? The above implies 6.0 and newer already which included 6.1, so all is good.
On Sun, Jan 1, 2023 at 2:16 PM Joel Fernandes (Google) <joel@joelfernandes.org> wrote: > > During shutdown of rcutorture, the shutdown thread in > rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop > to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for > readers and fakewriters to breakout of their main while loop and start > shutting down. > > Once out of their main loop, they then call torture_kthread_stopping() > which in turn waits for kthread_stop() to be called, however > rcu_torture_cleanup() has not even called kthread_stop() on those > threads yet, it does that a bit later. However, before it gets a chance > to do so, torture_kthread_stopping() calls > schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed > this makes the timer softirq constantly execute timer callbacks, while > never returning back to the softirq exit path and is essentially "locked > up" because of that. If the softirq preempts the shutdown thread, > kthread_stop() may never be called. > > This commit improves the situation dramatically, by increasing timeout > passed to schedule_timeout_interruptible() 1/20th of a second. This > causes the timer softirq to not lock up a CPU and everything works fine. > Testing has shown 100 runs of TREE07 passing reliably, which was not the > case before because of RCU stalls. On my Dell PowerEdge R720 with two Intel(R) Xeon(R) CPU E5-2660 128G memory: 1) before this patch: 3 of 80 rounds failed with "rcu: INFO: rcu_sched detected stalls on CPUs/tasks" [1] 2) after this patch all 80 rounds passed Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com> Thanks Zhouyi [1] http://154.220.3.115/logs/20230101/console.log > > Cc: Paul McKenney <paulmck@kernel.org> > Cc: Frederic Weisbecker <fweisbec@gmail.com> > Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> > Cc: <stable@vger.kernel.org> # 6.0.x > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- > kernel/torture.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/torture.c b/kernel/torture.c > index 29afc62f2bfe..d024f3b7181f 100644 > --- a/kernel/torture.c > +++ b/kernel/torture.c > @@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) > VERBOSE_TOROUT_STRING(buf); > while (!kthread_should_stop()) { > torture_shutdown_absorb(title); > - schedule_timeout_uninterruptible(1); > + schedule_timeout_uninterruptible(HZ/20); > } > } > EXPORT_SYMBOL_GPL(torture_kthread_stopping); > -- > 2.39.0.314.g84b9a713c41-goog >
> On Jan 1, 2023, at 3:48 AM, Greg KH <gregkh@linuxfoundation.org> wrote: > > On Sun, Jan 01, 2023 at 01:20:01AM -0500, Joel Fernandes wrote: >> On Sun, Jan 1, 2023 at 1:16 AM Joel Fernandes (Google) >>> Cc: Paul McKenney <paulmck@kernel.org> >>> Cc: Frederic Weisbecker <fweisbec@gmail.com> >>> Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> >>> Cc: <stable@vger.kernel.org> # 6.0.x >> >> Question for stable maintainers: >> This patch is for mainline and 6.0 stable. However, it should also go >> to 6.1 stable. How do we tag it to do that? I did not know how to tag >> 2 stable versions. I guess the above implies > 6.0 ? > > The above implies 6.0 and newer already which included 6.1, so all is > good. Thanks a lot for clarification, - Joel >
> On Jan 1, 2023, at 8:02 AM, Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > On Sun, Jan 1, 2023 at 2:16 PM Joel Fernandes (Google) > <joel@joelfernandes.org> wrote: >> >> During shutdown of rcutorture, the shutdown thread in >> rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop >> to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for >> readers and fakewriters to breakout of their main while loop and start >> shutting down. >> >> Once out of their main loop, they then call torture_kthread_stopping() >> which in turn waits for kthread_stop() to be called, however >> rcu_torture_cleanup() has not even called kthread_stop() on those >> threads yet, it does that a bit later. However, before it gets a chance >> to do so, torture_kthread_stopping() calls >> schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed >> this makes the timer softirq constantly execute timer callbacks, while >> never returning back to the softirq exit path and is essentially "locked >> up" because of that. If the softirq preempts the shutdown thread, >> kthread_stop() may never be called. >> >> This commit improves the situation dramatically, by increasing timeout >> passed to schedule_timeout_interruptible() 1/20th of a second. This >> causes the timer softirq to not lock up a CPU and everything works fine. >> Testing has shown 100 runs of TREE07 passing reliably, which was not the >> case before because of RCU stalls. > On my Dell PowerEdge R720 with two Intel(R) Xeon(R) CPU E5-2660 128G memory: > 1) before this patch: > 3 of 80 rounds failed with "rcu: INFO: rcu_sched detected stalls on > CPUs/tasks" [1] > 2) after this patch > all 80 rounds passed > > Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com> > Thanks! Glad to see your tests look good now. - Joel > Thanks > Zhouyi > > [1] http://154.220.3.115/logs/20230101/console.log >> >> Cc: Paul McKenney <paulmck@kernel.org> >> Cc: Frederic Weisbecker <fweisbec@gmail.com> >> Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> >> Cc: <stable@vger.kernel.org> # 6.0.x >> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> >> --- >> kernel/torture.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/kernel/torture.c b/kernel/torture.c >> index 29afc62f2bfe..d024f3b7181f 100644 >> --- a/kernel/torture.c >> +++ b/kernel/torture.c >> @@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) >> VERBOSE_TOROUT_STRING(buf); >> while (!kthread_should_stop()) { >> torture_shutdown_absorb(title); >> - schedule_timeout_uninterruptible(1); >> + schedule_timeout_uninterruptible(HZ/20); >> } >> } >> EXPORT_SYMBOL_GPL(torture_kthread_stopping); >> -- >> 2.39.0.314.g84b9a713c41-goog >>
On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote: >During shutdown of rcutorture, the shutdown thread in >rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop >to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for >readers and fakewriters to breakout of their main while loop and start >shutting down. > >Once out of their main loop, they then call torture_kthread_stopping() >which in turn waits for kthread_stop() to be called, however >rcu_torture_cleanup() has not even called kthread_stop() on those >threads yet, it does that a bit later. However, before it gets a chance >to do so, torture_kthread_stopping() calls >schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed >this makes the timer softirq constantly execute timer callbacks, while >never returning back to the softirq exit path and is essentially "locked >up" because of that. If the softirq preempts the shutdown thread, >kthread_stop() may never be called. > >This commit improves the situation dramatically, by increasing timeout >passed to schedule_timeout_interruptible() 1/20th of a second. This >causes the timer softirq to not lock up a CPU and everything works fine. >Testing has shown 100 runs of TREE07 passing reliably, which was not the >case before because of RCU stalls. > >Cc: Paul McKenney <paulmck@kernel.org> >Cc: Frederic Weisbecker <fweisbec@gmail.com> >Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> >Cc: <stable@vger.kernel.org> # 6.0.x >Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote: > On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote: > > > During shutdown of rcutorture, the shutdown thread in > > rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop > > to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for > > readers and fakewriters to breakout of their main while loop and start > > shutting down. > > > > Once out of their main loop, they then call torture_kthread_stopping() > > which in turn waits for kthread_stop() to be called, however > > rcu_torture_cleanup() has not even called kthread_stop() on those > > threads yet, it does that a bit later. However, before it gets a chance > > to do so, torture_kthread_stopping() calls > > schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed > > this makes the timer softirq constantly execute timer callbacks, while > > never returning back to the softirq exit path and is essentially "locked > > up" because of that. If the softirq preempts the shutdown thread, > > kthread_stop() may never be called. > > > > This commit improves the situation dramatically, by increasing timeout > > passed to schedule_timeout_interruptible() 1/20th of a second. This > > causes the timer softirq to not lock up a CPU and everything works fine. > > Testing has shown 100 runs of TREE07 passing reliably, which was not the > > case before because of RCU stalls. > > > > Cc: Paul McKenney <paulmck@kernel.org> > > Cc: Frederic Weisbecker <fweisbec@gmail.com> > > Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> > > Cc: <stable@vger.kernel.org> # 6.0.x > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Queued for further review and testing, thank you all! One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each timer hander must have consumed many tens of microseconds in order to keep the system busy, which seems a bit longer than it should be. Or am I underestimating the number of tasks involved? Thanx, Paul
On Tue, Jan 03, 2023 at 10:04:04AM -0800, Paul E. McKenney wrote: > On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote: > > On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote: > > > > > During shutdown of rcutorture, the shutdown thread in > > > rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop > > > to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for > > > readers and fakewriters to breakout of their main while loop and start > > > shutting down. > > > > > > Once out of their main loop, they then call torture_kthread_stopping() > > > which in turn waits for kthread_stop() to be called, however > > > rcu_torture_cleanup() has not even called kthread_stop() on those > > > threads yet, it does that a bit later. However, before it gets a chance > > > to do so, torture_kthread_stopping() calls > > > schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed > > > this makes the timer softirq constantly execute timer callbacks, while > > > never returning back to the softirq exit path and is essentially "locked > > > up" because of that. If the softirq preempts the shutdown thread, > > > kthread_stop() may never be called. > > > > > > This commit improves the situation dramatically, by increasing timeout > > > passed to schedule_timeout_interruptible() 1/20th of a second. This > > > causes the timer softirq to not lock up a CPU and everything works fine. > > > Testing has shown 100 runs of TREE07 passing reliably, which was not the > > > case before because of RCU stalls. > > > > > > Cc: Paul McKenney <paulmck@kernel.org> > > > Cc: Frederic Weisbecker <fweisbec@gmail.com> > > > Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > Cc: <stable@vger.kernel.org> # 6.0.x > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > > > Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> > > Queued for further review and testing, thank you all! > > One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each > timer hander must have consumed many tens of microseconds in order > to keep the system busy, which seems a bit longer than it should be. > Or am I underestimating the number of tasks involved? Here are the traces between successive calls to process_timeout() which is the timer callback handler: [ 1320.444210] <idle>-0 0dNs.. 314229620us : __run_timers: Calling timerfn 5: process_timeout [ 1320.444215] <idle>-0 0dNs.. 314229620us : sched_waking: comm=rcu_torture_fak pid=145 prio=139 target_cpu=008 [ 1320.463393] <idle>-0 7d.... 314229655us : sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_torture_wri next_pid=144 next_prio=120 [ 1320.478870] rcu_tort-144 7d.... 314229658us : sched_switch: prev_comm=rcu_torture_wri prev_pid=144 prev_prio=120 prev_state=D ==> next_comm=swapper/7 next_pid=0 next_prio=120 [ 1320.494324] <idle>-0 0dNs.. 314229738us : __run_timers: Calling timerfn 6: process_timeout It appears the time delta in the above occurrence is 118 micro seconds between 2 timer callbacks. It does appear to be doing a cross-CPU wake up. Maybe that adds to the long time? Here are the full logs with traces (in case it helps, search for "=D" for the D-state sched_switch event before the "panic now" trace happens): http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.0.y/26/artifact/tools/testing/selftests/rcutorture/res/2022.12.31-23.04.42/TREE07.2/console.log thanks, - Joel
On Tue, Jan 03, 2023 at 06:24:23PM +0000, Joel Fernandes wrote: > On Tue, Jan 03, 2023 at 10:04:04AM -0800, Paul E. McKenney wrote: > > On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote: > > > On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote: > > > > > > > During shutdown of rcutorture, the shutdown thread in > > > > rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop > > > > to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for > > > > readers and fakewriters to breakout of their main while loop and start > > > > shutting down. > > > > > > > > Once out of their main loop, they then call torture_kthread_stopping() > > > > which in turn waits for kthread_stop() to be called, however > > > > rcu_torture_cleanup() has not even called kthread_stop() on those > > > > threads yet, it does that a bit later. However, before it gets a chance > > > > to do so, torture_kthread_stopping() calls > > > > schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed > > > > this makes the timer softirq constantly execute timer callbacks, while > > > > never returning back to the softirq exit path and is essentially "locked > > > > up" because of that. If the softirq preempts the shutdown thread, > > > > kthread_stop() may never be called. > > > > > > > > This commit improves the situation dramatically, by increasing timeout > > > > passed to schedule_timeout_interruptible() 1/20th of a second. This > > > > causes the timer softirq to not lock up a CPU and everything works fine. > > > > Testing has shown 100 runs of TREE07 passing reliably, which was not the > > > > case before because of RCU stalls. > > > > > > > > Cc: Paul McKenney <paulmck@kernel.org> > > > > Cc: Frederic Weisbecker <fweisbec@gmail.com> > > > > Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > Cc: <stable@vger.kernel.org> # 6.0.x > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > > > > > Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> > > > > Queued for further review and testing, thank you all! > > > > One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each > > timer hander must have consumed many tens of microseconds in order > > to keep the system busy, which seems a bit longer than it should be. > > Or am I underestimating the number of tasks involved? > > Here are the traces between successive calls to process_timeout() which is the timer callback handler: > > [ 1320.444210] <idle>-0 0dNs.. 314229620us : __run_timers: Calling timerfn 5: process_timeout > [ 1320.444215] <idle>-0 0dNs.. 314229620us : sched_waking: comm=rcu_torture_fak pid=145 prio=139 target_cpu=008 > [ 1320.463393] <idle>-0 7d.... 314229655us : sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_torture_wri next_pid=144 next_prio=120 > [ 1320.478870] rcu_tort-144 7d.... 314229658us : sched_switch: prev_comm=rcu_torture_wri prev_pid=144 prev_prio=120 prev_state=D ==> next_comm=swapper/7 next_pid=0 next_prio=120 > [ 1320.494324] <idle>-0 0dNs.. 314229738us : __run_timers: Calling timerfn 6: process_timeout > > It appears the time delta in the above occurrence is 118 micro seconds > between 2 timer callbacks. It does appear to be doing a cross-CPU wake up. > Maybe that adds to the long time? > > Here are the full logs with traces (in case it helps, search for "=D" for the > D-state sched_switch event before the "panic now" trace happens): > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.0.y/26/artifact/tools/testing/selftests/rcutorture/res/2022.12.31-23.04.42/TREE07.2/console.log 118 microseconds would do it! Still seems excessive to me, but what do I know? Thanx, Paul
On Tue, Jan 3, 2023 at 7:00 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Tue, Jan 03, 2023 at 06:24:23PM +0000, Joel Fernandes wrote: > > On Tue, Jan 03, 2023 at 10:04:04AM -0800, Paul E. McKenney wrote: > > > On Mon, Jan 02, 2023 at 08:43:10AM -0800, Davidlohr Bueso wrote: > > > > On Sun, 01 Jan 2023, Joel Fernandes (Google) wrote: > > > > > > > > > During shutdown of rcutorture, the shutdown thread in > > > > > rcu_torture_cleanup() calls torture_cleanup_begin() which sets fullstop > > > > > to FULLSTOP_RMMOD. This is enough to cause the rcutorture threads for > > > > > readers and fakewriters to breakout of their main while loop and start > > > > > shutting down. > > > > > > > > > > Once out of their main loop, they then call torture_kthread_stopping() > > > > > which in turn waits for kthread_stop() to be called, however > > > > > rcu_torture_cleanup() has not even called kthread_stop() on those > > > > > threads yet, it does that a bit later. However, before it gets a chance > > > > > to do so, torture_kthread_stopping() calls > > > > > schedule_timeout_interruptible(1) in a tight loop. Tracing confirmed > > > > > this makes the timer softirq constantly execute timer callbacks, while > > > > > never returning back to the softirq exit path and is essentially "locked > > > > > up" because of that. If the softirq preempts the shutdown thread, > > > > > kthread_stop() may never be called. > > > > > > > > > > This commit improves the situation dramatically, by increasing timeout > > > > > passed to schedule_timeout_interruptible() 1/20th of a second. This > > > > > causes the timer softirq to not lock up a CPU and everything works fine. > > > > > Testing has shown 100 runs of TREE07 passing reliably, which was not the > > > > > case before because of RCU stalls. > > > > > > > > > > Cc: Paul McKenney <paulmck@kernel.org> > > > > > Cc: Frederic Weisbecker <fweisbec@gmail.com> > > > > > Cc: Zhouyi Zhou <zhouzhouyi@gmail.com> > > > > > Cc: <stable@vger.kernel.org> # 6.0.x > > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > > > > > > > Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> > > > > > > Queued for further review and testing, thank you all! > > > > > > One thing still puzzles me. Assuming HZ=1000 and given 16 CPUs, each > > > timer hander must have consumed many tens of microseconds in order > > > to keep the system busy, which seems a bit longer than it should be. > > > Or am I underestimating the number of tasks involved? > > > > Here are the traces between successive calls to process_timeout() which is the timer callback handler: > > > > [ 1320.444210] <idle>-0 0dNs.. 314229620us : __run_timers: Calling timerfn 5: process_timeout > > [ 1320.444215] <idle>-0 0dNs.. 314229620us : sched_waking: comm=rcu_torture_fak pid=145 prio=139 target_cpu=008 > > [ 1320.463393] <idle>-0 7d.... 314229655us : sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_torture_wri next_pid=144 next_prio=120 > > [ 1320.478870] rcu_tort-144 7d.... 314229658us : sched_switch: prev_comm=rcu_torture_wri prev_pid=144 prev_prio=120 prev_state=D ==> next_comm=swapper/7 next_pid=0 next_prio=120 > > [ 1320.494324] <idle>-0 0dNs.. 314229738us : __run_timers: Calling timerfn 6: process_timeout > > > > It appears the time delta in the above occurrence is 118 micro seconds > > between 2 timer callbacks. It does appear to be doing a cross-CPU wake up. > > Maybe that adds to the long time? > > > > Here are the full logs with traces (in case it helps, search for "=D" for the > > D-state sched_switch event before the "panic now" trace happens): > > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.0.y/26/artifact/tools/testing/selftests/rcutorture/res/2022.12.31-23.04.42/TREE07.2/console.log > > 118 microseconds would do it! > > Still seems excessive to me True, also my machine is a bit slow so maybe it took more CPU cycles than usual. Still if it happened to me and Zhouyi, it could happen to anyone ;-) thanks, - Joel
diff --git a/kernel/torture.c b/kernel/torture.c index 29afc62f2bfe..d024f3b7181f 100644 --- a/kernel/torture.c +++ b/kernel/torture.c @@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title) VERBOSE_TOROUT_STRING(buf); while (!kthread_should_stop()) { torture_shutdown_absorb(title); - schedule_timeout_uninterruptible(1); + schedule_timeout_uninterruptible(HZ/20); } } EXPORT_SYMBOL_GPL(torture_kthread_stopping);