Message ID | 20221117031551.1142289-3-joel@joelfernandes.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp189794wrr; Wed, 16 Nov 2022 19:19:15 -0800 (PST) X-Google-Smtp-Source: AA0mqf7dbTKJ3DJQFZ7RU3Ddyn5ryAYHDMR9AJsj6pUAjfId3rCu5xmcqKXu8oo8zENIHqrZNG2M X-Received: by 2002:a17:903:c5:b0:188:bb79:4892 with SMTP id x5-20020a17090300c500b00188bb794892mr874326plc.60.1668655154903; Wed, 16 Nov 2022 19:19:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668655154; cv=none; d=google.com; s=arc-20160816; b=twBBVAbFC7mzG5P9U/EtujJNOn+spJX1i/gDVribCUJOziUAGQaQ2kBraoajMbRTHT HtTJQAMGeajQ3lSVom4qmS6v3alM/iNJKgRnqcBg17wSmXBLhqD/m/RO9JEiYZiT5MAH 802usiRiJoI1fPVE63DPgvpgPOvTdsL9yv3jlR85HQtrAkLcTmrv4AUTjFA9fClqwYQt N1lb3O0vnMkBkyNgZ7rSJeHFni3sHTR3MvMmbvPDIUqLhYS49URUJy/Q9F+B7JCoojqS C8S9O1jWBN/EOkmjjCm115DJ0/06tcEzjE2zwjRvirGVYxdgqlrY5dzeyN8Zmvns4tOa OwiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=X9LBEXDD+Ph4SF0Q02heQyIynGLkO+luTsSJyIxQLi8=; b=apa+VmOBcMaH1ngRByUqgIJ0PxUYCBnP6It61yZDliZe4fIdKRT2xUnt9F3hKxvwqu C1tUPopJFQV7NwrhJV54494qO/1r4eoXuAqIi6ridjBL2Gf7/yuL1D6fvWgjcPmDSWIJ tOW6Q5VfayFaVht1I1AcJHwdzrru7UX7dOrZ/ygkM+6MgA9xQN7txy5+tZDAfkWGyUhs kdWEdjdsHC1HjvA7ZfFkg0Y9j9AzbLFho1WJ96mizBvM48gma2RMKUyxkbPkhvYJogQS nUCk2rT78aCNZRzUZBjzqI//sJwzDH4mSTwghCMHfPdaOVT7oGoWRcDcb6rK85rdh5kJ XhuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=JHB8aiQg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t10-20020a63f34a000000b00476cd35390esi1352874pgj.238.2022.11.16.19.19.01; Wed, 16 Nov 2022 19:19:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=JHB8aiQg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234040AbiKQDQR (ORCPT <rfc822;just.gull.subs@gmail.com> + 99 others); Wed, 16 Nov 2022 22:16:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234364AbiKQDQH (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 16 Nov 2022 22:16:07 -0500 Received: from mail-qt1-x831.google.com (mail-qt1-x831.google.com [IPv6:2607:f8b0:4864:20::831]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 950C564A31 for <linux-kernel@vger.kernel.org>; Wed, 16 Nov 2022 19:16:06 -0800 (PST) Received: by mail-qt1-x831.google.com with SMTP id w9so359114qtv.13 for <linux-kernel@vger.kernel.org>; Wed, 16 Nov 2022 19:16:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=X9LBEXDD+Ph4SF0Q02heQyIynGLkO+luTsSJyIxQLi8=; b=JHB8aiQgqZr6T2488gjy0+mFOhHh2VyJvRopV5+pPWMJXZhxAwmnceQy5QEv3N8tOk Ne1NmLfUHfz0FDXUP8hNcvKOK5FPq4UpnuFxw8/1IO6+cRNfmUgFz3ciXPbZVkiiHxIV xS+aaBeY1m0r9PJS6zCglk21/419o+83mDIcQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X9LBEXDD+Ph4SF0Q02heQyIynGLkO+luTsSJyIxQLi8=; b=o2ac17Kkt0TI3hKWKjlTBQCYic08ZszD60V1t7Y691kMYA99K4VybgwqgytQ0Aalj7 D6FbaiIfjS2gYRwcjwXDbkUVNJnxtq/+gKtjwY8cSxCCc/QZhe8ThGfC4Ca0SJtui1PB srYlbm7hq7jp0lDcKLoJUqkC7Qc/1HV08UNrVGsqAK1AHMBN7T/+eobp8hdMgjhzjXHf eipkUW+Qs/CttOReKXROb+s+HMy8l3VIeL4rfJObTnfvl9JWE7l9SNlVWxGj1SCoiOg8 sq/dNeBO7AZiGo8J1GHFgXcdcO8tVPHH258LX8z45/cEyMLD8Sgd40yHdFQaB4guSwpF j+lA== X-Gm-Message-State: ANoB5pn4qDQwFy5l8viiclN9P0Ks7gPB+BRrgbmGa0yGbH5hYPZL67hD /4s/XZnt7n8lI5LQOeZdH7kVgp1KZWrGKg== X-Received: by 2002:a05:622a:4899:b0:3a5:64a0:5eba with SMTP id fc25-20020a05622a489900b003a564a05ebamr827383qtb.96.1668654965522; Wed, 16 Nov 2022 19:16:05 -0800 (PST) Received: from joelboxx.c.googlers.com.com (228.221.150.34.bc.googleusercontent.com. [34.150.221.228]) by smtp.gmail.com with ESMTPSA id k19-20020a05620a415300b006cdd0939ffbsm11398318qko.86.2022.11.16.19.16.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Nov 2022 19:16:05 -0800 (PST) From: "Joel Fernandes (Google)" <joel@joelfernandes.org> To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>, Cong Wang <xiyou.wangcong@gmail.com>, David Ahern <dsahern@kernel.org>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>, Jakub Kicinski <kuba@kernel.org>, Jamal Hadi Salim <jhs@mojatatu.com>, Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>, rcu@vger.kernel.org, rostedt@goodmis.org, paulmck@kernel.org, fweisbec@gmail.com Subject: [PATCH rcu/dev 3/3] net: Use call_rcu_flush() for dst_destroy_rcu Date: Thu, 17 Nov 2022 03:15:50 +0000 Message-Id: <20221117031551.1142289-3-joel@joelfernandes.org> X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog In-Reply-To: <20221117031551.1142289-1-joel@joelfernandes.org> References: <20221117031551.1142289-1-joel@joelfernandes.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749711747723525932?= X-GMAIL-MSGID: =?utf-8?q?1749711747723525932?= |
Series |
[rcu/dev,1/3] net: Use call_rcu_flush() for qdisc_free_cb
|
|
Commit Message
Joel Fernandes
Nov. 17, 2022, 3:15 a.m. UTC
In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY
causes a networking test to fail in the teardown phase.
The failure happens during: ip netns del <name>
Using ftrace, I found the callbacks it was queuing which this series fixes. Use
call_rcu_flush() to revert to the old behavior. With that, the test passes.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
net/core/dst.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) <joel@joelfernandes.org> wrote: > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > causes a networking test to fail in the teardown phase. > > The failure happens during: ip netns del <name> And ? What happens then next ? > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > call_rcu_flush() to revert to the old behavior. With that, the test passes. What is this test about ? What barrier was used to make it not flaky ? Was it depending on some undocumented RCU behavior ? Maybe adding a sysctl to force the flush would be better for functional tests ? I would rather change the test(s), than adding call_rcu_flush(), adding merge conflicts to future backports.
Hello Eric, On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > <joel@joelfernandes.org> wrote: > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > causes a networking test to fail in the teardown phase. > > > > The failure happens during: ip netns del <name> > > And ? What happens then next ? The test is doing the 'ip netns del <name>' and then polling for the disappearance of a network interface name for upto 5 seconds. I believe it is using netlink to get a table of interfaces. That polling is timing out. Here is some more details from the test's owner (copy pasting from another bug report): In the cleanup, we remove the netns, and thus will cause the veth pair being removed automatically, so we use a poll to check that if the veth in the root netns still exists to know whether the cleanup is done. Here is a public link to the code that is failing (its in golang): https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 Here is a public link to the line of code in the actual test leading up to the above path (this is the test that is run: network.RoutingFallthrough.ipv4_only_primary) : https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > What is this test about ? What barrier was used to make it not flaky ? I provided the links above, let me know if you have any questions. > Was it depending on some undocumented RCU behavior ? This is a new RCU feature posted here for significant power-savings on battery-powered devices: https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a There is also an LPC presentation about the same, I can dig the link if you are interested. > Maybe adding a sysctl to force the flush would be better for functional tests ? > > I would rather change the test(s), than adding call_rcu_flush(), > adding merge conflicts to future backports. I am not too sure about that, I think a user might expect the network interface to disappear from the networking tables quickly enough without dealing with barriers or kernel iternals. However, I added the authors of the test to this email in the hopes he can provide is point of views as well. The general approach we are taking with this sort of thing is to use call_rcu_flush() which is basically the same as call_rcu() for systems with CALL_RCU_LAZY=n. You can see some examples of that in the patch series link above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only Android and ChromeOS that are using it. I am adding Jie to share any input, he is from the networking team and knows this test well. thanks, - Joel
On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > Hello Eric, > > On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > <joel@joelfernandes.org> wrote: > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > causes a networking test to fail in the teardown phase. > > > > > > The failure happens during: ip netns del <name> > > > > And ? What happens then next ? > > The test is doing the 'ip netns del <name>' and then polling for the > disappearance of a network interface name for upto 5 seconds. I believe it is > using netlink to get a table of interfaces. That polling is timing out. > > Here is some more details from the test's owner (copy pasting from another > bug report): > In the cleanup, we remove the netns, and thus will cause the veth pair being > removed automatically, so we use a poll to check that if the veth in the root > netns still exists to know whether the cleanup is done. > > Here is a public link to the code that is failing (its in golang): > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > Here is a public link to the line of code in the actual test leading up to the above > path (this is the test that is run: > network.RoutingFallthrough.ipv4_only_primary) : > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > What is this test about ? What barrier was used to make it not flaky ? > > I provided the links above, let me know if you have any questions. > > > Was it depending on some undocumented RCU behavior ? > > This is a new RCU feature posted here for significant power-savings on > battery-powered devices: > https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > There is also an LPC presentation about the same, I can dig the link if you > are interested. > > > Maybe adding a sysctl to force the flush would be better for functional tests ? > > > > I would rather change the test(s), than adding call_rcu_flush(), > > adding merge conflicts to future backports. > > I am not too sure about that, I think a user might expect the network > interface to disappear from the networking tables quickly enough without > dealing with barriers or kernel iternals. However, I added the authors of the > test to this email in the hopes he can provide is point of views as well. > > The general approach we are taking with this sort of thing is to use > call_rcu_flush() which is basically the same as call_rcu() for systems with > CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > Android and ChromeOS that are using it. I am adding Jie to share any input, > he is from the networking team and knows this test well. > > I do not know what is this RCU_LAZY thing, but IMO this should be opt-in For instance, only kfree_rcu() should use it. We can not review hundreds of call_rcu() call sites and decide if adding arbitrary delays cou hurt .
On Thu, Nov 17, 2022 at 9:17 AM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > Hello Eric, > > > > On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > > <joel@joelfernandes.org> wrote: > > > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > > causes a networking test to fail in the teardown phase. > > > > > > > > The failure happens during: ip netns del <name> > > > > > > And ? What happens then next ? > > > > The test is doing the 'ip netns del <name>' and then polling for the > > disappearance of a network interface name for upto 5 seconds. I believe it is > > using netlink to get a table of interfaces. That polling is timing out. > > > > Here is some more details from the test's owner (copy pasting from another > > bug report): > > In the cleanup, we remove the netns, and thus will cause the veth pair being > > removed automatically, so we use a poll to check that if the veth in the root > > netns still exists to know whether the cleanup is done. > > > > Here is a public link to the code that is failing (its in golang): > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > > > Here is a public link to the line of code in the actual test leading up to the above > > path (this is the test that is run: > > network.RoutingFallthrough.ipv4_only_primary) : > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > > > What is this test about ? What barrier was used to make it not flaky ? > > > > I provided the links above, let me know if you have any questions. > > > > > Was it depending on some undocumented RCU behavior ? > > > > This is a new RCU feature posted here for significant power-savings on > > battery-powered devices: > > https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > > > There is also an LPC presentation about the same, I can dig the link if you > > are interested. > > > > > Maybe adding a sysctl to force the flush would be better for functional tests ? > > > > > > I would rather change the test(s), than adding call_rcu_flush(), > > > adding merge conflicts to future backports. > > > > I am not too sure about that, I think a user might expect the network > > interface to disappear from the networking tables quickly enough without > > dealing with barriers or kernel iternals. However, I added the authors of the > > test to this email in the hopes he can provide is point of views as well. > > > > The general approach we are taking with this sort of thing is to use > > call_rcu_flush() which is basically the same as call_rcu() for systems with > > CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > > above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > > Android and ChromeOS that are using it. I am adding Jie to share any input, > > he is from the networking team and knows this test well. > > > > > > I do not know what is this RCU_LAZY thing, but IMO this should be opt-in > > For instance, only kfree_rcu() should use it. > > We can not review hundreds of call_rcu() call sites and decide if > adding arbitrary delays cou hurt . At a very minimum, things like rcu_barrier() should make sure that all 'lazy' callbacks are processed in a reasonable amount of time.
On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > Hello Eric, > > > > On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > > <joel@joelfernandes.org> wrote: > > > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > > causes a networking test to fail in the teardown phase. > > > > > > > > The failure happens during: ip netns del <name> > > > > > > And ? What happens then next ? > > > > The test is doing the 'ip netns del <name>' and then polling for the > > disappearance of a network interface name for upto 5 seconds. I believe it is > > using netlink to get a table of interfaces. That polling is timing out. > > > > Here is some more details from the test's owner (copy pasting from another > > bug report): > > In the cleanup, we remove the netns, and thus will cause the veth pair being > > removed automatically, so we use a poll to check that if the veth in the root > > netns still exists to know whether the cleanup is done. > > > > Here is a public link to the code that is failing (its in golang): > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > > > Here is a public link to the line of code in the actual test leading up to the above > > path (this is the test that is run: > > network.RoutingFallthrough.ipv4_only_primary) : > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > > > What is this test about ? What barrier was used to make it not flaky ? > > > > I provided the links above, let me know if you have any questions. > > > > > Was it depending on some undocumented RCU behavior ? > > > > This is a new RCU feature posted here for significant power-savings on > > battery-powered devices: > > https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > > > There is also an LPC presentation about the same, I can dig the link if you > > are interested. > > > > > Maybe adding a sysctl to force the flush would be better for functional tests ? > > > > > > I would rather change the test(s), than adding call_rcu_flush(), > > > adding merge conflicts to future backports. > > > > I am not too sure about that, I think a user might expect the network > > interface to disappear from the networking tables quickly enough without > > dealing with barriers or kernel iternals. However, I added the authors of the > > test to this email in the hopes he can provide is point of views as well. > > > > The general approach we are taking with this sort of thing is to use > > call_rcu_flush() which is basically the same as call_rcu() for systems with > > CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > > above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > > Android and ChromeOS that are using it. I am adding Jie to share any input, > > he is from the networking team and knows this test well. > > > > > > I do not know what is this RCU_LAZY thing, but IMO this should be opt-in You should read the links I sent you. We did already try opt-in, Thomas Gleixner made a point at LPC that we should not add new APIs for this purpose and confuse kernel developers. > For instance, only kfree_rcu() should use it. No. Most of the call_rcu() usages are for freeing memory, so the consensus is we should apply this as opt out and fix issues along the way. We already did a lot of research/diligence on seeing which users need conversion. > We can not review hundreds of call_rcu() call sites and decide if > adding arbitrary delays cou hurt . That work has already been done as much as possible, please read the links I sent. Thanks.
On Thu, Nov 17, 2022 at 9:38 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > Hello Eric, > > > > > > On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > > > <joel@joelfernandes.org> wrote: > > > > > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > > > causes a networking test to fail in the teardown phase. > > > > > > > > > > The failure happens during: ip netns del <name> > > > > > > > > And ? What happens then next ? > > > > > > The test is doing the 'ip netns del <name>' and then polling for the > > > disappearance of a network interface name for upto 5 seconds. I believe it is > > > using netlink to get a table of interfaces. That polling is timing out. > > > > > > Here is some more details from the test's owner (copy pasting from another > > > bug report): > > > In the cleanup, we remove the netns, and thus will cause the veth pair being > > > removed automatically, so we use a poll to check that if the veth in the root > > > netns still exists to know whether the cleanup is done. > > > > > > Here is a public link to the code that is failing (its in golang): > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > > > > > Here is a public link to the line of code in the actual test leading up to the above > > > path (this is the test that is run: > > > network.RoutingFallthrough.ipv4_only_primary) : > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > > > > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > > > > > What is this test about ? What barrier was used to make it not flaky ? > > > > > > I provided the links above, let me know if you have any questions. > > > > > > > Was it depending on some undocumented RCU behavior ? > > > > > > This is a new RCU feature posted here for significant power-savings on > > > battery-powered devices: > > > https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > > > > > There is also an LPC presentation about the same, I can dig the link if you > > > are interested. > > > > > > > Maybe adding a sysctl to force the flush would be better for functional tests ? > > > > > > > > I would rather change the test(s), than adding call_rcu_flush(), > > > > adding merge conflicts to future backports. > > > > > > I am not too sure about that, I think a user might expect the network > > > interface to disappear from the networking tables quickly enough without > > > dealing with barriers or kernel iternals. However, I added the authors of the > > > test to this email in the hopes he can provide is point of views as well. > > > > > > The general approach we are taking with this sort of thing is to use > > > call_rcu_flush() which is basically the same as call_rcu() for systems with > > > CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > > > above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > > > Android and ChromeOS that are using it. I am adding Jie to share any input, > > > he is from the networking team and knows this test well. > > > > > > > > > > I do not know what is this RCU_LAZY thing, but IMO this should be opt-in > > You should read the links I sent you. We did already try opt-in, > Thomas Gleixner made a point at LPC that we should not add new APIs > for this purpose and confuse kernel developers. > > > For instance, only kfree_rcu() should use it. > > No. Most of the call_rcu() usages are for freeing memory, so the > consensus is we should apply this as opt out and fix issues along the > way. We already did a lot of research/diligence on seeing which users > need conversion. > > > We can not review hundreds of call_rcu() call sites and decide if > > adding arbitrary delays cou hurt . > > That work has already been done as much as possible, please read the > links I sent. Oh well. No. I will leave it to other folks dealing with this crazy thing.
On Thu, Nov 17, 2022 at 5:38 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > Hello Eric, > > > > > > On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > > > <joel@joelfernandes.org> wrote: > > > > > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > > > causes a networking test to fail in the teardown phase. > > > > > > > > > > The failure happens during: ip netns del <name> > > > > > > > > And ? What happens then next ? > > > > > > The test is doing the 'ip netns del <name>' and then polling for the > > > disappearance of a network interface name for upto 5 seconds. I believe it is > > > using netlink to get a table of interfaces. That polling is timing out. > > > > > > Here is some more details from the test's owner (copy pasting from another > > > bug report): > > > In the cleanup, we remove the netns, and thus will cause the veth pair being > > > removed automatically, so we use a poll to check that if the veth in the root > > > netns still exists to know whether the cleanup is done. > > > > > > Here is a public link to the code that is failing (its in golang): > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > > > > > Here is a public link to the line of code in the actual test leading up to the above > > > path (this is the test that is run: > > > network.RoutingFallthrough.ipv4_only_primary) : > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > > > > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > > > > > What is this test about ? What barrier was used to make it not flaky ? > > > > > > I provided the links above, let me know if you have any questions. > > > > > > > Was it depending on some undocumented RCU behavior ? > > > > > > This is a new RCU feature posted here for significant power-savings on > > > battery-powered devices: > > > https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > > > > > There is also an LPC presentation about the same, I can dig the link if you > > > are interested. > > > > > > > Maybe adding a sysctl to force the flush would be better for functional tests ? > > > > > > > > I would rather change the test(s), than adding call_rcu_flush(), > > > > adding merge conflicts to future backports. > > > > > > I am not too sure about that, I think a user might expect the network > > > interface to disappear from the networking tables quickly enough without > > > dealing with barriers or kernel iternals. However, I added the authors of the > > > test to this email in the hopes he can provide is point of views as well. > > > > > > The general approach we are taking with this sort of thing is to use > > > call_rcu_flush() which is basically the same as call_rcu() for systems with > > > CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > > > above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > > > Android and ChromeOS that are using it. I am adding Jie to share any input, > > > he is from the networking team and knows this test well. > > > > > > > > > > I do not know what is this RCU_LAZY thing, but IMO this should be opt-in > > You should read the links I sent you. We did already try opt-in, > Thomas Gleixner made a point at LPC that we should not add new APIs > for this purpose and confuse kernel developers. > > > For instance, only kfree_rcu() should use it. > > No. Most of the call_rcu() usages are for freeing memory, so the > consensus is we should apply this as opt out and fix issues along the > way. We already did a lot of research/diligence on seeing which users > need conversion. > > > We can not review hundreds of call_rcu() call sites and decide if > > adding arbitrary delays cou hurt . > > That work has already been done as much as possible, please read the > links I sent. Also just to add, this test is a bit weird / corner case, as in anyone expecting a quick response from call_rcu() is broken by design. However, for these callbacks, it does not matter much which API they use as they are quite infrequent for power savings. Thanks.
On Thu, Nov 17, 2022 at 5:40 PM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Nov 17, 2022 at 9:38 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > > Hello Eric, > > > > > > > > On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > > > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > > > > <joel@joelfernandes.org> wrote: > > > > > > > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > > > > causes a networking test to fail in the teardown phase. > > > > > > > > > > > > The failure happens during: ip netns del <name> > > > > > > > > > > And ? What happens then next ? > > > > > > > > The test is doing the 'ip netns del <name>' and then polling for the > > > > disappearance of a network interface name for upto 5 seconds. I believe it is > > > > using netlink to get a table of interfaces. That polling is timing out. > > > > > > > > Here is some more details from the test's owner (copy pasting from another > > > > bug report): > > > > In the cleanup, we remove the netns, and thus will cause the veth pair being > > > > removed automatically, so we use a poll to check that if the veth in the root > > > > netns still exists to know whether the cleanup is done. > > > > > > > > Here is a public link to the code that is failing (its in golang): > > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > > > > > > > Here is a public link to the line of code in the actual test leading up to the above > > > > path (this is the test that is run: > > > > network.RoutingFallthrough.ipv4_only_primary) : > > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > > > > > > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > > > > > > > What is this test about ? What barrier was used to make it not flaky ? > > > > > > > > I provided the links above, let me know if you have any questions. > > > > > > > > > Was it depending on some undocumented RCU behavior ? > > > > > > > > This is a new RCU feature posted here for significant power-savings on > > > > battery-powered devices: > > > > https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > > > > > > > There is also an LPC presentation about the same, I can dig the link if you > > > > are interested. > > > > > > > > > Maybe adding a sysctl to force the flush would be better for functional tests ? > > > > > > > > > > I would rather change the test(s), than adding call_rcu_flush(), > > > > > adding merge conflicts to future backports. > > > > > > > > I am not too sure about that, I think a user might expect the network > > > > interface to disappear from the networking tables quickly enough without > > > > dealing with barriers or kernel iternals. However, I added the authors of the > > > > test to this email in the hopes he can provide is point of views as well. > > > > > > > > The general approach we are taking with this sort of thing is to use > > > > call_rcu_flush() which is basically the same as call_rcu() for systems with > > > > CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > > > > above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > > > > Android and ChromeOS that are using it. I am adding Jie to share any input, > > > > he is from the networking team and knows this test well. > > > > > > > > > > > > > > I do not know what is this RCU_LAZY thing, but IMO this should be opt-in > > > > You should read the links I sent you. We did already try opt-in, > > Thomas Gleixner made a point at LPC that we should not add new APIs > > for this purpose and confuse kernel developers. > > > > > For instance, only kfree_rcu() should use it. > > > > No. Most of the call_rcu() usages are for freeing memory, so the > > consensus is we should apply this as opt out and fix issues along the > > way. We already did a lot of research/diligence on seeing which users > > need conversion. > > > > > We can not review hundreds of call_rcu() call sites and decide if > > > adding arbitrary delays cou hurt . > > > > That work has already been done as much as possible, please read the > > links I sent. > > Oh well. No. > > I will leave it to other folks dealing with this crazy thing. Yes, I agree. Your comments here have not been useful (or respectful) so I am Ok with that. - Joel
On Thu, Nov 17, 2022 at 9:42 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > Yes, I agree. Your comments here have not been useful (or respectful) > so I am Ok with that. > > - Joel Well, I have discovered that some changes went in networking tree without network maintainers being involved nor CCed. What can I say ? It seems I have no say, right ? commit f32846476afbe1f296c41d036219178b3dfb6a9d Author: Joel Fernandes (Google) <joel@joelfernandes.org> Date: Sun Oct 16 16:23:04 2022 +0000 rxrpc: Use call_rcu_flush() instead of call_rcu() call_rcu() changes to save power may cause slowness. Use the call_rcu_flush() API instead which reverts to the old behavior. We find this via inspection that the RCU callback does a wakeup of a thread. This usually indicates that something is waiting on it. To be safe, let us use call_rcu_flush() here instead. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c index 22089e37e97f0628f780855f9e219e5c33d4afa1..fdcfb509cc4434b0781b76623532aff9c854ce60 100644 --- a/net/rxrpc/conn_object.c +++ b/net/rxrpc/conn_object.c @@ -253,7 +253,7 @@ void rxrpc_kill_connection(struct rxrpc_connection *conn) * must carry a ref on the connection to prevent us getting here whilst * it is queued or running. */ - call_rcu(&conn->rcu, rxrpc_destroy_connection); + call_rcu_flush(&conn->rcu, rxrpc_destroy_connection); } /*
On Thu, Nov 17, 2022 at 5:49 PM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Nov 17, 2022 at 9:42 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > Yes, I agree. Your comments here have not been useful (or respectful) > > so I am Ok with that. > > > > - Joel > > Well, I have discovered that some changes went in networking tree > without network maintainers being involved nor CCed. > > What can I say ? > > It seems I have no say, right ? Sorry, I take responsibility for that. FWIW, the rxrpc change is not yet in Linus's tree. Also FWIW, the rxrpc case came up because we detected that it does a scheduler wakeup from the callback. We did both static and dynamic testing to identify callbacks that do wakeups throughout the kernel (kernel patch available on request), as the pattern observed is things doing wakeups typically are for use cases that are not freeing memory but something blocking, similar to synchronize_rcu(). So it was a "trivial/obvious" change to make for rxrpc which I might have assumed did not need much supervision because it just reverts that API to the old behavior -- still probably no excuse. Again, we can talk this out no problem. But I would strongly recommend not calling it "crazy thing", as we did all due diligence for almost a year (talking about it at LPC, working through various code paths and bugs, 4 different patch redesigns on the idea (including the opt-in that you are bringing up), including a late night debugging session to figure this out etc). Just to clarify, I know you review/maintain a lot of the networking code and I really appreciate that (not praising just for the sake). And I care about the kernel too, just like you. thanks, - Joel > commit f32846476afbe1f296c41d036219178b3dfb6a9d > Author: Joel Fernandes (Google) <joel@joelfernandes.org> > Date: Sun Oct 16 16:23:04 2022 +0000 > > rxrpc: Use call_rcu_flush() instead of call_rcu() > > call_rcu() changes to save power may cause slowness. Use the > call_rcu_flush() API instead which reverts to the old behavior. > > We find this via inspection that the RCU callback does a wakeup of a > thread. This usually indicates that something is waiting on it. To be > safe, let us use call_rcu_flush() here instead. > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > diff --git a/net/rxrpc/conn_object.c b/net/rxrpc/conn_object.c > index 22089e37e97f0628f780855f9e219e5c33d4afa1..fdcfb509cc4434b0781b76623532aff9c854ce60 > 100644 > --- a/net/rxrpc/conn_object.c > +++ b/net/rxrpc/conn_object.c > @@ -253,7 +253,7 @@ void rxrpc_kill_connection(struct rxrpc_connection *conn) > * must carry a ref on the connection to prevent us getting here whilst > * it is queued or running. > */ > - call_rcu(&conn->rcu, rxrpc_destroy_connection); > + call_rcu_flush(&conn->rcu, rxrpc_destroy_connection); > } > > /*
On Thu, Nov 17, 2022 at 10:18 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > On Thu, Nov 17, 2022 at 5:49 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Thu, Nov 17, 2022 at 9:42 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > > > > > Yes, I agree. Your comments here have not been useful (or respectful) > > > so I am Ok with that. > > > > > > - Joel > > > > Well, I have discovered that some changes went in networking tree > > without network maintainers being involved nor CCed. > > > > What can I say ? > > > > It seems I have no say, right ? > > Sorry, I take responsibility for that. FWIW, the rxrpc change is not > yet in Linus's tree. > > Also FWIW, the rxrpc case came up because we detected that it does a > scheduler wakeup from the callback. We did both static and dynamic > testing to identify callbacks that do wakeups throughout the kernel > (kernel patch available on request), as the pattern observed is things > doing wakeups typically are for use cases that are not freeing memory > but something blocking, similar to synchronize_rcu(). So it was a > "trivial/obvious" change to make for rxrpc which I might have assumed > did not need much supervision because it just reverts that API to the > old behavior -- still probably no excuse. > > Again, we can talk this out no problem. But I would strongly recommend > not calling it "crazy thing", as we did all due diligence for almost a > year (talking about it at LPC, working through various code paths and > bugs, 4 different patch redesigns on the idea (including the opt-in > that you are bringing up), including a late night debugging session to > figure this out etc). Apologies. For me "crazy" does not have the same meaning, apparently. I will try to use more neutral words in the future. > > Just to clarify, I know you review/maintain a lot of the networking > code and I really appreciate that (not praising just for the sake). > And I care about the kernel too, just like you. I had no doubts about that, really.
On Thu, Nov 17, 2022 at 05:40:40PM +0000, Joel Fernandes wrote: > On Thu, Nov 17, 2022 at 5:38 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > > Hello Eric, > > > > > > > > On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > > > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > > > > <joel@joelfernandes.org> wrote: > > > > > > > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > > > > causes a networking test to fail in the teardown phase. > > > > > > > > > > > > The failure happens during: ip netns del <name> > > > > > > > > > > And ? What happens then next ? > > > > > > > > The test is doing the 'ip netns del <name>' and then polling for the > > > > disappearance of a network interface name for upto 5 seconds. I believe it is > > > > using netlink to get a table of interfaces. That polling is timing out. > > > > > > > > Here is some more details from the test's owner (copy pasting from another > > > > bug report): > > > > In the cleanup, we remove the netns, and thus will cause the veth pair being > > > > removed automatically, so we use a poll to check that if the veth in the root > > > > netns still exists to know whether the cleanup is done. > > > > > > > > Here is a public link to the code that is failing (its in golang): > > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > > > > > > > Here is a public link to the line of code in the actual test leading up to the above > > > > path (this is the test that is run: > > > > network.RoutingFallthrough.ipv4_only_primary) : > > > > https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > > > > > > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > > > > > > > What is this test about ? What barrier was used to make it not flaky ? > > > > > > > > I provided the links above, let me know if you have any questions. > > > > > > > > > Was it depending on some undocumented RCU behavior ? > > > > > > > > This is a new RCU feature posted here for significant power-savings on > > > > battery-powered devices: > > > > https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > > > > > > > There is also an LPC presentation about the same, I can dig the link if you > > > > are interested. > > > > > > > > > Maybe adding a sysctl to force the flush would be better for functional tests ? > > > > > > > > > > I would rather change the test(s), than adding call_rcu_flush(), > > > > > adding merge conflicts to future backports. > > > > > > > > I am not too sure about that, I think a user might expect the network > > > > interface to disappear from the networking tables quickly enough without > > > > dealing with barriers or kernel iternals. However, I added the authors of the > > > > test to this email in the hopes he can provide is point of views as well. > > > > > > > > The general approach we are taking with this sort of thing is to use > > > > call_rcu_flush() which is basically the same as call_rcu() for systems with > > > > CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > > > > above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > > > > Android and ChromeOS that are using it. I am adding Jie to share any input, > > > > he is from the networking team and knows this test well. > > > > > > > > > > > > > > I do not know what is this RCU_LAZY thing, but IMO this should be opt-in > > > > You should read the links I sent you. We did already try opt-in, > > Thomas Gleixner made a point at LPC that we should not add new APIs > > for this purpose and confuse kernel developers. > > > > > For instance, only kfree_rcu() should use it. > > > > No. Most of the call_rcu() usages are for freeing memory, so the > > consensus is we should apply this as opt out and fix issues along the > > way. We already did a lot of research/diligence on seeing which users > > need conversion. > > > > > We can not review hundreds of call_rcu() call sites and decide if > > > adding arbitrary delays cou hurt . > > > > That work has already been done as much as possible, please read the > > links I sent. > > Also just to add, this test is a bit weird / corner case, as in anyone > expecting a quick response from call_rcu() is broken by design. > However, for these callbacks, it does not matter much which API they > use as they are quite infrequent for power savings. The "broken by design" is a bit strong. Some of those call_rcu() invocations have been around for the better part of 20 years, after all. That aside, I do hope that we can arrive at something that will enhance battery lifetime while avoiding unnecessary disruption. But we are unlikely to be able to completely avoid disruption. As this email thread illustrates. ;-) Thanx, Paul
> On Nov 17, 2022, at 2:29 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Thu, Nov 17, 2022 at 05:40:40PM +0000, Joel Fernandes wrote: >>> On Thu, Nov 17, 2022 at 5:38 PM Joel Fernandes <joel@joelfernandes.org> wrote: >>> >>> On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: >>>> >>>> On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: >>>>> >>>>> Hello Eric, >>>>> >>>>> On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: >>>>>> On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) >>>>>> <joel@joelfernandes.org> wrote: >>>>>>> >>>>>>> In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY >>>>>>> causes a networking test to fail in the teardown phase. >>>>>>> >>>>>>> The failure happens during: ip netns del <name> >>>>>> >>>>>> And ? What happens then next ? >>>>> >>>>> The test is doing the 'ip netns del <name>' and then polling for the >>>>> disappearance of a network interface name for upto 5 seconds. I believe it is >>>>> using netlink to get a table of interfaces. That polling is timing out. >>>>> >>>>> Here is some more details from the test's owner (copy pasting from another >>>>> bug report): >>>>> In the cleanup, we remove the netns, and thus will cause the veth pair being >>>>> removed automatically, so we use a poll to check that if the veth in the root >>>>> netns still exists to know whether the cleanup is done. >>>>> >>>>> Here is a public link to the code that is failing (its in golang): >>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 >>>>> >>>>> Here is a public link to the line of code in the actual test leading up to the above >>>>> path (this is the test that is run: >>>>> network.RoutingFallthrough.ipv4_only_primary) : >>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 >>>>> >>>>>>> Using ftrace, I found the callbacks it was queuing which this series fixes. Use >>>>>>> call_rcu_flush() to revert to the old behavior. With that, the test passes. >>>>>> >>>>>> What is this test about ? What barrier was used to make it not flaky ? >>>>> >>>>> I provided the links above, let me know if you have any questions. >>>>> >>>>>> Was it depending on some undocumented RCU behavior ? >>>>> >>>>> This is a new RCU feature posted here for significant power-savings on >>>>> battery-powered devices: >>>>> https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a >>>>> >>>>> There is also an LPC presentation about the same, I can dig the link if you >>>>> are interested. >>>>> >>>>>> Maybe adding a sysctl to force the flush would be better for functional tests ? >>>>>> >>>>>> I would rather change the test(s), than adding call_rcu_flush(), >>>>>> adding merge conflicts to future backports. >>>>> >>>>> I am not too sure about that, I think a user might expect the network >>>>> interface to disappear from the networking tables quickly enough without >>>>> dealing with barriers or kernel iternals. However, I added the authors of the >>>>> test to this email in the hopes he can provide is point of views as well. >>>>> >>>>> The general approach we are taking with this sort of thing is to use >>>>> call_rcu_flush() which is basically the same as call_rcu() for systems with >>>>> CALL_RCU_LAZY=n. You can see some examples of that in the patch series link >>>>> above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only >>>>> Android and ChromeOS that are using it. I am adding Jie to share any input, >>>>> he is from the networking team and knows this test well. >>>>> >>>>> >>>> >>>> I do not know what is this RCU_LAZY thing, but IMO this should be opt-in >>> >>> You should read the links I sent you. We did already try opt-in, >>> Thomas Gleixner made a point at LPC that we should not add new APIs >>> for this purpose and confuse kernel developers. >>> >>>> For instance, only kfree_rcu() should use it. >>> >>> No. Most of the call_rcu() usages are for freeing memory, so the >>> consensus is we should apply this as opt out and fix issues along the >>> way. We already did a lot of research/diligence on seeing which users >>> need conversion. >>> >>>> We can not review hundreds of call_rcu() call sites and decide if >>>> adding arbitrary delays cou hurt . >>> >>> That work has already been done as much as possible, please read the >>> links I sent. >> >> Also just to add, this test is a bit weird / corner case, as in anyone >> expecting a quick response from call_rcu() is broken by design. >> However, for these callbacks, it does not matter much which API they >> use as they are quite infrequent for power savings. > > The "broken by design" is a bit strong. Some of those call_rcu() > invocations have been around for the better part of 20 years, after all. > > That aside, I do hope that we can arrive at something that will enhance > battery lifetime while avoiding unnecessary disruption. But we are > unlikely to be able to completely avoid disruption. As this email > thread illustrates. ;-) Another approach, with these 3 patches could be to keep the call_rcu() but add an rcu_barrier() after them. I think people running ip del netns should not have to wait for their RCU cb to take too long to run and remove user visible state. But I would need suggestions from networking experts which CBs of these 3, to do this for. Or for all of them. Alternatively, we can also patch just the test with a new knob that does rcu_barrier. But I dislike that as it does not fix it for all users. Probably the ip utilities will also need a patch then. Thanks, - Joel > > Thanx, Paul
On Thu, Nov 17, 2022 at 1:16 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > On Nov 17, 2022, at 2:29 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Thu, Nov 17, 2022 at 05:40:40PM +0000, Joel Fernandes wrote: > >>> On Thu, Nov 17, 2022 at 5:38 PM Joel Fernandes <joel@joelfernandes.org> wrote: > >>> > >>> On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: > >>>> > >>>> On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > >>>>> > >>>>> Hello Eric, > >>>>> > >>>>> On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > >>>>>> On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > >>>>>> <joel@joelfernandes.org> wrote: > >>>>>>> > >>>>>>> In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > >>>>>>> causes a networking test to fail in the teardown phase. > >>>>>>> > >>>>>>> The failure happens during: ip netns del <name> > >>>>>> > >>>>>> And ? What happens then next ? > >>>>> > >>>>> The test is doing the 'ip netns del <name>' and then polling for the > >>>>> disappearance of a network interface name for upto 5 seconds. I believe it is > >>>>> using netlink to get a table of interfaces. That polling is timing out. > >>>>> > >>>>> Here is some more details from the test's owner (copy pasting from another > >>>>> bug report): > >>>>> In the cleanup, we remove the netns, and thus will cause the veth pair being > >>>>> removed automatically, so we use a poll to check that if the veth in the root > >>>>> netns still exists to know whether the cleanup is done. > >>>>> > >>>>> Here is a public link to the code that is failing (its in golang): > >>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > >>>>> > >>>>> Here is a public link to the line of code in the actual test leading up to the above > >>>>> path (this is the test that is run: > >>>>> network.RoutingFallthrough.ipv4_only_primary) : > >>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > >>>>> > >>>>>>> Using ftrace, I found the callbacks it was queuing which this series fixes. Use > >>>>>>> call_rcu_flush() to revert to the old behavior. With that, the test passes. > >>>>>> > >>>>>> What is this test about ? What barrier was used to make it not flaky ? > >>>>> > >>>>> I provided the links above, let me know if you have any questions. > >>>>> > >>>>>> Was it depending on some undocumented RCU behavior ? > >>>>> > >>>>> This is a new RCU feature posted here for significant power-savings on > >>>>> battery-powered devices: > >>>>> https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > >>>>> > >>>>> There is also an LPC presentation about the same, I can dig the link if you > >>>>> are interested. > >>>>> > >>>>>> Maybe adding a sysctl to force the flush would be better for functional tests ? > >>>>>> > >>>>>> I would rather change the test(s), than adding call_rcu_flush(), > >>>>>> adding merge conflicts to future backports. > >>>>> > >>>>> I am not too sure about that, I think a user might expect the network > >>>>> interface to disappear from the networking tables quickly enough without > >>>>> dealing with barriers or kernel iternals. However, I added the authors of the > >>>>> test to this email in the hopes he can provide is point of views as well. > >>>>> > >>>>> The general approach we are taking with this sort of thing is to use > >>>>> call_rcu_flush() which is basically the same as call_rcu() for systems with > >>>>> CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > >>>>> above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > >>>>> Android and ChromeOS that are using it. I am adding Jie to share any input, > >>>>> he is from the networking team and knows this test well. > >>>>> > >>>>> > >>>> > >>>> I do not know what is this RCU_LAZY thing, but IMO this should be opt-in > >>> > >>> You should read the links I sent you. We did already try opt-in, > >>> Thomas Gleixner made a point at LPC that we should not add new APIs > >>> for this purpose and confuse kernel developers. > >>> > >>>> For instance, only kfree_rcu() should use it. > >>> > >>> No. Most of the call_rcu() usages are for freeing memory, so the > >>> consensus is we should apply this as opt out and fix issues along the > >>> way. We already did a lot of research/diligence on seeing which users > >>> need conversion. > >>> > >>>> We can not review hundreds of call_rcu() call sites and decide if > >>>> adding arbitrary delays cou hurt . > >>> > >>> That work has already been done as much as possible, please read the > >>> links I sent. > >> > >> Also just to add, this test is a bit weird / corner case, as in anyone > >> expecting a quick response from call_rcu() is broken by design. > >> However, for these callbacks, it does not matter much which API they > >> use as they are quite infrequent for power savings. > > > > The "broken by design" is a bit strong. Some of those call_rcu() > > invocations have been around for the better part of 20 years, after all. > > > > That aside, I do hope that we can arrive at something that will enhance > > battery lifetime while avoiding unnecessary disruption. But we are > > unlikely to be able to completely avoid disruption. As this email > > thread illustrates. ;-) > > Another approach, with these 3 patches could be to keep the call_rcu() but add an rcu_barrier() after them. I think people running ip del netns should not have to wait for their RCU cb to take too long to run and remove user visible state. But I would need suggestions from networking experts which CBs of these 3, to do this for. Or for all of them. > > Alternatively, we can also patch just the test with a new knob that does rcu_barrier. But I dislike that as it does not fix it for all users. Probably the ip utilities will also need a patch then. > Normally we have an rcu_barrier() in netns dismantle path already at a strategic location ( in cleanup_net() ) Maybe the issue here is that some particular layers need another one. Or we need to release a blocking reference before the call_rcu(). Some call_rcu() usages might not be optimal in this respect. We should not add an rcu_barrier() after a call_rcu(), we prefer factoring these expensive operations.
On Thu, Nov 17, 2022 at 9:29 PM Eric Dumazet <edumazet@google.com> wrote: > > On Thu, Nov 17, 2022 at 1:16 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > > > On Nov 17, 2022, at 2:29 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > On Thu, Nov 17, 2022 at 05:40:40PM +0000, Joel Fernandes wrote: > > >>> On Thu, Nov 17, 2022 at 5:38 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > >>> > > >>> On Thu, Nov 17, 2022 at 5:17 PM Eric Dumazet <edumazet@google.com> wrote: > > >>>> > > >>>> On Thu, Nov 17, 2022 at 7:58 AM Joel Fernandes <joel@joelfernandes.org> wrote: > > >>>>> > > >>>>> Hello Eric, > > >>>>> > > >>>>> On Wed, Nov 16, 2022 at 07:44:41PM -0800, Eric Dumazet wrote: > > >>>>>> On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > >>>>>> <joel@joelfernandes.org> wrote: > > >>>>>>> > > >>>>>>> In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > >>>>>>> causes a networking test to fail in the teardown phase. > > >>>>>>> > > >>>>>>> The failure happens during: ip netns del <name> > > >>>>>> > > >>>>>> And ? What happens then next ? > > >>>>> > > >>>>> The test is doing the 'ip netns del <name>' and then polling for the > > >>>>> disappearance of a network interface name for upto 5 seconds. I believe it is > > >>>>> using netlink to get a table of interfaces. That polling is timing out. > > >>>>> > > >>>>> Here is some more details from the test's owner (copy pasting from another > > >>>>> bug report): > > >>>>> In the cleanup, we remove the netns, and thus will cause the veth pair being > > >>>>> removed automatically, so we use a poll to check that if the veth in the root > > >>>>> netns still exists to know whether the cleanup is done. > > >>>>> > > >>>>> Here is a public link to the code that is failing (its in golang): > > >>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/network/virtualnet/env/env.go;drc=6c2841d6cc3eadd23e07912ec331943ee33d7de8;l=161 > > >>>>> > > >>>>> Here is a public link to the line of code in the actual test leading up to the above > > >>>>> path (this is the test that is run: > > >>>>> network.RoutingFallthrough.ipv4_only_primary) : > > >>>>> https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/platform/tast-tests/src/chromiumos/tast/local/bundles/cros/network/routing_fallthrough.go;drc=8fbf2c53960bc8917a6a01fda5405cad7c17201e;l=52 > > >>>>> > > >>>>>>> Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > >>>>>>> call_rcu_flush() to revert to the old behavior. With that, the test passes. > > >>>>>> > > >>>>>> What is this test about ? What barrier was used to make it not flaky ? > > >>>>> > > >>>>> I provided the links above, let me know if you have any questions. > > >>>>> > > >>>>>> Was it depending on some undocumented RCU behavior ? > > >>>>> > > >>>>> This is a new RCU feature posted here for significant power-savings on > > >>>>> battery-powered devices: > > >>>>> https://lore.kernel.org/rcu/20221017140726.GG5600@paulmck-ThinkPad-P17-Gen-1/T/#m7a54809b8903b41538850194d67eb34f203c752a > > >>>>> > > >>>>> There is also an LPC presentation about the same, I can dig the link if you > > >>>>> are interested. > > >>>>> > > >>>>>> Maybe adding a sysctl to force the flush would be better for functional tests ? > > >>>>>> > > >>>>>> I would rather change the test(s), than adding call_rcu_flush(), > > >>>>>> adding merge conflicts to future backports. > > >>>>> > > >>>>> I am not too sure about that, I think a user might expect the network > > >>>>> interface to disappear from the networking tables quickly enough without > > >>>>> dealing with barriers or kernel iternals. However, I added the authors of the > > >>>>> test to this email in the hopes he can provide is point of views as well. > > >>>>> > > >>>>> The general approach we are taking with this sort of thing is to use > > >>>>> call_rcu_flush() which is basically the same as call_rcu() for systems with > > >>>>> CALL_RCU_LAZY=n. You can see some examples of that in the patch series link > > >>>>> above. Just to note, CALL_RCU_LAZY depends on CONFIG_RCU_NOCB_CPU so its only > > >>>>> Android and ChromeOS that are using it. I am adding Jie to share any input, > > >>>>> he is from the networking team and knows this test well. > > >>>>> > > >>>>> > > >>>> > > >>>> I do not know what is this RCU_LAZY thing, but IMO this should be opt-in > > >>> > > >>> You should read the links I sent you. We did already try opt-in, > > >>> Thomas Gleixner made a point at LPC that we should not add new APIs > > >>> for this purpose and confuse kernel developers. > > >>> > > >>>> For instance, only kfree_rcu() should use it. > > >>> > > >>> No. Most of the call_rcu() usages are for freeing memory, so the > > >>> consensus is we should apply this as opt out and fix issues along the > > >>> way. We already did a lot of research/diligence on seeing which users > > >>> need conversion. > > >>> > > >>>> We can not review hundreds of call_rcu() call sites and decide if > > >>>> adding arbitrary delays cou hurt . > > >>> > > >>> That work has already been done as much as possible, please read the > > >>> links I sent. > > >> > > >> Also just to add, this test is a bit weird / corner case, as in anyone > > >> expecting a quick response from call_rcu() is broken by design. > > >> However, for these callbacks, it does not matter much which API they > > >> use as they are quite infrequent for power savings. > > > > > > The "broken by design" is a bit strong. Some of those call_rcu() > > > invocations have been around for the better part of 20 years, after all. > > > > > > That aside, I do hope that we can arrive at something that will enhance > > > battery lifetime while avoiding unnecessary disruption. But we are > > > unlikely to be able to completely avoid disruption. As this email > > > thread illustrates. ;-) > > > > Another approach, with these 3 patches could be to keep the call_rcu() but add an rcu_barrier() after them. I think people running ip del netns should not have to wait for their RCU cb to take too long to run and remove user visible state. But I would need suggestions from networking experts which CBs of these 3, to do this for. Or for all of them. > > > > Alternatively, we can also patch just the test with a new knob that does rcu_barrier. But I dislike that as it does not fix it for all users. Probably the ip utilities will also need a patch then. > > > > Normally we have an rcu_barrier() in netns dismantle path already at a > strategic location ( in cleanup_net() ) > > Maybe the issue here is that some particular layers need another one. > Or we need to release a blocking reference before the call_rcu(). > Some call_rcu() usages might not be optimal in this respect. > > We should not add an rcu_barrier() after a call_rcu(), we prefer > factoring these expensive operations. Sounds good! The dst_destroy_rcu() function appears complex to break down (similar to how you suggested in 2/3). The dst->ops->destroy() can decrement refcounts and so forth. We could audit all such dst->ops users, but I wonder if it is safer to use call_rcu_flush() for this patch. Thanks, - Joel
diff --git a/net/core/dst.c b/net/core/dst.c index bc9c9be4e080..15b16322703f 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -174,7 +174,7 @@ void dst_release(struct dst_entry *dst) net_warn_ratelimited("%s: dst:%p refcnt:%d\n", __func__, dst, newrefcnt); if (!newrefcnt) - call_rcu(&dst->rcu_head, dst_destroy_rcu); + call_rcu_flush(&dst->rcu_head, dst_destroy_rcu); } } EXPORT_SYMBOL(dst_release);