Message ID | 20221117031551.1142289-2-joel@joelfernandes.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp189504wrr; Wed, 16 Nov 2022 19:18:06 -0800 (PST) X-Google-Smtp-Source: AA0mqf7snNh1IWoBlBDr+I/CfsJ0AfVQwUqVXO13GMpVY6kP+p275PXKDZwt8vK1o8rO3f23qni9 X-Received: by 2002:aa7:93b4:0:b0:56d:1fdc:9d37 with SMTP id x20-20020aa793b4000000b0056d1fdc9d37mr954972pff.77.1668655086009; Wed, 16 Nov 2022 19:18:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668655086; cv=none; d=google.com; s=arc-20160816; b=lx4AoF4qV7Pt4/wHsXr4xCYUeup0J+O+HfOZiIQPuEBd0MmwHLNivSu1lEDy4iJg6L w7Km+xpHgQSr6YWiOsTXd3TtdLWey+6h+UCgXQ3uKPlbn3AXEAfJr+xHvLNdxiggoyaM JIAvbwnVD0TEkCJkra9xdVjYQiDqS6LuC5n8rPeSNcQP2of9GbOT09A0Eg7fPcoLVhWQ YO4y6BfpLu3yKNvuG0cSc1X+JFa2EOoqKigpXzozSZotp/Nqu9+l+E3G8VhcFg8vecLP qETDMSwAXy4A7u6t2+J6ucTxAlK/ESxVJG1dJS9lFEDjnnzKrtsBuuGRz2E1wWg8E5MN D9WQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TVyiieWly24z0z0fYFBjEvhMrkOmEI6OGYeufoegUM4=; b=FCzrVi4UkjDviM+7SPszOeErQzpGqsIapM7owJZl1QChfIytmHk8C5St4+E97EdzkO nr+fz4t4S7QGcvWlG9nKxrErwGVtFtIlhuwSwrZu4pPsz3UZFFfy7V6NiN+tu8QO/aUQ x4L6swoSazDysSQsefIr11w//FjMa5XNRuTN8m6MWBq8f1LCnFrEgn1Y/2sLs3BnLbGo YphI4qXDOIV/YhQUSdb0RmMIl3aAkZA05F6eNLrEbnHMi4z1GQs208bsxQR1ZvrsEqfC 8mQ6ZwwsJ6TMvGGzkb0k9GGpvL21sgNQZdhucUkgiAVk6gfZMLifG2XGVLT98Qm6uOUX uOLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b="phUx/1hE"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s20-20020aa78294000000b005669b9f5e1bsi14728069pfm.45.2022.11.16.19.17.52; Wed, 16 Nov 2022 19:18:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b="phUx/1hE"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234661AbiKQDQN (ORCPT <rfc822;just.gull.subs@gmail.com> + 99 others); Wed, 16 Nov 2022 22:16:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234166AbiKQDQG (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 16 Nov 2022 22:16:06 -0500 Received: from mail-qk1-x729.google.com (mail-qk1-x729.google.com [IPv6:2607:f8b0:4864:20::729]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8EBBB64A04 for <linux-kernel@vger.kernel.org>; Wed, 16 Nov 2022 19:16:05 -0800 (PST) Received: by mail-qk1-x729.google.com with SMTP id z17so379368qki.11 for <linux-kernel@vger.kernel.org>; Wed, 16 Nov 2022 19:16:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TVyiieWly24z0z0fYFBjEvhMrkOmEI6OGYeufoegUM4=; b=phUx/1hExUXj+l6ZupXb949+eS0h2KoDX51zvVMU7Ie7h3iZCBke2E+BGODPJBV4tL xGjPEH8PMNVD5NL6U0rDwSD9odVjl6CMe7+lYIMoTphmU1iyZXlh7qyciMp422Gumrv7 eeX0U1xVIh4mhYZ+0J4/xwfFnxRJXhOqK1gQk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TVyiieWly24z0z0fYFBjEvhMrkOmEI6OGYeufoegUM4=; b=jVhHF+RUQ5ssDW4yX1/babfZSNoQjbtkP938NU0gKUOyR1ih3BtBHeoHSL/2jMFZKM 0JBn8PFdPDU8QI/CRcbjcuAioihyHzZX6AmFPLiwTeEP5PYnz3FoiwvlTnDDFntDkn6E ZQPiTVWzlPcTnEMlWAgUL3VOjxjAbhoycOy3CcThJdvLcYsM+VkzlTPkwwjIJpDoRGPB 3o/vX0hrMmHDfiCNvVmXwMTYgBCct9YkR1dj89QRt0LbRZqbNOpVi2G9qHus1VzW1nAe gRwB5HON3GUbsnt9B+XZkXkyOU+UJFI3o+1XO+80DWBrjDlhvRs2V6jXDeeKc4e8Eljg AGkQ== X-Gm-Message-State: ANoB5plkigErQ7XQQELuFCksPOycvaacq6gsDfFOgTAnLwt34BK2kmFC MXe1fXVJa1bTEsOUTuHE0mkbeJWGb71rpQ== X-Received: by 2002:a05:620a:459f:b0:6fa:f76d:bbc1 with SMTP id bp31-20020a05620a459f00b006faf76dbbc1mr307187qkb.11.1668654964533; Wed, 16 Nov 2022 19:16:04 -0800 (PST) Received: from joelboxx.c.googlers.com.com (228.221.150.34.bc.googleusercontent.com. [34.150.221.228]) by smtp.gmail.com with ESMTPSA id k19-20020a05620a415300b006cdd0939ffbsm11398318qko.86.2022.11.16.19.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Nov 2022 19:16:04 -0800 (PST) From: "Joel Fernandes (Google)" <joel@joelfernandes.org> To: linux-kernel@vger.kernel.org Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>, Cong Wang <xiyou.wangcong@gmail.com>, David Ahern <dsahern@kernel.org>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>, Jakub Kicinski <kuba@kernel.org>, Jamal Hadi Salim <jhs@mojatatu.com>, Jiri Pirko <jiri@resnulli.us>, netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>, rcu@vger.kernel.org, rostedt@goodmis.org, paulmck@kernel.org, fweisbec@gmail.com Subject: [PATCH rcu/dev 2/3] net: Use call_rcu_flush() for in_dev_rcu_put Date: Thu, 17 Nov 2022 03:15:49 +0000 Message-Id: <20221117031551.1142289-2-joel@joelfernandes.org> X-Mailer: git-send-email 2.38.1.584.g0f3c55d4c2-goog In-Reply-To: <20221117031551.1142289-1-joel@joelfernandes.org> References: <20221117031551.1142289-1-joel@joelfernandes.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749711675418360731?= X-GMAIL-MSGID: =?utf-8?q?1749711675418360731?= |
Series |
[rcu/dev,1/3] net: Use call_rcu_flush() for qdisc_free_cb
|
|
Commit Message
Joel Fernandes
Nov. 17, 2022, 3:15 a.m. UTC
In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY
causes a networking test to fail in the teardown phase.
The failure happens during: ip netns del <name>
Using ftrace, I found the callbacks it was queuing which this series fixes. Use
call_rcu_flush() to revert to the old behavior. With that, the test passes.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
net/ipv4/devinet.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) <joel@joelfernandes.org> wrote: > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > causes a networking test to fail in the teardown phase. > > The failure happens during: ip netns del <name> > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > --- > net/ipv4/devinet.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c > index e8b9a9202fec..98b20f333e00 100644 > --- a/net/ipv4/devinet.c > +++ b/net/ipv4/devinet.c > @@ -328,7 +328,7 @@ static void inetdev_destroy(struct in_device *in_dev) > neigh_parms_release(&arp_tbl, in_dev->arp_parms); > arp_ifdown(dev); > > - call_rcu(&in_dev->rcu_head, in_dev_rcu_put); > + call_rcu_flush(&in_dev->rcu_head, in_dev_rcu_put); > } For this one, I suspect the issue is about device refcount lingering ? I think we should release refcounts earlier (and only delegate the freeing part after RCU grace period, which can be 'lazy' just fine) Something like: diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index e8b9a9202fecd913137f169f161dfdccc16f7edf..e0258aef4211ec6a72d062963470a32776e6d010 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -234,13 +234,21 @@ static void inet_free_ifa(struct in_ifaddr *ifa) call_rcu(&ifa->rcu_head, inet_rcu_free_ifa); } +static void in_dev_free_rcu(struct rcu_head *head) +{ + struct in_device *idev = container_of(head, struct in_device, rcu_head); + + kfree(rcu_dereference_protected(idev->mc_hash, 1)); + kfree(idev); +} + void in_dev_finish_destroy(struct in_device *idev) { struct net_device *dev = idev->dev; WARN_ON(idev->ifa_list); WARN_ON(idev->mc_list); - kfree(rcu_dereference_protected(idev->mc_hash, 1)); + #ifdef NET_REFCNT_DEBUG pr_debug("%s: %p=%s\n", __func__, idev, dev ? dev->name : "NIL"); #endif @@ -248,7 +256,7 @@ void in_dev_finish_destroy(struct in_device *idev) if (!idev->dead) pr_err("Freeing alive in_device %p\n", idev); else - kfree(idev); + call_rcu(&idev->rcu_head, in_dev_free_rcu); } EXPORT_SYMBOL(in_dev_finish_destroy); @@ -298,12 +306,6 @@ static struct in_device *inetdev_init(struct net_device *dev) goto out; } -static void in_dev_rcu_put(struct rcu_head *head) -{ - struct in_device *idev = container_of(head, struct in_device, rcu_head); - in_dev_put(idev); -} - static void inetdev_destroy(struct in_device *in_dev) { struct net_device *dev; @@ -328,7 +330,7 @@ static void inetdev_destroy(struct in_device *in_dev) neigh_parms_release(&arp_tbl, in_dev->arp_parms); arp_ifdown(dev); - call_rcu(&in_dev->rcu_head, in_dev_rcu_put); + in_dev_put(in_dev); } int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)
Hi Eric, On Thu, Nov 17, 2022 at 01:58:18PM -0800, Eric Dumazet wrote: > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > <joel@joelfernandes.org> wrote: > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > causes a networking test to fail in the teardown phase. > > > > The failure happens during: ip netns del <name> > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > --- > > net/ipv4/devinet.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c > > index e8b9a9202fec..98b20f333e00 100644 > > --- a/net/ipv4/devinet.c > > +++ b/net/ipv4/devinet.c > > @@ -328,7 +328,7 @@ static void inetdev_destroy(struct in_device *in_dev) > > neigh_parms_release(&arp_tbl, in_dev->arp_parms); > > arp_ifdown(dev); > > > > - call_rcu(&in_dev->rcu_head, in_dev_rcu_put); > > + call_rcu_flush(&in_dev->rcu_head, in_dev_rcu_put); > > } > > For this one, I suspect the issue is about device refcount lingering ? > > I think we should release refcounts earlier (and only delegate the > freeing part after RCU grace period, which can be 'lazy' just fine) > > Something like: The below diff where you reduce refcount before RCU grace period, also makes the test pass. If you are Ok with it, I can roll it into a patch with your Author tag and my Tested-by. Let me know what you prefer? Also, looking through the patch, I don't see any issue. One thing is netdev_put() now happens before a grace period, instead of after. But I don't think that's an issue. thanks! - Joel > > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c > index e8b9a9202fecd913137f169f161dfdccc16f7edf..e0258aef4211ec6a72d062963470a32776e6d010 > 100644 > --- a/net/ipv4/devinet.c > +++ b/net/ipv4/devinet.c > @@ -234,13 +234,21 @@ static void inet_free_ifa(struct in_ifaddr *ifa) > call_rcu(&ifa->rcu_head, inet_rcu_free_ifa); > } > > +static void in_dev_free_rcu(struct rcu_head *head) > +{ > + struct in_device *idev = container_of(head, struct in_device, rcu_head); > + > + kfree(rcu_dereference_protected(idev->mc_hash, 1)); > + kfree(idev); > +} > + > void in_dev_finish_destroy(struct in_device *idev) > { > struct net_device *dev = idev->dev; > > WARN_ON(idev->ifa_list); > WARN_ON(idev->mc_list); > - kfree(rcu_dereference_protected(idev->mc_hash, 1)); > + > #ifdef NET_REFCNT_DEBUG > pr_debug("%s: %p=%s\n", __func__, idev, dev ? dev->name : "NIL"); > #endif > @@ -248,7 +256,7 @@ void in_dev_finish_destroy(struct in_device *idev) > if (!idev->dead) > pr_err("Freeing alive in_device %p\n", idev); > else > - kfree(idev); > + call_rcu(&idev->rcu_head, in_dev_free_rcu); > } > EXPORT_SYMBOL(in_dev_finish_destroy); > > @@ -298,12 +306,6 @@ static struct in_device *inetdev_init(struct > net_device *dev) > goto out; > } > > -static void in_dev_rcu_put(struct rcu_head *head) > -{ > - struct in_device *idev = container_of(head, struct in_device, rcu_head); > - in_dev_put(idev); > -} > - > static void inetdev_destroy(struct in_device *in_dev) > { > struct net_device *dev; > @@ -328,7 +330,7 @@ static void inetdev_destroy(struct in_device *in_dev) > neigh_parms_release(&arp_tbl, in_dev->arp_parms); > arp_ifdown(dev); > > - call_rcu(&in_dev->rcu_head, in_dev_rcu_put); > + in_dev_put(in_dev); > } > > int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)
On Thu, Nov 17, 2022 at 4:52 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > Hi Eric, > > On Thu, Nov 17, 2022 at 01:58:18PM -0800, Eric Dumazet wrote: > > On Wed, Nov 16, 2022 at 7:16 PM Joel Fernandes (Google) > > <joel@joelfernandes.org> wrote: > > > > > > In a networking test on ChromeOS, we find that using the new CONFIG_RCU_LAZY > > > causes a networking test to fail in the teardown phase. > > > > > > The failure happens during: ip netns del <name> > > > > > > Using ftrace, I found the callbacks it was queuing which this series fixes. Use > > > call_rcu_flush() to revert to the old behavior. With that, the test passes. > > > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> > > > --- > > > net/ipv4/devinet.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c > > > index e8b9a9202fec..98b20f333e00 100644 > > > --- a/net/ipv4/devinet.c > > > +++ b/net/ipv4/devinet.c > > > @@ -328,7 +328,7 @@ static void inetdev_destroy(struct in_device *in_dev) > > > neigh_parms_release(&arp_tbl, in_dev->arp_parms); > > > arp_ifdown(dev); > > > > > > - call_rcu(&in_dev->rcu_head, in_dev_rcu_put); > > > + call_rcu_flush(&in_dev->rcu_head, in_dev_rcu_put); > > > } > > > > For this one, I suspect the issue is about device refcount lingering ? > > > > I think we should release refcounts earlier (and only delegate the > > freeing part after RCU grace period, which can be 'lazy' just fine) > > > > Something like: > > The below diff where you reduce refcount before RCU grace period, also makes the > test pass. > > If you are Ok with it, I can roll it into a patch with your Author tag and my > Tested-by. Let me know what you prefer? > > Also, looking through the patch, I don't see any issue. One thing is > netdev_put() now happens before a grace period, instead of after. But I don't > think that's an issue. Normally the early netdev_put() is fine, because these netdev are already fully RCU protected. Sure, feel free to take this patch as is, thanks.
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index e8b9a9202fec..98b20f333e00 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -328,7 +328,7 @@ static void inetdev_destroy(struct in_device *in_dev) neigh_parms_release(&arp_tbl, in_dev->arp_parms); arp_ifdown(dev); - call_rcu(&in_dev->rcu_head, in_dev_rcu_put); + call_rcu_flush(&in_dev->rcu_head, in_dev_rcu_put); } int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)