From patchwork Fri Oct 27 18:46:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peilin Ye X-Patchwork-Id: 159134 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:d641:0:b0:403:3b70:6f57 with SMTP id cy1csp814206vqb; Fri, 27 Oct 2023 11:49:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHmjMNwu9pS45xXKlHh2jaorFNHt7ZprygCaVpy9aRciMH3Q2UwJQMzVcIw+tXVLfDIdIGO X-Received: by 2002:a25:ab89:0:b0:d9b:37dd:a3d7 with SMTP id v9-20020a25ab89000000b00d9b37dda3d7mr3334008ybi.17.1698432540260; Fri, 27 Oct 2023 11:49:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698432540; cv=none; d=google.com; s=arc-20160816; b=LQ6emk8MSDi+D4lLYm8MN9FOXYKdYBW7rD4znyF6T8uFNpSK6OsFaf0dpnDnkcgl+O vC8HJPx9M/skpRt+Whrmb+k2dIbBWmTbyqSaEcZbL1782rJs+z4qGZuE2pvEzpWmse26 TPEMWowQaPuH94qCkM2z4puA/fF8ibJMoY5v//FRBYvELK0eItzBJmBu+artjnNTf4To AgBEwu35ayJ9/dUIKcsxzsmE3Sbh5yS0curx2AT9VnAQwtlgzWy7H5ruNJyhBWtO0S+D LdZdXBNCsl1sUqzdAR9VFGUw1ocoMKLyRV6IB/Mc0P48FHaLscvYULPYC73KLKPKCcwZ F0/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=s/a0tTu9/JPEdXuV8EJT0RD0nNZ+OkcyleZ2EfTC0bM=; fh=0brLx7s44ZXB0ENCWftnUsQXNosrlA5MseTNC0lh/Bg=; b=Ihmf1W4W4Shg8tueQ27YKy9QOu/JzMpqgSD2U7XnXew1Tmkslw3mwxIX1dIFP35WR6 smgIWCloDztJMT9YWwWmXq92mlZ2KQTdQDeoyIBgacrvjF3pa2k1rOTfB7JVO9BxIzyE +yzAKydmpS28MZBnvC4vmzEtVKOAPDad0SI2ALY6OFHrgFLbRBOcKzso5B+O/VoO0snj YkiF5mNKy5llX8Ndh5U5vj6QBrJMXTiEK9dDFtWflIT2Hw6JtsRHCiQezfJANBKSGImG AI6Mfso2iDFiTOlGLiZItiL0BEkqy0sA0+FU8cP3jiqOlcRCnOo0GOAwcn+390nScVqM 6SUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=ZVaYQ2iq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id d4-20020a258244000000b00da066698618si3398967ybn.363.2023.10.27.11.48.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Oct 2023 11:49:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=ZVaYQ2iq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 0D10283D6F36; Fri, 27 Oct 2023 11:48:55 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346534AbjJ0Ss2 (ORCPT + 25 others); Fri, 27 Oct 2023 14:48:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346540AbjJ0SsE (ORCPT ); Fri, 27 Oct 2023 14:48:04 -0400 Received: from mail-qv1-xf29.google.com (mail-qv1-xf29.google.com [IPv6:2607:f8b0:4864:20::f29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2E60B10C3; Fri, 27 Oct 2023 11:47:36 -0700 (PDT) Received: by mail-qv1-xf29.google.com with SMTP id 6a1803df08f44-66fbcaf03c6so5839106d6.1; Fri, 27 Oct 2023 11:47:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698432455; x=1699037255; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=s/a0tTu9/JPEdXuV8EJT0RD0nNZ+OkcyleZ2EfTC0bM=; b=ZVaYQ2iqIz9F5TFk7CLpbyBMbcXcwvexEV7SoDSi72/eGV4GwTaRcSyJANW1O+k8VW k6ESSNi9IFwfo3c7OWITAlbfsNjwaVUgANzXBvmYNUsOyzpZe4tLFB9rx/uOvDJN7sOq zbdAT3BlSo3zWO0VAth8FV4YaB1DDi3QlX4cP1vnLQleLsuMjaG9QcVA8QR7Ljexn5nb URA1tRjZbz3zJ0Ho4eYC0xSPTWwGjydtrMst3eDR39pfVAZxu1wS6afI3/qEmHxFEayX ZIrJo5Lf9aG4IDp1VTWc9lNkFZEbUGnLB108DjRQQ5UBLYJYH5aQklHexKrFEbAXpXPy uuuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698432455; x=1699037255; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=s/a0tTu9/JPEdXuV8EJT0RD0nNZ+OkcyleZ2EfTC0bM=; b=oALmuctOrMLAKK3sbCrxeWHy7T7K7Fjyx+wIkGLDVTNEzLxPpGV74GeZM3Cfun0quA uhd8NXfejxRs355hf3yul9uE6rLIziSlUkQTxpeFpLXHaifB8e2o1Hg4fyHb1Bt0WYZw R9+TfbXHv9OfC/MtCkcAiUaV/idezdQ3DkaoC3rwQJ/AeNCmaiT8/DaqJD5u3T8uKO5S zo1SOS3N8GomtrRyOYL3R21OD6qwRI6Ln/o+k9UwnZCPnwngpe4/1GD8nmWN/+0y1u0u wG5k2r6bFAZ+GJATtygB9+IDlGLvX+0tkBY1fcJY12/TWEQc9TsY5rxlDZ+YWflEAQi5 sJ3A== X-Gm-Message-State: AOJu0YzPdyVyqlCGahuX9w6ztODtEaVCFIVSeHtbo6x3ibmlsKXRDQGG EMdNdqZbCc3MCwiAgwkXKQ== X-Received: by 2002:a05:6214:21aa:b0:66d:12c7:bf85 with SMTP id t10-20020a05621421aa00b0066d12c7bf85mr4456312qvc.31.1698432455054; Fri, 27 Oct 2023 11:47:35 -0700 (PDT) Received: from n191-129-154.byted.org ([130.44.215.123]) by smtp.gmail.com with ESMTPSA id o1-20020a05620a110100b0076d25b11b62sm773944qkk.38.2023.10.27.11.47.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 27 Oct 2023 11:47:34 -0700 (PDT) From: Peilin Ye To: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Jesper Dangaard Brouer Cc: Peilin Ye , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, Cong Wang , Jiang Wang , Youlun Zhang , Peilin Ye Subject: [PATCH net] veth: Fix RX stats for bpf_redirect_peer() traffic Date: Fri, 27 Oct 2023 18:46:57 +0000 Message-Id: <20231027184657.83978-1-yepeilin.cs@gmail.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 27 Oct 2023 11:48:55 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780935599614124551 X-GMAIL-MSGID: 1780935599614124551 From: Peilin Ye Traffic redirected by bpf_redirect_peer() (used by recent CNIs like Cilium) is not accounted for in the RX stats of veth devices, confusing user space metrics collectors such as cAdvisor [1], as reported by Youlun. Currently veth devices use the @lstats per-CPU counters, which only cover TX traffic. veth_get_stats64() actually collects RX stats of a veth device from its peer's TX (@lstats) counters, based on the assumption that a veth device can _only_ receive packets from its peer, which is no longer true. Instead, use @tstats to maintain both per-CPU RX and TX traffic counters for each veth device, and count bpf_redirect_peer() traffic in skb_do_redirect(). veth_stats_rx() might need a name change (perhaps to "veth_stats_xdp()") for less confusion, but let's leave it to a separate patch to keep this fix minimal. [1] Specifically, the "container_network_receive_{byte,packet}s_total" counters are affected. Reported-by: Youlun Zhang Fixes: 9aa1206e8f48 ("bpf: Add redirect_peer helper") Cc: Jiang Wang Signed-off-by: Peilin Ye --- drivers/net/veth.c | 36 ++++++++++++++---------------------- net/core/filter.c | 1 + 2 files changed, 15 insertions(+), 22 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 9980517ed8b0..df7a7c21a46d 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -373,7 +373,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) skb_tx_timestamp(skb); if (likely(veth_forward_skb(rcv, skb, rq, use_napi) == NET_RX_SUCCESS)) { if (!use_napi) - dev_lstats_add(dev, length); + dev_sw_netstats_tx_add(dev, 1, length); else __veth_xdp_flush(rq); } else { @@ -387,14 +387,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev) return ret; } -static u64 veth_stats_tx(struct net_device *dev, u64 *packets, u64 *bytes) -{ - struct veth_priv *priv = netdev_priv(dev); - - dev_lstats_read(dev, packets, bytes); - return atomic64_read(&priv->dropped); -} - static void veth_stats_rx(struct veth_stats *result, struct net_device *dev) { struct veth_priv *priv = netdev_priv(dev); @@ -432,24 +424,24 @@ static void veth_get_stats64(struct net_device *dev, struct veth_priv *priv = netdev_priv(dev); struct net_device *peer; struct veth_stats rx; - u64 packets, bytes; - tot->tx_dropped = veth_stats_tx(dev, &packets, &bytes); - tot->tx_bytes = bytes; - tot->tx_packets = packets; + tot->tx_dropped = atomic64_read(&priv->dropped); + dev_fetch_sw_netstats(tot, dev->tstats); veth_stats_rx(&rx, dev); tot->tx_dropped += rx.xdp_tx_err; tot->rx_dropped = rx.rx_drops + rx.peer_tq_xdp_xmit_err; - tot->rx_bytes = rx.xdp_bytes; - tot->rx_packets = rx.xdp_packets; + tot->rx_bytes += rx.xdp_bytes; + tot->rx_packets += rx.xdp_packets; rcu_read_lock(); peer = rcu_dereference(priv->peer); if (peer) { - veth_stats_tx(peer, &packets, &bytes); - tot->rx_bytes += bytes; - tot->rx_packets += packets; + struct rtnl_link_stats64 tot_peer = {}; + + dev_fetch_sw_netstats(&tot_peer, peer->tstats); + tot->rx_bytes += tot_peer.tx_bytes; + tot->rx_packets += tot_peer.tx_packets; veth_stats_rx(&rx, peer); tot->tx_dropped += rx.peer_tq_xdp_xmit_err; @@ -1508,13 +1500,13 @@ static int veth_dev_init(struct net_device *dev) { int err; - dev->lstats = netdev_alloc_pcpu_stats(struct pcpu_lstats); - if (!dev->lstats) + dev->tstats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats); + if (!dev->tstats) return -ENOMEM; err = veth_alloc_queues(dev); if (err) { - free_percpu(dev->lstats); + free_percpu(dev->tstats); return err; } @@ -1524,7 +1516,7 @@ static int veth_dev_init(struct net_device *dev) static void veth_dev_free(struct net_device *dev) { veth_free_queues(dev); - free_percpu(dev->lstats); + free_percpu(dev->tstats); } #ifdef CONFIG_NET_POLL_CONTROLLER diff --git a/net/core/filter.c b/net/core/filter.c index 21d75108c2e9..7aca28b7d0fd 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2492,6 +2492,7 @@ int skb_do_redirect(struct sk_buff *skb) net_eq(net, dev_net(dev)))) goto out_drop; skb->dev = dev; + dev_sw_netstats_rx_add(dev, skb->len); return -EAGAIN; } return flags & BPF_F_NEIGH ?