Message ID | ZLbYdpWC8zt9EJtq@debian.debian |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp1942380vqt; Tue, 18 Jul 2023 11:50:58 -0700 (PDT) X-Google-Smtp-Source: APBJJlEgHw9P6VQ8tLBFJoPZ7NCKXSjvZAkD3SqU9WJNWbbmK4Tzpbi/ac6OOi1EYHRSY3i+pOYE X-Received: by 2002:a05:6358:2486:b0:135:5ede:f34e with SMTP id m6-20020a056358248600b001355edef34emr296013rwc.5.1689706257637; Tue, 18 Jul 2023 11:50:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689706257; cv=none; d=google.com; s=arc-20160816; b=QAYtd0K45mmpTnIgX4/iRsQMTsuRl5K/emWq3OLpBjry2t6iF+3KQXML4v3BaflEUW XSOMjqmrg+BSKyIJlNBpjFqfEcFh5Q85cKi6SJLCWr04h7RLeBmOR5BVceZWElVvSdUm ySiLmGf4+hzY99r/Cs05i4NhtZszbzyve56aRcLxs+AeNNVP90X4nWTtZBL3rtMd/1e4 6442q2kOS/sIuIgqRqwV2RfZG3MlIBS8DRNZWhhKxBpaOY+Xf9faF+HI+wJoPkZb540K 6ABJ/YwNTgasDOmwILhJI+s6f4WrW47cR4F73hIBgs320yJJJMHZhpyXGujAnrW8rvAf /yhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date:dkim-signature; bh=8AccF1SH0Y/JdOIILtRJfpt1KSbJ1RJLB1Gr9T87MgQ=; fh=V4eeLDpn9pvaqrbalX5vcURk0S5V52SdvyZcP8R31kQ=; b=eldTqyOPndoSQRk8tdtoICCU9RMHB5MI5tfpIrHKIr1llBGaS+VxJxNqV5d0PLjrqJ hqiKM1/2YTctI85X0K5IeXBHn6Xnvt5rMsfhwU0CZAG7bP/NUko/JO8hdPgezK3ytgKw jNTqpy+Oy8s+0GHaf7MzWTZMyfN9cLozyDq5qhw61K46XSKqQ7NEGLGAplD/KmKZFWBZ UE06PMEVDbhT0ieYBy+ObA7w8eRWkVv7tCgjK1NqWleG3VbuGbE3HpXVZOekvzHre0et Iu6jyanOKyI/PTnl95j2EiDJxQu0Zd9Yh/Y4bdAADNoBr2JyFPUMUj+qLUEehKidreTZ qW3g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=y55DQ0cc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 63-20020a630242000000b0055c8819f0easi1843674pgc.726.2023.07.18.11.50.42; Tue, 18 Jul 2023 11:50:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@cloudflare.com header.s=google header.b=y55DQ0cc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=cloudflare.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229900AbjGRSXZ (ORCPT <rfc822;assdfgzxcv4@gmail.com> + 99 others); Tue, 18 Jul 2023 14:23:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55994 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229888AbjGRSXQ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 18 Jul 2023 14:23:16 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F5902133 for <linux-kernel@vger.kernel.org>; Tue, 18 Jul 2023 11:22:51 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-76754b9eac0so565824085a.0 for <linux-kernel@vger.kernel.org>; Tue, 18 Jul 2023 11:22:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; t=1689704569; x=1692296569; h=content-disposition:mime-version:message-id:subject:cc:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=8AccF1SH0Y/JdOIILtRJfpt1KSbJ1RJLB1Gr9T87MgQ=; b=y55DQ0ccRWb6COUVaYx73KAnztfT22L6tm0bmuaM50598rvJ5Y6eD7a8Et8Mvnxr4v 2apMRynnMF1C3N9iuaHqnlHa6dMxDpRuPJiaxWWYtaPTgV+e9kUf5Pfjg0bX6xK1Tt91 KlUrfzFAwy2kOwP3ylFCArl6WDG/yTJx2SdJ4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689704569; x=1692296569; h=content-disposition:mime-version:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8AccF1SH0Y/JdOIILtRJfpt1KSbJ1RJLB1Gr9T87MgQ=; b=ObVXCtO8+F0TM3KqVjnyu/jtKGuEzd1X1Ggo0CeocjCS09do9+12jPVq/gk2XpEGki sKTZlfPYUdZheATznRXEdE8htUrTuaqkIPzUT7IEJVUJtxxx0WaPZTGcUt7VNQ2KFdFz WBeK/j3XjJdUUronOGNVZI1upIDczU0HhLOyqCOhbbFWnBZVWGVKqhxsJ0MNlR9QUbY0 exkYCiLyahq0i8gvdPCaD85Z2n2wxWzhz9gxk36VWSQigAWOd6OIjuJhS5E2DN/KjDEy 7gqAFVn3NlgavNjtZyEBZk82gAVGqWskjlX1IW66pBUAl0rLvZ74NgRPRNq/My69sVhI m46A== X-Gm-Message-State: ABy/qLbGVcdS6d9u3YLK2PMQKiz79H8q3aZVsvmNHYoYDb12Ef3d6TAF hSSo1aIQd7PoiAkOb6oX6Qdytw== X-Received: by 2002:a05:620a:4512:b0:767:1938:93c7 with SMTP id t18-20020a05620a451200b00767193893c7mr682655qkp.43.1689704569008; Tue, 18 Jul 2023 11:22:49 -0700 (PDT) Received: from debian.debian ([140.141.197.139]) by smtp.gmail.com with ESMTPSA id pi30-20020a05620a379e00b00767d572d651sm762073qkn.87.2023.07.18.11.22.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Jul 2023 11:22:48 -0700 (PDT) Date: Tue, 18 Jul 2023 11:22:46 -0700 From: Yan Zhai <yan@cloudflare.com> To: "open list:BPF [NETWORKING] (tc BPF, sock_addr)" <bpf@vger.kernel.org> Cc: kernel-team@cloudflare.com, Martin KaFai Lau <martin.lau@linux.dev>, Daniel Borkmann <daniel@iogearbox.net>, John Fastabend <john.fastabend@gmail.com>, Alexei Starovoitov <ast@kernel.org>, Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>, KP Singh <kpsingh@kernel.org>, Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, "open list:BPF [NETWORKING] (tc BPF, sock_addr)" <netdev@vger.kernel.org>, open list <linux-kernel@vger.kernel.org>, Jordan Griege <jgriege@cloudflare.com> Subject: [PATCH] bpf: lwt: do not return NET_XMIT_xxx values on bpf_redirect Message-ID: <ZLbYdpWC8zt9EJtq@debian.debian> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771785428851614334 X-GMAIL-MSGID: 1771785428851614334 |
Series |
bpf: lwt: do not return NET_XMIT_xxx values on bpf_redirect
|
|
Commit Message
Yan Zhai
July 18, 2023, 6:22 p.m. UTC
skb_do_redirect handles returns error code from both rx and tx path.
The tx path codes are special, e.g. NET_XMIT_CN: they are
non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly
returning such code can cause unexpected behavior. We found at least
one bug that will panic the kernel through KASAN report when we
accidentally redirect packets to a down or carrier-down device at lwt
xmit hook:
https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48
Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the
down device, and it propagates from dev_queue_xmit all way to the lwt
logic. Although skb has been freed by the qdisc, it still continues to
neighbor subsystem and triggers the bug.
This change converts the tx code to proper errors that lwt can consume.
Reported-by: Jordan Griege <jgriege@cloudflare.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
---
net/core/filter.c | 5 +++++
1 file changed, 5 insertions(+)
Comments
On Tue, Jul 18, 2023 at 11:22 AM Yan Zhai <yan@cloudflare.com> wrote: > > skb_do_redirect handles returns error code from both rx and tx path. > The tx path codes are special, e.g. NET_XMIT_CN: they are > non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly > returning such code can cause unexpected behavior. We found at least > one bug that will panic the kernel through KASAN report when we > accidentally redirect packets to a down or carrier-down device at lwt > xmit hook: > > https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48 > > Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the > down device, and it propagates from dev_queue_xmit all way to the lwt > logic. Although skb has been freed by the qdisc, it still continues to > neighbor subsystem and triggers the bug. > > This change converts the tx code to proper errors that lwt can consume. > > Reported-by: Jordan Griege <jgriege@cloudflare.com> > Signed-off-by: Yan Zhai <yan@cloudflare.com> > --- > net/core/filter.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/net/core/filter.c b/net/core/filter.c > index 06ba0e56e369..c9cc501ecdc0 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) > ret = dev_queue_xmit(skb); > dev_xmit_recursion_dec(); > > + // We should not return NET_XMIT_xxx here since it will conflict with > + // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead. C++ comments; should be /* */. But, also, maybe they are not really needed? ret = dev_queue_xmit(skb); if (ret) ret = net_xmit_errno(ret); We have a bunch of places with the pattern like this, so probably can do the same here? > + if (unlikely(ret != NET_XMIT_SUCCESS)) > + ret = net_xmit_errno(ret); > + > return ret; > } > > -- > 2.30.2 >
On Tue, Jul 18, 2023 at 3:29 PM Stanislav Fomichev <sdf@google.com> wrote: > > On Tue, Jul 18, 2023 at 11:22 AM Yan Zhai <yan@cloudflare.com> wrote: > > > > skb_do_redirect handles returns error code from both rx and tx path. > > The tx path codes are special, e.g. NET_XMIT_CN: they are > > non-negative, and can conflict with LWTUNNEL_XMIT_xxx values. Directly > > returning such code can cause unexpected behavior. We found at least > > one bug that will panic the kernel through KASAN report when we > > accidentally redirect packets to a down or carrier-down device at lwt > > xmit hook: > > > > https://gist.github.com/zhaiyan920/8fbac245b261fe316a7ef04c9b1eba48 > > > > Above bug is hit because NET_XMIT_CN is returned by noop_qdisc of the > > down device, and it propagates from dev_queue_xmit all way to the lwt > > logic. Although skb has been freed by the qdisc, it still continues to > > neighbor subsystem and triggers the bug. > > > > This change converts the tx code to proper errors that lwt can consume. > > > > Reported-by: Jordan Griege <jgriege@cloudflare.com> > > Signed-off-by: Yan Zhai <yan@cloudflare.com> > > --- > > net/core/filter.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/net/core/filter.c b/net/core/filter.c > > index 06ba0e56e369..c9cc501ecdc0 100644 > > --- a/net/core/filter.c > > +++ b/net/core/filter.c > > @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) > > ret = dev_queue_xmit(skb); > > dev_xmit_recursion_dec(); > > > > + // We should not return NET_XMIT_xxx here since it will conflict with > > + // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead. > > C++ comments; should be /* */. But, also, maybe they are not really needed? > *facepalm* yes I think we can remove them since the commit message already covers it... > ret = dev_queue_xmit(skb); > if (ret) > ret = net_xmit_errno(ret); > > We have a bunch of places with the pattern like this, so probably can > do the same here? > Personally I like an explicit name better, since not all the return codes use 0 to signal success, e.g. XDP_PASS, TC_ACT_PIPE. But I'd leave that for future improvements now that all other places use 0 on this. thanks Yan > > + if (unlikely(ret != NET_XMIT_SUCCESS)) > > + ret = net_xmit_errno(ret); > > + > > return ret; > > } > > > > -- > > 2.30.2 > >
diff --git a/net/core/filter.c b/net/core/filter.c index 06ba0e56e369..c9cc501ecdc0 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2129,6 +2129,11 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb) ret = dev_queue_xmit(skb); dev_xmit_recursion_dec(); + // We should not return NET_XMIT_xxx here since it will conflict with + // LWTUNNEL_XMIT_xxx values. Convert the return value to errno instead. + if (unlikely(ret != NET_XMIT_SUCCESS)) + ret = net_xmit_errno(ret); + return ret; }