Message ID | 20231221224311.130319-1-brad@faucet.nz |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-9149-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2483:b0:fb:cd0c:d3e with SMTP id q3csp736773dyi; Thu, 21 Dec 2023 15:06:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IFciD5TQP4AGujJV7Ptlg2uSs/XEIrf8MHe0bgy7jyuD6h5M/Yhn7d4rjqM0sQ+bPRjy2rP X-Received: by 2002:ac8:5f0c:0:b0:427:7aba:c694 with SMTP id x12-20020ac85f0c000000b004277abac694mr616464qta.71.1703199984414; Thu, 21 Dec 2023 15:06:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703199984; cv=none; d=google.com; s=arc-20160816; b=Ty5kiwVy/zYSnE0OWgyxW98JLJsh+a44ReoBO4VNEGDqHCMzjSzL+29HeYo8GO94TD q571SjNgPx1ng96XndJwNkYz1S0M8jBabbdxF4rGEQHr4pTp/dj5jyu0SLwabxoXq6Ym W8g0N7x+WT3DRYEE4wRDG7KJrurptjkoo6aQU5Ajknymmd/Sbhfq0wzgq442SqFt4euX 9jGZu75ZNIgCeqSehfM1fnAWM+rRvjEEdb07ODFJlH6+HaSq/tW1755zLmLEP4kna0SW fB9txe+EzbqZnRb3c8rw1RKrP7yGRtZDiYk4fEXzWI/vc4vU6BOngve4tG/nv2Uy0P6F LnJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=l60SR+53LmctJjdJT6+f0RpMxenYz9binUoltgUVMmE=; fh=KIfb6/RI7KX5rI+QzOT7PkjUOhSE9vGpz28tH4MewHc=; b=T9xbcHo0G3g0f8iCqQ0Ww8yqMaWuLVVtBCAX+0JTWuzpPJ7PZryg1ONU1tNfccjpH2 /1kAHYxM+Xf8c4K+ZXL3g3XXkSHJ65jNcbEueYQ1dzoK8yafQjFd/YT3hVLEzpolyA/0 5imqs5SZvJJG3+7ue/3zJaPyb5J0KRZxvIK+UNo2kYo3m+FzF79AjeJ0SXEUKEuvwHTb ocaJtMZM+A9bPVetQOzlOMPV0DNBZ1HG172Uqv5tmGuXmNm/+b/JiR9P9DBBWTRabcSk sZPhrFyEPXau2YTX/lMT49eGA5U5d5gXlcqavLOldWMtxd/Iz+6tAzW6J+ZtblfpVUL1 /P0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@faucet.nz header.s=fe-4ed8c67516 header.b=m5FWgqV9; spf=pass (google.com: domain of linux-kernel+bounces-9149-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9149-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=faucet.nz Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id o12-20020a05622a008c00b00425af180521si3150719qtw.730.2023.12.21.15.06.24 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Dec 2023 15:06:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-9149-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@faucet.nz header.s=fe-4ed8c67516 header.b=m5FWgqV9; spf=pass (google.com: domain of linux-kernel+bounces-9149-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9149-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=faucet.nz Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 360001C248F6 for <ouuuleilei@gmail.com>; Thu, 21 Dec 2023 23:06:24 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2D06F7949E; Thu, 21 Dec 2023 23:06:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=faucet.nz header.i=@faucet.nz header.b="m5FWgqV9" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp.forwardemail.net (smtp.forwardemail.net [149.28.215.223]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4032E78E90; Thu, 21 Dec 2023 23:05:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=faucet.nz Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fe-bounces.faucet.nz DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=faucet.nz; h=Content-Transfer-Encoding: MIME-Version: Message-Id: Date: Subject: Cc: To: From; q=dns/txt; s=fe-4ed8c67516; t=1703199957; bh=l60SR+53LmctJjdJT6+f0RpMxenYz9binUoltgUVMmE=; b=m5FWgqV9zNfPV+auhA9mYWbBFfs0BokrE6vja+oSSKrmOZZwdE73azw1L4uhNf7VE6rLuGDdl PV2UhMDM1hIcpABWJPrKyOfJJ0Q1rh/nX5M2KTU2rLsn/8/q8j4tTsJUtCAQZfdIB+zRfLFzvXN DAnLpwunFyVGkJIs5XGdj6A= From: Brad Cowie <brad@faucet.nz> To: netdev@vger.kernel.org Cc: pablo@netfilter.org, kadlec@netfilter.org, fw@strlen.de, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, netfilter-devel@vger.kernel.org, linux-kernel@vger.kernel.org, pshelar@ovn.org, dev@openvswitch.org, Brad Cowie <brad@faucet.nz> Subject: [PATCH net] netfilter: nf_nat: fix action not being set for all ct states Date: Fri, 22 Dec 2023 11:43:11 +1300 Message-Id: <20231221224311.130319-1-brad@faucet.nz> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Report-Abuse-To: abuse@forwardemail.net X-Report-Abuse: abuse@forwardemail.net X-Complaints-To: abuse@forwardemail.net X-ForwardEmail-Version: 0.4.40 X-ForwardEmail-Sender: rfc822; brad@faucet.nz, smtp.forwardemail.net, 149.28.215.223 X-ForwardEmail-ID: 6584c00e068c01ef26868e78 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785934626819965572 X-GMAIL-MSGID: 1785934626819965572 |
Series |
[net] netfilter: nf_nat: fix action not being set for all ct states
|
|
Commit Message
Brad Cowie
Dec. 21, 2023, 10:43 p.m. UTC
This fixes openvswitch's handling of nat packets in the related state.
In nf_ct_nat_execute(), which is called from nf_ct_nat(), ICMP/ICMPv6
packets in the IP_CT_RELATED or IP_CT_RELATED_REPLY state, which have
not been dropped, will follow the goto, however the placement of the
goto label means that updating the action bit field will be bypassed.
This causes ovs_nat_update_key() to not be called from ovs_ct_nat()
which means the openvswitch match key for the ICMP/ICMPv6 packet is not
updated and the pre-nat value will be retained for the key, which will
result in the wrong openflow rule being matched for that packet.
Move the goto label above where the action bit field is being set so
that it is updated in all cases where the packet is accepted.
Fixes: ebddb1404900 ("net: move the nat function to nf_nat_ovs for ovs and tc")
Signed-off-by: Brad Cowie <brad@faucet.nz>
---
net/netfilter/nf_nat_ovs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Comments
+ Xin Long <lucien.xin@gmail.com> Aaron Conole <aconole@redhat.com> coreteam@netfilter.org On Fri, Dec 22, 2023 at 11:43:11AM +1300, Brad Cowie wrote: > This fixes openvswitch's handling of nat packets in the related state. > > In nf_ct_nat_execute(), which is called from nf_ct_nat(), ICMP/ICMPv6 > packets in the IP_CT_RELATED or IP_CT_RELATED_REPLY state, which have > not been dropped, will follow the goto, however the placement of the > goto label means that updating the action bit field will be bypassed. > > This causes ovs_nat_update_key() to not be called from ovs_ct_nat() > which means the openvswitch match key for the ICMP/ICMPv6 packet is not > updated and the pre-nat value will be retained for the key, which will > result in the wrong openflow rule being matched for that packet. > > Move the goto label above where the action bit field is being set so > that it is updated in all cases where the packet is accepted. > > Fixes: ebddb1404900 ("net: move the nat function to nf_nat_ovs for ovs and tc") > Signed-off-by: Brad Cowie <brad@faucet.nz> Thanks Brad, I agree with your analysis and that the problem appears to have been introduced by the cited commit. I am curious to know what use case triggers this / why it when unnoticed for a year. But in any case, this fix looks good to me. Reviewed-by: Simon Horman <horms@kernel.org> > --- > net/netfilter/nf_nat_ovs.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/netfilter/nf_nat_ovs.c b/net/netfilter/nf_nat_ovs.c > index 551abd2da614..0f9a559f6207 100644 > --- a/net/netfilter/nf_nat_ovs.c > +++ b/net/netfilter/nf_nat_ovs.c > @@ -75,9 +75,10 @@ static int nf_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, > } > > err = nf_nat_packet(ct, ctinfo, hooknum, skb); > +out: > if (err == NF_ACCEPT) > *action |= BIT(maniptype); > -out: > + > return err; > } > > -- > 2.34.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Sun, 24 Dec 2023 at 10:13, Simon Horman <horms@kernel.org> wrote: > Thanks Brad, > > I agree with your analysis and that the problem appears to > have been introduced by the cited commit. Thanks for the review Simon. > I am curious to know what use case triggers this / > why it when unnoticed for a year. We encountered this issue while upgrading some routers from linux 5.15 to 6.2. The dataplane on these routers is provided by an openvswitch bridge which is controlled via openflow by faucet. These routers are also performing SNAT on all traffic to/from the wan interface via openvswitch conntrack openflow rules. We noticed that after upgrading the linux kernel, traceroute/mtr no longer worked when run from clients behind the router. We eventually discovered the reason for this is that the ICMP time exceeded messages elicited by traceroute were matching openflow rules with the incorrect destination ip, despite there being an openflow rule to undo the nat. Other packets in the established or new state matched the expected openflow rules. A git bisect between 5.15 and 6.2 showed that this change in behaviour was introduced by commit ebddb1404900. After the above patch is applied our routers perform nat correctly again for traceroute/mtr.
On Sat, Dec 23, 2023 at 9:48 PM Brad Cowie <brad@faucet.nz> wrote: > > On Sun, 24 Dec 2023 at 10:13, Simon Horman <horms@kernel.org> wrote: > > Thanks Brad, > > > > I agree with your analysis and that the problem appears to > > have been introduced by the cited commit. > > Thanks for the review Simon. > > > I am curious to know what use case triggers this / > > why it when unnoticed for a year. > > We encountered this issue while upgrading some routers from > linux 5.15 to 6.2. The dataplane on these routers is provided > by an openvswitch bridge which is controlled via openflow by > faucet. These routers are also performing SNAT on all traffic > to/from the wan interface via openvswitch conntrack openflow > rules. > > We noticed that after upgrading the linux kernel, traceroute/mtr > no longer worked when run from clients behind the router. > We eventually discovered the reason for this is that the > ICMP time exceeded messages elicited by traceroute were > matching openflow rules with the incorrect destination ip, > despite there being an openflow rule to undo the nat. > Other packets in the established or new state matched the > expected openflow rules. > > A git bisect between 5.15 and 6.2 showed that this change in > behaviour was introduced by commit ebddb1404900. After the > above patch is applied our routers perform nat correctly > again for traceroute/mtr. Acked-by: Xin Long <lucien.xin@gmail.com>
Simon Horman <horms@kernel.org> writes: > + Xin Long <lucien.xin@gmail.com> > Aaron Conole <aconole@redhat.com> > coreteam@netfilter.org > > On Fri, Dec 22, 2023 at 11:43:11AM +1300, Brad Cowie wrote: >> This fixes openvswitch's handling of nat packets in the related state. >> >> In nf_ct_nat_execute(), which is called from nf_ct_nat(), ICMP/ICMPv6 >> packets in the IP_CT_RELATED or IP_CT_RELATED_REPLY state, which have >> not been dropped, will follow the goto, however the placement of the >> goto label means that updating the action bit field will be bypassed. >> >> This causes ovs_nat_update_key() to not be called from ovs_ct_nat() >> which means the openvswitch match key for the ICMP/ICMPv6 packet is not >> updated and the pre-nat value will be retained for the key, which will >> result in the wrong openflow rule being matched for that packet. >> >> Move the goto label above where the action bit field is being set so >> that it is updated in all cases where the packet is accepted. >> >> Fixes: ebddb1404900 ("net: move the nat function to nf_nat_ovs for ovs and tc") >> Signed-off-by: Brad Cowie <brad@faucet.nz> > > Thanks Brad, > > I agree with your analysis and that the problem appears to > have been introduced by the cited commit. > > I am curious to know what use case triggers this / > why it when unnoticed for a year. > > But in any case, this fix looks good to me. > > Reviewed-by: Simon Horman <horms@kernel.org> > >> --- LGTM. I guess we should try to codify the specific flows that were used to flag this into the ovs selftest - we clearly have a missing case after NAT lookup. I'll add it to my (ever growing) list. Meanwhile, Acked-by: Aaron Conole <aconole@redhat.com> >> net/netfilter/nf_nat_ovs.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/net/netfilter/nf_nat_ovs.c b/net/netfilter/nf_nat_ovs.c >> index 551abd2da614..0f9a559f6207 100644 >> --- a/net/netfilter/nf_nat_ovs.c >> +++ b/net/netfilter/nf_nat_ovs.c >> @@ -75,9 +75,10 @@ static int nf_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, >> } >> >> err = nf_nat_packet(ct, ctinfo, hooknum, skb); >> +out: >> if (err == NF_ACCEPT) >> *action |= BIT(maniptype); >> -out: >> + >> return err; >> } >> >> -- >> 2.34.1 >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>
Applied to nf.git, thanks everyone for reviewing.
On Wed, 3 Jan 2024 at 04:10, Aaron Conole <aconole@redhat.com> wrote: > LGTM. I guess we should try to codify the specific flows that were used > to flag this into the ovs selftest - we clearly have a missing case > after NAT lookup. Thanks for the review Aaron, and the sensible suggestion to add a test to ovs to avoid this problem occuring again in future. I've simplified our NAT ruleset and turned it into an ovs system test, which I've submitted as a patch [1] to ovs-dev. The test reproduces the issue introduced by ebddb1404900 and passes when e6345d2824a3 is applied. [1]: https://mail.openvswitch.org/pipermail/ovs-dev/2024-January/410476.html
diff --git a/net/netfilter/nf_nat_ovs.c b/net/netfilter/nf_nat_ovs.c index 551abd2da614..0f9a559f6207 100644 --- a/net/netfilter/nf_nat_ovs.c +++ b/net/netfilter/nf_nat_ovs.c @@ -75,9 +75,10 @@ static int nf_ct_nat_execute(struct sk_buff *skb, struct nf_conn *ct, } err = nf_nat_packet(ct, ctinfo, hooknum, skb); +out: if (err == NF_ACCEPT) *action |= BIT(maniptype); -out: + return err; }