[net-next,RFC] net: dsa: qca8k: make learning configurable and keep off if standalone
Message ID | 20230623114005.9680-1-ansuelsmth@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp5708129vqr; Fri, 23 Jun 2023 04:46:34 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ58JvgIUjo2GK8QMFgSTmwMWgcGwQ9Nnopq41pC6MWponme6r6AypcsEGXzpa5q0wcauKNO X-Received: by 2002:a05:6a20:7f88:b0:122:a808:dbb9 with SMTP id d8-20020a056a207f8800b00122a808dbb9mr10410894pzj.41.1687520794422; Fri, 23 Jun 2023 04:46:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687520794; cv=none; d=google.com; s=arc-20160816; b=uv//8xzn0V+ZFVFrG7IGJPjpQnMFLALwIDzkuuqGBYLMQqadAMN0kIX18eY12fsJsf 3N+Jl3vhXA4iUfDKsRTfwOQft6fYf6H2d3D6YrTC9lJ/1Nf0FISSNI57BSbtKR63xHpi gLVZPyWLdaO2uTkSAC+m0TbhaysyEerptHPYrV6yXsQUjVZVlcexbeA8a+QFdlnspnw3 +sIjdp1wwaNQbEES8Q3lbNhG1DXbcgfAsCBmSF4J6T8Cb2vLtVEoRUWBgkq5liHpku7U JzBid7F7cI3rJnTH1OtIK4+Ctd1rtwWxqyY5CLAR8QSOf8yJdNE9cHIQ/WYia4UnyLWT ScPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature; bh=J/u0R3l9zBf01BjxNH8vuW4kixZhC6G1lWgIfS9JJbg=; b=Z6krwngYlkV1nDszfcOwAAhuhERgvKP+WveFUt08lBRUiqJtWgUuFAtgPSgq4VrOoN dAD6Tr9wuqAT1ZKvVJSlaPbtq2byUgozu7uIagijaP0/S0rxDNXd+vn/ubXR25ikdzcQ i/I9BNNWXLaFcspdO/SmUObSWp2WawfiqOUw7D+WO5qdLBUqF0LVgEtSG69mObvdgZLI y4QCWEF4xzDs36PLPhpTiYP2nAuJ3YpzIg5c47fNiCEaae7N5A9VoRRNYaCkiGg2/MET 4q0RXlta4BtuQxkbTJ4kOyo0iVRjKkv42sCtmHrkQmW+9RXW5J1vxVzDZt2nTb031Lol 52hw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=VXsbLpks; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id la7-20020a170902fa0700b001b04c325d66si7929271plb.565.2023.06.23.04.46.21; Fri, 23 Jun 2023 04:46:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=VXsbLpks; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230117AbjFWLkh (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Fri, 23 Jun 2023 07:40:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229448AbjFWLkg (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 23 Jun 2023 07:40:36 -0400 Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AF48E65; Fri, 23 Jun 2023 04:40:31 -0700 (PDT) Received: by mail-wr1-x430.google.com with SMTP id ffacd0b85a97d-311153ec442so529993f8f.1; Fri, 23 Jun 2023 04:40:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687520430; x=1690112430; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=J/u0R3l9zBf01BjxNH8vuW4kixZhC6G1lWgIfS9JJbg=; b=VXsbLpkseTtrpPd4T7qF6FRW4cQlJgmBd2UNomC0GfcTT8wzO9fme3BhauIQmkbEaW Eenl7lfXVplVmXJqtbDHGidGt31tvVUIKX022yNwnIynto43faf6KSczDFyP4jHuMzrp juVGJhNsZ45lGf13d0Sd27gMajOQELyAoq/PrsgzorvdnsCAhrq9oev1gu8xAbvL8gM1 JWRiQy2ARKUMxpqUM47EeS8yQr6qpeI5fYMpR5yGkZtMHqg2QkeQKoYyMm1kpDwSU9B1 PeX4pEpGlL/2y3jaavMwBP/FU66A+sGixOAh1Mu8avjAsPDo0ivtVEyAiMpy5yv4W5gJ IYWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687520430; x=1690112430; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=J/u0R3l9zBf01BjxNH8vuW4kixZhC6G1lWgIfS9JJbg=; b=X7+nrxu2ErZsUtoAXvzWOd0B3BfKA5pAiF6gqtdhVfcDgaHH3UWuuSk+iiEu/kzO5/ qIJT7LcTeohJ51bPQZzuIlB5fHXKPyC+OHKfZUSx0D8ZuuECTB9/wCOL6682bZNwQPXo lhs2wqh0DDO4zc/+/3sD08ETdkZTgYUxanDD54475bcP4mHTp94GYPrPnKP1GksLBTdG d6S2SVl8BlfTklsSeXdstnlYXONAzMds0SLGO7DF8DrfkQwi29GUqhUnBm/eE0VIZlTV TwIIrRgpHjN46bSJxZ0rZaUkzwHg+l8O3WtnZJTxFjugeooZZkr/UsN3lNzDtYg/j/z+ LURA== X-Gm-Message-State: AC+VfDzzB3D7glOWpNx/DXmYooUtec4k+qu+Smo3nV0m6Q4VW7BzK6b8 1ktPNtRHcLGTJIxs+cSYKQE= X-Received: by 2002:a5d:40c5:0:b0:311:133c:2903 with SMTP id b5-20020a5d40c5000000b00311133c2903mr17879799wrq.15.1687520429713; Fri, 23 Jun 2023 04:40:29 -0700 (PDT) Received: from localhost.localdomain (93-34-93-173.ip49.fastwebnet.it. [93.34.93.173]) by smtp.googlemail.com with ESMTPSA id k3-20020a5d4283000000b00312780bedc3sm9297384wrq.56.2023.06.23.04.40.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Jun 2023 04:40:29 -0700 (PDT) From: Christian Marangi <ansuelsmth@gmail.com> To: Andrew Lunn <andrew@lunn.ch>, Florian Fainelli <f.fainelli@gmail.com>, Vladimir Oltean <olteanv@gmail.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Christian Marangi <ansuelsmth@gmail.com>, Atin Bainada <hi@atinb.me>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [net-next PATCH RFC] net: dsa: qca8k: make learning configurable and keep off if standalone Date: Fri, 23 Jun 2023 13:40:05 +0200 Message-Id: <20230623114005.9680-1-ansuelsmth@gmail.com> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769493804673030619?= X-GMAIL-MSGID: =?utf-8?q?1769493804673030619?= |
Series |
[net-next,RFC] net: dsa: qca8k: make learning configurable and keep off if standalone
|
|
Commit Message
Christian Marangi
June 23, 2023, 11:40 a.m. UTC
Address learning should initially be turned off by the driver for port
operation in standalone mode, then the DSA core handles changes to it
via ds->ops->port_bridge_flags().
Currently this is not the case for qca8k where learning is enabled
unconditionally in qca8k_setup for every user port.
Handle ports configured in standalone mode by making the learning
configurable and not enabling it by default.
Implement .port_pre_bridge_flags and .port_bridge_flags dsa ops to
enable learning for bridge that request it and tweak
.port_stp_state_set to correctly disable learning when port is
configured in standalone mode.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
Posting as RFC as I would love some comments from Vladimir for correct
implementation of this. This was suggested to be on par with offload
bridge API and I used as example [1] (commit that does the same thing
with a microchip switch)
I didn't want to bloat the priv struct with additional info with the
port state and from what I can see it seems using dp->learning is enough
to understand if learning is currently enabled for the port or not but I
would love to have some confirmation about this. (from what I notice
when port is set in standalone mode, flags are cleared so it should be
correct)
I also verified this working by dumping the fdb in a bridge
configuration and in a standalone configuration. Traffic works in both
configuration.
Dump WITH BRIDGE CONFIGURATION:
01:00:5e:00:00:01 dev eth0 self permanent
33:33:00:00:00:02 dev eth0 self permanent
33:33:00:00:00:01 dev eth0 self permanent
33:33:ff:f2:5d:50 dev eth0 self permanent
33:33:ff:00:00:00 dev eth0 self permanent
dc:ef:09:f2:5d:4f dev eth1 self permanent
33:33:00:00:00:01 dev eth1 self permanent
33:33:00:00:00:02 dev eth1 self permanent
01:00:5e:00:00:01 dev eth1 self permanent
33:33:ff:f2:5d:4f dev eth1 self permanent
33:33:ff:00:00:00 dev eth1 self permanent
c0:3e:ba:c1:d7:47 dev lan1 master br-lan
dc:ef:09:f2:5d:4f dev lan1 vlan 1 master br-lan permanent
dc:ef:09:f2:5d:4f dev lan1 master br-lan permanent
c0:3e:ba:c1:d7:47 dev lan1 vlan 1 self
33:33:00:00:00:01 dev wlan0 self permanent
33:33:00:00:00:02 dev wlan0 self permanent
33:33:00:00:00:01 dev wlan1 self permanent
33:33:00:00:00:02 dev wlan1 self permanent
33:33:00:00:00:01 dev br-lan self permanent
33:33:00:00:00:02 dev br-lan self permanent
01:00:5e:00:00:01 dev br-lan self permanent
33:33:ff:00:00:01 dev br-lan self permanent
33:33:ff:f2:5d:4f dev br-lan self permanent
33:33:00:01:00:02 dev br-lan self permanent
33:33:00:01:00:03 dev br-lan self permanent
33:33:ff:00:00:00 dev br-lan self permanent
Dump WITH STANDALONE CONFIGURATION:
01:00:5e:00:00:01 dev eth0 self permanent
33:33:00:00:00:02 dev eth0 self permanent
33:33:00:00:00:01 dev eth0 self permanent
33:33:ff:f2:5d:50 dev eth0 self permanent
33:33:ff:00:00:00 dev eth0 self permanent
33:33:00:00:00:01 dev eth1 self permanent
33:33:00:00:00:02 dev eth1 self permanent
01:00:5e:00:00:01 dev eth1 self permanent
33:33:ff:f2:5d:4f dev eth1 self permanent
33:33:ff:00:00:01 dev eth1 self permanent
33:33:ff:00:00:00 dev eth1 self permanent
33:33:00:01:00:02 dev eth1 self permanent
33:33:00:01:00:03 dev eth1 self permanent
33:33:00:00:00:01 dev wlan0 self permanent
33:33:00:00:00:02 dev wlan0 self permanent
33:33:00:00:00:01 dev wlan1 self permanent
33:33:00:00:00:02 dev wlan1 self permanent
From what I can see there isn't any self entry with the MAC address of
the connected device and this should confirm that learning is actually
disabled.
Hope this is enough to test this feature and I would ask what would be
the next step to reach a point where port_change_master can be
implemented.
(would also love to see what are the criteria to enable offload_fwd_mask
on the targget for rcv and eventually for xmit)
Thanks for any response and sorry for the long comments.
[1] https://github.com/torvalds/linux/commit/15f7cfae912e
drivers/net/dsa/qca/qca8k-8xxx.c | 8 ++----
drivers/net/dsa/qca/qca8k-common.c | 40 ++++++++++++++++++++++++++++++
drivers/net/dsa/qca/qca8k.h | 6 +++++
3 files changed, 48 insertions(+), 6 deletions(-)
Comments
On Fri, Jun 23, 2023 at 01:40:05PM +0200, Christian Marangi wrote: > Address learning should initially be turned off by the driver for port > operation in standalone mode, then the DSA core handles changes to it > via ds->ops->port_bridge_flags(). > > Currently this is not the case for qca8k where learning is enabled > unconditionally in qca8k_setup for every user port. > > Handle ports configured in standalone mode by making the learning > configurable and not enabling it by default. > > Implement .port_pre_bridge_flags and .port_bridge_flags dsa ops to > enable learning for bridge that request it and tweak > .port_stp_state_set to correctly disable learning when port is > configured in standalone mode. > > Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> > --- > > Posting as RFC as I would love some comments from Vladimir for correct > implementation of this. This was suggested to be on par with offload > bridge API and I used as example [1] (commit that does the same thing > with a microchip switch) > > I didn't want to bloat the priv struct with additional info with the > port state and from what I can see it seems using dp->learning is enough > to understand if learning is currently enabled for the port or not but I > would love to have some confirmation about this. (from what I notice > when port is set in standalone mode, flags are cleared so it should be > correct) In principle you can use dp->learning, but in this case you are using it incorrectly (more below). > > I also verified this working by dumping the fdb in a bridge > configuration and in a standalone configuration. Traffic works in both > configuration. > > Dump WITH BRIDGE CONFIGURATION: > 01:00:5e:00:00:01 dev eth0 self permanent > 33:33:00:00:00:02 dev eth0 self permanent > 33:33:00:00:00:01 dev eth0 self permanent > 33:33:ff:f2:5d:50 dev eth0 self permanent > 33:33:ff:00:00:00 dev eth0 self permanent > dc:ef:09:f2:5d:4f dev eth1 self permanent > 33:33:00:00:00:01 dev eth1 self permanent > 33:33:00:00:00:02 dev eth1 self permanent > 01:00:5e:00:00:01 dev eth1 self permanent > 33:33:ff:f2:5d:4f dev eth1 self permanent > 33:33:ff:00:00:00 dev eth1 self permanent > c0:3e:ba:c1:d7:47 dev lan1 master br-lan > dc:ef:09:f2:5d:4f dev lan1 vlan 1 master br-lan permanent > dc:ef:09:f2:5d:4f dev lan1 master br-lan permanent > c0:3e:ba:c1:d7:47 dev lan1 vlan 1 self > 33:33:00:00:00:01 dev wlan0 self permanent > 33:33:00:00:00:02 dev wlan0 self permanent > 33:33:00:00:00:01 dev wlan1 self permanent > 33:33:00:00:00:02 dev wlan1 self permanent > 33:33:00:00:00:01 dev br-lan self permanent > 33:33:00:00:00:02 dev br-lan self permanent > 01:00:5e:00:00:01 dev br-lan self permanent > 33:33:ff:00:00:01 dev br-lan self permanent > 33:33:ff:f2:5d:4f dev br-lan self permanent > 33:33:00:01:00:02 dev br-lan self permanent > 33:33:00:01:00:03 dev br-lan self permanent > 33:33:ff:00:00:00 dev br-lan self permanent > > Dump WITH STANDALONE CONFIGURATION: > 01:00:5e:00:00:01 dev eth0 self permanent > 33:33:00:00:00:02 dev eth0 self permanent > 33:33:00:00:00:01 dev eth0 self permanent > 33:33:ff:f2:5d:50 dev eth0 self permanent > 33:33:ff:00:00:00 dev eth0 self permanent > 33:33:00:00:00:01 dev eth1 self permanent > 33:33:00:00:00:02 dev eth1 self permanent > 01:00:5e:00:00:01 dev eth1 self permanent > 33:33:ff:f2:5d:4f dev eth1 self permanent > 33:33:ff:00:00:01 dev eth1 self permanent > 33:33:ff:00:00:00 dev eth1 self permanent > 33:33:00:01:00:02 dev eth1 self permanent > 33:33:00:01:00:03 dev eth1 self permanent > 33:33:00:00:00:01 dev wlan0 self permanent > 33:33:00:00:00:02 dev wlan0 self permanent > 33:33:00:00:00:01 dev wlan1 self permanent > 33:33:00:00:00:02 dev wlan1 self permanent The information from these dumps is pretty much irrelevant. > From what I can see there isn't any self entry with the MAC address of > the connected device and this should confirm that learning is actually > disabled. > > Hope this is enough to test this feature and I would ask what would be > the next step to reach a point where port_change_master can be > implemented. The way to test this patch would be to connect in loopback 2 standalone qca8k ports having the same MAC address, and ping from one to the other. ip netns add ns0 ip link set lan1 netns ns0 && ip -n ns0 link set lan1 up ip -n ns0 addr add 192.168.100.1/24 dev lan1 ip link set lan2 up && ip addr add 192.168.100.2/24 dev lan2 ping 192.168.100.1 Before, it shouldn't have worked, now it should. Once that basic precondition passes, you should be able to start looking at tools/testing/selftests/drivers/net/dsa/ and run those one by one. An interesting one would be local_termination.sh, which monitors the way in which frames reach the CPU. Though be aware that some sub-tests from that suite will fail on misconfigurations that are non-fatal (and don't impact functionality), just sub-optimal (affecting performance). Like sending unknown packets to the CPU when the port is non-promiscuous and software would drop those packets anyway. > (would also love to see what are the criteria to enable offload_fwd_mask > on the targget for rcv and eventually for xmit) For RX, skb->offload_fwd_mark = true (mark not mask) means that the software bridge shouldn't flood packets received on lanX towards other lanY ports that are part of the same hwdom, because the hardware already took care of that. [ the hardware domain is determined by dev_get_port_parent_id() -> devlink_compat_switch_id_get() and populated by dsa_port_devlink_setup() ] Obviously, the requirement is for the hardware to indeed take care of that :) Currently it doesn't flood to the other user ports that are part of the same bridge and have egress flooding enabled for that traffic type. It just floods to the CPU and software decides where to flood. It's a matter of implementing other brport flags, like BR_FLOOD and friends. For TX, skb->offload_fwd_mark means that the driver should be able to send a skb potentially towards multiple TX ports at the same time, as a result of an FDB lookup. This makes the bridge avoid cloning that skb and calling dev_queue_xmit() towards every individual port that it must reach. I would concentrate on RX and leave TX for later. > Thanks for any response and sorry for the long comments. > > > [1] https://github.com/torvalds/linux/commit/15f7cfae912e > > drivers/net/dsa/qca/qca8k-8xxx.c | 8 ++---- > drivers/net/dsa/qca/qca8k-common.c | 40 ++++++++++++++++++++++++++++++ > drivers/net/dsa/qca/qca8k.h | 6 +++++ > 3 files changed, 48 insertions(+), 6 deletions(-) > > diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c > index f08086ac2261..a9af270a03ce 100644 > --- a/drivers/net/dsa/qca/qca8k-8xxx.c > +++ b/drivers/net/dsa/qca/qca8k-8xxx.c > @@ -1963,12 +1963,6 @@ qca8k_setup(struct dsa_switch *ds) > if (ret) > return ret; > > - /* Enable ARP Auto-learning by default */ > - ret = regmap_set_bits(priv->regmap, QCA8K_PORT_LOOKUP_CTRL(i), > - QCA8K_PORT_LOOKUP_LEARN); > - if (ret) > - return ret; > - > /* For port based vlans to work we need to set the > * default egress vid > */ > @@ -2071,6 +2065,8 @@ static const struct dsa_switch_ops qca8k_switch_ops = { > .port_change_mtu = qca8k_port_change_mtu, > .port_max_mtu = qca8k_port_max_mtu, > .port_stp_state_set = qca8k_port_stp_state_set, > + .port_pre_bridge_flags = qca8k_port_pre_bridge_flags, > + .port_bridge_flags = qca8k_port_bridge_flags, > .port_bridge_join = qca8k_port_bridge_join, > .port_bridge_leave = qca8k_port_bridge_leave, > .port_fast_age = qca8k_port_fast_age, > diff --git a/drivers/net/dsa/qca/qca8k-common.c b/drivers/net/dsa/qca/qca8k-common.c > index 8c2dc0e48ff4..f93defbd8b66 100644 > --- a/drivers/net/dsa/qca/qca8k-common.c > +++ b/drivers/net/dsa/qca/qca8k-common.c > @@ -559,8 +559,24 @@ int qca8k_get_mac_eee(struct dsa_switch *ds, int port, > return 0; > } > > +static int qca8k_port_configure_learning(struct dsa_switch *ds, int port, > + bool learning) > +{ > + struct qca8k_priv *priv = ds->priv; > + > + if (learning) > + return regmap_set_bits(priv->regmap, > + QCA8K_PORT_LOOKUP_CTRL(port), > + QCA8K_PORT_LOOKUP_LEARN); > + else > + return regmap_clear_bits(priv->regmap, > + QCA8K_PORT_LOOKUP_CTRL(port), > + QCA8K_PORT_LOOKUP_LEARN); > +} > + > void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state) > { > + struct dsa_port *dp = dsa_to_port(ds, port); > struct qca8k_priv *priv = ds->priv; > u32 stp_state; > > @@ -585,6 +601,30 @@ void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state) > > qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), > QCA8K_PORT_LOOKUP_STATE_MASK, stp_state); > + > + qca8k_port_configure_learning(ds, port, dp->learning); Learning should be enabled only if we're in an STP state compatible with learning: BR_STATE_LEARNING, BR_STATE_FORWARDING. The dp->learning flag does not follow the STP state, it is just an override for that. So, the condition should be: "(state == BR_STATE_LEARNING || state == BR_STATE_FORWARDING) && dp->learning" > +} > + > +int qca8k_port_pre_bridge_flags(struct dsa_switch *ds, int port, > + struct switchdev_brport_flags flags, > + struct netlink_ext_ack *extack) > +{ > + if (flags.mask & ~BR_LEARNING) > + return -EINVAL; > + > + return 0; > +} > + > +int qca8k_port_bridge_flags(struct dsa_switch *ds, int port, > + struct switchdev_brport_flags flags, > + struct netlink_ext_ack *extack) > +{ > + int ret; > + > + ret = qca8k_port_configure_learning(ds, port, > + flags.mask & ~BR_LEARNING); flags.mask contains the bits that have changed. flags.val contains the current value of all bits. Passing flags.mask & ~BR_LEARNING (the mask of changed flags except for BR_LEARNING) to qca8k_port_configure_learning() makes absolutely no sense. > + > + return ret; > } > > int qca8k_port_bridge_join(struct dsa_switch *ds, int port, > diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h > index c5cc8a172d65..8f88b7db384d 100644 > --- a/drivers/net/dsa/qca/qca8k.h > +++ b/drivers/net/dsa/qca/qca8k.h > @@ -522,6 +522,12 @@ int qca8k_get_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e); > > /* Common bridge function */ > void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state); > +int qca8k_port_pre_bridge_flags(struct dsa_switch *ds, int port, > + struct switchdev_brport_flags flags, > + struct netlink_ext_ack *extack); > +int qca8k_port_bridge_flags(struct dsa_switch *ds, int port, > + struct switchdev_brport_flags flags, > + struct netlink_ext_ack *extack); > int qca8k_port_bridge_join(struct dsa_switch *ds, int port, > struct dsa_bridge bridge, > bool *tx_fwd_offload, > -- > 2.40.1 >
On Sun, Jun 25, 2023 at 02:58:03PM +0300, Vladimir Oltean wrote: > On Fri, Jun 23, 2023 at 01:40:05PM +0200, Christian Marangi wrote: > > Address learning should initially be turned off by the driver for port > > operation in standalone mode, then the DSA core handles changes to it > > via ds->ops->port_bridge_flags(). > > > > Currently this is not the case for qca8k where learning is enabled > > unconditionally in qca8k_setup for every user port. > > > > Handle ports configured in standalone mode by making the learning > > configurable and not enabling it by default. > > > > Implement .port_pre_bridge_flags and .port_bridge_flags dsa ops to > > enable learning for bridge that request it and tweak > > .port_stp_state_set to correctly disable learning when port is > > configured in standalone mode. > > > > Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> > > --- > > > > Posting as RFC as I would love some comments from Vladimir for correct > > implementation of this. This was suggested to be on par with offload > > bridge API and I used as example [1] (commit that does the same thing > > with a microchip switch) > > > > I didn't want to bloat the priv struct with additional info with the > > port state and from what I can see it seems using dp->learning is enough > > to understand if learning is currently enabled for the port or not but I > > would love to have some confirmation about this. (from what I notice > > when port is set in standalone mode, flags are cleared so it should be > > correct) > > In principle you can use dp->learning, but in this case you are using it > incorrectly (more below). > Hi, thanks a lot for the response and for catching the stupid mistake. > > > > I also verified this working by dumping the fdb in a bridge > > configuration and in a standalone configuration. Traffic works in both > > configuration. > > > > Dump WITH BRIDGE CONFIGURATION: > > 01:00:5e:00:00:01 dev eth0 self permanent > > 33:33:00:00:00:02 dev eth0 self permanent > > 33:33:00:00:00:01 dev eth0 self permanent > > 33:33:ff:f2:5d:50 dev eth0 self permanent > > 33:33:ff:00:00:00 dev eth0 self permanent > > dc:ef:09:f2:5d:4f dev eth1 self permanent > > 33:33:00:00:00:01 dev eth1 self permanent > > 33:33:00:00:00:02 dev eth1 self permanent > > 01:00:5e:00:00:01 dev eth1 self permanent > > 33:33:ff:f2:5d:4f dev eth1 self permanent > > 33:33:ff:00:00:00 dev eth1 self permanent > > c0:3e:ba:c1:d7:47 dev lan1 master br-lan > > dc:ef:09:f2:5d:4f dev lan1 vlan 1 master br-lan permanent > > dc:ef:09:f2:5d:4f dev lan1 master br-lan permanent > > c0:3e:ba:c1:d7:47 dev lan1 vlan 1 self > > 33:33:00:00:00:01 dev wlan0 self permanent > > 33:33:00:00:00:02 dev wlan0 self permanent > > 33:33:00:00:00:01 dev wlan1 self permanent > > 33:33:00:00:00:02 dev wlan1 self permanent > > 33:33:00:00:00:01 dev br-lan self permanent > > 33:33:00:00:00:02 dev br-lan self permanent > > 01:00:5e:00:00:01 dev br-lan self permanent > > 33:33:ff:00:00:01 dev br-lan self permanent > > 33:33:ff:f2:5d:4f dev br-lan self permanent > > 33:33:00:01:00:02 dev br-lan self permanent > > 33:33:00:01:00:03 dev br-lan self permanent > > 33:33:ff:00:00:00 dev br-lan self permanent > > > > Dump WITH STANDALONE CONFIGURATION: > > 01:00:5e:00:00:01 dev eth0 self permanent > > 33:33:00:00:00:02 dev eth0 self permanent > > 33:33:00:00:00:01 dev eth0 self permanent > > 33:33:ff:f2:5d:50 dev eth0 self permanent > > 33:33:ff:00:00:00 dev eth0 self permanent > > 33:33:00:00:00:01 dev eth1 self permanent > > 33:33:00:00:00:02 dev eth1 self permanent > > 01:00:5e:00:00:01 dev eth1 self permanent > > 33:33:ff:f2:5d:4f dev eth1 self permanent > > 33:33:ff:00:00:01 dev eth1 self permanent > > 33:33:ff:00:00:00 dev eth1 self permanent > > 33:33:00:01:00:02 dev eth1 self permanent > > 33:33:00:01:00:03 dev eth1 self permanent > > 33:33:00:00:00:01 dev wlan0 self permanent > > 33:33:00:00:00:02 dev wlan0 self permanent > > 33:33:00:00:00:01 dev wlan1 self permanent > > 33:33:00:00:00:02 dev wlan1 self permanent > > The information from these dumps is pretty much irrelevant. > > > From what I can see there isn't any self entry with the MAC address of > > the connected device and this should confirm that learning is actually > > disabled. > > > > Hope this is enough to test this feature and I would ask what would be > > the next step to reach a point where port_change_master can be > > implemented. > > The way to test this patch would be to connect in loopback 2 standalone > qca8k ports having the same MAC address, and ping from one to the other. > > ip netns add ns0 > ip link set lan1 netns ns0 && ip -n ns0 link set lan1 up > ip -n ns0 addr add 192.168.100.1/24 dev lan1 > ip link set lan2 up && ip addr add 192.168.100.2/24 dev lan2 > ping 192.168.100.1 > > Before, it shouldn't have worked, now it should. I can confirm this works. > > Once that basic precondition passes, you should be able to start looking > at tools/testing/selftests/drivers/net/dsa/ and run those one by one. > An interesting one would be local_termination.sh, which monitors the way > in which frames reach the CPU. Though be aware that some sub-tests from > that suite will fail on misconfigurations that are non-fatal (and don't > impact functionality), just sub-optimal (affecting performance). Like > sending unknown packets to the CPU when the port is non-promiscuous and > software would drop those packets anyway. > Lots of difficult to run the selftests on a light fw but step at times I'm managing to make use of them (could be helpfull to add some comments in the .config saying that the testing port needs to be declared in the struct) (and maybe some additional checks on the kind of device type are required for the test to actually work (vrf, dummy, macvlan...) Anyway a run of local_termination.sh produce the following output. # selftests: drivers/net/dsa: local_termination.sh # TEST: lan1: Unicast IPv4 to primary MAC address [FAIL] # reception failed # TEST: lan1: Unicast IPv4 to macvlan MAC address [FAIL] # reception failed # TEST: lan1: Unicast IPv4 to unknown MAC address [ OK ] # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [FAIL] # reception failed # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [ OK ] # TEST: lan1: Multicast IPv4 to joined group [ OK ] # TEST: lan1: Multicast IPv4 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: lan1: Multicast IPv4 to unknown group, promisc [FAIL] # reception failed # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] # TEST: lan1: Multicast IPv6 to joined group [ OK ] # TEST: lan1: Multicast IPv6 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: lan1: Multicast IPv6 to unknown group, promisc [FAIL] # reception failed # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] # TEST: br0: Unicast IPv4 to primary MAC address [FAIL] # reception failed # TEST: br0: Unicast IPv4 to macvlan MAC address [FAIL] # reception failed # TEST: br0: Unicast IPv4 to unknown MAC address [ OK ] # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [FAIL] # reception failed # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [ OK ] # TEST: br0: Multicast IPv4 to joined group [ OK ] # TEST: br0: Multicast IPv4 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv4 to unknown group, promisc [FAIL] # reception failed # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] # TEST: br0: Multicast IPv6 to joined group [ OK ] # TEST: br0: Multicast IPv6 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv6 to unknown group, promisc [FAIL] # reception failed # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] Things doesn't look good to me or I am wrong? > > (would also love to see what are the criteria to enable offload_fwd_mask > > on the targget for rcv and eventually for xmit) > > For RX, skb->offload_fwd_mark = true (mark not mask) means that the software > bridge shouldn't flood packets received on lanX towards other lanY ports that > are part of the same hwdom, because the hardware already took care of > that. > > [ the hardware domain is determined by dev_get_port_parent_id() -> > devlink_compat_switch_id_get() and populated by dsa_port_devlink_setup() ] > > Obviously, the requirement is for the hardware to indeed take care of that :) > Currently it doesn't flood to the other user ports that are part of the > same bridge and have egress flooding enabled for that traffic type. It > just floods to the CPU and software decides where to flood. It's a > matter of implementing other brport flags, like BR_FLOOD and friends. > > For TX, skb->offload_fwd_mark means that the driver should be able to > send a skb potentially towards multiple TX ports at the same time, as a > result of an FDB lookup. This makes the bridge avoid cloning that skb > and calling dev_queue_xmit() towards every individual port that it must > reach. I would concentrate on RX and leave TX for later. > Sure. > > Thanks for any response and sorry for the long comments. > > > > > > [1] https://github.com/torvalds/linux/commit/15f7cfae912e > > > > drivers/net/dsa/qca/qca8k-8xxx.c | 8 ++---- > > drivers/net/dsa/qca/qca8k-common.c | 40 ++++++++++++++++++++++++++++++ > > drivers/net/dsa/qca/qca8k.h | 6 +++++ > > 3 files changed, 48 insertions(+), 6 deletions(-) > > > > diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c > > index f08086ac2261..a9af270a03ce 100644 > > --- a/drivers/net/dsa/qca/qca8k-8xxx.c > > +++ b/drivers/net/dsa/qca/qca8k-8xxx.c > > @@ -1963,12 +1963,6 @@ qca8k_setup(struct dsa_switch *ds) > > if (ret) > > return ret; > > > > - /* Enable ARP Auto-learning by default */ > > - ret = regmap_set_bits(priv->regmap, QCA8K_PORT_LOOKUP_CTRL(i), > > - QCA8K_PORT_LOOKUP_LEARN); > > - if (ret) > > - return ret; > > - > > /* For port based vlans to work we need to set the > > * default egress vid > > */ > > @@ -2071,6 +2065,8 @@ static const struct dsa_switch_ops qca8k_switch_ops = { > > .port_change_mtu = qca8k_port_change_mtu, > > .port_max_mtu = qca8k_port_max_mtu, > > .port_stp_state_set = qca8k_port_stp_state_set, > > + .port_pre_bridge_flags = qca8k_port_pre_bridge_flags, > > + .port_bridge_flags = qca8k_port_bridge_flags, > > .port_bridge_join = qca8k_port_bridge_join, > > .port_bridge_leave = qca8k_port_bridge_leave, > > .port_fast_age = qca8k_port_fast_age, > > diff --git a/drivers/net/dsa/qca/qca8k-common.c b/drivers/net/dsa/qca/qca8k-common.c > > index 8c2dc0e48ff4..f93defbd8b66 100644 > > --- a/drivers/net/dsa/qca/qca8k-common.c > > +++ b/drivers/net/dsa/qca/qca8k-common.c > > @@ -559,8 +559,24 @@ int qca8k_get_mac_eee(struct dsa_switch *ds, int port, > > return 0; > > } > > > > +static int qca8k_port_configure_learning(struct dsa_switch *ds, int port, > > + bool learning) > > +{ > > + struct qca8k_priv *priv = ds->priv; > > + > > + if (learning) > > + return regmap_set_bits(priv->regmap, > > + QCA8K_PORT_LOOKUP_CTRL(port), > > + QCA8K_PORT_LOOKUP_LEARN); > > + else > > + return regmap_clear_bits(priv->regmap, > > + QCA8K_PORT_LOOKUP_CTRL(port), > > + QCA8K_PORT_LOOKUP_LEARN); > > +} > > + > > void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state) > > { > > + struct dsa_port *dp = dsa_to_port(ds, port); > > struct qca8k_priv *priv = ds->priv; > > u32 stp_state; > > > > @@ -585,6 +601,30 @@ void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state) > > > > qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), > > QCA8K_PORT_LOOKUP_STATE_MASK, stp_state); > > + > > + qca8k_port_configure_learning(ds, port, dp->learning); > > Learning should be enabled only if we're in an STP state compatible with > learning: BR_STATE_LEARNING, BR_STATE_FORWARDING. The dp->learning flag > does not follow the STP state, it is just an override for that. > > So, the condition should be: > "(state == BR_STATE_LEARNING || state == BR_STATE_FORWARDING) && dp->learning" > Fixed. > > +} > > + > > +int qca8k_port_pre_bridge_flags(struct dsa_switch *ds, int port, > > + struct switchdev_brport_flags flags, > > + struct netlink_ext_ack *extack) > > +{ > > + if (flags.mask & ~BR_LEARNING) > > + return -EINVAL; > > + > > + return 0; > > +} > > + > > +int qca8k_port_bridge_flags(struct dsa_switch *ds, int port, > > + struct switchdev_brport_flags flags, > > + struct netlink_ext_ack *extack) > > +{ > > + int ret; > > + > > + ret = qca8k_port_configure_learning(ds, port, > > + flags.mask & ~BR_LEARNING); > > flags.mask contains the bits that have changed. > flags.val contains the current value of all bits. > > Passing flags.mask & ~BR_LEARNING (the mask of changed flags except for > BR_LEARNING) to qca8k_port_configure_learning() makes absolutely no sense. > Very stupid of me. > > + > > + return ret; > > } > > > > int qca8k_port_bridge_join(struct dsa_switch *ds, int port, > > diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h > > index c5cc8a172d65..8f88b7db384d 100644 > > --- a/drivers/net/dsa/qca/qca8k.h > > +++ b/drivers/net/dsa/qca/qca8k.h > > @@ -522,6 +522,12 @@ int qca8k_get_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e); > > > > /* Common bridge function */ > > void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state); > > +int qca8k_port_pre_bridge_flags(struct dsa_switch *ds, int port, > > + struct switchdev_brport_flags flags, > > + struct netlink_ext_ack *extack); > > +int qca8k_port_bridge_flags(struct dsa_switch *ds, int port, > > + struct switchdev_brport_flags flags, > > + struct netlink_ext_ack *extack); > > int qca8k_port_bridge_join(struct dsa_switch *ds, int port, > > struct dsa_bridge bridge, > > bool *tx_fwd_offload, > > -- > > 2.40.1 > > >
On Mon, Jun 26, 2023 at 06:41:50PM +0200, Christian Marangi wrote: > > Once that basic precondition passes, you should be able to start looking > > at tools/testing/selftests/drivers/net/dsa/ and run those one by one. > > An interesting one would be local_termination.sh, which monitors the way > > in which frames reach the CPU. Though be aware that some sub-tests from > > that suite will fail on misconfigurations that are non-fatal (and don't > > impact functionality), just sub-optimal (affecting performance). Like > > sending unknown packets to the CPU when the port is non-promiscuous and > > software would drop those packets anyway. > > > > Lots of difficult to run the selftests on a light fw but step at times > I'm managing to make use of them (could be helpfull to add some comments > in the .config saying that the testing port needs to be declared in the > struct) (and maybe some additional checks on the kind of device type are > required for the test to actually work (vrf, dummy, macvlan...) Yeah, that doesn't sound like a bad idea at all. AFAIK, tools/testing/selftests/net/forwarding/lib.sh doesn't check "zcat /proc/config.gz" at all. Maybe it would be nice if it did, and to guard that behavior based on some REQUIRE_* variables that are true by default (but can be set to false by scripts). > Anyway a run of local_termination.sh produce the following output. > # selftests: drivers/net/dsa: local_termination.sh > # TEST: lan1: Unicast IPv4 to primary MAC address [FAIL] > # reception failed Hmm, so ping works but this doesn't? That's strange, because send_uc_ipv4() also pings. Have you run with bash -x to see why it fails? > # TEST: lan1: Unicast IPv4 to macvlan MAC address [FAIL] > # reception failed > # TEST: lan1: Unicast IPv4 to unknown MAC address [ OK ] So the only reason why this test passes is because in this case, the unicast drops are okay? > # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [FAIL] > # reception failed > # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [ OK ] Similar here. Packet should have been dropped; the test detects a drop => okay. Passes for the wrong reason, most likely, because this driver doesn't react on IFF_PROMISC or IFF_ALLMULTI. > # TEST: lan1: Multicast IPv4 to joined group [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group [FAIL] > # reception succeeded, but should have failed "reception succeeded, but should have failed" is the okay kind of failure. "reception failed" is what's bothering. > # TEST: lan1: Multicast IPv4 to unknown group, promisc [FAIL] > # reception failed > # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: lan1: Multicast IPv6 to joined group [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Multicast IPv6 to unknown group, promisc [FAIL] > # reception failed This I cannot explain. For the test right above, "Multicast IPv6 to unknown group", it said that reception succeeded. This is sending the same packet, only the IFF_PROMISC flag of the device changes (this is also propagated to the DSA master). I've no idea why it fails. Again, bash -x will say more. > # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] > # TEST: br0: Unicast IPv4 to primary MAC address [FAIL] > # reception failed > # TEST: br0: Unicast IPv4 to macvlan MAC address [FAIL] > # reception failed > # TEST: br0: Unicast IPv4 to unknown MAC address [ OK ] > # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [FAIL] > # reception failed > # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [ OK ] > # TEST: br0: Multicast IPv4 to joined group [ OK ] > # TEST: br0: Multicast IPv4 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv4 to unknown group, promisc [FAIL] > # reception failed > # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: br0: Multicast IPv6 to joined group [ OK ] > # TEST: br0: Multicast IPv6 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv6 to unknown group, promisc [FAIL] > # reception failed > # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] > > Things doesn't look good to me or I am wrong? Nope, doesn't look good at all. Looks like an incomplete setup.
On Mon, Jun 26, 2023 at 08:30:56PM +0300, Vladimir Oltean wrote: > On Mon, Jun 26, 2023 at 06:41:50PM +0200, Christian Marangi wrote: > > > Once that basic precondition passes, you should be able to start looking > > > at tools/testing/selftests/drivers/net/dsa/ and run those one by one. > > > An interesting one would be local_termination.sh, which monitors the way > > > in which frames reach the CPU. Though be aware that some sub-tests from > > > that suite will fail on misconfigurations that are non-fatal (and don't > > > impact functionality), just sub-optimal (affecting performance). Like > > > sending unknown packets to the CPU when the port is non-promiscuous and > > > software would drop those packets anyway. > > > > > > > Lots of difficult to run the selftests on a light fw but step at times > > I'm managing to make use of them (could be helpfull to add some comments > > in the .config saying that the testing port needs to be declared in the > > struct) (and maybe some additional checks on the kind of device type are > > required for the test to actually work (vrf, dummy, macvlan...) > > Yeah, that doesn't sound like a bad idea at all. AFAIK, > tools/testing/selftests/net/forwarding/lib.sh doesn't check > "zcat /proc/config.gz" at all. Maybe it would be nice if it did, and to > guard that behavior based on some REQUIRE_* variables that are true by > default (but can be set to false by scripts). > Some thing can even be checked by simply creating an interface and see if the thing gives error. I feel this is a better approach than checking config and kflags since from what I can see the idea of these scripts is to be system agnostic and sometimes it's handy to ""compile"" or package these scripts on a different system than the target one (this is true for every shell script and target specific stuff that needs to be compiled is not the case) > > Anyway a run of local_termination.sh produce the following output. > > # selftests: drivers/net/dsa: local_termination.sh > > # TEST: lan1: Unicast IPv4 to primary MAC address [FAIL] > > # reception failed > > Hmm, so ping works but this doesn't? That's strange, because send_uc_ipv4() > also pings. Have you run with bash -x to see why it fails? > I just run with bash -x and I also mod the script to not delete the tcpdump. Limiting the script to only this test the dump is just 2 ICMPv6 packet and no output from tcpdump aside from tcpdump: listening on lan1, link-type EN10MB (Ethernet), snapshot length 65535 bytes 3 packets captured 5 packets received by filter 0 packets dropped by kernel I feel like this is important so I think I should focus on understanding why this doesn't work? Any clue? > > # TEST: lan1: Unicast IPv4 to macvlan MAC address [FAIL] > > # reception failed > > # TEST: lan1: Unicast IPv4 to unknown MAC address [ OK ] > > So the only reason why this test passes is because in this case, the > unicast drops are okay? > > > # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [FAIL] > > # reception failed > > # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [ OK ] > > Similar here. Packet should have been dropped; the test detects a drop => okay. > Passes for the wrong reason, most likely, because this driver doesn't react > on IFF_PROMISC or IFF_ALLMULTI. > > > # TEST: lan1: Multicast IPv4 to joined group [ OK ] > > # TEST: lan1: Multicast IPv4 to unknown group [FAIL] > > # reception succeeded, but should have failed > > "reception succeeded, but should have failed" is the okay kind of failure. > "reception failed" is what's bothering. > > > # TEST: lan1: Multicast IPv4 to unknown group, promisc [FAIL] > > # reception failed > > # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] > > # TEST: lan1: Multicast IPv6 to joined group [ OK ] > > # TEST: lan1: Multicast IPv6 to unknown group [FAIL] > > # reception succeeded, but should have failed > > # TEST: lan1: Multicast IPv6 to unknown group, promisc [FAIL] > > # reception failed > > This I cannot explain. For the test right above, "Multicast IPv6 to unknown group", > it said that reception succeeded. This is sending the same packet, only > the IFF_PROMISC flag of the device changes (this is also propagated to > the DSA master). I've no idea why it fails. Again, bash -x will say more. > > > # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] > > # TEST: br0: Unicast IPv4 to primary MAC address [FAIL] > > # reception failed > > # TEST: br0: Unicast IPv4 to macvlan MAC address [FAIL] > > # reception failed > > # TEST: br0: Unicast IPv4 to unknown MAC address [ OK ] > > # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [FAIL] > > # reception failed > > # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [ OK ] > > # TEST: br0: Multicast IPv4 to joined group [ OK ] > > # TEST: br0: Multicast IPv4 to unknown group [FAIL] > > # reception succeeded, but should have failed > > # TEST: br0: Multicast IPv4 to unknown group, promisc [FAIL] > > # reception failed > > # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] > > # TEST: br0: Multicast IPv6 to joined group [ OK ] > > # TEST: br0: Multicast IPv6 to unknown group [FAIL] > > # reception succeeded, but should have failed > > # TEST: br0: Multicast IPv6 to unknown group, promisc [FAIL] > > # reception failed > > # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] > > > > Things doesn't look good to me or I am wrong? > > Nope, doesn't look good at all. Looks like an incomplete setup.
On Mon, Jun 26, 2023 at 07:59:41PM +0200, Christian Marangi wrote: > > Hmm, so ping works but this doesn't? That's strange, because send_uc_ipv4() > > also pings. Have you run with bash -x to see why it fails? > > > > I just run with bash -x and I also mod the script to not delete the > tcpdump. Limiting the script to only this test the dump is just 2 ICMPv6 > packet and no output from tcpdump aside from > > tcpdump: listening on lan1, link-type EN10MB (Ethernet), snapshot length 65535 bytes > 3 packets captured > 5 packets received by filter > 0 packets dropped by kernel > > I feel like this is important so I think I should focus on understanding > why this doesn't work? Any clue? No clue. I'd put a "bash" instruction in send_uc_ipv4(), which would act as a sort of break point for the script (opens an interactive sub-shell), then run it again with bash -x, manually repeat the command that failed, investigate why it failed and hit Ctrl-d when I'm done.
On Tue, Jun 27, 2023 at 12:49:16PM +0300, Vladimir Oltean wrote: > On Mon, Jun 26, 2023 at 07:59:41PM +0200, Christian Marangi wrote: > > > Hmm, so ping works but this doesn't? That's strange, because send_uc_ipv4() > > > also pings. Have you run with bash -x to see why it fails? > > > > > > > I just run with bash -x and I also mod the script to not delete the > > tcpdump. Limiting the script to only this test the dump is just 2 ICMPv6 > > packet and no output from tcpdump aside from > > > > tcpdump: listening on lan1, link-type EN10MB (Ethernet), snapshot length 65535 bytes > > 3 packets captured > > 5 packets received by filter > > 0 packets dropped by kernel > > > > I feel like this is important so I think I should focus on understanding > > why this doesn't work? Any clue? > > No clue. I'd put a "bash" instruction in send_uc_ipv4(), which would act > as a sort of break point for the script (opens an interactive sub-shell), > then run it again with bash -x, manually repeat the command that failed, > investigate why it failed and hit Ctrl-d when I'm done. Hi, I wasted a day to only notice that the problem was in busybox not supporting 0.x value and that is what selfttest script use. Another thing to check. Also the confusing part of this is that we don't check if ping_do return error and the test just fails (while in reality the ping command was never executed) Anyway I'm fixing all kind of bugs and I even found an even hw limitation with the FDB table with mgmt packet... Also I implemented fdb_insolation following the implementation done on felix with reserving VID at the end... About this I wonder if it might be a good idea to expose some generic API from DSA? qca8k require a reserved VID for VLAN unaware port and with fdb_isolation even more VID are reserved. DSA have no idea about this so I wonder if there is a chance of VID clash? I feel like we need something to declare a range of reserved VID so that they gets rejected when applied. (about this I think I should return -EINVAL if fdb/mbd insert are asked with a reserved VID) Anyway by fixing that interval problem (enabling support in busybox as it's disabled by default in a OpenWRT system) This is the new output of the local_termination.sh TAP version 13 1..1 # selftests: drivers/net/dsa: local_termination.sh # TEST: lan1: Unicast IPv4 to primary MAC address [ OK ] # TEST: lan1: Unicast IPv4 to macvlan MAC address [ OK ] # TEST: lan1: Unicast IPv4 to unknown MAC address [FAIL] # reception succeeded, but should have failed # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [ OK ] # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [FAIL] # reception succeeded, but should have failed # TEST: lan1: Multicast IPv4 to joined group [ OK ] # TEST: lan1: Multicast IPv4 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: lan1: Multicast IPv4 to unknown group, promisc [ OK ] # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] # TEST: lan1: Multicast IPv6 to joined group [ OK ] # TEST: lan1: Multicast IPv6 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: lan1: Multicast IPv6 to unknown group, promisc [ OK ] # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] # TEST: br0: Unicast IPv4 to primary MAC address [ OK ] # TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] # TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] # reception succeeded, but should have failed # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv4 to joined group [ OK ] # TEST: br0: Multicast IPv4 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] # TEST: br0: Multicast IPv6 to joined group [ OK ] # TEST: br0: Multicast IPv6 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1 Seems good aside from the reception secceeded that I think the kernel just drops right? The switch have a way to control FLOOD per port but maybe the correct feature to use is the VLAN leaky? Where the setting can be set by both FDB/MBD and per port. Sorry if I described some fix and implementation without proposing the patch but I would like some comments on what the tool returned.
On Wed, Jun 28, 2023 at 02:49:53AM +0200, Christian Marangi wrote: > On Tue, Jun 27, 2023 at 12:49:16PM +0300, Vladimir Oltean wrote: > > On Mon, Jun 26, 2023 at 07:59:41PM +0200, Christian Marangi wrote: > > > > Hmm, so ping works but this doesn't? That's strange, because send_uc_ipv4() > > > > also pings. Have you run with bash -x to see why it fails? > > > > > > > > > > I just run with bash -x and I also mod the script to not delete the > > > tcpdump. Limiting the script to only this test the dump is just 2 ICMPv6 > > > packet and no output from tcpdump aside from > > > > > > tcpdump: listening on lan1, link-type EN10MB (Ethernet), snapshot length 65535 bytes > > > 3 packets captured > > > 5 packets received by filter > > > 0 packets dropped by kernel > > > > > > I feel like this is important so I think I should focus on understanding > > > why this doesn't work? Any clue? > > > > No clue. I'd put a "bash" instruction in send_uc_ipv4(), which would act > > as a sort of break point for the script (opens an interactive sub-shell), > > then run it again with bash -x, manually repeat the command that failed, > > investigate why it failed and hit Ctrl-d when I'm done. > > Hi, I wasted a day to only notice that the problem was in busybox not > supporting 0.x value and that is what selfttest script use. Another > thing to check. Also the confusing part of this is that we don't check > if ping_do return error and the test just fails (while in reality the > ping command was never executed) > Also sorry for the double email... I forgot to add that it seems I need to add a sleep 1 right after mc_route_prepare $h1 mc_route_prepare $rcv_if_name in local_termination.sh. Think the switch needs some time to enable all the port. Without the sleep 1 the first test just fails. > Anyway I'm fixing all kind of bugs and I even found an even hw > limitation with the FDB table with mgmt packet... > > Also I implemented fdb_insolation following the implementation done on > felix with reserving VID at the end... > About this I wonder if it might be a good idea to expose some generic > API from DSA? > > qca8k require a reserved VID for VLAN unaware port and with > fdb_isolation even more VID are reserved. DSA have no idea about this so > I wonder if there is a chance of VID clash? I feel like we need > something to declare a range of reserved VID so that they gets rejected > when applied. (about this I think I should return -EINVAL if fdb/mbd > insert are asked with a reserved VID) > > Anyway by fixing that interval problem (enabling support in busybox as > it's disabled by default in a OpenWRT system) > This is the new output of the local_termination.sh > > TAP version 13 > 1..1 > # selftests: drivers/net/dsa: local_termination.sh > # TEST: lan1: Unicast IPv4 to primary MAC address [ OK ] > # TEST: lan1: Unicast IPv4 to macvlan MAC address [ OK ] > # TEST: lan1: Unicast IPv4 to unknown MAC address [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [ OK ] > # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Multicast IPv4 to joined group [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Multicast IPv4 to unknown group, promisc [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: lan1: Multicast IPv6 to joined group [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Multicast IPv6 to unknown group, promisc [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] > # TEST: br0: Unicast IPv4 to primary MAC address [ OK ] > # TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] > # TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] > # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv4 to joined group [ OK ] > # TEST: br0: Multicast IPv4 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] > # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: br0: Multicast IPv6 to joined group [ OK ] > # TEST: br0: Multicast IPv6 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] > # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] > not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1 > > Seems good aside from the reception secceeded that I think the kernel > just drops right? > > The switch have a way to control FLOOD per port but maybe the correct > feature to use is the VLAN leaky? Where the setting can be set by both > FDB/MBD and per port. > > Sorry if I described some fix and implementation without proposing the > patch but I would like some comments on what the tool returned. > > -- > Ansuel
On Wed, Jun 28, 2023 at 02:49:53AM +0200, Christian Marangi wrote: > Hi, I wasted a day to only notice that the problem was in busybox not > supporting 0.x value and that is what selfttest script use. Another > thing to check. Also the confusing part of this is that we don't check > if ping_do return error and the test just fails (while in reality the > ping command was never executed) Interesting. Not sure how you'll check for the specific busybox implementation. Does it help if you add "check_err $?" after ping_do() in send_uc_ipv4(), like ping_test() has? > Anyway I'm fixing all kind of bugs and I even found an even hw > limitation with the FDB table with mgmt packet... > > Also I implemented fdb_insolation following the implementation done on > felix with reserving VID at the end... > About this I wonder if it might be a good idea to expose some generic > API from DSA? > > qca8k require a reserved VID for VLAN unaware port and with > fdb_isolation even more VID are reserved. DSA have no idea about this so > I wonder if there is a chance of VID clash? I feel like we need > something to declare a range of reserved VID so that they gets rejected > when applied. (about this I think I should return -EINVAL if fdb/mbd > insert are asked with a reserved VID) I don't think that it's possible to get a port_fdb_add() or port_mdb_add() call for a VID that wasn't accepted through port_vlan_add() first. At least I don't see how. The VLAN-aware FDB entries come from the bridge through SWITCHDEV_OBJ_ID_PORT_VLAN and from DSA's private .ndo_vlan_rx_add_vid() / .ndo_vlan_rx_kill_vid() implementations for VLAN filters (those that land in dp->user_vlans; refresh your net.git tree if you don't find those). If we reject VLANs at those layers, then: # through the bridge [root@mox:~] # bridge fdb add 00:01:02:03:04:05 vlan 2000 dev lan21 master static [ 7611.225227] bridge: RTM_NEWNEIGH with unconfigured vlan 2000 on lan21 RTNETLINK answers: Invalid argument # bridge bypass, should go through ndo_fdb_add() if DSA had one [root@mox:~] # bridge fdb add 00:01:02:03:04:05 vlan 2000 dev lan21 self static [ 7855.532615] mv88e6085 d0032004.mdio-mii:12 lan21: default FDB implementation only supports local addresses RTNETLINK answers: Invalid argument So, while we could probably add some API in core DSA for a reserved VID range, I'm not sure that it will be that useful. And regarding whether there is a chance of VID class? I guess there is. I have a board where the felix driver (reserved range 4000-4095) is a DSA master for the sja1105 driver (reserved range 3072-4095 for tag_8021q). Since any dsa_tag_8021q_port_setup() call does a vlan_vid_add() call towards the master, then there's a chance that felix could reject the tag_8021q setup of the sja1105 ports for tag_8021q VIDs 4000 and upwards. VID 4000 = 0xfa0 = bitwise 0b1111_1010_0000 Comparing with net/dsa/tag_8021q.c: * | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | * +-----------+-----+-----------------+-----------+-----------------------+ * | RSV | VBID| SWITCH_ID | VBID | PORT | * +-----------+-----+-----------------+-----------+-----------------------+ So it would be problematic for VBID >= 6, whenever SWITCH_ID >= 6. However, practical applications of tag_8021q with the sja1105 boards that I'm aware of do not have 7 switches, so it's a theoretical problem only. Though we would need to be rather careful when calculating and enforcing the reserved ranges in the DSA core, to not cause false positive errors. > > Anyway by fixing that interval problem (enabling support in busybox as > it's disabled by default in a OpenWRT system) > This is the new output of the local_termination.sh > > TAP version 13 > 1..1 > # selftests: drivers/net/dsa: local_termination.sh > # TEST: lan1: Unicast IPv4 to primary MAC address [ OK ] > # TEST: lan1: Unicast IPv4 to macvlan MAC address [ OK ] > # TEST: lan1: Unicast IPv4 to unknown MAC address [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [ OK ] > # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Multicast IPv4 to joined group [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Multicast IPv4 to unknown group, promisc [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: lan1: Multicast IPv6 to joined group [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: lan1: Multicast IPv6 to unknown group, promisc [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] > # TEST: br0: Unicast IPv4 to primary MAC address [ OK ] > # TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] > # TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] > # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv4 to joined group [ OK ] > # TEST: br0: Multicast IPv4 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] > # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: br0: Multicast IPv6 to joined group [ OK ] > # TEST: br0: Multicast IPv6 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] > # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] > not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1 > > Seems good aside from the reception secceeded that I think the kernel > just drops right? Looks good. If you've implemented FDB isolation, then it means that you get port_fdb_add() and port_mdb_add() calls on the CPU port(s), and you can now also implement .port_set_host_flood() and that should also make the following tests pass: * TEST: lan1: Unicast IPv4 to unknown MAC address [FAIL] * TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [FAIL] * TEST: lan1: Multicast IPv4 to unknown group [FAIL] * TEST: lan1: Multicast IPv6 to unknown group [FAIL] The bridge tests to an unknown address will always fail with the message "reception succeeded, but should have failed", so it's not you, it's the test there. Once you're there, you can re-do these tests with: - all user ports to CPU port 0 - all user ports to CPU port 6 - user ports statically split between CPU ports 0 and 6 - all user ports to LAG composed of CPU ports 0 and 6 > The switch have a way to control FLOOD per port but maybe the correct > feature to use is the VLAN leaky? Where the setting can be set by both > FDB/MBD and per port. I'm not sure why would the leaky VLAN feature be useful ("enable specific frames to be forwarded across VLAN boundary"). I really don't know what you're thinking about. > Sorry if I described some fix and implementation without proposing the > patch but I would like some comments on what the tool returned. > > -- > Ansuel
On Wed, Jun 28, 2023 at 02:53:35AM +0200, Christian Marangi wrote: > Also sorry for the double email... I forgot to add that it seems I need > to add a sleep 1 right after > > mc_route_prepare $h1 > mc_route_prepare $rcv_if_name > > in local_termination.sh. Think the switch needs some time to enable all > the port. Without the sleep 1 the first test just fails. This makes sense. Those multicast MAC addresses are synced to DSA through dsa_slave_set_rx_mode() -> dsa_slave_sync_mc(). Notice that the calling context is atomic, so the implementation sets up a deferred work item for later. So user space is allowed to return from the syscall before the hardware is actually reprogrammed.
On Thu, Jun 29, 2023 at 03:39:54PM +0300, Vladimir Oltean wrote: > On Wed, Jun 28, 2023 at 02:49:53AM +0200, Christian Marangi wrote: > > Hi, I wasted a day to only notice that the problem was in busybox not > > supporting 0.x value and that is what selfttest script use. Another > > thing to check. Also the confusing part of this is that we don't check > > if ping_do return error and the test just fails (while in reality the > > ping command was never executed) > > Interesting. Not sure how you'll check for the specific busybox > implementation. Does it help if you add "check_err $?" after ping_do() > in send_uc_ipv4(), like ping_test() has? > I think it's better to implement some simple tests before the test is done. Similar to tools missing tests... So something like "ping -i 0.1" and reading the output if it does match a specific error. Or at least add some comments that that option is required. But I guess check_err should be added anyway since it'd the current pattern used in lib.sh > > Anyway I'm fixing all kind of bugs and I even found an even hw > > limitation with the FDB table with mgmt packet... > > > > Also I implemented fdb_insolation following the implementation done on > > felix with reserving VID at the end... > > About this I wonder if it might be a good idea to expose some generic > > API from DSA? > > > > qca8k require a reserved VID for VLAN unaware port and with > > fdb_isolation even more VID are reserved. DSA have no idea about this so > > I wonder if there is a chance of VID clash? I feel like we need > > something to declare a range of reserved VID so that they gets rejected > > when applied. (about this I think I should return -EINVAL if fdb/mbd > > insert are asked with a reserved VID) > > I don't think that it's possible to get a port_fdb_add() or port_mdb_add() > call for a VID that wasn't accepted through port_vlan_add() first. > At least I don't see how. The VLAN-aware FDB entries come from the > bridge through SWITCHDEV_OBJ_ID_PORT_VLAN and from DSA's private > .ndo_vlan_rx_add_vid() / .ndo_vlan_rx_kill_vid() implementations for > VLAN filters (those that land in dp->user_vlans; refresh your net.git > tree if you don't find those). > Ok so in theory it should be reserved and a check should be added in port_valn_add to return an error if a reserved VID is used. > If we reject VLANs at those layers, then: > # through the bridge > [root@mox:~] # bridge fdb add 00:01:02:03:04:05 vlan 2000 dev lan21 master static > [ 7611.225227] bridge: RTM_NEWNEIGH with unconfigured vlan 2000 on lan21 > RTNETLINK answers: Invalid argument > # bridge bypass, should go through ndo_fdb_add() if DSA had one > [root@mox:~] # bridge fdb add 00:01:02:03:04:05 vlan 2000 dev lan21 self static > [ 7855.532615] mv88e6085 d0032004.mdio-mii:12 lan21: default FDB implementation only supports local addresses > RTNETLINK answers: Invalid argument > > So, while we could probably add some API in core DSA for a reserved VID > range, I'm not sure that it will be that useful. > > And regarding whether there is a chance of VID class? I guess there is. > I have a board where the felix driver (reserved range 4000-4095) is a > DSA master for the sja1105 driver (reserved range 3072-4095 for tag_8021q). > Since any dsa_tag_8021q_port_setup() call does a vlan_vid_add() call > towards the master, then there's a chance that felix could reject the > tag_8021q setup of the sja1105 ports for tag_8021q VIDs 4000 and upwards. > > VID 4000 = 0xfa0 = bitwise 0b1111_1010_0000 > > Comparing with net/dsa/tag_8021q.c: > > * | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | > * +-----------+-----+-----------------+-----------+-----------------------+ > * | RSV | VBID| SWITCH_ID | VBID | PORT | > * +-----------+-----+-----------------+-----------+-----------------------+ > > So it would be problematic for VBID >= 6, whenever SWITCH_ID >= 6. > However, practical applications of tag_8021q with the sja1105 boards > that I'm aware of do not have 7 switches, so it's a theoretical problem > only. > > Though we would need to be rather careful when calculating and enforcing > the reserved ranges in the DSA core, to not cause false positive errors. > Was just an idea since I see only felix using fdb_isolation and it seems reserving VID is the current way to handle it. > > > > Anyway by fixing that interval problem (enabling support in busybox as > > it's disabled by default in a OpenWRT system) > > This is the new output of the local_termination.sh > > > > TAP version 13 > > 1..1 > > # selftests: drivers/net/dsa: local_termination.sh > > # TEST: lan1: Unicast IPv4 to primary MAC address [ OK ] > > # TEST: lan1: Unicast IPv4 to macvlan MAC address [ OK ] > > # TEST: lan1: Unicast IPv4 to unknown MAC address [FAIL] > > # reception succeeded, but should have failed > > # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [ OK ] > > # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > > # reception succeeded, but should have failed > > # TEST: lan1: Multicast IPv4 to joined group [ OK ] > > # TEST: lan1: Multicast IPv4 to unknown group [FAIL] > > # reception succeeded, but should have failed > > # TEST: lan1: Multicast IPv4 to unknown group, promisc [ OK ] > > # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] > > # TEST: lan1: Multicast IPv6 to joined group [ OK ] > > # TEST: lan1: Multicast IPv6 to unknown group [FAIL] > > # reception succeeded, but should have failed > > # TEST: lan1: Multicast IPv6 to unknown group, promisc [ OK ] > > # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] > > # TEST: br0: Unicast IPv4 to primary MAC address [ OK ] > > # TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] > > # TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] > > # reception succeeded, but should have failed > > # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] > > # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > > # reception succeeded, but should have failed > > # TEST: br0: Multicast IPv4 to joined group [ OK ] > > # TEST: br0: Multicast IPv4 to unknown group [FAIL] > > # reception succeeded, but should have failed > > # TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] > > # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] > > # TEST: br0: Multicast IPv6 to joined group [ OK ] > > # TEST: br0: Multicast IPv6 to unknown group [FAIL] > > # reception succeeded, but should have failed > > # TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] > > # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] > > not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1 > > > > Seems good aside from the reception secceeded that I think the kernel > > just drops right? > > Looks good. If you've implemented FDB isolation, then it means that you > get port_fdb_add() and port_mdb_add() calls on the CPU port(s), and you > can now also implement .port_set_host_flood() and that should also make > the following tests pass: > > * TEST: lan1: Unicast IPv4 to unknown MAC address [FAIL] > * TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > * TEST: lan1: Multicast IPv4 to unknown group [FAIL] > * TEST: lan1: Multicast IPv6 to unknown group [FAIL] > > The bridge tests to an unknown address will always fail with the message > "reception succeeded, but should have failed", so it's not you, it's the > test there. > > Once you're there, you can re-do these tests with: > - all user ports to CPU port 0 > - all user ports to CPU port 6 > - user ports statically split between CPU ports 0 and 6 > - all user ports to LAG composed of CPU ports 0 and 6 > I have bad and good news about this... Bad news is that it seems to be qca8k handle flooding in a very interesting and broken way. Due to DSA implementation we expect every packet to be flooded to the CPU port by default and this is problematic with how the switch works. It seems that enabling flood bit for the CPU results in packet always be directed there... The switch tagger have a way to comunicate that the packet is flooded to the cpu. With some code in the tagger I manage to reach this state from local_termination.sh root@OpenWrt:~# /ktests/run_kselftest.sh -t drivers/net/dsa:local_termination.sh TAP version 13 1..1 # selftests: drivers/net/dsa: local_termination.sh # TEST: lan1: Unicast IPv4 to primary MAC address [ OK ] # TEST: lan1: Unicast IPv4 to macvlan MAC address [ OK ] # TEST: lan1: Unicast IPv4 to unknown MAC address [ OK ] # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [ OK ] # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [ OK ] # TEST: lan1: Multicast IPv4 to joined group [ OK ] # TEST: lan1: Multicast IPv4 to unknown group [ OK ] # TEST: lan1: Multicast IPv4 to unknown group, promisc [ OK ] # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] # TEST: lan1: Multicast IPv6 to joined group [ OK ] # TEST: lan1: Multicast IPv6 to unknown group [ OK ] # TEST: lan1: Multicast IPv6 to unknown group, promisc [ OK ] # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] # TEST: br0: Unicast IPv4 to primary MAC address [ OK ] # TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] # TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] # reception succeeded, but should have failed # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv4 to joined group [ OK ] # TEST: br0: Multicast IPv4 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] # TEST: br0: Multicast IPv6 to joined group [ OK ] # TEST: br0: Multicast IPv6 to unknown group [FAIL] # reception succeeded, but should have failed # TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1 The tagger check if the packet type is a flood kind and drop the packet (return NULL) if the interface is not promisc or allmulti. If the port if bridge type this check is skipped... My concern is... Is it ok to implement host_set_flood this way? I tried many variant and I wasn't able to make flood work by using those bits... With the current implementation, host_set_flood will return always zero and everything will be handled in the tagger by dropping packet but I'm not sure it's ok to do it. > > The switch have a way to control FLOOD per port but maybe the correct > > feature to use is the VLAN leaky? Where the setting can be set by both > > FDB/MBD and per port. > > I'm not sure why would the leaky VLAN feature be useful ("enable > specific frames to be forwarded across VLAN boundary"). I really don't > know what you're thinking about. > Yes sorry, just me wasting way too much time on this and searching for alternative solution... Switch documentation doesn't describe these case very well sadly... > > Sorry if I described some fix and implementation without proposing the > > patch but I would like some comments on what the tool returned. > >
On Sat, Jul 01, 2023 at 08:25:11PM +0200, Christian Marangi wrote: > I have bad and good news about this... Bad news is that it seems to be > qca8k handle flooding in a very interesting and broken way. > Due to DSA implementation we expect every packet to be flooded to the > CPU port by default and this is problematic with how the switch works. To be clear, the network stack understands "flooding" to be something that takes place for packets that have an unknown MAC DA, or broadcast. So no, we don't expect that every packet is flooded, nor that all are flooded to the CPU port. I'm not exactly clear what you mean. > > It seems that enabling flood bit for the CPU results in packet always be > directed there... Are you sure? How have you determined this? Can you rephrase this? Because what I'm understanding is that you're saying that when flooding towards the CPU is enabled, packets go to the CPU regardless of whether they're known to the FDB or not. Which can't be correct, because if it was, then packets sent to a MAC DA statically added as a bridge FDB entry would be seen twice: - once, autonomously forwarded by the switch - twice, due to your claim that "packets are always directed to the CPU", (claim which must also hold with the current driver configuration, because the driver currently always enables flooding to the CPU) But if that's the case, it means that the CPU (via the software bridge) will send a second copy of the packet to the egress port, because skb->offload_fwd_mark is unset. Someone would have noticed if that was the case ?! I have some skepticism towards this claim. Please try to describe what happens under this scenario: br0 / | \ / | \ / | \ swp0 swp1 swp2 | | | A B C # On the switch ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up ip link set swp0 master br0 && ip link set br0 up ip link set swp1 master br0 && ip link set br0 up ip link set swp2 master br0 && ip link set br0 up bridge fdb add dev swp1 ${mac_of_B} vlan 1 master static # On C tcpdump -i eth0 -e -n # On A ping $B How many ICMP requests do you see on B? How many do you see on C? How many on the CPU port (br0)? Correct answers should be: On B: as many as were sent by A On C: none On br0: none on ingress (tcpdump -Q in), none on egress (tcpdump -Q out) If you're seeing the ICMP requests on station C, see whether they don't in fact come from br0 having flooded them in software, which means that the real problem is with the FDB not matching on packets.. I have no idea how you've implemented things, but with reserved VIDs, you also need to remap VID 0 during port_fdb_add() calls to the reserved PVID of that bridge, and you've done that, otherwise packets won't match, right? > The switch tagger have a way to comunicate that the packet is flooded > to the cpu. > > With some code in the tagger I manage to reach this state from > local_termination.sh > > root@OpenWrt:~# /ktests/run_kselftest.sh -t drivers/net/dsa:local_termination.sh > TAP version 13 > 1..1 > # selftests: drivers/net/dsa: local_termination.sh > # TEST: lan1: Unicast IPv4 to primary MAC address [ OK ] > # TEST: lan1: Unicast IPv4 to macvlan MAC address [ OK ] > # TEST: lan1: Unicast IPv4 to unknown MAC address [ OK ] > # TEST: lan1: Unicast IPv4 to unknown MAC address, promisc [ OK ] > # TEST: lan1: Unicast IPv4 to unknown MAC address, allmulti [ OK ] > # TEST: lan1: Multicast IPv4 to joined group [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group, promisc [ OK ] > # TEST: lan1: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: lan1: Multicast IPv6 to joined group [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group, promisc [ OK ] > # TEST: lan1: Multicast IPv6 to unknown group, allmulti [ OK ] > # TEST: br0: Unicast IPv4 to primary MAC address [ OK ] > # TEST: br0: Unicast IPv4 to macvlan MAC address [ OK ] > # TEST: br0: Unicast IPv4 to unknown MAC address [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Unicast IPv4 to unknown MAC address, promisc [ OK ] > # TEST: br0: Unicast IPv4 to unknown MAC address, allmulti [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv4 to joined group [ OK ] > # TEST: br0: Multicast IPv4 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv4 to unknown group, promisc [ OK ] > # TEST: br0: Multicast IPv4 to unknown group, allmulti [ OK ] > # TEST: br0: Multicast IPv6 to joined group [ OK ] > # TEST: br0: Multicast IPv6 to unknown group [FAIL] > # reception succeeded, but should have failed > # TEST: br0: Multicast IPv6 to unknown group, promisc [ OK ] > # TEST: br0: Multicast IPv6 to unknown group, allmulti [ OK ] > not ok 1 selftests: drivers/net/dsa: local_termination.sh # exit=1 > > The tagger check if the packet type is a flood kind and drop the packet > (return NULL) if the interface is not promisc or allmulti. If the port > if bridge type this check is skipped... Why is the check skipped if the port is under a bridge? Because there might be foreign interfaces under that bridge (like wlan0) which might be valid destinations for flooded traffic (through software forwarding)? > My concern is... Is it ok to implement host_set_flood this way? I tried > many variant and I wasn't able to make flood work by using those bits... > > With the current implementation, host_set_flood will return always zero > and everything will be handled in the tagger by dropping packet but I'm > not sure it's ok to do it. Well, we should first clarify exactly what it is that you claim to be a hardware bug.
diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c index f08086ac2261..a9af270a03ce 100644 --- a/drivers/net/dsa/qca/qca8k-8xxx.c +++ b/drivers/net/dsa/qca/qca8k-8xxx.c @@ -1963,12 +1963,6 @@ qca8k_setup(struct dsa_switch *ds) if (ret) return ret; - /* Enable ARP Auto-learning by default */ - ret = regmap_set_bits(priv->regmap, QCA8K_PORT_LOOKUP_CTRL(i), - QCA8K_PORT_LOOKUP_LEARN); - if (ret) - return ret; - /* For port based vlans to work we need to set the * default egress vid */ @@ -2071,6 +2065,8 @@ static const struct dsa_switch_ops qca8k_switch_ops = { .port_change_mtu = qca8k_port_change_mtu, .port_max_mtu = qca8k_port_max_mtu, .port_stp_state_set = qca8k_port_stp_state_set, + .port_pre_bridge_flags = qca8k_port_pre_bridge_flags, + .port_bridge_flags = qca8k_port_bridge_flags, .port_bridge_join = qca8k_port_bridge_join, .port_bridge_leave = qca8k_port_bridge_leave, .port_fast_age = qca8k_port_fast_age, diff --git a/drivers/net/dsa/qca/qca8k-common.c b/drivers/net/dsa/qca/qca8k-common.c index 8c2dc0e48ff4..f93defbd8b66 100644 --- a/drivers/net/dsa/qca/qca8k-common.c +++ b/drivers/net/dsa/qca/qca8k-common.c @@ -559,8 +559,24 @@ int qca8k_get_mac_eee(struct dsa_switch *ds, int port, return 0; } +static int qca8k_port_configure_learning(struct dsa_switch *ds, int port, + bool learning) +{ + struct qca8k_priv *priv = ds->priv; + + if (learning) + return regmap_set_bits(priv->regmap, + QCA8K_PORT_LOOKUP_CTRL(port), + QCA8K_PORT_LOOKUP_LEARN); + else + return regmap_clear_bits(priv->regmap, + QCA8K_PORT_LOOKUP_CTRL(port), + QCA8K_PORT_LOOKUP_LEARN); +} + void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state) { + struct dsa_port *dp = dsa_to_port(ds, port); struct qca8k_priv *priv = ds->priv; u32 stp_state; @@ -585,6 +601,30 @@ void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state) qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), QCA8K_PORT_LOOKUP_STATE_MASK, stp_state); + + qca8k_port_configure_learning(ds, port, dp->learning); +} + +int qca8k_port_pre_bridge_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack) +{ + if (flags.mask & ~BR_LEARNING) + return -EINVAL; + + return 0; +} + +int qca8k_port_bridge_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack) +{ + int ret; + + ret = qca8k_port_configure_learning(ds, port, + flags.mask & ~BR_LEARNING); + + return ret; } int qca8k_port_bridge_join(struct dsa_switch *ds, int port, diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h index c5cc8a172d65..8f88b7db384d 100644 --- a/drivers/net/dsa/qca/qca8k.h +++ b/drivers/net/dsa/qca/qca8k.h @@ -522,6 +522,12 @@ int qca8k_get_mac_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e); /* Common bridge function */ void qca8k_port_stp_state_set(struct dsa_switch *ds, int port, u8 state); +int qca8k_port_pre_bridge_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack); +int qca8k_port_bridge_flags(struct dsa_switch *ds, int port, + struct switchdev_brport_flags flags, + struct netlink_ext_ack *extack); int qca8k_port_bridge_join(struct dsa_switch *ds, int port, struct dsa_bridge bridge, bool *tx_fwd_offload,