Message ID | 20230620063747.19175-1-ansuelsmth@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3753782vqr; Tue, 20 Jun 2023 08:37:03 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4u2WhjDaLBphGUk7p1ewku5u8lqje2QjriCYf6ZwLLgTkjWa4N/48EFs1IghSq/O0Nqbj4 X-Received: by 2002:a17:902:6bcb:b0:1b1:b50c:e330 with SMTP id m11-20020a1709026bcb00b001b1b50ce330mr7304353plt.2.1687275423423; Tue, 20 Jun 2023 08:37:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687275423; cv=none; d=google.com; s=arc-20160816; b=AnDmPJA8TXlbmVIsQE7LG+B4MUdyfLNVpXCBYPl0wRPwJyy9EYEn532rTOlJxH1dJm hSp9Gs+WTalH75Kt2jlHmZPCvxUtLkFp3f7wqLcAe65Wq31YTNJsQgFP+o3tH85fF/Ah kW0nGBIsCTfhtQ989TAErGWKTHd5DMhOe49NrYa/+qISU0XWW5Ix/uiSuK+tVof5XNde JRzecOy0lxfpCst479HJ6M93zMHriea9lPzJ7O2AGDMVH7b7+oZsmhEnJa1uaT7PAV/3 fVO03JqagHE+jrLaNzc3tYsOe6h248VPkdV+WRnJtjPRJ5U2k8zvaEH0V0Q++w8UhIo5 Pmfw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature; bh=ae2LgDthYgLW0wGsnuwxcKqUllQXntc3z/XI8MH0us0=; b=00+dzJj9cDY1ywp0Mmn5O3oAWXOQrnB6Pazi4qnMpqjaVVIs9b8QYHa1v28frq2ZkT 39rw/aoxUq4i+mQp+2Y+YksKlaIo7B0VfG8Sqv0sd6H1ne3qfIAV0Ms+YslA4H1/LG4b e3fPR6CmFK9MkUrdxVWNj1lKUTvcWd2dYt3coWpQHBQcdtMTc2zJL2sH/3458cFyeQ0x 9xBDYQBYs31edBOPgeH2uKzzJkaZz6bRw5fTkvj3Sw7uryiQIFc5o2tMGomXgfrJREbd IItkIi4U/JOQQ1dw+p846/9JC9J1n5bdcp3r12jnIYRJp3aHPm8Y/50dTiAjXI1WKVXw RnZQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=W4B4js9s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h16-20020a170902f55000b001afdfd0b363si2270798plf.489.2023.06.20.08.36.48; Tue, 20 Jun 2023 08:37:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=W4B4js9s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233744AbjFTPaw (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Tue, 20 Jun 2023 11:30:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233736AbjFTPau (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 20 Jun 2023 11:30:50 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27D6412C; Tue, 20 Jun 2023 08:30:49 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-30fcde6a73cso4212687f8f.2; Tue, 20 Jun 2023 08:30:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687275047; x=1689867047; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=ae2LgDthYgLW0wGsnuwxcKqUllQXntc3z/XI8MH0us0=; b=W4B4js9sbTcYqkHFBdWrlEW67dJIQ9ckcPv3hPGn/p071Qp/UkvdZ9UFNo9T/jNTEM 5OgLCijugVYQ6kxDF09bgxasUp+fxMlv0ImlJ1Whqwhz7O0frsxVRP6n0LA5rQOrskYr K8ILGvMXdU0FT/bH24OkYOsPgFXp/xF17LMfHrHYGSDUl5gwQ0rO0bzaORXP5iIQd+2X VsTKVHKp3lu3zzTY0QFTpSKrADnZX35ZufllZkaBKXgExlE1DjwqNFCV1z8zolqZWbMv Tu05fUWqidjmjzXBxm+gq0CfcszOg/dYUyD0YzDci0DFk1E3SnVSOAZ9L8i6nkKjXUcA 3lTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687275047; x=1689867047; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ae2LgDthYgLW0wGsnuwxcKqUllQXntc3z/XI8MH0us0=; b=X+p4h8ISsGB0FO+K/xOpZrv71KcM0i41s71MAazO3S4sn/E2EdgfokfzkWtS2gZHzp GgjH/cxlup2Z83q3IZxjf4dXTmSytkGvqFnR3vWuVjscqNUvsb0Je+k4JXPmxbMiQLkj AcJgBCcFeiQp6ZBpdlUVd5GT+rHJ7tWaRa4tUKB7L8RlT9YL0+7tTzCs+tMaVRWEQcBe cmFUEL3Ieq5c6hGmNzgIda38f3CCaEMObpY+bqoOwFATnijZq2lCoIHJ3eksp7Uy4o6n OdvPSHMCme9vFRMnbVah3z7n1H/0Jy+aI7QfMg607tx1rWsn/p/I1EZC/D7WUSC/xw94 XVvw== X-Gm-Message-State: AC+VfDzUnTIq1AwzYix2AYDdmTTrikNuWFzGK7Bmo8MbX8vxIYGnEdng gAjDA5FilZLqNLLWYVeIBJg= X-Received: by 2002:adf:f289:0:b0:30a:c341:920a with SMTP id k9-20020adff289000000b0030ac341920amr8897925wro.28.1687275047270; Tue, 20 Jun 2023 08:30:47 -0700 (PDT) Received: from localhost.localdomain (93-34-93-173.ip49.fastwebnet.it. [93.34.93.173]) by smtp.googlemail.com with ESMTPSA id k10-20020adff5ca000000b0030ae87bd3e3sm2265887wrp.18.2023.06.20.08.30.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jun 2023 08:30:46 -0700 (PDT) From: Christian Marangi <ansuelsmth@gmail.com> To: Andrew Lunn <andrew@lunn.ch>, Florian Fainelli <f.fainelli@gmail.com>, Vladimir Oltean <olteanv@gmail.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Christian Marangi <ansuelsmth@gmail.com>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [net-next PATCH] net: dsa: qca8k: add support for port_change_master Date: Tue, 20 Jun 2023 08:37:47 +0200 Message-Id: <20230620063747.19175-1-ansuelsmth@gmail.com> X-Mailer: git-send-email 2.40.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_00,DATE_IN_PAST_06_12, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769236514525469362?= X-GMAIL-MSGID: =?utf-8?q?1769236514525469362?= |
Series |
[net-next] net: dsa: qca8k: add support for port_change_master
|
|
Commit Message
Christian Marangi
June 20, 2023, 6:37 a.m. UTC
Add support for port_change_master to permit assigning an alternative
CPU port if the switch have both CPU port connected or create a LAG on
both CPU port and assign the LAG as DSA master.
On port change master request, we check if the master is a LAG.
With LAG we compose the cpu_port_mask with the CPU port in the LAG, if
master is a simple dsa_port, we derive the index.
Finally we apply the new cpu_port_mask to the LOOKUP MEMBER to permit
the port to receive packet by the new CPU port setup for the port and
we reenable the target port previously disabled.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
drivers/net/dsa/qca/qca8k-8xxx.c | 54 ++++++++++++++++++++++++++++++++
drivers/net/dsa/qca/qca8k.h | 1 +
2 files changed, 55 insertions(+)
Comments
On Tue, Jun 20, 2023 at 08:37:47AM +0200, Christian Marangi wrote: > Add support for port_change_master to permit assigning an alternative > CPU port if the switch have both CPU port connected or create a LAG on > both CPU port and assign the LAG as DSA master. > > On port change master request, we check if the master is a LAG. > With LAG we compose the cpu_port_mask with the CPU port in the LAG, if > master is a simple dsa_port, we derive the index. > > Finally we apply the new cpu_port_mask to the LOOKUP MEMBER to permit > the port to receive packet by the new CPU port setup for the port and > we reenable the target port previously disabled. > > Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> > --- > drivers/net/dsa/qca/qca8k-8xxx.c | 54 ++++++++++++++++++++++++++++++++ > drivers/net/dsa/qca/qca8k.h | 1 + > 2 files changed, 55 insertions(+) > Please ignore I notice some problem and I have to test this further.
On Tue, Jun 20, 2023 at 11:12:27PM +0300, Vladimir Oltean wrote: > Hi Christian, > > On Tue, Jun 20, 2023 at 08:37:47AM +0200, Christian Marangi wrote: > > Add support for port_change_master to permit assigning an alternative > > CPU port if the switch have both CPU port connected or create a LAG on > > both CPU port and assign the LAG as DSA master. > > > > On port change master request, we check if the master is a LAG. > > With LAG we compose the cpu_port_mask with the CPU port in the LAG, if > > master is a simple dsa_port, we derive the index. > > > > Finally we apply the new cpu_port_mask to the LOOKUP MEMBER to permit > > the port to receive packet by the new CPU port setup for the port and > > we reenable the target port previously disabled. > > > > Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> > > --- > > drivers/net/dsa/qca/qca8k-8xxx.c | 54 ++++++++++++++++++++++++++++++++ > > drivers/net/dsa/qca/qca8k.h | 1 + > > 2 files changed, 55 insertions(+) > > > > diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c > > index dee7b6579916..435b69c1c552 100644 > > --- a/drivers/net/dsa/qca/qca8k-8xxx.c > > +++ b/drivers/net/dsa/qca/qca8k-8xxx.c > > @@ -1713,6 +1713,59 @@ qca8k_get_tag_protocol(struct dsa_switch *ds, int port, > > return DSA_TAG_PROTO_QCA; > > } > > > > +static int qca8k_port_change_master(struct dsa_switch *ds, int port, > > + struct net_device *master, > > + struct netlink_ext_ack *extack) > > +{ > > + struct qca8k_priv *priv = ds->priv; > > + u32 val, cpu_port_mask = 0; > > + struct dsa_port *dp; > > + int ret; > > + > > + /* With LAG of CPU port, compose the mask for LOOKUP MEMBER */ > > + if (netif_is_lag_master(master)) { > > + struct dsa_lag *lag; > > + int id; > > + > > + id = dsa_lag_id(ds->dst, master); > > + lag = dsa_lag_by_id(ds->dst, id); > > + > > + dsa_lag_foreach_port(dp, ds->dst, lag) > > I think you use ds->dst often enough that you could assign it to its own > local variable. > Will do thanks! > > + if (dsa_port_is_cpu(dp)) > > + cpu_port_mask |= BIT(dp->index); > > + } else { > > + dp = dsa_port_from_netdev(master); > > dsa_port_from_netdev() is implemented by calling: > > static inline struct dsa_port *dsa_slave_to_port(const struct net_device *dev) > { > struct dsa_slave_priv *p = netdev_priv(dev); > > return p->dp; > } > > The "struct net_device *master" does not have a netdev_priv() of the > type "struct dsa_slave_priv *". So, this function does not do what you > want, but instead it messes through the guts of an unrelated private > structure, treating whatever it finds at offset 16 as a pointer, and > dereferincing that as a struct dsa_port *. I'm surprised it didn't > crash, to be frank. > > To find the cpu_dp behind the master, you need to dereference > master->dsa_ptr (for which we don't have a helper). > I was searching for an helper but no luck. Is it safe to access master->dsa_ptr? In theory the caller of port_change_master should already check that the passed master is a dsa port? I see in other context that master->dsa_ptr is checked if not NULL. Should I do the same check here? > > + cpu_port_mask |= BIT(dp->index); > > + } > > + > > + /* Disable port */ > > + qca8k_port_set_status(priv, port, 0); > > + > > + /* Connect it to new cpu port */ > > + ret = qca8k_read(priv, QCA8K_PORT_LOOKUP_CTRL(port), &val); > > + if (ret) > > + return ret; > > + > > + /* Reset connected CPU port in LOOKUP MEMBER */ > > + val &= QCA8K_PORT_LOOKUP_USER_MEMBER; > > val &= GENMASK(5, 1) practically has the effect of unsetting BIT(0) and BIT(6). > I suppose those are the 2 possible CPU ports? If so, then use ~dsa_cpu_ports(ds), > it's more readable at least for me as a fallback maintainer. > Yes they are and yes I love this so I can also drop the stupid define. > > + /* Assign the new CPU port in LOOKUP MEMBER */ > > + val |= cpu_port_mask; > > + > > + ret = qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), > > + QCA8K_PORT_LOOKUP_MEMBER, > > + val); > > + if (ret) > > + return ret; > > + > > + /* Fast Age the port to flush FDB table */ > > + qca8k_port_fast_age(ds, port); > > Why do you have to fast age the (user) port? > The 2 CPU port have a different mac address, is it a problem? > > + > > + /* Reenable port */ > > + qca8k_port_set_status(priv, port, 1); > > or disable/enable it, for that matter? > The idea is sto stop any traffic flowing to one CPU to another before doing the change. > > + > > + return 0; > > +} > > + > > static void > > qca8k_master_change(struct dsa_switch *ds, const struct net_device *master, > > bool operational) > > @@ -1996,6 +2049,7 @@ static const struct dsa_switch_ops qca8k_switch_ops = { > > .get_phy_flags = qca8k_get_phy_flags, > > .port_lag_join = qca8k_port_lag_join, > > .port_lag_leave = qca8k_port_lag_leave, > > + .port_change_master = qca8k_port_change_master, > > From my notes in commit eca70102cfb1 ("net: dsa: felix: add support for > changing DSA master"), I recall this: > > When we change the DSA master to a LAG device, DSA guarantees us that > the LAG has at least one lower interface as a physical DSA master. > But DSA masters can come and go as lowers of that LAG, and > ds->ops->port_change_master() will not get called, because the DSA > master is still the same (the LAG). So we need to hook into the > ds->ops->port_lag_{join,leave} calls on the CPU ports and update the > logical port ID of the LAG that user ports are assigned to. > > Otherwise said: > > $ ip link add bond0 type bond mode balance-xor && ip link set bond0 up > $ ip link set eth0 down && ip link set eth0 master bond0 # .port_change_master() gets called > $ ip link set eth1 down && ip link set eth1 master bond0 # .port_change_master() does not get called > $ ip link set eth0 nomaster # .port_change_master() does not get called > > Unless something has changed, I believe that you need to handle these as well, > and update the QCA8K_PORT_LOOKUP_MEMBER field. In the case above, your > CPU port association would remain towards eth0, but the bond's lower interface > is eth1. > Can you better describe this case? In theory from the switch view, with a LAG we just set that an user port can receive packet from both CPU port. Or you are saying that when an additional memeber is added to the LAG, port_change_master is not called and we could face a scenario where: - dsa master is LAG - LAG have the 2 CPU port - user port have LAG as master but QCA8K_PORT_LOOKUP_MEMBER with only one CPU? If I got this right, then I get what you mean with the fact that I should update the lag_join/leave definition and refresh each configuration. > > .master_state_change = qca8k_master_change, > > .connect_tag_protocol = qca8k_connect_tag_protocol, > > }; > > diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h > > index c5cc8a172d65..424f851db881 100644 > > --- a/drivers/net/dsa/qca/qca8k.h > > +++ b/drivers/net/dsa/qca/qca8k.h > > @@ -250,6 +250,7 @@ > > #define QCA8K_GLOBAL_FW_CTRL1_MC_DP_MASK GENMASK(14, 8) > > #define QCA8K_GLOBAL_FW_CTRL1_UC_DP_MASK GENMASK(6, 0) > > #define QCA8K_PORT_LOOKUP_CTRL(_i) (0x660 + (_i) * 0xc) > > +#define QCA8K_PORT_LOOKUP_USER_MEMBER GENMASK(5, 1) > > #define QCA8K_PORT_LOOKUP_MEMBER GENMASK(6, 0) > > #define QCA8K_PORT_LOOKUP_VLAN_MODE_MASK GENMASK(9, 8) > > #define QCA8K_PORT_LOOKUP_VLAN_MODE(x) FIELD_PREP(QCA8K_PORT_LOOKUP_VLAN_MODE_MASK, x) > > -- > > 2.40.1 > > >
On Wed, Jun 21, 2023 at 01:25:27PM +0300, Vladimir Oltean wrote: > On Tue, Jun 20, 2023 at 03:04:28PM +0200, Christian Marangi wrote: > > > > + if (dsa_port_is_cpu(dp)) > > > > + cpu_port_mask |= BIT(dp->index); > > > > + } else { > > > > + dp = dsa_port_from_netdev(master); > > > > > > dsa_port_from_netdev() is implemented by calling: > > > > > > static inline struct dsa_port *dsa_slave_to_port(const struct net_device *dev) > > > { > > > struct dsa_slave_priv *p = netdev_priv(dev); > > > > > > return p->dp; > > > } > > > > > > The "struct net_device *master" does not have a netdev_priv() of the > > > type "struct dsa_slave_priv *". So, this function does not do what you > > > want, but instead it messes through the guts of an unrelated private > > > structure, treating whatever it finds at offset 16 as a pointer, and > > > dereferincing that as a struct dsa_port *. I'm surprised it didn't > > > crash, to be frank. > > > > > > To find the cpu_dp behind the master, you need to dereference > > > master->dsa_ptr (for which we don't have a helper). > > > > > > > I was searching for an helper but no luck. Is it safe to access > > master->dsa_ptr? In theory the caller of port_change_master should > > already check that the passed master is a dsa port? > > *that the passed network interface is a master - netdev_uses_dsa() > > What is attached to the DSA master through dev->dsa_ptr is the CPU port. > > what makes a net_device be a DSA master is dsa_master_setup(), and what > makes it stop being that is dsa_master_teardown(). Both are called under > rtnl_lock(), so as long as you are in a calling context where that lock > is held, you can be sure that the value of netdev_uses_dsa() does not > change for a device - and thus the value of dev->dsa_ptr. > > > I see in other context that master->dsa_ptr is checked if not NULL. > > Should I do the same check here? > > Nope. DSA takes care of passing a fully set up DSA master as the > "master" argument, and the calling convention is that rtnl_lock() is held. > Thanks for both clarification! > > > > + /* Assign the new CPU port in LOOKUP MEMBER */ > > > > + val |= cpu_port_mask; > > > > + > > > > + ret = qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), > > > > + QCA8K_PORT_LOOKUP_MEMBER, > > > > + val); > > > > + if (ret) > > > > + return ret; > > > > + > > > > + /* Fast Age the port to flush FDB table */ > > > > + qca8k_port_fast_age(ds, port); > > > > > > Why do you have to fast age the (user) port? > > > > > > > The 2 CPU port have a different mac address, is it a problem? > > But fast ageing the user port (which is what "port" is, here) gets rid > of the FDB entries learned on that port as part of the bridging service, > and which have it as a *destination*. So I'm not sure how that operation > would help. The MAC address of the DSA masters, if learned at all, would > not point towards any user port but towards CPU ports. > > FWIW, dsa_port_change_master() takes care of migrating/replaying a lot of > configuration, including the MAC addresses for local address filtering - > dsa_slave_unsync_ha() and dsa_slave_sync_ha(). > I notice that I require assisted_learning_on_cpu_port to make this actually work. Wth this false, bridge fdb show still had entry with the MAC of the old master. With assisted, they gets correctly updated. > That being said, those 2 functions are dead code for your switch, > because dsa_switch_supports_uc_filtering() and dsa_switch_supports_mc_filtering() > both return false. > > It would be good to hear from you how do you plan the qca8k driver to > send and receive packets. From looking at the code (learning on the CPU > port isn't enabled), I guess that the MAC addresses of the ports are > never programmed in the FDB and thus, they reach the CPU by flooding, > with the usual drawbacks that come with that - packets destined for > local termination will also be flooded to other stations in the bridging > domain. Getting rid of the reliance on flooding will have its own > challenges. You can't enable automatic address learning [ on the CPU > ports ] with multiple active CPU ports, because one FDB entry could ping > pong from one CPU port to the other, leading to packet loss from certain > user ports when the FDB entry points to the CPU port that isn't affine > to the inbound port. So you'd probably need to program some sort of > "multicast" FDB entries that target all CPU ports, and rely on the > PORT_VID_MEMBER field to restrict forwarding to only one of those CPU > ports at a time. > Eh I really think this is not trivial at all and I would love some help. With further testing, to make this actually work I had to operate on the GLOBAL_FW_CTRL1 regs that handle how to treat unknown frames of all kind. They are classified as unknown when the DA is not contained in the ARL table and are split in IGMP, BROAD (broadcast), MULTI (multicast) and UNI (unicast) and just are just the FLOOD option. Saddly in the current configuration to make the secondary CPU port work, I had to set the flooding to both CPU port and I guess this is extremely wrong since I assume linux would receive double the packet for each unknown frame. And this match what you have theorized about the need of a multicast FDB entry I guess? Main problem is that I have some fear the switch doesn't support controlling flooding with ARL or ACL (but I have to check this better) Just for reference this is the current fdb table 01:00:5e:00:00:01 dev eth0 self permanent 33:33:00:00:00:02 dev eth0 self permanent 33:33:00:00:00:01 dev eth0 self permanent 33:33:ff:f2:5d:50 dev eth0 self permanent 33:33:ff:00:00:00 dev eth0 self permanent dc:ef:09:f2:5d:4f dev eth1 self permanent 33:33:00:00:00:01 dev eth1 self permanent 33:33:00:00:00:02 dev eth1 self permanent 01:00:5e:00:00:01 dev eth1 self permanent 33:33:ff:f2:5d:4f dev eth1 self permanent 33:33:ff:00:00:00 dev eth1 self permanent c0:3e:ba:c1:d7:47 dev lan1 master br-lan dc:ef:09:f2:5d:4f dev lan1 vlan 1 master br-lan permanent dc:ef:09:f2:5d:4f dev lan1 master br-lan permanent c0:3e:ba:c1:d7:47 dev lan1 vlan 1 self 33:33:00:00:00:01 dev wlan0 self permanent 33:33:00:00:00:02 dev wlan0 self permanent 33:33:00:00:00:01 dev wlan1 self permanent 33:33:00:00:00:02 dev wlan1 self permanent 33:33:00:00:00:01 dev br-lan self permanent 33:33:00:00:00:02 dev br-lan self permanent 01:00:5e:00:00:01 dev br-lan self permanent 33:33:ff:00:00:01 dev br-lan self permanent 33:33:ff:f2:5d:4f dev br-lan self permanent 33:33:00:01:00:02 dev br-lan self permanent 33:33:00:01:00:03 dev br-lan self permanent 33:33:ff:00:00:00 dev br-lan self permanent > > > > + > > > > + /* Reenable port */ > > > > + qca8k_port_set_status(priv, port, 1); > > > > > > or disable/enable it, for that matter? > > > > > > > The idea is sto stop any traffic flowing to one CPU to another before > > doing the change. > > Both DSA masters are prepared to handle traffic when port_change_master() > is called, so unless there's some limitation in the qca8k driver, there > shouldn't be any in DSA. > This is nice, I will drop all the bloat. > > > From my notes in commit eca70102cfb1 ("net: dsa: felix: add support for > > > changing DSA master"), I recall this: > > > > > > When we change the DSA master to a LAG device, DSA guarantees us that > > > the LAG has at least one lower interface as a physical DSA master. > > > But DSA masters can come and go as lowers of that LAG, and > > > ds->ops->port_change_master() will not get called, because the DSA > > > master is still the same (the LAG). So we need to hook into the > > > ds->ops->port_lag_{join,leave} calls on the CPU ports and update the > > > logical port ID of the LAG that user ports are assigned to. > > > > > > Otherwise said: > > > > > > $ ip link add bond0 type bond mode balance-xor && ip link set bond0 up > > > $ ip link set eth0 down && ip link set eth0 master bond0 # .port_change_master() gets called > > > $ ip link set eth1 down && ip link set eth1 master bond0 # .port_change_master() does not get called > > > $ ip link set eth0 nomaster # .port_change_master() does not get called > > > > > > Unless something has changed, I believe that you need to handle these as well, > > > and update the QCA8K_PORT_LOOKUP_MEMBER field. In the case above, your > > > CPU port association would remain towards eth0, but the bond's lower interface > > > is eth1. > > > > > > > Can you better describe this case? > > > > In theory from the switch view, with a LAG we just set that an user port > > can receive packet from both CPU port. > > > > Or you are saying that when an additional memeber is added to the LAG, > > port_change_master is not called and we could face a scenario where: > > > > - dsa master is LAG > > - LAG have the 2 CPU port > > - user port have LAG as master but QCA8K_PORT_LOOKUP_MEMBER with only > > one CPU? > > > > If I got this right, then I get what you mean with the fact that I > > should update the lag_join/leave definition and refresh each > > configuration. > > In Documentation/networking/dsa/configuration.rst I gave 2 examples of > changing the DSA master to be a LAG. > > In the list of 4 commands I posted in the previous reply, I assumed that > eth0 is the original DSA master, and eth1 is the second (initially inactive) > DSA master. > > When eth0 joins a LAG, DSA notices that and implicitly migrates all user > ports affine to eth0 towards bond0 as the new DSA master. At that time, > .port_change_master() will be called for all user ports under eth0, to > be notified that the new DSA master is bond0. > > Once all user ports have bond0 as a DSA master, .port_change_master() > will no longer be called as long as bond0 remains their DSA master. > But the lower port configuration of bond0 can still change. > > During the command where eth1 also becomes a lower port of bond0, DSA > just calls .port_lag_join() for the CPU port attached to eth1, and you > need to handle that and update the CPU port mask. Same thing when eth0 > leaves bond0. Ok thanks for the clarification, I will implement refresh function in lag join and leave.
Hi Christian, On Tue, Jun 20, 2023 at 08:37:47AM +0200, Christian Marangi wrote: > Add support for port_change_master to permit assigning an alternative > CPU port if the switch have both CPU port connected or create a LAG on > both CPU port and assign the LAG as DSA master. > > On port change master request, we check if the master is a LAG. > With LAG we compose the cpu_port_mask with the CPU port in the LAG, if > master is a simple dsa_port, we derive the index. > > Finally we apply the new cpu_port_mask to the LOOKUP MEMBER to permit > the port to receive packet by the new CPU port setup for the port and > we reenable the target port previously disabled. > > Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> > --- > drivers/net/dsa/qca/qca8k-8xxx.c | 54 ++++++++++++++++++++++++++++++++ > drivers/net/dsa/qca/qca8k.h | 1 + > 2 files changed, 55 insertions(+) > > diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c > index dee7b6579916..435b69c1c552 100644 > --- a/drivers/net/dsa/qca/qca8k-8xxx.c > +++ b/drivers/net/dsa/qca/qca8k-8xxx.c > @@ -1713,6 +1713,59 @@ qca8k_get_tag_protocol(struct dsa_switch *ds, int port, > return DSA_TAG_PROTO_QCA; > } > > +static int qca8k_port_change_master(struct dsa_switch *ds, int port, > + struct net_device *master, > + struct netlink_ext_ack *extack) > +{ > + struct qca8k_priv *priv = ds->priv; > + u32 val, cpu_port_mask = 0; > + struct dsa_port *dp; > + int ret; > + > + /* With LAG of CPU port, compose the mask for LOOKUP MEMBER */ > + if (netif_is_lag_master(master)) { > + struct dsa_lag *lag; > + int id; > + > + id = dsa_lag_id(ds->dst, master); > + lag = dsa_lag_by_id(ds->dst, id); > + > + dsa_lag_foreach_port(dp, ds->dst, lag) I think you use ds->dst often enough that you could assign it to its own local variable. > + if (dsa_port_is_cpu(dp)) > + cpu_port_mask |= BIT(dp->index); > + } else { > + dp = dsa_port_from_netdev(master); dsa_port_from_netdev() is implemented by calling: static inline struct dsa_port *dsa_slave_to_port(const struct net_device *dev) { struct dsa_slave_priv *p = netdev_priv(dev); return p->dp; } The "struct net_device *master" does not have a netdev_priv() of the type "struct dsa_slave_priv *". So, this function does not do what you want, but instead it messes through the guts of an unrelated private structure, treating whatever it finds at offset 16 as a pointer, and dereferincing that as a struct dsa_port *. I'm surprised it didn't crash, to be frank. To find the cpu_dp behind the master, you need to dereference master->dsa_ptr (for which we don't have a helper). > + cpu_port_mask |= BIT(dp->index); > + } > + > + /* Disable port */ > + qca8k_port_set_status(priv, port, 0); > + > + /* Connect it to new cpu port */ > + ret = qca8k_read(priv, QCA8K_PORT_LOOKUP_CTRL(port), &val); > + if (ret) > + return ret; > + > + /* Reset connected CPU port in LOOKUP MEMBER */ > + val &= QCA8K_PORT_LOOKUP_USER_MEMBER; val &= GENMASK(5, 1) practically has the effect of unsetting BIT(0) and BIT(6). I suppose those are the 2 possible CPU ports? If so, then use ~dsa_cpu_ports(ds), it's more readable at least for me as a fallback maintainer. > + /* Assign the new CPU port in LOOKUP MEMBER */ > + val |= cpu_port_mask; > + > + ret = qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), > + QCA8K_PORT_LOOKUP_MEMBER, > + val); > + if (ret) > + return ret; > + > + /* Fast Age the port to flush FDB table */ > + qca8k_port_fast_age(ds, port); Why do you have to fast age the (user) port? > + > + /* Reenable port */ > + qca8k_port_set_status(priv, port, 1); or disable/enable it, for that matter? > + > + return 0; > +} > + > static void > qca8k_master_change(struct dsa_switch *ds, const struct net_device *master, > bool operational) > @@ -1996,6 +2049,7 @@ static const struct dsa_switch_ops qca8k_switch_ops = { > .get_phy_flags = qca8k_get_phy_flags, > .port_lag_join = qca8k_port_lag_join, > .port_lag_leave = qca8k_port_lag_leave, > + .port_change_master = qca8k_port_change_master, From my notes in commit eca70102cfb1 ("net: dsa: felix: add support for changing DSA master"), I recall this: When we change the DSA master to a LAG device, DSA guarantees us that the LAG has at least one lower interface as a physical DSA master. But DSA masters can come and go as lowers of that LAG, and ds->ops->port_change_master() will not get called, because the DSA master is still the same (the LAG). So we need to hook into the ds->ops->port_lag_{join,leave} calls on the CPU ports and update the logical port ID of the LAG that user ports are assigned to. Otherwise said: $ ip link add bond0 type bond mode balance-xor && ip link set bond0 up $ ip link set eth0 down && ip link set eth0 master bond0 # .port_change_master() gets called $ ip link set eth1 down && ip link set eth1 master bond0 # .port_change_master() does not get called $ ip link set eth0 nomaster # .port_change_master() does not get called Unless something has changed, I believe that you need to handle these as well, and update the QCA8K_PORT_LOOKUP_MEMBER field. In the case above, your CPU port association would remain towards eth0, but the bond's lower interface is eth1. > .master_state_change = qca8k_master_change, > .connect_tag_protocol = qca8k_connect_tag_protocol, > }; > diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h > index c5cc8a172d65..424f851db881 100644 > --- a/drivers/net/dsa/qca/qca8k.h > +++ b/drivers/net/dsa/qca/qca8k.h > @@ -250,6 +250,7 @@ > #define QCA8K_GLOBAL_FW_CTRL1_MC_DP_MASK GENMASK(14, 8) > #define QCA8K_GLOBAL_FW_CTRL1_UC_DP_MASK GENMASK(6, 0) > #define QCA8K_PORT_LOOKUP_CTRL(_i) (0x660 + (_i) * 0xc) > +#define QCA8K_PORT_LOOKUP_USER_MEMBER GENMASK(5, 1) > #define QCA8K_PORT_LOOKUP_MEMBER GENMASK(6, 0) > #define QCA8K_PORT_LOOKUP_VLAN_MODE_MASK GENMASK(9, 8) > #define QCA8K_PORT_LOOKUP_VLAN_MODE(x) FIELD_PREP(QCA8K_PORT_LOOKUP_VLAN_MODE_MASK, x) > -- > 2.40.1 >
On Wed, Jun 21, 2023 at 03:17:42PM +0300, Vladimir Oltean wrote: > On Tue, Jun 20, 2023 at 08:52:27PM +0200, Christian Marangi wrote: > > On Wed, Jun 21, 2023 at 01:25:27PM +0300, Vladimir Oltean wrote: > > > > > Why do you have to fast age the (user) port? > > > > > > > > > > > > > The 2 CPU port have a different mac address, is it a problem? > > > > > > But fast ageing the user port (which is what "port" is, here) gets rid > > > of the FDB entries learned on that port as part of the bridging service, > > > and which have it as a *destination*. So I'm not sure how that operation > > > would help. The MAC address of the DSA masters, if learned at all, would > > > not point towards any user port but towards CPU ports. > > > > > > FWIW, dsa_port_change_master() takes care of migrating/replaying a lot of > > > configuration, including the MAC addresses for local address filtering - > > > dsa_slave_unsync_ha() and dsa_slave_sync_ha(). > > > > > > > I notice that I require assisted_learning_on_cpu_port to make this > > actually work. > > > > Wth this false, bridge fdb show still had entry with the MAC of the old > > master. With assisted, they gets correctly updated. > > The behavior you're describing does not ring a bell to me. > > What assisted_learning_on_cpu_port does is, when DSA is in a bridge with > a foreign device like wlan0, and wlan0 learns (in software, through the > bridge driver) a MAC address from a connected station, DSA calls > port_fdb_add() for that MAC address on the CPU ports associated with all > user ports from that bridge. > > So it has nothing to do with the MAC address of the DSA master (100% sure). > > Nonetheless, if you implement port_change_master(), you need to make > sure that port_fdb_add() on the CPU port does something sensible in the > presence of multiple simultaneously active CPU ports. Reading some > commit messages from the development of this feature on the felix driver > may give you some ideas. > I will search them. > > > It would be good to hear from you how do you plan the qca8k driver to > > > send and receive packets. From looking at the code (learning on the CPU > > > port isn't enabled), I guess that the MAC addresses of the ports are > > > never programmed in the FDB and thus, they reach the CPU by flooding, > > > with the usual drawbacks that come with that - packets destined for > > > local termination will also be flooded to other stations in the bridging > > > domain. Getting rid of the reliance on flooding will have its own > > > challenges. You can't enable automatic address learning [ on the CPU > > > ports ] with multiple active CPU ports, because one FDB entry could ping > > > pong from one CPU port to the other, leading to packet loss from certain > > > user ports when the FDB entry points to the CPU port that isn't affine > > > to the inbound port. So you'd probably need to program some sort of > > > "multicast" FDB entries that target all CPU ports, and rely on the > > > PORT_VID_MEMBER field to restrict forwarding to only one of those CPU > > > ports at a time. > > > > > > > Eh I really think this is not trivial at all and I would love some help. > > > > With further testing, to make this actually work I had to operate on the > > GLOBAL_FW_CTRL1 regs that handle how to treat unknown frames of all > > kind. > > They are classified as unknown when the DA is not contained in the ARL > > table and are split in IGMP, BROAD (broadcast), MULTI (multicast) and > > UNI (unicast) and just are just the FLOOD option. > > The qca8k driver lags a bit (more) behind when it comes to implementing > the bridge offload API efficiently (and correctly). > I'm pushing to finally use it on every ipq806x device we have on OpenWRT hoping to get more traction in the use... Also ipq40xx should receive support sometimes this year... > Namely, it floods everything to the single (first) CPU port and does not > implement any autonomous flooding to other user ports in the same > bridging domain. So it relies on software flooding - tag_qca.c does not > set skb->offload_fwd_mark = true for any kind of packet. Here, I'm > assuming that there isn't any inherent limitation that prevents > autonomous flooding from working, and this would offload some tasks from > the CPU. I would love to do that, I need to check if I can find some example on other driver and see if the thing can be handled with the ACL table or the ARL. The thing is very flexible but maybe too much and hard to understand for the task we need to accomplish. > > In terms of correctness, it enables address learning on all user ports > by default, which is incorrect because user ports should only learn if > they are under a bridge *and* that bridge has learning enabled on that > port. Standalone ports should have address learning disabled, otherwise > learning will lead to packet loss when connecting two standalone user > ports to the same LAN. Commit 15f7cfae912e ("net: dsa: microchip: make > learning configurable and keep it off while standalone") comes to mind > as an example. I should have a commit ready for this but I wonder if there is more to fix looking at the fdb dump. > > I would consider implementing .port_pre_bridge_flags() and .port_bridge_flags() > before jumping straight ahead to such advanced topics like .port_change_master(). > > In general, see what else there is in Documentation/networking/dsa/dsa.rst, > because the API documentation has been overhauled relatively recently. > > > Saddly in the current configuration to make the secondary CPU port work, > > I had to set the flooding to both CPU port and I guess this is extremely > > wrong since I assume linux would receive double the packet for each > > unknown frame. > > Not sure if it's "extremely wrong", maybe it isn't. A civilized hardware > implementation would probably restrict the flooding destinations with the > QCA8K_PORT_LOOKUP_MEMBER mask of the inbound port. So as long as a single > CPU port is present in that, then even when the flooding destination masks > contain multiple bits set, each packet goes to a single destination. > > With LAG it should be similar, except that the destination mask is further > restricted by a port mask indexed by a hash calculated from packet headers. > > It just needs serious testing. Arınç ÜNAL did similar work with mausezahn > and tcpdump for testing flooding, trapping etc on the mt7530 driver, > while adding support for multiple CPU ports. > A testing way is to check if packet received to the CPU port are actually dpulicated on flooding right? > > And this match what you have theorized about the need of a multicast FDB > > entry I guess? Main problem is that I have some fear the switch doesn't > > support controlling flooding with ARL or ACL (but I have to check this > > better) > > Not sure that I understand what you're saying here. > > > Just for reference this is the current fdb table > > > > 01:00:5e:00:00:01 dev eth0 self permanent > > 33:33:00:00:00:02 dev eth0 self permanent > > 33:33:00:00:00:01 dev eth0 self permanent > > 33:33:ff:f2:5d:50 dev eth0 self permanent > > 33:33:ff:00:00:00 dev eth0 self permanent > > dc:ef:09:f2:5d:4f dev eth1 self permanent > > 33:33:00:00:00:01 dev eth1 self permanent > > 33:33:00:00:00:02 dev eth1 self permanent > > 01:00:5e:00:00:01 dev eth1 self permanent > > 33:33:ff:f2:5d:4f dev eth1 self permanent > > 33:33:ff:00:00:00 dev eth1 self permanent > > c0:3e:ba:c1:d7:47 dev lan1 master br-lan > > The entries with "master" are present in the software bridge database. > > > dc:ef:09:f2:5d:4f dev lan1 vlan 1 master br-lan permanent > > The entries with "permanent" (synonym is "local", BR_FDB_LOCAL in the code) > are FDB entries that are intended for local termination. They are > present on a bridge port, but they really mean that packets should go to > software (the CPU) for local termination, not towards that bridge port. > > > dc:ef:09:f2:5d:4f dev lan1 master br-lan permanent > > c0:3e:ba:c1:d7:47 dev lan1 vlan 1 self > > The entries with "self" are present in the hardware bridge database. > > > 33:33:00:00:00:01 dev wlan0 self permanent > > 33:33:00:00:00:02 dev wlan0 self permanent > > 33:33:00:00:00:01 dev wlan1 self permanent > > 33:33:00:00:00:02 dev wlan1 self permanent > > 33:33:00:00:00:01 dev br-lan self permanent > > 33:33:00:00:00:02 dev br-lan self permanent > > 01:00:5e:00:00:01 dev br-lan self permanent > > 33:33:ff:00:00:01 dev br-lan self permanent > > 33:33:ff:f2:5d:4f dev br-lan self permanent > > 33:33:00:01:00:02 dev br-lan self permanent > > 33:33:00:01:00:03 dev br-lan self permanent > > 33:33:ff:00:00:00 dev br-lan self permanent > > So I would guess that dc:ef:09:f2:5d:4f is the MAC address of br-lan, > inherited from the MAC address of the first bridge port (DSA switch > port). In turn, if there is no MAC address for ports in the device tree, > DSA inherits the MAC address from the master. So that's a plausible > avenue for the DSA master's MAC address to reach the bridge FDB. > > The bridge notifies local addresses on the switchdev chain with the > fdb_info->is_local bit set, and DSA calls port_fdb_add() on the CPU > ports for them. So that's how they might reach the hardware FDB. > But again, port_fdb_add() programs static FDB entries, and fast ageing > doesn't remove those. Did you notice anything wrong in this dump? Entry that should have been deleted but they aren't?
On Tue, Jun 20, 2023 at 03:04:28PM +0200, Christian Marangi wrote: > > > + if (dsa_port_is_cpu(dp)) > > > + cpu_port_mask |= BIT(dp->index); > > > + } else { > > > + dp = dsa_port_from_netdev(master); > > > > dsa_port_from_netdev() is implemented by calling: > > > > static inline struct dsa_port *dsa_slave_to_port(const struct net_device *dev) > > { > > struct dsa_slave_priv *p = netdev_priv(dev); > > > > return p->dp; > > } > > > > The "struct net_device *master" does not have a netdev_priv() of the > > type "struct dsa_slave_priv *". So, this function does not do what you > > want, but instead it messes through the guts of an unrelated private > > structure, treating whatever it finds at offset 16 as a pointer, and > > dereferincing that as a struct dsa_port *. I'm surprised it didn't > > crash, to be frank. > > > > To find the cpu_dp behind the master, you need to dereference > > master->dsa_ptr (for which we don't have a helper). > > > > I was searching for an helper but no luck. Is it safe to access > master->dsa_ptr? In theory the caller of port_change_master should > already check that the passed master is a dsa port? *that the passed network interface is a master - netdev_uses_dsa() What is attached to the DSA master through dev->dsa_ptr is the CPU port. what makes a net_device be a DSA master is dsa_master_setup(), and what makes it stop being that is dsa_master_teardown(). Both are called under rtnl_lock(), so as long as you are in a calling context where that lock is held, you can be sure that the value of netdev_uses_dsa() does not change for a device - and thus the value of dev->dsa_ptr. > I see in other context that master->dsa_ptr is checked if not NULL. > Should I do the same check here? Nope. DSA takes care of passing a fully set up DSA master as the "master" argument, and the calling convention is that rtnl_lock() is held. > > > + /* Assign the new CPU port in LOOKUP MEMBER */ > > > + val |= cpu_port_mask; > > > + > > > + ret = qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), > > > + QCA8K_PORT_LOOKUP_MEMBER, > > > + val); > > > + if (ret) > > > + return ret; > > > + > > > + /* Fast Age the port to flush FDB table */ > > > + qca8k_port_fast_age(ds, port); > > > > Why do you have to fast age the (user) port? > > > > The 2 CPU port have a different mac address, is it a problem? But fast ageing the user port (which is what "port" is, here) gets rid of the FDB entries learned on that port as part of the bridging service, and which have it as a *destination*. So I'm not sure how that operation would help. The MAC address of the DSA masters, if learned at all, would not point towards any user port but towards CPU ports. FWIW, dsa_port_change_master() takes care of migrating/replaying a lot of configuration, including the MAC addresses for local address filtering - dsa_slave_unsync_ha() and dsa_slave_sync_ha(). That being said, those 2 functions are dead code for your switch, because dsa_switch_supports_uc_filtering() and dsa_switch_supports_mc_filtering() both return false. It would be good to hear from you how do you plan the qca8k driver to send and receive packets. From looking at the code (learning on the CPU port isn't enabled), I guess that the MAC addresses of the ports are never programmed in the FDB and thus, they reach the CPU by flooding, with the usual drawbacks that come with that - packets destined for local termination will also be flooded to other stations in the bridging domain. Getting rid of the reliance on flooding will have its own challenges. You can't enable automatic address learning [ on the CPU ports ] with multiple active CPU ports, because one FDB entry could ping pong from one CPU port to the other, leading to packet loss from certain user ports when the FDB entry points to the CPU port that isn't affine to the inbound port. So you'd probably need to program some sort of "multicast" FDB entries that target all CPU ports, and rely on the PORT_VID_MEMBER field to restrict forwarding to only one of those CPU ports at a time. > > > + > > > + /* Reenable port */ > > > + qca8k_port_set_status(priv, port, 1); > > > > or disable/enable it, for that matter? > > > > The idea is sto stop any traffic flowing to one CPU to another before > doing the change. Both DSA masters are prepared to handle traffic when port_change_master() is called, so unless there's some limitation in the qca8k driver, there shouldn't be any in DSA. > > From my notes in commit eca70102cfb1 ("net: dsa: felix: add support for > > changing DSA master"), I recall this: > > > > When we change the DSA master to a LAG device, DSA guarantees us that > > the LAG has at least one lower interface as a physical DSA master. > > But DSA masters can come and go as lowers of that LAG, and > > ds->ops->port_change_master() will not get called, because the DSA > > master is still the same (the LAG). So we need to hook into the > > ds->ops->port_lag_{join,leave} calls on the CPU ports and update the > > logical port ID of the LAG that user ports are assigned to. > > > > Otherwise said: > > > > $ ip link add bond0 type bond mode balance-xor && ip link set bond0 up > > $ ip link set eth0 down && ip link set eth0 master bond0 # .port_change_master() gets called > > $ ip link set eth1 down && ip link set eth1 master bond0 # .port_change_master() does not get called > > $ ip link set eth0 nomaster # .port_change_master() does not get called > > > > Unless something has changed, I believe that you need to handle these as well, > > and update the QCA8K_PORT_LOOKUP_MEMBER field. In the case above, your > > CPU port association would remain towards eth0, but the bond's lower interface > > is eth1. > > > > Can you better describe this case? > > In theory from the switch view, with a LAG we just set that an user port > can receive packet from both CPU port. > > Or you are saying that when an additional memeber is added to the LAG, > port_change_master is not called and we could face a scenario where: > > - dsa master is LAG > - LAG have the 2 CPU port > - user port have LAG as master but QCA8K_PORT_LOOKUP_MEMBER with only > one CPU? > > If I got this right, then I get what you mean with the fact that I > should update the lag_join/leave definition and refresh each > configuration. In Documentation/networking/dsa/configuration.rst I gave 2 examples of changing the DSA master to be a LAG. In the list of 4 commands I posted in the previous reply, I assumed that eth0 is the original DSA master, and eth1 is the second (initially inactive) DSA master. When eth0 joins a LAG, DSA notices that and implicitly migrates all user ports affine to eth0 towards bond0 as the new DSA master. At that time, .port_change_master() will be called for all user ports under eth0, to be notified that the new DSA master is bond0. Once all user ports have bond0 as a DSA master, .port_change_master() will no longer be called as long as bond0 remains their DSA master. But the lower port configuration of bond0 can still change. During the command where eth1 also becomes a lower port of bond0, DSA just calls .port_lag_join() for the CPU port attached to eth1, and you need to handle that and update the CPU port mask. Same thing when eth0 leaves bond0.
On Tue, Jun 20, 2023 at 08:52:27PM +0200, Christian Marangi wrote: > On Wed, Jun 21, 2023 at 01:25:27PM +0300, Vladimir Oltean wrote: > > > > Why do you have to fast age the (user) port? > > > > > > > > > > The 2 CPU port have a different mac address, is it a problem? > > > > But fast ageing the user port (which is what "port" is, here) gets rid > > of the FDB entries learned on that port as part of the bridging service, > > and which have it as a *destination*. So I'm not sure how that operation > > would help. The MAC address of the DSA masters, if learned at all, would > > not point towards any user port but towards CPU ports. > > > > FWIW, dsa_port_change_master() takes care of migrating/replaying a lot of > > configuration, including the MAC addresses for local address filtering - > > dsa_slave_unsync_ha() and dsa_slave_sync_ha(). > > > > I notice that I require assisted_learning_on_cpu_port to make this > actually work. > > Wth this false, bridge fdb show still had entry with the MAC of the old > master. With assisted, they gets correctly updated. The behavior you're describing does not ring a bell to me. What assisted_learning_on_cpu_port does is, when DSA is in a bridge with a foreign device like wlan0, and wlan0 learns (in software, through the bridge driver) a MAC address from a connected station, DSA calls port_fdb_add() for that MAC address on the CPU ports associated with all user ports from that bridge. So it has nothing to do with the MAC address of the DSA master (100% sure). Nonetheless, if you implement port_change_master(), you need to make sure that port_fdb_add() on the CPU port does something sensible in the presence of multiple simultaneously active CPU ports. Reading some commit messages from the development of this feature on the felix driver may give you some ideas. > > It would be good to hear from you how do you plan the qca8k driver to > > send and receive packets. From looking at the code (learning on the CPU > > port isn't enabled), I guess that the MAC addresses of the ports are > > never programmed in the FDB and thus, they reach the CPU by flooding, > > with the usual drawbacks that come with that - packets destined for > > local termination will also be flooded to other stations in the bridging > > domain. Getting rid of the reliance on flooding will have its own > > challenges. You can't enable automatic address learning [ on the CPU > > ports ] with multiple active CPU ports, because one FDB entry could ping > > pong from one CPU port to the other, leading to packet loss from certain > > user ports when the FDB entry points to the CPU port that isn't affine > > to the inbound port. So you'd probably need to program some sort of > > "multicast" FDB entries that target all CPU ports, and rely on the > > PORT_VID_MEMBER field to restrict forwarding to only one of those CPU > > ports at a time. > > > > Eh I really think this is not trivial at all and I would love some help. > > With further testing, to make this actually work I had to operate on the > GLOBAL_FW_CTRL1 regs that handle how to treat unknown frames of all > kind. > They are classified as unknown when the DA is not contained in the ARL > table and are split in IGMP, BROAD (broadcast), MULTI (multicast) and > UNI (unicast) and just are just the FLOOD option. The qca8k driver lags a bit (more) behind when it comes to implementing the bridge offload API efficiently (and correctly). Namely, it floods everything to the single (first) CPU port and does not implement any autonomous flooding to other user ports in the same bridging domain. So it relies on software flooding - tag_qca.c does not set skb->offload_fwd_mark = true for any kind of packet. Here, I'm assuming that there isn't any inherent limitation that prevents autonomous flooding from working, and this would offload some tasks from the CPU. In terms of correctness, it enables address learning on all user ports by default, which is incorrect because user ports should only learn if they are under a bridge *and* that bridge has learning enabled on that port. Standalone ports should have address learning disabled, otherwise learning will lead to packet loss when connecting two standalone user ports to the same LAN. Commit 15f7cfae912e ("net: dsa: microchip: make learning configurable and keep it off while standalone") comes to mind as an example. I would consider implementing .port_pre_bridge_flags() and .port_bridge_flags() before jumping straight ahead to such advanced topics like .port_change_master(). In general, see what else there is in Documentation/networking/dsa/dsa.rst, because the API documentation has been overhauled relatively recently. > Saddly in the current configuration to make the secondary CPU port work, > I had to set the flooding to both CPU port and I guess this is extremely > wrong since I assume linux would receive double the packet for each > unknown frame. Not sure if it's "extremely wrong", maybe it isn't. A civilized hardware implementation would probably restrict the flooding destinations with the QCA8K_PORT_LOOKUP_MEMBER mask of the inbound port. So as long as a single CPU port is present in that, then even when the flooding destination masks contain multiple bits set, each packet goes to a single destination. With LAG it should be similar, except that the destination mask is further restricted by a port mask indexed by a hash calculated from packet headers. It just needs serious testing. Arınç ÜNAL did similar work with mausezahn and tcpdump for testing flooding, trapping etc on the mt7530 driver, while adding support for multiple CPU ports. > And this match what you have theorized about the need of a multicast FDB > entry I guess? Main problem is that I have some fear the switch doesn't > support controlling flooding with ARL or ACL (but I have to check this > better) Not sure that I understand what you're saying here. > Just for reference this is the current fdb table > > 01:00:5e:00:00:01 dev eth0 self permanent > 33:33:00:00:00:02 dev eth0 self permanent > 33:33:00:00:00:01 dev eth0 self permanent > 33:33:ff:f2:5d:50 dev eth0 self permanent > 33:33:ff:00:00:00 dev eth0 self permanent > dc:ef:09:f2:5d:4f dev eth1 self permanent > 33:33:00:00:00:01 dev eth1 self permanent > 33:33:00:00:00:02 dev eth1 self permanent > 01:00:5e:00:00:01 dev eth1 self permanent > 33:33:ff:f2:5d:4f dev eth1 self permanent > 33:33:ff:00:00:00 dev eth1 self permanent > c0:3e:ba:c1:d7:47 dev lan1 master br-lan The entries with "master" are present in the software bridge database. > dc:ef:09:f2:5d:4f dev lan1 vlan 1 master br-lan permanent The entries with "permanent" (synonym is "local", BR_FDB_LOCAL in the code) are FDB entries that are intended for local termination. They are present on a bridge port, but they really mean that packets should go to software (the CPU) for local termination, not towards that bridge port. > dc:ef:09:f2:5d:4f dev lan1 master br-lan permanent > c0:3e:ba:c1:d7:47 dev lan1 vlan 1 self The entries with "self" are present in the hardware bridge database. > 33:33:00:00:00:01 dev wlan0 self permanent > 33:33:00:00:00:02 dev wlan0 self permanent > 33:33:00:00:00:01 dev wlan1 self permanent > 33:33:00:00:00:02 dev wlan1 self permanent > 33:33:00:00:00:01 dev br-lan self permanent > 33:33:00:00:00:02 dev br-lan self permanent > 01:00:5e:00:00:01 dev br-lan self permanent > 33:33:ff:00:00:01 dev br-lan self permanent > 33:33:ff:f2:5d:4f dev br-lan self permanent > 33:33:00:01:00:02 dev br-lan self permanent > 33:33:00:01:00:03 dev br-lan self permanent > 33:33:ff:00:00:00 dev br-lan self permanent So I would guess that dc:ef:09:f2:5d:4f is the MAC address of br-lan, inherited from the MAC address of the first bridge port (DSA switch port). In turn, if there is no MAC address for ports in the device tree, DSA inherits the MAC address from the master. So that's a plausible avenue for the DSA master's MAC address to reach the bridge FDB. The bridge notifies local addresses on the switchdev chain with the fdb_info->is_local bit set, and DSA calls port_fdb_add() on the CPU ports for them. So that's how they might reach the hardware FDB. But again, port_fdb_add() programs static FDB entries, and fast ageing doesn't remove those.
diff --git a/drivers/net/dsa/qca/qca8k-8xxx.c b/drivers/net/dsa/qca/qca8k-8xxx.c index dee7b6579916..435b69c1c552 100644 --- a/drivers/net/dsa/qca/qca8k-8xxx.c +++ b/drivers/net/dsa/qca/qca8k-8xxx.c @@ -1713,6 +1713,59 @@ qca8k_get_tag_protocol(struct dsa_switch *ds, int port, return DSA_TAG_PROTO_QCA; } +static int qca8k_port_change_master(struct dsa_switch *ds, int port, + struct net_device *master, + struct netlink_ext_ack *extack) +{ + struct qca8k_priv *priv = ds->priv; + u32 val, cpu_port_mask = 0; + struct dsa_port *dp; + int ret; + + /* With LAG of CPU port, compose the mask for LOOKUP MEMBER */ + if (netif_is_lag_master(master)) { + struct dsa_lag *lag; + int id; + + id = dsa_lag_id(ds->dst, master); + lag = dsa_lag_by_id(ds->dst, id); + + dsa_lag_foreach_port(dp, ds->dst, lag) + if (dsa_port_is_cpu(dp)) + cpu_port_mask |= BIT(dp->index); + } else { + dp = dsa_port_from_netdev(master); + cpu_port_mask |= BIT(dp->index); + } + + /* Disable port */ + qca8k_port_set_status(priv, port, 0); + + /* Connect it to new cpu port */ + ret = qca8k_read(priv, QCA8K_PORT_LOOKUP_CTRL(port), &val); + if (ret) + return ret; + + /* Reset connected CPU port in LOOKUP MEMBER */ + val &= QCA8K_PORT_LOOKUP_USER_MEMBER; + /* Assign the new CPU port in LOOKUP MEMBER */ + val |= cpu_port_mask; + + ret = qca8k_rmw(priv, QCA8K_PORT_LOOKUP_CTRL(port), + QCA8K_PORT_LOOKUP_MEMBER, + val); + if (ret) + return ret; + + /* Fast Age the port to flush FDB table */ + qca8k_port_fast_age(ds, port); + + /* Reenable port */ + qca8k_port_set_status(priv, port, 1); + + return 0; +} + static void qca8k_master_change(struct dsa_switch *ds, const struct net_device *master, bool operational) @@ -1996,6 +2049,7 @@ static const struct dsa_switch_ops qca8k_switch_ops = { .get_phy_flags = qca8k_get_phy_flags, .port_lag_join = qca8k_port_lag_join, .port_lag_leave = qca8k_port_lag_leave, + .port_change_master = qca8k_port_change_master, .master_state_change = qca8k_master_change, .connect_tag_protocol = qca8k_connect_tag_protocol, }; diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h index c5cc8a172d65..424f851db881 100644 --- a/drivers/net/dsa/qca/qca8k.h +++ b/drivers/net/dsa/qca/qca8k.h @@ -250,6 +250,7 @@ #define QCA8K_GLOBAL_FW_CTRL1_MC_DP_MASK GENMASK(14, 8) #define QCA8K_GLOBAL_FW_CTRL1_UC_DP_MASK GENMASK(6, 0) #define QCA8K_PORT_LOOKUP_CTRL(_i) (0x660 + (_i) * 0xc) +#define QCA8K_PORT_LOOKUP_USER_MEMBER GENMASK(5, 1) #define QCA8K_PORT_LOOKUP_MEMBER GENMASK(6, 0) #define QCA8K_PORT_LOOKUP_VLAN_MODE_MASK GENMASK(9, 8) #define QCA8K_PORT_LOOKUP_VLAN_MODE(x) FIELD_PREP(QCA8K_PORT_LOOKUP_VLAN_MODE_MASK, x)