[next] bonding: Extending LACP MUX State Machine to include a Collecting State.

Message ID 20231221023650.3208767-1-aahila@google.com
State New
Headers
Series [next] bonding: Extending LACP MUX State Machine to include a Collecting State. |

Commit Message

Aahil Awatramani Dec. 21, 2023, 2:36 a.m. UTC
  Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
the LACP MUX state machine for separated handling of an initial
Collecting state before the Collecting and Distributing state. This
enables a port to be in a state where it can receive incoming packets
while not still distributing. This is useful for reducing packet loss when
a port begins distributing before its partner is able to collect.
Additionally this also brings the 802.3ad bonding driver's implementation
closer to the LACP specification which already predefined this behaviour.

With this change, 802.3ad mode will use new
bond_set_slave_txrx_{enabled|disabled}_flags() set of functions only
instead of the earlier one (bond_set_slave_{active|inactive}_flags).
Additionally, it adds new functions such as
bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to
precisely manage the port's collecting and distributing states.
Previously, there was no dedicated method to disable TX while keeping RX
enabled, which this patch addresses.

Note that the regular flow process in the kernel's bonding driver remains
unaffected by this patch. The extension requires explicit opt-in by the
user (in order to ensure no disruptions for existing setups) via netlink
or sysfs support using the new bonding parameter lacp_extended_mux. The
default value for lacp_extended_mux is set to 0 so as to preserve existing
behaviour.

Signed-off-by: Aahil Awatramani <aahila@google.com>
---
 drivers/net/bonding/bond_3ad.c     | 155 +++++++++++++++++++++++++++--
 drivers/net/bonding/bond_main.c    |  22 ++--
 drivers/net/bonding/bond_netlink.c |  16 +++
 drivers/net/bonding/bond_options.c |  26 ++++-
 drivers/net/bonding/bond_sysfs.c   |  12 +++
 include/net/bond_3ad.h             |   2 +
 include/net/bond_options.h         |   1 +
 include/net/bonding.h              |  33 ++++++
 include/uapi/linux/if_link.h       |   1 +
 tools/include/uapi/linux/if_link.h |   1 +
 10 files changed, 254 insertions(+), 15 deletions(-)
  

Comments

Simon Horman Dec. 21, 2023, 5:35 p.m. UTC | #1
On Thu, Dec 21, 2023 at 02:36:50AM +0000, Aahil Awatramani wrote:
> Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
> the LACP MUX state machine for separated handling of an initial
> Collecting state before the Collecting and Distributing state. This
> enables a port to be in a state where it can receive incoming packets
> while not still distributing. This is useful for reducing packet loss when
> a port begins distributing before its partner is able to collect.
> Additionally this also brings the 802.3ad bonding driver's implementation
> closer to the LACP specification which already predefined this behaviour.
> 
> With this change, 802.3ad mode will use new
> bond_set_slave_txrx_{enabled|disabled}_flags() set of functions only
> instead of the earlier one (bond_set_slave_{active|inactive}_flags).
> Additionally, it adds new functions such as
> bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to
> precisely manage the port's collecting and distributing states.
> Previously, there was no dedicated method to disable TX while keeping RX
> enabled, which this patch addresses.
> 
> Note that the regular flow process in the kernel's bonding driver remains
> unaffected by this patch. The extension requires explicit opt-in by the
> user (in order to ensure no disruptions for existing setups) via netlink
> or sysfs support using the new bonding parameter lacp_extended_mux. The
> default value for lacp_extended_mux is set to 0 so as to preserve existing
> behaviour.
> 
> Signed-off-by: Aahil Awatramani <aahila@google.com>

...

> @@ -1906,6 +2005,46 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
>  	}
>  }
>  
> +/**
> + * ad_enable_collecting - enable a port's receive
> + * @port: the port we're looking at
> + * @update_slave_arr: Does slave array need update?

The line above documenting @update_slave_arr
should be removed as the parameter is not in
the function definition below.

> + *
> + * Enable @port if it's in an active aggregator
> + */
> +static void ad_enable_collecting(struct port *port)
> +{
> +	if (port->aggregator->is_active) {
> +		struct slave *slave = port->slave;
> +
> +		slave_dbg(slave->bond->dev, slave->dev,
> +			  "Enabling collecting on port %d (LAG %d)\n",
> +			  port->actor_port_number,
> +			  port->aggregator->aggregator_identifier);
> +		__enable_collecting_port(port);
> +	}
> +}

The above is a pretty minor problem, but the bots are likely
to complain about it, so please fix it in v2 after waiting
for feedback from others on this patch.

When posting v2 please target it at the net-next tree.

	Subject: [PATCH net-next v2] ...
  
Jay Vosburgh Dec. 21, 2023, 6:48 p.m. UTC | #2
Aahil Awatramani <aahila@google.com> wrote:

>Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
>the LACP MUX state machine for separated handling of an initial
>Collecting state before the Collecting and Distributing state. This
>enables a port to be in a state where it can receive incoming packets
>while not still distributing. This is useful for reducing packet loss when
>a port begins distributing before its partner is able to collect.
>Additionally this also brings the 802.3ad bonding driver's implementation
>closer to the LACP specification which already predefined this behaviour.

	To be clear, the current implementation (that combines
COLLECTING and DISTRIBUTING into a single state) is compliant with the
standard, which defines the current logic as "coupled control," per IEEE
802.1AX-2008, 5.4.15 or 802.1AX-2020, 6.4.13.

	I haven't read the patch in detail yet, but my overall question
is: why do we need this?  This adds significant complexity to the state
machine logic.  What real problem is this solving, i.e., what examples
do you have of systems where a port is "in a state where it can receive
incoming packets while not still distributing"?

	For the nomenclature, I would prefer to use the naming from the
standard.  Thus, instead of "lacp_extended_mux" my preference would be
"coupled_control", which would be enabled by default.  This extends to
the naming of variables or constants within the code as well.

	Lastly, in order to be accepted, this needs to include an update
to the bonding documentation.

	-J

>With this change, 802.3ad mode will use new
>bond_set_slave_txrx_{enabled|disabled}_flags() set of functions only
>instead of the earlier one (bond_set_slave_{active|inactive}_flags).
>Additionally, it adds new functions such as
>bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to
>precisely manage the port's collecting and distributing states.
>Previously, there was no dedicated method to disable TX while keeping RX
>enabled, which this patch addresses.
>
>Note that the regular flow process in the kernel's bonding driver remains
>unaffected by this patch. The extension requires explicit opt-in by the
>user (in order to ensure no disruptions for existing setups) via netlink
>or sysfs support using the new bonding parameter lacp_extended_mux. The
>default value for lacp_extended_mux is set to 0 so as to preserve existing
>behaviour.
>
>Signed-off-by: Aahil Awatramani <aahila@google.com>
>---
> drivers/net/bonding/bond_3ad.c     | 155 +++++++++++++++++++++++++++--
> drivers/net/bonding/bond_main.c    |  22 ++--
> drivers/net/bonding/bond_netlink.c |  16 +++
> drivers/net/bonding/bond_options.c |  26 ++++-
> drivers/net/bonding/bond_sysfs.c   |  12 +++
> include/net/bond_3ad.h             |   2 +
> include/net/bond_options.h         |   1 +
> include/net/bonding.h              |  33 ++++++
> include/uapi/linux/if_link.h       |   1 +
> tools/include/uapi/linux/if_link.h |   1 +
> 10 files changed, 254 insertions(+), 15 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>index c99ffe6c683a..38a7aa6e4edd 100644
>--- a/drivers/net/bonding/bond_3ad.c
>+++ b/drivers/net/bonding/bond_3ad.c
>@@ -106,6 +106,9 @@ static void ad_agg_selection_logic(struct aggregator *aggregator,
> static void ad_clear_agg(struct aggregator *aggregator);
> static void ad_initialize_agg(struct aggregator *aggregator);
> static void ad_initialize_port(struct port *port, int lacp_fast);
>+static void ad_enable_collecting(struct port *port);
>+static void ad_disable_distributing(struct port *port,
>+				    bool *update_slave_arr);
> static void ad_enable_collecting_distributing(struct port *port,
> 					      bool *update_slave_arr);
> static void ad_disable_collecting_distributing(struct port *port,
>@@ -171,32 +174,64 @@ static inline int __agg_has_partner(struct aggregator *agg)
> 	return !is_zero_ether_addr(agg->partner_system.mac_addr_value);
> }
> 
>+/**
>+ * __disable_distributing_port - disable the port's slave for distributing.
>+ * Port will still be able to collect.
>+ * @port: the port we're looking at
>+ *
>+ * This will disable only distributing on the port's slave.
>+ */
>+static inline void __disable_distributing_port(struct port *port)
>+{
>+	bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
>+}
>+
>+/**
>+ * __enable_collecting_port - enable the port's slave for collecting,
>+ * if it's up
>+ * @port: the port we're looking at
>+ *
>+ * This will enable only collecting on the port's slave.
>+ */
>+static inline void __enable_collecting_port(struct port *port)
>+{
>+	struct slave *slave = port->slave;
>+
>+	if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave))
>+		bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
>+}
>+
> /**
>  * __disable_port - disable the port's slave
>  * @port: the port we're looking at
>+ *
>+ * This will disable both collecting and distributing on the port's slave.
>  */
> static inline void __disable_port(struct port *port)
> {
>-	bond_set_slave_inactive_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
>+	bond_set_slave_txrx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> }
> 
> /**
>  * __enable_port - enable the port's slave, if it's up
>  * @port: the port we're looking at
>+ *
>+ * This will enable both collecting and distributing on the port's slave.
>  */
> static inline void __enable_port(struct port *port)
> {
> 	struct slave *slave = port->slave;
> 
> 	if ((slave->link == BOND_LINK_UP) && bond_slave_is_up(slave))
>-		bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_LATER);
>+		bond_set_slave_txrx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> }
> 
> /**
>- * __port_is_enabled - check if the port's slave is in active state
>+ * __port_is_collecting_distributing - check if the port's slave is in the
>+ * combined collecting/distributing state
>  * @port: the port we're looking at
>  */
>-static inline int __port_is_enabled(struct port *port)
>+static inline int __port_is_collecting_distributing(struct port *port)
> {
> 	return bond_is_active_slave(port->slave);
> }
>@@ -942,6 +977,7 @@ static int ad_marker_send(struct port *port, struct bond_marker *marker)
>  */
> static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> {
>+	struct bonding *bond = __get_bond_by_port(port);
> 	mux_states_t last_state;
> 
> 	/* keep current State Machine state to compare later if it was
>@@ -999,9 +1035,13 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> 			if ((port->sm_vars & AD_PORT_SELECTED) &&
> 			    (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> 			    !__check_agg_selection_timer(port)) {
>-				if (port->aggregator->is_active)
>-					port->sm_mux_state =
>-					    AD_MUX_COLLECTING_DISTRIBUTING;
>+				if (port->aggregator->is_active) {
>+					int state = AD_MUX_COLLECTING_DISTRIBUTING;
>+
>+					if (bond->params.lacp_extended_mux)
>+						state = AD_MUX_COLLECTING;
>+					port->sm_mux_state = state;
>+				}
> 			} else if (!(port->sm_vars & AD_PORT_SELECTED) ||
> 				   (port->sm_vars & AD_PORT_STANDBY)) {
> 				/* if UNSELECTED or STANDBY */
>@@ -1031,7 +1071,52 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> 				 */
> 				if (port->aggregator &&
> 				    port->aggregator->is_active &&
>-				    !__port_is_enabled(port)) {
>+				    !__port_is_collecting_distributing(port)) {
>+					__enable_port(port);
>+					*update_slave_arr = true;
>+				}
>+			}
>+			break;
>+		case AD_MUX_COLLECTING:
>+			if (!(port->sm_vars & AD_PORT_SELECTED) ||
>+			    (port->sm_vars & AD_PORT_STANDBY) ||
>+			    !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
>+			    !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
>+				port->sm_mux_state = AD_MUX_ATTACHED;
>+			} else if ((port->sm_vars & AD_PORT_SELECTED) &&
>+			    (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
>+			    (port->partner_oper.port_state & LACP_STATE_COLLECTING)) {
>+				port->sm_mux_state = AD_MUX_DISTRIBUTING;
>+			} else {
>+				/* If port state hasn't changed, make sure that a collecting
>+				 * port is enabled for an active aggregator.
>+				 */
>+				if (port->aggregator &&
>+				    port->aggregator->is_active) {
>+					struct slave *slave = port->slave;
>+
>+					if (bond_is_slave_rx_disabled(slave) != 0) {
>+						ad_enable_collecting(port);
>+						*update_slave_arr = true;
>+					}
>+				}
>+			}
>+			break;
>+		case AD_MUX_DISTRIBUTING:
>+			if (!(port->sm_vars & AD_PORT_SELECTED) ||
>+			    (port->sm_vars & AD_PORT_STANDBY) ||
>+			    !(port->partner_oper.port_state & LACP_STATE_COLLECTING) ||
>+			    !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
>+			    !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
>+				port->sm_mux_state = AD_MUX_COLLECTING;
>+			} else {
>+				/* if port state hasn't changed make
>+				 * sure that a collecting distributing
>+				 * port in an active aggregator is enabled
>+				 */
>+				if (port->aggregator &&
>+				    port->aggregator->is_active &&
>+				    !__port_is_collecting_distributing(port)) {
> 					__enable_port(port);
> 					*update_slave_arr = true;
> 				}
>@@ -1082,6 +1167,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> 							  update_slave_arr);
> 			port->ntt = true;
> 			break;
>+		case AD_MUX_COLLECTING:
>+			port->actor_oper_port_state |= LACP_STATE_COLLECTING;
>+			port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
>+			port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
>+			ad_enable_collecting(port);
>+			ad_disable_distributing(port, update_slave_arr);
>+			port->ntt = true;
>+			break;
>+		case AD_MUX_DISTRIBUTING:
>+			port->actor_oper_port_state |= LACP_STATE_DISTRIBUTING;
>+			port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
>+			ad_enable_collecting_distributing(port,
>+							  update_slave_arr);
>+			break;
> 		default:
> 			break;
> 		}
>@@ -1906,6 +2005,46 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
> 	}
> }
> 
>+/**
>+ * ad_enable_collecting - enable a port's receive
>+ * @port: the port we're looking at
>+ * @update_slave_arr: Does slave array need update?
>+ *
>+ * Enable @port if it's in an active aggregator
>+ */
>+static void ad_enable_collecting(struct port *port)
>+{
>+	if (port->aggregator->is_active) {
>+		struct slave *slave = port->slave;
>+
>+		slave_dbg(slave->bond->dev, slave->dev,
>+			  "Enabling collecting on port %d (LAG %d)\n",
>+			  port->actor_port_number,
>+			  port->aggregator->aggregator_identifier);
>+		__enable_collecting_port(port);
>+	}
>+}
>+
>+/**
>+ * ad_disable_distributing - disable a port's transmit
>+ * @port: the port we're looking at
>+ * @update_slave_arr: Does slave array need update?
>+ */
>+static void ad_disable_distributing(struct port *port, bool *update_slave_arr)
>+{
>+	if (port->aggregator &&
>+	    !MAC_ADDRESS_EQUAL(&port->aggregator->partner_system,
>+			       &(null_mac_addr))) {
>+		slave_dbg(port->slave->bond->dev, port->slave->dev,
>+			  "Disabling distributing on port %d (LAG %d)\n",
>+			  port->actor_port_number,
>+			  port->aggregator->aggregator_identifier);
>+		__disable_distributing_port(port);
>+		/* Slave array needs an update */
>+		*update_slave_arr = true;
>+	}
>+}
>+
> /**
>  * ad_enable_collecting_distributing - enable a port's transmit/receive
>  * @port: the port we're looking at
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 8e6cc0e133b7..6b8f001a51a5 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -2119,7 +2119,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> 		 * will activate the slaves in the selected
> 		 * aggregator
> 		 */
>-		bond_set_slave_inactive_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
>+		bond_set_slave_txrx_disabled_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
> 		/* if this is the first slave */
> 		if (!prev_slave) {
> 			SLAVE_AD_INFO(new_slave)->id = 1;
>@@ -2381,7 +2381,10 @@ static int __bond_release_one(struct net_device *bond_dev,
> 		return -EINVAL;
> 	}
> 
>-	bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
>+	if (BOND_MODE(bond) == BOND_MODE_8023AD)
>+		bond_set_slave_txrx_disabled_flags(slave, BOND_SLAVE_NOTIFY_NOW);
>+	else
>+		bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
> 
> 	bond_sysfs_slave_del(slave);
> 
>@@ -2763,11 +2766,14 @@ static void bond_miimon_commit(struct bonding *bond)
> 			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> 						  BOND_SLAVE_NOTIFY_NOW);
> 
>-			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
>-			    BOND_MODE(bond) == BOND_MODE_8023AD)
>+			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> 				bond_set_slave_inactive_flags(slave,
> 							      BOND_SLAVE_NOTIFY_NOW);
> 
>+			if (BOND_MODE(bond) == BOND_MODE_8023AD)
>+				bond_set_slave_txrx_disabled_flags(slave,
>+								   BOND_SLAVE_NOTIFY_NOW);
>+
> 			slave_info(bond->dev, slave->dev, "link status definitely down, disabling slave\n");
> 
> 			bond_miimon_link_change(bond, slave, BOND_LINK_DOWN);
>@@ -4276,8 +4282,12 @@ static int bond_open(struct net_device *bond_dev)
> 		bond_for_each_slave(bond, slave, iter) {
> 			if (bond_uses_primary(bond) &&
> 			    slave != rcu_access_pointer(bond->curr_active_slave)) {
>-				bond_set_slave_inactive_flags(slave,
>-							      BOND_SLAVE_NOTIFY_NOW);
>+				if (BOND_MODE(bond) == BOND_MODE_8023AD)
>+					bond_set_slave_txrx_disabled_flags(slave,
>+									   BOND_SLAVE_NOTIFY_NOW);
>+				else
>+					bond_set_slave_inactive_flags(slave,
>+								      BOND_SLAVE_NOTIFY_NOW);
> 			} else if (BOND_MODE(bond) != BOND_MODE_8023AD) {
> 				bond_set_slave_active_flags(slave,
> 							    BOND_SLAVE_NOTIFY_NOW);
>diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
>index cfa74cf8bb1a..1e671f504fc1 100644
>--- a/drivers/net/bonding/bond_netlink.c
>+++ b/drivers/net/bonding/bond_netlink.c
>@@ -122,6 +122,7 @@ static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = {
> 	[IFLA_BOND_PEER_NOTIF_DELAY]    = NLA_POLICY_FULL_RANGE(NLA_U32, &delay_range),
> 	[IFLA_BOND_MISSED_MAX]		= { .type = NLA_U8 },
> 	[IFLA_BOND_NS_IP6_TARGET]	= { .type = NLA_NESTED },
>+	[IFLA_BOND_LACP_EXTENDED_MUX]   = { .type = NLA_U8 },
> };
> 
> static const struct nla_policy bond_slave_policy[IFLA_BOND_SLAVE_MAX + 1] = {
>@@ -549,6 +550,16 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[],
> 			return err;
> 	}
> 
>+	if (data[IFLA_BOND_LACP_EXTENDED_MUX]) {
>+		int lacp_extended_mux = nla_get_u8(data[IFLA_BOND_LACP_EXTENDED_MUX]);
>+
>+		bond_opt_initval(&newval, lacp_extended_mux);
>+		err = __bond_opt_set(bond, BOND_OPT_LACP_EXTENDED_MUX, &newval,
>+				     data[IFLA_BOND_LACP_EXTENDED_MUX], extack);
>+		if (err)
>+			return err;
>+	}
>+
> 	return 0;
> }
> 
>@@ -615,6 +626,7 @@ static size_t bond_get_size(const struct net_device *bond_dev)
> 						/* IFLA_BOND_NS_IP6_TARGET */
> 		nla_total_size(sizeof(struct nlattr)) +
> 		nla_total_size(sizeof(struct in6_addr)) * BOND_MAX_NS_TARGETS +
>+		nla_total_size(sizeof(u8)) +	/* IFLA_BOND_LACP_EXTENDED_MUX */
> 		0;
> }
> 
>@@ -774,6 +786,10 @@ static int bond_fill_info(struct sk_buff *skb,
> 		       bond->params.missed_max))
> 		goto nla_put_failure;
> 
>+	if (nla_put_u8(skb, IFLA_BOND_LACP_EXTENDED_MUX,
>+		       bond->params.lacp_extended_mux))
>+		goto nla_put_failure;
>+
> 	if (BOND_MODE(bond) == BOND_MODE_8023AD) {
> 		struct ad_info info;
> 
>diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
>index f3f27f0bd2a6..c9997e42d045 100644
>--- a/drivers/net/bonding/bond_options.c
>+++ b/drivers/net/bonding/bond_options.c
>@@ -84,7 +84,8 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
> 					    const struct bond_opt_value *newval);
> static int bond_option_missed_max_set(struct bonding *bond,
> 				      const struct bond_opt_value *newval);
>-
>+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
>+					     const struct bond_opt_value *newval);
> 
> static const struct bond_opt_value bond_mode_tbl[] = {
> 	{ "balance-rr",    BOND_MODE_ROUNDROBIN,   BOND_VALFLAG_DEFAULT},
>@@ -232,6 +233,12 @@ static const struct bond_opt_value bond_missed_max_tbl[] = {
> 	{ NULL,		-1,	0},
> };
> 
>+static const struct bond_opt_value bond_lacp_extended_mux_tbl[] = {
>+	{ "off", 0,  BOND_VALFLAG_DEFAULT},
>+	{ "on",  1,  0},
>+	{ NULL,  -1, 0},
>+};
>+
> static const struct bond_option bond_opts[BOND_OPT_LAST] = {
> 	[BOND_OPT_MODE] = {
> 		.id = BOND_OPT_MODE,
>@@ -496,6 +503,14 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
> 		.desc = "Delay between each peer notification on failover event, in milliseconds",
> 		.values = bond_peer_notif_delay_tbl,
> 		.set = bond_option_peer_notif_delay_set
>+	},
>+	[BOND_OPT_LACP_EXTENDED_MUX] = {
>+		.id = BOND_OPT_LACP_EXTENDED_MUX,
>+		.name = "lacp_extended_mux",
>+		.desc = "Opt into using extended MUX for LACP states",
>+		.unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_8023AD)),
>+		.values = bond_lacp_extended_mux_tbl,
>+		.set = bond_option_lacp_extended_mux_set,
> 	}
> };
> 
>@@ -1692,3 +1707,12 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
> 	bond->params.ad_user_port_key = newval->value;
> 	return 0;
> }
>+
>+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
>+					     const struct bond_opt_value *newval)
>+{
>+	netdev_info(bond->dev, "Setting lacp_extended_mux to %s (%llu)\n",
>+		    newval->string, newval->value);
>+	bond->params.lacp_extended_mux = newval->value;
>+	return 0;
>+}
>diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>index 2805135a7205..62e264010998 100644
>--- a/drivers/net/bonding/bond_sysfs.c
>+++ b/drivers/net/bonding/bond_sysfs.c
>@@ -753,6 +753,17 @@ static ssize_t bonding_show_ad_user_port_key(struct device *d,
> static DEVICE_ATTR(ad_user_port_key, 0644,
> 		   bonding_show_ad_user_port_key, bonding_sysfs_store_option);
> 
>+static ssize_t bonding_show_lacp_extended_mux(struct device *d,
>+					      struct device_attribute *attr,
>+					      char *buf)
>+{
>+	struct bonding *bond = to_bond(d);
>+
>+	return sprintf(buf, "%d\n", bond->params.lacp_extended_mux);
>+}
>+static DEVICE_ATTR(lacp_extended_mux, 0644,
>+		   bonding_show_lacp_extended_mux, bonding_sysfs_store_option);
>+
> static struct attribute *per_bond_attrs[] = {
> 	&dev_attr_slaves.attr,
> 	&dev_attr_mode.attr,
>@@ -792,6 +803,7 @@ static struct attribute *per_bond_attrs[] = {
> 	&dev_attr_ad_actor_system.attr,
> 	&dev_attr_ad_user_port_key.attr,
> 	&dev_attr_arp_missed_max.attr,
>+	&dev_attr_lacp_extended_mux.attr,
> 	NULL,
> };
> 
>diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h
>index c5e57c6bd873..9ce5ac2bfbad 100644
>--- a/include/net/bond_3ad.h
>+++ b/include/net/bond_3ad.h
>@@ -54,6 +54,8 @@ typedef enum {
> 	AD_MUX_DETACHED,	/* mux machine */
> 	AD_MUX_WAITING,		/* mux machine */
> 	AD_MUX_ATTACHED,	/* mux machine */
>+	AD_MUX_COLLECTING,	/* mux machine */
>+	AD_MUX_DISTRIBUTING,	/* mux machine */
> 	AD_MUX_COLLECTING_DISTRIBUTING	/* mux machine */
> } mux_states_t;
> 
>diff --git a/include/net/bond_options.h b/include/net/bond_options.h
>index 69292ecc0325..8d1e9cb28684 100644
>--- a/include/net/bond_options.h
>+++ b/include/net/bond_options.h
>@@ -76,6 +76,7 @@ enum {
> 	BOND_OPT_MISSED_MAX,
> 	BOND_OPT_NS_TARGETS,
> 	BOND_OPT_PRIO,
>+	BOND_OPT_LACP_EXTENDED_MUX,
> 	BOND_OPT_LAST
> };
> 
>diff --git a/include/net/bonding.h b/include/net/bonding.h
>index 5b8b1b644a2d..b31880d53d76 100644
>--- a/include/net/bonding.h
>+++ b/include/net/bonding.h
>@@ -148,6 +148,7 @@ struct bond_params {
> #if IS_ENABLED(CONFIG_IPV6)
> 	struct in6_addr ns_targets[BOND_MAX_NS_TARGETS];
> #endif
>+	int lacp_extended_mux;
> 
> 	/* 2 bytes of padding : see ether_addr_equal_64bits() */
> 	u8 ad_actor_system[ETH_ALEN + 2];
>@@ -167,6 +168,7 @@ struct slave {
> 	u8     backup:1,   /* indicates backup slave. Value corresponds with
> 			      BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
> 	       inactive:1, /* indicates inactive slave */
>+	       rx_disabled:1, /* indicates whether slave's Rx is disabled */
> 	       should_notify:1, /* indicates whether the state changed */
> 	       should_notify_link:1; /* indicates whether the link changed */
> 	u8     duplex;
>@@ -570,6 +572,19 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave,
> 		slave->inactive = 1;
> }
> 
>+static inline void bond_set_slave_txrx_disabled_flags(struct slave *slave,
>+						 bool notify)
>+{
>+	bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
>+	slave->rx_disabled = 1;
>+}
>+
>+static inline void bond_set_slave_tx_disabled_flags(struct slave *slave,
>+						 bool notify)
>+{
>+	bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
>+}
>+
> static inline void bond_set_slave_active_flags(struct slave *slave,
> 					       bool notify)
> {
>@@ -577,11 +592,29 @@ static inline void bond_set_slave_active_flags(struct slave *slave,
> 	slave->inactive = 0;
> }
> 
>+static inline void bond_set_slave_txrx_enabled_flags(struct slave *slave,
>+					       bool notify)
>+{
>+	bond_set_slave_state(slave, BOND_STATE_ACTIVE, notify);
>+	slave->rx_disabled = 0;
>+}
>+
>+static inline void bond_set_slave_rx_enabled_flags(struct slave *slave,
>+					       bool notify)
>+{
>+	slave->rx_disabled = 0;
>+}
>+
> static inline bool bond_is_slave_inactive(struct slave *slave)
> {
> 	return slave->inactive;
> }
> 
>+static inline bool bond_is_slave_rx_disabled(struct slave *slave)
>+{
>+	return slave->rx_disabled;
>+}
>+
> static inline void bond_propose_link_state(struct slave *slave, int state)
> {
> 	slave->link_new_state = state;
>diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>index 29ff80da2775..e8fb30da9110 100644
>--- a/include/uapi/linux/if_link.h
>+++ b/include/uapi/linux/if_link.h
>@@ -976,6 +976,7 @@ enum {
> 	IFLA_BOND_AD_LACP_ACTIVE,
> 	IFLA_BOND_MISSED_MAX,
> 	IFLA_BOND_NS_IP6_TARGET,
>+	IFLA_BOND_LACP_EXTENDED_MUX,
> 	__IFLA_BOND_MAX,
> };
> 
>diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
>index a0aa05a28cf2..f641f55dbbc4 100644
>--- a/tools/include/uapi/linux/if_link.h
>+++ b/tools/include/uapi/linux/if_link.h
>@@ -974,6 +974,7 @@ enum {
> 	IFLA_BOND_AD_LACP_ACTIVE,
> 	IFLA_BOND_MISSED_MAX,
> 	IFLA_BOND_NS_IP6_TARGET,
>+	IFLA_BOND_LACP_EXTENDED_MUX,
> 	__IFLA_BOND_MAX,
> };
> 
>-- 
>2.43.0.472.g3155946c3a-goog
>
>

---
	-Jay Vosburgh, jay.vosburgh@canonical.com
  
Mahesh Bandewar (महेश बंडेवार) Dec. 21, 2023, 9:39 p.m. UTC | #3
On Thu, Dec 21, 2023 at 10:48 AM Jay Vosburgh
<jay.vosburgh@canonical.com> wrote:
>
> Aahil Awatramani <aahila@google.com> wrote:
>
> >Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
> >the LACP MUX state machine for separated handling of an initial
> >Collecting state before the Collecting and Distributing state. This
> >enables a port to be in a state where it can receive incoming packets
> >while not still distributing. This is useful for reducing packet loss when
> >a port begins distributing before its partner is able to collect.
> >Additionally this also brings the 802.3ad bonding driver's implementation
> >closer to the LACP specification which already predefined this behaviour.
>
>         To be clear, the current implementation (that combines
> COLLECTING and DISTRIBUTING into a single state) is compliant with the
> standard, which defines the current logic as "coupled control," per IEEE
> 802.1AX-2008, 5.4.15 or 802.1AX-2020, 6.4.13.
>
My understanding is that the bonding implementation in Linux is
compliant per 802.3ad-2000 specifications and that would mean a subset
of 802.1ax-2008 or beyond e.g. CSCD (conversation-sensitive collection
distribution) etc. which are part of the later specs that are not
implemented in the current driver. However, you are correct to say
that it does implement coupled-control while this patch is extending
to support collection and distribution as separate states without
removing the "coupled control" state.

>         I haven't read the patch in detail yet, but my overall question
> is: why do we need this?  This adds significant complexity to the state
> machine logic.  What real problem is this solving, i.e., what examples
> do you have of systems where a port is "in a state where it can receive
> incoming packets while not still distributing"?
>
>         For the nomenclature, I would prefer to use the naming from the
> standard.  Thus, instead of "lacp_extended_mux" my preference would be
> "coupled_control", which would be enabled by default.  This extends to
> the naming of variables or constants within the code as well.
>
>         Lastly, in order to be accepted, this needs to include an update
> to the bonding documentation.
>
>         -J
>
> >With this change, 802.3ad mode will use new
> >bond_set_slave_txrx_{enabled|disabled}_flags() set of functions only
> >instead of the earlier one (bond_set_slave_{active|inactive}_flags).
> >Additionally, it adds new functions such as
> >bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to
> >precisely manage the port's collecting and distributing states.
> >Previously, there was no dedicated method to disable TX while keeping RX
> >enabled, which this patch addresses.
> >
> >Note that the regular flow process in the kernel's bonding driver remains
> >unaffected by this patch. The extension requires explicit opt-in by the
> >user (in order to ensure no disruptions for existing setups) via netlink
> >or sysfs support using the new bonding parameter lacp_extended_mux. The
> >default value for lacp_extended_mux is set to 0 so as to preserve existing
> >behaviour.
> >
> >Signed-off-by: Aahil Awatramani <aahila@google.com>
> >---
> > drivers/net/bonding/bond_3ad.c     | 155 +++++++++++++++++++++++++++--
> > drivers/net/bonding/bond_main.c    |  22 ++--
> > drivers/net/bonding/bond_netlink.c |  16 +++
> > drivers/net/bonding/bond_options.c |  26 ++++-
> > drivers/net/bonding/bond_sysfs.c   |  12 +++
> > include/net/bond_3ad.h             |   2 +
> > include/net/bond_options.h         |   1 +
> > include/net/bonding.h              |  33 ++++++
> > include/uapi/linux/if_link.h       |   1 +
> > tools/include/uapi/linux/if_link.h |   1 +
> > 10 files changed, 254 insertions(+), 15 deletions(-)
> >
> >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> >index c99ffe6c683a..38a7aa6e4edd 100644
> >--- a/drivers/net/bonding/bond_3ad.c
> >+++ b/drivers/net/bonding/bond_3ad.c
> >@@ -106,6 +106,9 @@ static void ad_agg_selection_logic(struct aggregator *aggregator,
> > static void ad_clear_agg(struct aggregator *aggregator);
> > static void ad_initialize_agg(struct aggregator *aggregator);
> > static void ad_initialize_port(struct port *port, int lacp_fast);
> >+static void ad_enable_collecting(struct port *port);
> >+static void ad_disable_distributing(struct port *port,
> >+                                  bool *update_slave_arr);
> > static void ad_enable_collecting_distributing(struct port *port,
> >                                             bool *update_slave_arr);
> > static void ad_disable_collecting_distributing(struct port *port,
> >@@ -171,32 +174,64 @@ static inline int __agg_has_partner(struct aggregator *agg)
> >       return !is_zero_ether_addr(agg->partner_system.mac_addr_value);
> > }
> >
> >+/**
> >+ * __disable_distributing_port - disable the port's slave for distributing.
> >+ * Port will still be able to collect.
> >+ * @port: the port we're looking at
> >+ *
> >+ * This will disable only distributing on the port's slave.
> >+ */
> >+static inline void __disable_distributing_port(struct port *port)
> >+{
> >+      bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> >+}
> >+
> >+/**
> >+ * __enable_collecting_port - enable the port's slave for collecting,
> >+ * if it's up
> >+ * @port: the port we're looking at
> >+ *
> >+ * This will enable only collecting on the port's slave.
> >+ */
> >+static inline void __enable_collecting_port(struct port *port)
> >+{
> >+      struct slave *slave = port->slave;
> >+
> >+      if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave))
> >+              bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> >+}
> >+
> > /**
> >  * __disable_port - disable the port's slave
> >  * @port: the port we're looking at
> >+ *
> >+ * This will disable both collecting and distributing on the port's slave.
> >  */
> > static inline void __disable_port(struct port *port)
> > {
> >-      bond_set_slave_inactive_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> >+      bond_set_slave_txrx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> > }
> >
> > /**
> >  * __enable_port - enable the port's slave, if it's up
> >  * @port: the port we're looking at
> >+ *
> >+ * This will enable both collecting and distributing on the port's slave.
> >  */
> > static inline void __enable_port(struct port *port)
> > {
> >       struct slave *slave = port->slave;
> >
> >       if ((slave->link == BOND_LINK_UP) && bond_slave_is_up(slave))
> >-              bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> >+              bond_set_slave_txrx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> > }
> >
> > /**
> >- * __port_is_enabled - check if the port's slave is in active state
> >+ * __port_is_collecting_distributing - check if the port's slave is in the
> >+ * combined collecting/distributing state
> >  * @port: the port we're looking at
> >  */
> >-static inline int __port_is_enabled(struct port *port)
> >+static inline int __port_is_collecting_distributing(struct port *port)
> > {
> >       return bond_is_active_slave(port->slave);
> > }
> >@@ -942,6 +977,7 @@ static int ad_marker_send(struct port *port, struct bond_marker *marker)
> >  */
> > static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> > {
> >+      struct bonding *bond = __get_bond_by_port(port);
> >       mux_states_t last_state;
> >
> >       /* keep current State Machine state to compare later if it was
> >@@ -999,9 +1035,13 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                       if ((port->sm_vars & AD_PORT_SELECTED) &&
> >                           (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> >                           !__check_agg_selection_timer(port)) {
> >-                              if (port->aggregator->is_active)
> >-                                      port->sm_mux_state =
> >-                                          AD_MUX_COLLECTING_DISTRIBUTING;
> >+                              if (port->aggregator->is_active) {
> >+                                      int state = AD_MUX_COLLECTING_DISTRIBUTING;
> >+
> >+                                      if (bond->params.lacp_extended_mux)
> >+                                              state = AD_MUX_COLLECTING;
> >+                                      port->sm_mux_state = state;
> >+                              }
> >                       } else if (!(port->sm_vars & AD_PORT_SELECTED) ||
> >                                  (port->sm_vars & AD_PORT_STANDBY)) {
> >                               /* if UNSELECTED or STANDBY */
> >@@ -1031,7 +1071,52 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                                */
> >                               if (port->aggregator &&
> >                                   port->aggregator->is_active &&
> >-                                  !__port_is_enabled(port)) {
> >+                                  !__port_is_collecting_distributing(port)) {
> >+                                      __enable_port(port);
> >+                                      *update_slave_arr = true;
> >+                              }
> >+                      }
> >+                      break;
> >+              case AD_MUX_COLLECTING:
> >+                      if (!(port->sm_vars & AD_PORT_SELECTED) ||
> >+                          (port->sm_vars & AD_PORT_STANDBY) ||
> >+                          !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> >+                          !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
> >+                              port->sm_mux_state = AD_MUX_ATTACHED;
> >+                      } else if ((port->sm_vars & AD_PORT_SELECTED) &&
> >+                          (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> >+                          (port->partner_oper.port_state & LACP_STATE_COLLECTING)) {
> >+                              port->sm_mux_state = AD_MUX_DISTRIBUTING;
> >+                      } else {
> >+                              /* If port state hasn't changed, make sure that a collecting
> >+                               * port is enabled for an active aggregator.
> >+                               */
> >+                              if (port->aggregator &&
> >+                                  port->aggregator->is_active) {
> >+                                      struct slave *slave = port->slave;
> >+
> >+                                      if (bond_is_slave_rx_disabled(slave) != 0) {
> >+                                              ad_enable_collecting(port);
> >+                                              *update_slave_arr = true;
> >+                                      }
> >+                              }
> >+                      }
> >+                      break;
> >+              case AD_MUX_DISTRIBUTING:
> >+                      if (!(port->sm_vars & AD_PORT_SELECTED) ||
> >+                          (port->sm_vars & AD_PORT_STANDBY) ||
> >+                          !(port->partner_oper.port_state & LACP_STATE_COLLECTING) ||
> >+                          !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> >+                          !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
> >+                              port->sm_mux_state = AD_MUX_COLLECTING;
> >+                      } else {
> >+                              /* if port state hasn't changed make
> >+                               * sure that a collecting distributing
> >+                               * port in an active aggregator is enabled
> >+                               */
> >+                              if (port->aggregator &&
> >+                                  port->aggregator->is_active &&
> >+                                  !__port_is_collecting_distributing(port)) {
> >                                       __enable_port(port);
> >                                       *update_slave_arr = true;
> >                               }
> >@@ -1082,6 +1167,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                                                         update_slave_arr);
> >                       port->ntt = true;
> >                       break;
> >+              case AD_MUX_COLLECTING:
> >+                      port->actor_oper_port_state |= LACP_STATE_COLLECTING;
> >+                      port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
> >+                      port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
> >+                      ad_enable_collecting(port);
> >+                      ad_disable_distributing(port, update_slave_arr);
> >+                      port->ntt = true;
> >+                      break;
> >+              case AD_MUX_DISTRIBUTING:
> >+                      port->actor_oper_port_state |= LACP_STATE_DISTRIBUTING;
> >+                      port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
> >+                      ad_enable_collecting_distributing(port,
> >+                                                        update_slave_arr);
> >+                      break;
> >               default:
> >                       break;
> >               }
> >@@ -1906,6 +2005,46 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
> >       }
> > }
> >
> >+/**
> >+ * ad_enable_collecting - enable a port's receive
> >+ * @port: the port we're looking at
> >+ * @update_slave_arr: Does slave array need update?
> >+ *
> >+ * Enable @port if it's in an active aggregator
> >+ */
> >+static void ad_enable_collecting(struct port *port)
> >+{
> >+      if (port->aggregator->is_active) {
> >+              struct slave *slave = port->slave;
> >+
> >+              slave_dbg(slave->bond->dev, slave->dev,
> >+                        "Enabling collecting on port %d (LAG %d)\n",
> >+                        port->actor_port_number,
> >+                        port->aggregator->aggregator_identifier);
> >+              __enable_collecting_port(port);
> >+      }
> >+}
> >+
> >+/**
> >+ * ad_disable_distributing - disable a port's transmit
> >+ * @port: the port we're looking at
> >+ * @update_slave_arr: Does slave array need update?
> >+ */
> >+static void ad_disable_distributing(struct port *port, bool *update_slave_arr)
> >+{
> >+      if (port->aggregator &&
> >+          !MAC_ADDRESS_EQUAL(&port->aggregator->partner_system,
> >+                             &(null_mac_addr))) {
> >+              slave_dbg(port->slave->bond->dev, port->slave->dev,
> >+                        "Disabling distributing on port %d (LAG %d)\n",
> >+                        port->actor_port_number,
> >+                        port->aggregator->aggregator_identifier);
> >+              __disable_distributing_port(port);
> >+              /* Slave array needs an update */
> >+              *update_slave_arr = true;
> >+      }
> >+}
> >+
> > /**
> >  * ad_enable_collecting_distributing - enable a port's transmit/receive
> >  * @port: the port we're looking at
> >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >index 8e6cc0e133b7..6b8f001a51a5 100644
> >--- a/drivers/net/bonding/bond_main.c
> >+++ b/drivers/net/bonding/bond_main.c
> >@@ -2119,7 +2119,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> >                * will activate the slaves in the selected
> >                * aggregator
> >                */
> >-              bond_set_slave_inactive_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
> >+              bond_set_slave_txrx_disabled_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
> >               /* if this is the first slave */
> >               if (!prev_slave) {
> >                       SLAVE_AD_INFO(new_slave)->id = 1;
> >@@ -2381,7 +2381,10 @@ static int __bond_release_one(struct net_device *bond_dev,
> >               return -EINVAL;
> >       }
> >
> >-      bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
> >+      if (BOND_MODE(bond) == BOND_MODE_8023AD)
> >+              bond_set_slave_txrx_disabled_flags(slave, BOND_SLAVE_NOTIFY_NOW);
> >+      else
> >+              bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
> >
> >       bond_sysfs_slave_del(slave);
> >
> >@@ -2763,11 +2766,14 @@ static void bond_miimon_commit(struct bonding *bond)
> >                       bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> >                                                 BOND_SLAVE_NOTIFY_NOW);
> >
> >-                      if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
> >-                          BOND_MODE(bond) == BOND_MODE_8023AD)
> >+                      if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> >                               bond_set_slave_inactive_flags(slave,
> >                                                             BOND_SLAVE_NOTIFY_NOW);
> >
> >+                      if (BOND_MODE(bond) == BOND_MODE_8023AD)
> >+                              bond_set_slave_txrx_disabled_flags(slave,
> >+                                                                 BOND_SLAVE_NOTIFY_NOW);
> >+
> >                       slave_info(bond->dev, slave->dev, "link status definitely down, disabling slave\n");
> >
> >                       bond_miimon_link_change(bond, slave, BOND_LINK_DOWN);
> >@@ -4276,8 +4282,12 @@ static int bond_open(struct net_device *bond_dev)
> >               bond_for_each_slave(bond, slave, iter) {
> >                       if (bond_uses_primary(bond) &&
> >                           slave != rcu_access_pointer(bond->curr_active_slave)) {
> >-                              bond_set_slave_inactive_flags(slave,
> >-                                                            BOND_SLAVE_NOTIFY_NOW);
> >+                              if (BOND_MODE(bond) == BOND_MODE_8023AD)
> >+                                      bond_set_slave_txrx_disabled_flags(slave,
> >+                                                                         BOND_SLAVE_NOTIFY_NOW);
> >+                              else
> >+                                      bond_set_slave_inactive_flags(slave,
> >+                                                                    BOND_SLAVE_NOTIFY_NOW);
> >                       } else if (BOND_MODE(bond) != BOND_MODE_8023AD) {
> >                               bond_set_slave_active_flags(slave,
> >                                                           BOND_SLAVE_NOTIFY_NOW);
> >diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
> >index cfa74cf8bb1a..1e671f504fc1 100644
> >--- a/drivers/net/bonding/bond_netlink.c
> >+++ b/drivers/net/bonding/bond_netlink.c
> >@@ -122,6 +122,7 @@ static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = {
> >       [IFLA_BOND_PEER_NOTIF_DELAY]    = NLA_POLICY_FULL_RANGE(NLA_U32, &delay_range),
> >       [IFLA_BOND_MISSED_MAX]          = { .type = NLA_U8 },
> >       [IFLA_BOND_NS_IP6_TARGET]       = { .type = NLA_NESTED },
> >+      [IFLA_BOND_LACP_EXTENDED_MUX]   = { .type = NLA_U8 },
> > };
> >
> > static const struct nla_policy bond_slave_policy[IFLA_BOND_SLAVE_MAX + 1] = {
> >@@ -549,6 +550,16 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[],
> >                       return err;
> >       }
> >
> >+      if (data[IFLA_BOND_LACP_EXTENDED_MUX]) {
> >+              int lacp_extended_mux = nla_get_u8(data[IFLA_BOND_LACP_EXTENDED_MUX]);
> >+
> >+              bond_opt_initval(&newval, lacp_extended_mux);
> >+              err = __bond_opt_set(bond, BOND_OPT_LACP_EXTENDED_MUX, &newval,
> >+                                   data[IFLA_BOND_LACP_EXTENDED_MUX], extack);
> >+              if (err)
> >+                      return err;
> >+      }
> >+
> >       return 0;
> > }
> >
> >@@ -615,6 +626,7 @@ static size_t bond_get_size(const struct net_device *bond_dev)
> >                                               /* IFLA_BOND_NS_IP6_TARGET */
> >               nla_total_size(sizeof(struct nlattr)) +
> >               nla_total_size(sizeof(struct in6_addr)) * BOND_MAX_NS_TARGETS +
> >+              nla_total_size(sizeof(u8)) +    /* IFLA_BOND_LACP_EXTENDED_MUX */
> >               0;
> > }
> >
> >@@ -774,6 +786,10 @@ static int bond_fill_info(struct sk_buff *skb,
> >                      bond->params.missed_max))
> >               goto nla_put_failure;
> >
> >+      if (nla_put_u8(skb, IFLA_BOND_LACP_EXTENDED_MUX,
> >+                     bond->params.lacp_extended_mux))
> >+              goto nla_put_failure;
> >+
> >       if (BOND_MODE(bond) == BOND_MODE_8023AD) {
> >               struct ad_info info;
> >
> >diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
> >index f3f27f0bd2a6..c9997e42d045 100644
> >--- a/drivers/net/bonding/bond_options.c
> >+++ b/drivers/net/bonding/bond_options.c
> >@@ -84,7 +84,8 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
> >                                           const struct bond_opt_value *newval);
> > static int bond_option_missed_max_set(struct bonding *bond,
> >                                     const struct bond_opt_value *newval);
> >-
> >+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
> >+                                           const struct bond_opt_value *newval);
> >
> > static const struct bond_opt_value bond_mode_tbl[] = {
> >       { "balance-rr",    BOND_MODE_ROUNDROBIN,   BOND_VALFLAG_DEFAULT},
> >@@ -232,6 +233,12 @@ static const struct bond_opt_value bond_missed_max_tbl[] = {
> >       { NULL,         -1,     0},
> > };
> >
> >+static const struct bond_opt_value bond_lacp_extended_mux_tbl[] = {
> >+      { "off", 0,  BOND_VALFLAG_DEFAULT},
> >+      { "on",  1,  0},
> >+      { NULL,  -1, 0},
> >+};
> >+
> > static const struct bond_option bond_opts[BOND_OPT_LAST] = {
> >       [BOND_OPT_MODE] = {
> >               .id = BOND_OPT_MODE,
> >@@ -496,6 +503,14 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
> >               .desc = "Delay between each peer notification on failover event, in milliseconds",
> >               .values = bond_peer_notif_delay_tbl,
> >               .set = bond_option_peer_notif_delay_set
> >+      },
> >+      [BOND_OPT_LACP_EXTENDED_MUX] = {
> >+              .id = BOND_OPT_LACP_EXTENDED_MUX,
> >+              .name = "lacp_extended_mux",
> >+              .desc = "Opt into using extended MUX for LACP states",
> >+              .unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_8023AD)),
> >+              .values = bond_lacp_extended_mux_tbl,
> >+              .set = bond_option_lacp_extended_mux_set,
> >       }
> > };
> >
> >@@ -1692,3 +1707,12 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
> >       bond->params.ad_user_port_key = newval->value;
> >       return 0;
> > }
> >+
> >+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
> >+                                           const struct bond_opt_value *newval)
> >+{
> >+      netdev_info(bond->dev, "Setting lacp_extended_mux to %s (%llu)\n",
> >+                  newval->string, newval->value);
> >+      bond->params.lacp_extended_mux = newval->value;
> >+      return 0;
> >+}
> >diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
> >index 2805135a7205..62e264010998 100644
> >--- a/drivers/net/bonding/bond_sysfs.c
> >+++ b/drivers/net/bonding/bond_sysfs.c
> >@@ -753,6 +753,17 @@ static ssize_t bonding_show_ad_user_port_key(struct device *d,
> > static DEVICE_ATTR(ad_user_port_key, 0644,
> >                  bonding_show_ad_user_port_key, bonding_sysfs_store_option);
> >
> >+static ssize_t bonding_show_lacp_extended_mux(struct device *d,
> >+                                            struct device_attribute *attr,
> >+                                            char *buf)
> >+{
> >+      struct bonding *bond = to_bond(d);
> >+
> >+      return sprintf(buf, "%d\n", bond->params.lacp_extended_mux);
> >+}
> >+static DEVICE_ATTR(lacp_extended_mux, 0644,
> >+                 bonding_show_lacp_extended_mux, bonding_sysfs_store_option);
> >+
> > static struct attribute *per_bond_attrs[] = {
> >       &dev_attr_slaves.attr,
> >       &dev_attr_mode.attr,
> >@@ -792,6 +803,7 @@ static struct attribute *per_bond_attrs[] = {
> >       &dev_attr_ad_actor_system.attr,
> >       &dev_attr_ad_user_port_key.attr,
> >       &dev_attr_arp_missed_max.attr,
> >+      &dev_attr_lacp_extended_mux.attr,
> >       NULL,
> > };
> >
> >diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h
> >index c5e57c6bd873..9ce5ac2bfbad 100644
> >--- a/include/net/bond_3ad.h
> >+++ b/include/net/bond_3ad.h
> >@@ -54,6 +54,8 @@ typedef enum {
> >       AD_MUX_DETACHED,        /* mux machine */
> >       AD_MUX_WAITING,         /* mux machine */
> >       AD_MUX_ATTACHED,        /* mux machine */
> >+      AD_MUX_COLLECTING,      /* mux machine */
> >+      AD_MUX_DISTRIBUTING,    /* mux machine */
> >       AD_MUX_COLLECTING_DISTRIBUTING  /* mux machine */
> > } mux_states_t;
> >
> >diff --git a/include/net/bond_options.h b/include/net/bond_options.h
> >index 69292ecc0325..8d1e9cb28684 100644
> >--- a/include/net/bond_options.h
> >+++ b/include/net/bond_options.h
> >@@ -76,6 +76,7 @@ enum {
> >       BOND_OPT_MISSED_MAX,
> >       BOND_OPT_NS_TARGETS,
> >       BOND_OPT_PRIO,
> >+      BOND_OPT_LACP_EXTENDED_MUX,
> >       BOND_OPT_LAST
> > };
> >
> >diff --git a/include/net/bonding.h b/include/net/bonding.h
> >index 5b8b1b644a2d..b31880d53d76 100644
> >--- a/include/net/bonding.h
> >+++ b/include/net/bonding.h
> >@@ -148,6 +148,7 @@ struct bond_params {
> > #if IS_ENABLED(CONFIG_IPV6)
> >       struct in6_addr ns_targets[BOND_MAX_NS_TARGETS];
> > #endif
> >+      int lacp_extended_mux;
> >
> >       /* 2 bytes of padding : see ether_addr_equal_64bits() */
> >       u8 ad_actor_system[ETH_ALEN + 2];
> >@@ -167,6 +168,7 @@ struct slave {
> >       u8     backup:1,   /* indicates backup slave. Value corresponds with
> >                             BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
> >              inactive:1, /* indicates inactive slave */
> >+             rx_disabled:1, /* indicates whether slave's Rx is disabled */
> >              should_notify:1, /* indicates whether the state changed */
> >              should_notify_link:1; /* indicates whether the link changed */
> >       u8     duplex;
> >@@ -570,6 +572,19 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave,
> >               slave->inactive = 1;
> > }
> >
> >+static inline void bond_set_slave_txrx_disabled_flags(struct slave *slave,
> >+                                               bool notify)
> >+{
> >+      bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
> >+      slave->rx_disabled = 1;
> >+}
> >+
> >+static inline void bond_set_slave_tx_disabled_flags(struct slave *slave,
> >+                                               bool notify)
> >+{
> >+      bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
> >+}
> >+
> > static inline void bond_set_slave_active_flags(struct slave *slave,
> >                                              bool notify)
> > {
> >@@ -577,11 +592,29 @@ static inline void bond_set_slave_active_flags(struct slave *slave,
> >       slave->inactive = 0;
> > }
> >
> >+static inline void bond_set_slave_txrx_enabled_flags(struct slave *slave,
> >+                                             bool notify)
> >+{
> >+      bond_set_slave_state(slave, BOND_STATE_ACTIVE, notify);
> >+      slave->rx_disabled = 0;
> >+}
> >+
> >+static inline void bond_set_slave_rx_enabled_flags(struct slave *slave,
> >+                                             bool notify)
> >+{
> >+      slave->rx_disabled = 0;
> >+}
> >+
> > static inline bool bond_is_slave_inactive(struct slave *slave)
> > {
> >       return slave->inactive;
> > }
> >
> >+static inline bool bond_is_slave_rx_disabled(struct slave *slave)
> >+{
> >+      return slave->rx_disabled;
> >+}
> >+
> > static inline void bond_propose_link_state(struct slave *slave, int state)
> > {
> >       slave->link_new_state = state;
> >diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> >index 29ff80da2775..e8fb30da9110 100644
> >--- a/include/uapi/linux/if_link.h
> >+++ b/include/uapi/linux/if_link.h
> >@@ -976,6 +976,7 @@ enum {
> >       IFLA_BOND_AD_LACP_ACTIVE,
> >       IFLA_BOND_MISSED_MAX,
> >       IFLA_BOND_NS_IP6_TARGET,
> >+      IFLA_BOND_LACP_EXTENDED_MUX,
> >       __IFLA_BOND_MAX,
> > };
> >
> >diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
> >index a0aa05a28cf2..f641f55dbbc4 100644
> >--- a/tools/include/uapi/linux/if_link.h
> >+++ b/tools/include/uapi/linux/if_link.h
> >@@ -974,6 +974,7 @@ enum {
> >       IFLA_BOND_AD_LACP_ACTIVE,
> >       IFLA_BOND_MISSED_MAX,
> >       IFLA_BOND_NS_IP6_TARGET,
> >+      IFLA_BOND_LACP_EXTENDED_MUX,
> >       __IFLA_BOND_MAX,
> > };
> >
> >--
> >2.43.0.472.g3155946c3a-goog
> >
> >
>
> ---
>         -Jay Vosburgh, jay.vosburgh@canonical.com
  
Hangbin Liu Dec. 22, 2023, 3:23 a.m. UTC | #4
Hi Aahil,

On Thu, Dec 21, 2023 at 02:36:50AM +0000, Aahil Awatramani wrote:
> Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
> the LACP MUX state machine for separated handling of an initial
> Collecting state before the Collecting and Distributing state. This
> enables a port to be in a state where it can receive incoming packets
> while not still distributing. This is useful for reducing packet loss when
> a port begins distributing before its partner is able to collect.
> Additionally this also brings the 802.3ad bonding driver's implementation
> closer to the LACP specification which already predefined this behaviour.
> 
> Note that the regular flow process in the kernel's bonding driver remains
> unaffected by this patch. The extension requires explicit opt-in by the
> user (in order to ensure no disruptions for existing setups) via netlink
> or sysfs support using the new bonding parameter lacp_extended_mux. The

Sysfs is deprecated. We should only use netlink API.

> default value for lacp_extended_mux is set to 0 so as to preserve existing
> behaviour.

As Jay said, please update the document for new parameter. It also would be
good to add a selftest in tools/testing/selftests/drivers/net/bonding to make
sure the Mux machine changes correctly. You can use scapy to send the partner
state. This could be used for both bonding/teaming testing, which is valuable.

> 
> Signed-off-by: Aahil Awatramani <aahila@google.com>
> ---
>  drivers/net/bonding/bond_3ad.c     | 155 +++++++++++++++++++++++++++--
>  drivers/net/bonding/bond_main.c    |  22 ++--
>  drivers/net/bonding/bond_netlink.c |  16 +++
>  drivers/net/bonding/bond_options.c |  26 ++++-
>  drivers/net/bonding/bond_sysfs.c   |  12 +++
>  include/net/bond_3ad.h             |   2 +
>  include/net/bond_options.h         |   1 +
>  include/net/bonding.h              |  33 ++++++
>  include/uapi/linux/if_link.h       |   1 +
>  tools/include/uapi/linux/if_link.h |   1 +
>  10 files changed, 254 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> index c99ffe6c683a..38a7aa6e4edd 100644
> --- a/drivers/net/bonding/bond_3ad.c
> +++ b/drivers/net/bonding/bond_3ad.c
> @@ -106,6 +106,9 @@ static void ad_agg_selection_logic(struct aggregator *aggregator,
>  static void ad_clear_agg(struct aggregator *aggregator);
>  static void ad_initialize_agg(struct aggregator *aggregator);
>  static void ad_initialize_port(struct port *port, int lacp_fast);
> +static void ad_enable_collecting(struct port *port);
> +static void ad_disable_distributing(struct port *port,
> +				    bool *update_slave_arr);
>  static void ad_enable_collecting_distributing(struct port *port,
>  					      bool *update_slave_arr);
>  static void ad_disable_collecting_distributing(struct port *port,
> @@ -171,32 +174,64 @@ static inline int __agg_has_partner(struct aggregator *agg)
>  	return !is_zero_ether_addr(agg->partner_system.mac_addr_value);
>  }
>  
> +/**
> + * __disable_distributing_port - disable the port's slave for distributing.
> + * Port will still be able to collect.
> + * @port: the port we're looking at
> + *
> + * This will disable only distributing on the port's slave.
> + */
> +static inline void __disable_distributing_port(struct port *port)
> +{
> +	bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> +}
> +
> +/**
> + * __enable_collecting_port - enable the port's slave for collecting,
> + * if it's up
> + * @port: the port we're looking at
> + *
> + * This will enable only collecting on the port's slave.
> + */
> +static inline void __enable_collecting_port(struct port *port)
> +{
> +	struct slave *slave = port->slave;
> +
> +	if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave))
> +		bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> +}
> +
>  /**
>   * __disable_port - disable the port's slave
>   * @port: the port we're looking at
> + *
> + * This will disable both collecting and distributing on the port's slave.
>   */
>  static inline void __disable_port(struct port *port)
>  {
> -	bond_set_slave_inactive_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> +	bond_set_slave_txrx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
>  }

Can we replace bond_set_slave_inactive_flags() directly?
bond_set_slave_inactive_flags() also checks the all_slaves_active parameter.
Please see more comments under bond_set_slave_txrx_disabled_flags() function.

>  
>  /**
>   * __enable_port - enable the port's slave, if it's up
>   * @port: the port we're looking at
> + *
> + * This will enable both collecting and distributing on the port's slave.
>   */
>  static inline void __enable_port(struct port *port)
>  {
>  	struct slave *slave = port->slave;
>  
>  	if ((slave->link == BOND_LINK_UP) && bond_slave_is_up(slave))
> -		bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> +		bond_set_slave_txrx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
>  }
>  
>  /**
> - * __port_is_enabled - check if the port's slave is in active state
> + * __port_is_collecting_distributing - check if the port's slave is in the
> + * combined collecting/distributing state
>   * @port: the port we're looking at
>   */
> -static inline int __port_is_enabled(struct port *port)
> +static inline int __port_is_collecting_distributing(struct port *port)
>  {
>  	return bond_is_active_slave(port->slave);
>  }
> @@ -942,6 +977,7 @@ static int ad_marker_send(struct port *port, struct bond_marker *marker)
>   */
>  static void ad_mux_machine(struct port *port, bool *update_slave_arr)
>  {
> +	struct bonding *bond = __get_bond_by_port(port);
>  	mux_states_t last_state;
>  
>  	/* keep current State Machine state to compare later if it was
> @@ -999,9 +1035,13 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
>  			if ((port->sm_vars & AD_PORT_SELECTED) &&
>  			    (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
>  			    !__check_agg_selection_timer(port)) {
> -				if (port->aggregator->is_active)
> -					port->sm_mux_state =
> -					    AD_MUX_COLLECTING_DISTRIBUTING;
> +				if (port->aggregator->is_active) {
> +					int state = AD_MUX_COLLECTING_DISTRIBUTING;
> +
> +					if (bond->params.lacp_extended_mux)
> +						state = AD_MUX_COLLECTING;
> +					port->sm_mux_state = state;
> +				}
>  			} else if (!(port->sm_vars & AD_PORT_SELECTED) ||
>  				   (port->sm_vars & AD_PORT_STANDBY)) {
>  				/* if UNSELECTED or STANDBY */
> @@ -1031,7 +1071,52 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
>  				 */
>  				if (port->aggregator &&
>  				    port->aggregator->is_active &&
> -				    !__port_is_enabled(port)) {
> +				    !__port_is_collecting_distributing(port)) {
> +					__enable_port(port);
> +					*update_slave_arr = true;
> +				}
> +			}
> +			break;
> +		case AD_MUX_COLLECTING:
> +			if (!(port->sm_vars & AD_PORT_SELECTED) ||
> +			    (port->sm_vars & AD_PORT_STANDBY) ||
> +			    !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> +			    !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {

Both AD_MUX_COLLECTING_DISTRIBUTING and AD_MUX_COLLECTING check these, maybe
we can add a function like port_should_mux_attached() to do the checks.

> +				port->sm_mux_state = AD_MUX_ATTACHED;
> +			} else if ((port->sm_vars & AD_PORT_SELECTED) &&
> +			    (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> +			    (port->partner_oper.port_state & LACP_STATE_COLLECTING)) {
> +				port->sm_mux_state = AD_MUX_DISTRIBUTING;
> +			} else {
> +				/* If port state hasn't changed, make sure that a collecting
> +				 * port is enabled for an active aggregator.
> +				 */
> +				if (port->aggregator &&
> +				    port->aggregator->is_active) {
> +					struct slave *slave = port->slave;
> +
> +					if (bond_is_slave_rx_disabled(slave) != 0) {
> +						ad_enable_collecting(port);
> +						*update_slave_arr = true;
> +					}
> +				}
> +			}
> +			break;
> +		case AD_MUX_DISTRIBUTING:
> +			if (!(port->sm_vars & AD_PORT_SELECTED) ||
> +			    (port->sm_vars & AD_PORT_STANDBY) ||
> +			    !(port->partner_oper.port_state & LACP_STATE_COLLECTING) ||
> +			    !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> +			    !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {

Is it possible that a port in DISTRIBUTING state while no LACP_STATE_SYNCHRONIZATION flag?
It should has this flag since COLLECTING.

> +				port->sm_mux_state = AD_MUX_COLLECTING;
> +			} else {
> +				/* if port state hasn't changed make
> +				 * sure that a collecting distributing
> +				 * port in an active aggregator is enabled
> +				 */
> +				if (port->aggregator &&
> +				    port->aggregator->is_active &&
> +				    !__port_is_collecting_distributing(port)) {
>  					__enable_port(port);
>  					*update_slave_arr = true;
>  				}
> @@ -1082,6 +1167,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
>  							  update_slave_arr);

...

> @@ -2763,11 +2766,14 @@ static void bond_miimon_commit(struct bonding *bond)
>  			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
>  						  BOND_SLAVE_NOTIFY_NOW);
>  
> -			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
> -			    BOND_MODE(bond) == BOND_MODE_8023AD)
> +			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
>  				bond_set_slave_inactive_flags(slave,
>  							      BOND_SLAVE_NOTIFY_NOW);
>  
> +			if (BOND_MODE(bond) == BOND_MODE_8023AD)
> +				bond_set_slave_txrx_disabled_flags(slave,
> +								   BOND_SLAVE_NOTIFY_NOW);
> +
>  			slave_info(bond->dev, slave->dev, "link status definitely down, disabling slave\n");
>  
>  			bond_miimon_link_change(bond, slave, BOND_LINK_DOWN);
> @@ -4276,8 +4282,12 @@ static int bond_open(struct net_device *bond_dev)
>  		bond_for_each_slave(bond, slave, iter) {
>  			if (bond_uses_primary(bond) &&
>  			    slave != rcu_access_pointer(bond->curr_active_slave)) {
> -				bond_set_slave_inactive_flags(slave,
> -							      BOND_SLAVE_NOTIFY_NOW);
> +				if (BOND_MODE(bond) == BOND_MODE_8023AD)

The bond_uses_primary() only returns true for ab/tlb/alb mode, there won't be
8023ad mode.

> +					bond_set_slave_txrx_disabled_flags(slave,
> +									   BOND_SLAVE_NOTIFY_NOW);
> +				else
> +					bond_set_slave_inactive_flags(slave,
> +								      BOND_SLAVE_NOTIFY_NOW);
>  			} else if (BOND_MODE(bond) != BOND_MODE_8023AD) {
>  				bond_set_slave_active_flags(slave,
>  							    BOND_SLAVE_NOTIFY_NOW);

...

>  
> diff --git a/include/net/bonding.h b/include/net/bonding.h
> index 5b8b1b644a2d..b31880d53d76 100644
> --- a/include/net/bonding.h
> +++ b/include/net/bonding.h
> @@ -148,6 +148,7 @@ struct bond_params {
>  #if IS_ENABLED(CONFIG_IPV6)
>  	struct in6_addr ns_targets[BOND_MAX_NS_TARGETS];
>  #endif
> +	int lacp_extended_mux;
>  
>  	/* 2 bytes of padding : see ether_addr_equal_64bits() */
>  	u8 ad_actor_system[ETH_ALEN + 2];
> @@ -167,6 +168,7 @@ struct slave {
>  	u8     backup:1,   /* indicates backup slave. Value corresponds with
>  			      BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
>  	       inactive:1, /* indicates inactive slave */
> +	       rx_disabled:1, /* indicates whether slave's Rx is disabled */
>  	       should_notify:1, /* indicates whether the state changed */
>  	       should_notify_link:1; /* indicates whether the link changed */
>  	u8     duplex;
> @@ -570,6 +572,19 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave,
>  		slave->inactive = 1;
>  }
>  
> +static inline void bond_set_slave_txrx_disabled_flags(struct slave *slave,
> +						 bool notify)
> +{
> +	bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
> +	slave->rx_disabled = 1;
> +}

The bond_set_slave_inactive_flags() also has all_slaves_active() checks.
I don't think you can replace all the bond_set_slave_inactive_flags() to this one directly.
How about update the bond_set_slave_inactive_flags() with a 8023ad check, e.g.

diff --git a/include/net/bonding.h b/include/net/bonding.h
index 5b8b1b644a2d..ab70c46119a0 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -568,6 +568,8 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave,
                bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
        if (!slave->bond->params.all_slaves_active)
                slave->inactive = 1;
+       if (BOND_MODE(slave->bond) == BOND_MODE_8023AD)
+               slave->rx_disabled = 1;
 }

Thanks
Hangbin
  
David Dillow Dec. 22, 2023, 5:38 a.m. UTC | #5
> I haven't read the patch in detail yet, but my overall question
> is: why do we need this?  This adds significant complexity to the
> state machine logic.  What real problem is this solving, i.e., what
> examples do you have of systems where a port is "in a state where
> it can receive incoming packets while not still distributing"?

Any time we add a new link to an aggregator, or the bond selects a new
aggegrator based on the selection policy, there is currently a race
where we start distributing traffic before our partner (usually a
switch) is ready to start collecting it, leading to dropped packets if
we're running traffic over the bond. We reliably hit this window,
making what should be a non-issue into a customer-visible packet-loss
event. Implementing the full state machine closes the window and makes
these maintenance events lossless.
  
kernel test robot Dec. 25, 2023, 8:09 a.m. UTC | #6
Hi Aahil,

kernel test robot noticed the following build warnings:

[auto build test WARNING on next-20231222]

url:    https://github.com/intel-lab-lkp/linux/commits/Aahil-Awatramani/bonding-Extending-LACP-MUX-State-Machine-to-include-a-Collecting-State/20231222-174732
base:   next-20231222
patch link:    https://lore.kernel.org/r/20231221023650.3208767-1-aahila%40google.com
patch subject: [PATCH next] bonding: Extending LACP MUX State Machine to include a Collecting State.
config: i386-allyesconfig (https://download.01.org/0day-ci/archive/20231225/202312251524.jjgo5nFR-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231225/202312251524.jjgo5nFR-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312251524.jjgo5nFR-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/net/bonding/bond_3ad.c:2016: warning: Excess function parameter 'update_slave_arr' description in 'ad_enable_collecting'


vim +2016 drivers/net/bonding/bond_3ad.c

  2007	
  2008	/**
  2009	 * ad_enable_collecting - enable a port's receive
  2010	 * @port: the port we're looking at
  2011	 * @update_slave_arr: Does slave array need update?
  2012	 *
  2013	 * Enable @port if it's in an active aggregator
  2014	 */
  2015	static void ad_enable_collecting(struct port *port)
> 2016	{
  2017		if (port->aggregator->is_active) {
  2018			struct slave *slave = port->slave;
  2019	
  2020			slave_dbg(slave->bond->dev, slave->dev,
  2021				  "Enabling collecting on port %d (LAG %d)\n",
  2022				  port->actor_port_number,
  2023				  port->aggregator->aggregator_identifier);
  2024			__enable_collecting_port(port);
  2025		}
  2026	}
  2027
  
Aahil Awatramani Jan. 4, 2024, 11:26 p.m. UTC | #7
Thanks Hangbin, I have made the changes that you have suggested in v2
of the patch.

> Is it possible that a port in DISTRIBUTING state while no LACP_STATE_SYNCHRONIZATION flag?
> It should has this flag since COLLECTING.

This is correct, I kept the LACP_STATE_SYNCHRONIZATION flag check to
be consistent with
what the previous Collecting and Distributing states were checking as
well as what the IEEE
802.1AX-2020 state diagram checks for.

> The bond_set_slave_inactive_flags() also has all_slaves_active() checks.
> I don't think you can replace all the bond_set_slave_inactive_flags() to this one directly.
> How about update the bond_set_slave_inactive_flags() with a 8023ad check, e.g.

I have followed up with your suggestion here as well as also done it
similarly for
bond_set_slave_active_flags.

On Thu, Dec 21, 2023 at 7:23 PM Hangbin Liu <liuhangbin@gmail.com> wrote:
>
> Hi Aahil,
>
> On Thu, Dec 21, 2023 at 02:36:50AM +0000, Aahil Awatramani wrote:
> > Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
> > the LACP MUX state machine for separated handling of an initial
> > Collecting state before the Collecting and Distributing state. This
> > enables a port to be in a state where it can receive incoming packets
> > while not still distributing. This is useful for reducing packet loss when
> > a port begins distributing before its partner is able to collect.
> > Additionally this also brings the 802.3ad bonding driver's implementation
> > closer to the LACP specification which already predefined this behaviour.
> >
> > Note that the regular flow process in the kernel's bonding driver remains
> > unaffected by this patch. The extension requires explicit opt-in by the
> > user (in order to ensure no disruptions for existing setups) via netlink
> > or sysfs support using the new bonding parameter lacp_extended_mux. The
>
> Sysfs is deprecated. We should only use netlink API.
>
> > default value for lacp_extended_mux is set to 0 so as to preserve existing
> > behaviour.
>
> As Jay said, please update the document for new parameter. It also would be
> good to add a selftest in tools/testing/selftests/drivers/net/bonding to make
> sure the Mux machine changes correctly. You can use scapy to send the partner
> state. This could be used for both bonding/teaming testing, which is valuable.
>
> >
> > Signed-off-by: Aahil Awatramani <aahila@google.com>
> > ---
> >  drivers/net/bonding/bond_3ad.c     | 155 +++++++++++++++++++++++++++--
> >  drivers/net/bonding/bond_main.c    |  22 ++--
> >  drivers/net/bonding/bond_netlink.c |  16 +++
> >  drivers/net/bonding/bond_options.c |  26 ++++-
> >  drivers/net/bonding/bond_sysfs.c   |  12 +++
> >  include/net/bond_3ad.h             |   2 +
> >  include/net/bond_options.h         |   1 +
> >  include/net/bonding.h              |  33 ++++++
> >  include/uapi/linux/if_link.h       |   1 +
> >  tools/include/uapi/linux/if_link.h |   1 +
> >  10 files changed, 254 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> > index c99ffe6c683a..38a7aa6e4edd 100644
> > --- a/drivers/net/bonding/bond_3ad.c
> > +++ b/drivers/net/bonding/bond_3ad.c
> > @@ -106,6 +106,9 @@ static void ad_agg_selection_logic(struct aggregator *aggregator,
> >  static void ad_clear_agg(struct aggregator *aggregator);
> >  static void ad_initialize_agg(struct aggregator *aggregator);
> >  static void ad_initialize_port(struct port *port, int lacp_fast);
> > +static void ad_enable_collecting(struct port *port);
> > +static void ad_disable_distributing(struct port *port,
> > +                                 bool *update_slave_arr);
> >  static void ad_enable_collecting_distributing(struct port *port,
> >                                             bool *update_slave_arr);
> >  static void ad_disable_collecting_distributing(struct port *port,
> > @@ -171,32 +174,64 @@ static inline int __agg_has_partner(struct aggregator *agg)
> >       return !is_zero_ether_addr(agg->partner_system.mac_addr_value);
> >  }
> >
> > +/**
> > + * __disable_distributing_port - disable the port's slave for distributing.
> > + * Port will still be able to collect.
> > + * @port: the port we're looking at
> > + *
> > + * This will disable only distributing on the port's slave.
> > + */
> > +static inline void __disable_distributing_port(struct port *port)
> > +{
> > +     bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> > +}
> > +
> > +/**
> > + * __enable_collecting_port - enable the port's slave for collecting,
> > + * if it's up
> > + * @port: the port we're looking at
> > + *
> > + * This will enable only collecting on the port's slave.
> > + */
> > +static inline void __enable_collecting_port(struct port *port)
> > +{
> > +     struct slave *slave = port->slave;
> > +
> > +     if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave))
> > +             bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> > +}
> > +
> >  /**
> >   * __disable_port - disable the port's slave
> >   * @port: the port we're looking at
> > + *
> > + * This will disable both collecting and distributing on the port's slave.
> >   */
> >  static inline void __disable_port(struct port *port)
> >  {
> > -     bond_set_slave_inactive_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> > +     bond_set_slave_txrx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> >  }
>
> Can we replace bond_set_slave_inactive_flags() directly?
> bond_set_slave_inactive_flags() also checks the all_slaves_active parameter.
> Please see more comments under bond_set_slave_txrx_disabled_flags() function.
>
> >
> >  /**
> >   * __enable_port - enable the port's slave, if it's up
> >   * @port: the port we're looking at
> > + *
> > + * This will enable both collecting and distributing on the port's slave.
> >   */
> >  static inline void __enable_port(struct port *port)
> >  {
> >       struct slave *slave = port->slave;
> >
> >       if ((slave->link == BOND_LINK_UP) && bond_slave_is_up(slave))
> > -             bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> > +             bond_set_slave_txrx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> >  }
> >
> >  /**
> > - * __port_is_enabled - check if the port's slave is in active state
> > + * __port_is_collecting_distributing - check if the port's slave is in the
> > + * combined collecting/distributing state
> >   * @port: the port we're looking at
> >   */
> > -static inline int __port_is_enabled(struct port *port)
> > +static inline int __port_is_collecting_distributing(struct port *port)
> >  {
> >       return bond_is_active_slave(port->slave);
> >  }
> > @@ -942,6 +977,7 @@ static int ad_marker_send(struct port *port, struct bond_marker *marker)
> >   */
> >  static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >  {
> > +     struct bonding *bond = __get_bond_by_port(port);
> >       mux_states_t last_state;
> >
> >       /* keep current State Machine state to compare later if it was
> > @@ -999,9 +1035,13 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                       if ((port->sm_vars & AD_PORT_SELECTED) &&
> >                           (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> >                           !__check_agg_selection_timer(port)) {
> > -                             if (port->aggregator->is_active)
> > -                                     port->sm_mux_state =
> > -                                         AD_MUX_COLLECTING_DISTRIBUTING;
> > +                             if (port->aggregator->is_active) {
> > +                                     int state = AD_MUX_COLLECTING_DISTRIBUTING;
> > +
> > +                                     if (bond->params.lacp_extended_mux)
> > +                                             state = AD_MUX_COLLECTING;
> > +                                     port->sm_mux_state = state;
> > +                             }
> >                       } else if (!(port->sm_vars & AD_PORT_SELECTED) ||
> >                                  (port->sm_vars & AD_PORT_STANDBY)) {
> >                               /* if UNSELECTED or STANDBY */
> > @@ -1031,7 +1071,52 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                                */
> >                               if (port->aggregator &&
> >                                   port->aggregator->is_active &&
> > -                                 !__port_is_enabled(port)) {
> > +                                 !__port_is_collecting_distributing(port)) {
> > +                                     __enable_port(port);
> > +                                     *update_slave_arr = true;
> > +                             }
> > +                     }
> > +                     break;
> > +             case AD_MUX_COLLECTING:
> > +                     if (!(port->sm_vars & AD_PORT_SELECTED) ||
> > +                         (port->sm_vars & AD_PORT_STANDBY) ||
> > +                         !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> > +                         !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
>
> Both AD_MUX_COLLECTING_DISTRIBUTING and AD_MUX_COLLECTING check these, maybe
> we can add a function like port_should_mux_attached() to do the checks.
>
> > +                             port->sm_mux_state = AD_MUX_ATTACHED;
> > +                     } else if ((port->sm_vars & AD_PORT_SELECTED) &&
> > +                         (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> > +                         (port->partner_oper.port_state & LACP_STATE_COLLECTING)) {
> > +                             port->sm_mux_state = AD_MUX_DISTRIBUTING;
> > +                     } else {
> > +                             /* If port state hasn't changed, make sure that a collecting
> > +                              * port is enabled for an active aggregator.
> > +                              */
> > +                             if (port->aggregator &&
> > +                                 port->aggregator->is_active) {
> > +                                     struct slave *slave = port->slave;
> > +
> > +                                     if (bond_is_slave_rx_disabled(slave) != 0) {
> > +                                             ad_enable_collecting(port);
> > +                                             *update_slave_arr = true;
> > +                                     }
> > +                             }
> > +                     }
> > +                     break;
> > +             case AD_MUX_DISTRIBUTING:
> > +                     if (!(port->sm_vars & AD_PORT_SELECTED) ||
> > +                         (port->sm_vars & AD_PORT_STANDBY) ||
> > +                         !(port->partner_oper.port_state & LACP_STATE_COLLECTING) ||
> > +                         !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> > +                         !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
>
> Is it possible that a port in DISTRIBUTING state while no LACP_STATE_SYNCHRONIZATION flag?
> It should has this flag since COLLECTING.
>
> > +                             port->sm_mux_state = AD_MUX_COLLECTING;
> > +                     } else {
> > +                             /* if port state hasn't changed make
> > +                              * sure that a collecting distributing
> > +                              * port in an active aggregator is enabled
> > +                              */
> > +                             if (port->aggregator &&
> > +                                 port->aggregator->is_active &&
> > +                                 !__port_is_collecting_distributing(port)) {
> >                                       __enable_port(port);
> >                                       *update_slave_arr = true;
> >                               }
> > @@ -1082,6 +1167,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                                                         update_slave_arr);
>
> ...
>
> > @@ -2763,11 +2766,14 @@ static void bond_miimon_commit(struct bonding *bond)
> >                       bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> >                                                 BOND_SLAVE_NOTIFY_NOW);
> >
> > -                     if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
> > -                         BOND_MODE(bond) == BOND_MODE_8023AD)
> > +                     if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> >                               bond_set_slave_inactive_flags(slave,
> >                                                             BOND_SLAVE_NOTIFY_NOW);
> >
> > +                     if (BOND_MODE(bond) == BOND_MODE_8023AD)
> > +                             bond_set_slave_txrx_disabled_flags(slave,
> > +                                                                BOND_SLAVE_NOTIFY_NOW);
> > +
> >                       slave_info(bond->dev, slave->dev, "link status definitely down, disabling slave\n");
> >
> >                       bond_miimon_link_change(bond, slave, BOND_LINK_DOWN);
> > @@ -4276,8 +4282,12 @@ static int bond_open(struct net_device *bond_dev)
> >               bond_for_each_slave(bond, slave, iter) {
> >                       if (bond_uses_primary(bond) &&
> >                           slave != rcu_access_pointer(bond->curr_active_slave)) {
> > -                             bond_set_slave_inactive_flags(slave,
> > -                                                           BOND_SLAVE_NOTIFY_NOW);
> > +                             if (BOND_MODE(bond) == BOND_MODE_8023AD)
>
> The bond_uses_primary() only returns true for ab/tlb/alb mode, there won't be
> 8023ad mode.
>
> > +                                     bond_set_slave_txrx_disabled_flags(slave,
> > +                                                                        BOND_SLAVE_NOTIFY_NOW);
> > +                             else
> > +                                     bond_set_slave_inactive_flags(slave,
> > +                                                                   BOND_SLAVE_NOTIFY_NOW);
> >                       } else if (BOND_MODE(bond) != BOND_MODE_8023AD) {
> >                               bond_set_slave_active_flags(slave,
> >                                                           BOND_SLAVE_NOTIFY_NOW);
>
> ...
>
> >
> > diff --git a/include/net/bonding.h b/include/net/bonding.h
> > index 5b8b1b644a2d..b31880d53d76 100644
> > --- a/include/net/bonding.h
> > +++ b/include/net/bonding.h
> > @@ -148,6 +148,7 @@ struct bond_params {
> >  #if IS_ENABLED(CONFIG_IPV6)
> >       struct in6_addr ns_targets[BOND_MAX_NS_TARGETS];
> >  #endif
> > +     int lacp_extended_mux;
> >
> >       /* 2 bytes of padding : see ether_addr_equal_64bits() */
> >       u8 ad_actor_system[ETH_ALEN + 2];
> > @@ -167,6 +168,7 @@ struct slave {
> >       u8     backup:1,   /* indicates backup slave. Value corresponds with
> >                             BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
> >              inactive:1, /* indicates inactive slave */
> > +            rx_disabled:1, /* indicates whether slave's Rx is disabled */
> >              should_notify:1, /* indicates whether the state changed */
> >              should_notify_link:1; /* indicates whether the link changed */
> >       u8     duplex;
> > @@ -570,6 +572,19 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave,
> >               slave->inactive = 1;
> >  }
> >
> > +static inline void bond_set_slave_txrx_disabled_flags(struct slave *slave,
> > +                                              bool notify)
> > +{
> > +     bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
> > +     slave->rx_disabled = 1;
> > +}
>
> The bond_set_slave_inactive_flags() also has all_slaves_active() checks.
> I don't think you can replace all the bond_set_slave_inactive_flags() to this one directly.
> How about update the bond_set_slave_inactive_flags() with a 8023ad check, e.g.
>
> diff --git a/include/net/bonding.h b/include/net/bonding.h
> index 5b8b1b644a2d..ab70c46119a0 100644
> --- a/include/net/bonding.h
> +++ b/include/net/bonding.h
> @@ -568,6 +568,8 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave,
>                 bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
>         if (!slave->bond->params.all_slaves_active)
>                 slave->inactive = 1;
> +       if (BOND_MODE(slave->bond) == BOND_MODE_8023AD)
> +               slave->rx_disabled = 1;
>  }
>
> Thanks
> Hangbin
  
Aahil Awatramani Jan. 4, 2024, 11:26 p.m. UTC | #8
Thanks for the feedback, I have made the changes you have suggested
moving forward.

On Thu, Dec 21, 2023 at 9:35 AM Simon Horman <horms@kernel.org> wrote:
>
> On Thu, Dec 21, 2023 at 02:36:50AM +0000, Aahil Awatramani wrote:
> > Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
> > the LACP MUX state machine for separated handling of an initial
> > Collecting state before the Collecting and Distributing state. This
> > enables a port to be in a state where it can receive incoming packets
> > while not still distributing. This is useful for reducing packet loss when
> > a port begins distributing before its partner is able to collect.
> > Additionally this also brings the 802.3ad bonding driver's implementation
> > closer to the LACP specification which already predefined this behaviour.
> >
> > With this change, 802.3ad mode will use new
> > bond_set_slave_txrx_{enabled|disabled}_flags() set of functions only
> > instead of the earlier one (bond_set_slave_{active|inactive}_flags).
> > Additionally, it adds new functions such as
> > bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to
> > precisely manage the port's collecting and distributing states.
> > Previously, there was no dedicated method to disable TX while keeping RX
> > enabled, which this patch addresses.
> >
> > Note that the regular flow process in the kernel's bonding driver remains
> > unaffected by this patch. The extension requires explicit opt-in by the
> > user (in order to ensure no disruptions for existing setups) via netlink
> > or sysfs support using the new bonding parameter lacp_extended_mux. The
> > default value for lacp_extended_mux is set to 0 so as to preserve existing
> > behaviour.
> >
> > Signed-off-by: Aahil Awatramani <aahila@google.com>
>
> ...
>
> > @@ -1906,6 +2005,46 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
> >       }
> >  }
> >
> > +/**
> > + * ad_enable_collecting - enable a port's receive
> > + * @port: the port we're looking at
> > + * @update_slave_arr: Does slave array need update?
>
> The line above documenting @update_slave_arr
> should be removed as the parameter is not in
> the function definition below.
>
> > + *
> > + * Enable @port if it's in an active aggregator
> > + */
> > +static void ad_enable_collecting(struct port *port)
> > +{
> > +     if (port->aggregator->is_active) {
> > +             struct slave *slave = port->slave;
> > +
> > +             slave_dbg(slave->bond->dev, slave->dev,
> > +                       "Enabling collecting on port %d (LAG %d)\n",
> > +                       port->actor_port_number,
> > +                       port->aggregator->aggregator_identifier);
> > +             __enable_collecting_port(port);
> > +     }
> > +}
>
> The above is a pretty minor problem, but the bots are likely
> to complain about it, so please fix it in v2 after waiting
> for feedback from others on this patch.
>
> When posting v2 please target it at the net-next tree.
>
>         Subject: [PATCH net-next v2] ...
>
> --
> pw-bot: changes-requested
  
Aahil Awatramani Jan. 4, 2024, 11:35 p.m. UTC | #9
Thanks for the feedback Jay, I have updated documentation in v2 of the patch
and have also changed the naming to "coupled_control".

> Any time we add a new link to an aggregator, or the bond selects a new>
> aggegrator based on the selection policy, there is currently a race
> where we start distributing traffic before our partner (usually a
> switch) is ready to start collecting it, leading to dropped packets if
> we're running traffic over the bond. We reliably hit this window,
> making what should be a non-issue into a customer-visible packet-loss
> event. Implementing the full state machine closes the window and makes
> these maintenance events lossless.

For use case what David says is correct, implementation of this full
state machine
helps to achieve losslessness during such maintenance events.

On Thu, Dec 21, 2023 at 10:48 AM Jay Vosburgh
<jay.vosburgh@canonical.com> wrote:
>
> Aahil Awatramani <aahila@google.com> wrote:
>
> >Introduces two new states, AD_MUX_COLLECTING and AD_MUX_DISTRIBUTING in
> >the LACP MUX state machine for separated handling of an initial
> >Collecting state before the Collecting and Distributing state. This
> >enables a port to be in a state where it can receive incoming packets
> >while not still distributing. This is useful for reducing packet loss when
> >a port begins distributing before its partner is able to collect.
> >Additionally this also brings the 802.3ad bonding driver's implementation
> >closer to the LACP specification which already predefined this behaviour.
>
>         To be clear, the current implementation (that combines
> COLLECTING and DISTRIBUTING into a single state) is compliant with the
> standard, which defines the current logic as "coupled control," per IEEE
> 802.1AX-2008, 5.4.15 or 802.1AX-2020, 6.4.13.
>
>         I haven't read the patch in detail yet, but my overall question
> is: why do we need this?  This adds significant complexity to the state
> machine logic.  What real problem is this solving, i.e., what examples
> do you have of systems where a port is "in a state where it can receive
> incoming packets while not still distributing"?
>
>         For the nomenclature, I would prefer to use the naming from the
> standard.  Thus, instead of "lacp_extended_mux" my preference would be
> "coupled_control", which would be enabled by default.  This extends to
> the naming of variables or constants within the code as well.
>
>         Lastly, in order to be accepted, this needs to include an update
> to the bonding documentation.
>
>         -J
>
> >With this change, 802.3ad mode will use new
> >bond_set_slave_txrx_{enabled|disabled}_flags() set of functions only
> >instead of the earlier one (bond_set_slave_{active|inactive}_flags).
> >Additionally, it adds new functions such as
> >bond_set_slave_tx_disabled_flags and bond_set_slave_rx_enabled_flags to
> >precisely manage the port's collecting and distributing states.
> >Previously, there was no dedicated method to disable TX while keeping RX
> >enabled, which this patch addresses.
> >
> >Note that the regular flow process in the kernel's bonding driver remains
> >unaffected by this patch. The extension requires explicit opt-in by the
> >user (in order to ensure no disruptions for existing setups) via netlink
> >or sysfs support using the new bonding parameter lacp_extended_mux. The
> >default value for lacp_extended_mux is set to 0 so as to preserve existing
> >behaviour.
> >
> >Signed-off-by: Aahil Awatramani <aahila@google.com>
> >---
> > drivers/net/bonding/bond_3ad.c     | 155 +++++++++++++++++++++++++++--
> > drivers/net/bonding/bond_main.c    |  22 ++--
> > drivers/net/bonding/bond_netlink.c |  16 +++
> > drivers/net/bonding/bond_options.c |  26 ++++-
> > drivers/net/bonding/bond_sysfs.c   |  12 +++
> > include/net/bond_3ad.h             |   2 +
> > include/net/bond_options.h         |   1 +
> > include/net/bonding.h              |  33 ++++++
> > include/uapi/linux/if_link.h       |   1 +
> > tools/include/uapi/linux/if_link.h |   1 +
> > 10 files changed, 254 insertions(+), 15 deletions(-)
> >
> >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> >index c99ffe6c683a..38a7aa6e4edd 100644
> >--- a/drivers/net/bonding/bond_3ad.c
> >+++ b/drivers/net/bonding/bond_3ad.c
> >@@ -106,6 +106,9 @@ static void ad_agg_selection_logic(struct aggregator *aggregator,
> > static void ad_clear_agg(struct aggregator *aggregator);
> > static void ad_initialize_agg(struct aggregator *aggregator);
> > static void ad_initialize_port(struct port *port, int lacp_fast);
> >+static void ad_enable_collecting(struct port *port);
> >+static void ad_disable_distributing(struct port *port,
> >+                                  bool *update_slave_arr);
> > static void ad_enable_collecting_distributing(struct port *port,
> >                                             bool *update_slave_arr);
> > static void ad_disable_collecting_distributing(struct port *port,
> >@@ -171,32 +174,64 @@ static inline int __agg_has_partner(struct aggregator *agg)
> >       return !is_zero_ether_addr(agg->partner_system.mac_addr_value);
> > }
> >
> >+/**
> >+ * __disable_distributing_port - disable the port's slave for distributing.
> >+ * Port will still be able to collect.
> >+ * @port: the port we're looking at
> >+ *
> >+ * This will disable only distributing on the port's slave.
> >+ */
> >+static inline void __disable_distributing_port(struct port *port)
> >+{
> >+      bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> >+}
> >+
> >+/**
> >+ * __enable_collecting_port - enable the port's slave for collecting,
> >+ * if it's up
> >+ * @port: the port we're looking at
> >+ *
> >+ * This will enable only collecting on the port's slave.
> >+ */
> >+static inline void __enable_collecting_port(struct port *port)
> >+{
> >+      struct slave *slave = port->slave;
> >+
> >+      if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave))
> >+              bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> >+}
> >+
> > /**
> >  * __disable_port - disable the port's slave
> >  * @port: the port we're looking at
> >+ *
> >+ * This will disable both collecting and distributing on the port's slave.
> >  */
> > static inline void __disable_port(struct port *port)
> > {
> >-      bond_set_slave_inactive_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> >+      bond_set_slave_txrx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
> > }
> >
> > /**
> >  * __enable_port - enable the port's slave, if it's up
> >  * @port: the port we're looking at
> >+ *
> >+ * This will enable both collecting and distributing on the port's slave.
> >  */
> > static inline void __enable_port(struct port *port)
> > {
> >       struct slave *slave = port->slave;
> >
> >       if ((slave->link == BOND_LINK_UP) && bond_slave_is_up(slave))
> >-              bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> >+              bond_set_slave_txrx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
> > }
> >
> > /**
> >- * __port_is_enabled - check if the port's slave is in active state
> >+ * __port_is_collecting_distributing - check if the port's slave is in the
> >+ * combined collecting/distributing state
> >  * @port: the port we're looking at
> >  */
> >-static inline int __port_is_enabled(struct port *port)
> >+static inline int __port_is_collecting_distributing(struct port *port)
> > {
> >       return bond_is_active_slave(port->slave);
> > }
> >@@ -942,6 +977,7 @@ static int ad_marker_send(struct port *port, struct bond_marker *marker)
> >  */
> > static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> > {
> >+      struct bonding *bond = __get_bond_by_port(port);
> >       mux_states_t last_state;
> >
> >       /* keep current State Machine state to compare later if it was
> >@@ -999,9 +1035,13 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                       if ((port->sm_vars & AD_PORT_SELECTED) &&
> >                           (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> >                           !__check_agg_selection_timer(port)) {
> >-                              if (port->aggregator->is_active)
> >-                                      port->sm_mux_state =
> >-                                          AD_MUX_COLLECTING_DISTRIBUTING;
> >+                              if (port->aggregator->is_active) {
> >+                                      int state = AD_MUX_COLLECTING_DISTRIBUTING;
> >+
> >+                                      if (bond->params.lacp_extended_mux)
> >+                                              state = AD_MUX_COLLECTING;
> >+                                      port->sm_mux_state = state;
> >+                              }
> >                       } else if (!(port->sm_vars & AD_PORT_SELECTED) ||
> >                                  (port->sm_vars & AD_PORT_STANDBY)) {
> >                               /* if UNSELECTED or STANDBY */
> >@@ -1031,7 +1071,52 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                                */
> >                               if (port->aggregator &&
> >                                   port->aggregator->is_active &&
> >-                                  !__port_is_enabled(port)) {
> >+                                  !__port_is_collecting_distributing(port)) {
> >+                                      __enable_port(port);
> >+                                      *update_slave_arr = true;
> >+                              }
> >+                      }
> >+                      break;
> >+              case AD_MUX_COLLECTING:
> >+                      if (!(port->sm_vars & AD_PORT_SELECTED) ||
> >+                          (port->sm_vars & AD_PORT_STANDBY) ||
> >+                          !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> >+                          !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
> >+                              port->sm_mux_state = AD_MUX_ATTACHED;
> >+                      } else if ((port->sm_vars & AD_PORT_SELECTED) &&
> >+                          (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
> >+                          (port->partner_oper.port_state & LACP_STATE_COLLECTING)) {
> >+                              port->sm_mux_state = AD_MUX_DISTRIBUTING;
> >+                      } else {
> >+                              /* If port state hasn't changed, make sure that a collecting
> >+                               * port is enabled for an active aggregator.
> >+                               */
> >+                              if (port->aggregator &&
> >+                                  port->aggregator->is_active) {
> >+                                      struct slave *slave = port->slave;
> >+
> >+                                      if (bond_is_slave_rx_disabled(slave) != 0) {
> >+                                              ad_enable_collecting(port);
> >+                                              *update_slave_arr = true;
> >+                                      }
> >+                              }
> >+                      }
> >+                      break;
> >+              case AD_MUX_DISTRIBUTING:
> >+                      if (!(port->sm_vars & AD_PORT_SELECTED) ||
> >+                          (port->sm_vars & AD_PORT_STANDBY) ||
> >+                          !(port->partner_oper.port_state & LACP_STATE_COLLECTING) ||
> >+                          !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
> >+                          !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
> >+                              port->sm_mux_state = AD_MUX_COLLECTING;
> >+                      } else {
> >+                              /* if port state hasn't changed make
> >+                               * sure that a collecting distributing
> >+                               * port in an active aggregator is enabled
> >+                               */
> >+                              if (port->aggregator &&
> >+                                  port->aggregator->is_active &&
> >+                                  !__port_is_collecting_distributing(port)) {
> >                                       __enable_port(port);
> >                                       *update_slave_arr = true;
> >                               }
> >@@ -1082,6 +1167,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
> >                                                         update_slave_arr);
> >                       port->ntt = true;
> >                       break;
> >+              case AD_MUX_COLLECTING:
> >+                      port->actor_oper_port_state |= LACP_STATE_COLLECTING;
> >+                      port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
> >+                      port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
> >+                      ad_enable_collecting(port);
> >+                      ad_disable_distributing(port, update_slave_arr);
> >+                      port->ntt = true;
> >+                      break;
> >+              case AD_MUX_DISTRIBUTING:
> >+                      port->actor_oper_port_state |= LACP_STATE_DISTRIBUTING;
> >+                      port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
> >+                      ad_enable_collecting_distributing(port,
> >+                                                        update_slave_arr);
> >+                      break;
> >               default:
> >                       break;
> >               }
> >@@ -1906,6 +2005,46 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
> >       }
> > }
> >
> >+/**
> >+ * ad_enable_collecting - enable a port's receive
> >+ * @port: the port we're looking at
> >+ * @update_slave_arr: Does slave array need update?
> >+ *
> >+ * Enable @port if it's in an active aggregator
> >+ */
> >+static void ad_enable_collecting(struct port *port)
> >+{
> >+      if (port->aggregator->is_active) {
> >+              struct slave *slave = port->slave;
> >+
> >+              slave_dbg(slave->bond->dev, slave->dev,
> >+                        "Enabling collecting on port %d (LAG %d)\n",
> >+                        port->actor_port_number,
> >+                        port->aggregator->aggregator_identifier);
> >+              __enable_collecting_port(port);
> >+      }
> >+}
> >+
> >+/**
> >+ * ad_disable_distributing - disable a port's transmit
> >+ * @port: the port we're looking at
> >+ * @update_slave_arr: Does slave array need update?
> >+ */
> >+static void ad_disable_distributing(struct port *port, bool *update_slave_arr)
> >+{
> >+      if (port->aggregator &&
> >+          !MAC_ADDRESS_EQUAL(&port->aggregator->partner_system,
> >+                             &(null_mac_addr))) {
> >+              slave_dbg(port->slave->bond->dev, port->slave->dev,
> >+                        "Disabling distributing on port %d (LAG %d)\n",
> >+                        port->actor_port_number,
> >+                        port->aggregator->aggregator_identifier);
> >+              __disable_distributing_port(port);
> >+              /* Slave array needs an update */
> >+              *update_slave_arr = true;
> >+      }
> >+}
> >+
> > /**
> >  * ad_enable_collecting_distributing - enable a port's transmit/receive
> >  * @port: the port we're looking at
> >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> >index 8e6cc0e133b7..6b8f001a51a5 100644
> >--- a/drivers/net/bonding/bond_main.c
> >+++ b/drivers/net/bonding/bond_main.c
> >@@ -2119,7 +2119,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
> >                * will activate the slaves in the selected
> >                * aggregator
> >                */
> >-              bond_set_slave_inactive_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
> >+              bond_set_slave_txrx_disabled_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
> >               /* if this is the first slave */
> >               if (!prev_slave) {
> >                       SLAVE_AD_INFO(new_slave)->id = 1;
> >@@ -2381,7 +2381,10 @@ static int __bond_release_one(struct net_device *bond_dev,
> >               return -EINVAL;
> >       }
> >
> >-      bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
> >+      if (BOND_MODE(bond) == BOND_MODE_8023AD)
> >+              bond_set_slave_txrx_disabled_flags(slave, BOND_SLAVE_NOTIFY_NOW);
> >+      else
> >+              bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
> >
> >       bond_sysfs_slave_del(slave);
> >
> >@@ -2763,11 +2766,14 @@ static void bond_miimon_commit(struct bonding *bond)
> >                       bond_set_slave_link_state(slave, BOND_LINK_DOWN,
> >                                                 BOND_SLAVE_NOTIFY_NOW);
> >
> >-                      if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
> >-                          BOND_MODE(bond) == BOND_MODE_8023AD)
> >+                      if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
> >                               bond_set_slave_inactive_flags(slave,
> >                                                             BOND_SLAVE_NOTIFY_NOW);
> >
> >+                      if (BOND_MODE(bond) == BOND_MODE_8023AD)
> >+                              bond_set_slave_txrx_disabled_flags(slave,
> >+                                                                 BOND_SLAVE_NOTIFY_NOW);
> >+
> >                       slave_info(bond->dev, slave->dev, "link status definitely down, disabling slave\n");
> >
> >                       bond_miimon_link_change(bond, slave, BOND_LINK_DOWN);
> >@@ -4276,8 +4282,12 @@ static int bond_open(struct net_device *bond_dev)
> >               bond_for_each_slave(bond, slave, iter) {
> >                       if (bond_uses_primary(bond) &&
> >                           slave != rcu_access_pointer(bond->curr_active_slave)) {
> >-                              bond_set_slave_inactive_flags(slave,
> >-                                                            BOND_SLAVE_NOTIFY_NOW);
> >+                              if (BOND_MODE(bond) == BOND_MODE_8023AD)
> >+                                      bond_set_slave_txrx_disabled_flags(slave,
> >+                                                                         BOND_SLAVE_NOTIFY_NOW);
> >+                              else
> >+                                      bond_set_slave_inactive_flags(slave,
> >+                                                                    BOND_SLAVE_NOTIFY_NOW);
> >                       } else if (BOND_MODE(bond) != BOND_MODE_8023AD) {
> >                               bond_set_slave_active_flags(slave,
> >                                                           BOND_SLAVE_NOTIFY_NOW);
> >diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
> >index cfa74cf8bb1a..1e671f504fc1 100644
> >--- a/drivers/net/bonding/bond_netlink.c
> >+++ b/drivers/net/bonding/bond_netlink.c
> >@@ -122,6 +122,7 @@ static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = {
> >       [IFLA_BOND_PEER_NOTIF_DELAY]    = NLA_POLICY_FULL_RANGE(NLA_U32, &delay_range),
> >       [IFLA_BOND_MISSED_MAX]          = { .type = NLA_U8 },
> >       [IFLA_BOND_NS_IP6_TARGET]       = { .type = NLA_NESTED },
> >+      [IFLA_BOND_LACP_EXTENDED_MUX]   = { .type = NLA_U8 },
> > };
> >
> > static const struct nla_policy bond_slave_policy[IFLA_BOND_SLAVE_MAX + 1] = {
> >@@ -549,6 +550,16 @@ static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[],
> >                       return err;
> >       }
> >
> >+      if (data[IFLA_BOND_LACP_EXTENDED_MUX]) {
> >+              int lacp_extended_mux = nla_get_u8(data[IFLA_BOND_LACP_EXTENDED_MUX]);
> >+
> >+              bond_opt_initval(&newval, lacp_extended_mux);
> >+              err = __bond_opt_set(bond, BOND_OPT_LACP_EXTENDED_MUX, &newval,
> >+                                   data[IFLA_BOND_LACP_EXTENDED_MUX], extack);
> >+              if (err)
> >+                      return err;
> >+      }
> >+
> >       return 0;
> > }
> >
> >@@ -615,6 +626,7 @@ static size_t bond_get_size(const struct net_device *bond_dev)
> >                                               /* IFLA_BOND_NS_IP6_TARGET */
> >               nla_total_size(sizeof(struct nlattr)) +
> >               nla_total_size(sizeof(struct in6_addr)) * BOND_MAX_NS_TARGETS +
> >+              nla_total_size(sizeof(u8)) +    /* IFLA_BOND_LACP_EXTENDED_MUX */
> >               0;
> > }
> >
> >@@ -774,6 +786,10 @@ static int bond_fill_info(struct sk_buff *skb,
> >                      bond->params.missed_max))
> >               goto nla_put_failure;
> >
> >+      if (nla_put_u8(skb, IFLA_BOND_LACP_EXTENDED_MUX,
> >+                     bond->params.lacp_extended_mux))
> >+              goto nla_put_failure;
> >+
> >       if (BOND_MODE(bond) == BOND_MODE_8023AD) {
> >               struct ad_info info;
> >
> >diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
> >index f3f27f0bd2a6..c9997e42d045 100644
> >--- a/drivers/net/bonding/bond_options.c
> >+++ b/drivers/net/bonding/bond_options.c
> >@@ -84,7 +84,8 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
> >                                           const struct bond_opt_value *newval);
> > static int bond_option_missed_max_set(struct bonding *bond,
> >                                     const struct bond_opt_value *newval);
> >-
> >+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
> >+                                           const struct bond_opt_value *newval);
> >
> > static const struct bond_opt_value bond_mode_tbl[] = {
> >       { "balance-rr",    BOND_MODE_ROUNDROBIN,   BOND_VALFLAG_DEFAULT},
> >@@ -232,6 +233,12 @@ static const struct bond_opt_value bond_missed_max_tbl[] = {
> >       { NULL,         -1,     0},
> > };
> >
> >+static const struct bond_opt_value bond_lacp_extended_mux_tbl[] = {
> >+      { "off", 0,  BOND_VALFLAG_DEFAULT},
> >+      { "on",  1,  0},
> >+      { NULL,  -1, 0},
> >+};
> >+
> > static const struct bond_option bond_opts[BOND_OPT_LAST] = {
> >       [BOND_OPT_MODE] = {
> >               .id = BOND_OPT_MODE,
> >@@ -496,6 +503,14 @@ static const struct bond_option bond_opts[BOND_OPT_LAST] = {
> >               .desc = "Delay between each peer notification on failover event, in milliseconds",
> >               .values = bond_peer_notif_delay_tbl,
> >               .set = bond_option_peer_notif_delay_set
> >+      },
> >+      [BOND_OPT_LACP_EXTENDED_MUX] = {
> >+              .id = BOND_OPT_LACP_EXTENDED_MUX,
> >+              .name = "lacp_extended_mux",
> >+              .desc = "Opt into using extended MUX for LACP states",
> >+              .unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_8023AD)),
> >+              .values = bond_lacp_extended_mux_tbl,
> >+              .set = bond_option_lacp_extended_mux_set,
> >       }
> > };
> >
> >@@ -1692,3 +1707,12 @@ static int bond_option_ad_user_port_key_set(struct bonding *bond,
> >       bond->params.ad_user_port_key = newval->value;
> >       return 0;
> > }
> >+
> >+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
> >+                                           const struct bond_opt_value *newval)
> >+{
> >+      netdev_info(bond->dev, "Setting lacp_extended_mux to %s (%llu)\n",
> >+                  newval->string, newval->value);
> >+      bond->params.lacp_extended_mux = newval->value;
> >+      return 0;
> >+}
> >diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
> >index 2805135a7205..62e264010998 100644
> >--- a/drivers/net/bonding/bond_sysfs.c
> >+++ b/drivers/net/bonding/bond_sysfs.c
> >@@ -753,6 +753,17 @@ static ssize_t bonding_show_ad_user_port_key(struct device *d,
> > static DEVICE_ATTR(ad_user_port_key, 0644,
> >                  bonding_show_ad_user_port_key, bonding_sysfs_store_option);
> >
> >+static ssize_t bonding_show_lacp_extended_mux(struct device *d,
> >+                                            struct device_attribute *attr,
> >+                                            char *buf)
> >+{
> >+      struct bonding *bond = to_bond(d);
> >+
> >+      return sprintf(buf, "%d\n", bond->params.lacp_extended_mux);
> >+}
> >+static DEVICE_ATTR(lacp_extended_mux, 0644,
> >+                 bonding_show_lacp_extended_mux, bonding_sysfs_store_option);
> >+
> > static struct attribute *per_bond_attrs[] = {
> >       &dev_attr_slaves.attr,
> >       &dev_attr_mode.attr,
> >@@ -792,6 +803,7 @@ static struct attribute *per_bond_attrs[] = {
> >       &dev_attr_ad_actor_system.attr,
> >       &dev_attr_ad_user_port_key.attr,
> >       &dev_attr_arp_missed_max.attr,
> >+      &dev_attr_lacp_extended_mux.attr,
> >       NULL,
> > };
> >
> >diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h
> >index c5e57c6bd873..9ce5ac2bfbad 100644
> >--- a/include/net/bond_3ad.h
> >+++ b/include/net/bond_3ad.h
> >@@ -54,6 +54,8 @@ typedef enum {
> >       AD_MUX_DETACHED,        /* mux machine */
> >       AD_MUX_WAITING,         /* mux machine */
> >       AD_MUX_ATTACHED,        /* mux machine */
> >+      AD_MUX_COLLECTING,      /* mux machine */
> >+      AD_MUX_DISTRIBUTING,    /* mux machine */
> >       AD_MUX_COLLECTING_DISTRIBUTING  /* mux machine */
> > } mux_states_t;
> >
> >diff --git a/include/net/bond_options.h b/include/net/bond_options.h
> >index 69292ecc0325..8d1e9cb28684 100644
> >--- a/include/net/bond_options.h
> >+++ b/include/net/bond_options.h
> >@@ -76,6 +76,7 @@ enum {
> >       BOND_OPT_MISSED_MAX,
> >       BOND_OPT_NS_TARGETS,
> >       BOND_OPT_PRIO,
> >+      BOND_OPT_LACP_EXTENDED_MUX,
> >       BOND_OPT_LAST
> > };
> >
> >diff --git a/include/net/bonding.h b/include/net/bonding.h
> >index 5b8b1b644a2d..b31880d53d76 100644
> >--- a/include/net/bonding.h
> >+++ b/include/net/bonding.h
> >@@ -148,6 +148,7 @@ struct bond_params {
> > #if IS_ENABLED(CONFIG_IPV6)
> >       struct in6_addr ns_targets[BOND_MAX_NS_TARGETS];
> > #endif
> >+      int lacp_extended_mux;
> >
> >       /* 2 bytes of padding : see ether_addr_equal_64bits() */
> >       u8 ad_actor_system[ETH_ALEN + 2];
> >@@ -167,6 +168,7 @@ struct slave {
> >       u8     backup:1,   /* indicates backup slave. Value corresponds with
> >                             BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
> >              inactive:1, /* indicates inactive slave */
> >+             rx_disabled:1, /* indicates whether slave's Rx is disabled */
> >              should_notify:1, /* indicates whether the state changed */
> >              should_notify_link:1; /* indicates whether the link changed */
> >       u8     duplex;
> >@@ -570,6 +572,19 @@ static inline void bond_set_slave_inactive_flags(struct slave *slave,
> >               slave->inactive = 1;
> > }
> >
> >+static inline void bond_set_slave_txrx_disabled_flags(struct slave *slave,
> >+                                               bool notify)
> >+{
> >+      bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
> >+      slave->rx_disabled = 1;
> >+}
> >+
> >+static inline void bond_set_slave_tx_disabled_flags(struct slave *slave,
> >+                                               bool notify)
> >+{
> >+      bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
> >+}
> >+
> > static inline void bond_set_slave_active_flags(struct slave *slave,
> >                                              bool notify)
> > {
> >@@ -577,11 +592,29 @@ static inline void bond_set_slave_active_flags(struct slave *slave,
> >       slave->inactive = 0;
> > }
> >
> >+static inline void bond_set_slave_txrx_enabled_flags(struct slave *slave,
> >+                                             bool notify)
> >+{
> >+      bond_set_slave_state(slave, BOND_STATE_ACTIVE, notify);
> >+      slave->rx_disabled = 0;
> >+}
> >+
> >+static inline void bond_set_slave_rx_enabled_flags(struct slave *slave,
> >+                                             bool notify)
> >+{
> >+      slave->rx_disabled = 0;
> >+}
> >+
> > static inline bool bond_is_slave_inactive(struct slave *slave)
> > {
> >       return slave->inactive;
> > }
> >
> >+static inline bool bond_is_slave_rx_disabled(struct slave *slave)
> >+{
> >+      return slave->rx_disabled;
> >+}
> >+
> > static inline void bond_propose_link_state(struct slave *slave, int state)
> > {
> >       slave->link_new_state = state;
> >diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
> >index 29ff80da2775..e8fb30da9110 100644
> >--- a/include/uapi/linux/if_link.h
> >+++ b/include/uapi/linux/if_link.h
> >@@ -976,6 +976,7 @@ enum {
> >       IFLA_BOND_AD_LACP_ACTIVE,
> >       IFLA_BOND_MISSED_MAX,
> >       IFLA_BOND_NS_IP6_TARGET,
> >+      IFLA_BOND_LACP_EXTENDED_MUX,
> >       __IFLA_BOND_MAX,
> > };
> >
> >diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
> >index a0aa05a28cf2..f641f55dbbc4 100644
> >--- a/tools/include/uapi/linux/if_link.h
> >+++ b/tools/include/uapi/linux/if_link.h
> >@@ -974,6 +974,7 @@ enum {
> >       IFLA_BOND_AD_LACP_ACTIVE,
> >       IFLA_BOND_MISSED_MAX,
> >       IFLA_BOND_NS_IP6_TARGET,
> >+      IFLA_BOND_LACP_EXTENDED_MUX,
> >       __IFLA_BOND_MAX,
> > };
> >
> >--
> >2.43.0.472.g3155946c3a-goog
> >
> >
>
> ---
>         -Jay Vosburgh, jay.vosburgh@canonical.com
  

Patch

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c99ffe6c683a..38a7aa6e4edd 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -106,6 +106,9 @@  static void ad_agg_selection_logic(struct aggregator *aggregator,
 static void ad_clear_agg(struct aggregator *aggregator);
 static void ad_initialize_agg(struct aggregator *aggregator);
 static void ad_initialize_port(struct port *port, int lacp_fast);
+static void ad_enable_collecting(struct port *port);
+static void ad_disable_distributing(struct port *port,
+				    bool *update_slave_arr);
 static void ad_enable_collecting_distributing(struct port *port,
 					      bool *update_slave_arr);
 static void ad_disable_collecting_distributing(struct port *port,
@@ -171,32 +174,64 @@  static inline int __agg_has_partner(struct aggregator *agg)
 	return !is_zero_ether_addr(agg->partner_system.mac_addr_value);
 }
 
+/**
+ * __disable_distributing_port - disable the port's slave for distributing.
+ * Port will still be able to collect.
+ * @port: the port we're looking at
+ *
+ * This will disable only distributing on the port's slave.
+ */
+static inline void __disable_distributing_port(struct port *port)
+{
+	bond_set_slave_tx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
+}
+
+/**
+ * __enable_collecting_port - enable the port's slave for collecting,
+ * if it's up
+ * @port: the port we're looking at
+ *
+ * This will enable only collecting on the port's slave.
+ */
+static inline void __enable_collecting_port(struct port *port)
+{
+	struct slave *slave = port->slave;
+
+	if (slave->link == BOND_LINK_UP && bond_slave_is_up(slave))
+		bond_set_slave_rx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
+}
+
 /**
  * __disable_port - disable the port's slave
  * @port: the port we're looking at
+ *
+ * This will disable both collecting and distributing on the port's slave.
  */
 static inline void __disable_port(struct port *port)
 {
-	bond_set_slave_inactive_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
+	bond_set_slave_txrx_disabled_flags(port->slave, BOND_SLAVE_NOTIFY_LATER);
 }
 
 /**
  * __enable_port - enable the port's slave, if it's up
  * @port: the port we're looking at
+ *
+ * This will enable both collecting and distributing on the port's slave.
  */
 static inline void __enable_port(struct port *port)
 {
 	struct slave *slave = port->slave;
 
 	if ((slave->link == BOND_LINK_UP) && bond_slave_is_up(slave))
-		bond_set_slave_active_flags(slave, BOND_SLAVE_NOTIFY_LATER);
+		bond_set_slave_txrx_enabled_flags(slave, BOND_SLAVE_NOTIFY_LATER);
 }
 
 /**
- * __port_is_enabled - check if the port's slave is in active state
+ * __port_is_collecting_distributing - check if the port's slave is in the
+ * combined collecting/distributing state
  * @port: the port we're looking at
  */
-static inline int __port_is_enabled(struct port *port)
+static inline int __port_is_collecting_distributing(struct port *port)
 {
 	return bond_is_active_slave(port->slave);
 }
@@ -942,6 +977,7 @@  static int ad_marker_send(struct port *port, struct bond_marker *marker)
  */
 static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 {
+	struct bonding *bond = __get_bond_by_port(port);
 	mux_states_t last_state;
 
 	/* keep current State Machine state to compare later if it was
@@ -999,9 +1035,13 @@  static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 			if ((port->sm_vars & AD_PORT_SELECTED) &&
 			    (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
 			    !__check_agg_selection_timer(port)) {
-				if (port->aggregator->is_active)
-					port->sm_mux_state =
-					    AD_MUX_COLLECTING_DISTRIBUTING;
+				if (port->aggregator->is_active) {
+					int state = AD_MUX_COLLECTING_DISTRIBUTING;
+
+					if (bond->params.lacp_extended_mux)
+						state = AD_MUX_COLLECTING;
+					port->sm_mux_state = state;
+				}
 			} else if (!(port->sm_vars & AD_PORT_SELECTED) ||
 				   (port->sm_vars & AD_PORT_STANDBY)) {
 				/* if UNSELECTED or STANDBY */
@@ -1031,7 +1071,52 @@  static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 				 */
 				if (port->aggregator &&
 				    port->aggregator->is_active &&
-				    !__port_is_enabled(port)) {
+				    !__port_is_collecting_distributing(port)) {
+					__enable_port(port);
+					*update_slave_arr = true;
+				}
+			}
+			break;
+		case AD_MUX_COLLECTING:
+			if (!(port->sm_vars & AD_PORT_SELECTED) ||
+			    (port->sm_vars & AD_PORT_STANDBY) ||
+			    !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
+			    !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
+				port->sm_mux_state = AD_MUX_ATTACHED;
+			} else if ((port->sm_vars & AD_PORT_SELECTED) &&
+			    (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
+			    (port->partner_oper.port_state & LACP_STATE_COLLECTING)) {
+				port->sm_mux_state = AD_MUX_DISTRIBUTING;
+			} else {
+				/* If port state hasn't changed, make sure that a collecting
+				 * port is enabled for an active aggregator.
+				 */
+				if (port->aggregator &&
+				    port->aggregator->is_active) {
+					struct slave *slave = port->slave;
+
+					if (bond_is_slave_rx_disabled(slave) != 0) {
+						ad_enable_collecting(port);
+						*update_slave_arr = true;
+					}
+				}
+			}
+			break;
+		case AD_MUX_DISTRIBUTING:
+			if (!(port->sm_vars & AD_PORT_SELECTED) ||
+			    (port->sm_vars & AD_PORT_STANDBY) ||
+			    !(port->partner_oper.port_state & LACP_STATE_COLLECTING) ||
+			    !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
+			    !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
+				port->sm_mux_state = AD_MUX_COLLECTING;
+			} else {
+				/* if port state hasn't changed make
+				 * sure that a collecting distributing
+				 * port in an active aggregator is enabled
+				 */
+				if (port->aggregator &&
+				    port->aggregator->is_active &&
+				    !__port_is_collecting_distributing(port)) {
 					__enable_port(port);
 					*update_slave_arr = true;
 				}
@@ -1082,6 +1167,20 @@  static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 							  update_slave_arr);
 			port->ntt = true;
 			break;
+		case AD_MUX_COLLECTING:
+			port->actor_oper_port_state |= LACP_STATE_COLLECTING;
+			port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
+			port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
+			ad_enable_collecting(port);
+			ad_disable_distributing(port, update_slave_arr);
+			port->ntt = true;
+			break;
+		case AD_MUX_DISTRIBUTING:
+			port->actor_oper_port_state |= LACP_STATE_DISTRIBUTING;
+			port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
+			ad_enable_collecting_distributing(port,
+							  update_slave_arr);
+			break;
 		default:
 			break;
 		}
@@ -1906,6 +2005,46 @@  static void ad_initialize_port(struct port *port, int lacp_fast)
 	}
 }
 
+/**
+ * ad_enable_collecting - enable a port's receive
+ * @port: the port we're looking at
+ * @update_slave_arr: Does slave array need update?
+ *
+ * Enable @port if it's in an active aggregator
+ */
+static void ad_enable_collecting(struct port *port)
+{
+	if (port->aggregator->is_active) {
+		struct slave *slave = port->slave;
+
+		slave_dbg(slave->bond->dev, slave->dev,
+			  "Enabling collecting on port %d (LAG %d)\n",
+			  port->actor_port_number,
+			  port->aggregator->aggregator_identifier);
+		__enable_collecting_port(port);
+	}
+}
+
+/**
+ * ad_disable_distributing - disable a port's transmit
+ * @port: the port we're looking at
+ * @update_slave_arr: Does slave array need update?
+ */
+static void ad_disable_distributing(struct port *port, bool *update_slave_arr)
+{
+	if (port->aggregator &&
+	    !MAC_ADDRESS_EQUAL(&port->aggregator->partner_system,
+			       &(null_mac_addr))) {
+		slave_dbg(port->slave->bond->dev, port->slave->dev,
+			  "Disabling distributing on port %d (LAG %d)\n",
+			  port->actor_port_number,
+			  port->aggregator->aggregator_identifier);
+		__disable_distributing_port(port);
+		/* Slave array needs an update */
+		*update_slave_arr = true;
+	}
+}
+
 /**
  * ad_enable_collecting_distributing - enable a port's transmit/receive
  * @port: the port we're looking at
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 8e6cc0e133b7..6b8f001a51a5 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2119,7 +2119,7 @@  int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev,
 		 * will activate the slaves in the selected
 		 * aggregator
 		 */
-		bond_set_slave_inactive_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
+		bond_set_slave_txrx_disabled_flags(new_slave, BOND_SLAVE_NOTIFY_NOW);
 		/* if this is the first slave */
 		if (!prev_slave) {
 			SLAVE_AD_INFO(new_slave)->id = 1;
@@ -2381,7 +2381,10 @@  static int __bond_release_one(struct net_device *bond_dev,
 		return -EINVAL;
 	}
 
-	bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
+	if (BOND_MODE(bond) == BOND_MODE_8023AD)
+		bond_set_slave_txrx_disabled_flags(slave, BOND_SLAVE_NOTIFY_NOW);
+	else
+		bond_set_slave_inactive_flags(slave, BOND_SLAVE_NOTIFY_NOW);
 
 	bond_sysfs_slave_del(slave);
 
@@ -2763,11 +2766,14 @@  static void bond_miimon_commit(struct bonding *bond)
 			bond_set_slave_link_state(slave, BOND_LINK_DOWN,
 						  BOND_SLAVE_NOTIFY_NOW);
 
-			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP ||
-			    BOND_MODE(bond) == BOND_MODE_8023AD)
+			if (BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP)
 				bond_set_slave_inactive_flags(slave,
 							      BOND_SLAVE_NOTIFY_NOW);
 
+			if (BOND_MODE(bond) == BOND_MODE_8023AD)
+				bond_set_slave_txrx_disabled_flags(slave,
+								   BOND_SLAVE_NOTIFY_NOW);
+
 			slave_info(bond->dev, slave->dev, "link status definitely down, disabling slave\n");
 
 			bond_miimon_link_change(bond, slave, BOND_LINK_DOWN);
@@ -4276,8 +4282,12 @@  static int bond_open(struct net_device *bond_dev)
 		bond_for_each_slave(bond, slave, iter) {
 			if (bond_uses_primary(bond) &&
 			    slave != rcu_access_pointer(bond->curr_active_slave)) {
-				bond_set_slave_inactive_flags(slave,
-							      BOND_SLAVE_NOTIFY_NOW);
+				if (BOND_MODE(bond) == BOND_MODE_8023AD)
+					bond_set_slave_txrx_disabled_flags(slave,
+									   BOND_SLAVE_NOTIFY_NOW);
+				else
+					bond_set_slave_inactive_flags(slave,
+								      BOND_SLAVE_NOTIFY_NOW);
 			} else if (BOND_MODE(bond) != BOND_MODE_8023AD) {
 				bond_set_slave_active_flags(slave,
 							    BOND_SLAVE_NOTIFY_NOW);
diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index cfa74cf8bb1a..1e671f504fc1 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -122,6 +122,7 @@  static const struct nla_policy bond_policy[IFLA_BOND_MAX + 1] = {
 	[IFLA_BOND_PEER_NOTIF_DELAY]    = NLA_POLICY_FULL_RANGE(NLA_U32, &delay_range),
 	[IFLA_BOND_MISSED_MAX]		= { .type = NLA_U8 },
 	[IFLA_BOND_NS_IP6_TARGET]	= { .type = NLA_NESTED },
+	[IFLA_BOND_LACP_EXTENDED_MUX]   = { .type = NLA_U8 },
 };
 
 static const struct nla_policy bond_slave_policy[IFLA_BOND_SLAVE_MAX + 1] = {
@@ -549,6 +550,16 @@  static int bond_changelink(struct net_device *bond_dev, struct nlattr *tb[],
 			return err;
 	}
 
+	if (data[IFLA_BOND_LACP_EXTENDED_MUX]) {
+		int lacp_extended_mux = nla_get_u8(data[IFLA_BOND_LACP_EXTENDED_MUX]);
+
+		bond_opt_initval(&newval, lacp_extended_mux);
+		err = __bond_opt_set(bond, BOND_OPT_LACP_EXTENDED_MUX, &newval,
+				     data[IFLA_BOND_LACP_EXTENDED_MUX], extack);
+		if (err)
+			return err;
+	}
+
 	return 0;
 }
 
@@ -615,6 +626,7 @@  static size_t bond_get_size(const struct net_device *bond_dev)
 						/* IFLA_BOND_NS_IP6_TARGET */
 		nla_total_size(sizeof(struct nlattr)) +
 		nla_total_size(sizeof(struct in6_addr)) * BOND_MAX_NS_TARGETS +
+		nla_total_size(sizeof(u8)) +	/* IFLA_BOND_LACP_EXTENDED_MUX */
 		0;
 }
 
@@ -774,6 +786,10 @@  static int bond_fill_info(struct sk_buff *skb,
 		       bond->params.missed_max))
 		goto nla_put_failure;
 
+	if (nla_put_u8(skb, IFLA_BOND_LACP_EXTENDED_MUX,
+		       bond->params.lacp_extended_mux))
+		goto nla_put_failure;
+
 	if (BOND_MODE(bond) == BOND_MODE_8023AD) {
 		struct ad_info info;
 
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index f3f27f0bd2a6..c9997e42d045 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -84,7 +84,8 @@  static int bond_option_ad_user_port_key_set(struct bonding *bond,
 					    const struct bond_opt_value *newval);
 static int bond_option_missed_max_set(struct bonding *bond,
 				      const struct bond_opt_value *newval);
-
+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
+					     const struct bond_opt_value *newval);
 
 static const struct bond_opt_value bond_mode_tbl[] = {
 	{ "balance-rr",    BOND_MODE_ROUNDROBIN,   BOND_VALFLAG_DEFAULT},
@@ -232,6 +233,12 @@  static const struct bond_opt_value bond_missed_max_tbl[] = {
 	{ NULL,		-1,	0},
 };
 
+static const struct bond_opt_value bond_lacp_extended_mux_tbl[] = {
+	{ "off", 0,  BOND_VALFLAG_DEFAULT},
+	{ "on",  1,  0},
+	{ NULL,  -1, 0},
+};
+
 static const struct bond_option bond_opts[BOND_OPT_LAST] = {
 	[BOND_OPT_MODE] = {
 		.id = BOND_OPT_MODE,
@@ -496,6 +503,14 @@  static const struct bond_option bond_opts[BOND_OPT_LAST] = {
 		.desc = "Delay between each peer notification on failover event, in milliseconds",
 		.values = bond_peer_notif_delay_tbl,
 		.set = bond_option_peer_notif_delay_set
+	},
+	[BOND_OPT_LACP_EXTENDED_MUX] = {
+		.id = BOND_OPT_LACP_EXTENDED_MUX,
+		.name = "lacp_extended_mux",
+		.desc = "Opt into using extended MUX for LACP states",
+		.unsuppmodes = BOND_MODE_ALL_EX(BIT(BOND_MODE_8023AD)),
+		.values = bond_lacp_extended_mux_tbl,
+		.set = bond_option_lacp_extended_mux_set,
 	}
 };
 
@@ -1692,3 +1707,12 @@  static int bond_option_ad_user_port_key_set(struct bonding *bond,
 	bond->params.ad_user_port_key = newval->value;
 	return 0;
 }
+
+static int bond_option_lacp_extended_mux_set(struct bonding *bond,
+					     const struct bond_opt_value *newval)
+{
+	netdev_info(bond->dev, "Setting lacp_extended_mux to %s (%llu)\n",
+		    newval->string, newval->value);
+	bond->params.lacp_extended_mux = newval->value;
+	return 0;
+}
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 2805135a7205..62e264010998 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -753,6 +753,17 @@  static ssize_t bonding_show_ad_user_port_key(struct device *d,
 static DEVICE_ATTR(ad_user_port_key, 0644,
 		   bonding_show_ad_user_port_key, bonding_sysfs_store_option);
 
+static ssize_t bonding_show_lacp_extended_mux(struct device *d,
+					      struct device_attribute *attr,
+					      char *buf)
+{
+	struct bonding *bond = to_bond(d);
+
+	return sprintf(buf, "%d\n", bond->params.lacp_extended_mux);
+}
+static DEVICE_ATTR(lacp_extended_mux, 0644,
+		   bonding_show_lacp_extended_mux, bonding_sysfs_store_option);
+
 static struct attribute *per_bond_attrs[] = {
 	&dev_attr_slaves.attr,
 	&dev_attr_mode.attr,
@@ -792,6 +803,7 @@  static struct attribute *per_bond_attrs[] = {
 	&dev_attr_ad_actor_system.attr,
 	&dev_attr_ad_user_port_key.attr,
 	&dev_attr_arp_missed_max.attr,
+	&dev_attr_lacp_extended_mux.attr,
 	NULL,
 };
 
diff --git a/include/net/bond_3ad.h b/include/net/bond_3ad.h
index c5e57c6bd873..9ce5ac2bfbad 100644
--- a/include/net/bond_3ad.h
+++ b/include/net/bond_3ad.h
@@ -54,6 +54,8 @@  typedef enum {
 	AD_MUX_DETACHED,	/* mux machine */
 	AD_MUX_WAITING,		/* mux machine */
 	AD_MUX_ATTACHED,	/* mux machine */
+	AD_MUX_COLLECTING,	/* mux machine */
+	AD_MUX_DISTRIBUTING,	/* mux machine */
 	AD_MUX_COLLECTING_DISTRIBUTING	/* mux machine */
 } mux_states_t;
 
diff --git a/include/net/bond_options.h b/include/net/bond_options.h
index 69292ecc0325..8d1e9cb28684 100644
--- a/include/net/bond_options.h
+++ b/include/net/bond_options.h
@@ -76,6 +76,7 @@  enum {
 	BOND_OPT_MISSED_MAX,
 	BOND_OPT_NS_TARGETS,
 	BOND_OPT_PRIO,
+	BOND_OPT_LACP_EXTENDED_MUX,
 	BOND_OPT_LAST
 };
 
diff --git a/include/net/bonding.h b/include/net/bonding.h
index 5b8b1b644a2d..b31880d53d76 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -148,6 +148,7 @@  struct bond_params {
 #if IS_ENABLED(CONFIG_IPV6)
 	struct in6_addr ns_targets[BOND_MAX_NS_TARGETS];
 #endif
+	int lacp_extended_mux;
 
 	/* 2 bytes of padding : see ether_addr_equal_64bits() */
 	u8 ad_actor_system[ETH_ALEN + 2];
@@ -167,6 +168,7 @@  struct slave {
 	u8     backup:1,   /* indicates backup slave. Value corresponds with
 			      BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
 	       inactive:1, /* indicates inactive slave */
+	       rx_disabled:1, /* indicates whether slave's Rx is disabled */
 	       should_notify:1, /* indicates whether the state changed */
 	       should_notify_link:1; /* indicates whether the link changed */
 	u8     duplex;
@@ -570,6 +572,19 @@  static inline void bond_set_slave_inactive_flags(struct slave *slave,
 		slave->inactive = 1;
 }
 
+static inline void bond_set_slave_txrx_disabled_flags(struct slave *slave,
+						 bool notify)
+{
+	bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
+	slave->rx_disabled = 1;
+}
+
+static inline void bond_set_slave_tx_disabled_flags(struct slave *slave,
+						 bool notify)
+{
+	bond_set_slave_state(slave, BOND_STATE_BACKUP, notify);
+}
+
 static inline void bond_set_slave_active_flags(struct slave *slave,
 					       bool notify)
 {
@@ -577,11 +592,29 @@  static inline void bond_set_slave_active_flags(struct slave *slave,
 	slave->inactive = 0;
 }
 
+static inline void bond_set_slave_txrx_enabled_flags(struct slave *slave,
+					       bool notify)
+{
+	bond_set_slave_state(slave, BOND_STATE_ACTIVE, notify);
+	slave->rx_disabled = 0;
+}
+
+static inline void bond_set_slave_rx_enabled_flags(struct slave *slave,
+					       bool notify)
+{
+	slave->rx_disabled = 0;
+}
+
 static inline bool bond_is_slave_inactive(struct slave *slave)
 {
 	return slave->inactive;
 }
 
+static inline bool bond_is_slave_rx_disabled(struct slave *slave)
+{
+	return slave->rx_disabled;
+}
+
 static inline void bond_propose_link_state(struct slave *slave, int state)
 {
 	slave->link_new_state = state;
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 29ff80da2775..e8fb30da9110 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -976,6 +976,7 @@  enum {
 	IFLA_BOND_AD_LACP_ACTIVE,
 	IFLA_BOND_MISSED_MAX,
 	IFLA_BOND_NS_IP6_TARGET,
+	IFLA_BOND_LACP_EXTENDED_MUX,
 	__IFLA_BOND_MAX,
 };
 
diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
index a0aa05a28cf2..f641f55dbbc4 100644
--- a/tools/include/uapi/linux/if_link.h
+++ b/tools/include/uapi/linux/if_link.h
@@ -974,6 +974,7 @@  enum {
 	IFLA_BOND_AD_LACP_ACTIVE,
 	IFLA_BOND_MISSED_MAX,
 	IFLA_BOND_NS_IP6_TARGET,
+	IFLA_BOND_LACP_EXTENDED_MUX,
 	__IFLA_BOND_MAX,
 };