[net,v4,1/3] ixgbe: allow to increase MTU to 3K with XDP enabled

Message ID 20230208024333.10465-1-kerneljasonxing@gmail.com
State New
Headers
Series [net,v4,1/3] ixgbe: allow to increase MTU to 3K with XDP enabled |

Commit Message

Jason Xing Feb. 8, 2023, 2:43 a.m. UTC
  From: Jason Xing <kernelxing@tencent.com>

Recently I encountered one case where I cannot increase the MTU size
directly from 1500 to a much bigger value with XDP enabled if the
server is equipped with IXGBE card, which happened on thousands of
servers in production environment. After appling the current patch,
we can set the maximum MTU size to 3K.

This patch follows the behavior of changing MTU as i40e/ice does.

Referrences:
[1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP")
[2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")

Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
v4:
1) use ':' instead of '-' for kdoc

v3:
1) modify the titile and body message.

v2:
1) change the commit message.
2) modify the logic when changing MTU size suggested by Maciej and Alexander.
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 25 ++++++++++++-------
 1 file changed, 16 insertions(+), 9 deletions(-)
  

Comments

Alexander Duyck Feb. 8, 2023, 3:37 p.m. UTC | #1
On Wed, 2023-02-08 at 10:43 +0800, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
> 
> Recently I encountered one case where I cannot increase the MTU size
> directly from 1500 to a much bigger value with XDP enabled if the
> server is equipped with IXGBE card, which happened on thousands of
> servers in production environment. After appling the current patch,
> we can set the maximum MTU size to 3K.
> 
> This patch follows the behavior of changing MTU as i40e/ice does.
> 
> Referrences:
> [1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP")
> [2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
> 
> Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP")
> Signed-off-by: Jason Xing <kernelxing@tencent.com>

This is based on the broken premise that w/ XDP we are using a 4K page.
The ixgbe driver isn't using page pool and is therefore running on
different limitations. The ixgbe driver is only using 2K slices of the
4K page. In addition that is reduced to 1.5K to allow for headroom and
the shared info in the buffer.

Currently the only way a 3K buffer would work is if FCoE is enabled and
in that case the driver is using order 1 pages and still using the
split buffer approach.

Changing the MTU to more than 1.5K will allow multi-buffer frames which
would break things when you try to use XDP_REDIRECT or XDP_TX on frames
over 1.5K in size. For things like XDP_PASS, XDP_DROP, and XDP_ABORT it
should still work as long as you don't attempt to reach beyond the 1.5K
boundary.

Until this driver supports XDP multi-buffer I don't think you can
increase the MTU past 1.5K. If you are wanting a larger MTU you should
look at enabling XDP multi-buffer and then just drop the XDP
limitations entirely.

> ---
> v4:
> 1) use ':' instead of '-' for kdoc
> 
> v3:
> 1) modify the titile and body message.
> 
> v2:
> 1) change the commit message.
> 2) modify the logic when changing MTU size suggested by Maciej and Alexander.
> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 25 ++++++++++++-------
>  1 file changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index ab8370c413f3..25ca329f7d3c 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -6777,6 +6777,18 @@ static void ixgbe_free_all_rx_resources(struct ixgbe_adapter *adapter)
>  			ixgbe_free_rx_resources(adapter->rx_ring[i]);
>  }
>  
> +/**
> + * ixgbe_max_xdp_frame_size - returns the maximum allowed frame size for XDP
> + * @adapter: device handle, pointer to adapter
> + */
> +static int ixgbe_max_xdp_frame_size(struct ixgbe_adapter *adapter)
> +{
> +	if (PAGE_SIZE >= 8192 || adapter->flags2 & IXGBE_FLAG2_RX_LEGACY)
> +		return IXGBE_RXBUFFER_2K;
> +	else
> +		return IXGBE_RXBUFFER_3K;
> +}
> +

There is no difference in the buffer allocation approach for LEGACY vs
non-legacy. The difference is if we are building the frame around the
buffer using build_skb or we are adding it as a frag and then copying
out the header.

>  /**
>   * ixgbe_change_mtu - Change the Maximum Transfer Unit
>   * @netdev: network interface device structure
> @@ -6788,18 +6800,13 @@ static int ixgbe_change_mtu(struct net_device *netdev, int new_mtu)
>  {
>  	struct ixgbe_adapter *adapter = netdev_priv(netdev);
>  
> -	if (adapter->xdp_prog) {
> +	if (ixgbe_enabled_xdp_adapter(adapter)) {
>  		int new_frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN +
>  				     VLAN_HLEN;
> -		int i;
> -
> -		for (i = 0; i < adapter->num_rx_queues; i++) {
> -			struct ixgbe_ring *ring = adapter->rx_ring[i];
>  
> -			if (new_frame_size > ixgbe_rx_bufsz(ring)) {
> -				e_warn(probe, "Requested MTU size is not supported with XDP\n");
> -				return -EINVAL;
> -			}
> +		if (new_frame_size > ixgbe_max_xdp_frame_size(adapter)) {
> +			e_warn(probe, "Requested MTU size is not supported with XDP\n");
> +			return -EINVAL;
>  		}
>  	}
>
  
Maciej Fijalkowski Feb. 8, 2023, 4:27 p.m. UTC | #2
On Wed, Feb 08, 2023 at 07:37:57AM -0800, Alexander H Duyck wrote:
> On Wed, 2023-02-08 at 10:43 +0800, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> > 
> > Recently I encountered one case where I cannot increase the MTU size
> > directly from 1500 to a much bigger value with XDP enabled if the
> > server is equipped with IXGBE card, which happened on thousands of
> > servers in production environment. After appling the current patch,
> > we can set the maximum MTU size to 3K.
> > 
> > This patch follows the behavior of changing MTU as i40e/ice does.
> > 
> > Referrences:
> > [1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP")
> > [2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
> > 
> > Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP")
> > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> 
> This is based on the broken premise that w/ XDP we are using a 4K page.
> The ixgbe driver isn't using page pool and is therefore running on
> different limitations. The ixgbe driver is only using 2K slices of the
> 4K page. In addition that is reduced to 1.5K to allow for headroom and
> the shared info in the buffer.
> 
> Currently the only way a 3K buffer would work is if FCoE is enabled and
> in that case the driver is using order 1 pages and still using the
> split buffer approach.

Hey Alex, interesting, we based this on the following logic from
ixgbe_set_rx_buffer_len() I guess:

#if (PAGE_SIZE < 8192)
		if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED)
			set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);

		if (IXGBE_2K_TOO_SMALL_WITH_PADDING ||
		    (max_frame > (ETH_FRAME_LEN + ETH_FCS_LEN)))
			set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);
#endif

so we assumed that ixgbe is no different than i40e/ice in these terms, but
we ignored whole overhead of LRO/RSC that ixgbe carries.

I am not actively working with ixgbe but I know that you were the main dev
of it, so without premature dive into the datasheet and codebase, are you
really sure that 3k mtu for XDP is a no go?

> 
> Changing the MTU to more than 1.5K will allow multi-buffer frames which
> would break things when you try to use XDP_REDIRECT or XDP_TX on frames
> over 1.5K in size. For things like XDP_PASS, XDP_DROP, and XDP_ABORT it
> should still work as long as you don't attempt to reach beyond the 1.5K
> boundary.
> 
> Until this driver supports XDP multi-buffer I don't think you can
> increase the MTU past 1.5K. If you are wanting a larger MTU you should
> look at enabling XDP multi-buffer and then just drop the XDP
> limitations entirely.
> 
> > ---
> > v4:
> > 1) use ':' instead of '-' for kdoc
> > 
> > v3:
> > 1) modify the titile and body message.
> > 
> > v2:
> > 1) change the commit message.
> > 2) modify the logic when changing MTU size suggested by Maciej and Alexander.
> > ---
> >  drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 25 ++++++++++++-------
> >  1 file changed, 16 insertions(+), 9 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > index ab8370c413f3..25ca329f7d3c 100644
> > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > @@ -6777,6 +6777,18 @@ static void ixgbe_free_all_rx_resources(struct ixgbe_adapter *adapter)
> >  			ixgbe_free_rx_resources(adapter->rx_ring[i]);
> >  }
> >  
> > +/**
> > + * ixgbe_max_xdp_frame_size - returns the maximum allowed frame size for XDP
> > + * @adapter: device handle, pointer to adapter
> > + */
> > +static int ixgbe_max_xdp_frame_size(struct ixgbe_adapter *adapter)
> > +{
> > +	if (PAGE_SIZE >= 8192 || adapter->flags2 & IXGBE_FLAG2_RX_LEGACY)
> > +		return IXGBE_RXBUFFER_2K;
> > +	else
> > +		return IXGBE_RXBUFFER_3K;
> > +}
> > +
> 
> There is no difference in the buffer allocation approach for LEGACY vs
> non-legacy. The difference is if we are building the frame around the
> buffer using build_skb or we are adding it as a frag and then copying
> out the header.
> 
> >  /**
> >   * ixgbe_change_mtu - Change the Maximum Transfer Unit
> >   * @netdev: network interface device structure
> > @@ -6788,18 +6800,13 @@ static int ixgbe_change_mtu(struct net_device *netdev, int new_mtu)
> >  {
> >  	struct ixgbe_adapter *adapter = netdev_priv(netdev);
> >  
> > -	if (adapter->xdp_prog) {
> > +	if (ixgbe_enabled_xdp_adapter(adapter)) {
> >  		int new_frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN +
> >  				     VLAN_HLEN;
> > -		int i;
> > -
> > -		for (i = 0; i < adapter->num_rx_queues; i++) {
> > -			struct ixgbe_ring *ring = adapter->rx_ring[i];
> >  
> > -			if (new_frame_size > ixgbe_rx_bufsz(ring)) {
> > -				e_warn(probe, "Requested MTU size is not supported with XDP\n");
> > -				return -EINVAL;
> > -			}
> > +		if (new_frame_size > ixgbe_max_xdp_frame_size(adapter)) {
> > +			e_warn(probe, "Requested MTU size is not supported with XDP\n");
> > +			return -EINVAL;
> >  		}
> >  	}
> >  
>
  
Alexander Duyck Feb. 8, 2023, 6:57 p.m. UTC | #3
On Wed, 2023-02-08 at 17:27 +0100, Maciej Fijalkowski wrote:
> On Wed, Feb 08, 2023 at 07:37:57AM -0800, Alexander H Duyck wrote:
> > On Wed, 2023-02-08 at 10:43 +0800, Jason Xing wrote:
> > > From: Jason Xing <kernelxing@tencent.com>
> > > 
> > > Recently I encountered one case where I cannot increase the MTU size
> > > directly from 1500 to a much bigger value with XDP enabled if the
> > > server is equipped with IXGBE card, which happened on thousands of
> > > servers in production environment. After appling the current patch,
> > > we can set the maximum MTU size to 3K.
> > > 
> > > This patch follows the behavior of changing MTU as i40e/ice does.
> > > 
> > > Referrences:
> > > [1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP")
> > > [2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
> > > 
> > > Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP")
> > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > 
> > This is based on the broken premise that w/ XDP we are using a 4K page.
> > The ixgbe driver isn't using page pool and is therefore running on
> > different limitations. The ixgbe driver is only using 2K slices of the
> > 4K page. In addition that is reduced to 1.5K to allow for headroom and
> > the shared info in the buffer.
> > 
> > Currently the only way a 3K buffer would work is if FCoE is enabled and
> > in that case the driver is using order 1 pages and still using the
> > split buffer approach.
> 
> Hey Alex, interesting, we based this on the following logic from
> ixgbe_set_rx_buffer_len() I guess:
> 
> #if (PAGE_SIZE < 8192)
> 		if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED)
> 			set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);
> 
> 		if (IXGBE_2K_TOO_SMALL_WITH_PADDING ||
> 		    (max_frame > (ETH_FRAME_LEN + ETH_FCS_LEN)))
> 			set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);
> #endif
> 
> so we assumed that ixgbe is no different than i40e/ice in these terms, but
> we ignored whole overhead of LRO/RSC that ixgbe carries.

If XDP is already enabled the LRO/RSC cannot be enabled. I think that
is already disabled if we have XDP enabled.

> I am not actively working with ixgbe but I know that you were the main dev
> of it, so without premature dive into the datasheet and codebase, are you
> really sure that 3k mtu for XDP is a no go?

I think I mixed up fm10k and ixgbe, either that or I was thinking of
the legacy setup. They all kind of blur together as I had worked on
pretty much all the Intel drivers up to i40e the last time I was
updating them for all the Rx path stuff. :)

So if I am reading things right the issue is that if XDP is enabled you
cannot set a 3K MTU, but if you set the 3K MTU first then you can
enable XDP after the fact right?

Looking it over again after re-reading the code this looks good to me.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
  
Maciej Fijalkowski Feb. 8, 2023, 7:02 p.m. UTC | #4
On Wed, Feb 08, 2023 at 10:57:22AM -0800, Alexander H Duyck wrote:
> On Wed, 2023-02-08 at 17:27 +0100, Maciej Fijalkowski wrote:
> > On Wed, Feb 08, 2023 at 07:37:57AM -0800, Alexander H Duyck wrote:
> > > On Wed, 2023-02-08 at 10:43 +0800, Jason Xing wrote:
> > > > From: Jason Xing <kernelxing@tencent.com>
> > > > 
> > > > Recently I encountered one case where I cannot increase the MTU size
> > > > directly from 1500 to a much bigger value with XDP enabled if the
> > > > server is equipped with IXGBE card, which happened on thousands of
> > > > servers in production environment. After appling the current patch,
> > > > we can set the maximum MTU size to 3K.
> > > > 
> > > > This patch follows the behavior of changing MTU as i40e/ice does.
> > > > 
> > > > Referrences:
> > > > [1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP")
> > > > [2] commit 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
> > > > 
> > > > Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP")
> > > > Signed-off-by: Jason Xing <kernelxing@tencent.com>
> > > 
> > > This is based on the broken premise that w/ XDP we are using a 4K page.
> > > The ixgbe driver isn't using page pool and is therefore running on
> > > different limitations. The ixgbe driver is only using 2K slices of the
> > > 4K page. In addition that is reduced to 1.5K to allow for headroom and
> > > the shared info in the buffer.
> > > 
> > > Currently the only way a 3K buffer would work is if FCoE is enabled and
> > > in that case the driver is using order 1 pages and still using the
> > > split buffer approach.
> > 
> > Hey Alex, interesting, we based this on the following logic from
> > ixgbe_set_rx_buffer_len() I guess:
> > 
> > #if (PAGE_SIZE < 8192)
> > 		if (adapter->flags2 & IXGBE_FLAG2_RSC_ENABLED)
> > 			set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);
> > 
> > 		if (IXGBE_2K_TOO_SMALL_WITH_PADDING ||
> > 		    (max_frame > (ETH_FRAME_LEN + ETH_FCS_LEN)))
> > 			set_bit(__IXGBE_RX_3K_BUFFER, &rx_ring->state);
> > #endif
> > 
> > so we assumed that ixgbe is no different than i40e/ice in these terms, but
> > we ignored whole overhead of LRO/RSC that ixgbe carries.
> 
> If XDP is already enabled the LRO/RSC cannot be enabled. I think that
> is already disabled if we have XDP enabled.
> 
> > I am not actively working with ixgbe but I know that you were the main dev
> > of it, so without premature dive into the datasheet and codebase, are you
> > really sure that 3k mtu for XDP is a no go?
> 
> I think I mixed up fm10k and ixgbe, either that or I was thinking of
> the legacy setup. They all kind of blur together as I had worked on
> pretty much all the Intel drivers up to i40e the last time I was
> updating them for all the Rx path stuff. :)
> 
> So if I am reading things right the issue is that if XDP is enabled you
> cannot set a 3K MTU, but if you set the 3K MTU first then you can
> enable XDP after the fact right?

Yes and vice versa - when XDP is on then you should be able to work with
3k mtus.

> 
> Looking it over again after re-reading the code this looks good to me.
> 
> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>

Awesome :)

> 
> 
> 
> 
>
  
Rout, ChandanX Feb. 14, 2023, 2:23 a.m. UTC | #5
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Jason Xing
>Sent: 08 February 2023 08:14
>To: Brandeburg, Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; davem@davemloft.net;
>edumazet@google.com; kuba@kernel.org; pabeni@redhat.com;
>richardcochran@gmail.com; ast@kernel.org; daniel@iogearbox.net;
>hawk@kernel.org; john.fastabend@gmail.com; Lobakin, Alexandr
><alexandr.lobakin@intel.com>; Fijalkowski, Maciej
><maciej.fijalkowski@intel.com>
>Cc: kerneljasonxing@gmail.com; netdev@vger.kernel.org; linux-
>kernel@vger.kernel.org; intel-wired-lan@lists.osuosl.org;
>bpf@vger.kernel.org; Jason Xing <kernelxing@tencent.com>
>Subject: [Intel-wired-lan] [PATCH net v4 1/3] ixgbe: allow to increase MTU to
>3K with XDP enabled
>
>From: Jason Xing <kernelxing@tencent.com>
>
>Recently I encountered one case where I cannot increase the MTU size
>directly from 1500 to a much bigger value with XDP enabled if the server is
>equipped with IXGBE card, which happened on thousands of servers in
>production environment. After appling the current patch, we can set the
>maximum MTU size to 3K.
>
>This patch follows the behavior of changing MTU as i40e/ice does.
>
>Referrences:
>[1] commit 23b44513c3e6 ("ice: allow 3k MTU for XDP") [2] commit
>0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
>
>Fixes: fabf1bce103a ("ixgbe: Prevent unsupported configurations with XDP")
>Signed-off-by: Jason Xing <kernelxing@tencent.com>
>---
>v4:
>1) use ':' instead of '-' for kdoc
>
>v3:
>1) modify the titile and body message.
>
>v2:
>1) change the commit message.
>2) modify the logic when changing MTU size suggested by Maciej and
>Alexander.
>---
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 25 ++++++++++++-------
> 1 file changed, 16 insertions(+), 9 deletions(-)
>

Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
  

Patch

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index ab8370c413f3..25ca329f7d3c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -6777,6 +6777,18 @@  static void ixgbe_free_all_rx_resources(struct ixgbe_adapter *adapter)
 			ixgbe_free_rx_resources(adapter->rx_ring[i]);
 }
 
+/**
+ * ixgbe_max_xdp_frame_size - returns the maximum allowed frame size for XDP
+ * @adapter: device handle, pointer to adapter
+ */
+static int ixgbe_max_xdp_frame_size(struct ixgbe_adapter *adapter)
+{
+	if (PAGE_SIZE >= 8192 || adapter->flags2 & IXGBE_FLAG2_RX_LEGACY)
+		return IXGBE_RXBUFFER_2K;
+	else
+		return IXGBE_RXBUFFER_3K;
+}
+
 /**
  * ixgbe_change_mtu - Change the Maximum Transfer Unit
  * @netdev: network interface device structure
@@ -6788,18 +6800,13 @@  static int ixgbe_change_mtu(struct net_device *netdev, int new_mtu)
 {
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
 
-	if (adapter->xdp_prog) {
+	if (ixgbe_enabled_xdp_adapter(adapter)) {
 		int new_frame_size = new_mtu + ETH_HLEN + ETH_FCS_LEN +
 				     VLAN_HLEN;
-		int i;
-
-		for (i = 0; i < adapter->num_rx_queues; i++) {
-			struct ixgbe_ring *ring = adapter->rx_ring[i];
 
-			if (new_frame_size > ixgbe_rx_bufsz(ring)) {
-				e_warn(probe, "Requested MTU size is not supported with XDP\n");
-				return -EINVAL;
-			}
+		if (new_frame_size > ixgbe_max_xdp_frame_size(adapter)) {
+			e_warn(probe, "Requested MTU size is not supported with XDP\n");
+			return -EINVAL;
 		}
 	}