[02/15] can: m_can: Wakeup net queue once tx was issued

Message ID 20221116205308.2996556-3-msp@baylibre.com
State New
Headers
Series can: m_can: Optimizations for tcan and peripheral chips |

Commit Message

Markus Schneider-Pargmann Nov. 16, 2022, 8:52 p.m. UTC
  Currently the driver waits to wakeup the queue until the interrupt for
the transmit event is received and acknowledged. If we want to use the
hardware FIFO, this is too late.

Instead release the queue as soon as the transmit was transferred into
the hardware FIFO. We are then ready for the next transmit to be
transferred.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
---
 drivers/net/can/m_can/m_can.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
  

Comments

Marc Kleine-Budde Nov. 30, 2022, 5:21 p.m. UTC | #1
On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> Currently the driver waits to wakeup the queue until the interrupt for
> the transmit event is received and acknowledged. If we want to use the
> hardware FIFO, this is too late.
> 
> Instead release the queue as soon as the transmit was transferred into
> the hardware FIFO. We are then ready for the next transmit to be
> transferred.

If you want to really speed up the TX path, remove the worker and use
the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().

Extra bonus if you implement xmit_more() and transfer more than 1 skb
per SPI transfer.

Marc
  
Markus Schneider-Pargmann Dec. 1, 2022, 8:43 a.m. UTC | #2
Hi Marc,

On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > Currently the driver waits to wakeup the queue until the interrupt for
> > the transmit event is received and acknowledged. If we want to use the
> > hardware FIFO, this is too late.
> > 
> > Instead release the queue as soon as the transmit was transferred into
> > the hardware FIFO. We are then ready for the next transmit to be
> > transferred.
> 
> If you want to really speed up the TX path, remove the worker and use
> the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().

Good idea. I will check how regmap's async_write works and if it is
suitable to do the job. I don't want to drop the regmap usage for this
right now.

> 
> Extra bonus if you implement xmit_more() and transfer more than 1 skb
> per SPI transfer.

That's on my todo list, but I am not sure I will get to it soonish.

Thank you Marc!.

Best,
Markus
  
Marc Kleine-Budde Dec. 1, 2022, 9:16 a.m. UTC | #3
On 01.12.2022 09:43:02, Markus Schneider-Pargmann wrote:
> Hi Marc,
> 
> On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > Currently the driver waits to wakeup the queue until the interrupt for
> > > the transmit event is received and acknowledged. If we want to use the
> > > hardware FIFO, this is too late.
> > > 
> > > Instead release the queue as soon as the transmit was transferred into
> > > the hardware FIFO. We are then ready for the next transmit to be
> > > transferred.
> > 
> > If you want to really speed up the TX path, remove the worker and use
> > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> 
> Good idea. I will check how regmap's async_write works and if it is
> suitable to do the job. I don't want to drop the regmap usage for this
> right now.

IIRC regmap async write still uses mutexes, but sleeping is not allowed
in the xmit handler. The mcp251xfd driver does the endianness conversion
(and the optional CRC) manually for the TX path.

Sending directly from the xmit handler basically eliminates the queuing
between the network stack and the worker. Getting rid of the worker
makes life easier and it's faster anyways.

> > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > per SPI transfer.
> 
> That's on my todo list, but I am not sure I will get to it soonish.

I haven't implemented this for the mcp251xfd, yet, but I have some
proof-of-concept code somewhere. However, the mcp251xfd driver already
implemented byte queue limits: 0084e298acfe ("can: mcp251xfd: add BQL
support").

regards,
Marc
  
Markus Schneider-Pargmann Dec. 1, 2022, 4:49 p.m. UTC | #4
Hi Marc,

On Thu, Dec 01, 2022 at 10:16:05AM +0100, Marc Kleine-Budde wrote:
> On 01.12.2022 09:43:02, Markus Schneider-Pargmann wrote:
> > Hi Marc,
> > 
> > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > the transmit event is received and acknowledged. If we want to use the
> > > > hardware FIFO, this is too late.
> > > > 
> > > > Instead release the queue as soon as the transmit was transferred into
> > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > transferred.
> > > 
> > > If you want to really speed up the TX path, remove the worker and use
> > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > 
> > Good idea. I will check how regmap's async_write works and if it is
> > suitable to do the job. I don't want to drop the regmap usage for this
> > right now.
> 
> IIRC regmap async write still uses mutexes, but sleeping is not allowed
> in the xmit handler. The mcp251xfd driver does the endianness conversion
> (and the optional CRC) manually for the TX path.

I just saw, you can force regmap to use spinlocks as well. But it uses
the same operation for sync operations as well.

> 
> Sending directly from the xmit handler basically eliminates the queuing
> between the network stack and the worker. Getting rid of the worker
> makes life easier and it's faster anyways.

The current implementation of the driver doesn't really queue anything
between the network stack and the worker. It is a queue of size 1 ;).

To be honest I would rather focus on the other things than on getting
rid of the worker completely as this can be done in a separate patch
later as well. Yes I agree it would be nice to get rid of the worker but
it is also probably not a major bottleneck for the performance and in
its current state it works. If I have time left at the end I will be
more than happy to do that. But for the moment I would just keep the
worker as it is. Is that OK for you?

Thanks,
Markus

> 
> > > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > > per SPI transfer.
> > 
> > That's on my todo list, but I am not sure I will get to it soonish.
> 
> I haven't implemented this for the mcp251xfd, yet, but I have some
> proof-of-concept code somewhere. However, the mcp251xfd driver already
> implemented byte queue limits: 0084e298acfe ("can: mcp251xfd: add BQL
> support").
> 
> regards,
> Marc
> 
> -- 
> Pengutronix e.K.                 | Marc Kleine-Budde           |
> Embedded Linux                   | https://www.pengutronix.de  |
> Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
> Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |
  
Marc Kleine-Budde Dec. 2, 2022, 9:16 a.m. UTC | #5
On 01.12.2022 17:49:02, Markus Schneider-Pargmann wrote:
> Hi Marc,
> 
> On Thu, Dec 01, 2022 at 10:16:05AM +0100, Marc Kleine-Budde wrote:
> > On 01.12.2022 09:43:02, Markus Schneider-Pargmann wrote:
> > > Hi Marc,
> > > 
> > > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > > the transmit event is received and acknowledged. If we want to use the
> > > > > hardware FIFO, this is too late.
> > > > > 
> > > > > Instead release the queue as soon as the transmit was transferred into
> > > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > > transferred.
> > > > 
> > > > If you want to really speed up the TX path, remove the worker and use
> > > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > > 
> > > Good idea. I will check how regmap's async_write works and if it is
> > > suitable to do the job. I don't want to drop the regmap usage for this
> > > right now.
> > 
> > IIRC regmap async write still uses mutexes, but sleeping is not allowed
> > in the xmit handler. The mcp251xfd driver does the endianness conversion
> > (and the optional CRC) manually for the TX path.
> 
> I just saw, you can force regmap to use spinlocks as well. But it uses
> the same operation for sync operations as well.

But you cannot use sync SPI api under a spinlock.

> > Sending directly from the xmit handler basically eliminates the queuing
> > between the network stack and the worker. Getting rid of the worker
> > makes life easier and it's faster anyways.
> 
> The current implementation of the driver doesn't really queue anything
> between the network stack and the worker. It is a queue of size 1 ;).

Ok

> To be honest I would rather focus on the other things than on getting
> rid of the worker completely as this can be done in a separate patch
> later as well. Yes I agree it would be nice to get rid of the worker but
> it is also probably not a major bottleneck for the performance and in
> its current state it works. If I have time left at the end I will be
> more than happy to do that. But for the moment I would just keep the
> worker as it is. Is that OK for you?

Sure.

Marc
  
Marc Kleine-Budde Dec. 2, 2022, 1:53 p.m. UTC | #6
On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> Currently the driver waits to wakeup the queue until the interrupt for
> the transmit event is received and acknowledged. If we want to use the
> hardware FIFO, this is too late.
> 
> Instead release the queue as soon as the transmit was transferred into
> the hardware FIFO. We are then ready for the next transmit to be
> transferred.
> 
> Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
> ---
>  drivers/net/can/m_can/m_can.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
> index 2c01e3f7b23f..4adf03111782 100644
> --- a/drivers/net/can/m_can/m_can.c
> +++ b/drivers/net/can/m_can/m_can.c
> @@ -1097,10 +1097,9 @@ static irqreturn_t m_can_isr(int irq, void *dev_id)
>  			/* New TX FIFO Element arrived */
>  			if (m_can_echo_tx_event(dev) != 0)
>  				goto out_fail;
> -
> -			if (netif_queue_stopped(dev) &&
> -			    !m_can_tx_fifo_full(cdev))
> +			if (!cdev->tx_skb && netif_queue_stopped(dev))
>  				netif_wake_queue(dev);

Please don't start the queue if the FIFO is still full. Is this a
gamble, that it will take long enough until the work queue runs that the
FIFO is not full anymore?

> +

Nitpick: Please don't introduce an extra newline here.

>  		}
>  	}
>  
> @@ -1705,6 +1704,8 @@ static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
>  		if (m_can_tx_fifo_full(cdev) ||
>  		    m_can_next_echo_skb_occupied(dev, putidx))
>  			netif_stop_queue(dev);
> +		else if (cdev->is_peripheral && !cdev->tx_skb && netif_queue_stopped(dev))
> +			netif_wake_queue(dev);

Same question as above, what happens if the FIFO is full? e.g. in case
of a slow bus or the first CAN frame in the FIFO has a low prio...

>  	}
>  
>  	return NETDEV_TX_OK;
> -- 
> 2.38.1
> 
>

Marc
  
Markus Schneider-Pargmann Dec. 14, 2022, 9:14 a.m. UTC | #7
Hi Marc,

On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > Currently the driver waits to wakeup the queue until the interrupt for
> > the transmit event is received and acknowledged. If we want to use the
> > hardware FIFO, this is too late.
> > 
> > Instead release the queue as soon as the transmit was transferred into
> > the hardware FIFO. We are then ready for the next transmit to be
> > transferred.
> 
> If you want to really speed up the TX path, remove the worker and use
> the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> 
> Extra bonus if you implement xmit_more() and transfer more than 1 skb
> per SPI transfer.

Just a quick question here, I mplemented a xmit_more() call and I am
testing it right now, but it always returns false even under high
pressure. The device has a txqueuelen set to 1000. Do I need to turn
some other knob for this to work?

Thanks,
Markus
  
Marc Kleine-Budde Dec. 14, 2022, 9:18 a.m. UTC | #8
On 14.12.2022 10:14:06, Markus Schneider-Pargmann wrote:
> Hi Marc,
> 
> On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > Currently the driver waits to wakeup the queue until the interrupt for
> > > the transmit event is received and acknowledged. If we want to use the
> > > hardware FIFO, this is too late.
> > > 
> > > Instead release the queue as soon as the transmit was transferred into
> > > the hardware FIFO. We are then ready for the next transmit to be
> > > transferred.
> > 
> > If you want to really speed up the TX path, remove the worker and use
> > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > 
> > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > per SPI transfer.
> 
> Just a quick question here, I mplemented a xmit_more() call and I am
> testing it right now, but it always returns false even under high
> pressure. The device has a txqueuelen set to 1000. Do I need to turn
> some other knob for this to work?

AFAIK you need BQL support: see 0084e298acfe ("can: mcp251xfd: add BQL support").

The etas_es58x driver implements xmit_more(), I added the Author Vincent
on Cc.

Marc
  
Marc Kleine-Budde Dec. 14, 2022, 9:22 a.m. UTC | #9
On 14.12.2022 10:18:20, Marc Kleine-Budde wrote:
> On 14.12.2022 10:14:06, Markus Schneider-Pargmann wrote:
> > Hi Marc,
> > 
> > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > the transmit event is received and acknowledged. If we want to use the
> > > > hardware FIFO, this is too late.
> > > > 
> > > > Instead release the queue as soon as the transmit was transferred into
> > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > transferred.
> > > 
> > > If you want to really speed up the TX path, remove the worker and use
> > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > > 
> > > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > > per SPI transfer.
> > 
> > Just a quick question here, I mplemented a xmit_more() call and I am
> > testing it right now, but it always returns false even under high
> > pressure. The device has a txqueuelen set to 1000. Do I need to turn
> > some other knob for this to work?
> 
> AFAIK you need BQL support: see 0084e298acfe ("can: mcp251xfd: add BQL support").
> 
> The etas_es58x driver implements xmit_more(), I added the Author Vincent
> on Cc.

Have a look at netdev_queue_set_dql_min_limit() in the etas driver.

Marc
  
Vincent Mailhol Dec. 14, 2022, 10:15 a.m. UTC | #10
On Wed. 14 Dec. 2022 at 18:28, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> On 14.12.2022 10:18:20, Marc Kleine-Budde wrote:
> > On 14.12.2022 10:14:06, Markus Schneider-Pargmann wrote:
> > > Hi Marc,
> > >
> > > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > > the transmit event is received and acknowledged. If we want to use the
> > > > > hardware FIFO, this is too late.
> > > > >
> > > > > Instead release the queue as soon as the transmit was transferred into
> > > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > > transferred.
> > > >
> > > > If you want to really speed up the TX path, remove the worker and use
> > > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > > >
> > > > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > > > per SPI transfer.
> > >
> > > Just a quick question here, I mplemented a xmit_more() call and I am
> > > testing it right now, but it always returns false even under high
> > > pressure. The device has a txqueuelen set to 1000. Do I need to turn
> > > some other knob for this to work?

I was the first to use BQL in a CAN driver. It also took me time to
first figure out the existence of xmit_more() and even more to
understand how to make it so that it would return true.

> > AFAIK you need BQL support: see 0084e298acfe ("can: mcp251xfd: add BQL support").
> >
> > The etas_es58x driver implements xmit_more(), I added the Author Vincent
> > on Cc.
>
> Have a look at netdev_queue_set_dql_min_limit() in the etas driver.

The functions you need are the netdev_send_queue() and the
netdev_complete_queue():

  https://elixir.bootlin.com/linux/latest/source/include/linux/netdevice.h#L3424

For CAN, you probably want to have a look to can_skb_get_frame_len().

  https://elixir.bootlin.com/linux/latest/source/include/linux/can/length.h#L166

The netdev_queue_set_dql_min_limit() gives hints by setting a minimum
value for BQL. It is optional (and as of today I am the only user of
it).


Yours sincerely,
Vincent Mailhol
  
Markus Schneider-Pargmann Dec. 14, 2022, 10:18 a.m. UTC | #11
On Wed, Dec 14, 2022 at 10:22:01AM +0100, Marc Kleine-Budde wrote:
> On 14.12.2022 10:18:20, Marc Kleine-Budde wrote:
> > On 14.12.2022 10:14:06, Markus Schneider-Pargmann wrote:
> > > Hi Marc,
> > > 
> > > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > > the transmit event is received and acknowledged. If we want to use the
> > > > > hardware FIFO, this is too late.
> > > > > 
> > > > > Instead release the queue as soon as the transmit was transferred into
> > > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > > transferred.
> > > > 
> > > > If you want to really speed up the TX path, remove the worker and use
> > > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > > > 
> > > > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > > > per SPI transfer.
> > > 
> > > Just a quick question here, I mplemented a xmit_more() call and I am
> > > testing it right now, but it always returns false even under high
> > > pressure. The device has a txqueuelen set to 1000. Do I need to turn
> > > some other knob for this to work?
> > 
> > AFAIK you need BQL support: see 0084e298acfe ("can: mcp251xfd: add BQL support").
> > 
> > The etas_es58x driver implements xmit_more(), I added the Author Vincent
> > on Cc.
> 
> Have a look at netdev_queue_set_dql_min_limit() in the etas driver.

Thank you, I need to implement BQL, I wasn't aware of that. I may need
to delay this a bit just to get some feedback on the current state of my
patches, so I know what I need to work on first.

But it seems to be just accounting... Anyways, thank you :)

Best,
Markus
  
Markus Schneider-Pargmann Dec. 14, 2022, 10:35 a.m. UTC | #12
Hi Vincent,

On Wed, Dec 14, 2022 at 07:15:25PM +0900, Vincent MAILHOL wrote:
> On Wed. 14 Dec. 2022 at 18:28, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> > On 14.12.2022 10:18:20, Marc Kleine-Budde wrote:
> > > On 14.12.2022 10:14:06, Markus Schneider-Pargmann wrote:
> > > > Hi Marc,
> > > >
> > > > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > > > the transmit event is received and acknowledged. If we want to use the
> > > > > > hardware FIFO, this is too late.
> > > > > >
> > > > > > Instead release the queue as soon as the transmit was transferred into
> > > > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > > > transferred.
> > > > >
> > > > > If you want to really speed up the TX path, remove the worker and use
> > > > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > > > >
> > > > > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > > > > per SPI transfer.
> > > >
> > > > Just a quick question here, I mplemented a xmit_more() call and I am
> > > > testing it right now, but it always returns false even under high
> > > > pressure. The device has a txqueuelen set to 1000. Do I need to turn
> > > > some other knob for this to work?
> 
> I was the first to use BQL in a CAN driver. It also took me time to
> first figure out the existence of xmit_more() and even more to
> understand how to make it so that it would return true.
> 
> > > AFAIK you need BQL support: see 0084e298acfe ("can: mcp251xfd: add BQL support").
> > >
> > > The etas_es58x driver implements xmit_more(), I added the Author Vincent
> > > on Cc.
> >
> > Have a look at netdev_queue_set_dql_min_limit() in the etas driver.
> 
> The functions you need are the netdev_send_queue() and the
> netdev_complete_queue():
> 
>   https://elixir.bootlin.com/linux/latest/source/include/linux/netdevice.h#L3424
> 
> For CAN, you probably want to have a look to can_skb_get_frame_len().
> 
>   https://elixir.bootlin.com/linux/latest/source/include/linux/can/length.h#L166
> 
> The netdev_queue_set_dql_min_limit() gives hints by setting a minimum
> value for BQL. It is optional (and as of today I am the only user of
> it).

Thank you for this summary, great that you already invested the time to
make it work with a CAN driver. I will give it a try in the m_can
driver.

Best,
Markus
  
Markus Schneider-Pargmann Dec. 15, 2022, 9:31 a.m. UTC | #13
Hi,

On Wed, Dec 14, 2022 at 11:35:43AM +0100, Markus Schneider-Pargmann wrote:
> Hi Vincent,
> 
> On Wed, Dec 14, 2022 at 07:15:25PM +0900, Vincent MAILHOL wrote:
> > On Wed. 14 Dec. 2022 at 18:28, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> > > On 14.12.2022 10:18:20, Marc Kleine-Budde wrote:
> > > > On 14.12.2022 10:14:06, Markus Schneider-Pargmann wrote:
> > > > > Hi Marc,
> > > > >
> > > > > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > > > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > > > > the transmit event is received and acknowledged. If we want to use the
> > > > > > > hardware FIFO, this is too late.
> > > > > > >
> > > > > > > Instead release the queue as soon as the transmit was transferred into
> > > > > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > > > > transferred.
> > > > > >
> > > > > > If you want to really speed up the TX path, remove the worker and use
> > > > > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > > > > >
> > > > > > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > > > > > per SPI transfer.
> > > > >
> > > > > Just a quick question here, I mplemented a xmit_more() call and I am
> > > > > testing it right now, but it always returns false even under high
> > > > > pressure. The device has a txqueuelen set to 1000. Do I need to turn
> > > > > some other knob for this to work?
> > 
> > I was the first to use BQL in a CAN driver. It also took me time to
> > first figure out the existence of xmit_more() and even more to
> > understand how to make it so that it would return true.
> > 
> > > > AFAIK you need BQL support: see 0084e298acfe ("can: mcp251xfd: add BQL support").
> > > >
> > > > The etas_es58x driver implements xmit_more(), I added the Author Vincent
> > > > on Cc.
> > >
> > > Have a look at netdev_queue_set_dql_min_limit() in the etas driver.
> > 
> > The functions you need are the netdev_send_queue() and the
> > netdev_complete_queue():
> > 
> >   https://elixir.bootlin.com/linux/latest/source/include/linux/netdevice.h#L3424
> > 
> > For CAN, you probably want to have a look to can_skb_get_frame_len().
> > 
> >   https://elixir.bootlin.com/linux/latest/source/include/linux/can/length.h#L166
> > 
> > The netdev_queue_set_dql_min_limit() gives hints by setting a minimum
> > value for BQL. It is optional (and as of today I am the only user of
> > it).
> 
> Thank you for this summary, great that you already invested the time to
> make it work with a CAN driver. I will give it a try in the m_can
> driver.

Thanks again, it looks like it is working after adding netdev_sent_queue
and netdev_complete_queue.

Best,
Markus
  
Vincent Mailhol Dec. 16, 2022, 4:40 a.m. UTC | #14
On Thu. 15 Dec. 2022 at 18:44, Markus Schneider-Pargmann
<msp@baylibre.com> wrote:
> Hi,
>
> On Wed, Dec 14, 2022 at 11:35:43AM +0100, Markus Schneider-Pargmann wrote:
> > Hi Vincent,
> >
> > On Wed, Dec 14, 2022 at 07:15:25PM +0900, Vincent MAILHOL wrote:
> > > On Wed. 14 Dec. 2022 at 18:28, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> > > > On 14.12.2022 10:18:20, Marc Kleine-Budde wrote:
> > > > > On 14.12.2022 10:14:06, Markus Schneider-Pargmann wrote:
> > > > > > Hi Marc,
> > > > > >
> > > > > > On Wed, Nov 30, 2022 at 06:21:00PM +0100, Marc Kleine-Budde wrote:
> > > > > > > On 16.11.2022 21:52:55, Markus Schneider-Pargmann wrote:
> > > > > > > > Currently the driver waits to wakeup the queue until the interrupt for
> > > > > > > > the transmit event is received and acknowledged. If we want to use the
> > > > > > > > hardware FIFO, this is too late.
> > > > > > > >
> > > > > > > > Instead release the queue as soon as the transmit was transferred into
> > > > > > > > the hardware FIFO. We are then ready for the next transmit to be
> > > > > > > > transferred.
> > > > > > >
> > > > > > > If you want to really speed up the TX path, remove the worker and use
> > > > > > > the spi_async() API from the xmit callback, see mcp251xfd_start_xmit().
> > > > > > >
> > > > > > > Extra bonus if you implement xmit_more() and transfer more than 1 skb
> > > > > > > per SPI transfer.
> > > > > >
> > > > > > Just a quick question here, I mplemented a xmit_more() call and I am
> > > > > > testing it right now, but it always returns false even under high
> > > > > > pressure. The device has a txqueuelen set to 1000. Do I need to turn
> > > > > > some other knob for this to work?
> > >
> > > I was the first to use BQL in a CAN driver. It also took me time to
> > > first figure out the existence of xmit_more() and even more to
> > > understand how to make it so that it would return true.
> > >
> > > > > AFAIK you need BQL support: see 0084e298acfe ("can: mcp251xfd: add BQL support").
> > > > >
> > > > > The etas_es58x driver implements xmit_more(), I added the Author Vincent
> > > > > on Cc.
> > > >
> > > > Have a look at netdev_queue_set_dql_min_limit() in the etas driver.
> > >
> > > The functions you need are the netdev_send_queue() and the
> > > netdev_complete_queue():
> > >
> > >   https://elixir.bootlin.com/linux/latest/source/include/linux/netdevice.h#L3424
> > >
> > > For CAN, you probably want to have a look to can_skb_get_frame_len().
> > >
> > >   https://elixir.bootlin.com/linux/latest/source/include/linux/can/length.h#L166
> > >
> > > The netdev_queue_set_dql_min_limit() gives hints by setting a minimum
> > > value for BQL. It is optional (and as of today I am the only user of
> > > it).
> >
> > Thank you for this summary, great that you already invested the time to
> > make it work with a CAN driver. I will give it a try in the m_can
> > driver.
>
> Thanks again, it looks like it is working after adding netdev_sent_queue
> and netdev_complete_queue.

Happy to hear that :)

A quick advice just in case: make sure to test the different error
branches (failed TX, can_restart() after bus off…).


Yours sincerely,
Vincent Mailhol
  

Patch

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 2c01e3f7b23f..4adf03111782 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -1097,10 +1097,9 @@  static irqreturn_t m_can_isr(int irq, void *dev_id)
 			/* New TX FIFO Element arrived */
 			if (m_can_echo_tx_event(dev) != 0)
 				goto out_fail;
-
-			if (netif_queue_stopped(dev) &&
-			    !m_can_tx_fifo_full(cdev))
+			if (!cdev->tx_skb && netif_queue_stopped(dev))
 				netif_wake_queue(dev);
+
 		}
 	}
 
@@ -1705,6 +1704,8 @@  static netdev_tx_t m_can_tx_handler(struct m_can_classdev *cdev)
 		if (m_can_tx_fifo_full(cdev) ||
 		    m_can_next_echo_skb_occupied(dev, putidx))
 			netif_stop_queue(dev);
+		else if (cdev->is_peripheral && !cdev->tx_skb && netif_queue_stopped(dev))
+			netif_wake_queue(dev);
 	}
 
 	return NETDEV_TX_OK;