[04/15] can: m_can: Use transmit event FIFO watermark level interrupt

Message ID 20221116205308.2996556-5-msp@baylibre.com
State New
Headers
Series can: m_can: Optimizations for tcan and peripheral chips |

Commit Message

Markus Schneider-Pargmann Nov. 16, 2022, 8:52 p.m. UTC
  Currently the only mode of operation is an interrupt for every transmit
event. This is inefficient for peripheral chips. Use the transmit FIFO
event watermark interrupt instead if the FIFO size is more than 2. Use
FIFOsize - 1 for the watermark so the interrupt is triggered early
enough to not stop transmitting.

Note that if the number of transmits is less than the watermark level,
the transmit events will not be processed until there is any other
interrupt. This will only affect statistic counters. Also there is an
interrupt every time the timestamp wraps around.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
---
 drivers/net/can/m_can/m_can.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)
  

Comments

Marc Kleine-Budde Nov. 30, 2022, 5:17 p.m. UTC | #1
On 16.11.2022 21:52:57, Markus Schneider-Pargmann wrote:
> Currently the only mode of operation is an interrupt for every transmit
> event. This is inefficient for peripheral chips. Use the transmit FIFO
> event watermark interrupt instead if the FIFO size is more than 2. Use
> FIFOsize - 1 for the watermark so the interrupt is triggered early
> enough to not stop transmitting.
> 
> Note that if the number of transmits is less than the watermark level,
> the transmit events will not be processed until there is any other
> interrupt. This will only affect statistic counters. Also there is an
> interrupt every time the timestamp wraps around.
> 
> Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>

Please make this configurable with the ethtool TX IRQ coalescing
parameter. Please setup an hwtimer to enable the regular interrupt after
some configurable time to avoid starving of the TX complete events.

I've implemented this for the mcp251xfd driver, see:

656fc12ddaf8 ("can: mcp251xfd: add TX IRQ coalescing ethtool support")
169d00a25658 ("can: mcp251xfd: add TX IRQ coalescing support")
846990e0ed82 ("can: mcp251xfd: add RX IRQ coalescing ethtool support")
60a848c50d2d ("can: mcp251xfd: add RX IRQ coalescing support")
9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")

Marc
  
Markus Schneider-Pargmann Dec. 1, 2022, 8:25 a.m. UTC | #2
Hi Marc,

Thanks for reviewing.

On Wed, Nov 30, 2022 at 06:17:15PM +0100, Marc Kleine-Budde wrote:
> On 16.11.2022 21:52:57, Markus Schneider-Pargmann wrote:
> > Currently the only mode of operation is an interrupt for every transmit
> > event. This is inefficient for peripheral chips. Use the transmit FIFO
> > event watermark interrupt instead if the FIFO size is more than 2. Use
> > FIFOsize - 1 for the watermark so the interrupt is triggered early
> > enough to not stop transmitting.
> > 
> > Note that if the number of transmits is less than the watermark level,
> > the transmit events will not be processed until there is any other
> > interrupt. This will only affect statistic counters. Also there is an
> > interrupt every time the timestamp wraps around.
> > 
> > Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
> 
> Please make this configurable with the ethtool TX IRQ coalescing
> parameter. Please setup an hwtimer to enable the regular interrupt after
> some configurable time to avoid starving of the TX complete events.

I guess hwtimer==hrtimer?

I thought about setting up a timer but decided against it as the TX
completion events are only used to update statistics of the interface,
as far as I can tell. I can implement a timer as well.

For the upcoming receive side patch I already added a hrtimer. I may try
to use the same timer for both directions as it is going to do the exact
same thing in both cases (call the interrupt routine). Of course that
depends on the details of the coalescing support. Any objections on
that?

> I've implemented this for the mcp251xfd driver, see:
> 
> 656fc12ddaf8 ("can: mcp251xfd: add TX IRQ coalescing ethtool support")
> 169d00a25658 ("can: mcp251xfd: add TX IRQ coalescing support")
> 846990e0ed82 ("can: mcp251xfd: add RX IRQ coalescing ethtool support")
> 60a848c50d2d ("can: mcp251xfd: add RX IRQ coalescing support")
> 9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")

Thanks for the pointers. I will have a look and try to implement it
similarly.

Best,
Markus
  
Marc Kleine-Budde Dec. 1, 2022, 9:05 a.m. UTC | #3
On 01.12.2022 09:25:21, Markus Schneider-Pargmann wrote:
> Hi Marc,
> 
> Thanks for reviewing.
> 
> On Wed, Nov 30, 2022 at 06:17:15PM +0100, Marc Kleine-Budde wrote:
> > On 16.11.2022 21:52:57, Markus Schneider-Pargmann wrote:
> > > Currently the only mode of operation is an interrupt for every transmit
> > > event. This is inefficient for peripheral chips. Use the transmit FIFO
> > > event watermark interrupt instead if the FIFO size is more than 2. Use
> > > FIFOsize - 1 for the watermark so the interrupt is triggered early
> > > enough to not stop transmitting.
> > > 
> > > Note that if the number of transmits is less than the watermark level,
> > > the transmit events will not be processed until there is any other
> > > interrupt. This will only affect statistic counters. Also there is an
> > > interrupt every time the timestamp wraps around.
> > > 
> > > Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
> > 
> > Please make this configurable with the ethtool TX IRQ coalescing
> > parameter. Please setup an hwtimer to enable the regular interrupt after
> > some configurable time to avoid starving of the TX complete events.
> 
> I guess hwtimer==hrtimer?

Sorry, yes!

> I thought about setting up a timer but decided against it as the TX
> completion events are only used to update statistics of the interface,
> as far as I can tell. I can implement a timer as well.

It's not only statistics, the sending socket can opt in to receive the
sent CAN frame on successful transmission. Other sockets will (by
default) receive successful sent CAN frames. The idea is that the other
sockets see the same CAN bus, doesn't matter if they are on a different
system receiving the CAN frame via the bus or on the same system
receiving the CAN frame as soon it has been sent to the bus.

> For the upcoming receive side patch I already added a hrtimer. I may try
> to use the same timer for both directions as it is going to do the exact
> same thing in both cases (call the interrupt routine). Of course that
> depends on the details of the coalescing support. Any objections on
> that?

For the mcp251xfd I implemented the RX and TX coalescing independent of
each other and made it configurable via ethtool's IRQ coalescing
options.

The hardware doesn't support any timeouts and only FIFO not empty, FIFO
half full and FIFO full IRQs and the on chip RAM for mailboxes is rather
limited. I think the mcan core has the same limitations.

The configuration for the mcp251xfd looks like this:

- First decide for classical CAN or CAN-FD mode
- configure RX and TX ring size
  9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
  For TX only a single FIFO is used.
  For RX up to 3 FIFOs (up to a depth of 32 each).
  FIFO depth is limited to power of 2.
  On the mcan cores this is currently done with a DT property.
  Runtime configurable ring size is optional but gives more flexibility
  for our use-cases due to limited RAM size.
- configure RX and TX coalescing via ethtools
  Set a timeout and the max CAN frames to coalesce.
  The max frames are limited to half or full FIFO.

How does coalescing work?

If coalescing is activated during reading of the RX'ed frames the FIFO
not empty IRQ is disabled (the half or full IRQ stays enabled). After
handling the RX'ed frames a hrtimer is started. In the hrtimer's
functions the FIFO not empty IRQ is enabled again.

I decided not to call the IRQ handler from the hrtimer to avoid
concurrency, but enable the FIFO not empty IRQ.

> > I've implemented this for the mcp251xfd driver, see:
> > 
> > 656fc12ddaf8 ("can: mcp251xfd: add TX IRQ coalescing ethtool support")
> > 169d00a25658 ("can: mcp251xfd: add TX IRQ coalescing support")
> > 846990e0ed82 ("can: mcp251xfd: add RX IRQ coalescing ethtool support")
> > 60a848c50d2d ("can: mcp251xfd: add RX IRQ coalescing support")
> > 9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
> 
> Thanks for the pointers. I will have a look and try to implement it
> similarly.

If you want to implement runtime configurable ring size, I created a
function to help in the calculation of the ring sizes:

a1439a5add62 ("can: mcp251xfd: ram: add helper function for runtime ring size calculation")

The code is part of the mcp251xfd driver, but is prepared to become a
generic helper function. The HW parameters are described with struct
can_ram_config and use you can_ram_get_layout() to get a valid RAM
layout based on CAN/CAN-FD ring size and coalescing parameters.

regards,
Marc
  
Markus Schneider-Pargmann Dec. 1, 2022, 10:12 a.m. UTC | #4
HI Marc,

On Thu, Dec 01, 2022 at 10:05:08AM +0100, Marc Kleine-Budde wrote:
> On 01.12.2022 09:25:21, Markus Schneider-Pargmann wrote:
> > Hi Marc,
> > 
> > Thanks for reviewing.
> > 
> > On Wed, Nov 30, 2022 at 06:17:15PM +0100, Marc Kleine-Budde wrote:
> > > On 16.11.2022 21:52:57, Markus Schneider-Pargmann wrote:
> > > > Currently the only mode of operation is an interrupt for every transmit
> > > > event. This is inefficient for peripheral chips. Use the transmit FIFO
> > > > event watermark interrupt instead if the FIFO size is more than 2. Use
> > > > FIFOsize - 1 for the watermark so the interrupt is triggered early
> > > > enough to not stop transmitting.
> > > > 
> > > > Note that if the number of transmits is less than the watermark level,
> > > > the transmit events will not be processed until there is any other
> > > > interrupt. This will only affect statistic counters. Also there is an
> > > > interrupt every time the timestamp wraps around.
> > > > 
> > > > Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
> > > 
> > > Please make this configurable with the ethtool TX IRQ coalescing
> > > parameter. Please setup an hwtimer to enable the regular interrupt after
> > > some configurable time to avoid starving of the TX complete events.
> > 
> > I guess hwtimer==hrtimer?
> 
> Sorry, yes!
> 
> > I thought about setting up a timer but decided against it as the TX
> > completion events are only used to update statistics of the interface,
> > as far as I can tell. I can implement a timer as well.
> 
> It's not only statistics, the sending socket can opt in to receive the
> sent CAN frame on successful transmission. Other sockets will (by
> default) receive successful sent CAN frames. The idea is that the other
> sockets see the same CAN bus, doesn't matter if they are on a different
> system receiving the CAN frame via the bus or on the same system
> receiving the CAN frame as soon it has been sent to the bus.

Thanks for explaining. I wasn't aware of that. I agree on the timer
then.

> 
> > For the upcoming receive side patch I already added a hrtimer. I may try
> > to use the same timer for both directions as it is going to do the exact
> > same thing in both cases (call the interrupt routine). Of course that
> > depends on the details of the coalescing support. Any objections on
> > that?
> 
> For the mcp251xfd I implemented the RX and TX coalescing independent of
> each other and made it configurable via ethtool's IRQ coalescing
> options.
> 
> The hardware doesn't support any timeouts and only FIFO not empty, FIFO
> half full and FIFO full IRQs and the on chip RAM for mailboxes is rather
> limited. I think the mcan core has the same limitations.

Yes and no, the mcan core provides watermark levels so it has more
options, but there is no hardware timer as well (at least I didn't see
anything usable).

> 
> The configuration for the mcp251xfd looks like this:
> 
> - First decide for classical CAN or CAN-FD mode
> - configure RX and TX ring size
>   9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
>   For TX only a single FIFO is used.
>   For RX up to 3 FIFOs (up to a depth of 32 each).
>   FIFO depth is limited to power of 2.
>   On the mcan cores this is currently done with a DT property.
>   Runtime configurable ring size is optional but gives more flexibility
>   for our use-cases due to limited RAM size.
> - configure RX and TX coalescing via ethtools
>   Set a timeout and the max CAN frames to coalesce.
>   The max frames are limited to half or full FIFO.

mcan can offer more options for the max frames limit fortunately.

> 
> How does coalescing work?
> 
> If coalescing is activated during reading of the RX'ed frames the FIFO
> not empty IRQ is disabled (the half or full IRQ stays enabled). After
> handling the RX'ed frames a hrtimer is started. In the hrtimer's
> functions the FIFO not empty IRQ is enabled again.

My rx path patches are working similarly though not 100% the same. I
will adopt everything and add it to the next version of this series.

> 
> I decided not to call the IRQ handler from the hrtimer to avoid
> concurrency, but enable the FIFO not empty IRQ.

mcan uses a threaded irq and I found this nice helper function I am
currently using for the receive path.
	irq_wake_thread()

It is not widely used so I hope this is fine. But this hopefully avoids
the concurrency issue. Also I don't need to artificially create an IRQ
as you do.

> 
> > > I've implemented this for the mcp251xfd driver, see:
> > > 
> > > 656fc12ddaf8 ("can: mcp251xfd: add TX IRQ coalescing ethtool support")
> > > 169d00a25658 ("can: mcp251xfd: add TX IRQ coalescing support")
> > > 846990e0ed82 ("can: mcp251xfd: add RX IRQ coalescing ethtool support")
> > > 60a848c50d2d ("can: mcp251xfd: add RX IRQ coalescing support")
> > > 9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
> > 
> > Thanks for the pointers. I will have a look and try to implement it
> > similarly.
> 
> If you want to implement runtime configurable ring size, I created a
> function to help in the calculation of the ring sizes:
> 
> a1439a5add62 ("can: mcp251xfd: ram: add helper function for runtime ring size calculation")
> 
> The code is part of the mcp251xfd driver, but is prepared to become a
> generic helper function. The HW parameters are described with struct
> can_ram_config and use you can_ram_get_layout() to get a valid RAM
> layout based on CAN/CAN-FD ring size and coalescing parameters.

Thank you. I think configurable ring sizes are currently out of scope
for me as I only have limited time for this.

Thank you Marc!

Best,
Markus
  
Marc Kleine-Budde Dec. 1, 2022, 11 a.m. UTC | #5
On 01.12.2022 11:12:20, Markus Schneider-Pargmann wrote:
> > > For the upcoming receive side patch I already added a hrtimer. I may try
> > > to use the same timer for both directions as it is going to do the exact
> > > same thing in both cases (call the interrupt routine). Of course that
> > > depends on the details of the coalescing support. Any objections on
> > > that?
> > 
> > For the mcp251xfd I implemented the RX and TX coalescing independent of
> > each other and made it configurable via ethtool's IRQ coalescing
> > options.
> > 
> > The hardware doesn't support any timeouts and only FIFO not empty, FIFO
> > half full and FIFO full IRQs and the on chip RAM for mailboxes is rather
> > limited. I think the mcan core has the same limitations.
> 
> Yes and no, the mcan core provides watermark levels so it has more
> options, but there is no hardware timer as well (at least I didn't see
> anything usable).

Are there any limitations to the water mark level?

> > The configuration for the mcp251xfd looks like this:
> > 
> > - First decide for classical CAN or CAN-FD mode
> > - configure RX and TX ring size
> >   9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
> >   For TX only a single FIFO is used.
> >   For RX up to 3 FIFOs (up to a depth of 32 each).
> >   FIFO depth is limited to power of 2.
> >   On the mcan cores this is currently done with a DT property.
> >   Runtime configurable ring size is optional but gives more flexibility
> >   for our use-cases due to limited RAM size.
> > - configure RX and TX coalescing via ethtools
> >   Set a timeout and the max CAN frames to coalesce.
> >   The max frames are limited to half or full FIFO.
> 
> mcan can offer more options for the max frames limit fortunately.
> 
> > 
> > How does coalescing work?
> > 
> > If coalescing is activated during reading of the RX'ed frames the FIFO
> > not empty IRQ is disabled (the half or full IRQ stays enabled). After
> > handling the RX'ed frames a hrtimer is started. In the hrtimer's
> > functions the FIFO not empty IRQ is enabled again.
> 
> My rx path patches are working similarly though not 100% the same. I
> will adopt everything and add it to the next version of this series.
> 
> > 
> > I decided not to call the IRQ handler from the hrtimer to avoid
> > concurrency, but enable the FIFO not empty IRQ.
> 
> mcan uses a threaded irq and I found this nice helper function I am
> currently using for the receive path.
> 	irq_wake_thread()
> 
> It is not widely used so I hope this is fine. But this hopefully avoids
> the concurrency issue. Also I don't need to artificially create an IRQ
> as you do.

I think it's Ok to use the function. Which IRQs are enabled after you
leave the RX handler? The mcp251xfd driver enables only a high watermark
IRQ and sets up the hrtimer. Then we have 3 scenarios:
- high watermark IRQ triggers -> IRQ is handled,
- FIFO level between 0 and high water mark -> no IRQ triggered, but
  hrtimer will run, irq_wake_thread() is called, IRQ is handled
- FIFO level 0 -> no IRQ triggered, hrtimer will run. What do you do in
  the IRQ handler? Check if FIFO is empty and enable the FIFO not empty
  IRQ?

The mcp251xfd unconditionally enables the FIFO not empty IRQ in the
hrtimer. This avoids reading of the FIFO fill level.

[...]

> > If you want to implement runtime configurable ring size, I created a
> > function to help in the calculation of the ring sizes:
> > 
> > a1439a5add62 ("can: mcp251xfd: ram: add helper function for runtime ring size calculation")
> > 
> > The code is part of the mcp251xfd driver, but is prepared to become a
> > generic helper function. The HW parameters are described with struct
> > can_ram_config and use you can_ram_get_layout() to get a valid RAM
> > layout based on CAN/CAN-FD ring size and coalescing parameters.
> 
> Thank you. I think configurable ring sizes are currently out of scope
> for me as I only have limited time for this.

Ok.

regards,
Marc
  
Markus Schneider-Pargmann Dec. 1, 2022, 4:59 p.m. UTC | #6
On Thu, Dec 01, 2022 at 12:00:33PM +0100, Marc Kleine-Budde wrote:
> On 01.12.2022 11:12:20, Markus Schneider-Pargmann wrote:
> > > > For the upcoming receive side patch I already added a hrtimer. I may try
> > > > to use the same timer for both directions as it is going to do the exact
> > > > same thing in both cases (call the interrupt routine). Of course that
> > > > depends on the details of the coalescing support. Any objections on
> > > > that?
> > > 
> > > For the mcp251xfd I implemented the RX and TX coalescing independent of
> > > each other and made it configurable via ethtool's IRQ coalescing
> > > options.
> > > 
> > > The hardware doesn't support any timeouts and only FIFO not empty, FIFO
> > > half full and FIFO full IRQs and the on chip RAM for mailboxes is rather
> > > limited. I think the mcan core has the same limitations.
> > 
> > Yes and no, the mcan core provides watermark levels so it has more
> > options, but there is no hardware timer as well (at least I didn't see
> > anything usable).
> 
> Are there any limitations to the water mark level?

Anything specific? I can't really see any limitation. You can set the
watermark between 1 and 32. I guess we could also always use it instead
of the new-element interrupt, but I haven't tried that yet. That may
simplify the code.

> 
> > > The configuration for the mcp251xfd looks like this:
> > > 
> > > - First decide for classical CAN or CAN-FD mode
> > > - configure RX and TX ring size
> > >   9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
> > >   For TX only a single FIFO is used.
> > >   For RX up to 3 FIFOs (up to a depth of 32 each).
> > >   FIFO depth is limited to power of 2.
> > >   On the mcan cores this is currently done with a DT property.
> > >   Runtime configurable ring size is optional but gives more flexibility
> > >   for our use-cases due to limited RAM size.
> > > - configure RX and TX coalescing via ethtools
> > >   Set a timeout and the max CAN frames to coalesce.
> > >   The max frames are limited to half or full FIFO.
> > 
> > mcan can offer more options for the max frames limit fortunately.
> > 
> > > 
> > > How does coalescing work?
> > > 
> > > If coalescing is activated during reading of the RX'ed frames the FIFO
> > > not empty IRQ is disabled (the half or full IRQ stays enabled). After
> > > handling the RX'ed frames a hrtimer is started. In the hrtimer's
> > > functions the FIFO not empty IRQ is enabled again.
> > 
> > My rx path patches are working similarly though not 100% the same. I
> > will adopt everything and add it to the next version of this series.
> > 
> > > 
> > > I decided not to call the IRQ handler from the hrtimer to avoid
> > > concurrency, but enable the FIFO not empty IRQ.
> > 
> > mcan uses a threaded irq and I found this nice helper function I am
> > currently using for the receive path.
> > 	irq_wake_thread()
> > 
> > It is not widely used so I hope this is fine. But this hopefully avoids
> > the concurrency issue. Also I don't need to artificially create an IRQ
> > as you do.
> 
> I think it's Ok to use the function. Which IRQs are enabled after you
> leave the RX handler? The mcp251xfd driver enables only a high watermark
> IRQ and sets up the hrtimer. Then we have 3 scenarios:
> - high watermark IRQ triggers -> IRQ is handled,
> - FIFO level between 0 and high water mark -> no IRQ triggered, but
>   hrtimer will run, irq_wake_thread() is called, IRQ is handled
> - FIFO level 0 -> no IRQ triggered, hrtimer will run. What do you do in
>   the IRQ handler? Check if FIFO is empty and enable the FIFO not empty
>   IRQ?

I am currently doing the normal IRQ handler run. It checks the
"Interrupt Register" at the beginning. This register does not show the
interrupts that fired, it shows the status. So even though the watermark
interrupt didn't trigger when called by a timer, RF0N 'new message'
status bit is still set if there is something new in the FIFO. Of course
it is the same for the transmit status bits.
So there is no need to read the FIFO fill levels directly, just the
general status register.

Best,
Markus
  
Marc Kleine-Budde Dec. 2, 2022, 9:23 a.m. UTC | #7
On 01.12.2022 17:59:51, Markus Schneider-Pargmann wrote:
> On Thu, Dec 01, 2022 at 12:00:33PM +0100, Marc Kleine-Budde wrote:
> > On 01.12.2022 11:12:20, Markus Schneider-Pargmann wrote:
> > > > > For the upcoming receive side patch I already added a hrtimer. I may try
> > > > > to use the same timer for both directions as it is going to do the exact
> > > > > same thing in both cases (call the interrupt routine). Of course that
> > > > > depends on the details of the coalescing support. Any objections on
> > > > > that?
> > > > 
> > > > For the mcp251xfd I implemented the RX and TX coalescing independent of
> > > > each other and made it configurable via ethtool's IRQ coalescing
> > > > options.
> > > > 
> > > > The hardware doesn't support any timeouts and only FIFO not empty, FIFO
> > > > half full and FIFO full IRQs and the on chip RAM for mailboxes is rather
> > > > limited. I think the mcan core has the same limitations.
> > > 
> > > Yes and no, the mcan core provides watermark levels so it has more
> > > options, but there is no hardware timer as well (at least I didn't see
> > > anything usable).
> > 
> > Are there any limitations to the water mark level?
> 
> Anything specific? I can't really see any limitation. You can set the
> watermark between 1 and 32. I guess we could also always use it instead
> of the new-element interrupt, but I haven't tried that yet. That may
> simplify the code.

Makes sense.

> > > > The configuration for the mcp251xfd looks like this:
> > > > 
> > > > - First decide for classical CAN or CAN-FD mode
> > > > - configure RX and TX ring size
> > > >   9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
> > > >   For TX only a single FIFO is used.
> > > >   For RX up to 3 FIFOs (up to a depth of 32 each).
> > > >   FIFO depth is limited to power of 2.
> > > >   On the mcan cores this is currently done with a DT property.
> > > >   Runtime configurable ring size is optional but gives more flexibility
> > > >   for our use-cases due to limited RAM size.
> > > > - configure RX and TX coalescing via ethtools
> > > >   Set a timeout and the max CAN frames to coalesce.
> > > >   The max frames are limited to half or full FIFO.
> > > 
> > > mcan can offer more options for the max frames limit fortunately.
> > > 
> > > > 
> > > > How does coalescing work?
> > > > 
> > > > If coalescing is activated during reading of the RX'ed frames the FIFO
> > > > not empty IRQ is disabled (the half or full IRQ stays enabled). After
> > > > handling the RX'ed frames a hrtimer is started. In the hrtimer's
> > > > functions the FIFO not empty IRQ is enabled again.
> > > 
> > > My rx path patches are working similarly though not 100% the same. I
> > > will adopt everything and add it to the next version of this series.
> > > 
> > > > 
> > > > I decided not to call the IRQ handler from the hrtimer to avoid
> > > > concurrency, but enable the FIFO not empty IRQ.
> > > 
> > > mcan uses a threaded irq and I found this nice helper function I am
> > > currently using for the receive path.
> > > 	irq_wake_thread()
> > > 
> > > It is not widely used so I hope this is fine. But this hopefully avoids
> > > the concurrency issue. Also I don't need to artificially create an IRQ
> > > as you do.
> > 
> > I think it's Ok to use the function. Which IRQs are enabled after you
> > leave the RX handler? The mcp251xfd driver enables only a high watermark
> > IRQ and sets up the hrtimer. Then we have 3 scenarios:
> > - high watermark IRQ triggers -> IRQ is handled,
> > - FIFO level between 0 and high water mark -> no IRQ triggered, but
> >   hrtimer will run, irq_wake_thread() is called, IRQ is handled
> > - FIFO level 0 -> no IRQ triggered, hrtimer will run. What do you do in
> >   the IRQ handler? Check if FIFO is empty and enable the FIFO not empty
> >   IRQ?
> 
> I am currently doing the normal IRQ handler run. It checks the
> "Interrupt Register" at the beginning. This register does not show the
> interrupts that fired, it shows the status. So even though the watermark
> interrupt didn't trigger when called by a timer, RF0N 'new message'
> status bit is still set if there is something new in the FIFO.

That covers scenario 2 from above.

> Of course it is the same for the transmit status bits.

ACK - The TX complete event handling is a 95% copy/paste of the RX
handling.

> So there is no need to read the FIFO fill levels directly, just the
> general status register.

What do you do if the hrtimer fires and there's no CAN frame waiting in
the FIFO?

Marc
  
Markus Schneider-Pargmann Dec. 2, 2022, 9:43 a.m. UTC | #8
On Fri, Dec 02, 2022 at 10:23:06AM +0100, Marc Kleine-Budde wrote:
...
> > > > > The configuration for the mcp251xfd looks like this:
> > > > > 
> > > > > - First decide for classical CAN or CAN-FD mode
> > > > > - configure RX and TX ring size
> > > > >   9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
> > > > >   For TX only a single FIFO is used.
> > > > >   For RX up to 3 FIFOs (up to a depth of 32 each).
> > > > >   FIFO depth is limited to power of 2.
> > > > >   On the mcan cores this is currently done with a DT property.
> > > > >   Runtime configurable ring size is optional but gives more flexibility
> > > > >   for our use-cases due to limited RAM size.
> > > > > - configure RX and TX coalescing via ethtools
> > > > >   Set a timeout and the max CAN frames to coalesce.
> > > > >   The max frames are limited to half or full FIFO.
> > > > 
> > > > mcan can offer more options for the max frames limit fortunately.
> > > > 
> > > > > 
> > > > > How does coalescing work?
> > > > > 
> > > > > If coalescing is activated during reading of the RX'ed frames the FIFO
> > > > > not empty IRQ is disabled (the half or full IRQ stays enabled). After
> > > > > handling the RX'ed frames a hrtimer is started. In the hrtimer's
> > > > > functions the FIFO not empty IRQ is enabled again.
> > > > 
> > > > My rx path patches are working similarly though not 100% the same. I
> > > > will adopt everything and add it to the next version of this series.
> > > > 
> > > > > 
> > > > > I decided not to call the IRQ handler from the hrtimer to avoid
> > > > > concurrency, but enable the FIFO not empty IRQ.
> > > > 
> > > > mcan uses a threaded irq and I found this nice helper function I am
> > > > currently using for the receive path.
> > > > 	irq_wake_thread()
> > > > 
> > > > It is not widely used so I hope this is fine. But this hopefully avoids
> > > > the concurrency issue. Also I don't need to artificially create an IRQ
> > > > as you do.
> > > 
> > > I think it's Ok to use the function. Which IRQs are enabled after you
> > > leave the RX handler? The mcp251xfd driver enables only a high watermark
> > > IRQ and sets up the hrtimer. Then we have 3 scenarios:
> > > - high watermark IRQ triggers -> IRQ is handled,
> > > - FIFO level between 0 and high water mark -> no IRQ triggered, but
> > >   hrtimer will run, irq_wake_thread() is called, IRQ is handled
> > > - FIFO level 0 -> no IRQ triggered, hrtimer will run. What do you do in
> > >   the IRQ handler? Check if FIFO is empty and enable the FIFO not empty
> > >   IRQ?
> > 
> > I am currently doing the normal IRQ handler run. It checks the
> > "Interrupt Register" at the beginning. This register does not show the
> > interrupts that fired, it shows the status. So even though the watermark
> > interrupt didn't trigger when called by a timer, RF0N 'new message'
> > status bit is still set if there is something new in the FIFO.
> 
> That covers scenario 2 from above.
> 
> > Of course it is the same for the transmit status bits.
> 
> ACK - The TX complete event handling is a 95% copy/paste of the RX
> handling.
> 
> > So there is no need to read the FIFO fill levels directly, just the
> > general status register.
> 
> What do you do if the hrtimer fires and there's no CAN frame waiting in
> the FIFO?

Just enabling the 'new item' interrupt again and keep the hrtimer
disabled.

Best,
Markus
  
Marc Kleine-Budde Dec. 2, 2022, 2:03 p.m. UTC | #9
On 02.12.2022 10:43:46, Markus Schneider-Pargmann wrote:
> On Fri, Dec 02, 2022 at 10:23:06AM +0100, Marc Kleine-Budde wrote:
> ...
> > > > > > The configuration for the mcp251xfd looks like this:
> > > > > > 
> > > > > > - First decide for classical CAN or CAN-FD mode
> > > > > > - configure RX and TX ring size
> > > > > >   9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters")
> > > > > >   For TX only a single FIFO is used.
> > > > > >   For RX up to 3 FIFOs (up to a depth of 32 each).
> > > > > >   FIFO depth is limited to power of 2.
> > > > > >   On the mcan cores this is currently done with a DT property.
> > > > > >   Runtime configurable ring size is optional but gives more flexibility
> > > > > >   for our use-cases due to limited RAM size.
> > > > > > - configure RX and TX coalescing via ethtools
> > > > > >   Set a timeout and the max CAN frames to coalesce.
> > > > > >   The max frames are limited to half or full FIFO.
> > > > > 
> > > > > mcan can offer more options for the max frames limit fortunately.
> > > > > 
> > > > > > 
> > > > > > How does coalescing work?
> > > > > > 
> > > > > > If coalescing is activated during reading of the RX'ed frames the FIFO
> > > > > > not empty IRQ is disabled (the half or full IRQ stays enabled). After
> > > > > > handling the RX'ed frames a hrtimer is started. In the hrtimer's
> > > > > > functions the FIFO not empty IRQ is enabled again.
> > > > > 
> > > > > My rx path patches are working similarly though not 100% the same. I
> > > > > will adopt everything and add it to the next version of this series.
> > > > > 
> > > > > > 
> > > > > > I decided not to call the IRQ handler from the hrtimer to avoid
> > > > > > concurrency, but enable the FIFO not empty IRQ.
> > > > > 
> > > > > mcan uses a threaded irq and I found this nice helper function I am
> > > > > currently using for the receive path.
> > > > > 	irq_wake_thread()
> > > > > 
> > > > > It is not widely used so I hope this is fine. But this hopefully avoids
> > > > > the concurrency issue. Also I don't need to artificially create an IRQ
> > > > > as you do.
> > > > 
> > > > I think it's Ok to use the function. Which IRQs are enabled after you
> > > > leave the RX handler? The mcp251xfd driver enables only a high watermark
> > > > IRQ and sets up the hrtimer. Then we have 3 scenarios:
> > > > - high watermark IRQ triggers -> IRQ is handled,
> > > > - FIFO level between 0 and high water mark -> no IRQ triggered, but
> > > >   hrtimer will run, irq_wake_thread() is called, IRQ is handled
> > > > - FIFO level 0 -> no IRQ triggered, hrtimer will run. What do you do in
> > > >   the IRQ handler? Check if FIFO is empty and enable the FIFO not empty
> > > >   IRQ?
> > > 
> > > I am currently doing the normal IRQ handler run. It checks the
> > > "Interrupt Register" at the beginning. This register does not show the
> > > interrupts that fired, it shows the status. So even though the watermark
> > > interrupt didn't trigger when called by a timer, RF0N 'new message'
> > > status bit is still set if there is something new in the FIFO.
> > 
> > That covers scenario 2 from above.
> > 
> > > Of course it is the same for the transmit status bits.
> > 
> > ACK - The TX complete event handling is a 95% copy/paste of the RX
> > handling.
> > 
> > > So there is no need to read the FIFO fill levels directly, just the
> > > general status register.
> > 
> > What do you do if the hrtimer fires and there's no CAN frame waiting in
> > the FIFO?
> 
> Just enabling the 'new item' interrupt again and keep the hrtimer
> disabled.

Sounds good!

regards,
Marc
  
Markus Schneider-Pargmann Dec. 13, 2022, 5:19 p.m. UTC | #10
Hi Marc,

On Thu, Dec 01, 2022 at 05:59:53PM +0100, Markus Schneider-Pargmann wrote:
> On Thu, Dec 01, 2022 at 12:00:33PM +0100, Marc Kleine-Budde wrote:
> > On 01.12.2022 11:12:20, Markus Schneider-Pargmann wrote:
> > > > > For the upcoming receive side patch I already added a hrtimer. I may try
> > > > > to use the same timer for both directions as it is going to do the exact
> > > > > same thing in both cases (call the interrupt routine). Of course that
> > > > > depends on the details of the coalescing support. Any objections on
> > > > > that?
> > > > 
> > > > For the mcp251xfd I implemented the RX and TX coalescing independent of
> > > > each other and made it configurable via ethtool's IRQ coalescing
> > > > options.
> > > > 
> > > > The hardware doesn't support any timeouts and only FIFO not empty, FIFO
> > > > half full and FIFO full IRQs and the on chip RAM for mailboxes is rather
> > > > limited. I think the mcan core has the same limitations.
> > > 
> > > Yes and no, the mcan core provides watermark levels so it has more
> > > options, but there is no hardware timer as well (at least I didn't see
> > > anything usable).
> > 
> > Are there any limitations to the water mark level?
> 
> Anything specific? I can't really see any limitation. You can set the
> watermark between 1 and 32. I guess we could also always use it instead
> of the new-element interrupt, but I haven't tried that yet. That may
> simplify the code.

Just a quick comment here after trying this, I decided against it.
- I can't modify the watermark levels once the chip is active.
- Using interrupt (un)masking I can change the behavior for tx and rx
  with a single register write instead of two to the two fifo
  configuration registers.

You will see this in the second part of the series then.

Best,
Markus
  
Marc Kleine-Budde Dec. 13, 2022, 7:18 p.m. UTC | #11
On 13.12.2022 18:19:46, Markus Schneider-Pargmann wrote:
> Hi Marc,
> 
> On Thu, Dec 01, 2022 at 05:59:53PM +0100, Markus Schneider-Pargmann wrote:
> > On Thu, Dec 01, 2022 at 12:00:33PM +0100, Marc Kleine-Budde wrote:
> > > On 01.12.2022 11:12:20, Markus Schneider-Pargmann wrote:
> > > > > > For the upcoming receive side patch I already added a hrtimer. I may try
> > > > > > to use the same timer for both directions as it is going to do the exact
> > > > > > same thing in both cases (call the interrupt routine). Of course that
> > > > > > depends on the details of the coalescing support. Any objections on
> > > > > > that?
> > > > > 
> > > > > For the mcp251xfd I implemented the RX and TX coalescing independent of
> > > > > each other and made it configurable via ethtool's IRQ coalescing
> > > > > options.
> > > > > 
> > > > > The hardware doesn't support any timeouts and only FIFO not empty, FIFO
> > > > > half full and FIFO full IRQs and the on chip RAM for mailboxes is rather
> > > > > limited. I think the mcan core has the same limitations.
> > > > 
> > > > Yes and no, the mcan core provides watermark levels so it has more
> > > > options, but there is no hardware timer as well (at least I didn't see
> > > > anything usable).
> > > 
> > > Are there any limitations to the water mark level?
> > 
> > Anything specific? I can't really see any limitation. You can set the
> > watermark between 1 and 32. I guess we could also always use it instead
> > of the new-element interrupt, but I haven't tried that yet. That may
> > simplify the code.
> 
> Just a quick comment here after trying this, I decided against it.
> - I can't modify the watermark levels once the chip is active.
> - Using interrupt (un)masking I can change the behavior for tx and rx
>   with a single register write instead of two to the two fifo
>   configuration registers.

Makes sense.

> You will see this in the second part of the series then.

Marc
  

Patch

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index f5bba848bd56..4a6972c8bacd 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -254,6 +254,7 @@  enum m_can_reg {
 #define TXESC_TBDS_64B		0x7
 
 /* Tx Event FIFO Configuration (TXEFC) */
+#define TXEFC_EFWM_MASK		GENMASK(29, 24)
 #define TXEFC_EFS_MASK		GENMASK(21, 16)
 
 /* Tx Event FIFO Status (TXEFS) */
@@ -1094,8 +1095,8 @@  static irqreturn_t m_can_isr(int irq, void *dev_id)
 			netif_wake_queue(dev);
 		}
 	} else  {
-		if (ir & IR_TEFN) {
-			/* New TX FIFO Element arrived */
+		if (ir & (IR_TEFN | IR_TEFW)) {
+			/* New TX FIFO Element arrived or watermark reached */
 			if (m_can_echo_tx_event(dev) != 0)
 				goto out_fail;
 			if (!cdev->tx_skb && netif_queue_stopped(dev))
@@ -1242,6 +1243,7 @@  static void m_can_chip_config(struct net_device *dev)
 {
 	struct m_can_classdev *cdev = netdev_priv(dev);
 	u32 cccr, test;
+	u32 interrupts = IR_ALL_INT;
 
 	m_can_config_endisable(cdev, true);
 
@@ -1276,11 +1278,20 @@  static void m_can_chip_config(struct net_device *dev)
 			    FIELD_PREP(TXEFC_EFS_MASK, 1) |
 			    cdev->mcfg[MRAM_TXE].off);
 	} else {
+		u32 txe_watermark;
+
+		txe_watermark = cdev->mcfg[MRAM_TXE].num - 1;
 		/* Full TX Event FIFO is used */
 		m_can_write(cdev, M_CAN_TXEFC,
+			    FIELD_PREP(TXEFC_EFWM_MASK,
+				       txe_watermark) |
 			    FIELD_PREP(TXEFC_EFS_MASK,
 				       cdev->mcfg[MRAM_TXE].num) |
 			    cdev->mcfg[MRAM_TXE].off);
+
+		/* Watermark interrupt mode */
+		if (txe_watermark)
+			interrupts &= ~IR_TEFN;
 	}
 
 	/* rx fifo configuration, blocking mode, fifo size 1 */
@@ -1338,15 +1349,13 @@  static void m_can_chip_config(struct net_device *dev)
 
 	/* Enable interrupts */
 	m_can_write(cdev, M_CAN_IR, IR_ALL_INT);
-	if (!(cdev->can.ctrlmode & CAN_CTRLMODE_BERR_REPORTING))
+	if (!(cdev->can.ctrlmode & CAN_CTRLMODE_BERR_REPORTING)) {
 		if (cdev->version == 30)
-			m_can_write(cdev, M_CAN_IE, IR_ALL_INT &
-				    ~(IR_ERR_LEC_30X));
+			interrupts &= ~(IR_ERR_LEC_30X);
 		else
-			m_can_write(cdev, M_CAN_IE, IR_ALL_INT &
-				    ~(IR_ERR_LEC_31X));
-	else
-		m_can_write(cdev, M_CAN_IE, IR_ALL_INT);
+			interrupts &= ~(IR_ERR_LEC_31X);
+	}
+	m_can_write(cdev, M_CAN_IE, interrupts);
 
 	/* route all interrupts to INT0 */
 	m_can_write(cdev, M_CAN_ILS, ILS_ALL_INT0);