[RFC] net/sched: use real_num_tx_queues in dev_watchdog()

Message ID 20230315183408.2723-1-praveen.kannoju@oracle.com
State New
Headers
Series [RFC] net/sched: use real_num_tx_queues in dev_watchdog() |

Commit Message

Praveen Kannoju March 15, 2023, 6:34 p.m. UTC
  Currently dev_watchdog() loops through num_tx_queues[Number of TX queues
allocated at alloc_netdev_mq() time] instead of real_num_tx_queues
[Number of TX queues currently active in device] to detect transmit
queue time out. Make this efficient by using real_num_tx_queues.

Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
---
PS: Please let me know if I am missing something obvious here.
 net/sched/sch_generic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Praveen Kannoju March 21, 2023, 10:05 a.m. UTC | #1
Ping. 

> -----Original Message-----
> From: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> Sent: 16 March 2023 12:04 AM
> To: jhs@mojatatu.com; xiyou.wangcong@gmail.com; jiri@resnulli.us; davem@davemloft.net; edumazet@google.com;
> kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>; Rama Nichanamatlu
> <rama.nichanamatlu@oracle.com>; Manjunath Patil <manjunath.b.patil@oracle.com>; Praveen Kannoju
> <praveen.kannoju@oracle.com>
> Subject: [PATCH RFC] net/sched: use real_num_tx_queues in dev_watchdog()
> 
> Currently dev_watchdog() loops through num_tx_queues[Number of TX queues allocated at alloc_netdev_mq() time] instead of
> real_num_tx_queues [Number of TX queues currently active in device] to detect transmit queue time out. Make this efficient by
> using real_num_tx_queues.
> 
> Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> ---
> PS: Please let me know if I am missing something obvious here.
>  net/sched/sch_generic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index a9aadc4e6858..e7d41a25f0e8 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -506,7 +506,7 @@ static void dev_watchdog(struct timer_list *t)
>  			unsigned int i;
>  			unsigned long trans_start;
> 
> -			for (i = 0; i < dev->num_tx_queues; i++) {
> +			for (i = 0; i < dev->real_num_tx_queues; i++) {
>  				struct netdev_queue *txq;
> 
>  				txq = netdev_get_tx_queue(dev, i);
> --
> 2.31.1
  
Eric Dumazet March 21, 2023, 11:08 a.m. UTC | #2
On Tue, Mar 21, 2023 at 3:05 AM Praveen Kannoju
<praveen.kannoju@oracle.com> wrote:
>
> Ping.
>

I do not think dev_watchdog() needs to be efficient ?

In any case, reading dev->real_num_tx_queues from a timer handler
could be racy vs RTNL.

While reading dev->num_tx_queues is not racy.

I think you should describe what problem you are trying to solve.

> > -----Original Message-----
> > From: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> > Sent: 16 March 2023 12:04 AM
> > To: jhs@mojatatu.com; xiyou.wangcong@gmail.com; jiri@resnulli.us; davem@davemloft.net; edumazet@google.com;
> > kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
> > Cc: Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>; Rama Nichanamatlu
> > <rama.nichanamatlu@oracle.com>; Manjunath Patil <manjunath.b.patil@oracle.com>; Praveen Kannoju
> > <praveen.kannoju@oracle.com>
> > Subject: [PATCH RFC] net/sched: use real_num_tx_queues in dev_watchdog()
> >
> > Currently dev_watchdog() loops through num_tx_queues[Number of TX queues allocated at alloc_netdev_mq() time] instead of
> > real_num_tx_queues [Number of TX queues currently active in device] to detect transmit queue time out. Make this efficient by
> > using real_num_tx_queues.
> >
> > Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> > ---
> > PS: Please let me know if I am missing something obvious here.
> >  net/sched/sch_generic.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index a9aadc4e6858..e7d41a25f0e8 100644
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -506,7 +506,7 @@ static void dev_watchdog(struct timer_list *t)
> >                       unsigned int i;
> >                       unsigned long trans_start;
> >
> > -                     for (i = 0; i < dev->num_tx_queues; i++) {
> > +                     for (i = 0; i < dev->real_num_tx_queues; i++) {
> >                               struct netdev_queue *txq;
> >
> >                               txq = netdev_get_tx_queue(dev, i);
> > --
> > 2.31.1
>
  
Praveen Kannoju March 21, 2023, 11:59 a.m. UTC | #3
Thank you for the reply, Eric.

During net device allocation through the routine "alloc_netdev_mqs()", both the variables  "num_tx_queues" and "real_num_tx_queues" are being initialized with same value. After which, drivers chose the number of active queues which are being supported by them and update the variable " real_num_tx_queues " using the routine "netif_set_real_num_tx_queues()". 

The intention of sending this patch was, Is it not efficient to monitor only the queues which are actively being used by the device instead of ones which are not at all participating in data communication. 

For example we have seen that for a bnxt interface the values of "num_tx_queues" and "real_num_tx_queues" are 74 and 8 respectively, and for IGB interface they were 8 and 4.  So around 50% to 90% of un-necessary queues are being monitored by the device watchdog at every interval.

-
Praveen

> -----Original Message-----
> From: Eric Dumazet <edumazet@google.com>
> Sent: 21 March 2023 04:38 PM
> To: Praveen Kannoju <praveen.kannoju@oracle.com>
> Cc: jhs@mojatatu.com; xiyou.wangcong@gmail.com; jiri@resnulli.us; davem@davemloft.net; kuba@kernel.org; pabeni@redhat.com;
> netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Rajesh Sivaramasubramaniom <rajesh.sivaramasubramaniom@oracle.com>;
> Rama Nichanamatlu <rama.nichanamatlu@oracle.com>; Manjunath Patil <manjunath.b.patil@oracle.com>
> Subject: Re: [PATCH RFC] net/sched: use real_num_tx_queues in dev_watchdog()
> 
> On Tue, Mar 21, 2023 at 3:05 AM Praveen Kannoju <praveen.kannoju@oracle.com> wrote:
> >
> > Ping.
> >
> 
> I do not think dev_watchdog() needs to be efficient ?
> 
> In any case, reading dev->real_num_tx_queues from a timer handler could be racy vs RTNL.
> 
> While reading dev->num_tx_queues is not racy.
> 
> I think you should describe what problem you are trying to solve.
> 
> > > -----Original Message-----
> > > From: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> > > Sent: 16 March 2023 12:04 AM
> > > To: jhs@mojatatu.com; xiyou.wangcong@gmail.com; jiri@resnulli.us;
> > > davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> > > pabeni@redhat.com; netdev@vger.kernel.org;
> > > linux-kernel@vger.kernel.org
> > > Cc: Rajesh Sivaramasubramaniom
> > > <rajesh.sivaramasubramaniom@oracle.com>; Rama Nichanamatlu
> > > <rama.nichanamatlu@oracle.com>; Manjunath Patil
> > > <manjunath.b.patil@oracle.com>; Praveen Kannoju
> > > <praveen.kannoju@oracle.com>
> > > Subject: [PATCH RFC] net/sched: use real_num_tx_queues in
> > > dev_watchdog()
> > >
> > > Currently dev_watchdog() loops through num_tx_queues[Number of TX
> > > queues allocated at alloc_netdev_mq() time] instead of
> > > real_num_tx_queues [Number of TX queues currently active in device] to detect transmit queue time out. Make this efficient by
> using real_num_tx_queues.
> > >
> > > Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
> > > ---
> > > PS: Please let me know if I am missing something obvious here.
> > >  net/sched/sch_generic.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index
> > > a9aadc4e6858..e7d41a25f0e8 100644
> > > --- a/net/sched/sch_generic.c
> > > +++ b/net/sched/sch_generic.c
> > > @@ -506,7 +506,7 @@ static void dev_watchdog(struct timer_list *t)
> > >                       unsigned int i;
> > >                       unsigned long trans_start;
> > >
> > > -                     for (i = 0; i < dev->num_tx_queues; i++) {
> > > +                     for (i = 0; i < dev->real_num_tx_queues; i++)
> > > + {
> > >                               struct netdev_queue *txq;
> > >
> > >                               txq = netdev_get_tx_queue(dev, i);
> > > --
> > > 2.31.1
> >
  

Patch

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index a9aadc4e6858..e7d41a25f0e8 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -506,7 +506,7 @@  static void dev_watchdog(struct timer_list *t)
 			unsigned int i;
 			unsigned long trans_start;
 
-			for (i = 0; i < dev->num_tx_queues; i++) {
+			for (i = 0; i < dev->real_num_tx_queues; i++) {
 				struct netdev_queue *txq;
 
 				txq = netdev_get_tx_queue(dev, i);