From patchwork Fri Apr 21 08:26:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yajun Deng X-Patchwork-Id: 86174 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp912698vqo; Fri, 21 Apr 2023 01:36:56 -0700 (PDT) X-Google-Smtp-Source: AKy350Y0pd7g45AKkIbl5lrrLvdp2jO+EQdPHlaqZ4CjgH7efje69mkAnBDiCYJ256TzDswM/+2I X-Received: by 2002:a17:90a:38e2:b0:24b:7611:3683 with SMTP id x89-20020a17090a38e200b0024b76113683mr367138pjb.38.1682066216474; Fri, 21 Apr 2023 01:36:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682066216; cv=none; d=google.com; s=arc-20160816; b=REhn8G6zVjDWCpzrtOumNTW13csY69xkZR6JK5uQWlY4zEDaYtG89AHt3V2K+QflkZ /cOHNA+hZtg0URIyLAbtnAIOWcm43Y7udcg8WnbmbCyFLAECHrxmeFGqKWcozGIShPiS TH6sqWheQl0wi31dUzktaJLStCIvhAeWtezbOHSXJWSbPRBYzRhdcitVdKwRWRFmK7CT ah1Nb12uVyUaKTKIt+utdz2hFXfs85S2kZlHhEq1mIgdTNsXV4fwoiXZVSsJB1rpeYkF BDGl+QFRmVaK74imnFbJuQSgAI1KMdlHkN0DS6PTKVvxa9XelBAbgb13+quSneZU3qAL ah8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=SHe/jzCqewm2DDzl3v2vI3vnMZWQOaTgzDjtGP1hSHE=; b=sl+mTz80kYwxtBWcecidO4dO1ae3eLINRkNIx+BgRKA7FTXpG3fvw2PH+5EC0BzKXn lhHatEmNTsziAKX9yr4YGmu4epfhbDB1PDwz0cObC0eyfNiCpv5fmYRAvrXdMknt6Ael n8pgZ3dACDWtEMaT6Z3oK79NxONXhfrumDSg2O/UpNZt8QeU3PWThdRWlzWjynAO5BYZ 52M8AZNHEb9QhR5XFMmfuDEBzodcKXerCSoQSr7hZMsU0T1M+a+QCYz5ALqL49Nkhc2T DAll6fbUez2uJ3Ku/XI4gpulJ42Im8l2CeRwOrD1LIBdSTZAK3RmfnvSXSKb5AScs8Ip T3NQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=KC+8xr4D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e9-20020a17090301c900b001a2513b8e14si4195802plh.84.2023.04.21.01.36.41; Fri, 21 Apr 2023 01:36:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=KC+8xr4D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230114AbjDUI02 (ORCPT + 99 others); Fri, 21 Apr 2023 04:26:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229913AbjDUI01 (ORCPT ); Fri, 21 Apr 2023 04:26:27 -0400 Received: from out-42.mta1.migadu.com (out-42.mta1.migadu.com [95.215.58.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 030361FCD for ; Fri, 21 Apr 2023 01:26:25 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1682065582; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=SHe/jzCqewm2DDzl3v2vI3vnMZWQOaTgzDjtGP1hSHE=; b=KC+8xr4DqFayZ9uHezyBxBEJRGhGyHjWU0OIUIuzxlBD+1em15tVTQ39Moa4Ow837/YDR7 f5DdKiZiL4S8ASSZTLXT9RZjW5duqYpTvKtqPJ7N//SZiQe9/iiWwY50u0W4AJgccc4sKZ gKAeURTkRp03G56RuSy9CVsqq2MYvDY= From: Yajun Deng To: jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Yajun Deng Subject: [PATCH] net: sched: Print msecs when transmit queue time out Date: Fri, 21 Apr 2023 16:26:06 +0800 Message-Id: <20230421082606.551411-1-yajun.deng@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1763774265219763035?= X-GMAIL-MSGID: =?utf-8?q?1763774265219763035?= The kernel will print several warnings in a short period of time when it stalls. Like this: First warning: [ 7100.097547] ------------[ cut here ]------------ [ 7100.097550] NETDEV WATCHDOG: eno2 (xxx): transmit queue 8 timed out [ 7100.097571] WARNING: CPU: 8 PID: 0 at net/sched/sch_generic.c:467 dev_watchdog+0x260/0x270 ... Second warning: [ 7147.756952] rcu: INFO: rcu_preempt self-detected stall on CPU [ 7147.756958] rcu: 24-....: (59999 ticks this GP) idle=546/1/0x400000000000000 softirq=367 3137/3673146 fqs=13844 [ 7147.756960] (t=60001 jiffies g=4322709 q=133381) [ 7147.756962] NMI backtrace for cpu 24 ... We calculate that the transmit queue start stall should occur before 7095s according to watchdog_timeo, the rcu start stall at 7087s. These two times are close together, it is difficult to confirm which happened first. To let users know the exact time the stall started, print msecs when the transmit queue time out. Signed-off-by: Yajun Deng --- net/sched/sch_generic.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index a9aadc4e6858..37e41f972f69 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -502,7 +502,7 @@ static void dev_watchdog(struct timer_list *t) if (netif_device_present(dev) && netif_running(dev) && netif_carrier_ok(dev)) { - int some_queue_timedout = 0; + unsigned int timedout_ms = 0; unsigned int i; unsigned long trans_start; @@ -514,16 +514,16 @@ static void dev_watchdog(struct timer_list *t) if (netif_xmit_stopped(txq) && time_after(jiffies, (trans_start + dev->watchdog_timeo))) { - some_queue_timedout = 1; + timedout_ms = jiffies_to_msecs(jiffies - trans_start); atomic_long_inc(&txq->trans_timeout); break; } } - if (unlikely(some_queue_timedout)) { + if (unlikely(timedout_ms)) { trace_net_dev_xmit_timeout(dev, i); - WARN_ONCE(1, KERN_INFO "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out\n", - dev->name, netdev_drivername(dev), i); + WARN_ONCE(1, "NETDEV WATCHDOG: %s (%s): transmit queue %u timed out %u ms\n", + dev->name, netdev_drivername(dev), i, timedout_ms); netif_freeze_queues(dev); dev->netdev_ops->ndo_tx_timeout(dev, i); netif_unfreeze_queues(dev);