Message ID | 1687771137-26911-1-git-send-email-schakrabarti@linux.microsoft.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7356213vqr; Mon, 26 Jun 2023 02:38:56 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7wqM58unI7pjlxX2Jg+Ewcvj2+5ZjESj4iWAZIbAbUG3JOn1dfsBoyAN+TDLaRXJkDIg8Z X-Received: by 2002:aa7:df04:0:b0:51d:8977:e391 with SMTP id c4-20020aa7df04000000b0051d8977e391mr4009075edy.24.1687772336642; Mon, 26 Jun 2023 02:38:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687772336; cv=none; d=google.com; s=arc-20160816; b=ZiOFekmWNZARbUWZw/dWWN57d4vnooB/3kXssWbgvchecY7wGLOZWkSh3aOKjd/N/k PeoMTZhiJJaLZ4/H8oAV699vlTziadT2Em4TQmJbS9GN3sVNdlfKzab0wtjsr2gFBCZ+ JGqUkecKaPljY28bp/awxF0m1CKAWeJJDqTX5sqlz7BqtxsYmvcK6xvz1smBHyvpaJIQ hnYNWjks6XJU5UW56TmL1skc6dd84g4WUr29KTcNwyqBXgPrZDwZqwdEELGjzTr7HhvU iaoHCKJPlLvY2tdDkyTd46ku8b/qDIShIkM49ImdIe2Qcp0nL6IkSh19lcihoeeoa4G/ +swA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature:dkim-filter; bh=w0RTwq6xnEh5PToktxOWxF1t6KZX1z6EpuCc02QAfJw=; fh=qKGHJM/OaNHTo96+1HG2BqACTjy4q9C0dbufp/+3Ef4=; b=g8fHzF1/bH+Su/dBAUjZ5j1aGTDmpDXSrqtKjk0aYVvjqkvk9wighe5czTIu9/58+k ZaHqx/Y0HZL1Um6HCE1lqsD5lPaeg1cXs4kMBb8l3ywDVAdYUR2tkI6aojOSnvE5/Bt/ VZOQV/OmAAKTTGSGkKbiDX0Mq0WTRTh4SKKnB1JJKSvqYI6IEvSyNqgecqEbhgST+Pek zWN3DNNVLb/JLnqy/D57V7pwAItEJewJZ+/JMAQ1BGWYY0So9RnLVUUqKi5zElSGUTjm Prs2/HVusjPOfza5Lwfx/px1YXToY5FcxIbboSKSkzlI7i3ktdNwTHQfXT0e1T0GEqKI EDPg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b="koblSqD/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w19-20020a056402071300b0051a26ce0026si2525800edx.371.2023.06.26.02.38.31; Mon, 26 Jun 2023 02:38:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b="koblSqD/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229992AbjFZJVT (ORCPT <rfc822;filip.gregor98@gmail.com> + 99 others); Mon, 26 Jun 2023 05:21:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229960AbjFZJUx (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 26 Jun 2023 05:20:53 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8F58F119; Mon, 26 Jun 2023 02:19:00 -0700 (PDT) Received: by linux.microsoft.com (Postfix, from userid 1099) id 2481B21C3F2C; Mon, 26 Jun 2023 02:19:00 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 2481B21C3F2C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1687771140; bh=w0RTwq6xnEh5PToktxOWxF1t6KZX1z6EpuCc02QAfJw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=koblSqD/2huTtJrCBeUEQy5eNoqXbPICww1juVE4bM7oCKFN9eFyD7AKtHBAxcYca /vfnAMLBZaqKImty5tAGrIrQJMKatGmVIj85erlx6WQALHMJOb5i2brI+YsQwYmeQu IA9wF6IVXuzpu5Bv8ch7eeQvOqO3cEdHXhcZqmsg= From: souradeep chakrabarti <schakrabarti@linux.microsoft.com> To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, longli@microsoft.com, sharmaajay@microsoft.com, leon@kernel.org, cai.huoqing@linux.dev, ssengar@linux.microsoft.com, vkuznets@redhat.com, tglx@linutronix.de, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org Cc: stable@vger.kernel.org, schakrabarti@microsoft.com, Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> Subject: [PATCH 1/2 V3 net] net: mana: Fix MANA VF unload when host is unresponsive Date: Mon, 26 Jun 2023 02:18:57 -0700 Message-Id: <1687771137-26911-1-git-send-email-schakrabarti@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1687771098-26775-1-git-send-email-schakrabarti@linux.microsoft.com> References: <1687771098-26775-1-git-send-email-schakrabarti@linux.microsoft.com> X-Spam-Status: No, score=-19.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769757565421688496?= X-GMAIL-MSGID: =?utf-8?q?1769757565421688496?= |
Series |
[1/2,V3,net] net: mana: Fix MANA VF unload when host is unresponsive
|
|
Commit Message
Souradeep Chakrabarti
June 26, 2023, 9:18 a.m. UTC
From: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> This patch addresses the VF unload issue, where mana_dealloc_queues() gets stuck in infinite while loop, because of host unresponsiveness. It adds a timeout in the while loop, to fix it. Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a driver for Microsoft Azure Network Adapter) Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> --- V2 -> V3: * Splitted the patch in two parts. * Removed the unnecessary braces from mana_dealloc_queues(). --- drivers/net/ethernet/microsoft/mana/mana_en.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-)
Comments
On 6/26/2023 2:48 PM, souradeep chakrabarti wrote: > From: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> > > This patch addresses the VF unload issue, where mana_dealloc_queues() > gets stuck in infinite while loop, because of host unresponsiveness. > It adds a timeout in the while loop, to fix it. > > Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a driver for > Microsoft Azure Network Adapter) > Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> > --- > V2 -> V3: > * Splitted the patch in two parts. > * Removed the unnecessary braces from mana_dealloc_queues(). > --- > drivers/net/ethernet/microsoft/mana/mana_en.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c > index d907727c7b7a..cb5c43c3c47e 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > @@ -2329,7 +2329,10 @@ static int mana_dealloc_queues(struct net_device *ndev) > { > struct mana_port_context *apc = netdev_priv(ndev); > struct gdma_dev *gd = apc->ac->gdma_dev; > + unsigned long timeout; > struct mana_txq *txq; > + struct sk_buff *skb; > + struct mana_cq *cq; > int i, err; > > if (apc->port_is_up) > @@ -2348,13 +2351,25 @@ static int mana_dealloc_queues(struct net_device *ndev) > * > * Drain all the in-flight TX packets > */ > + > + timeout = jiffies + 120 * HZ; > for (i = 0; i < apc->num_queues; i++) { > txq = &apc->tx_qp[i].txq; > - > - while (atomic_read(&txq->pending_sends) > 0) > + while (atomic_read(&txq->pending_sends) > 0 && > + time_before(jiffies, timeout)) > usleep_range(1000, 2000); > } > > + for (i = 0; i < apc->num_queues; i++) { > + txq = &apc->tx_qp[i].txq; > + cq = &apc->tx_qp[i].tx_cq; > + while (atomic_read(&txq->pending_sends)) { > + skb = skb_dequeue(&txq->pending_skbs); > + mana_unmap_skb(skb, apc); > + napi_consume_skb(skb, cq->budget); > + atomic_sub(1, &txq->pending_sends); > + } > + } Can we combine these 2 loops into 1 something like this ? for (i = 0; i < apc->num_queues; i++) { txq = &apc->tx_qp[i].txq; cq = &apc->tx_qp[i].tx_cq; while (atomic_read(&txq->pending_sends)) { if (time_before(jiffies, timeout)) { usleep_range(1000, 2000); } else { skb = skb_dequeue(&txq->pending_skbs); mana_unmap_skb(skb, apc); napi_consume_skb(skb, cq->budget); atomic_sub(1, &txq->pending_sends); } } } > /* We're 100% sure the queues can no longer be woken up, because > * we're sure now mana_poll_tx_cq() can't be running. > */
From: souradeep chakrabarti <schakrabarti@linux.microsoft.com> Sent: Monday, June 26, 2023 2:19 AM > > This patch addresses the VF unload issue, where mana_dealloc_queues() > gets stuck in infinite while loop, because of host unresponsiveness. > It adds a timeout in the while loop, to fix it. For a patch series, the cover letter (patch 0 of the series) does not get included in the commit log anywhere. The cover letter can provide overall motivation and describe how the patches fit together, but the commit message for each patch should be as self-contained as possible. The commit message here refers to "the VF unload issue", and there's no context for understanding what that issue is, though you do provide some description in the text following "where". Could you provide a commit message that is a bit more self-contained? Same comment applies to commit message for the 2nd patch of this series. Also, avoid text like "this patch". See the "Describe your changes" section in Documentation/process/submitting-patches.rst where the use of imperative mood is mentioned. If you like, I can provide some offline help on writing a good commit message. Michael > > Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a driver for > Microsoft Azure Network Adapter) > Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> > --- > V2 -> V3: > * Splitted the patch in two parts. > * Removed the unnecessary braces from mana_dealloc_queues(). > --- > drivers/net/ethernet/microsoft/mana/mana_en.c | 19 +++++++++++++++++-- > 1 file changed, 17 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c > b/drivers/net/ethernet/microsoft/mana/mana_en.c > index d907727c7b7a..cb5c43c3c47e 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > @@ -2329,7 +2329,10 @@ static int mana_dealloc_queues(struct net_device *ndev) > { > struct mana_port_context *apc = netdev_priv(ndev); > struct gdma_dev *gd = apc->ac->gdma_dev; > + unsigned long timeout; > struct mana_txq *txq; > + struct sk_buff *skb; > + struct mana_cq *cq; > int i, err; > > if (apc->port_is_up) > @@ -2348,13 +2351,25 @@ static int mana_dealloc_queues(struct net_device *ndev) > * > * Drain all the in-flight TX packets > */ > + > + timeout = jiffies + 120 * HZ; > for (i = 0; i < apc->num_queues; i++) { > txq = &apc->tx_qp[i].txq; > - > - while (atomic_read(&txq->pending_sends) > 0) > + while (atomic_read(&txq->pending_sends) > 0 && > + time_before(jiffies, timeout)) > usleep_range(1000, 2000); > } > > + for (i = 0; i < apc->num_queues; i++) { > + txq = &apc->tx_qp[i].txq; > + cq = &apc->tx_qp[i].tx_cq; > + while (atomic_read(&txq->pending_sends)) { > + skb = skb_dequeue(&txq->pending_skbs); > + mana_unmap_skb(skb, apc); > + napi_consume_skb(skb, cq->budget); > + atomic_sub(1, &txq->pending_sends); > + } > + } > /* We're 100% sure the queues can no longer be woken up, because > * we're sure now mana_poll_tx_cq() can't be running. > */ > -- > 2.34.1
On Mon, Jun 26, 2023 at 07:50:44PM +0530, Praveen Kumar wrote: > On 6/26/2023 2:48 PM, souradeep chakrabarti wrote: > > From: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> > > > > This patch addresses the VF unload issue, where mana_dealloc_queues() > > gets stuck in infinite while loop, because of host unresponsiveness. > > It adds a timeout in the while loop, to fix it. > > > > Fixes: ca9c54d2d6a5ab2430c4eda364c77125d62e5e0f (net: mana: Add a driver for > > Microsoft Azure Network Adapter) > > Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> > > --- > > V2 -> V3: > > * Splitted the patch in two parts. > > * Removed the unnecessary braces from mana_dealloc_queues(). > > --- > > drivers/net/ethernet/microsoft/mana/mana_en.c | 19 +++++++++++++++++-- > > 1 file changed, 17 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c > > index d907727c7b7a..cb5c43c3c47e 100644 > > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > > @@ -2329,7 +2329,10 @@ static int mana_dealloc_queues(struct net_device *ndev) > > { > > struct mana_port_context *apc = netdev_priv(ndev); > > struct gdma_dev *gd = apc->ac->gdma_dev; > > + unsigned long timeout; > > struct mana_txq *txq; > > + struct sk_buff *skb; > > + struct mana_cq *cq; > > int i, err; > > > > if (apc->port_is_up) > > @@ -2348,13 +2351,25 @@ static int mana_dealloc_queues(struct net_device *ndev) > > * > > * Drain all the in-flight TX packets > > */ > > + > > + timeout = jiffies + 120 * HZ; > > for (i = 0; i < apc->num_queues; i++) { > > txq = &apc->tx_qp[i].txq; > > - > > - while (atomic_read(&txq->pending_sends) > 0) > > + while (atomic_read(&txq->pending_sends) > 0 && > > + time_before(jiffies, timeout)) > > usleep_range(1000, 2000); > > } > > > > + for (i = 0; i < apc->num_queues; i++) { > > + txq = &apc->tx_qp[i].txq; > > + cq = &apc->tx_qp[i].tx_cq; > > + while (atomic_read(&txq->pending_sends)) { > > + skb = skb_dequeue(&txq->pending_skbs); > > + mana_unmap_skb(skb, apc); > > + napi_consume_skb(skb, cq->budget); > > + atomic_sub(1, &txq->pending_sends); > > + } > > + } > > Can we combine these 2 loops into 1 something like this ? > > for (i = 0; i < apc->num_queues; i++) { > txq = &apc->tx_qp[i].txq; > cq = &apc->tx_qp[i].tx_cq; > while (atomic_read(&txq->pending_sends)) { > if (time_before(jiffies, timeout)) { > usleep_range(1000, 2000); > } else { > skb = skb_dequeue(&txq->pending_skbs); > mana_unmap_skb(skb, apc); > napi_consume_skb(skb, cq->budget); > atomic_sub(1, &txq->pending_sends); > } > } > } We should free up the skbs only after timeout has happened or after all the queues are looped. > > /* We're 100% sure the queues can no longer be woken up, because > > * we're sure now mana_poll_tx_cq() can't be running. > > */
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index d907727c7b7a..cb5c43c3c47e 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -2329,7 +2329,10 @@ static int mana_dealloc_queues(struct net_device *ndev) { struct mana_port_context *apc = netdev_priv(ndev); struct gdma_dev *gd = apc->ac->gdma_dev; + unsigned long timeout; struct mana_txq *txq; + struct sk_buff *skb; + struct mana_cq *cq; int i, err; if (apc->port_is_up) @@ -2348,13 +2351,25 @@ static int mana_dealloc_queues(struct net_device *ndev) * * Drain all the in-flight TX packets */ + + timeout = jiffies + 120 * HZ; for (i = 0; i < apc->num_queues; i++) { txq = &apc->tx_qp[i].txq; - - while (atomic_read(&txq->pending_sends) > 0) + while (atomic_read(&txq->pending_sends) > 0 && + time_before(jiffies, timeout)) usleep_range(1000, 2000); } + for (i = 0; i < apc->num_queues; i++) { + txq = &apc->tx_qp[i].txq; + cq = &apc->tx_qp[i].tx_cq; + while (atomic_read(&txq->pending_sends)) { + skb = skb_dequeue(&txq->pending_skbs); + mana_unmap_skb(skb, apc); + napi_consume_skb(skb, cq->budget); + atomic_sub(1, &txq->pending_sends); + } + } /* We're 100% sure the queues can no longer be woken up, because * we're sure now mana_poll_tx_cq() can't be running. */