Message ID | 20231016020314.1269636-1-haowenchao2@huawei.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3190029vqb; Sun, 15 Oct 2023 19:04:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGY8D9P0LiajNyWLPfajpZLcSgsTwBWaE/KHGCoyGUbR0X3XS/ZtAYceQ6X4jBdMFf8loAl X-Received: by 2002:a9d:51cb:0:b0:6bf:5010:9d35 with SMTP id d11-20020a9d51cb000000b006bf50109d35mr32769866oth.3.1697421865591; Sun, 15 Oct 2023 19:04:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697421865; cv=none; d=google.com; s=arc-20160816; b=SMjtm8hsISEKpq5Y1poNqGlGmkgJs79G43Xwo4JlD2WIHEKwxJWZGn4Jkj8ANY0jqd fQUE5x5EUGOaOJeRWdpXaph6SeDCSe+mtwgpESp5SrEJL1CM0HtG4IESPD1AxoW3/YwS 6m/rY47ZBWOWQ3W5CSXV2TTOhho/tnLnDxJUE417BgI4DULqjDzixDIlHOCIObFBv4mL tGNpqGGbpbRGhez6TDIQWaCdku+oPR4ILWmt1CTpemtDqIfcNKGTTyO5OKLLs3b0mkcj 2YaAHvJLi936ScuaNIJ9XeJeWQfmKGP1l4XQKPGVeH8Jp+sM1fXL6k+GE2zVbzieTnJZ 7pGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=mQTG/nkGrl8JKUeK0AqAbNBZBa20VHsMMzrlu4j3Dpc=; fh=ZDcIlyYggP5K3GvQBQF3K/GzUnCJV+bQb/1GDPxM7E4=; b=HSom+NH6QNY9P1P0HDGi7JznjX6+ILTmD4e8Ar28/kldBSHIfUEZZoeA8CoVAlOoor v0/a2BHt3wmVUymubmcJM867HDC8VvR9hQFo0VndW+10snW3DcrmWLn729U01gojw3RD 8iTCHXr4BEFgTQCgjxcrxOEAwgUMaXVlmlVojgFSkulM44nEhhxyc2uMh1hDV4txGVAE /2lAIg9irvwOj5hHymTSXirsJJGwKH6vUnJLaFoZtUiKlkq03ums7dSNgy/kQ0Bbekr0 Kn9CRHUurXeJyHE2l8IG/cm0JKxExtUVsg1QRuhnGRZyB36nWKucLfmBncR+ibebg27k DckQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id r34-20020a634422000000b00584e05f62e1si9445698pga.297.2023.10.15.19.04.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Oct 2023 19:04:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 525D08063128; Sun, 15 Oct 2023 19:04:23 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231281AbjJPCDl (ORCPT <rfc822;hjfbswb@gmail.com> + 19 others); Sun, 15 Oct 2023 22:03:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51696 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229611AbjJPCDj (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 15 Oct 2023 22:03:39 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D465E95; Sun, 15 Oct 2023 19:03:36 -0700 (PDT) Received: from kwepemm000012.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4S80fZ3QJTzNndS; Mon, 16 Oct 2023 09:59:34 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm000012.china.huawei.com (7.193.23.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Mon, 16 Oct 2023 10:03:32 +0800 From: Wenchao Hao <haowenchao2@huawei.com> To: "James E . J . Bottomley" <jejb@linux.ibm.com>, "Martin K . Petersen" <martin.petersen@oracle.com>, <linux-scsi@vger.kernel.org> CC: <linux-kernel@vger.kernel.org>, <louhongxiang@huawei.com>, Wenchao Hao <haowenchao2@huawei.com> Subject: [PATCH v3 0/4] SCSI: Fix issues between removing device and error handle Date: Mon, 16 Oct 2023 10:03:10 +0800 Message-ID: <20231016020314.1269636-1-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm000012.china.huawei.com (7.193.23.142) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Sun, 15 Oct 2023 19:04:23 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779875830183461663 X-GMAIL-MSGID: 1779875830183461663 |
Series |
SCSI: Fix issues between removing device and error handle
|
|
Message
Wenchao Hao
Oct. 16, 2023, 2:03 a.m. UTC
I am testing SCSI error handle with my previous scsi_debug error injection patches, and found some issues when removing device and error handler happened together. These issues are triggered because devices in removing would be skipped when calling shost_for_each_device(). Three issues are found: 1. statistic info printed at beginning of scsi_error_handler is wrong 2. device reset is not triggered 3. IO requeued to request_queue would be hang after error handle V3: - Update patch description - Update comments of functions added V2: - Fix IO hang by run all devices' queue after error handler - Do not modify shost_for_each_device() directly but add a new helper to iterate devices but do not skip devices in removing Wenchao Hao (4): scsi: core: Add new helper to iterate all devices of host scsi: scsi_error: Fix wrong statistic when print error info scsi: scsi_error: Fix device reset is not triggered scsi: scsi_core: Fix IO hang when device removing drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------ drivers/scsi/scsi_error.c | 4 ++-- drivers/scsi/scsi_lib.c | 2 +- include/scsi/scsi_device.h | 25 ++++++++++++++++++--- 4 files changed, 57 insertions(+), 20 deletions(-)
Comments
On Mon, Oct 16, 2023 at 10:04 AM Wenchao Hao <haowenchao2@huawei.com> wrote: > > I am testing SCSI error handle with my previous scsi_debug error > injection patches, and found some issues when removing device and > error handler happened together. > > These issues are triggered because devices in removing would be skipped > when calling shost_for_each_device(). Friendly ping... > > Three issues are found: > 1. statistic info printed at beginning of scsi_error_handler is wrong > 2. device reset is not triggered > 3. IO requeued to request_queue would be hang after error handle > > V3: > - Update patch description > - Update comments of functions added > > V2: > - Fix IO hang by run all devices' queue after error handler > - Do not modify shost_for_each_device() directly but add a new > helper to iterate devices but do not skip devices in removing > > Wenchao Hao (4): > scsi: core: Add new helper to iterate all devices of host > scsi: scsi_error: Fix wrong statistic when print error info > scsi: scsi_error: Fix device reset is not triggered > scsi: scsi_core: Fix IO hang when device removing > > drivers/scsi/scsi.c | 46 ++++++++++++++++++++++++++------------ > drivers/scsi/scsi_error.c | 4 ++-- > drivers/scsi/scsi_lib.c | 2 +- > include/scsi/scsi_device.h | 25 ++++++++++++++++++--- > 4 files changed, 57 insertions(+), 20 deletions(-) > > -- > 2.32.0 >
On 10/17/23 10:00, Wenchao Hao wrote: > On Mon, Oct 16, 2023 at 10:04 AM Wenchao Hao <haowenchao2@huawei.com> wrote: >> >> I am testing SCSI error handle with my previous scsi_debug error >> injection patches, and found some issues when removing device and >> error handler happened together. >> >> These issues are triggered because devices in removing would be skipped >> when calling shost_for_each_device(). > > Friendly ping... The patch series was posted on October 15, 7 PM PDT and the ping has been posted on October 17, 10 AM PDT. That's less than two days after the patch series was posted. Isn't that way too soon to post a "ping"? Thanks, Bart.
On 10/17/23 18:37, Wenchao Hao wrote: > The previous version was posted on 2023/9/28 but not reviewed, so I > ping soon after repost. Since a repost counts as a ping, I think posting a ping soon after reposting is considered aggressive. Bart.
On Wed, Oct 18, 2023 at 9:51 PM Bart Van Assche <bvanassche@acm.org> wrote: > > On 10/17/23 18:37, Wenchao Hao wrote: > > The previous version was posted on 2023/9/28 but not reviewed, so I > > ping soon after repost. > > Since a repost counts as a ping, I think posting a ping soon after > reposting is considered aggressive. > I didn't mean that, then how long is appropriate to post a ping? Thanks. > Bart.
On 10/18/23 07:40, Wenchao Hao wrote: > On Wed, Oct 18, 2023 at 9:51 PM Bart Van Assche <bvanassche@acm.org> wrote: >> >> On 10/17/23 18:37, Wenchao Hao wrote: >>> The previous version was posted on 2023/9/28 but not reviewed, so I >>> ping soon after repost. >> >> Since a repost counts as a ping, I think posting a ping soon after >> reposting is considered aggressive. > > I didn't mean that, then how long is appropriate to post a ping? It depends on how busy the reviewers are. I wait at least one week before sending out a ping or reposting a patch series. Bart.