Message ID | 20231016020314.1269636-5-haowenchao2@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3190081vqb; Sun, 15 Oct 2023 19:04:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHsh/wnYihCIUq13n+yLkW0eFfTgANjWNDrxrj2euRAX/jY6MxT/1eM1Ayo7CY64Sschv/r X-Received: by 2002:a05:6a00:4ac1:b0:6b9:7d5c:bb58 with SMTP id ds1-20020a056a004ac100b006b97d5cbb58mr5616465pfb.0.1697421876407; Sun, 15 Oct 2023 19:04:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697421876; cv=none; d=google.com; s=arc-20160816; b=iJn6FKDw0Mhsx7ezkDaXb45KBoMAhWtCSHPc/4r5tM5Z+UVL7I14Nze0NcqB10ZEbY RwthWHq4n21yK5YK2lFL6rXxJX6Ju5IH7KnzYilWld3U8WnkqS5QIiwJJVh3ZzukMfeS xb1uBIIVo9hfZK9Brv/CJ7MFIOp9IS2bORe2gkViOXNLwqtx5mUmjsR8y8GQ/B4DcVx/ wuYFwHKAm9Nd/ApQK7MuWbEDTlgVfu7KnZbIMnccn6ySyNrFa32AkvY6LxCo0+vOHCM1 6GIWhYCwsgi52VZ0KuvhQJLa3mD5lC7rHwb6KEmJr+t4yUpPSK880GKk9pAu8q0SbPFk jPzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=SCdjDueQHmYAC0v4B3pVqWz7wNyspaHOvR0H9+rRfPE=; fh=ZDcIlyYggP5K3GvQBQF3K/GzUnCJV+bQb/1GDPxM7E4=; b=FV2xWnaofnl9ZBSFauHKAtkm6g8+6Hp6SzKaAY2/zKpKCYuA5jFTyMrjJAnmMOpmdy lGPWQpfiwaKM7o1/zcSsALXp3zkX3vXZ0jaX7miEUgd2mdu6pMnS7oefDNLWkMAmx9UQ 6/ApE3IN4WXa5iiR98NDv1wPpVoUrjDyzqhrBPxnMlxF0PfZd8HHdTHfReBnBpOuLp24 UXk/orlNjWFMdoztmenYlBc/K/bzijT3pfnbMmoCPof/fwbjP6SdpH4hSBtC1b78BGIV pras3iYQz44e/Q49QwupZG6wjr9okkyRhIMVYuBlWJwyP81hv6fP9WGlibiK/7FCjnIV hDlg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id g4-20020aa796a4000000b006be2b79b254si785150pfk.368.2023.10.15.19.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Oct 2023 19:04:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 631F4806A974; Sun, 15 Oct 2023 19:04:34 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231332AbjJPCDo (ORCPT <rfc822;hjfbswb@gmail.com> + 19 others); Sun, 15 Oct 2023 22:03:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230283AbjJPCDj (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 15 Oct 2023 22:03:39 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C2ADC5; Sun, 15 Oct 2023 19:03:37 -0700 (PDT) Received: from kwepemm000012.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4S80g231KSzVlcN; Mon, 16 Oct 2023 09:59:58 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm000012.china.huawei.com (7.193.23.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Mon, 16 Oct 2023 10:03:34 +0800 From: Wenchao Hao <haowenchao2@huawei.com> To: "James E . J . Bottomley" <jejb@linux.ibm.com>, "Martin K . Petersen" <martin.petersen@oracle.com>, <linux-scsi@vger.kernel.org> CC: <linux-kernel@vger.kernel.org>, <louhongxiang@huawei.com>, Wenchao Hao <haowenchao2@huawei.com> Subject: [PATCH v3 4/4] scsi: scsi_core: Fix IO hang when device removing Date: Mon, 16 Oct 2023 10:03:14 +0800 Message-ID: <20231016020314.1269636-5-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20231016020314.1269636-1-haowenchao2@huawei.com> References: <20231016020314.1269636-1-haowenchao2@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To kwepemm000012.china.huawei.com (7.193.23.142) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Sun, 15 Oct 2023 19:04:34 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779875841428392778 X-GMAIL-MSGID: 1779875841428392778 |
Series |
SCSI: Fix issues between removing device and error handle
|
|
Commit Message
Wenchao Hao
Oct. 16, 2023, 2:03 a.m. UTC
shost_for_each_device() would skip devices which is in progress of
removing, so scsi_run_queue() for these devices would be skipped in
scsi_run_host_queues() after blocking hosts' IO.
IO hang would be caused if return true when state is SDEV_CANCEL with
following order:
T1: T2:scsi_error_handler
__scsi_remove_device()
scsi_device_set_state(sdev, SDEV_CANCEL)
...
sd_remove()
del_gendisk()
blk_mq_freeze_queue_wait()
scsi_eh_flush_done_q()
scsi_queue_insert(scmd,...)
scsi_queue_insert() would not kick device's queue since commit
8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary")
After scsi_unjam_host(), the scsi error handler would call
scsi_run_host_queues() to trigger run queue for devices, while it
would not run queue for devices which is in progress of removing
because shost_for_each_device() would skip them.
So the requests added to these queues would not be handled any more,
and the removing device process would hang too.
Fix this issue by using shost_for_each_device_include_deleted() in
scsi_run_host_queues() to trigger a run queue for devices in removing.
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
---
drivers/scsi/scsi_lib.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On 10/15/23 9:03 PM, Wenchao Hao wrote: > shost_for_each_device() would skip devices which is in progress of > removing, so scsi_run_queue() for these devices would be skipped in > scsi_run_host_queues() after blocking hosts' IO. > > IO hang would be caused if return true when state is SDEV_CANCEL with > following order: > > T1: T2:scsi_error_handler > __scsi_remove_device() > scsi_device_set_state(sdev, SDEV_CANCEL) > ... > sd_remove() > del_gendisk() > blk_mq_freeze_queue_wait() > scsi_eh_flush_done_q() > scsi_queue_insert(scmd,...) > > scsi_queue_insert() would not kick device's queue since commit > 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary") > > After scsi_unjam_host(), the scsi error handler would call > scsi_run_host_queues() to trigger run queue for devices, while it > would not run queue for devices which is in progress of removing > because shost_for_each_device() would skip them. > > So the requests added to these queues would not be handled any more, > and the removing device process would hang too. > > Fix this issue by using shost_for_each_device_include_deleted() in > scsi_run_host_queues() to trigger a run queue for devices in removing. > > Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> > --- > drivers/scsi/scsi_lib.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c > index 195ca80667d0..40f407ffd26f 100644 > --- a/drivers/scsi/scsi_lib.c > +++ b/drivers/scsi/scsi_lib.c > @@ -466,7 +466,7 @@ void scsi_run_host_queues(struct Scsi_Host *shost) > { > struct scsi_device *sdev; > > - shost_for_each_device(sdev, shost) > + shost_for_each_device_include_deleted(sdev, shost) > scsi_run_queue(sdev->request_queue); What happens if there were no commands for the device that was destroyed and we race with this code and device deletion? So thread1 has set the device state tp SDEV_DEL and has finished blk_mq_destroy_queue because there were no commands running. The above eh thread, then is calling: scsi_run_queue -> blk_mq_kick_requeue_list and that queues the requeue work. blk_mq_destroy_queue had done blk_mq_cancel_work_sync but blk_mq_kick_requeue_list just added it back on the kblockd_workqueue. When __scsi_iterate_devices does scsi_device_put it would call scsi_device_dev_release and call blk_put_queue which frees the request_queue while it's requeue work might still be queued on kblockd_workqueue.
On 11/14/23 3:23 PM, Mike Christie wrote: > On 10/15/23 9:03 PM, Wenchao Hao wrote: >> shost_for_each_device() would skip devices which is in progress of >> removing, so scsi_run_queue() for these devices would be skipped in >> scsi_run_host_queues() after blocking hosts' IO. >> >> IO hang would be caused if return true when state is SDEV_CANCEL with >> following order: >> >> T1: T2:scsi_error_handler >> __scsi_remove_device() >> scsi_device_set_state(sdev, SDEV_CANCEL) >> ... >> sd_remove() >> del_gendisk() >> blk_mq_freeze_queue_wait() >> scsi_eh_flush_done_q() >> scsi_queue_insert(scmd,...) >> >> scsi_queue_insert() would not kick device's queue since commit >> 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary") >> >> After scsi_unjam_host(), the scsi error handler would call >> scsi_run_host_queues() to trigger run queue for devices, while it >> would not run queue for devices which is in progress of removing >> because shost_for_each_device() would skip them. >> >> So the requests added to these queues would not be handled any more, >> and the removing device process would hang too. >> >> Fix this issue by using shost_for_each_device_include_deleted() in >> scsi_run_host_queues() to trigger a run queue for devices in removing. >> >> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> >> --- >> drivers/scsi/scsi_lib.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >> index 195ca80667d0..40f407ffd26f 100644 >> --- a/drivers/scsi/scsi_lib.c >> +++ b/drivers/scsi/scsi_lib.c >> @@ -466,7 +466,7 @@ void scsi_run_host_queues(struct Scsi_Host *shost) >> { >> struct scsi_device *sdev; >> >> - shost_for_each_device(sdev, shost) >> + shost_for_each_device_include_deleted(sdev, shost) >> scsi_run_queue(sdev->request_queue); > > What happens if there were no commands for the device that > was destroyed and we race with this code and device deletion? > > So thread1 has set the device state tp SDEV_DEL and has finished > blk_mq_destroy_queue because there were no commands running. > > The above eh thread, then is calling: > > scsi_run_queue -> blk_mq_kick_requeue_list > > and that queues the requeue work. > > blk_mq_destroy_queue had done blk_mq_cancel_work_sync but > blk_mq_kick_requeue_list just added it back on the kblockd_workqueue. > > When __scsi_iterate_devices does scsi_device_put it would call > scsi_device_dev_release and call blk_put_queue which frees the > request_queue while it's requeue work might still be queued on > kblockd_workqueue. > Oh yeah, for your other lun/target reset patches were you trying to do something where you have a list for each scsi_device or a list of scsi_devices that needed error handler work? If so, maybe break that part out and use it here first. You can then just loop over the list of devices that needed work and start those above.
On 11/15/23 5:47 AM, Mike Christie wrote: > On 11/14/23 3:23 PM, Mike Christie wrote: >> On 10/15/23 9:03 PM, Wenchao Hao wrote: >>> shost_for_each_device() would skip devices which is in progress of >>> removing, so scsi_run_queue() for these devices would be skipped in >>> scsi_run_host_queues() after blocking hosts' IO. >>> >>> IO hang would be caused if return true when state is SDEV_CANCEL with >>> following order: >>> >>> T1: T2:scsi_error_handler >>> __scsi_remove_device() >>> scsi_device_set_state(sdev, SDEV_CANCEL) >>> ... >>> sd_remove() >>> del_gendisk() >>> blk_mq_freeze_queue_wait() >>> scsi_eh_flush_done_q() >>> scsi_queue_insert(scmd,...) >>> >>> scsi_queue_insert() would not kick device's queue since commit >>> 8b566edbdbfb ("scsi: core: Only kick the requeue list if necessary") >>> >>> After scsi_unjam_host(), the scsi error handler would call >>> scsi_run_host_queues() to trigger run queue for devices, while it >>> would not run queue for devices which is in progress of removing >>> because shost_for_each_device() would skip them. >>> >>> So the requests added to these queues would not be handled any more, >>> and the removing device process would hang too. >>> >>> Fix this issue by using shost_for_each_device_include_deleted() in >>> scsi_run_host_queues() to trigger a run queue for devices in removing. >>> >>> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com> >>> --- >>> drivers/scsi/scsi_lib.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c >>> index 195ca80667d0..40f407ffd26f 100644 >>> --- a/drivers/scsi/scsi_lib.c >>> +++ b/drivers/scsi/scsi_lib.c >>> @@ -466,7 +466,7 @@ void scsi_run_host_queues(struct Scsi_Host *shost) >>> { >>> struct scsi_device *sdev; >>> >>> - shost_for_each_device(sdev, shost) >>> + shost_for_each_device_include_deleted(sdev, shost) >>> scsi_run_queue(sdev->request_queue); >> >> What happens if there were no commands for the device that >> was destroyed and we race with this code and device deletion? >> >> So thread1 has set the device state tp SDEV_DEL and has finished >> blk_mq_destroy_queue because there were no commands running. >> >> The above eh thread, then is calling: >> >> scsi_run_queue -> blk_mq_kick_requeue_list >> >> and that queues the requeue work. >> >> blk_mq_destroy_queue had done blk_mq_cancel_work_sync but >> blk_mq_kick_requeue_list just added it back on the kblockd_workqueue. >> >> When __scsi_iterate_devices does scsi_device_put it would call >> scsi_device_dev_release and call blk_put_queue which frees the >> request_queue while it's requeue work might still be queued on >> kblockd_workqueue. >> Hi Mike, thank you for the review. Sorry I did not take the above flow into consideration and it's a bug should be fixed in next version. > > Oh yeah, for your other lun/target reset patches were you trying to > do something where you have a list for each scsi_device or a list of > scsi_devices that needed error handler work? If so, maybe break that > part out and use it here first. > The lun/target reset changes are not general for all drivers in my design, so it should not work here. > You can then just loop over the list of devices that needed work and > start those above. What about introduce a new flag "recovery" for each scsi_device to mark if there is error command happened on it, the new flag is set in scsi_eh_scmd_add() and cleared after error handle finished. Since clear is always after scsi_error_handle() is waked up and no more scsi_eh_scmd_add() would be called after scsi_error_handle() is waked up, we do not need lock between set and clear this flag. This change can help me to fix the issue you described above too. Here is a brief changes: diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c index c67cdcdc3ba8..36af294c2cef 100644 --- a/drivers/scsi/scsi_error.c +++ b/drivers/scsi/scsi_error.c @@ -310,6 +310,8 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd) if (shost->eh_deadline != -1 && !shost->last_reset) shost->last_reset = jiffies; + scmd->device->recovery = 1; + scsi_eh_reset(scmd); list_add_tail(&scmd->eh_entry, &shost->eh_cmd_q); spin_unlock_irqrestore(shost->host_lock, flags); @@ -2149,7 +2151,7 @@ static void scsi_restart_operations(struct Scsi_Host *shost) * now that error recovery is done, we will need to ensure that these * requests are started. */ - scsi_run_host_queues(shost); + scsi_run_host_recovery_queues(shost); /* * if eh is active and host_eh_scheduled is pending we need to re-run diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index cf3864f72093..0bf4423b6b9a 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -470,6 +470,17 @@ void scsi_run_host_queues(struct Scsi_Host *shost) scsi_run_queue(sdev->request_queue); } +void scsi_run_host_recovery_queues(struct Scsi_Host *shost) +{ + struct scsi_device *sdev; + + shost_for_each_device_include_deleted(sdev, shost) + if (sdev->recovery) { + scsi_run_queue(sdev->request_queue); + sdev->recovery = 0; + } +} + static void scsi_uninit_cmd(struct scsi_cmnd *cmd) { if (!blk_rq_is_passthrough(scsi_cmd_to_rq(cmd))) { diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h index 3f0dfb97db6b..3aba8ddd0101 100644 --- a/drivers/scsi/scsi_priv.h +++ b/drivers/scsi/scsi_priv.h @@ -107,6 +107,7 @@ extern void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd); extern void scsi_queue_insert(struct scsi_cmnd *cmd, int reason); extern void scsi_io_completion(struct scsi_cmnd *, unsigned int); extern void scsi_run_host_queues(struct Scsi_Host *shost); +extern void scsi_run_host_recovery_queues(struct Scsi_Host *shost); extern void scsi_requeue_run_queue(struct work_struct *work); extern void scsi_start_queue(struct scsi_device *sdev); extern int scsi_mq_setup_tags(struct Scsi_Host *shost); diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h index 10480eb582b2..b730ceab9996 100644 --- a/include/scsi/scsi_device.h +++ b/include/scsi/scsi_device.h @@ -239,6 +239,7 @@ struct scsi_device { unsigned cdl_supported:1; /* Command duration limits supported */ unsigned cdl_enable:1; /* Enable/disable Command duration limits */ + unsigned recovery; /* Mark it error command happened */ unsigned int queue_stopped; /* request queue is quiesced */
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 195ca80667d0..40f407ffd26f 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -466,7 +466,7 @@ void scsi_run_host_queues(struct Scsi_Host *shost) { struct scsi_device *sdev; - shost_for_each_device(sdev, shost) + shost_for_each_device_include_deleted(sdev, shost) scsi_run_queue(sdev->request_queue); }