Message ID | 20230928073543.3496394-1-haowenchao2@huawei.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp3138053vqu; Thu, 28 Sep 2023 00:44:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFO2IvWYt2msya6SBLWppOMkA39eqzWtcl2zcwJ35I0GQmdpvuTV6rK0oyhioIfMFYPOyLP X-Received: by 2002:a05:6a00:399e:b0:68f:dfda:182a with SMTP id fi30-20020a056a00399e00b0068fdfda182amr404636pfb.26.1695887051165; Thu, 28 Sep 2023 00:44:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695887051; cv=none; d=google.com; s=arc-20160816; b=cImUvHSh67icMg+qDIsz6dAtKgEg1jQqeiru2zeNGJwkh6u0SpZGcETWvo/bHya8Yu 8+POUPAWYEu44G69NI2pMFyZgsxuNKIxSwcG8RSuqi9UNbEPn5lFxj5hk97I+hQyjxMZ eqGbXRXFYkDWBX00Uyoq6jpkXMikd6LMs9UJ9mnNgaZSmTO5FCyoGA8zSrAbcpJHTaXf giATUN7IQxdcsh9UOp2zaHh+7Al31WUpKZ8RCK9N+X7k4FDO7+c7sM6fDD2DIvIfI9F/ DYDvFV26Dr3F6jrmE6b9jEPaNRR0b4dZ9udfOjqFsyHQ7cUDzNz+1BF+yofyljyUhimo Z4NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=iAGEblWEusvzZRGhSQWxTO1pRBKrHt/COGlOQrnK1RQ=; fh=ZDcIlyYggP5K3GvQBQF3K/GzUnCJV+bQb/1GDPxM7E4=; b=JtsTMJsJZZaCCkR3fCz4e1kQ46NELeVFYOcSm3GjQCHm8CuwRsjle4wFW19EeTeTIh AETyWt9FVgz+aJ1abDIXAF2wl8104vJOu32RDTGEtG2JdWWONe3IQaiuNoUYkbyLJsy7 K1qJ87ZYXRfPRAEZBHD/jKzn1jwTyoRCfDAklbrtijJrRHDKVGNGuISQaOybKTCFET+M 8h6wlwa7BWXSe0n3B3yM2/2j5rCxHZ8KZR5z/r7LaTdHeXC8/6U9zI2W+caL9fOVwyCY IArPxvTGwunI1/JgGsVjp5E2+Hsgu59cOSDTQ6yQbImaQWO7uEWKhTIKbipNLyq4ARm5 jLMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id d19-20020a637353000000b00578aedd8e8bsi17741718pgn.716.2023.09.28.00.44.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Sep 2023 00:44:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 2CF5680411A8; Thu, 28 Sep 2023 00:37:05 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230257AbjI1Hgc (ORCPT <rfc822;ruipengqi7@gmail.com> + 20 others); Thu, 28 Sep 2023 03:36:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229648AbjI1HgW (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 28 Sep 2023 03:36:22 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E418A95; Thu, 28 Sep 2023 00:36:17 -0700 (PDT) Received: from kwepemm000012.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Rx4tH22vLztT0k; Thu, 28 Sep 2023 15:31:51 +0800 (CST) Received: from build.huawei.com (10.175.101.6) by kwepemm000012.china.huawei.com (7.193.23.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Thu, 28 Sep 2023 15:36:13 +0800 From: Wenchao Hao <haowenchao2@huawei.com> To: "James E . J . Bottomley" <jejb@linux.ibm.com>, "Martin K . Petersen" <martin.petersen@oracle.com>, <linux-scsi@vger.kernel.org> CC: <linux-kernel@vger.kernel.org>, <louhongxiang@huawei.com>, Wenchao Hao <haowenchao2@huawei.com> Subject: [PATCH v2 0/4] SCSI: Fix issues between removing device and error handle Date: Thu, 28 Sep 2023 15:35:39 +0800 Message-ID: <20230928073543.3496394-1-haowenchao2@huawei.com> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.101.6] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To kwepemm000012.china.huawei.com (7.193.23.142) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Thu, 28 Sep 2023 00:37:05 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778266460603620929 X-GMAIL-MSGID: 1778266460603620929 |
Series |
SCSI: Fix issues between removing device and error handle
|
|
Message
Wenchao Hao
Sept. 28, 2023, 7:35 a.m. UTC
I am testing SCSI error handle with my previous scsi_debug error injection patches, and found some issues when removing device and error handler happened together. These issues are triggered because devices in removing would be skipped when calling shost_for_each_device(). Three issues are found: 1. statistic info printed at beginning of scsi_error_handler is wrong 2. device reset is not triggered 3. IO requeued to request_queue would be hang after error handle V2: - Fix IO hang by run all devices' queue after error handler - Do not modify shost_for_each_device() directly but add a new helper to iterate devices but do not skip devices in removing Wenchao Hao (4): scsi: core: Add new helper to iterate all devices of host scsi: scsi_error: Fix wrong statistic when print error info scsi: scsi_error: Fix device reset is not triggered scsi: scsi_core: Fix IO hang when device removing drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++------------- drivers/scsi/scsi_error.c | 4 ++-- drivers/scsi/scsi_lib.c | 2 +- include/scsi/scsi_device.h | 25 +++++++++++++++++++--- 4 files changed, 53 insertions(+), 21 deletions(-)
Comments
On 2023/9/28 15:35, Wenchao Hao wrote: > I am testing SCSI error handle with my previous scsi_debug error > injection patches, and found some issues when removing device and > error handler happened together. > > These issues are triggered because devices in removing would be skipped > when calling shost_for_each_device(). > ping... > Three issues are found: > 1. statistic info printed at beginning of scsi_error_handler is wrong > 2. device reset is not triggered > 3. IO requeued to request_queue would be hang after error handle > > V2: > - Fix IO hang by run all devices' queue after error handler > - Do not modify shost_for_each_device() directly but add a new > helper to iterate devices but do not skip devices in removing > > Wenchao Hao (4): > scsi: core: Add new helper to iterate all devices of host > scsi: scsi_error: Fix wrong statistic when print error info > scsi: scsi_error: Fix device reset is not triggered > scsi: scsi_core: Fix IO hang when device removing > > drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++------------- > drivers/scsi/scsi_error.c | 4 ++-- > drivers/scsi/scsi_lib.c | 2 +- > include/scsi/scsi_device.h | 25 +++++++++++++++++++--- > 4 files changed, 53 insertions(+), 21 deletions(-) >
On 2023/9/28 15:35, Wenchao Hao wrote: > I am testing SCSI error handle with my previous scsi_debug error > injection patches, and found some issues when removing device and > error handler happened together. > > These issues are triggered because devices in removing would be skipped > when calling shost_for_each_device(). > > Three issues are found: > 1. statistic info printed at beginning of scsi_error_handler is wrong > 2. device reset is not triggered > 3. IO requeued to request_queue would be hang after error handle > These patches fix bug which is easy to recurrent when removing device and error handle happened together, so friendly ping again... > V2: > - Fix IO hang by run all devices' queue after error handler > - Do not modify shost_for_each_device() directly but add a new > helper to iterate devices but do not skip devices in removing > > Wenchao Hao (4): > scsi: core: Add new helper to iterate all devices of host > scsi: scsi_error: Fix wrong statistic when print error info > scsi: scsi_error: Fix device reset is not triggered > scsi: scsi_core: Fix IO hang when device removing > > drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++------------- > drivers/scsi/scsi_error.c | 4 ++-- > drivers/scsi/scsi_lib.c | 2 +- > include/scsi/scsi_device.h | 25 +++++++++++++++++++--- > 4 files changed, 53 insertions(+), 21 deletions(-) >
On 2023/9/28 15:35, Wenchao Hao wrote: > I am testing SCSI error handle with my previous scsi_debug error > injection patches, and found some issues when removing device and > error handler happened together. > > These issues are triggered because devices in removing would be skipped > when calling shost_for_each_device(). > > Three issues are found: > 1. statistic info printed at beginning of scsi_error_handler is wrong > 2. device reset is not triggered > 3. IO requeued to request_queue would be hang after error handle > Hi Martin, would you help review these patches? > V2: > - Fix IO hang by run all devices' queue after error handler > - Do not modify shost_for_each_device() directly but add a new > helper to iterate devices but do not skip devices in removing > > Wenchao Hao (4): > scsi: core: Add new helper to iterate all devices of host > scsi: scsi_error: Fix wrong statistic when print error info > scsi: scsi_error: Fix device reset is not triggered > scsi: scsi_core: Fix IO hang when device removing > > drivers/scsi/scsi.c | 43 +++++++++++++++++++++++++------------- > drivers/scsi/scsi_error.c | 4 ++-- > drivers/scsi/scsi_lib.c | 2 +- > include/scsi/scsi_device.h | 25 +++++++++++++++++++--- > 4 files changed, 53 insertions(+), 21 deletions(-) >