[RFC] ata: libata-eh: Retry the cmnd when normal complete occurrd after scsi timeout
Message ID | 1672220216-46938-1-git-send-email-wubo40@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp1792340wrt; Wed, 28 Dec 2022 01:08:32 -0800 (PST) X-Google-Smtp-Source: AMrXdXs0iuIhRQmfgvJNS/GBf+1wbIZDW9hb1q0hOsYOqsLyDXczNiQvPApvgrzX0qBGZbzWeNnl X-Received: by 2002:a05:6402:2296:b0:46a:96b3:22bf with SMTP id cw22-20020a056402229600b0046a96b322bfmr27225213edb.17.1672218512664; Wed, 28 Dec 2022 01:08:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672218512; cv=none; d=google.com; s=arc-20160816; b=NOSFjaiJRX1fWsDF1uNdR11eQxN7vgMDt5eOgaPv1lP0GCnwN/SJteV/OE0uC6tcUj G/4jQoLaYlVQ2bAr9WT+9KsvMKpkoiuKqJeIq7vAJIF7PJZV7a9ZcXUUhAoNhlZi8I/v FBckE7kwmCDByrGlRSqW/s44k5J3R5UO3S4ZXhUAwvnjAlEv0zKB+3eugsE4pAYAtyly Zk1zXEL+OPiO1oa7pAPmnue/pyzWvsH+xYZndhOAeIjq0t68A41TFdh1f1jkkw82bV5j eqwtQekshUJZvPx8VBopO3RTdpfQ5+VFetVQVKJ91Ot5dALDqO1xeSEvxyK1bk781g2X G0uA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from; bh=6OXbryU3GaRMN/PPFWT0jIsnWFfceVLqW3cvU5I7iIk=; b=Kag4eaSCtZ6V/M0vd5CnS4hccVA+sGbf7btOuK+m9HHEfad1QDMsuiOHL3h0rpl/ds wwdBFZZZyQ7NGdS+Eb5cCilQiMusdzfpKPpSKbZI6ycqTypzYEy3vHOQjI3mOW2Ybd1+ Pbhq12/VO8UYISrV40Iu50lVTV4n3V/ZTYSbsflfh8F+eGwVbZeuW9bBup6il56zWgkY nd2W2L2PSGZh0pDaHrFJiU+j6XDhjGHdqWfN/cuxCuXp6aNJEzp6vKqDSAzHstFunVFw XJbtpv5SxV7kce++3afylpLw0Dpds9vyZcVpswgV6gjavXir8CnHGG91yHyCmFy843xs lkmQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gn12-20020a1709070d0c00b007ae26c753edsi14395133ejc.52.2022.12.28.01.08.09; Wed, 28 Dec 2022 01:08:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229587AbiL1I4t (ORCPT <rfc822;eddaouddi.ayoub@gmail.com> + 99 others); Wed, 28 Dec 2022 03:56:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230080AbiL1I4s (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 28 Dec 2022 03:56:48 -0500 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB3DDBCB8; Wed, 28 Dec 2022 00:56:47 -0800 (PST) Received: from dggpeml500019.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4NhljF1P8vzmWr7; Wed, 28 Dec 2022 16:55:29 +0800 (CST) Received: from huawei.com (10.175.124.27) by dggpeml500019.china.huawei.com (7.185.36.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Wed, 28 Dec 2022 16:56:46 +0800 From: Wu Bo <wubo40@huawei.com> To: Damien Le Moal <damien.lemoal@opensource.wdc.com>, <linux-ide@vger.kernel.org>, <linux-kernel@vger.kernel.org> CC: <qiuchangqi.qiu@huawei.com>, <wubo40@huawei.com> Subject: [RFC PATCH] ata: libata-eh: Retry the cmnd when normal complete occurrd after scsi timeout Date: Wed, 28 Dec 2022 17:36:56 +0800 Message-ID: <1672220216-46938-1-git-send-email-wubo40@huawei.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.175.124.27] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500019.china.huawei.com (7.185.36.137) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753448199335546835?= X-GMAIL-MSGID: =?utf-8?q?1753448199335546835?= |
Series |
[RFC] ata: libata-eh: Retry the cmnd when normal complete occurrd after scsi timeout
|
|
Commit Message
Wu Bo
Dec. 28, 2022, 9:36 a.m. UTC
From: wubo <wubo40@huawei.com> Hi, Now SCSI middle layer EH and normal IO handler can only choose one of them, after the SCSI command is completed normally after scsi timeout period, Should this scenario be given a chance to retry? Signed-off-by: wubo <wubo40@huawei.com> --- drivers/ata/libata-eh.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
Comments
On Wed, Dec 28, 2022 at 05:36:56PM +0800, Wu Bo wrote: > From: wubo <wubo40@huawei.com> > > Hi, > > Now SCSI middle layer EH and normal IO handler can only choose one of them, > after the SCSI command is completed normally after scsi timeout period, > Should this scenario be given a chance to retry? > > Signed-off-by: wubo <wubo40@huawei.com> > --- > drivers/ata/libata-eh.c | 8 +------- > 1 file changed, 1 insertion(+), 7 deletions(-) > > diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c > index 34303ce..8d1856f 100644 > --- a/drivers/ata/libata-eh.c > +++ b/drivers/ata/libata-eh.c > @@ -617,14 +617,8 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, > qc->flags |= ATA_QCFLAG_FAILED; > nr_timedout++; > } > - } else { > - /* Normal completion occurred after > - * SCSI timeout but before this point. > - * Successfully complete it. > - */ > - scmd->retries = scmd->allowed; > + } else > scsi_eh_finish_cmd(scmd, &ap->eh_done_q); > - } > } > Hello Wu, The function comment looks like this: /* For new EH, all qcs are finished in one of three ways - * normal completion, error completion, and SCSI timeout. * Both completions can race against SCSI timeout. When normal * completion wins, the qc never reaches EH. When error * completion wins, the qc has ATA_QCFLAG_FAILED set. * * When SCSI timeout wins, things are a bit more complex. * Normal or error completion can occur after the timeout but * before this point. In such cases, both types of * completions are honored. A scmd is determined to have * timed out iff its associated qc is active and not failed. */ And the code looks like this: if (i < ATA_MAX_QUEUE) { /* the scmd has an associated qc */ if (!(qc->flags & ATA_QCFLAG_FAILED)) { /* which hasn't failed yet, timeout */ qc->err_mask |= AC_ERR_TIMEOUT; qc->flags |= ATA_QCFLAG_FAILED; nr_timedout++; } } else { /* Normal completion occurred after * SCSI timeout but before this point. * Successfully complete it. */ scmd->retries = scmd->allowed; scsi_eh_finish_cmd(scmd, &ap->eh_done_q); } If SCSI timeout wins, but there came an error completion after, then we will go into the if (i < ATA_MAX_QUEUE) case, but we will not enter the if !(qc->flags & ATA_QCFLAG_FAILED), as ATA_QCFLAG_FAILED will already be set by the irq handler. This QC will be completed by the ata_scsi_port_error_handler(), which gets to run just after this function has returned: https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/ata/libata-eh.c#L546 The else clause you are modifying however is for the case where SCSI timeout wins, but there came a normal completion occurred after the SCSI timeout. In more detail, what happens first is that scsi_timeout() gets called, and if scsi timeout wins, it sets SCMD_STATE_COMPLETE: https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/scsi/scsi_error.c#L355 and then schedules EH for that command using scsi_eh_scmd_add(). What happens next in this specific case is that the IRQ handler is called, takes the ap->lock (which is also taken is this function so that it can run at the same time as the IRQ handler), then the IRQ handler calls __ata_qc_complete() for the QC, however, when scsi_done() is finally called in ata_qc_done() (from IRQ context), it will not be a no-op, since SCMD_STATE_COMPLETE is already set: https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/scsi/scsi_lib.c#L1623 Since scsi_done() never finished the scsi_cmd, we need to finish it here, in the else clause, by calling scsi_eh_finish_cmd(). When the EH queue is flushed, it will check if scsi_cmd_retry_allowed() and if it is, the command will be retried, otherwise it will call scsi_finish() on the command: https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/scsi/scsi_error.c#L2150 Considering that we want to only finish the scmd here, libata sets scmd->retries = scmd->allowed; such that the check: return ++cmd->retries <= cmd->allowed; in scsi_cmd_retry_allowed() will evaulate to false. So TL;DR: It is absolutely essential to set scmd->retries = scmd->allowed; in this else clause, as that is the only reason why this command will be finished instead of retried. Since this else clause is for a command that timed out, but got completed successfully via the IRQ handler after timing out (so the QC got freed), we only need to finish the scmd. Retrying the scmd is wrong in this case. Kind regards, Niklas
On 2022/12/29 1:59, Niklas Cassel wrote: > On Wed, Dec 28, 2022 at 05:36:56PM +0800, Wu Bo wrote: >> From: wubo <wubo40@huawei.com> >> >> Hi, >> >> Now SCSI middle layer EH and normal IO handler can only choose one of them, >> after the SCSI command is completed normally after scsi timeout period, >> Should this scenario be given a chance to retry? >> >> Signed-off-by: wubo <wubo40@huawei.com> >> --- >> drivers/ata/libata-eh.c | 8 +------- >> 1 file changed, 1 insertion(+), 7 deletions(-) >> >> diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c >> index 34303ce..8d1856f 100644 >> --- a/drivers/ata/libata-eh.c >> +++ b/drivers/ata/libata-eh.c >> @@ -617,14 +617,8 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, >> qc->flags |= ATA_QCFLAG_FAILED; >> nr_timedout++; >> } >> - } else { >> - /* Normal completion occurred after >> - * SCSI timeout but before this point. >> - * Successfully complete it. >> - */ >> - scmd->retries = scmd->allowed; >> + } else >> scsi_eh_finish_cmd(scmd, &ap->eh_done_q); >> - } >> } >> > > Hello Wu, > > > The function comment looks like this: > /* For new EH, all qcs are finished in one of three ways - > * normal completion, error completion, and SCSI timeout. > * Both completions can race against SCSI timeout. When normal > * completion wins, the qc never reaches EH. When error > * completion wins, the qc has ATA_QCFLAG_FAILED set. > * > * When SCSI timeout wins, things are a bit more complex. > * Normal or error completion can occur after the timeout but > * before this point. In such cases, both types of > * completions are honored. A scmd is determined to have > * timed out iff its associated qc is active and not failed. > */ > > And the code looks like this: > > if (i < ATA_MAX_QUEUE) { > /* the scmd has an associated qc */ > if (!(qc->flags & ATA_QCFLAG_FAILED)) { > /* which hasn't failed yet, timeout */ > qc->err_mask |= AC_ERR_TIMEOUT; > qc->flags |= ATA_QCFLAG_FAILED; > nr_timedout++; > } > } else { > /* Normal completion occurred after > * SCSI timeout but before this point. > * Successfully complete it. > */ > scmd->retries = scmd->allowed; > scsi_eh_finish_cmd(scmd, &ap->eh_done_q); > } > > > > If SCSI timeout wins, but there came an error completion after, > then we will go into the if (i < ATA_MAX_QUEUE) case, but we will > not enter the if !(qc->flags & ATA_QCFLAG_FAILED), as ATA_QCFLAG_FAILED > will already be set by the irq handler. This QC will be completed > by the ata_scsi_port_error_handler(), which gets to run just after > this function has returned: > https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/ata/libata-eh.c#L546 > > > > The else clause you are modifying however is for the case where SCSI timeout > wins, but there came a normal completion occurred after the SCSI timeout. > > In more detail, what happens first is that scsi_timeout() gets called, > and if scsi timeout wins, it sets SCMD_STATE_COMPLETE: > https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/scsi/scsi_error.c#L355 > and then schedules EH for that command using scsi_eh_scmd_add(). > > What happens next in this specific case is that the IRQ handler is called, > takes the ap->lock (which is also taken is this function so that it can run > at the same time as the IRQ handler), then the IRQ handler calls > __ata_qc_complete() for the QC, however, when scsi_done() is finally called > in ata_qc_done() (from IRQ context), it will not be a no-op, since > SCMD_STATE_COMPLETE is already set: > https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/scsi/scsi_lib.c#L1623 > > Since scsi_done() never finished the scsi_cmd, we need to finish it here, > in the else clause, by calling scsi_eh_finish_cmd(). > > When the EH queue is flushed, it will check if scsi_cmd_retry_allowed() > and if it is, the command will be retried, otherwise it will call scsi_finish() > on the command: > https://github.com/torvalds/linux/blob/v6.2-rc1/drivers/scsi/scsi_error.c#L2150 > > Considering that we want to only finish the scmd here, libata sets > scmd->retries = scmd->allowed; such that the check: > return ++cmd->retries <= cmd->allowed; > in scsi_cmd_retry_allowed() will evaulate to false. > > > So TL;DR: > It is absolutely essential to set scmd->retries = scmd->allowed; > in this else clause, as that is the only reason why this command will > be finished instead of retried. > Since this else clause is for a command that timed out, but got completed > successfully via the IRQ handler after timing out (so the QC got freed), > we only need to finish the scmd. Retrying the scmd is wrong in this case. > > > Kind regards, > Niklas. Hi Niklas, Thanks for your detailed reply. The case where SCSI timeout wins, but there came a normal completion occurred after the SCSI timeout. in this scenario, The EH process cannot get the command status from the IRQ handler process. If the command is not retried, it can only be returned to the upper layer with timeout state. Log as fllows: sd 1:0:0:0: [sda] tag#30 scsi_eh_1: flush retry cmd, scmd->retries:3, scmd->allowed:2 sd 1:0:0:0: [sda] tag#30 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT sd 1:0:0:0: [sda] tag#30 CDB: Read(10) 28 00 07 5e e6 7b 00 00 01 00 print_req_error: I/O error, dev sda, sector 123659899 In fact, The I/O can be completed normally, but it takes a little longer. In this scenario, do we give the opportunity as much as possible so that the IO can return successfully?
On Thu, Dec 29, 2022 at 11:17:31AM +0800, Wu Bo wrote: > > Hi Niklas, > > Thanks for your detailed reply. > > The case where SCSI timeout wins, > but there came a normal completion occurred after the SCSI timeout. > in this scenario, The EH process cannot get the command status > from the IRQ handler process. > > If the command is not retried, it can only be returned to the upper layer > with timeout state. So the possible scenarios are: 1) command completed normally via IRQ handler 2) command completed with error via the IRQ hander 3) SCSI timeout, which is by default 30 seconds. No IRQ came for this specific QC for 30 seconds. You should always get a completion for a command even if it is error. If the HBA did not send an IRQ for this specific QC for 30 seconds, it is obviously misbehaving. libata will reset the HBA, and outstanding QCs will be retried. 4) SCSI timeout, so no IRQ came for this specific QC for 30 seconds, so EH gets scheduled for this QC, however it takes some milliseconds before the EH thread starts running, and then it needs to call the EH ->eh_strategy_handler() (ata_scsi_error()) which takes the ap->lock. Between the time EH gets scheduled and the ap->lock gets acquired, there comes a completion via the IRQ handler that completes the command normally. 5) Same as 4) but the IRQ handler completes the command with error. Note that 4) and 5) are very rare cases. 3) is way more common than 4) and 5). > > Log as fllows: > sd 1:0:0:0: [sda] tag#30 scsi_eh_1: flush retry cmd, scmd->retries:3, > scmd->allowed:2 > sd 1:0:0:0: [sda] tag#30 FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_TIMEOUT > sd 1:0:0:0: [sda] tag#30 CDB: Read(10) 28 00 07 5e e6 7b 00 00 01 00 > print_req_error: I/O error, dev sda, sector 123659899 How do you know that this is for scenario 4) ? You have no prints of the IRQ handler, and the QCs it completed. To me, it actually looks like scenario 3), especially since you have "flush retry cmd". In scenario 3), timed out commands will be retried after the HBA is reset. Do you see that the HBA gets reset in your log? If it was scenario 4) or 5) I would have expected a "flush finish cmd" print instead of a "flush retry cmd" print. Kind regards, Niklas
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index 34303ce..8d1856f 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -617,14 +617,8 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap, qc->flags |= ATA_QCFLAG_FAILED; nr_timedout++; } - } else { - /* Normal completion occurred after - * SCSI timeout but before this point. - * Successfully complete it. - */ - scmd->retries = scmd->allowed; + } else scsi_eh_finish_cmd(scmd, &ap->eh_done_q); - } } /* If we have timed out qcs. They belong to EH from