Message ID | 20221220125349.45091-1-yangxingui@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp2951190wrn; Tue, 20 Dec 2022 05:03:54 -0800 (PST) X-Google-Smtp-Source: AA0mqf4xvx3HvNYbqxvwe3a7RX+jsYQczBQc0YFF8IRB0XhKDITSaJibeuBMtEr3MWR5M15nJ3YQ X-Received: by 2002:a05:6a21:78aa:b0:af:7774:4de4 with SMTP id bf42-20020a056a2178aa00b000af77744de4mr38155762pzc.22.1671541433888; Tue, 20 Dec 2022 05:03:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671541433; cv=none; d=google.com; s=arc-20160816; b=tT1A9xBGLUcVOOwFtU4JTN7PtKNMEGTLaI9m1Y5enVep8v73D2nTKedd6aUWealYyj Haa+sT9WWrfCxwRwRzMRs6fyh5Nw6U9Y3XSE3gRt4k1bRNMM6BWBUR7tgxWkQqmNvVtN 5sEn3wbihWkNxFb/Iu+fTZctpwuVYMsKmlWJcMB+PectT+FyHxpNvIc4evHk70GsLDDP crfpL3EgEH1a6JDx0dWNKue0ICx8KDMxN7vSY5bCkP4hM3eIvgNZrzxouPWs2BINjOkM MwWov/hbM0qHhQUWSGK8kJmCNuTEKfgO9bB5lKvPqyYgN9F4Nd0UrDnwYsYpgWyJM5b3 ycsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from; bh=w3mTkAx4XqteeLjriGSryAuendUiyLbrCWT3xKmkUNQ=; b=VdKpc/fmL8kQdODLYSWOUduJmqaCZI25x3yanGyORiEA9OCUgDwnOsTCMTIUYP60kF W4lsjGgLec0tn4QI5wBeAAtrcJ+Cfxn5Qd0Dz5lhtWtWa+aiG1ITZh0An+czDzgRmmrp 2kB8P5DepAc0rjzxsL2vpt5l0wtHxYyijjU9vc/2XiPpFtQdIx7hCmQ0fAnQx5NgP5zc pBT1juqGszL3g39rjgaCrRrjCmzOqE7Y6QAeRtuv1TfKnOqGQthm8+Kke6FtP8S365RC Rt7VyVH/otBvMG5BT9jznqYW3sdKrjx61Rsx2rdisKyqge8uhb+TK96qJwsrtuZbPTMV ROEg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j63-20020a638b42000000b00477f864cf87si13938148pge.24.2022.12.20.05.03.40; Tue, 20 Dec 2022 05:03:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233739AbiLTNAi (ORCPT <rfc822;abdi.embedded@gmail.com> + 99 others); Tue, 20 Dec 2022 08:00:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230189AbiLTNA2 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 20 Dec 2022 08:00:28 -0500 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36C35DEE7; Tue, 20 Dec 2022 05:00:27 -0800 (PST) Received: from dggpemm500012.china.huawei.com (unknown [172.30.72.54]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4NbxVC0T72zmWkd; Tue, 20 Dec 2022 20:59:15 +0800 (CST) Received: from localhost.localdomain (10.67.165.24) by dggpemm500012.china.huawei.com (7.185.36.89) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Tue, 20 Dec 2022 21:00:21 +0800 From: Xingui Yang <yangxingui@huawei.com> To: <jejb@linux.ibm.com>, <martin.petersen@oracle.com>, <john.g.garry@oracle.com> CC: <linux-scsi@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linuxarm@huawei.com>, <yangxingui@huawei.com>, <prime.zeng@hisilicon.com>, <kangfenglong@huawei.com> Subject: [PATCH] scsi: libsas: Grab the host lock in sas_ata_device_link_abort() Date: Tue, 20 Dec 2022 12:53:49 +0000 Message-ID: <20221220125349.45091-1-yangxingui@huawei.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.67.165.24] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500012.china.huawei.com (7.185.36.89) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752738230986275164?= X-GMAIL-MSGID: =?utf-8?q?1752738230986275164?= |
Series |
scsi: libsas: Grab the host lock in sas_ata_device_link_abort()
|
|
Commit Message
yangxingui
Dec. 20, 2022, 12:53 p.m. UTC
Grab the host lock in sas_ata_device_link_abort() before calling
ata_link_abort(), as the comment in ata_link_abort() mentions.
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
---
drivers/scsi/libsas/sas_ata.c | 3 +++
1 file changed, 3 insertions(+)
Comments
On 20/12/2022 12:53, Xingui Yang wrote: > Grab the host lock in sas_ata_device_link_abort() before calling This is should be the ata port lock, right? I know that the ata comments say differently. > ata_link_abort(), as the comment in ata_link_abort() mentions. > Can you please add a fixes tag? > Signed-off-by: Xingui Yang <yangxingui@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> > --- > drivers/scsi/libsas/sas_ata.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c > index f7439bf9cdc6..4f2017b21e6d 100644 > --- a/drivers/scsi/libsas/sas_ata.c > +++ b/drivers/scsi/libsas/sas_ata.c > @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) > { > struct ata_port *ap = device->sata_dev.ap; > struct ata_link *link = &ap->link; > + unsigned long flags; > > + spin_lock_irqsave(ap->lock, flags); > device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ > device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ > > @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) > if (force_reset) > link->eh_info.action |= ATA_EH_RESET; > ata_link_abort(link); > + spin_unlock_irqrestore(ap->lock, flags); > } > EXPORT_SYMBOL_GPL(sas_ata_device_link_abort); >
On 2022/12/20 23:59, John Garry wrote: > On 20/12/2022 12:53, Xingui Yang wrote: >> Grab the host lock in sas_ata_device_link_abort() before calling > > This is should be the ata port lock, right? I know that the ata comments > say differently. > >> ata_link_abort(), as the comment in ata_link_abort() mentions. >> > > Can you please add a fixes tag? > >> Signed-off-by: Xingui Yang <yangxingui@huawei.com> > > Reviewed-by: John Garry <john.g.garry@oracle.com> > >> --- >> drivers/scsi/libsas/sas_ata.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c >> index f7439bf9cdc6..4f2017b21e6d 100644 >> --- a/drivers/scsi/libsas/sas_ata.c >> +++ b/drivers/scsi/libsas/sas_ata.c >> @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >> { >> struct ata_port *ap = device->sata_dev.ap; >> struct ata_link *link = &ap->link; >> + unsigned long flags; >> >> + spin_lock_irqsave(ap->lock, flags); >> device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ >> device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ >> >> @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >> if (force_reset) >> link->eh_info.action |= ATA_EH_RESET; >> ata_link_abort(link); Really need to add lockdep annotations in libata to avoid/catch such bugs... Will work on that. >> + spin_unlock_irqrestore(ap->lock, flags); >> } >> EXPORT_SYMBOL_GPL(sas_ata_device_link_abort); >> >
On 2022/12/20 22:59, John Garry wrote: > On 20/12/2022 12:53, Xingui Yang wrote: >> Grab the host lock in sas_ata_device_link_abort() before calling > > This is should be the ata port lock, right? I know that the ata comments > say differently. ok, I will update the commit message and use ata port lock instead. > >> ata_link_abort(), as the comment in ata_link_abort() mentions. >> > > Can you please add a fixes tag? ok, I will update and add a fix tag. Thanks, Xingui > >> Signed-off-by: Xingui Yang <yangxingui@huawei.com> > > Reviewed-by: John Garry <john.g.garry@oracle.com> > >> --- >> drivers/scsi/libsas/sas_ata.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/scsi/libsas/sas_ata.c >> b/drivers/scsi/libsas/sas_ata.c >> index f7439bf9cdc6..4f2017b21e6d 100644 >> --- a/drivers/scsi/libsas/sas_ata.c >> +++ b/drivers/scsi/libsas/sas_ata.c >> @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct >> domain_device *device, bool force_reset) >> { >> struct ata_port *ap = device->sata_dev.ap; >> struct ata_link *link = &ap->link; >> + unsigned long flags; >> + spin_lock_irqsave(ap->lock, flags); >> device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ >> device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ >> @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct >> domain_device *device, bool force_reset) >> if (force_reset) >> link->eh_info.action |= ATA_EH_RESET; >> ata_link_abort(link); >> + spin_unlock_irqrestore(ap->lock, flags); >> } >> EXPORT_SYMBOL_GPL(sas_ata_device_link_abort); > > .
On 2022/12/21 8:36, Damien Le Moal wrote: > On 2022/12/20 23:59, John Garry wrote: >> On 20/12/2022 12:53, Xingui Yang wrote: >>> Grab the host lock in sas_ata_device_link_abort() before calling >> >> This is should be the ata port lock, right? I know that the ata comments >> say differently. >> >>> ata_link_abort(), as the comment in ata_link_abort() mentions. >>> >> >> Can you please add a fixes tag? >> >>> Signed-off-by: Xingui Yang <yangxingui@huawei.com> >> >> Reviewed-by: John Garry <john.g.garry@oracle.com> >> >>> --- >>> drivers/scsi/libsas/sas_ata.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c >>> index f7439bf9cdc6..4f2017b21e6d 100644 >>> --- a/drivers/scsi/libsas/sas_ata.c >>> +++ b/drivers/scsi/libsas/sas_ata.c >>> @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >>> { >>> struct ata_port *ap = device->sata_dev.ap; >>> struct ata_link *link = &ap->link; >>> + unsigned long flags; >>> >>> + spin_lock_irqsave(ap->lock, flags); >>> device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ >>> device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ >>> >>> @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >>> if (force_reset) >>> link->eh_info.action |= ATA_EH_RESET; >>> ata_link_abort(link); > > Really need to add lockdep annotations in libata to avoid/catch such bugs... > Will work on that. Actually in libata there are many places calling ata_link_abort() not holding port lock. And some places are holding the real host lock(ata_host->lock) while calling ata_link_abort(). So if you add the lockdep annotations, there may be too many warnings. If these are real issues, we should fix them first. Thanks, Jason
On 2022/12/21 11:42, Jason Yan wrote: > On 2022/12/21 8:36, Damien Le Moal wrote: >> On 2022/12/20 23:59, John Garry wrote: >>> On 20/12/2022 12:53, Xingui Yang wrote: >>>> Grab the host lock in sas_ata_device_link_abort() before calling >>> >>> This is should be the ata port lock, right? I know that the ata comments >>> say differently. >>> >>>> ata_link_abort(), as the comment in ata_link_abort() mentions. >>>> >>> >>> Can you please add a fixes tag? >>> >>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com> >>> >>> Reviewed-by: John Garry <john.g.garry@oracle.com> >>> >>>> --- >>>> drivers/scsi/libsas/sas_ata.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c >>>> index f7439bf9cdc6..4f2017b21e6d 100644 >>>> --- a/drivers/scsi/libsas/sas_ata.c >>>> +++ b/drivers/scsi/libsas/sas_ata.c >>>> @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >>>> { >>>> struct ata_port *ap = device->sata_dev.ap; >>>> struct ata_link *link = &ap->link; >>>> + unsigned long flags; >>>> >>>> + spin_lock_irqsave(ap->lock, flags); >>>> device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ >>>> device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ >>>> >>>> @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >>>> if (force_reset) >>>> link->eh_info.action |= ATA_EH_RESET; >>>> ata_link_abort(link); >> >> Really need to add lockdep annotations in libata to avoid/catch such bugs... >> Will work on that. > > Actually in libata there are many places calling ata_link_abort() not > holding port lock. And some places are holding the real host > lock(ata_host->lock) while calling ata_link_abort(). So if you add the > lockdep annotations, there may be too many warnings. If these are real > issues, we should fix them first. libata-EH does most of its work without the port lock held because by the time we get EH started, we are guaranteed to be idle with no commands in flight. That is why the calls you mention look like "bugs" but are not. lockdep annotation will be a little tricky, but not too hard to do either. > > Thanks, > Jason
On 2022/12/21 11:59, Damien Le Moal wrote: > On 2022/12/21 11:42, Jason Yan wrote: >> On 2022/12/21 8:36, Damien Le Moal wrote: >>> On 2022/12/20 23:59, John Garry wrote: >>>> On 20/12/2022 12:53, Xingui Yang wrote: >>>>> Grab the host lock in sas_ata_device_link_abort() before calling >>>> >>>> This is should be the ata port lock, right? I know that the ata comments >>>> say differently. >>>> >>>>> ata_link_abort(), as the comment in ata_link_abort() mentions. >>>>> >>>> >>>> Can you please add a fixes tag? >>>> >>>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com> >>>> >>>> Reviewed-by: John Garry <john.g.garry@oracle.com> >>>> >>>>> --- >>>>> drivers/scsi/libsas/sas_ata.c | 3 +++ >>>>> 1 file changed, 3 insertions(+) >>>>> >>>>> diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c >>>>> index f7439bf9cdc6..4f2017b21e6d 100644 >>>>> --- a/drivers/scsi/libsas/sas_ata.c >>>>> +++ b/drivers/scsi/libsas/sas_ata.c >>>>> @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >>>>> { >>>>> struct ata_port *ap = device->sata_dev.ap; >>>>> struct ata_link *link = &ap->link; >>>>> + unsigned long flags; >>>>> >>>>> + spin_lock_irqsave(ap->lock, flags); >>>>> device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ >>>>> device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ >>>>> >>>>> @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) >>>>> if (force_reset) >>>>> link->eh_info.action |= ATA_EH_RESET; >>>>> ata_link_abort(link); >>> >>> Really need to add lockdep annotations in libata to avoid/catch such bugs... >>> Will work on that. >> >> Actually in libata there are many places calling ata_link_abort() not >> holding port lock. And some places are holding the real host >> lock(ata_host->lock) while calling ata_link_abort(). So if you add the >> lockdep annotations, there may be too many warnings. If these are real >> issues, we should fix them first. > > libata-EH does most of its work without the port lock held because by the time > we get EH started, we are guaranteed to be idle with no commands in flight. That > is why the calls you mention look like "bugs" but are not. What about the interrupt handler such as ahci_error_intr()? I didn't see the callers hold the port lock too. Do they need the port lock?
On 12/21/22 15:35, Jason Yan wrote: > On 2022/12/21 11:59, Damien Le Moal wrote: >> On 2022/12/21 11:42, Jason Yan wrote: >>> On 2022/12/21 8:36, Damien Le Moal wrote: >>>> On 2022/12/20 23:59, John Garry wrote: >>>>> On 20/12/2022 12:53, Xingui Yang wrote: >>>>>> Grab the host lock in sas_ata_device_link_abort() before calling >>>>> >>>>> This is should be the ata port lock, right? I know that the ata >>>>> comments >>>>> say differently. >>>>> >>>>>> ata_link_abort(), as the comment in ata_link_abort() mentions. >>>>>> >>>>> >>>>> Can you please add a fixes tag? >>>>> >>>>>> Signed-off-by: Xingui Yang <yangxingui@huawei.com> >>>>> >>>>> Reviewed-by: John Garry <john.g.garry@oracle.com> >>>>> >>>>>> --- >>>>>> drivers/scsi/libsas/sas_ata.c | 3 +++ >>>>>> 1 file changed, 3 insertions(+) >>>>>> >>>>>> diff --git a/drivers/scsi/libsas/sas_ata.c >>>>>> b/drivers/scsi/libsas/sas_ata.c >>>>>> index f7439bf9cdc6..4f2017b21e6d 100644 >>>>>> --- a/drivers/scsi/libsas/sas_ata.c >>>>>> +++ b/drivers/scsi/libsas/sas_ata.c >>>>>> @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct >>>>>> domain_device *device, bool force_reset) >>>>>> { >>>>>> struct ata_port *ap = device->sata_dev.ap; >>>>>> struct ata_link *link = &ap->link; >>>>>> + unsigned long flags; >>>>>> + spin_lock_irqsave(ap->lock, flags); >>>>>> device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ >>>>>> device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ >>>>>> @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct >>>>>> domain_device *device, bool force_reset) >>>>>> if (force_reset) >>>>>> link->eh_info.action |= ATA_EH_RESET; >>>>>> ata_link_abort(link); >>>> >>>> Really need to add lockdep annotations in libata to avoid/catch such >>>> bugs... >>>> Will work on that. >>> >>> Actually in libata there are many places calling ata_link_abort() not >>> holding port lock. And some places are holding the real host >>> lock(ata_host->lock) while calling ata_link_abort(). So if you add the >>> lockdep annotations, there may be too many warnings. If these are real >>> issues, we should fix them first. >> >> libata-EH does most of its work without the port lock held because by >> the time >> we get EH started, we are guaranteed to be idle with no commands in >> flight. That >> is why the calls you mention look like "bugs" but are not. > > What about the interrupt handler such as ahci_error_intr()? I didn't see > the callers hold the port lock too. Do they need the port lock? It looks like it is missing for ahci_thunderx_irq_handler() but that one takes the host lock. Same for xgene_ahci_irq_intr(), again no port lock but host lock taken. And again for ahci_single_level_irq_intr() for the non MSI case. For modern MSI adapters, the port lock is taken in For other cases, ahci_multi_irqs_intr_hard) takes the port lock. So it looks like ahci_port_intr() needs to take the lock and some cleanups overall (the host lock should not be necessary in the command path. But nobody seems to have issues with the "bad" cases... Probably because they are not mainstream adapters. Definitely some work needed here.
On Wed, Dec 21, 2022 at 05:31:59PM +0900, Damien Le Moal wrote: > > > > What about the interrupt handler such as ahci_error_intr()? I didn't see > > the callers hold the port lock too. Do they need the port lock? > > It looks like it is missing for ahci_thunderx_irq_handler() but that one > takes the host lock. Same for xgene_ahci_irq_intr(), again no port lock > but host lock taken. And again for ahci_single_level_irq_intr() for the > non MSI case. For modern MSI adapters, the port lock is taken in > > For other cases, ahci_multi_irqs_intr_hard) takes the port lock. > > So it looks like ahci_port_intr() needs to take the lock and some > cleanups overall (the host lock should not be necessary in the command > path. But nobody seems to have issues with the "bad" cases... Probably > because they are not mainstream adapters. > > Definitely some work needed here. ahci_multi_irqs_intr_hard() takes the ap->lock before calling ahci_handle_port_interrupt(), which calls ahci_port_intr(), so I don't think there is any work needed for multi IRQ AHCI. For ahci_single_level_irq_intr() the host lock is taken before calling ahci_handle_port_intr(), so I don't see why we need any extra work for single IRQ AHCI. Remember, while the default is that: ap->lock = &host->lock; see: https://github.com/torvalds/linux/blob/v6.1/drivers/ata/libata-core.c#L5305 In case of MULTI MSI, the ap->lock is using its own lock: https://github.com/torvalds/linux/blob/v6.1/drivers/ata/libahci.c#L2460 So what is it that needs to be fixed for AHCI? I haven't looked at ahci_thunderx_irq_handler() and xgene_ahci_irq_intr() so I can't speak for these. Kind regards, Niklas
On 2022/12/21 18:44, Niklas Cassel wrote: > On Wed, Dec 21, 2022 at 05:31:59PM +0900, Damien Le Moal wrote: >>> >>> What about the interrupt handler such as ahci_error_intr()? I didn't see >>> the callers hold the port lock too. Do they need the port lock? >> >> It looks like it is missing for ahci_thunderx_irq_handler() but that one >> takes the host lock. Same for xgene_ahci_irq_intr(), again no port lock >> but host lock taken. And again for ahci_single_level_irq_intr() for the >> non MSI case. For modern MSI adapters, the port lock is taken in >> >> For other cases, ahci_multi_irqs_intr_hard) takes the port lock. >> >> So it looks like ahci_port_intr() needs to take the lock and some >> cleanups overall (the host lock should not be necessary in the command >> path. But nobody seems to have issues with the "bad" cases... Probably >> because they are not mainstream adapters. >> >> Definitely some work needed here. > > ahci_multi_irqs_intr_hard() takes the ap->lock before calling > ahci_handle_port_interrupt(), which calls ahci_port_intr(), > so I don't think there is any work needed for multi IRQ AHCI. Yes. I tried to say that above. > > For ahci_single_level_irq_intr() the host lock is taken before > calling ahci_handle_port_intr(), so I don't see why we need any > extra work for single IRQ AHCI. > > > Remember, while the default is that: > ap->lock = &host->lock; Ah ! Yes ! good point ! So there are no issues anywhere. This is only confusing because of the comments I think. > see: > https://github.com/torvalds/linux/blob/v6.1/drivers/ata/libata-core.c#L5305 > > In case of MULTI MSI, the ap->lock is using its own lock: > https://github.com/torvalds/linux/blob/v6.1/drivers/ata/libahci.c#L2460 > > > So what is it that needs to be fixed for AHCI? > > I haven't looked at ahci_thunderx_irq_handler() and xgene_ahci_irq_intr() > so I can't speak for these. These are not multi-irq and use the &host->lock directly, which is the same as taking the ap->lock. We could clean that up for consistency and always use ap->lock. But not a big deal. No bugs, just a little confusing with a simple reading of the code. > > > Kind regards, > Niklas
diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c index f7439bf9cdc6..4f2017b21e6d 100644 --- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) { struct ata_port *ap = device->sata_dev.ap; struct ata_link *link = &ap->link; + unsigned long flags; + spin_lock_irqsave(ap->lock, flags); device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */ device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */ @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct domain_device *device, bool force_reset) if (force_reset) link->eh_info.action |= ATA_EH_RESET; ata_link_abort(link); + spin_unlock_irqrestore(ap->lock, flags); } EXPORT_SYMBOL_GPL(sas_ata_device_link_abort);