Message ID | 20240222090251.2849702-2-haifeng.zhao@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-76171-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:aa16:b0:108:e6aa:91d0 with SMTP id by22csp186982dyb; Thu, 22 Feb 2024 03:30:17 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXYaa+PCq3IqbBRmnLj31RFdsnymsENYNMr3qZi9xQ0aKl22OK7o3YtmZF+uYw7Wlk9Zygl01WfQF8ZjEVwykMv1IodKw== X-Google-Smtp-Source: AGHT+IH38QaFDCYJOr0vbPa63rRTGriCiVMHIs51eM48/PBdCEdw07aTOm843EkbQHxxxP28Van9 X-Received: by 2002:a92:de48:0:b0:365:4b91:7cf9 with SMTP id e8-20020a92de48000000b003654b917cf9mr6364350ilr.26.1708601417267; Thu, 22 Feb 2024 03:30:17 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708601417; cv=pass; d=google.com; s=arc-20160816; b=cy9Yhy88FDe45sn0RhhHhJ0GJ2UvaxhjEFKzMbI6JjCQbr61Eh32iHc/ilSnYktAfQ nMLoZdkdmFWmiYjWAYce6U0Cl9FOBUxu9/3xY+epQVmB1AI8jUXiScvOB/zmPdoNZ3Fq MQUs97QPYeyUGRJC5fqHZrxtJQZ+eTUfbswJlTBe20ZAhavuxUir+jkiQBLjbZjTvdFJ ifAsZ7Uq/4MBuYGCUqBzYQge1aZfFfJACU/l7hQkKJCw9lN2GzFIjUoYIYZeAAp36wT8 xsZkKXFmwBN1wnxdnkeklq7OPBXFpnhv6r5ruBjzQiH5jzHHUNT2BcE63Y6iN7Zwtvbg M1zA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=D+b2YYnMcglSGwbebBuWw2/9+8HM1SsUieOBFSaXBS4=; fh=1X0ut6BtlQnZ1MoQItaqwKIGCqk6qNGAbHMp6ZD3eIQ=; b=CHzpfjD4hxeQq3j3jr4/HBg0/qDLH5ZP0WcreSpZpdif7Ewrbe6hrghNPgEKKSSX6B zPdRavToJzGmSwb3U0EMPoRcHjEZq85A66XIDs3CET5OYtxVAmbM3kLeEBaTHrkaXfbP YNHQ9VD1cVTFczrrtP0oeznYjeLib4gSorVINVS4b41t2Kx1juIDcmCvqQhlR4jHVI4t 7D+5XzEkg04l0WmR86fkvRB94+7K9csdvGkObLdZHJkKciQ7USw5hDf8H/YBpG1t6gu1 kFrHRRfKInS2Ntq0i7r5SdIMFrBI4Kzd8tVjoiuiSjJc296OtIbfioTTtWBoU9SBwxGi fsbA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jPtVE1rM; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-76171-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-76171-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id l25-20020a63ba59000000b005d64d951c89si10356624pgu.143.2024.02.22.03.30.17 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Feb 2024 03:30:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-76171-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=jPtVE1rM; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-76171-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-76171-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id E27A42855EC for <ouuuleilei@gmail.com>; Thu, 22 Feb 2024 09:03:47 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 9EAE4381B4; Thu, 22 Feb 2024 09:03:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jPtVE1rM" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A54F52261F; Thu, 22 Feb 2024 09:03:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708592600; cv=none; b=bgAmgNy09GXei7DLfkBYHE93Z6G07sW5la1NoJgyaA68slq9vZ+PIepaFWMGvwIHw3hHmbeCnQKsDRlVSNe2jwXXIzqtRIIPMZfdBC7XzUcq69f7Hl4aUfPE+jUumb63GbddupmokzfP0U8IgsiDKMbieOjYoCuXkwoP+5Dt2Ys= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708592600; c=relaxed/simple; bh=qGvJdjdFOeQfZsaVurZBwtpkSGEkZOdq9FP4c2dHoM4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=To5J69wEInxLDajnhYrGD4sWKv0/CMO1Xf7bMJLn+GgJRhdOeZnHJh4D//7Zx8gRmSD0kEMILgCXwH0R8mLa92mrzZF2YEBiGE5rXHZhZC6sOUULvFMC2eQK1EgrVcLUWezV679hpQsV7vlnEDucN4Iv0v/RyS/fL+GUxHqbdjY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jPtVE1rM; arc=none smtp.client-ip=198.175.65.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708592599; x=1740128599; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qGvJdjdFOeQfZsaVurZBwtpkSGEkZOdq9FP4c2dHoM4=; b=jPtVE1rMxKHh2Ugo3xYKE74tM0fGb/N47Igw5vE6QKjmAqbwzWy5ah8F vCeoD31d4w92b8bJ+dxNoFj+C5+2XJb29DFfm/8von0aL2p7ouWok9UH9 GOo+ap0J8tGGdQEH/Rh70MbD45rwnco6BAoifXnPRKqjqkvu3taT0fX0q nreS5Ho0hsULeVVTEB03xvRF4xGzbRimA7tW47X2gvZYSZnWfIEdrRDwS 1KXkHYuHmFDqz0+0HbeGi5gjOAoHtcaJPFxpoUu+Yl2VGGm6KrmwN+oVq SV91V0GeP+r6PK9IrIO5XRTq4tjhLNU4MqokRc5mYS2E4yO9eTE7dFNpm A==; X-IronPort-AV: E=McAfee;i="6600,9927,10991"; a="25264451" X-IronPort-AV: E=Sophos;i="6.06,177,1705392000"; d="scan'208";a="25264451" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Feb 2024 01:03:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.06,177,1705392000"; d="scan'208";a="10122632" Received: from unknown (HELO ply01-vm-store.amr.corp.intel.com) ([10.238.153.201]) by orviesa003.jf.intel.com with ESMTP; 22 Feb 2024 01:03:14 -0800 From: Ethan Zhao <haifeng.zhao@linux.intel.com> To: baolu.lu@linux.intel.com, bhelgaas@google.com, robin.murphy@arm.com, jgg@ziepe.ca Cc: kevin.tian@intel.com, dwmw2@infradead.org, will@kernel.org, lukas@wunner.de, yi.l.liu@intel.com, dan.carpenter@linaro.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, Ethan Zhao <haifeng.zhao@linux.intel.com>, Haorong Ye <yehaorong@bytedance.com> Subject: [PATCH v13 1/3] PCI: make pci_dev_is_disconnected() helper public for other drivers Date: Thu, 22 Feb 2024 04:02:49 -0500 Message-Id: <20240222090251.2849702-2-haifeng.zhao@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240222090251.2849702-1-haifeng.zhao@linux.intel.com> References: <20240222090251.2849702-1-haifeng.zhao@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791598439860825140 X-GMAIL-MSGID: 1791598439860825140 |
Series |
fix vt-d hard lockup when hotplug ATS capable device
|
|
Commit Message
Ethan Zhao
Feb. 22, 2024, 9:02 a.m. UTC
Make pci_dev_is_disconnected() public so that it can be called from Intel VT-d driver to quickly fix/workaround the surprise removal unplug hang issue for those ATS capable devices on PCIe switch downstream hotplug capable ports. Beside pci_device_is_present() function, this one has no config space space access, so is light enough to optimize the normal pure surprise removal and safe removal flow. Tested-by: Haorong Ye <yehaorong@bytedance.com> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> --- drivers/pci/pci.h | 5 ----- include/linux/pci.h | 5 +++++ 2 files changed, 5 insertions(+), 5 deletions(-)
Comments
On 2024/2/22 17:02, Ethan Zhao wrote: > Make pci_dev_is_disconnected() public so that it can be called from > Intel VT-d driver to quickly fix/workaround the surprise removal > unplug hang issue for those ATS capable devices on PCIe switch downstream > hotplug capable ports. > > Beside pci_device_is_present() function, this one has no config space > space access, so is light enough to optimize the normal pure surprise > removal and safe removal flow. > > Tested-by: Haorong Ye<yehaorong@bytedance.com> > Signed-off-by: Ethan Zhao<haifeng.zhao@linux.intel.com> > --- > drivers/pci/pci.h | 5 ----- > include/linux/pci.h | 5 +++++ > 2 files changed, 5 insertions(+), 5 deletions(-) Hi PCI subsystem maintainers, The iommu drivers (including, but not limited to, the Intel VT-d driver) require a helper to check the physical presence of a PCI device in two scenarios: - During the iommu_release_device() path: This ensures the device is physically present before sending device TLB invalidation to device. - During the device driver lifecycle when a device TLB invalidation timeout event is generated by the IOMMU hardware: This helps handle situations where the device might have been hot-removed. While there may be some adjustments needed in patch 3/3, I'd like to confirm with you whether it's feasible to expose this helper for general use within the iommu subsystem. If you agree with this change, I can route this patch to Linus through the iommu tree with an "acked-by" or "reviewed-by" tag from you. Best regards, baolu
I'm not a PCI maintainer, but first two patches seem good to me. For
the third patch, my complaints were really minor. Let's just add a
Fixes tag for sure, the rest is okay.
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
regards,
dan carpenter
On 2/23/2024 2:47 PM, Dan Carpenter wrote: > I'm not a PCI maintainer, but first two patches seem good to me. For > the third patch, my complaints were really minor. Let's just add a > Fixes tag for sure, the rest is okay. > > Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Thanks Dan. > > regards, > dan carpenter >
On Thu, Feb 22, 2024 at 08:54:54PM +0800, Baolu Lu wrote: > On 2024/2/22 17:02, Ethan Zhao wrote: > > Make pci_dev_is_disconnected() public so that it can be called from > > Intel VT-d driver to quickly fix/workaround the surprise removal > > unplug hang issue for those ATS capable devices on PCIe switch downstream > > hotplug capable ports. > > > > Beside pci_device_is_present() function, this one has no config space > > space access, so is light enough to optimize the normal pure surprise > > removal and safe removal flow. > > > > Tested-by: Haorong Ye<yehaorong@bytedance.com> > > Signed-off-by: Ethan Zhao<haifeng.zhao@linux.intel.com> > > --- > > drivers/pci/pci.h | 5 ----- > > include/linux/pci.h | 5 +++++ > > 2 files changed, 5 insertions(+), 5 deletions(-) > > Hi PCI subsystem maintainers, > > The iommu drivers (including, but not limited to, the Intel VT-d driver) > require a helper to check the physical presence of a PCI device in two > scenarios: > > - During the iommu_release_device() path: This ensures the device is > physically present before sending device TLB invalidation to device. This wording is fundamentally wrong. Testing pci_dev_is_disconnected() can never ensure the device will still be present by the time a TLB invalidation is sent. The device may be removed after the pci_dev_is_disconnected() test and before a TLB invalidate is sent. This is why I hesitate to expose pci_dev_is_disconnected() (and pci_device_is_present(), which we already export) outside drivers/pci/. They both lead to terrible mistakes like relying on the false assumption that the result will remain valid after the functions return, without any recognition that we MUST be able to deal with the cases where that assumption is broken. This series claims to avoid "continuous hard lockup warnings and system hangs". It may reduce the likelihood, but I don't think it can completely avoid them. I don't see any acknowledgement of that in the commit logs of this series. E.g., it doesn't say "we can recover from ATS Invalidate Completion Timeouts, but the timeouts are on the order of minutes, so we want to avoid them when possible." And given the "system hangs" language, I am not convinced that we actually *can* recover from those timeouts. Using pci_dev_is_disconnected() may make those timeouts less frequent and give the illusion that the problem is "solved", but it just means the problem is still there and harder to reproduce. > - During the device driver lifecycle when a device TLB invalidation > timeout event is generated by the IOMMU hardware: This helps handle > situations where the device might have been hot-removed. > > While there may be some adjustments needed in patch 3/3, I'd like to > confirm with you whether it's feasible to expose this helper for general > use within the iommu subsystem. > > If you agree with this change, I can route this patch to Linus through > the iommu tree with an "acked-by" or "reviewed-by" tag from you. > > Best regards, > baolu
On 2/27/2024 7:05 AM, Bjorn Helgaas wrote: > On Thu, Feb 22, 2024 at 08:54:54PM +0800, Baolu Lu wrote: >> On 2024/2/22 17:02, Ethan Zhao wrote: >>> Make pci_dev_is_disconnected() public so that it can be called from >>> Intel VT-d driver to quickly fix/workaround the surprise removal >>> unplug hang issue for those ATS capable devices on PCIe switch downstream >>> hotplug capable ports. >>> >>> Beside pci_device_is_present() function, this one has no config space >>> space access, so is light enough to optimize the normal pure surprise >>> removal and safe removal flow. >>> >>> Tested-by: Haorong Ye<yehaorong@bytedance.com> >>> Signed-off-by: Ethan Zhao<haifeng.zhao@linux.intel.com> >>> --- >>> drivers/pci/pci.h | 5 ----- >>> include/linux/pci.h | 5 +++++ >>> 2 files changed, 5 insertions(+), 5 deletions(-) >> Hi PCI subsystem maintainers, >> >> The iommu drivers (including, but not limited to, the Intel VT-d driver) >> require a helper to check the physical presence of a PCI device in two >> scenarios: >> >> - During the iommu_release_device() path: This ensures the device is >> physically present before sending device TLB invalidation to device. > This wording is fundamentally wrong. Testing > pci_dev_is_disconnected() can never ensure the device will still be > present by the time a TLB invalidation is sent. The logic of testing pci_dev_is_disconnected() in patch [2/3] works in the opposite: 1. if pci_dev_is_disconnected() return true, means the device is in the process of surprise removal handling, adapter already been removed from the slot. 2. for removed device, no need to send ATS invalidation request to it. removed device lost power, its devTLB wouldn't be valid anymore. 3. if pci_dev_is_disconnected() return false, the device is *likely* to be removed at any momoment after this function called. such case will be treated in the iommu ITE fault handling, not to retry the timeout request if device isn't present (patch [3/3]). > > The device may be removed after the pci_dev_is_disconnected() test and > before a TLB invalidate is sent. even in the process while TLB is invalidating. > > This is why I hesitate to expose pci_dev_is_disconnected() (and > pci_device_is_present(), which we already export) outside > drivers/pci/. They both lead to terrible mistakes like relying on the > false assumption that the result will remain valid after the functions > return, without any recognition that we MUST be able to deal with the > cases where that assumption is broken. Yup, your concern is worthy ,but isn't happening within this patchset. > > This series claims to avoid "continuous hard lockup warnings and > system hangs". It may reduce the likelihood, but I don't think it can > completely avoid them. It doesn't try to close the race window, actually we are doing post-fault handling in patch [3/3]. > > I don't see any acknowledgement of that in the commit logs of this > series. E.g., it doesn't say "we can recover from ATS Invalidate > Completion Timeouts, but the timeouts are on the order of minutes, so > we want to avoid them when possible." > And given the "system hangs" language, I am not convinced that we > actually *can* recover from those timeouts. It is testing in customer's environment. separatly, patch[3/3] vs patch [2/3]. > > Using pci_dev_is_disconnected() may make those timeouts less frequent will test patch[3/3] alone, if couldn't recover from the 100% ITE fault case, we will look for other method. Thanks, Ethan > and give the illusion that the problem is "solved", but it just means > the problem is still there and harder to reproduce. > >> - During the device driver lifecycle when a device TLB invalidation >> timeout event is generated by the IOMMU hardware: This helps handle >> situations where the device might have been hot-removed. >> >> While there may be some adjustments needed in patch 3/3, I'd like to >> confirm with you whether it's feasible to expose this helper for general >> use within the iommu subsystem. >> >> If you agree with this change, I can route this patch to Linus through >> the iommu tree with an "acked-by" or "reviewed-by" tag from you. >> >> Best regards, >> baolu
On Thu, Feb 22, 2024 at 04:02:49AM -0500, Ethan Zhao wrote: > Make pci_dev_is_disconnected() public so that it can be called from > Intel VT-d driver to quickly fix/workaround the surprise removal > unplug hang issue for those ATS capable devices on PCIe switch downstream > hotplug capable ports. > > Beside pci_device_is_present() function, this one has no config space > space access, so is light enough to optimize the normal pure surprise > removal and safe removal flow. > > Tested-by: Haorong Ye <yehaorong@bytedance.com> > Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> > --- > drivers/pci/pci.h | 5 ----- > include/linux/pci.h | 5 +++++ > 2 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h > index e9750b1b19ba..bfc56f7bee1c 100644 > --- a/drivers/pci/pci.h > +++ b/drivers/pci/pci.h > @@ -368,11 +368,6 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused) > return 0; > } > > -static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) > -{ > - return dev->error_state == pci_channel_io_perm_failure; > -} > - > /* pci_dev priv_flags */ > #define PCI_DEV_ADDED 0 > #define PCI_DPC_RECOVERED 1 > diff --git a/include/linux/pci.h b/include/linux/pci.h > index 7ab0d13672da..213109d3c601 100644 > --- a/include/linux/pci.h > +++ b/include/linux/pci.h > @@ -2517,6 +2517,11 @@ static inline struct pci_dev *pcie_find_root_port(struct pci_dev *dev) > return NULL; > } > > +static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) > +{ > + return dev->error_state == pci_channel_io_perm_failure; > +} > + > void pci_request_acs(void); > bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); > bool pci_acs_path_enabled(struct pci_dev *start, > -- > 2.31.1 >
On Thu, Feb 29, 2024 at 09:58:43AM +0800, Ethan Zhao wrote: > On 2/27/2024 7:05 AM, Bjorn Helgaas wrote: > > On Thu, Feb 22, 2024 at 08:54:54PM +0800, Baolu Lu wrote: > > > On 2024/2/22 17:02, Ethan Zhao wrote: > > > > Make pci_dev_is_disconnected() public so that it can be called from > > > > Intel VT-d driver to quickly fix/workaround the surprise removal > > > > unplug hang issue for those ATS capable devices on PCIe switch downstream > > > > hotplug capable ports. > > > > > > > > Beside pci_device_is_present() function, this one has no config space > > > > space access, so is light enough to optimize the normal pure surprise > > > > removal and safe removal flow. > > > > > > > > Tested-by: Haorong Ye<yehaorong@bytedance.com> > > > > Signed-off-by: Ethan Zhao<haifeng.zhao@linux.intel.com> > > > > --- > > > > drivers/pci/pci.h | 5 ----- > > > > include/linux/pci.h | 5 +++++ > > > > 2 files changed, 5 insertions(+), 5 deletions(-) > > > Hi PCI subsystem maintainers, > > > > > > The iommu drivers (including, but not limited to, the Intel VT-d driver) > > > require a helper to check the physical presence of a PCI device in two > > > scenarios: > > > > > > - During the iommu_release_device() path: This ensures the device is > > > physically present before sending device TLB invalidation to device. > > This wording is fundamentally wrong. Testing > > pci_dev_is_disconnected() can never ensure the device will still be > > present by the time a TLB invalidation is sent. > > The logic of testing pci_dev_is_disconnected() in patch [2/3] works > in the opposite: > > 1. if pci_dev_is_disconnected() return true, means the device is in > the process of surprise removal handling, adapter already been > removed from the slot. > > 2. for removed device, no need to send ATS invalidation request to it. > removed device lost power, its devTLB wouldn't be valid anymore. > > 3. if pci_dev_is_disconnected() return false, the device is *likely* > to be removed at any momoment after this function called. > such case will be treated in the iommu ITE fault handling, not to > retry the timeout request if device isn't present (patch [3/3]). > > > The device may be removed after the pci_dev_is_disconnected() test and > > before a TLB invalidate is sent. > > even in the process while TLB is invalidating. > > > This is why I hesitate to expose pci_dev_is_disconnected() (and > > pci_device_is_present(), which we already export) outside > > drivers/pci/. They both lead to terrible mistakes like relying on the > > false assumption that the result will remain valid after the functions > > return, without any recognition that we MUST be able to deal with the > > cases where that assumption is broken. > > Yup, your concern is worthy ,but isn't happening within this patchset. OK, I acked the patch. I guess my complaint is really with pci_device_is_present() because that's even harder to use correctly. pci_device_is_present(): slow (may do config access to device) true => device *was* present in the recent past, may not be now false => device is not accessible pci_dev_is_disconnected(): fast (doesn't touch device) true => device is not accessible false => basically means nothing I guess they're both hard ;) Bjorn
On 3/1/2024 6:33 AM, Bjorn Helgaas wrote: > On Thu, Feb 29, 2024 at 09:58:43AM +0800, Ethan Zhao wrote: >> On 2/27/2024 7:05 AM, Bjorn Helgaas wrote: >>> On Thu, Feb 22, 2024 at 08:54:54PM +0800, Baolu Lu wrote: >>>> On 2024/2/22 17:02, Ethan Zhao wrote: >>>>> Make pci_dev_is_disconnected() public so that it can be called from >>>>> Intel VT-d driver to quickly fix/workaround the surprise removal >>>>> unplug hang issue for those ATS capable devices on PCIe switch downstream >>>>> hotplug capable ports. >>>>> >>>>> Beside pci_device_is_present() function, this one has no config space >>>>> space access, so is light enough to optimize the normal pure surprise >>>>> removal and safe removal flow. >>>>> >>>>> Tested-by: Haorong Ye<yehaorong@bytedance.com> >>>>> Signed-off-by: Ethan Zhao<haifeng.zhao@linux.intel.com> >>>>> --- >>>>> drivers/pci/pci.h | 5 ----- >>>>> include/linux/pci.h | 5 +++++ >>>>> 2 files changed, 5 insertions(+), 5 deletions(-) >>>> Hi PCI subsystem maintainers, >>>> >>>> The iommu drivers (including, but not limited to, the Intel VT-d driver) >>>> require a helper to check the physical presence of a PCI device in two >>>> scenarios: >>>> >>>> - During the iommu_release_device() path: This ensures the device is >>>> physically present before sending device TLB invalidation to device. >>> This wording is fundamentally wrong. Testing >>> pci_dev_is_disconnected() can never ensure the device will still be >>> present by the time a TLB invalidation is sent. >> The logic of testing pci_dev_is_disconnected() in patch [2/3] works >> in the opposite: >> >> 1. if pci_dev_is_disconnected() return true, means the device is in >> the process of surprise removal handling, adapter already been >> removed from the slot. >> >> 2. for removed device, no need to send ATS invalidation request to it. >> removed device lost power, its devTLB wouldn't be valid anymore. >> >> 3. if pci_dev_is_disconnected() return false, the device is *likely* >> to be removed at any momoment after this function called. >> such case will be treated in the iommu ITE fault handling, not to >> retry the timeout request if device isn't present (patch [3/3]). >> >>> The device may be removed after the pci_dev_is_disconnected() test and >>> before a TLB invalidate is sent. >> even in the process while TLB is invalidating. >> >>> This is why I hesitate to expose pci_dev_is_disconnected() (and >>> pci_device_is_present(), which we already export) outside >>> drivers/pci/. They both lead to terrible mistakes like relying on the >>> false assumption that the result will remain valid after the functions >>> return, without any recognition that we MUST be able to deal with the >>> cases where that assumption is broken. >> Yup, your concern is worthy ,but isn't happening within this patchset. > OK, I acked the patch. Great ! > > I guess my complaint is really with pci_device_is_present() because > that's even harder to use correctly. > > pci_device_is_present(): > slow (may do config access to device) > true => device *was* present in the recent past, may not be now > false => device is not accessible so the 'false' result is reliable for post-calling code, then give up more attempt of the same request. the usage in patch[3/3] > > pci_dev_is_disconnected(): > fast (doesn't touch device) > true => device is not accessible also we are relying on the 'true' result returned, used in patch[2/3]. > false => basically means nothing > > I guess they're both hard ;) seems I didn't mess them up. :) Thanks, Ethan > > Bjorn
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index e9750b1b19ba..bfc56f7bee1c 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -368,11 +368,6 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused) return 0; } -static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) -{ - return dev->error_state == pci_channel_io_perm_failure; -} - /* pci_dev priv_flags */ #define PCI_DEV_ADDED 0 #define PCI_DPC_RECOVERED 1 diff --git a/include/linux/pci.h b/include/linux/pci.h index 7ab0d13672da..213109d3c601 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2517,6 +2517,11 @@ static inline struct pci_dev *pcie_find_root_port(struct pci_dev *dev) return NULL; } +static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) +{ + return dev->error_state == pci_channel_io_perm_failure; +} + void pci_request_acs(void); bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start,