Message ID | 20231228170504.720794-1-haifeng.zhao@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-12801-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp2117648dyb; Thu, 28 Dec 2023 09:06:46 -0800 (PST) X-Google-Smtp-Source: AGHT+IFiOF6uOoNy8leTUfHrXy807dv36PAGncqWsdxsFQ4ZHwn2gXtSUON1ZxupjGwBBqdAzgr9 X-Received: by 2002:a05:6a00:194f:b0:6d9:f9af:7d39 with SMTP id s15-20020a056a00194f00b006d9f9af7d39mr3957740pfk.34.1703783206500; Thu, 28 Dec 2023 09:06:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703783206; cv=none; d=google.com; s=arc-20160816; b=UTU9MBNJ1vnzvPfVwM7ekps4NrtIHEMipSF3jXZ8HjyyDaRBsm68yE9fmV/rmA0ja8 scem5r8fs2fU1Y0oWhZl7hnbR7+39sQnu922Iuvw65QB6rxQsvjK51LnvC/znPoIB83Q Qp/34MHLjj/8/J0dMvQs9ODg+cKwX8FZRHak9QgchPOpL5WYJgfeSMThHWCefyX2+sT0 UlNlj1RrQAgm8SAFFkfJLj1lKQMr3mpUERvlBtB9VCsEY6QzGtC/zgEhUWr0RxvcjjUP UXV8do16Pf6hwkrITMOV/5t7WWEcDr370GY9X0FcQx0Dne2kWhWNGEfUYEVB+3Kh0otP Xs0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=DelDexIfre9vz5N71N3zrlf2++us0mu/7QD2oskpaOE=; fh=dlWkbAla6c7cXCen1OD7/w45BfQUKF45lw88dDQPBUc=; b=iX0yZRp9G/uPTvA2JampnvfJtA6FOOh6laenBzspowtwv01WHe8geMd66EF0+NiGpg WZ+mA9PozYHIchsSbh8qRQC5d2rcZe30d7wkoCp+FRxxbSb6rhic5JZgGg6UbWYXK2zo dC6UmzwDAb5aEif4LpdlHoGHcJJr2JfTGU/r//EkqIM+7tkXPwJpWMUIrPNY90Ja5I3F f3uC07hbrLgBXlrgmm4tTEXMKfKgpaG6OJJDzJuSI+GELbTNgSvVlYsX0ZghRlrFUH9Y iY9GoRDs29LoyMWrgTKGsifYZZ9W049NacN/wYFrFKcPXk5Otr7I+NBqmdVrj1zPMM6E dwEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YpaELrdB; spf=pass (google.com: domain of linux-kernel+bounces-12801-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12801-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id j19-20020a635953000000b005cdbf06a032si13027773pgm.716.2023.12.28.09.06.46 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Dec 2023 09:06:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-12801-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YpaELrdB; spf=pass (google.com: domain of linux-kernel+bounces-12801-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12801-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id E0646284A11 for <ouuuleilei@gmail.com>; Thu, 28 Dec 2023 17:05:31 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 373D2101D0; Thu, 28 Dec 2023 17:05:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YpaELrdB" X-Original-To: linux-kernel@vger.kernel.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60B9710945; Thu, 28 Dec 2023 17:05:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703783112; x=1735319112; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=1qBt4D1WArrSV0MOzrBsQzAWBORFHOsqP7rl6KWn9uo=; b=YpaELrdBRz9FfVAMKLP2WC3Qvzw4FMLoONJY697LpJorhvRxtETn8p/t 6A9XpowPVFgkSGgwQSLLAZNJpixU1hOjRATUPu/Trh+ndMaQqZnILCQOR neFUGJ4djx4jB5XIMmzJiAXMaSFQZBbOYGLfZP5uOqWHYFtdj3Abpmhy7 LkNQlYvSIpHXgoYik2iOzIhw3oSw94cmCqJS6ZBtFng4knhrGlStD+53k f1dOLSAEl9ha4ZZKUFtDoUuJrv13XfjWXlkd9EsWFtprfw96Eo7RFJI7g 0L/BSBNyoxjOdVjh13joy0KOW+LN2Y8ZizXfi7M8DCgs0O8trkfS5k0Iy w==; X-IronPort-AV: E=McAfee;i="6600,9927,10937"; a="10123448" X-IronPort-AV: E=Sophos;i="6.04,312,1695711600"; d="scan'208";a="10123448" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmvoesa102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Dec 2023 09:05:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10937"; a="848992439" X-IronPort-AV: E=Sophos;i="6.04,312,1695711600"; d="scan'208";a="848992439" Received: from ply01-vm-store.bj.intel.com ([10.238.153.201]) by fmsmga004.fm.intel.com with ESMTP; 28 Dec 2023 09:05:08 -0800 From: Ethan Zhao <haifeng.zhao@linux.intel.com> To: kevin.tian@intel.com, bhelgaas@google.com, baolu.lu@linux.intel.com, dwmw2@infradead.org, will@kernel.org, robin.murphy@arm.com, lukas@wunner.de Cc: linux-pci@vger.kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [RFC PATCH v10 3/5] PCI: make pci_dev_is_disconnected() helper public for other drivers Date: Thu, 28 Dec 2023 12:05:02 -0500 Message-Id: <20231228170504.720794-1-haifeng.zhao@linux.intel.com> X-Mailer: git-send-email 2.31.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786546179832604720 X-GMAIL-MSGID: 1786546179832604720 |
Series |
fix vt-d hard lockup when hotplug ATS capable device
|
|
Commit Message
Ethan Zhao
Dec. 28, 2023, 5:05 p.m. UTC
Make pci_dev_is_disconnected() public so that it can be called from Intel VT-d driver to quickly fix/workaround the surprise removal unplug hang issue for those ATS capable devices on PCIe switch downstream hotplug capable ports. Beside pci_device_is_present() function, this one has no config space space access, so is light enough to optimize the normal pure surprise removal and safe removal flow. Tested-by: Haorong Ye <yehaorong@bytedance.com> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> --- drivers/pci/pci.h | 5 ----- include/linux/pci.h | 5 +++++ 2 files changed, 5 insertions(+), 5 deletions(-)
Comments
On 12/29/2023 1:05 AM, Ethan Zhao wrote: > When the ATS Invalidation request timeout happens, the qi_submit_sync() > will restart and loop for the invalidation request forever till it is > done, it will block another Invalidation thread such as the fq_timer > to issue invalidation request, cause the system lockup as following > > [exception RIP: native_queued_spin_lock_slowpath+92] > > RIP: ffffffffa9d1025c RSP: ffffb202f268cdc8 RFLAGS: 00000002 > > RAX: 0000000000000101 RBX: ffffffffab36c2a0 RCX: 0000000000000000 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffab36c2a0 > > RBP: ffffffffab36c2a0 R8: 0000000000000001 R9: 0000000000000000 > > R10: 0000000000000010 R11: 0000000000000018 R12: 0000000000000000 > > R13: 0000000000000004 R14: ffff9e10d71b1c88 R15: ffff9e10d71b1980 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #12 [ffffb202f268cdc8] native_queued_spin_lock_slowpath at ffffffffa9d1025c #13 [ffffb202f268cdc8] do_raw_spin_lock at ffffffffa9d121f1 #14 [ffffb202f268cdd8] _raw_spin_lock_irqsave at ffffffffaa51795b #15 [ffffb202f268cdf8] iommu_flush_dev_iotlb at ffffffffaa20df48 #16 [ffffb202f268ce28] iommu_flush_iova at ffffffffaa20e182 #17 [ffffb202f268ce60] iova_domain_flush at ffffffffaa220e27 #18 [ffffb202f268ce70] fq_flush_timeout at ffffffffaa221c9d #19 [ffffb202f268cea8] call_timer_fn at ffffffffa9d46661 #20 [ffffb202f268cf08] run_timer_softirq at ffffffffa9d47933 #21 [ffffb202f268cf98] __softirqentry_text_start at ffffffffaa8000e0 #22 [ffffb202f268cff0] asm_call_sysvec_on_stack at ffffffffaa60114f This part get lost perhpas I append "----" here. Thanks, Ethan > > (the left part of exception see the hotplug case of ATS capable device) > > If one endpoint device just no response to the ATS Invalidation request, > but is not gone, it will bring down the whole system, to avoid such > case, don't try the timeout ATS Invalidation request forever. > > Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> > --- > drivers/iommu/intel/dmar.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c > index 0a8d628a42ee..9edb4b44afca 100644 > --- a/drivers/iommu/intel/dmar.c > +++ b/drivers/iommu/intel/dmar.c > @@ -1453,7 +1453,7 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, > reclaim_free_desc(qi); > raw_spin_unlock_irqrestore(&qi->q_lock, flags); > > - if (rc == -EAGAIN) > + if (rc == -EAGAIN && type !=QI_DIOTLB_TYPE && type != QI_DEIOTLB_TYPE) > goto restart; > > if (iotlb_start_ktime)
On 12/29/23 1:05 AM, Ethan Zhao wrote: > Make pci_dev_is_disconnected() public so that it can be called from > Intel VT-d driver to quickly fix/workaround the surprise removal > unplug hang issue for those ATS capable devices on PCIe switch downstream > hotplug capable ports. > > Beside pci_device_is_present() function, this one has no config space > space access, so is light enough to optimize the normal pure surprise > removal and safe removal flow. > > Tested-by: Haorong Ye<yehaorong@bytedance.com> > Signed-off-by: Ethan Zhao<haifeng.zhao@linux.intel.com> > --- > drivers/pci/pci.h | 5 ----- > include/linux/pci.h | 5 +++++ > 2 files changed, 5 insertions(+), 5 deletions(-) This should be moved before PATCH 2/5? Otherwise, PATCH 2/5 couldn't be compiled. Best regards, baolu
On 1/10/2024 1:25 PM, Baolu Lu wrote: > On 12/29/23 1:05 AM, Ethan Zhao wrote: >> Make pci_dev_is_disconnected() public so that it can be called from >> Intel VT-d driver to quickly fix/workaround the surprise removal >> unplug hang issue for those ATS capable devices on PCIe switch >> downstream >> hotplug capable ports. >> >> Beside pci_device_is_present() function, this one has no config space >> space access, so is light enough to optimize the normal pure surprise >> removal and safe removal flow. >> >> Tested-by: Haorong Ye<yehaorong@bytedance.com> >> Signed-off-by: Ethan Zhao<haifeng.zhao@linux.intel.com> >> --- >> drivers/pci/pci.h | 5 ----- >> include/linux/pci.h | 5 +++++ >> 2 files changed, 5 insertions(+), 5 deletions(-) > > This should be moved before PATCH 2/5? Otherwise, PATCH 2/5 couldn't be Seems the order was mixed when send-email was abort by network connection and sent again. [3/5] &[4/5] goes to upset. though the subject order is right. anyway will resend in next version. Thanks, Ethan > compiled. > > Best regards, > baolu
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h index 5ecbcf041179..75fa2084492f 100644 --- a/drivers/pci/pci.h +++ b/drivers/pci/pci.h @@ -366,11 +366,6 @@ static inline int pci_dev_set_disconnected(struct pci_dev *dev, void *unused) return 0; } -static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) -{ - return dev->error_state == pci_channel_io_perm_failure; -} - /* pci_dev priv_flags */ #define PCI_DEV_ADDED 0 #define PCI_DPC_RECOVERED 1 diff --git a/include/linux/pci.h b/include/linux/pci.h index dea043bc1e38..4779eec8b267 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -2506,6 +2506,11 @@ static inline struct pci_dev *pcie_find_root_port(struct pci_dev *dev) return NULL; } +static inline bool pci_dev_is_disconnected(const struct pci_dev *dev) +{ + return dev->error_state == pci_channel_io_perm_failure; +} + void pci_request_acs(void); bool pci_acs_enabled(struct pci_dev *pdev, u16 acs_flags); bool pci_acs_path_enabled(struct pci_dev *start,