Message ID | 20231228001646.587653-6-haifeng.zhao@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-12318-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp1732589dyb; Wed, 27 Dec 2023 16:18:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IGTpkWPN5nvvb+0tk8EUaNF+GnMJ5Smh7OCHHxjsVl+FuCzRGWzFdJvkjLdZXnyifEO64M1 X-Received: by 2002:a05:620a:229:b0:77d:8946:11c3 with SMTP id u9-20020a05620a022900b0077d894611c3mr9727787qkm.34.1703722696987; Wed, 27 Dec 2023 16:18:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703722696; cv=none; d=google.com; s=arc-20160816; b=uLShvqgKH+hhy5b2Fj/BE7RJGMYA885Ti9D+lrhJ11GxQjZyat2VuIGcKL5a0dc1KD kxV12ZbYX2fRxHBcxixpRPbEdDWDkALRTdzeGYazQZ5UueI5ZbQDuBiyLldgx0xvetox LB0vllkYJAADQB0ZQCP6fTaYHy06ZeXTZADFgudOlGHqLMy4ZEK1ENvr6Vbx9Ug5flo/ +roYlIxmS5/1Ii32mwf5iYfEsXaPvQsqzy/bBiE2g12IkoG2nAJmQFN5yB8LzmBc4std eIf3sPSqBaeCmLBLjhjSPr1LscuE7Yizuf2DlD0Aem12GvNJugmfkpIGqRkZ0PJv8akX Y8+g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=Hhy3PCWrIjpEthT0ZHydK8llmvz3gBBkZEy9Oq9L0Bk=; fh=cQTxynQ7rGExgmbyXq8tNmd/VmE1Fzgt9QW51hABj2c=; b=V0pMyA5REel+/64pAA1onKjDsEnbIgGf6Hw4sYxezBGY4Ny7I5hjo2Bs69rhEge1zl CyPJvQ2A3ZtenG1rJdGjpz1dBj3zPEHNxLSKxyzVUQvzZ8fynPqvGBccQGYWIYj+AAjg mYRFp2crBOMlnVM3Y5uzihQRJyaEeVuGbSEvPoVUDXPPxz0jaRB2ouwQY9EmtkPjxjdU P+95Zwc/TXHVjh/hdKMyvF9dHDBiPLg6tOJtPt21U9CKq7pA7kSYa6TFXSTvlMLCzoeO L29akKbp1mIS7sqqIqzo+6oNW7PJFZ37yiNlqWy4M8LXEDX1cm8saLJ4yT8tetu0pJyc toCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NcDQUH4J; spf=pass (google.com: domain of linux-kernel+bounces-12318-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12318-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id u9-20020a05620a430900b0077f91185dcasi16555189qko.98.2023.12.27.16.18.16 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 Dec 2023 16:18:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-12318-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NcDQUH4J; spf=pass (google.com: domain of linux-kernel+bounces-12318-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-12318-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id B720D1C20F0C for <ouuuleilei@gmail.com>; Thu, 28 Dec 2023 00:18:16 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 480B58466; Thu, 28 Dec 2023 00:17:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="NcDQUH4J" X-Original-To: linux-kernel@vger.kernel.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E1326ABC; Thu, 28 Dec 2023 00:17:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703722627; x=1735258627; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zVZqQhADGmi9C9GQU0Rpvfm53YSnRw8E732emv+/c8g=; b=NcDQUH4JDI1HBp3eglGmJZl89vtfPsh8WVHzAUUetgSTQ/CCagVoO386 kP6KQYoFPmfaB3giiSzmp87rMLpIueBejui7nR7OHqiikCCd56QSrddhU Rgepi500nsF5pZWHFbEfsLCBWkCjsGbi/jfUilniZjW9iAPEyXgPY3mfg jSzTlURgPBrOAiirFygFJ1UnAFWGrECrdGQQBzFtrChPaO5heCewJ/SG+ AuXxCtDkC/fWHOMsWChe2op6RUR2IKsUzePrWwfrOjaxrt4hGJnSYykK6 11dntf1TTw8Q3VrWZlrQzcT6QE7aP+4iOe6o9rQ8OvkniX5UMtlYOonY8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10936"; a="3800513" X-IronPort-AV: E=Sophos;i="6.04,310,1695711600"; d="scan'208";a="3800513" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Dec 2023 16:17:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10936"; a="848812712" X-IronPort-AV: E=Sophos;i="6.04,310,1695711600"; d="scan'208";a="848812712" Received: from ply01-vm-store.bj.intel.com ([10.238.153.201]) by fmsmga004.fm.intel.com with ESMTP; 27 Dec 2023 16:17:02 -0800 From: Ethan Zhao <haifeng.zhao@linux.intel.com> To: bhelgaas@google.com, baolu.lu@linux.intel.com, dwmw2@infradead.org, will@kernel.org, robin.murphy@arm.com, lukas@wunner.de Cc: linux-pci@vger.kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [RFC PATCH v9 5/5] iommu/vt-d: don't loop for timeout ATS Invalidation request forever Date: Wed, 27 Dec 2023 19:16:46 -0500 Message-Id: <20231228001646.587653-6-haifeng.zhao@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20231228001646.587653-1-haifeng.zhao@linux.intel.com> References: <20231228001646.587653-1-haifeng.zhao@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786482730508485937 X-GMAIL-MSGID: 1786482730508485937 |
Series |
fix vt-d hard lockup when hotplug ATS capable device
|
|
Commit Message
Ethan Zhao
Dec. 28, 2023, 12:16 a.m. UTC
When the ATS Invalidation request timeout happens, the qi_submit_sync()
will restart and loop for the invalidation request forever till it is
done, it will block another Invalidation thread such as the fq_timer
to issue invalidation request, cause the system lockup as following
[exception RIP: native_queued_spin_lock_slowpath+92]
RIP: ffffffffa9d1025c RSP: ffffb202f268cdc8 RFLAGS: 00000002
RAX: 0000000000000101 RBX: ffffffffab36c2a0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffab36c2a0
RBP: ffffffffab36c2a0 R8: 0000000000000001 R9: 0000000000000000
R10: 0000000000000010 R11: 0000000000000018 R12: 0000000000000000
R13: 0000000000000004 R14: ffff9e10d71b1c88 R15: ffff9e10d71b1980
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#12 [ffffb202f268cdc8] native_queued_spin_lock_slowpath at ffffffffa9d1025c
#13 [ffffb202f268cdc8] do_raw_spin_lock at ffffffffa9d121f1
#14 [ffffb202f268cdd8] _raw_spin_lock_irqsave at ffffffffaa51795b
#15 [ffffb202f268cdf8] iommu_flush_dev_iotlb at ffffffffaa20df48
#16 [ffffb202f268ce28] iommu_flush_iova at ffffffffaa20e182
#17 [ffffb202f268ce60] iova_domain_flush at ffffffffaa220e27
#18 [ffffb202f268ce70] fq_flush_timeout at ffffffffaa221c9d
#19 [ffffb202f268cea8] call_timer_fn at ffffffffa9d46661
#20 [ffffb202f268cf08] run_timer_softirq at ffffffffa9d47933
#21 [ffffb202f268cf98] __softirqentry_text_start at ffffffffaa8000e0
#22 [ffffb202f268cff0] asm_call_sysvec_on_stack at ffffffffaa60114f
--- ---
(the left part of exception see the hotplug case of ATS capable device)
If one endpoint device just no response to the ATS Invalidation request,
but is not gone, it will bring down the whole system, to avoid such
case, don't try the timeout ATS Invalidation request forever.
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
---
drivers/iommu/intel/dmar.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
> From: Ethan Zhao <haifeng.zhao@linux.intel.com> > Sent: Thursday, December 28, 2023 8:17 AM > > When the ATS Invalidation request timeout happens, the qi_submit_sync() > will restart and loop for the invalidation request forever till it is > done, it will block another Invalidation thread such as the fq_timer > to issue invalidation request, cause the system lockup as following > > [exception RIP: native_queued_spin_lock_slowpath+92] > > RIP: ffffffffa9d1025c RSP: ffffb202f268cdc8 RFLAGS: 00000002 > > RAX: 0000000000000101 RBX: ffffffffab36c2a0 RCX: 0000000000000000 > > RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffab36c2a0 > > RBP: ffffffffab36c2a0 R8: 0000000000000001 R9: 0000000000000000 > > R10: 0000000000000010 R11: 0000000000000018 R12: 0000000000000000 > > R13: 0000000000000004 R14: ffff9e10d71b1c88 R15: ffff9e10d71b1980 > > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > > #12 [ffffb202f268cdc8] native_queued_spin_lock_slowpath at > ffffffffa9d1025c > > #13 [ffffb202f268cdc8] do_raw_spin_lock at ffffffffa9d121f1 > > #14 [ffffb202f268cdd8] _raw_spin_lock_irqsave at ffffffffaa51795b > > #15 [ffffb202f268cdf8] iommu_flush_dev_iotlb at ffffffffaa20df48 > > #16 [ffffb202f268ce28] iommu_flush_iova at ffffffffaa20e182 > > #17 [ffffb202f268ce60] iova_domain_flush at ffffffffaa220e27 > > #18 [ffffb202f268ce70] fq_flush_timeout at ffffffffaa221c9d > > #19 [ffffb202f268cea8] call_timer_fn at ffffffffa9d46661 > > #20 [ffffb202f268cf08] run_timer_softirq at ffffffffa9d47933 > > #21 [ffffb202f268cf98] __softirqentry_text_start at ffffffffaa8000e0 > > #22 [ffffb202f268cff0] asm_call_sysvec_on_stack at ffffffffaa60114f > --- --- > (the left part of exception see the hotplug case of ATS capable device) > > If one endpoint device just no response to the ATS Invalidation request, > but is not gone, it will bring down the whole system, to avoid such > case, don't try the timeout ATS Invalidation request forever. > > Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> > --- > drivers/iommu/intel/dmar.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c > index 76903a8bf963..206ab0b7294f 100644 > --- a/drivers/iommu/intel/dmar.c > +++ b/drivers/iommu/intel/dmar.c > @@ -1457,7 +1457,7 @@ int qi_submit_sync(struct intel_iommu *iommu, > struct qi_desc *desc, > reclaim_free_desc(qi); > raw_spin_unlock_irqrestore(&qi->q_lock, flags); > > - if (rc == -EAGAIN) > + if (rc == -EAGAIN && type !=QI_DIOTLB_TYPE && type != > QI_DEIOTLB_TYPE) > goto restart; > this change is moot. -EAGAIN is set only when hardware detects a ATS invalidation completion timeout in qi_check_fault(). so above just essentially kills the restart logic. I'd wait for the maintainer of this driver to comment. this part doesn't look good but there might be some history reason so carefulness must be paid.
On 12/28/2023 4:38 PM, Tian, Kevin wrote: >> From: Ethan Zhao <haifeng.zhao@linux.intel.com> >> Sent: Thursday, December 28, 2023 8:17 AM >> >> When the ATS Invalidation request timeout happens, the qi_submit_sync() >> will restart and loop for the invalidation request forever till it is >> done, it will block another Invalidation thread such as the fq_timer >> to issue invalidation request, cause the system lockup as following >> >> [exception RIP: native_queued_spin_lock_slowpath+92] >> >> RIP: ffffffffa9d1025c RSP: ffffb202f268cdc8 RFLAGS: 00000002 >> >> RAX: 0000000000000101 RBX: ffffffffab36c2a0 RCX: 0000000000000000 >> >> RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffab36c2a0 >> >> RBP: ffffffffab36c2a0 R8: 0000000000000001 R9: 0000000000000000 >> >> R10: 0000000000000010 R11: 0000000000000018 R12: 0000000000000000 >> >> R13: 0000000000000004 R14: ffff9e10d71b1c88 R15: ffff9e10d71b1980 >> >> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 >> >> #12 [ffffb202f268cdc8] native_queued_spin_lock_slowpath at >> ffffffffa9d1025c >> >> #13 [ffffb202f268cdc8] do_raw_spin_lock at ffffffffa9d121f1 >> >> #14 [ffffb202f268cdd8] _raw_spin_lock_irqsave at ffffffffaa51795b >> >> #15 [ffffb202f268cdf8] iommu_flush_dev_iotlb at ffffffffaa20df48 >> >> #16 [ffffb202f268ce28] iommu_flush_iova at ffffffffaa20e182 >> >> #17 [ffffb202f268ce60] iova_domain_flush at ffffffffaa220e27 >> >> #18 [ffffb202f268ce70] fq_flush_timeout at ffffffffaa221c9d >> >> #19 [ffffb202f268cea8] call_timer_fn at ffffffffa9d46661 >> >> #20 [ffffb202f268cf08] run_timer_softirq at ffffffffa9d47933 >> >> #21 [ffffb202f268cf98] __softirqentry_text_start at ffffffffaa8000e0 >> >> #22 [ffffb202f268cff0] asm_call_sysvec_on_stack at ffffffffaa60114f >> --- --- >> (the left part of exception see the hotplug case of ATS capable device) >> >> If one endpoint device just no response to the ATS Invalidation request, >> but is not gone, it will bring down the whole system, to avoid such >> case, don't try the timeout ATS Invalidation request forever. >> >> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> >> --- >> drivers/iommu/intel/dmar.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c >> index 76903a8bf963..206ab0b7294f 100644 >> --- a/drivers/iommu/intel/dmar.c >> +++ b/drivers/iommu/intel/dmar.c >> @@ -1457,7 +1457,7 @@ int qi_submit_sync(struct intel_iommu *iommu, >> struct qi_desc *desc, >> reclaim_free_desc(qi); >> raw_spin_unlock_irqrestore(&qi->q_lock, flags); >> >> - if (rc == -EAGAIN) >> + if (rc == -EAGAIN && type !=QI_DIOTLB_TYPE && type != >> QI_DEIOTLB_TYPE) >> goto restart; >> > this change is moot. > > -EAGAIN is set only when hardware detects a ATS invalidation completion > timeout in qi_check_fault(). so above just essentially kills the restart logic. This change is intended to break the restar login when device-TLB invalidation timeout happens, we don't know how long the ITE took if the device is just no reponse. > > I'd wait for the maintainer of this driver to comment. this part doesn't > look good but there might be some history reason so carefulness must > be paid. I would like to know the reason a hole is left here to hang the driver forever. Thanks, Ethan
> From: Ethan Zhao <haifeng.zhao@linux.intel.com> > Sent: Thursday, December 28, 2023 9:10 PM > > On 12/28/2023 4:38 PM, Tian, Kevin wrote: > >> From: Ethan Zhao <haifeng.zhao@linux.intel.com> > >> Sent: Thursday, December 28, 2023 8:17 AM > >> > >> > >> - if (rc == -EAGAIN) > >> + if (rc == -EAGAIN && type !=QI_DIOTLB_TYPE && type != > >> QI_DEIOTLB_TYPE) > >> goto restart; > >> > > this change is moot. > > > > -EAGAIN is set only when hardware detects a ATS invalidation completion > > timeout in qi_check_fault(). so above just essentially kills the restart logic. > > This change is intended to break the restar login when device-TLB > > invalidation timeout happens, we don't know how long the ITE took > > if the device is just no reponse. if in the end the agreement is to remove the restart logic, then do it. it's not good to introduce a change which essentially kills the restart logic but still keeps the related code.
On 12/29/2023 4:17 PM, Tian, Kevin wrote: >> From: Ethan Zhao <haifeng.zhao@linux.intel.com> >> Sent: Thursday, December 28, 2023 9:10 PM >> >> On 12/28/2023 4:38 PM, Tian, Kevin wrote: >>>> From: Ethan Zhao <haifeng.zhao@linux.intel.com> >>>> Sent: Thursday, December 28, 2023 8:17 AM >>>> >>>> >>>> - if (rc == -EAGAIN) >>>> + if (rc == -EAGAIN && type !=QI_DIOTLB_TYPE && type != >>>> QI_DEIOTLB_TYPE) >>>> goto restart; >>>> >>> this change is moot. >>> >>> -EAGAIN is set only when hardware detects a ATS invalidation completion >>> timeout in qi_check_fault(). so above just essentially kills the restart logic. >> This change is intended to break the restar login when device-TLB >> >> invalidation timeout happens, we don't know how long the ITE took >> >> if the device is just no reponse. > if in the end the agreement is to remove the restart logic, then do it. > > it's not good to introduce a change which essentially kills the restart > logic but still keeps the related code. Here, the device-TLB invalidation, depends on devcies response, no one could make sure the what the third party adapters will act. but for those invalidation issued to iommu itself, should be more likely to survive ? Anyway, would like to see more comments. Thanks, Ethan
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 76903a8bf963..206ab0b7294f 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1457,7 +1457,7 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, reclaim_free_desc(qi); raw_spin_unlock_irqrestore(&qi->q_lock, flags); - if (rc == -EAGAIN) + if (rc == -EAGAIN && type !=QI_DIOTLB_TYPE && type != QI_DEIOTLB_TYPE) goto restart; if (iotlb_start_ktime)