Message ID | 20240223152124.20042-5-johan+linaro@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-78596-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp656086dyb; Fri, 23 Feb 2024 07:27:11 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCW9ofl7CvPwwmnfz+0nI8ylyJGDODpn0zW1hqNEhoRH/eUNQuQTAxvUua2oh3WF+iuoAehxMqP7gRVmOZz7V813QBgTWQ== X-Google-Smtp-Source: AGHT+IHUcnm8CYR9j1zgGm+1KuxQgfA+kXaV1Cmnmgypz0L1slw/O6gLEeL5JS99trh2nYCgqfPE X-Received: by 2002:a05:6a20:9594:b0:1a0:e180:8c5 with SMTP id iu20-20020a056a20959400b001a0e18008c5mr225156pzb.9.1708702031432; Fri, 23 Feb 2024 07:27:11 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708702031; cv=pass; d=google.com; s=arc-20160816; b=crvoQV0FXMsP3Rbtw31RdXhP96u1dgLJJlzPXIg45UPK41HKvH8VVXK2yA0oOJ4qoM FuFyGzhZLheFAtNdWSMHvQ2vqQX8JfU2HjDOMmE96hZtyo+eewAZqCwNDQQfbZJQsG0V GLVZCuyYrTstEIzSo+xagui1tfg3L5sBHyaDKvjBO42SjNFbh/s36UGobe9LBTI09nGt drjhLdyU9X+Zh3x/e9GdSyE4onhfr28BjsdR9W8lxNZoa82Vw9OBPZJrxayaUwuaa4YI NJL5h1rod4+2x0tuoNhEMb7HPADvamSuFNDo1/PRCJ2Y3QMeSbLxQAzIq3f50BP2t2yW W1gQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=8c6M6MliMw/e/VSwbYrk2+fgXiVrbnFvfE3g5rkfwpY=; fh=ZRW1OjL+vw4nR3zHw03SqmaugmDx4dJgwazkhUMUovE=; b=0qN5dUJjelV3JXv+mtNS1AQNXPW+L7o2laJ0SaglqooxcRjj1We4dRBxT7jo4SoA3Z fgqEq9rB4QfraGiyE3d7tuY466BGt8tzhFwvoC2CO42e0tEQ2H2JjRDu5a1dnH+epYOH YIsNuLLXY6gP/KtKupJRsTSosk+qcGe83wnvAsUPsRRn3YCycF6ADGBJ3v/dNHn6D4tY L7pRx2RMoq3NFHWqntwjAIzEDY1nRYF1IXUGc+Ty7oeTUUJebaGXsW7AVzUWh+YW9b/v vmFlRa2vTvN6eg8KArKb4kavG9M8sYCHfdFjZ15zcbPMRsza9LppFQlpx7lwm84ToCnY y6Yg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=fOTuRidG; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-78596-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-78596-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id f5-20020a63f105000000b005e430f63a76si5486472pgi.449.2024.02.23.07.27.11 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Feb 2024 07:27:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-78596-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=fOTuRidG; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-78596-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-78596-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 9AF02284587 for <ouuuleilei@gmail.com>; Fri, 23 Feb 2024 15:26:47 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 35BAC12E1FC; Fri, 23 Feb 2024 15:23:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="fOTuRidG" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4AE2583A03; Fri, 23 Feb 2024 15:23:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708701788; cv=none; b=hbec8dflC5S7s91RBP6pVaAgDhMczjQ7wEyy5Yo9ZrJVGcfFchxQWiTTQKq1z2Zz4IqMIHmJnXiNB2vcLevcdsIrjBSuXWhrGF5xjPX6GjmKkomLRgByRkJFR8vQjS2zWQTsfCNmYvgnXbcw/WOTDK2cmURmfcXVqv4NTWYkkps= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708701788; c=relaxed/simple; bh=+MZNQen3wcYHTNycJniWNvbay9tcVWDXKsVdslC0djE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IBtIAcMEr1r2cRjLnROtygymYwemCt1EbYte3qNRWSONDkGq4D0j7ivfV3cvZyzdirUfdnnLjDdg3GJB7soglysIxE7hUU3e22+jKrCH1OpVWK8oc+59VcLCD+hJA6Nq8HL6JZ5231nBHgn2YXp8JYts7bJ4rTsGK7TERsfoHG4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=fOTuRidG; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 98205C433B1; Fri, 23 Feb 2024 15:23:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1708701786; bh=+MZNQen3wcYHTNycJniWNvbay9tcVWDXKsVdslC0djE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fOTuRidGnMQVuYfe+pILwzBb1KejZ6SdNH5TybZzzp0ZE51C+sdgTDdjy3KDp19HJ zmvwwGLN7Z/gL9Kd2jBapVUZ/AN9825OlvtsFegQ+AWLMPq9uML2LKUj+sYKgMAgqf M7O+Qm0MHlzYf9FiHAZFV48eJEE3VGan9R9QP09tapmCPAiOU5WqeLwP5GFd4gV+6C +3PWB/iSV/+mEhlUoOts/CCQI584PNxJCPq4+T7zyE6GEePehLNTKfZAkNtzp9yHZG Dq53LJql+qwxam9g4WJU3cHTjIMqdtiODX+x4leBpT2cYWjsZBvQ6DXNlANPeCcgk/ fjkNuQfBDW8GA== Received: from johan by xi.lan with local (Exim 4.97.1) (envelope-from <johan+linaro@kernel.org>) id 1rdXOJ-000000005Fb-0VmY; Fri, 23 Feb 2024 16:23:11 +0100 From: Johan Hovold <johan+linaro@kernel.org> To: Bjorn Helgaas <bhelgaas@google.com>, Bjorn Andersson <andersson@kernel.org> Cc: Konrad Dybcio <konrad.dybcio@linaro.org>, Lorenzo Pieralisi <lpieralisi@kernel.org>, =?utf-8?q?Krzysztof_Wilczy=C5=84?= =?utf-8?q?ski?= <kw@linux.com>, Rob Herring <robh@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>, Conor Dooley <conor+dt@kernel.org>, Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>, linux-arm-msm@vger.kernel.org, linux-pci@vger.kernel.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, Johan Hovold <johan+linaro@kernel.org>, stable@vger.kernel.org Subject: [PATCH v2 04/12] PCI: qcom: Add support for disabling ASPM L0s in devicetree Date: Fri, 23 Feb 2024 16:21:16 +0100 Message-ID: <20240223152124.20042-5-johan+linaro@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240223152124.20042-1-johan+linaro@kernel.org> References: <20240223152124.20042-1-johan+linaro@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791703941370825464 X-GMAIL-MSGID: 1791703941370825464 |
Series |
arm64: dts: qcom: sc8280xp: PCIe fixes and GICv3 ITS enable
|
|
Commit Message
Johan Hovold
Feb. 23, 2024, 3:21 p.m. UTC
Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting
1.9.0 ops") started enabling ASPM unconditionally when the hardware
claims to support it. This triggers Correctable Errors for some PCIe
devices on machines like the Lenovo ThinkPad X13s, which could indicate
an incomplete driver ASPM implementation or that the hardware does in
fact not support L0s.
Add support for disabling ASPM L0s in the devicetree when it is not
supported on a particular machine and controller.
Note that only the 1.9.0 ops enable ASPM currently.
Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops")
Cc: stable@vger.kernel.org # 6.7
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
---
drivers/pci/controller/dwc/pcie-qcom.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
Comments
On Fri, Feb 23, 2024 at 04:21:16PM +0100, Johan Hovold wrote: > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting > 1.9.0 ops") started enabling ASPM unconditionally when the hardware > claims to support it. This triggers Correctable Errors for some PCIe > devices on machines like the Lenovo ThinkPad X13s, which could indicate > an incomplete driver ASPM implementation or that the hardware does in > fact not support L0s. Are there any more details about this? Do the errors occur around suspend/resume, a power state transition, or some other event? Might other DWC-based devices be susceptible? Is there a specific driver you suspect might be incomplete? Do you want the DT approach because the problem is believed to be platform-specific? Otherwise, maybe we should consider reverting 9f4f3dfad8cf until the problem is understood? Could this be done via a quirk like quirk_disable_aspm_l0s()? That currently uses pci_disable_link_state(), which I don't think is completely safe because it leaves the possibility that drivers or users could re-enable L0s, e.g., via sysfs. This patch is nice because IIUC it directly changes PCI_EXP_LNKCAP, which avoids that issue, but quirk_disable_aspm_l0s() could conceivably be reimplemented to cache PCI_EXP_LNKCAP in struct pci_dev so quirks could override it, as we do with struct pci_dev.devcap. > Add support for disabling ASPM L0s in the devicetree when it is not > supported on a particular machine and controller. > > Note that only the 1.9.0 ops enable ASPM currently. > > Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") > Cc: stable@vger.kernel.org # 6.7 > Signed-off-by: Johan Hovold <johan+linaro@kernel.org> > --- > drivers/pci/controller/dwc/pcie-qcom.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c > index 09d485df34b9..0fb5dc06d2ef 100644 > --- a/drivers/pci/controller/dwc/pcie-qcom.c > +++ b/drivers/pci/controller/dwc/pcie-qcom.c > @@ -273,6 +273,25 @@ static int qcom_pcie_start_link(struct dw_pcie *pci) > return 0; > } > > +static void qcom_pcie_clear_aspm_l0s(struct dw_pcie *pci) > +{ > + u16 offset; > + u32 val; > + > + if (!of_property_read_bool(pci->dev->of_node, "aspm-no-l0s")) > + return; > + > + offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); > + > + dw_pcie_dbi_ro_wr_en(pci); > + > + val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP); > + val &= ~PCI_EXP_LNKCAP_ASPM_L0S; > + writel(val, pci->dbi_base + offset + PCI_EXP_LNKCAP); > + > + dw_pcie_dbi_ro_wr_dis(pci); > +} > + > static void qcom_pcie_clear_hpc(struct dw_pcie *pci) > { > u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); > @@ -962,6 +981,7 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie) > > static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie) > { > + qcom_pcie_clear_aspm_l0s(pcie->pci); > qcom_pcie_clear_hpc(pcie->pci); > > return 0; > -- > 2.43.0 >
On Fri, Feb 23, 2024 at 04:10:00PM -0600, Bjorn Helgaas wrote: > On Fri, Feb 23, 2024 at 04:21:16PM +0100, Johan Hovold wrote: > > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting > > 1.9.0 ops") started enabling ASPM unconditionally when the hardware > > claims to support it. This triggers Correctable Errors for some PCIe > > devices on machines like the Lenovo ThinkPad X13s, which could indicate > > an incomplete driver ASPM implementation or that the hardware does in > > fact not support L0s. > > Are there any more details about this? Do the errors occur around > suspend/resume, a power state transition, or some other event? Might > other DWC-based devices be susceptible? Is there a specific driver > you suspect might be incomplete? I see these errors when the devices in question are active as well as idle (not during suspend/resume). For example, when running iperf3 or fio to test the wifi and nvme, but I also see this occasionally for a wifi device which is (supposedly) not active (e.g. a handful errors over night). I skimmed Qualcomm's driver and noted that there are some registers related to ASPM which that driver updates, while the mainline driver leaves them at their default settings, but I essentially only mentioned that the ASPM implementation may be incomplete as a theoretical possibility. The somewhat erratic ASPM behaviour for one of the modems also suggests that some further tweak/quirk may be needed, and I was hoping to catch Mani's interest by reporting it. But based on what I've since heard from Qualcomm, it seems like these correctable error may be a known issue with the hardware (e.g. seen also with Windows), which is also why we decided to disable it for all controllers on these two platforms where I've seen this in v2. > Do you want the DT approach because the problem is believed to be > platform-specific? Otherwise, maybe we should consider reverting > 9f4f3dfad8cf until the problem is understood? Enabling ASPM gave a very significant improvement in battery life on the Lenovo ThinkPad X13s, from 10.5 h to 15 h, so reverting is not really an option there. And with L0s disabled, the AER error reports about correctable errors (that prevent enabling the GIC ITS and possibly degrades performance somewhat) are gone. I don't know for sure if there are further Qualcomm platform that are affected by this so I also don't want to use a too big of a hammer. The devicetree property allows us to disable L0s only after confirming that it's needed, and we can always extend this to broader classes of device when/if we learn more. > Could this be done via a quirk like quirk_disable_aspm_l0s()? That > currently uses pci_disable_link_state(), which I don't think is > completely safe because it leaves the possibility that drivers or > users could re-enable L0s, e.g., via sysfs. That was my first approach, thinking that it was the endpoint devices which did not really support L0s. But initially it seemed like the wifi controller on the CRD was not affected by this, while the same controller on the X13s was. That made me conclude that this is not just a property of the device but (also) of the controller and/or machine. I then noticed that we already had some controller drivers implementing 'aspm-no-l0s' and decided to go with that. > This patch is nice because IIUC it directly changes PCI_EXP_LNKCAP, > which avoids that issue, but quirk_disable_aspm_l0s() could > conceivably be reimplemented to cache PCI_EXP_LNKCAP in struct pci_dev > so quirks could override it, as we do with struct pci_dev.devcap. Johan
On Tue, Feb 27, 2024 at 04:29:15PM +0100, Johan Hovold wrote: > On Fri, Feb 23, 2024 at 04:10:00PM -0600, Bjorn Helgaas wrote: > > On Fri, Feb 23, 2024 at 04:21:16PM +0100, Johan Hovold wrote: > > > Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting > > > 1.9.0 ops") started enabling ASPM unconditionally when the hardware > > > claims to support it. This triggers Correctable Errors for some PCIe > > > devices on machines like the Lenovo ThinkPad X13s, which could indicate > > > an incomplete driver ASPM implementation or that the hardware does in > > > fact not support L0s. > > > > Are there any more details about this? Do the errors occur around > > suspend/resume, a power state transition, or some other event? Might > > other DWC-based devices be susceptible? Is there a specific driver > > you suspect might be incomplete? > > I see these errors when the devices in question are active as well as > idle (not during suspend/resume). For example, when running iperf3 or > fio to test the wifi and nvme, but I also see this occasionally for a > wifi device which is (supposedly) not active (e.g. a handful errors over > night). > > I skimmed Qualcomm's driver and noted that there are some registers > related to ASPM which that driver updates, while the mainline driver > leaves them at their default settings, but I essentially only mentioned > that the ASPM implementation may be incomplete as a theoretical > possibility. The somewhat erratic ASPM behaviour for one of the modems > also suggests that some further tweak/quirk may be needed, and I was > hoping to catch Mani's interest by reporting it. > > But based on what I've since heard from Qualcomm, it seems like these > correctable error may be a known issue with the hardware (e.g. seen > also with Windows), which is also why we decided to disable it for all > controllers on these two platforms where I've seen this in v2. > > > Do you want the DT approach because the problem is believed to be > > platform-specific? Otherwise, maybe we should consider reverting > > 9f4f3dfad8cf until the problem is understood? > > Enabling ASPM gave a very significant improvement in battery life on the > Lenovo ThinkPad X13s, from 10.5 h to 15 h, so reverting is not really an > option there. Ah, I missed that you're only disabling L0s, but leaving L1 enabled, thanks! And given that the v1.9.0 ops that enable ASPM are used on a bunch of platforms, and L0s seems to work fine on most of them, we wouldn't want to disable L0s for everybody, so this seems like the right solution. Bjorn
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 09d485df34b9..0fb5dc06d2ef 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -273,6 +273,25 @@ static int qcom_pcie_start_link(struct dw_pcie *pci) return 0; } +static void qcom_pcie_clear_aspm_l0s(struct dw_pcie *pci) +{ + u16 offset; + u32 val; + + if (!of_property_read_bool(pci->dev->of_node, "aspm-no-l0s")) + return; + + offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); + + dw_pcie_dbi_ro_wr_en(pci); + + val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP); + val &= ~PCI_EXP_LNKCAP_ASPM_L0S; + writel(val, pci->dbi_base + offset + PCI_EXP_LNKCAP); + + dw_pcie_dbi_ro_wr_dis(pci); +} + static void qcom_pcie_clear_hpc(struct dw_pcie *pci) { u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); @@ -962,6 +981,7 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie) static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie) { + qcom_pcie_clear_aspm_l0s(pcie->pci); qcom_pcie_clear_hpc(pcie->pci); return 0;