Message ID | 20230120031522.2304439-5-david.e.box@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp675705wrn; Thu, 19 Jan 2023 19:19:37 -0800 (PST) X-Google-Smtp-Source: AMrXdXtp+PjF8G/Xjm6Hme46yWeIcCsfPv651jrDA4zaziaG3tx6srdQeVfSwQZRr7/wbf0v75MT X-Received: by 2002:a05:6402:448:b0:492:798:385e with SMTP id p8-20020a056402044800b004920798385emr25512209edw.33.1674184777540; Thu, 19 Jan 2023 19:19:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674184777; cv=none; d=google.com; s=arc-20160816; b=cnfrnHA5x2HUOeO3mt8YVAf/pHFV4REDSxXh/WzXqRVOivr0/FiFmDc6fDsem8FKbN OXVRkB1pdlSo/K377R/ZZiEDbKA10DzrCgJQb6FNXuSZQinSzFeZgFQRYGSdanqaK7BB MRuDRZXXz4ON+D5EYjcoa9lg/T7YJpX+ecC13axMwQ+eO7LU/U7tr2W0d+GnSk7LwdAr 379NTUYeu8cAH0gXj+uKs9E/5thba/mxzVYF2lqTxD7YzsaI9kQaAX9sbZF+44Fj30OM IKKzFyhH9dOzg2g7E5mg12WRSu9S6GvHoXifZO1S23/aFITGbecRbuxqt6J3DikanXKS WyGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TGMpl8wObao4kCVKD1lbYUvsNagMmXldpyMPOQF0rYM=; b=JUP0z6kBzavyzybIoQCDEW/nQm1xDSfRt5KxCk0bv4f9TY9m0tsqCduOXBYC7tfWtk Avin17bkwfEN8MFc4iWiHKfvuG8Ds526FpP522Ej0Uo7PGM+MbojDqzLjWCdz0ecCRNU P6h/6mXJbrcFGG2XteBX3WnQNM6xMe1sKKXmYcAmvUVRiW+Tcjiav1VQHAWroJh9iTvv ZJTVoMKO6HUJMVyhLQzj9+HhrCOEouLjM1wABIvXmHHPpd+gI+lOxPjvCo9njMNGNwsS h92Jopk+03vW80LwkY0DXX/e13QCE17L6jLZcKfLVUHfO8mQnFV1kfh0nq1RwJRWniRk u3uQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=h8hhQ1Of; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ew4-20020a056402538400b0049ecbfa2f9fsi96584edb.370.2023.01.19.19.19.06; Thu, 19 Jan 2023 19:19:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=h8hhQ1Of; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229790AbjATDPp (ORCPT <rfc822;pavtiger@gmail.com> + 99 others); Thu, 19 Jan 2023 22:15:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229462AbjATDP3 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 19 Jan 2023 22:15:29 -0500 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 155E79F394; Thu, 19 Jan 2023 19:15:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1674184528; x=1705720528; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JekCCMDvu3Vw1jKwH0W0YN6B08jNIg7G5QXeYb9gH0g=; b=h8hhQ1OftxcLERppST1vf3fLvS1hs/OmS6/tNR/Nf2BJmjQysCQ7GSRx oUoLd5s5Zn/p+mp8cDppLwvRPN0QcCYa2WBcOd/BpFidedfsOeZeIJOYj GPLuRnc466GoYR4fEbGoQMSTvl3nhpeMe+fSHS5QGaZiBlzTycVDR7fzw Sd+Cu1AM7dtxwSTMdAPJvt4djkWpw4L0iHZhhdEYEvLxnqnB7Bmfj//vW oP7Nx56ByKgI2lAMawDRuPS9LW04SEIDoYK2TZHr+uzBgtDijI8HXAAzU oXTrbVU5s/v64ZOJm3J0KUEqb7gfqZJ6koIhoGEuFM4TCOWtcQ1ch6lqM w==; X-IronPort-AV: E=McAfee;i="6500,9779,10595"; a="390012704" X-IronPort-AV: E=Sophos;i="5.97,230,1669104000"; d="scan'208";a="390012704" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jan 2023 19:15:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10595"; a="638009323" X-IronPort-AV: E=Sophos;i="5.97,230,1669104000"; d="scan'208";a="638009323" Received: from linux.intel.com ([10.54.29.200]) by orsmga006.jf.intel.com with ESMTP; 19 Jan 2023 19:15:23 -0800 Received: from debox1-desk4.intel.com (unknown [10.212.255.207]) by linux.intel.com (Postfix) with ESMTP id AE58A580C4A; Thu, 19 Jan 2023 19:15:23 -0800 (PST) From: "David E. Box" <david.e.box@linux.intel.com> To: david.e.box@linux.intel.com, nirmal.patel@linux.intel.com, jonathan.derrick@linux.dev, lorenzo.pieralisi@arm.com, hch@infradead.org, kw@linux.com, robh@kernel.org, bhelgaas@google.com, michael.a.bottini@intel.com, rafael@kernel.org, me@adhityamohan.in Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH V10 4/4] PCI: vmd: Add quirk to configure PCIe ASPM and LTR Date: Thu, 19 Jan 2023 19:15:22 -0800 Message-Id: <20230120031522.2304439-5-david.e.box@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230120031522.2304439-1-david.e.box@linux.intel.com> References: <20230120031522.2304439-1-david.e.box@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755509977046270613?= X-GMAIL-MSGID: =?utf-8?q?1755509977046270613?= |
Series |
Enable PCIe ASPM and LTR on select hardware
|
|
Commit Message
David E. Box
Jan. 20, 2023, 3:15 a.m. UTC
PCIe ports reserved for VMD use are not visible to BIOS and therefore not configured to enable PCIe ASPM or LTR values (which BIOS will configure if they are not set). Lack of this programming results in high power consumption on laptops as reported in bugzilla. For affected products use pci_enable_link_state to set the allowed link states for devices on the root ports. Also set the LTR value to the maximum value needed for the SoC. This is a workaround for products from Rocket Lake through Alder Lake. Raptor Lake, the latest product at this time, has already implemented LTR configuring in BIOS. Future products will move ASPM configuration back to BIOS as well. As this solution is intended for laptops, support is not added for hotplug or for devices downstream of a switch on the root port. Link: https://bugzilla.kernel.org/show_bug.cgi?id=212355 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215063 Link: https://bugzilla.kernel.org/show_bug.cgi?id=213717 Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev> Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com> --- V10 - No change V9 - Added BIOS quirk flag to VMD_FEATS_CLIENT flag, suggested by Sathya. V8 - Removed struct vmd_device_data patch. Instead use #define for the LTR value which is the same across all products needing the quirk. V7 - No change V6 - Set ASPM first before setting LTR. This is needed because some devices may only have LTR set by BIOS and not ASPM - Skip setting the LTR if the current LTR in non-zero. V5 - Provide the LTR value as driver data. - Use DWORD for the config space write to avoid PCI WORD access bug. - Set ASPM links firsts, enabling all link states, before setting a default LTR if the capability is present - Add kernel message that VMD is setting the device LTR. V4 - Refactor vmd_enable_apsm() to exit early, making the lines shorter and more readable. Suggested by Christoph. V3 - No changes V2 - Use return status to print pci_info message if ASPM cannot be enabled. - Add missing static declaration, caught by lkp@intel.com drivers/pci/controller/vmd.c | 55 +++++++++++++++++++++++++++++++++++- 1 file changed, 54 insertions(+), 1 deletion(-)
Comments
On Thu, Jan 19, 2023 at 07:15:22PM -0800, David E. Box wrote: > +/* > + * Enable ASPM and LTR settings on devices that aren't configured by BIOS. > + */ > +static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata) > +{ > + unsigned long features = *(unsigned long *)userdata; > + u16 ltr = VMD_BIOS_PM_QUIRK_LTR; > + u32 ltr_reg; > + int pos; > + > + if (!(features & VMD_FEAT_BIOS_PM_QUIRK)) > + return 0; > + > + pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL); Hi, This is tripping lockdep on one our CI ADL machines. https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12814/bat-adlp-6/boot0.txt <4>[ 13.815380] ============================================ <4>[ 13.815382] WARNING: possible recursive locking detected <4>[ 13.815384] 6.3.0-rc1-CI_DRM_12814-g4753bbc2a817+ #1 Not tainted <4>[ 13.815386] -------------------------------------------- <4>[ 13.815387] swapper/0/1 is trying to acquire lock: <4>[ 13.815389] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: pci_enable_link_state+0x69/0x1d0 <4>[ 13.815396] but task is already holding lock: <4>[ 13.815398] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x24/0x90 <4>[ 13.815403] other info that might help us debug this: <4>[ 13.815404] Possible unsafe locking scenario: <4>[ 13.815406] CPU0 <4>[ 13.815407] ---- <4>[ 13.815408] lock(pci_bus_sem); <4>[ 13.815410] lock(pci_bus_sem); <4>[ 13.815411] *** DEADLOCK *** <4>[ 13.815413] May be due to missing lock nesting notation <4>[ 13.815414] 2 locks held by swapper/0/1: <4>[ 13.815416] #0: ffff8881029511b8 (&dev->mutex){....}-{3:3}, at: __driver_attach+0xab/0x180 <4>[ 13.815422] #1: ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x24/0x90 <4>[ 13.815426] stack backtrace: <4>[ 13.815428] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc1-CI_DRM_12814-g4753bbc2a817+ #1 <4>[ 13.815431] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR4 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 03/25/2022 <4>[ 13.815434] Call Trace: <4>[ 13.815436] <TASK> <4>[ 13.815437] dump_stack_lvl+0x64/0xb0 <4>[ 13.815443] __lock_acquire+0x9b5/0x2550 <4>[ 13.815461] lock_acquire+0xd7/0x330 <4>[ 13.815463] ? pci_enable_link_state+0x69/0x1d0 <4>[ 13.815466] down_read+0x3d/0x180 <4>[ 13.815480] ? pci_enable_link_state+0x69/0x1d0 <4>[ 13.815482] pci_enable_link_state+0x69/0x1d0 <4>[ 13.815485] ? __pfx_vmd_pm_enable_quirk+0x10/0x10 <4>[ 13.815488] vmd_pm_enable_quirk+0x49/0xb0 <4>[ 13.815490] pci_walk_bus+0x6d/0x90 <4>[ 13.815492] vmd_probe+0x75f/0x9d0 <4>[ 13.815495] pci_device_probe+0x95/0x120 <4>[ 13.815498] really_probe+0x164/0x3c0 <4>[ 13.815500] ? __pfx___driver_attach+0x10/0x10 <4>[ 13.815503] __driver_probe_device+0x73/0x170 <4>[ 13.815506] driver_probe_device+0x19/0xa0 <4>[ 13.815508] __driver_attach+0xb6/0x180 <4>[ 13.815511] ? __pfx___driver_attach+0x10/0x10 <4>[ 13.815513] bus_for_each_dev+0x77/0xd0 <4>[ 13.815516] bus_add_driver+0x114/0x210 <4>[ 13.815518] driver_register+0x5b/0x110 <4>[ 13.815520] ? __pfx_vmd_drv_init+0x10/0x10 <4>[ 13.815523] do_one_initcall+0x57/0x330 <4>[ 13.815527] kernel_init_freeable+0x181/0x3a0 <4>[ 13.815529] ? __pfx_kernel_init+0x10/0x10 <4>[ 13.815532] kernel_init+0x15/0x120 <4>[ 13.815534] ret_from_fork+0x29/0x50 <4>[ 13.815537] </TASK>
Hi, On Tue, 2023-03-21 at 00:56 +0200, Ville Syrjälä wrote: > On Thu, Jan 19, 2023 at 07:15:22PM -0800, David E. Box wrote: > > +/* > > + * Enable ASPM and LTR settings on devices that aren't configured by BIOS. > > + */ > > +static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata) > > +{ > > + unsigned long features = *(unsigned long *)userdata; > > + u16 ltr = VMD_BIOS_PM_QUIRK_LTR; > > + u32 ltr_reg; > > + int pos; > > + > > + if (!(features & VMD_FEAT_BIOS_PM_QUIRK)) > > + return 0; > > + > > + pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL); We call pci_enable_link_state from a callback that's run during pci_walk_bus() which I see already acquires the semaphore. We've had this patch for well over a year and I haven't seen this issue before. Is there a particular config needed to reproduce it? As far as a solution I think we can copy what __pci_disable_link_state() does and add a bool argument so that we only do down/up on the semaphore when set to true. Since we know we will in be the lock during the bus walk we can set it to false. David > > Hi, > > This is tripping lockdep on one our CI ADL machines. > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12814/bat-adlp-6/boot0.txt > > <4>[ 13.815380] ============================================ > <4>[ 13.815382] WARNING: possible recursive locking detected > <4>[ 13.815384] 6.3.0-rc1-CI_DRM_12814-g4753bbc2a817+ #1 Not tainted > <4>[ 13.815386] -------------------------------------------- > <4>[ 13.815387] swapper/0/1 is trying to acquire lock: > <4>[ 13.815389] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > pci_enable_link_state+0x69/0x1d0 > <4>[ 13.815396] > but task is already holding lock: > <4>[ 13.815398] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > pci_walk_bus+0x24/0x90 > <4>[ 13.815403] > other info that might help us debug this: > <4>[ 13.815404] Possible unsafe locking scenario: > > <4>[ 13.815406] CPU0 > <4>[ 13.815407] ---- > <4>[ 13.815408] lock(pci_bus_sem); > <4>[ 13.815410] lock(pci_bus_sem); > <4>[ 13.815411] > *** DEADLOCK *** > > <4>[ 13.815413] May be due to missing lock nesting notation > > <4>[ 13.815414] 2 locks held by swapper/0/1: > <4>[ 13.815416] #0: ffff8881029511b8 (&dev->mutex){....}-{3:3}, at: > __driver_attach+0xab/0x180 > <4>[ 13.815422] #1: ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > pci_walk_bus+0x24/0x90 > <4>[ 13.815426] > stack backtrace: > <4>[ 13.815428] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc1- > CI_DRM_12814-g4753bbc2a817+ #1 > <4>[ 13.815431] Hardware name: Intel Corporation Alder Lake Client > Platform/AlderLake-P DDR4 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 > 03/25/2022 > <4>[ 13.815434] Call Trace: > <4>[ 13.815436] <TASK> > <4>[ 13.815437] dump_stack_lvl+0x64/0xb0 > <4>[ 13.815443] __lock_acquire+0x9b5/0x2550 > <4>[ 13.815461] lock_acquire+0xd7/0x330 > <4>[ 13.815463] ? pci_enable_link_state+0x69/0x1d0 > <4>[ 13.815466] down_read+0x3d/0x180 > <4>[ 13.815480] ? pci_enable_link_state+0x69/0x1d0 > <4>[ 13.815482] pci_enable_link_state+0x69/0x1d0 > <4>[ 13.815485] ? __pfx_vmd_pm_enable_quirk+0x10/0x10 > <4>[ 13.815488] vmd_pm_enable_quirk+0x49/0xb0 > <4>[ 13.815490] pci_walk_bus+0x6d/0x90 > <4>[ 13.815492] vmd_probe+0x75f/0x9d0 > <4>[ 13.815495] pci_device_probe+0x95/0x120 > <4>[ 13.815498] really_probe+0x164/0x3c0 > <4>[ 13.815500] ? __pfx___driver_attach+0x10/0x10 > <4>[ 13.815503] __driver_probe_device+0x73/0x170 > <4>[ 13.815506] driver_probe_device+0x19/0xa0 > <4>[ 13.815508] __driver_attach+0xb6/0x180 > <4>[ 13.815511] ? __pfx___driver_attach+0x10/0x10 > <4>[ 13.815513] bus_for_each_dev+0x77/0xd0 > <4>[ 13.815516] bus_add_driver+0x114/0x210 > <4>[ 13.815518] driver_register+0x5b/0x110 > <4>[ 13.815520] ? __pfx_vmd_drv_init+0x10/0x10 > <4>[ 13.815523] do_one_initcall+0x57/0x330 > <4>[ 13.815527] kernel_init_freeable+0x181/0x3a0 > <4>[ 13.815529] ? __pfx_kernel_init+0x10/0x10 > <4>[ 13.815532] kernel_init+0x15/0x120 > <4>[ 13.815534] ret_from_fork+0x29/0x50 > <4>[ 13.815537] </TASK> >
On Mon, Mar 20, 2023 at 07:24:16PM -0700, David E. Box wrote: > Hi, > > On Tue, 2023-03-21 at 00:56 +0200, Ville Syrjälä wrote: > > On Thu, Jan 19, 2023 at 07:15:22PM -0800, David E. Box wrote: > > > +/* > > > + * Enable ASPM and LTR settings on devices that aren't configured by BIOS. > > > + */ > > > +static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata) > > > +{ > > > + unsigned long features = *(unsigned long *)userdata; > > > + u16 ltr = VMD_BIOS_PM_QUIRK_LTR; > > > + u32 ltr_reg; > > > + int pos; > > > + > > > + if (!(features & VMD_FEAT_BIOS_PM_QUIRK)) > > > + return 0; > > > + > > > + pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL); > > We call pci_enable_link_state from a callback that's run during pci_walk_bus() > which I see already acquires the semaphore. We've had this patch for well over a > year and I haven't seen this issue before. Is there a particular config needed > to reproduce it? Not sure what would affect it, beyond the normal PROVE_LOCKING=y. This is the .config our CI uses: https://gitlab.freedesktop.org/gfx-ci/i915-infra/-/blob/master/kconfig/debug > > As far as a solution I think we can copy what __pci_disable_link_state() does > and add a bool argument so that we only do down/up on the semaphore when set to > true. Since we know we will in be the lock during the bus walk we can set it to > false. > > David > > > > > Hi, > > > > This is tripping lockdep on one our CI ADL machines. > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12814/bat-adlp-6/boot0.txt > > > > <4>[ 13.815380] ============================================ > > <4>[ 13.815382] WARNING: possible recursive locking detected > > <4>[ 13.815384] 6.3.0-rc1-CI_DRM_12814-g4753bbc2a817+ #1 Not tainted > > <4>[ 13.815386] -------------------------------------------- > > <4>[ 13.815387] swapper/0/1 is trying to acquire lock: > > <4>[ 13.815389] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > > pci_enable_link_state+0x69/0x1d0 > > <4>[ 13.815396] > > but task is already holding lock: > > <4>[ 13.815398] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > > pci_walk_bus+0x24/0x90 > > <4>[ 13.815403] > > other info that might help us debug this: > > <4>[ 13.815404] Possible unsafe locking scenario: > > > > <4>[ 13.815406] CPU0 > > <4>[ 13.815407] ---- > > <4>[ 13.815408] lock(pci_bus_sem); > > <4>[ 13.815410] lock(pci_bus_sem); > > <4>[ 13.815411] > > *** DEADLOCK *** > > > > <4>[ 13.815413] May be due to missing lock nesting notation > > > > <4>[ 13.815414] 2 locks held by swapper/0/1: > > <4>[ 13.815416] #0: ffff8881029511b8 (&dev->mutex){....}-{3:3}, at: > > __driver_attach+0xab/0x180 > > <4>[ 13.815422] #1: ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > > pci_walk_bus+0x24/0x90 > > <4>[ 13.815426] > > stack backtrace: > > <4>[ 13.815428] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc1- > > CI_DRM_12814-g4753bbc2a817+ #1 > > <4>[ 13.815431] Hardware name: Intel Corporation Alder Lake Client > > Platform/AlderLake-P DDR4 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 > > 03/25/2022 > > <4>[ 13.815434] Call Trace: > > <4>[ 13.815436] <TASK> > > <4>[ 13.815437] dump_stack_lvl+0x64/0xb0 > > <4>[ 13.815443] __lock_acquire+0x9b5/0x2550 > > <4>[ 13.815461] lock_acquire+0xd7/0x330 > > <4>[ 13.815463] ? pci_enable_link_state+0x69/0x1d0 > > <4>[ 13.815466] down_read+0x3d/0x180 > > <4>[ 13.815480] ? pci_enable_link_state+0x69/0x1d0 > > <4>[ 13.815482] pci_enable_link_state+0x69/0x1d0 > > <4>[ 13.815485] ? __pfx_vmd_pm_enable_quirk+0x10/0x10 > > <4>[ 13.815488] vmd_pm_enable_quirk+0x49/0xb0 > > <4>[ 13.815490] pci_walk_bus+0x6d/0x90 > > <4>[ 13.815492] vmd_probe+0x75f/0x9d0 > > <4>[ 13.815495] pci_device_probe+0x95/0x120 > > <4>[ 13.815498] really_probe+0x164/0x3c0 > > <4>[ 13.815500] ? __pfx___driver_attach+0x10/0x10 > > <4>[ 13.815503] __driver_probe_device+0x73/0x170 > > <4>[ 13.815506] driver_probe_device+0x19/0xa0 > > <4>[ 13.815508] __driver_attach+0xb6/0x180 > > <4>[ 13.815511] ? __pfx___driver_attach+0x10/0x10 > > <4>[ 13.815513] bus_for_each_dev+0x77/0xd0 > > <4>[ 13.815516] bus_add_driver+0x114/0x210 > > <4>[ 13.815518] driver_register+0x5b/0x110 > > <4>[ 13.815520] ? __pfx_vmd_drv_init+0x10/0x10 > > <4>[ 13.815523] do_one_initcall+0x57/0x330 > > <4>[ 13.815527] kernel_init_freeable+0x181/0x3a0 > > <4>[ 13.815529] ? __pfx_kernel_init+0x10/0x10 > > <4>[ 13.815532] kernel_init+0x15/0x120 > > <4>[ 13.815534] ret_from_fork+0x29/0x50 > > <4>[ 13.815537] </TASK> > >
On Tue, 2023-03-21 at 14:38 +0200, Ville Syrjälä wrote: > On Mon, Mar 20, 2023 at 07:24:16PM -0700, David E. Box wrote: > > Hi, > > > > On Tue, 2023-03-21 at 00:56 +0200, Ville Syrjälä wrote: > > > On Thu, Jan 19, 2023 at 07:15:22PM -0800, David E. Box wrote: > > > > +/* > > > > + * Enable ASPM and LTR settings on devices that aren't configured by > > > > BIOS. > > > > + */ > > > > +static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata) > > > > +{ > > > > + unsigned long features = *(unsigned long *)userdata; > > > > + u16 ltr = VMD_BIOS_PM_QUIRK_LTR; > > > > + u32 ltr_reg; > > > > + int pos; > > > > + > > > > + if (!(features & VMD_FEAT_BIOS_PM_QUIRK)) > > > > + return 0; > > > > + > > > > + pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL); > > > > We call pci_enable_link_state from a callback that's run during > > pci_walk_bus() > > which I see already acquires the semaphore. We've had this patch for well > > over a > > year and I haven't seen this issue before. Is there a particular config > > needed > > to reproduce it? > > Not sure what would affect it, beyond the normal PROVE_LOCKING=y. Thanks. Did not have this set. I reproduced the issue and have sent a fix. David > > This is the .config our CI uses: > https://gitlab.freedesktop.org/gfx-ci/i915-infra/-/blob/master/kconfig/debug > > > > > As far as a solution I think we can copy what __pci_disable_link_state() > > does > > and add a bool argument so that we only do down/up on the semaphore when set > > to > > true. Since we know we will in be the lock during the bus walk we can set it > > to > > false. > > > > David > > > > > > > > Hi, > > > > > > This is tripping lockdep on one our CI ADL machines. > > > > > > https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_12814/bat-adlp-6/boot0.txt > > > > > > <4>[ 13.815380] ============================================ > > > <4>[ 13.815382] WARNING: possible recursive locking detected > > > <4>[ 13.815384] 6.3.0-rc1-CI_DRM_12814-g4753bbc2a817+ #1 Not tainted > > > <4>[ 13.815386] -------------------------------------------- > > > <4>[ 13.815387] swapper/0/1 is trying to acquire lock: > > > <4>[ 13.815389] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > > > pci_enable_link_state+0x69/0x1d0 > > > <4>[ 13.815396] > > > but task is already holding lock: > > > <4>[ 13.815398] ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > > > pci_walk_bus+0x24/0x90 > > > <4>[ 13.815403] > > > other info that might help us debug this: > > > <4>[ 13.815404] Possible unsafe locking scenario: > > > > > > <4>[ 13.815406] CPU0 > > > <4>[ 13.815407] ---- > > > <4>[ 13.815408] lock(pci_bus_sem); > > > <4>[ 13.815410] lock(pci_bus_sem); > > > <4>[ 13.815411] > > > *** DEADLOCK *** > > > > > > <4>[ 13.815413] May be due to missing lock nesting notation > > > > > > <4>[ 13.815414] 2 locks held by swapper/0/1: > > > <4>[ 13.815416] #0: ffff8881029511b8 (&dev->mutex){....}-{3:3}, at: > > > __driver_attach+0xab/0x180 > > > <4>[ 13.815422] #1: ffffffff827ab0b0 (pci_bus_sem){++++}-{3:3}, at: > > > pci_walk_bus+0x24/0x90 > > > <4>[ 13.815426] > > > stack backtrace: > > > <4>[ 13.815428] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc1- > > > CI_DRM_12814-g4753bbc2a817+ #1 > > > <4>[ 13.815431] Hardware name: Intel Corporation Alder Lake Client > > > Platform/AlderLake-P DDR4 RVP, BIOS ADLPFWI1.R00.3135.A00.2203251419 > > > 03/25/2022 > > > <4>[ 13.815434] Call Trace: > > > <4>[ 13.815436] <TASK> > > > <4>[ 13.815437] dump_stack_lvl+0x64/0xb0 > > > <4>[ 13.815443] __lock_acquire+0x9b5/0x2550 > > > <4>[ 13.815461] lock_acquire+0xd7/0x330 > > > <4>[ 13.815463] ? pci_enable_link_state+0x69/0x1d0 > > > <4>[ 13.815466] down_read+0x3d/0x180 > > > <4>[ 13.815480] ? pci_enable_link_state+0x69/0x1d0 > > > <4>[ 13.815482] pci_enable_link_state+0x69/0x1d0 > > > <4>[ 13.815485] ? __pfx_vmd_pm_enable_quirk+0x10/0x10 > > > <4>[ 13.815488] vmd_pm_enable_quirk+0x49/0xb0 > > > <4>[ 13.815490] pci_walk_bus+0x6d/0x90 > > > <4>[ 13.815492] vmd_probe+0x75f/0x9d0 > > > <4>[ 13.815495] pci_device_probe+0x95/0x120 > > > <4>[ 13.815498] really_probe+0x164/0x3c0 > > > <4>[ 13.815500] ? __pfx___driver_attach+0x10/0x10 > > > <4>[ 13.815503] __driver_probe_device+0x73/0x170 > > > <4>[ 13.815506] driver_probe_device+0x19/0xa0 > > > <4>[ 13.815508] __driver_attach+0xb6/0x180 > > > <4>[ 13.815511] ? __pfx___driver_attach+0x10/0x10 > > > <4>[ 13.815513] bus_for_each_dev+0x77/0xd0 > > > <4>[ 13.815516] bus_add_driver+0x114/0x210 > > > <4>[ 13.815518] driver_register+0x5b/0x110 > > > <4>[ 13.815520] ? __pfx_vmd_drv_init+0x10/0x10 > > > <4>[ 13.815523] do_one_initcall+0x57/0x330 > > > <4>[ 13.815527] kernel_init_freeable+0x181/0x3a0 > > > <4>[ 13.815529] ? __pfx_kernel_init+0x10/0x10 > > > <4>[ 13.815532] kernel_init+0x15/0x120 > > > <4>[ 13.815534] ret_from_fork+0x29/0x50 > > > <4>[ 13.815537] </TASK> > > > >
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c index 47fa3e5f2dc5..990630ec57c6 100644 --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -66,11 +66,22 @@ enum vmd_features { * interrupt handling. */ VMD_FEAT_CAN_BYPASS_MSI_REMAP = (1 << 4), + + /* + * Enable ASPM on the PCIE root ports and set the default LTR of the + * storage devices on platforms where these values are not configured by + * BIOS. This is needed for laptops, which require these settings for + * proper power management of the SoC. + */ + VMD_FEAT_BIOS_PM_QUIRK = (1 << 5), }; +#define VMD_BIOS_PM_QUIRK_LTR 0x1003 /* 3145728 ns */ + #define VMD_FEATS_CLIENT (VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP | \ VMD_FEAT_HAS_BUS_RESTRICTIONS | \ - VMD_FEAT_OFFSET_FIRST_VECTOR) + VMD_FEAT_OFFSET_FIRST_VECTOR | \ + VMD_FEAT_BIOS_PM_QUIRK) static DEFINE_IDA(vmd_instance_ida); @@ -713,6 +724,46 @@ static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge, vmd_bridge->native_dpc = root_bridge->native_dpc; } +/* + * Enable ASPM and LTR settings on devices that aren't configured by BIOS. + */ +static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata) +{ + unsigned long features = *(unsigned long *)userdata; + u16 ltr = VMD_BIOS_PM_QUIRK_LTR; + u32 ltr_reg; + int pos; + + if (!(features & VMD_FEAT_BIOS_PM_QUIRK)) + return 0; + + pci_enable_link_state(pdev, PCIE_LINK_STATE_ALL); + + pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR); + if (!pos) + return 0; + + /* + * Skip if the max snoop LTR is non-zero, indicating BIOS has set it + * so the LTR quirk is not needed. + */ + pci_read_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, <r_reg); + if (!!(ltr_reg & (PCI_LTR_VALUE_MASK | PCI_LTR_SCALE_MASK))) + return 0; + + /* + * Set the default values to the maximum required by the platform to + * allow the deepest power management savings. Write as a DWORD where + * the lower word is the max snoop latency and the upper word is the + * max non-snoop latency. + */ + ltr_reg = (ltr << 16) | ltr; + pci_write_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, ltr_reg); + pci_info(pdev, "VMD: Default LTR value set by driver\n"); + + return 0; +} + static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) { struct pci_sysdata *sd = &vmd->sysdata; @@ -885,6 +936,8 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features) pci_assign_unassigned_bus_resources(vmd->bus); + pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features); + /* * VMD root buses are virtual and don't return true on pci_is_pcie() * and will fail pcie_bus_configure_settings() early. It can instead be