[v4] PCI/ACPI: PCI/ACPI: Validate devices with power resources support D3

Message ID 20221026215237.18556-1-mario.limonciello@amd.com
State New
Headers
Series [v4] PCI/ACPI: PCI/ACPI: Validate devices with power resources support D3 |

Commit Message

Mario Limonciello Oct. 26, 2022, 9:52 p.m. UTC
  Firmware typically advertises that ACPI devices that represent PCIe
devices can support D3 by a combination of the value returned by
_S0W as well as the HotPlugSupportInD3 _DSD [1].

`acpi_pci_bridge_d3` looks for this combination but also contains
an assumption that if an ACPI device contains power resources the PCIe
device it's associated with can support D3.  This was introduced
from commit c6e331312ebf ("PCI/ACPI: Whitelist hotplug ports for
D3 if power managed by ACPI").

Some firmware configurations for "AMD Pink Sardine" do not support
wake from D3 in _S0W for the ACPI device representing the PCIe root
port used for tunneling. The PCIe device will still be opted into
runtime PM in the kernel [2] because of the logic within
`acpi_pci_bridge_d3`. This currently happens because the ACPI
device contains power resources.

When the thunderbolt driver is loaded two device links are created:
* USB4 router <-> PCIe root port for tunneling
* USB4 router <-> XHCI PCIe device

These device links are created because the ACPI devices declare the
`usb4-host-interface` _DSD [3]. For both links the USB4 router is the
supplier and these other devices are the consumers.
Here is a demonstration of this topology that occurs:

|
├─ 00:03.1
|       | "PCIe root port used for tunneling"
|       | ACPI Path: \_SB_.PCI0.GP11
|       | ACPI Power Resources: Yes
|       | ACPI _S0W return value: 0
|       | Device Links: supplier:pci:0000:c4:00.5
|       └─ PCIe Power state: D0
└─ 00:08.3
        | ACPI Path: \_SB_.PCI0.GP19
        ├─ PCIe Power state: D0
        ├─ c4:00.3
        |       | "XHCI PCIe device used for tunneling"
        |       | ACPI Path: \_SB_.PCI0.GP19.XHC3
        |       | ACPI Power Resources: Yes
        |       | ACPI _S0W return value: 4
        |       | Device Links: supplier:pci:0000:c4:00.5
        |       └─ PCIe Power state: D3cold
        └─ c4:00.5
                | "USB4 Router"
                | ACPI Path: \_SB_.PCI0.GP19.NHI0
                | ACPI Power Resources: Yes
                | ACPI _S0W return value: 4
                | Device Links: consumer:pci:0000:00:03.1 consumer:pci:0000:c4:00.3
                └─ PCIe Power state: D3cold

Currently runtime PM is allowed for all of these devices.  This means that
when all consumers are idle long enough, they will enter their deepest allowed
sleep state. Once all consumers are in their deepest allowed sleep state the
suppliers will enter the deepest sleep state as well.

* The PCIe root port for tunneling doesn't support waking from D3hot or
  D3cold so it stays in D0.
* The XHCI PCIe device supports wakeup from D3cold so it goes to D3cold.
* Both consumers are in their deepest state and the USB4 router supports
  wakeup from D3cold, so it goes into this state.

The expectation is the USB4 router should have also remained in D0 since
the PCIe root port for tunneling remained in D0 and a device link exists
between the two devices.

Instead of making the assertion that the device can support D3 (and thus
runtime PM) solely from the presence of ACPI power resources, move the check
to later on in the function, which will have validated that the ACPI device
supports wake from D3hot or D3cold.

This fix prevents the USB4 router being put into D3 when the firmware
says that ACPI device representing the PCIe root port for tunneling can't
handle it while still allowing ACPI devices that don't have the
HotplugSupportInD3 _DSD to also enter D3 if they have power resources that
can wake from D3.

Link: https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3 [1]
Link: https://github.com/torvalds/linux/blob/v6.1-rc1/drivers/pci/pcie/portdrv_pci.c#L126 [2]
Link: https://github.com/torvalds/linux/blob/v6.1-rc1/drivers/thunderbolt/acpi.c#L29 [3]
Fixes: dff6139015dc6 ("PCI/ACPI: Allow D3 only if Root Port can signal and wake from D3")
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
v3->v4:
 * Pick up tags
 * Add more details to the commit message
v2->v3:
 * Reword commit message
v1->v2:
 * Just return value of acpi_pci_power_manageable (Rafael)
 * Remove extra word in commit message
---
 drivers/pci/pci-acpi.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)
  

Comments

Lukas Wunner Oct. 27, 2022, 5:24 a.m. UTC | #1
On Wed, Oct 26, 2022 at 04:52:37PM -0500, Mario Limonciello wrote:
> Firmware typically advertises that ACPI devices that represent PCIe
> devices can support D3 by a combination of the value returned by
> _S0W as well as the HotPlugSupportInD3 _DSD [1].
> 
> `acpi_pci_bridge_d3` looks for this combination but also contains
> an assumption that if an ACPI device contains power resources the PCIe
> device it's associated with can support D3.  This was introduced
> from commit c6e331312ebf ("PCI/ACPI: Whitelist hotplug ports for
> D3 if power managed by ACPI").
> 
> Some firmware configurations for "AMD Pink Sardine" do not support
> wake from D3 in _S0W for the ACPI device representing the PCIe root
> port used for tunneling. The PCIe device will still be opted into
> runtime PM in the kernel [2] because of the logic within
> `acpi_pci_bridge_d3`. This currently happens because the ACPI
> device contains power resources.

So put briefly, in acpi_pci_bridge_d3() we fail to take wake capabilities
into account and blindly assume that a bridge can be runtime suspended
to D3 if it is power-manageable by ACPI.

By moving the acpi_pci_power_manageable() below the wake capabilities
checks, we avoid runtime suspending a bridge that is not wakeup capable.

The more verbose explanation in the commit message is useful to
understand how the issue was exposed, but it somewhat obscures
the issue itself.


> When the thunderbolt driver is loaded two device links are created:
> * USB4 router <-> PCIe root port for tunneling
> * USB4 router <-> XHCI PCIe device

Those double arrows are a little misleading, a device link is
unidirectional, so it's really <-- and not <->.


> Currently runtime PM is allowed for all of these devices.  This means that
> when all consumers are idle long enough, they will enter their deepest allowed
> sleep state. Once all consumers are in their deepest allowed sleep state the
> suppliers will enter the deepest sleep state as well.
> 
> * The PCIe root port for tunneling doesn't support waking from D3hot or
>   D3cold so it stays in D0.

Huh?  I thought it's runtime suspended to D3hot even though it should stay
runtime resumed in D0 because it's not wakeup capable in D3hot?


> * The XHCI PCIe device supports wakeup from D3cold so it goes to D3cold.
> * Both consumers are in their deepest state and the USB4 router supports
>   wakeup from D3cold, so it goes into this state.
> 
> The expectation is the USB4 router should have also remained in D0 since
> the PCIe root port for tunneling remained in D0 and a device link exists
> between the two devices.

This paragraph sounds like the problem is the router runtime suspended.
IIUC the router could only runtime suspend because its consumer, the
Root Port, runtime suspended.  By preventing the Root Port from runtime
suspending, you're implicitly preventing it's supplier (the router)
from suspending.


> Link: https://learn.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3 [1]
> Link: https://github.com/torvalds/linux/blob/v6.1-rc1/drivers/pci/pcie/portdrv_pci.c#L126 [2]
> Link: https://github.com/torvalds/linux/blob/v6.1-rc1/drivers/thunderbolt/acpi.c#L29 [3]

I think git.kernel.org links are preferred to 3rd party hosting services.

Thanks,

Lukas
  
Mario Limonciello Oct. 27, 2022, 7:56 p.m. UTC | #2
On 10/27/2022 00:24, Lukas Wunner wrote:
> On Wed, Oct 26, 2022 at 04:52:37PM -0500, Mario Limonciello wrote:
>> Firmware typically advertises that ACPI devices that represent PCIe
>> devices can support D3 by a combination of the value returned by
>> _S0W as well as the HotPlugSupportInD3 _DSD [1].
>>
>> `acpi_pci_bridge_d3` looks for this combination but also contains
>> an assumption that if an ACPI device contains power resources the PCIe
>> device it's associated with can support D3.  This was introduced
>> from commit c6e331312ebf ("PCI/ACPI: Whitelist hotplug ports for
>> D3 if power managed by ACPI").
>>
>> Some firmware configurations for "AMD Pink Sardine" do not support
>> wake from D3 in _S0W for the ACPI device representing the PCIe root
>> port used for tunneling. The PCIe device will still be opted into
>> runtime PM in the kernel [2] because of the logic within
>> `acpi_pci_bridge_d3`. This currently happens because the ACPI
>> device contains power resources.
> 
> So put briefly, in acpi_pci_bridge_d3() we fail to take wake capabilities
> into account and blindly assume that a bridge can be runtime suspended
> to D3 if it is power-manageable by ACPI.
> 
> By moving the acpi_pci_power_manageable() below the wake capabilities
> checks, we avoid runtime suspending a bridge that is not wakeup capable.
> 

Yes, spot on.

> The more verbose explanation in the commit message is useful to
> understand how the issue was exposed, but it somewhat obscures
> the issue itself.

Within this lengthy commit message I attempted to follow the model of:

"Status quo"
"Background"
"Problem Statement"
"Impact"
"Solution"

> 
> 
>> When the thunderbolt driver is loaded two device links are created:
>> * USB4 router <-> PCIe root port for tunneling
>> * USB4 router <-> XHCI PCIe device
> 
> Those double arrows are a little misleading, a device link is
> unidirectional, so it's really <-- and not <->.

Yes, that's correct.  Thanks.

> 
> 
>> Currently runtime PM is allowed for all of these devices.  This means that
>> when all consumers are idle long enough, they will enter their deepest allowed
>> sleep state. Once all consumers are in their deepest allowed sleep state the
>> suppliers will enter the deepest sleep state as well.
>>
>> * The PCIe root port for tunneling doesn't support waking from D3hot or
>>    D3cold so it stays in D0.
> 
> Huh?  I thought it's runtime suspended to D3hot even though it should stay
> runtime resumed in D0 because it's not wakeup capable in D3hot?

This is why I included the power state information in my topology diagram.

It's runtime suspended, but as it can't wake from D3hot it is 
"suspended" to D0.

$ cat /sys/bus/pci/devices/0000\:00\:03.1/power/control
auto
$ cat /sys/bus/pci/devices/0000\:00\:03.1/power/runtime_enabled
enabled
$ cat /sys/bus/pci/devices/0000\:00\:03.1/power/runtime_status
suspended
$ cat /sys/bus/pci/devices/0000\:00\:03.1/power_state
D0

> 
> 
>> * The XHCI PCIe device supports wakeup from D3cold so it goes to D3cold.
>> * Both consumers are in their deepest state and the USB4 router supports
>>    wakeup from D3cold, so it goes into this state.
>>
>> The expectation is the USB4 router should have also remained in D0 since
>> the PCIe root port for tunneling remained in D0 and a device link exists
>> between the two devices.
>  > This paragraph sounds like the problem is the router runtime suspended.
> IIUC the router could only runtime suspend because its consumer, the
> Root Port, runtime suspended.  By preventing the Root Port from runtime
> suspending, you're implicitly preventing it's supplier (the router)
> from suspending.

Yes, but I think it's a matter of perspective.  Both of these PCIe 
devices are exposing interfaces to different parts of the same SoC.
This issue was identified because this sequence of events in the kernel 
leads to unexpected power sequencing within the USB4 IP.

 From the perspective of the silicon designer the USB4 router shouldn't 
have "been able" to go into D3 until the PCIe root port for tunneling 
went into D3.  When the firmware prohibited the PCIe root port for 
tunneling to go into D3 this should implicitly prohibit the USB4 router 
as well.

I'll attempt to adjust my wording accordingly.

> 
> 
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flearn.microsoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-pcie-root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-d3&amp;data=05%7C01%7Cmario.limonciello%40amd.com%7Cd7450aa3d87e43996a5c08dab7db79dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024450531138458%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=1GCD4G5n79pldE3zOD7%2F3CCjdHY4qgzIRT5YHajbLEY%3D&amp;reserved=0 [1]
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftorvalds%2Flinux%2Fblob%2Fv6.1-rc1%2Fdrivers%2Fpci%2Fpcie%2Fportdrv_pci.c%23L126&amp;data=05%7C01%7Cmario.limonciello%40amd.com%7Cd7450aa3d87e43996a5c08dab7db79dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024450531138458%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Wr%2FDRQNQrl6EE2dJRWG2SVJ4QQkIsjSejM84nJE4R4g%3D&amp;reserved=0 [2]
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftorvalds%2Flinux%2Fblob%2Fv6.1-rc1%2Fdrivers%2Fthunderbolt%2Facpi.c%23L29&amp;data=05%7C01%7Cmario.limonciello%40amd.com%7Cd7450aa3d87e43996a5c08dab7db79dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024450531138458%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=ghZJRgT2xWjYFYOWoC77y%2B5Jn8ZlCErOgswjeQQbfCM%3D&amp;reserved=0 [3]
> 
> I think git.kernel.org links are preferred to 3rd party hosting services.

I wasn't aware of any such policy.  Within the last release it seemed to 
me Github was perfectly acceptable to use for links.

$ git log v6.0..v6.1-rc1 | grep "Link: https://github" | wc -l
107
$ git log v6.0..v6.1-rc1 | grep "Link: https://git.kernel.org" | wc -l
2

> 
> Thanks,
> 
> Lukas
  
Bjorn Helgaas Oct. 27, 2022, 8:44 p.m. UTC | #3
On Thu, Oct 27, 2022 at 02:56:19PM -0500, Limonciello, Mario wrote:
> On 10/27/2022 00:24, Lukas Wunner wrote:
> > ...

> > I think git.kernel.org links are preferred to 3rd party hosting services.
> 
> I wasn't aware of any such policy.  Within the last release it seemed to me
> Github was perfectly acceptable to use for links.
> 
> $ git log v6.0..v6.1-rc1 | grep "Link: https://github" | wc -l
> 107
> $ git log v6.0..v6.1-rc1 | grep "Link: https://git.kernel.org" | wc -l
> 2

I'm not aware of a formal policy, but I do prefer kernel.org links
because github is a 3rd party company that may not persist, may add
ads, etc.  I know github may *also* add value like fancier markup,
cross referencing, CI services, etc., but for commit logs, the
longevity of kernel.org is pretty persuasive to me.

There's a similar situation with mailing lists where many of the old
links to archives like marc.info, spinics.net, lkml.org, etc., are now
dead or not as useful for building tools (b4, for instance).

So no big deal, but I would probably silently convert them when
applying.  The current formats I use are:

  commits: https://git.kernel.org/linus/dff6139015dc
  files:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/pci-acpi.c?id=v6.0#n976

Bjorn
  

Patch

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index a46fec776ad77..8c6aec50dd471 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -984,10 +984,6 @@  bool acpi_pci_bridge_d3(struct pci_dev *dev)
 	if (acpi_pci_disabled || !dev->is_hotplug_bridge)
 		return false;
 
-	/* Assume D3 support if the bridge is power-manageable by ACPI. */
-	if (acpi_pci_power_manageable(dev))
-		return true;
-
 	rpdev = pcie_find_root_port(dev);
 	if (!rpdev)
 		return false;
@@ -1023,7 +1019,8 @@  bool acpi_pci_bridge_d3(struct pci_dev *dev)
 	    obj->integer.value == 1)
 		return true;
 
-	return false;
+	/* Assume D3 support if the bridge is power-manageable by ACPI. */
+	return acpi_pci_power_manageable(dev);
 }
 
 int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)