[v2,1/3] PCI/DPC: Request DPC only if also requesting AER

Message ID 20240222221521.32159-2-helgaas@kernel.org
State New
Headers
Series PCI/DPC: Clean up DPC vs AER/EDR ownership and Kconfig |

Commit Message

Bjorn Helgaas Feb. 22, 2024, 10:15 p.m. UTC
  From: Bjorn Helgaas <bhelgaas@google.com>

When booting with "pci=noaer", we don't request control of AER, but we
previously *did* request control of DPC, as in the dmesg log attached at
the bugzilla below:

  Command line: ... pci=noaer
  acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
  acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]

That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
says:

  If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
  must also set bit 7 of the Support field (indicating support for Error
  Disconnect Recover notifications) and bits 3 and 4 of the Control field
  (requesting control of PCI Express Advanced Error Reporting and the PCI
  Express Capability Structure).

Request DPC control only if we have also requested AER control.

Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: <stable@vger.kernel.org>	# v5.7+
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: Matthew W Carlis <mattc@purestorage.com>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
---
 drivers/acpi/pci_root.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)
  

Comments

Kuppuswamy Sathyanarayanan Feb. 25, 2024, 7:46 p.m. UTC | #1
On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
>
> When booting with "pci=noaer", we don't request control of AER, but we
> previously *did* request control of DPC, as in the dmesg log attached at
> the bugzilla below:
>
>   Command line: ... pci=noaer
>   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
>   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
>
> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
> says:
>
>   If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
>   must also set bit 7 of the Support field (indicating support for Error
>   Disconnect Recover notifications) and bits 3 and 4 of the Control field
>   (requesting control of PCI Express Advanced Error Reporting and the PCI
>   Express Capability Structure).

IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies Between
_OSC Control Bits".

Because handling of Downstream Port Containment has a dependency on Advanced Error
Reporting, the operating system is required to request control over Advanced Error Reporting (bit 3
of the Control field) while requesting control over Downstream Port Containment Configuration
(bit 7 of the Control field). If the operating system attempts to claim control of Downstream Port
Containment Configuration without also claiming control over Advanced Error Reporting, firmware
is required to refuse control of the feature being illegally claimed and mask the corresponding bit.
Firmware is required to maintain ownership of Advanced Error Reporting if it retains ownership of
Downstream Port Containment Configuration.
If the operating system sets bit 7 of the Control field, it must set bit 7 of the Support field, indicating
support for the Error Disconnect Recover event.

>
> Request DPC control only if we have also requested AER control.
>
> Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> Cc: <stable@vger.kernel.org>	# v5.7+
> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Cc: Matthew W Carlis <mattc@purestorage.com>
> Cc: Keith Busch <kbusch@kernel.org>
> Cc: Lukas Wunner <lukas@wunner.de>
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> ---
Code wise it looks fine to me.

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>  drivers/acpi/pci_root.c | 20 +++++++++++---------
>  1 file changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 58b89b8d950e..efc292b6214e 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -518,17 +518,19 @@ static u32 calculate_control(void)
>  	if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
>  		control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
>  
> -	if (pci_aer_available())
> +	if (pci_aer_available()) {
>  		control |= OSC_PCI_EXPRESS_AER_CONTROL;
>  
> -	/*
> -	 * Per the Downstream Port Containment Related Enhancements ECN to
> -	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> -	 * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
> -	 * and EDR.
> -	 */
> -	if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
> -		control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> +		/*
> +		 * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
> +		 * OS can request DPC control only if it has advertised
> +		 * OSC_PCI_EDR_SUPPORT and requested both
> +		 * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
> +		 * OSC_PCI_EXPRESS_AER_CONTROL.
> +		 */
> +		if (IS_ENABLED(CONFIG_PCIE_DPC))
> +			control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> +	}
>  
>  	return control;
>  }
  
Bjorn Helgaas Feb. 26, 2024, 3:18 p.m. UTC | #2
On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
> 
> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <bhelgaas@google.com>
> >
> > When booting with "pci=noaer", we don't request control of AER, but we
> > previously *did* request control of DPC, as in the dmesg log attached at
> > the bugzilla below:
> >
> >   Command line: ... pci=noaer
> >   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
> >   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
> >
> > That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
> > says:
> >
> >   If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
> >   must also set bit 7 of the Support field (indicating support for Error
> >   Disconnect Recover notifications) and bits 3 and 4 of the Control field
> >   (requesting control of PCI Express Advanced Error Reporting and the PCI
> >   Express Capability Structure).
> 
> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies
> Between _OSC Control Bits".
> 
> Because handling of Downstream Port Containment has a dependency on
> Advanced Error Reporting, the operating system is required to
> request control over Advanced Error Reporting (bit 3 of the Control
> field) while requesting control over Downstream Port Containment
> Configuration (bit 7 of the Control field). If the operating system
> attempts to claim control of Downstream Port Containment
> Configuration without also claiming control over Advanced Error
> Reporting, firmware is required to refuse control of the feature
> being illegally claimed and mask the corresponding bit.  Firmware is
> required to maintain ownership of Advanced Error Reporting if it
> retains ownership of Downstream Port Containment Configuration.  If
> the operating system sets bit 7 of the Control field, it must set
> bit 7 of the Support field, indicating support for the Error
> Disconnect Recover event.

So I guess you're suggesting that there are two defects here?

  1) Linux requested DPC control without requesting AER control.

  2) Platform granted DPC control when it shouldn't have.

I do agree with that, but obviously we can only fix 1) in Linux.

> > Request DPC control only if we have also requested AER control.
> >
> > Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
> > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > Cc: <stable@vger.kernel.org>	# v5.7+
> > Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > Cc: Matthew W Carlis <mattc@purestorage.com>
> > Cc: Keith Busch <kbusch@kernel.org>
> > Cc: Lukas Wunner <lukas@wunner.de>
> > Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> > Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
> > ---
> Code wise it looks fine to me.
> 
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> >  drivers/acpi/pci_root.c | 20 +++++++++++---------
> >  1 file changed, 11 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> > index 58b89b8d950e..efc292b6214e 100644
> > --- a/drivers/acpi/pci_root.c
> > +++ b/drivers/acpi/pci_root.c
> > @@ -518,17 +518,19 @@ static u32 calculate_control(void)
> >  	if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
> >  		control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
> >  
> > -	if (pci_aer_available())
> > +	if (pci_aer_available()) {
> >  		control |= OSC_PCI_EXPRESS_AER_CONTROL;
> >  
> > -	/*
> > -	 * Per the Downstream Port Containment Related Enhancements ECN to
> > -	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
> > -	 * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
> > -	 * and EDR.
> > -	 */
> > -	if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
> > -		control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> > +		/*
> > +		 * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
> > +		 * OS can request DPC control only if it has advertised
> > +		 * OSC_PCI_EDR_SUPPORT and requested both
> > +		 * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
> > +		 * OSC_PCI_EXPRESS_AER_CONTROL.
> > +		 */
> > +		if (IS_ENABLED(CONFIG_PCIE_DPC))
> > +			control |= OSC_PCI_EXPRESS_DPC_CONTROL;
> > +	}
> >  
> >  	return control;
> >  }
> 
> -- 
> Sathyanarayanan Kuppuswamy
> Linux Kernel Developer
>
  
Kuppuswamy Sathyanarayanan Feb. 26, 2024, 3:46 p.m. UTC | #3
On 2/26/24 7:18 AM, Bjorn Helgaas wrote:
> On Sun, Feb 25, 2024 at 11:46:07AM -0800, Kuppuswamy Sathyanarayanan wrote:
>> On 2/22/24 2:15 PM, Bjorn Helgaas wrote:
>>> From: Bjorn Helgaas <bhelgaas@google.com>
>>>
>>> When booting with "pci=noaer", we don't request control of AER, but we
>>> previously *did* request control of DPC, as in the dmesg log attached at
>>> the bugzilla below:
>>>
>>>   Command line: ... pci=noaer
>>>   acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI EDR HPX-Type3]
>>>   acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug SHPCHotplug PME PCIeCapability LTR DPC]
>>>
>>> That's illegal per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, which
>>> says:
>>>
>>>   If the operating system sets this bit [OSC_PCI_EXPRESS_DPC_CONTROL], it
>>>   must also set bit 7 of the Support field (indicating support for Error
>>>   Disconnect Recover notifications) and bits 3 and 4 of the Control field
>>>   (requesting control of PCI Express Advanced Error Reporting and the PCI
>>>   Express Capability Structure).
>> IIUC, this dependency is discussed in sec 4.5.2.4. "Dependencies
>> Between _OSC Control Bits".
>>
>> Because handling of Downstream Port Containment has a dependency on
>> Advanced Error Reporting, the operating system is required to
>> request control over Advanced Error Reporting (bit 3 of the Control
>> field) while requesting control over Downstream Port Containment
>> Configuration (bit 7 of the Control field). If the operating system
>> attempts to claim control of Downstream Port Containment
>> Configuration without also claiming control over Advanced Error
>> Reporting, firmware is required to refuse control of the feature
>> being illegally claimed and mask the corresponding bit.  Firmware is
>> required to maintain ownership of Advanced Error Reporting if it
>> retains ownership of Downstream Port Containment Configuration.  If
>> the operating system sets bit 7 of the Control field, it must set
>> bit 7 of the Support field, indicating support for the Error
>> Disconnect Recover event.
> So I guess you're suggesting that there are two defects here?
>
>   1) Linux requested DPC control without requesting AER control.
>
>   2) Platform granted DPC control when it shouldn't have.
>
> I do agree with that, but obviously we can only fix 1) in Linux.

Sorry, maybe my comment was not clear. I was just suggesting
to change the  spec reference from r3.3, sec 4.5.1, table 4-5
to  r3.3, sec 4.5.2.4 "Dependencies Between _OSC Control Bits". I agree that we cannot do much if firmware misbehaves here.

>>> Request DPC control only if we have also requested AER control.
>>>
>>> Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218491#c12
>>> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>> Cc: <stable@vger.kernel.org>	# v5.7+
>>> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>> Cc: Matthew W Carlis <mattc@purestorage.com>
>>> Cc: Keith Busch <kbusch@kernel.org>
>>> Cc: Lukas Wunner <lukas@wunner.de>
>>> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
>>> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
>>> ---
>> Code wise it looks fine to me.
>>
>> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>>  drivers/acpi/pci_root.c | 20 +++++++++++---------
>>>  1 file changed, 11 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
>>> index 58b89b8d950e..efc292b6214e 100644
>>> --- a/drivers/acpi/pci_root.c
>>> +++ b/drivers/acpi/pci_root.c
>>> @@ -518,17 +518,19 @@ static u32 calculate_control(void)
>>>  	if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
>>>  		control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
>>>  
>>> -	if (pci_aer_available())
>>> +	if (pci_aer_available()) {
>>>  		control |= OSC_PCI_EXPRESS_AER_CONTROL;
>>>  
>>> -	/*
>>> -	 * Per the Downstream Port Containment Related Enhancements ECN to
>>> -	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
>>> -	 * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
>>> -	 * and EDR.
>>> -	 */
>>> -	if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
>>> -		control |= OSC_PCI_EXPRESS_DPC_CONTROL;
>>> +		/*
>>> +		 * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
>>> +		 * OS can request DPC control only if it has advertised
>>> +		 * OSC_PCI_EDR_SUPPORT and requested both
>>> +		 * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
>>> +		 * OSC_PCI_EXPRESS_AER_CONTROL.
>>> +		 */
>>> +		if (IS_ENABLED(CONFIG_PCIE_DPC))
>>> +			control |= OSC_PCI_EXPRESS_DPC_CONTROL;
>>> +	}
>>>  
>>>  	return control;
>>>  }
>> -- 
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer
>>
  

Patch

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 58b89b8d950e..efc292b6214e 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -518,17 +518,19 @@  static u32 calculate_control(void)
 	if (IS_ENABLED(CONFIG_HOTPLUG_PCI_SHPC))
 		control |= OSC_PCI_SHPC_NATIVE_HP_CONTROL;
 
-	if (pci_aer_available())
+	if (pci_aer_available()) {
 		control |= OSC_PCI_EXPRESS_AER_CONTROL;
 
-	/*
-	 * Per the Downstream Port Containment Related Enhancements ECN to
-	 * the PCI Firmware Spec, r3.2, sec 4.5.1, table 4-5,
-	 * OSC_PCI_EXPRESS_DPC_CONTROL indicates the OS supports both DPC
-	 * and EDR.
-	 */
-	if (IS_ENABLED(CONFIG_PCIE_DPC) && IS_ENABLED(CONFIG_PCIE_EDR))
-		control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+		/*
+		 * Per PCI Firmware Spec, r3.3, sec 4.5.1, table 4-5, the
+		 * OS can request DPC control only if it has advertised
+		 * OSC_PCI_EDR_SUPPORT and requested both
+		 * OSC_PCI_EXPRESS_CAPABILITY_CONTROL and
+		 * OSC_PCI_EXPRESS_AER_CONTROL.
+		 */
+		if (IS_ENABLED(CONFIG_PCIE_DPC))
+			control |= OSC_PCI_EXPRESS_DPC_CONTROL;
+	}
 
 	return control;
 }