[v2,2/8] iommu/vt-d: Improve iommu_enable_pci_caps()
Commit Message
The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
This improves iommu_enable_pci_caps() to avoid unnecessary kernel
traces through checking and enabling. This also adds kernel messages
if any feature enabling results in failure. It is worth noting that
PRI depends on ATS. This adds a check as well.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
drivers/iommu/intel/iommu.c | 86 ++++++++++++++++++++++++++-----------
1 file changed, 61 insertions(+), 25 deletions(-)
Comments
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Tuesday, November 8, 2022 3:34 PM
>
> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
> traces through checking and enabling. This also adds kernel messages
> if any feature enabling results in failure. It is worth noting that
> PRI depends on ATS. This adds a check as well.
Cannot we have a helper to check whether this device has been attached
to any domain? If no in the blocking path then disable PCI caps. If no
in the attaching path then enable PCI caps.
I just didn't get the point of leaving them enabled while the device can
not do any DMA at all.
On 2022/11/11 11:45, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Tuesday, November 8, 2022 3:34 PM
>>
>> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
>> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
>> traces through checking and enabling. This also adds kernel messages
>> if any feature enabling results in failure. It is worth noting that
>> PRI depends on ATS. This adds a check as well.
>
> Cannot we have a helper to check whether this device has been attached
> to any domain? If no in the blocking path then disable PCI caps. If no
> in the attaching path then enable PCI caps.
>
> I just didn't get the point of leaving them enabled while the device can
> not do any DMA at all.
Ideally, the kernel owns the default policy (default on or off). The
upper layers are able to control it over IOMMUFD uAPI or kerneld kAPI.
I can't see the benefits of associating these features with the
existence of any domain.
The VT-d spec seems to use the same idea. The control of PASID/ATS are
placed in the device context fields, while the setting of domains are
placed in the PASID entry fields.
Best regards,
baolu
> From: Baolu Lu <baolu.lu@linux.intel.com>
> Sent: Friday, November 11, 2022 2:59 PM
>
> On 2022/11/11 11:45, Tian, Kevin wrote:
> >> From: Lu Baolu <baolu.lu@linux.intel.com>
> >> Sent: Tuesday, November 8, 2022 3:34 PM
> >>
> >> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
> >> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
> >> traces through checking and enabling. This also adds kernel messages
> >> if any feature enabling results in failure. It is worth noting that
> >> PRI depends on ATS. This adds a check as well.
> >
> > Cannot we have a helper to check whether this device has been attached
> > to any domain? If no in the blocking path then disable PCI caps. If no
> > in the attaching path then enable PCI caps.
> >
> > I just didn't get the point of leaving them enabled while the device can
> > not do any DMA at all.
>
> Ideally, the kernel owns the default policy (default on or off). The
> upper layers are able to control it over IOMMUFD uAPI or kerneld kAPI.
> I can't see the benefits of associating these features with the
> existence of any domain.
we don't have such uAPI or kAPI today.
the current behavior before your change is default off and then toggled
along with attach/detach domain. as only one domain is allowed per
RID it implies the capabilities are toggled along with DMA allow/block.
now you change it to a messy model:
- default off when the device is probed
- turn on at the 1st domain attach and never turn off until release
- but iommu_enable_pci_caps() is still called at every domain attach
with band-aid to allow re-entrant
this isn't like a good cleanup...
On 2022/11/11 16:16, Tian, Kevin wrote:
>> From: Baolu Lu <baolu.lu@linux.intel.com>
>> Sent: Friday, November 11, 2022 2:59 PM
>>
>> On 2022/11/11 11:45, Tian, Kevin wrote:
>>>> From: Lu Baolu <baolu.lu@linux.intel.com>
>>>> Sent: Tuesday, November 8, 2022 3:34 PM
>>>>
>>>> The PCI subsystem triggers WARN() if a feature is repeatedly enabled.
>>>> This improves iommu_enable_pci_caps() to avoid unnecessary kernel
>>>> traces through checking and enabling. This also adds kernel messages
>>>> if any feature enabling results in failure. It is worth noting that
>>>> PRI depends on ATS. This adds a check as well.
>>>
>>> Cannot we have a helper to check whether this device has been attached
>>> to any domain? If no in the blocking path then disable PCI caps. If no
>>> in the attaching path then enable PCI caps.
>>>
>>> I just didn't get the point of leaving them enabled while the device can
>>> not do any DMA at all.
>>
>> Ideally, the kernel owns the default policy (default on or off). The
>> upper layers are able to control it over IOMMUFD uAPI or kerneld kAPI.
>> I can't see the benefits of associating these features with the
>> existence of any domain.
>
> we don't have such uAPI or kAPI today.
>
> the current behavior before your change is default off and then toggled
> along with attach/detach domain. as only one domain is allowed per
> RID it implies the capabilities are toggled along with DMA allow/block.
>
> now you change it to a messy model:
>
> - default off when the device is probed
> - turn on at the 1st domain attach and never turn off until release
> - but iommu_enable_pci_caps() is still called at every domain attach
> with band-aid to allow re-entrant
>
> this isn't like a good cleanup...
Fair enough. We should not bury this behavior change in a cleanup
series. Okay! I will keep the previous behavior.
Best regards,
baolu
@@ -1401,44 +1401,80 @@ static void domain_update_iotlb(struct dmar_domain *domain)
static void iommu_enable_pci_caps(struct device_domain_info *info)
{
struct pci_dev *pdev;
+ int ret;
if (!info || !dev_is_pci(info->dev))
return;
pdev = to_pci_dev(info->dev);
- /* For IOMMU that supports device IOTLB throttling (DIT), we assign
- * PFSID to the invalidation desc of a VF such that IOMMU HW can gauge
- * queue depth at PF level. If DIT is not set, PFSID will be treated as
- * reserved, which should be set to 0.
+ /*
+ * The PCIe spec, in its wisdom, declares that the behaviour of
+ * the device if you enable PASID support after ATS support is
+ * undefined. So always enable PASID support on devices which
+ * have it, even if we can't yet know if we're ever going to
+ * use it.
*/
- if (!ecap_dit(info->iommu->ecap))
- info->pfsid = 0;
- else {
- struct pci_dev *pf_pdev;
-
- /* pdev will be returned if device is not a vf */
- pf_pdev = pci_physfn(pdev);
- info->pfsid = pci_dev_id(pf_pdev);
+ if (info->pasid_supported && !info->pasid_enabled) {
+ ret = pci_enable_pasid(pdev, info->pasid_supported & ~1);
+ if (ret)
+ pci_info(pdev, "Failed to enable PASID: %d\n", ret);
+ else
+ info->pasid_enabled = 1;
}
- /* The PCIe spec, in its wisdom, declares that the behaviour of
- the device if you enable PASID support after ATS support is
- undefined. So always enable PASID support on devices which
- have it, even if we can't yet know if we're ever going to
- use it. */
- if (info->pasid_supported && !pci_enable_pasid(pdev, info->pasid_supported & ~1))
- info->pasid_enabled = 1;
+ if (info->ats_supported && !info->ats_enabled) {
+ if (!pci_ats_page_aligned(pdev)) {
+ pci_info(pdev, "Untranslated Addresses not aligned\n");
+ return;
+ }
- if (info->pri_supported &&
- (info->pasid_enabled ? pci_prg_resp_pasid_required(pdev) : 1) &&
- !pci_reset_pri(pdev) && !pci_enable_pri(pdev, PRQ_DEPTH))
- info->pri_enabled = 1;
+ ret = pci_enable_ats(pdev, VTD_PAGE_SHIFT);
+ if (ret) {
+ pci_info(pdev, "Failed to enable ATS: %d\n", ret);
+ return;
+ }
- if (info->ats_supported && pci_ats_page_aligned(pdev) &&
- !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
info->ats_enabled = 1;
domain_update_iotlb(info->domain);
info->ats_qdep = pci_ats_queue_depth(pdev);
+
+ /*
+ * For IOMMU that supports device IOTLB throttling (DIT),
+ * we assign PFSID to the invalidation desc of a VF such
+ * that IOMMU HW can gauge queue depth at PF level. If DIT
+ * is not set, PFSID will be treated as reserved, which
+ * should be set to 0.
+ */
+ if (ecap_dit(info->iommu->ecap)) {
+ struct pci_dev *pf_pdev;
+
+ /* pdev will be returned if device is not a vf */
+ pf_pdev = pci_physfn(pdev);
+ info->pfsid = pci_dev_id(pf_pdev);
+ } else {
+ info->pfsid = 0;
+ }
+ }
+
+ if (info->pri_supported && !info->pri_enabled && info->ats_enabled) {
+ if (info->pasid_enabled && !pci_prg_resp_pasid_required(pdev)) {
+ pci_info(pdev, "PRG Response PASID Required\n");
+ return;
+ }
+
+ ret = pci_reset_pri(pdev);
+ if (ret) {
+ pci_info(pdev, "Failed to reset PRI: %d\n", ret);
+ return;
+ }
+
+ ret = pci_enable_pri(pdev, PRQ_DEPTH);
+ if (ret) {
+ pci_info(pdev, "Failed to enable PRI: %d\n", ret);
+ return;
+ }
+
+ info->pri_enabled = 1;
}
}