[v3,06/17] iommufd: Only enforce IOMMU_RESV_SW_MSI when attaching user-managed HWPT
Commit Message
From: Nicolin Chen <nicolinc@nvidia.com>
Reserved IOVA regions except for the IOMMU_RESV_DIRECT_RELAXABLE type
should be excluded from the device's DMA address space. For single stage
translation configuration, such reserved regions are excluded in the
attaching HWPT that is managed by kernel. While for nested translation
configuration, there are two stage HWPTs, the reserved regions should be
excluded in stage-1 HWPT which is managed by userspace. The current code
always excludes the reserved regions in the stage-2 HWPT which is kernel
managed. This is incorrect. Instead, the reserved regions should be
reported to userspace and excluded in stage-1 HWPT.
Besides above, the IOMMU_RESV_SW_MSI type region needs to be excluded in
the stage-2 HWPT even in the nested translation configuration on ARM. So
the IOMMU_RESV_SW_MSI type region should always be excluded in the kernel
managed HWPT.
This adds a boolean argument to enforce IOMMU_RESV_SW_MSI only, if attaching
HWPT is a user-managed one, i.e. hwpt->parent must be valid, resulting
"!!hwpt->parent" to be true, hence only add the IOMMU_RESV_SW_MSI region to
the stage-2 HWPT reserved iovas.
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
drivers/iommu/iommufd/device.c | 6 ++++--
drivers/iommu/iommufd/io_pagetable.c | 5 ++++-
drivers/iommu/iommufd/iommufd_private.h | 3 ++-
3 files changed, 10 insertions(+), 4 deletions(-)
Comments
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, July 24, 2023 7:04 PM
> }
>
> rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev-
> >dev,
> - &idev->igroup-
> >sw_msi_start);
> + &idev->igroup-
> >sw_msi_start,
> + !!hwpt->parent);
> if (rc)
> goto err_unlock;
I prefer to not setting parent ioas to hwpt in iommufd_hw_pagetable_alloc().
then here ioas can be retrieved from hwpt->parent and then it'd be pretty
clear that in nested case the sw_msi reservation happens in the parent
instead of pretending the stage-1 hwpt has an ioas too.
On Fri, Jul 28, 2023 at 10:02:36AM +0000, Tian, Kevin wrote:
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Monday, July 24, 2023 7:04 PM
> > }
> >
> > rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev-
> > >dev,
> > - &idev->igroup-
> > >sw_msi_start);
> > + &idev->igroup-
> > >sw_msi_start,
> > + !!hwpt->parent);
> > if (rc)
> > goto err_unlock;
>
> I prefer to not setting parent ioas to hwpt in iommufd_hw_pagetable_alloc().
Yes, the prior patch didn't add it to the iopt, so it shouldn't have
an ioas set. The NESTED domains don't have an IOAS almost by
definition.
> then here ioas can be retrieved from hwpt->parent and then it'd be pretty
> clear that in nested case the sw_msi reservation happens in the parent
> instead of pretending the stage-1 hwpt has an ioas too.
Yeah, I'm confused by this patch as well. Since there should be no
IOAS for the NESTED why are we messing with resv_regions?
Jason
@@ -347,7 +347,8 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt,
}
rc = iopt_table_enforce_dev_resv_regions(&hwpt->ioas->iopt, idev->dev,
- &idev->igroup->sw_msi_start);
+ &idev->igroup->sw_msi_start,
+ !!hwpt->parent);
if (rc)
goto err_unlock;
@@ -445,7 +446,8 @@ iommufd_device_do_replace(struct iommufd_device *idev,
if (hwpt->ioas != old_hwpt->ioas) {
list_for_each_entry(cur, &igroup->device_list, group_item) {
rc = iopt_table_enforce_dev_resv_regions(
- &hwpt->ioas->iopt, cur->dev, NULL);
+ &hwpt->ioas->iopt, cur->dev, NULL,
+ !!hwpt->parent);
if (rc)
goto err_unresv;
}
@@ -1172,7 +1172,8 @@ void iopt_remove_access(struct io_pagetable *iopt,
/* Narrow the valid_iova_itree to include reserved ranges from a device. */
int iopt_table_enforce_dev_resv_regions(struct io_pagetable *iopt,
struct device *dev,
- phys_addr_t *sw_msi_start)
+ phys_addr_t *sw_msi_start,
+ bool sw_msi_only)
{
struct iommu_resv_region *resv;
LIST_HEAD(resv_regions);
@@ -1198,6 +1199,8 @@ int iopt_table_enforce_dev_resv_regions(struct io_pagetable *iopt,
num_sw_msi++;
}
+ if (sw_msi_only && resv->type != IOMMU_RESV_SW_MSI)
+ continue;
rc = iopt_reserve_iova(iopt, resv->start,
resv->length - 1 + resv->start, dev);
if (rc)
@@ -79,7 +79,8 @@ void iopt_table_remove_domain(struct io_pagetable *iopt,
struct iommu_domain *domain);
int iopt_table_enforce_dev_resv_regions(struct io_pagetable *iopt,
struct device *dev,
- phys_addr_t *sw_msi_start);
+ phys_addr_t *sw_msi_start,
+ bool sw_msi_only);
int iopt_set_allow_iova(struct io_pagetable *iopt,
struct rb_root_cached *allowed_iova);
int iopt_reserve_iova(struct io_pagetable *iopt, unsigned long start,