[v2,1/2] iommu: Prevent RESV_DIRECT devices from blocking domains
Commit Message
The IOMMU_RESV_DIRECT flag indicates that a memory region must be mapped
1:1 at all times. This means that the region must always be accessible to
the device, even if the device is attached to a blocking domain. This is
equal to saying that IOMMU_RESV_DIRECT flag prevents devices from being
attached to blocking domains.
This also implies that devices that implement RESV_DIRECT regions will be
prevented from being assigned to user space since taking the DMA ownership
immediately switches to a blocking domain.
The rule of preventing devices with the IOMMU_RESV_DIRECT regions from
being assigned to user space has existed in the Intel IOMMU driver for
a long time. Now, this rule is being lifted up to a general core rule,
as other architectures like AMD and ARM also have RMRR-like reserved
regions. This has been discussed in the community mailing list and refer
to below link for more details.
Other places using unmanaged domains for kernel DMA must follow the
iommu_get_resv_regions() and setup IOMMU_RESV_DIRECT - we do not restrict
them in the core code.
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/linux-iommu/BN9PR11MB5276E84229B5BD952D78E9598C639@BN9PR11MB5276.namprd11.prod.outlook.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
---
include/linux/iommu.h | 2 ++
drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++----------
2 files changed, 29 insertions(+), 10 deletions(-)
Comments
> From: Lu Baolu <baolu.lu@linux.intel.com>
> Sent: Thursday, July 13, 2023 12:33 PM
>
> @@ -409,6 +409,7 @@ struct iommu_fault_param {
> * @priv: IOMMU Driver private data
> * @max_pasids: number of PASIDs this device can consume
> * @attach_deferred: the dma domain attachment is deferred
> + * @requires_direct: The driver requested IOMMU_RESV_DIRECT
it's not accurate to say "driver requested" as it's a device attribute.
s/requires_direct/require_direct/
what about "has_resv_direct"?
> @@ -959,14 +959,12 @@ static int
> iommu_create_device_direct_mappings(struct iommu_domain *domain,
> unsigned long pg_size;
> int ret = 0;
>
> - if (!iommu_is_dma_domain(domain))
> - return 0;
> -
> - BUG_ON(!domain->pgsize_bitmap);
> -
> - pg_size = 1UL << __ffs(domain->pgsize_bitmap);
> + pg_size = domain->pgsize_bitmap ? 1UL << __ffs(domain-
> >pgsize_bitmap) : 0;
> INIT_LIST_HEAD(&mappings);
>
> + if (WARN_ON_ONCE(iommu_is_dma_domain(domain) && !pg_size))
> + return -EINVAL;
> +
> iommu_get_resv_regions(dev, &mappings);
>
> /* We need to consider overlapping regions for different devices */
> @@ -974,13 +972,17 @@ static int
> iommu_create_device_direct_mappings(struct iommu_domain *domain,
> dma_addr_t start, end, addr;
> size_t map_size = 0;
>
> + if (entry->type == IOMMU_RESV_DIRECT)
> + dev->iommu->requires_direct = 1;
> +
> + if ((entry->type != IOMMU_RESV_DIRECT &&
> + entry->type != IOMMU_RESV_DIRECT_RELAXABLE) ||
> + !iommu_is_dma_domain(domain))
> + continue;
> +
> start = ALIGN(entry->start, pg_size);
> end = ALIGN(entry->start + entry->length, pg_size);
>
> - if (entry->type != IOMMU_RESV_DIRECT &&
> - entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
> - continue;
> -
> for (addr = start; addr <= end; addr += pg_size) {
> phys_addr_t phys_addr;
>
piggybacking a device attribute detection in a function which tries to
populate domain mappings is a bit confusing.
Does it work better to introduce a new function to detect this attribute
and has it directly called in the probe path?
> @@ -2121,6 +2123,21 @@ static int __iommu_device_set_domain(struct
> iommu_group *group,
> {
> int ret;
>
> + /*
> + * If the driver has requested IOMMU_RESV_DIRECT then we cannot
ditto. It's not requested by the driver.
> allow
> + * the blocking domain to be attached as it does not contain the
> + * required 1:1 mapping. This test effectively exclusive the device
s/exclusive/excludes/
On Fri, Jul 21, 2023 at 03:07:47AM +0000, Tian, Kevin wrote:
> > @@ -974,13 +972,17 @@ static int
> > iommu_create_device_direct_mappings(struct iommu_domain *domain,
> > dma_addr_t start, end, addr;
> > size_t map_size = 0;
> >
> > + if (entry->type == IOMMU_RESV_DIRECT)
> > + dev->iommu->requires_direct = 1;
> > +
> > + if ((entry->type != IOMMU_RESV_DIRECT &&
> > + entry->type != IOMMU_RESV_DIRECT_RELAXABLE) ||
> > + !iommu_is_dma_domain(domain))
> > + continue;
>
> piggybacking a device attribute detection in a function which tries to
> populate domain mappings is a bit confusing.
It is, but to do otherwise we'd want to have the caller obtain the
reserved regions list and iterate it twice. Not sure it is worth the
trouble right now.
Jason
On Thu, Jul 13, 2023 at 12:32:47PM +0800, Lu Baolu wrote:
> The IOMMU_RESV_DIRECT flag indicates that a memory region must be mapped
> 1:1 at all times. This means that the region must always be accessible to
> the device, even if the device is attached to a blocking domain. This is
> equal to saying that IOMMU_RESV_DIRECT flag prevents devices from being
> attached to blocking domains.
>
> This also implies that devices that implement RESV_DIRECT regions will be
> prevented from being assigned to user space since taking the DMA ownership
> immediately switches to a blocking domain.
>
> The rule of preventing devices with the IOMMU_RESV_DIRECT regions from
> being assigned to user space has existed in the Intel IOMMU driver for
> a long time. Now, this rule is being lifted up to a general core rule,
> as other architectures like AMD and ARM also have RMRR-like reserved
> regions. This has been discussed in the community mailing list and refer
> to below link for more details.
>
> Other places using unmanaged domains for kernel DMA must follow the
> iommu_get_resv_regions() and setup IOMMU_RESV_DIRECT - we do not restrict
> them in the core code.
>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Kevin Tian <kevin.tian@intel.com>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> Link: https://lore.kernel.org/linux-iommu/BN9PR11MB5276E84229B5BD952D78E9598C639@BN9PR11MB5276.namprd11.prod.outlook.com
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> ---
> include/linux/iommu.h | 2 ++
> drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++----------
> 2 files changed, 29 insertions(+), 10 deletions(-)
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Jason
On 2023/7/21 11:07, Tian, Kevin wrote:
>> From: Lu Baolu <baolu.lu@linux.intel.com>
>> Sent: Thursday, July 13, 2023 12:33 PM
>>
>> @@ -409,6 +409,7 @@ struct iommu_fault_param {
>> * @priv: IOMMU Driver private data
>> * @max_pasids: number of PASIDs this device can consume
>> * @attach_deferred: the dma domain attachment is deferred
>> + * @requires_direct: The driver requested IOMMU_RESV_DIRECT
>
> it's not accurate to say "driver requested" as it's a device attribute.
>
> s/requires_direct/require_direct/
>
> what about "has_resv_direct"?
How about
* @require_direct: device requires IOMMU_RESV_DIRECT reserved regions
?
>
>> @@ -959,14 +959,12 @@ static int
>> iommu_create_device_direct_mappings(struct iommu_domain *domain,
>> unsigned long pg_size;
>> int ret = 0;
>>
>> - if (!iommu_is_dma_domain(domain))
>> - return 0;
>> -
>> - BUG_ON(!domain->pgsize_bitmap);
>> -
>> - pg_size = 1UL << __ffs(domain->pgsize_bitmap);
>> + pg_size = domain->pgsize_bitmap ? 1UL << __ffs(domain-
>>> pgsize_bitmap) : 0;
>> INIT_LIST_HEAD(&mappings);
>>
>> + if (WARN_ON_ONCE(iommu_is_dma_domain(domain) && !pg_size))
>> + return -EINVAL;
>> +
>> iommu_get_resv_regions(dev, &mappings);
>>
>> /* We need to consider overlapping regions for different devices */
>> @@ -974,13 +972,17 @@ static int
>> iommu_create_device_direct_mappings(struct iommu_domain *domain,
>> dma_addr_t start, end, addr;
>> size_t map_size = 0;
>>
>> + if (entry->type == IOMMU_RESV_DIRECT)
>> + dev->iommu->requires_direct = 1;
>> +
>> + if ((entry->type != IOMMU_RESV_DIRECT &&
>> + entry->type != IOMMU_RESV_DIRECT_RELAXABLE) ||
>> + !iommu_is_dma_domain(domain))
>> + continue;
>> +
>> start = ALIGN(entry->start, pg_size);
>> end = ALIGN(entry->start + entry->length, pg_size);
>>
>> - if (entry->type != IOMMU_RESV_DIRECT &&
>> - entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
>> - continue;
>> -
>> for (addr = start; addr <= end; addr += pg_size) {
>> phys_addr_t phys_addr;
>>
>
> piggybacking a device attribute detection in a function which tries to
> populate domain mappings is a bit confusing.
>
> Does it work better to introduce a new function to detect this attribute
> and has it directly called in the probe path?
Jason answered this.
>
>> @@ -2121,6 +2123,21 @@ static int __iommu_device_set_domain(struct
>> iommu_group *group,
>> {
>> int ret;
>>
>> + /*
>> + * If the driver has requested IOMMU_RESV_DIRECT then we cannot
>
> ditto. It's not requested by the driver.
>
>> allow
>> + * the blocking domain to be attached as it does not contain the
>> + * required 1:1 mapping. This test effectively exclusive the device
>
> s/exclusive/excludes/
>
Updated. Thanks!
Best regards,
baolu
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, July 21, 2023 11:10 PM
>
> On Fri, Jul 21, 2023 at 03:07:47AM +0000, Tian, Kevin wrote:
> > > @@ -974,13 +972,17 @@ static int
> > > iommu_create_device_direct_mappings(struct iommu_domain *domain,
> > > dma_addr_t start, end, addr;
> > > size_t map_size = 0;
> > >
> > > + if (entry->type == IOMMU_RESV_DIRECT)
> > > + dev->iommu->requires_direct = 1;
> > > +
> > > + if ((entry->type != IOMMU_RESV_DIRECT &&
> > > + entry->type != IOMMU_RESV_DIRECT_RELAXABLE) ||
> > > + !iommu_is_dma_domain(domain))
> > > + continue;
> >
> > piggybacking a device attribute detection in a function which tries to
> > populate domain mappings is a bit confusing.
>
> It is, but to do otherwise we'd want to have the caller obtain the
> reserved regions list and iterate it twice. Not sure it is worth the
> trouble right now.
>
Not a strong opinion but It's a slow path and readability is more
preferable to me. 😊
@@ -409,6 +409,7 @@ struct iommu_fault_param {
* @priv: IOMMU Driver private data
* @max_pasids: number of PASIDs this device can consume
* @attach_deferred: the dma domain attachment is deferred
+ * @requires_direct: The driver requested IOMMU_RESV_DIRECT
*
* TODO: migrate other per device data pointers under iommu_dev_data, e.g.
* struct iommu_group *iommu_group;
@@ -422,6 +423,7 @@ struct dev_iommu {
void *priv;
u32 max_pasids;
u32 attach_deferred:1;
+ u32 requires_direct:1;
};
int iommu_device_register(struct iommu_device *iommu,
@@ -959,14 +959,12 @@ static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
unsigned long pg_size;
int ret = 0;
- if (!iommu_is_dma_domain(domain))
- return 0;
-
- BUG_ON(!domain->pgsize_bitmap);
-
- pg_size = 1UL << __ffs(domain->pgsize_bitmap);
+ pg_size = domain->pgsize_bitmap ? 1UL << __ffs(domain->pgsize_bitmap) : 0;
INIT_LIST_HEAD(&mappings);
+ if (WARN_ON_ONCE(iommu_is_dma_domain(domain) && !pg_size))
+ return -EINVAL;
+
iommu_get_resv_regions(dev, &mappings);
/* We need to consider overlapping regions for different devices */
@@ -974,13 +972,17 @@ static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
dma_addr_t start, end, addr;
size_t map_size = 0;
+ if (entry->type == IOMMU_RESV_DIRECT)
+ dev->iommu->requires_direct = 1;
+
+ if ((entry->type != IOMMU_RESV_DIRECT &&
+ entry->type != IOMMU_RESV_DIRECT_RELAXABLE) ||
+ !iommu_is_dma_domain(domain))
+ continue;
+
start = ALIGN(entry->start, pg_size);
end = ALIGN(entry->start + entry->length, pg_size);
- if (entry->type != IOMMU_RESV_DIRECT &&
- entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
- continue;
-
for (addr = start; addr <= end; addr += pg_size) {
phys_addr_t phys_addr;
@@ -2121,6 +2123,21 @@ static int __iommu_device_set_domain(struct iommu_group *group,
{
int ret;
+ /*
+ * If the driver has requested IOMMU_RESV_DIRECT then we cannot allow
+ * the blocking domain to be attached as it does not contain the
+ * required 1:1 mapping. This test effectively exclusive the device from
+ * being used with iommu_group_claim_dma_owner() which will block vfio
+ * and iommufd as well.
+ */
+ if (dev->iommu->requires_direct &&
+ (new_domain->type == IOMMU_DOMAIN_BLOCKED ||
+ new_domain == group->blocking_domain)) {
+ dev_warn(dev,
+ "Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.\n");
+ return -EINVAL;
+ }
+
if (dev->iommu->attach_deferred) {
if (new_domain == group->default_domain)
return 0;