[v2,02/11] iommu: Add nested domain support

Message ID 20230511143844.22693-3-yi.l.liu@intel.com
State New
Headers
Series iommufd: Add nesting infrastructure |

Commit Message

Yi Liu May 11, 2023, 2:38 p.m. UTC
  From: Lu Baolu <baolu.lu@linux.intel.com>

Introduce a new domain type for a user I/O page table, which is nested
on top of another user space address represented by a UNMANAGED domain. The
mappings of a nested domain are managed by user space software, therefore
it's unnecessary to have map/unmap callbacks. But the updates of the PTEs
in the nested domain page table must be propagated to the caches on both
IOMMU (IOTLB) and devices (DevTLB).

The nested domain is allocated by the domain_alloc_user op, and attached
to the device through the existing iommu_attach_device/group() interfaces.

A new domain op, named cache_invalidate_user is added for the userspace to
flush the hardware caches for a nested domain through iommufd. No wrapper
for it, as it's only supposed to be used by iommufd.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h | 11 +++++++++++
 1 file changed, 11 insertions(+)
  

Comments

Tian, Kevin May 19, 2023, 8:51 a.m. UTC | #1
> From: Yi Liu <yi.l.liu@intel.com>
> Sent: Thursday, May 11, 2023 10:39 PM
> 
> @@ -66,6 +66,9 @@ struct iommu_domain_geometry {
> 
>  #define __IOMMU_DOMAIN_SVA	(1U << 4)  /* Shared process address
> space */
> 
> +#define __IOMMU_DOMAIN_NESTED	(1U << 5)  /* User-managed IOVA
> nested on
> +					      a stage-2 translation        */

s/IOVA/address space/

> @@ -346,6 +350,10 @@ struct iommu_ops {
>   * @iotlb_sync_map: Sync mappings created recently using @map to the
> hardware
>   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty
> flush
>   *            queue
> + * @cache_invalidate_user: Flush hardware TLBs caching user space IO
> mappings
> + * @cache_invalidate_user_data_len: Defined length of input user data for
> the
> + *                                  cache_invalidate_user op, being sizeof the
> + *                                  structure in include/uapi/linux/iommufd.h

same as comment to last patch, can this be merged with @hw_info?
  
Nicolin Chen May 19, 2023, 6:49 p.m. UTC | #2
On Fri, May 19, 2023 at 08:51:21AM +0000, Tian, Kevin wrote:
 
> > @@ -346,6 +350,10 @@ struct iommu_ops {
> >   * @iotlb_sync_map: Sync mappings created recently using @map to the
> > hardware
> >   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty
> > flush
> >   *            queue
> > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO
> > mappings
> > + * @cache_invalidate_user_data_len: Defined length of input user data for
> > the
> > + *                                  cache_invalidate_user op, being sizeof the
> > + *                                  structure in include/uapi/linux/iommufd.h
> 
> same as comment to last patch, can this be merged with @hw_info?

I think it's better to keep them separate, since this is added
in struct iommu_domain_ops, given it is domain/hwpt specific,
while the hw_info is in struct iommu_ops?

Thanks
Nic
  
Tian, Kevin May 24, 2023, 5:03 a.m. UTC | #3
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Saturday, May 20, 2023 2:49 AM
> 
> On Fri, May 19, 2023 at 08:51:21AM +0000, Tian, Kevin wrote:
> 
> > > @@ -346,6 +350,10 @@ struct iommu_ops {
> > >   * @iotlb_sync_map: Sync mappings created recently using @map to the
> > > hardware
> > >   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and
> empty
> > > flush
> > >   *            queue
> > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO
> > > mappings
> > > + * @cache_invalidate_user_data_len: Defined length of input user data
> for
> > > the
> > > + *                                  cache_invalidate_user op, being sizeof the
> > > + *                                  structure in include/uapi/linux/iommufd.h
> >
> > same as comment to last patch, can this be merged with @hw_info?
> 
> I think it's better to keep them separate, since this is added
> in struct iommu_domain_ops, given it is domain/hwpt specific,
> while the hw_info is in struct iommu_ops?
> 

Just be curious whether there are real examples in which the data
len might be different upon the hwpt type...
  
Nicolin Chen May 24, 2023, 5:28 a.m. UTC | #4
On Wed, May 24, 2023 at 05:03:37AM +0000, Tian, Kevin wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Saturday, May 20, 2023 2:49 AM
> >
> > On Fri, May 19, 2023 at 08:51:21AM +0000, Tian, Kevin wrote:
> >
> > > > @@ -346,6 +350,10 @@ struct iommu_ops {
> > > >   * @iotlb_sync_map: Sync mappings created recently using @map to the
> > > > hardware
> > > >   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and
> > empty
> > > > flush
> > > >   *            queue
> > > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO
> > > > mappings
> > > > + * @cache_invalidate_user_data_len: Defined length of input user data
> > for
> > > > the
> > > > + *                                  cache_invalidate_user op, being sizeof the
> > > > + *                                  structure in include/uapi/linux/iommufd.h
> > >
> > > same as comment to last patch, can this be merged with @hw_info?
> >
> > I think it's better to keep them separate, since this is added
> > in struct iommu_domain_ops, given it is domain/hwpt specific,
> > while the hw_info is in struct iommu_ops?
> >
> 
> Just be curious whether there are real examples in which the data
> len might be different upon the hwpt type...

Likely "no" on top of my head. Yet it's a bit odd to see the
data length for cache_invalidate_user being added along with
the hw_info, since they belong to two different structs, and
even two different series?

Thanks
Nic
  

Patch

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7f2046fa53a3..c2d0fa3e2e18 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -66,6 +66,9 @@  struct iommu_domain_geometry {
 
 #define __IOMMU_DOMAIN_SVA	(1U << 4)  /* Shared process address space */
 
+#define __IOMMU_DOMAIN_NESTED	(1U << 5)  /* User-managed IOVA nested on
+					      a stage-2 translation        */
+
 /*
  * This are the possible domain-types
  *
@@ -91,6 +94,7 @@  struct iommu_domain_geometry {
 				 __IOMMU_DOMAIN_DMA_API |	\
 				 __IOMMU_DOMAIN_DMA_FQ)
 #define IOMMU_DOMAIN_SVA	(__IOMMU_DOMAIN_SVA)
+#define IOMMU_DOMAIN_NESTED	(__IOMMU_DOMAIN_NESTED)
 
 struct iommu_domain {
 	unsigned type;
@@ -346,6 +350,10 @@  struct iommu_ops {
  * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
  *            queue
+ * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings
+ * @cache_invalidate_user_data_len: Defined length of input user data for the
+ *                                  cache_invalidate_user op, being sizeof the
+ *                                  structure in include/uapi/linux/iommufd.h
  * @iova_to_phys: translate iova to physical address
  * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
  *                           including no-snoop TLPs on PCIe or other platform
@@ -375,6 +383,9 @@  struct iommu_domain_ops {
 			       size_t size);
 	void (*iotlb_sync)(struct iommu_domain *domain,
 			   struct iommu_iotlb_gather *iotlb_gather);
+	int (*cache_invalidate_user)(struct iommu_domain *domain,
+				     void *user_data);
+	const size_t cache_invalidate_user_data_len;
 
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
 				    dma_addr_t iova);