[v3,02/17] iommu: Add nested domain support

Message ID 20230724110406.107212-3-yi.l.liu@intel.com
State New
Headers
Series iommufd: Add nesting infrastructure |

Commit Message

Yi Liu July 24, 2023, 11:03 a.m. UTC
  From: Lu Baolu <baolu.lu@linux.intel.com>

Introduce a new domain type for a user I/O page table, which is nested
on top of another user space address represented by a UNMANAGED domain. The
mappings of a nested domain are managed by user space software, therefore
it's unnecessary to have map/unmap callbacks. But the updates of the PTEs
in the nested domain page table must be propagated to the caches on both
IOMMU (IOTLB) and devices (DevTLB).

The nested domain is allocated by the domain_alloc_user op, and attached
to the device through the existing iommu_attach_device/group() interfaces.

A new domain op, named cache_invalidate_user is added for the userspace to
flush the hardware caches for a nested domain through iommufd. No wrapper
for it, as it's only supposed to be used by iommufd.

Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h | 11 +++++++++++
 1 file changed, 11 insertions(+)
  

Comments

Tian, Kevin July 28, 2023, 9:38 a.m. UTC | #1
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Monday, July 24, 2023 7:04 PM
> 
> From: Lu Baolu <baolu.lu@linux.intel.com>
> 
> Introduce a new domain type for a user I/O page table, which is nested
> on top of another user space address represented by a UNMANAGED
> domain. The
> mappings of a nested domain are managed by user space software,
> therefore
> it's unnecessary to have map/unmap callbacks. But the updates of the PTEs
> in the nested domain page table must be propagated to the caches on both
> IOMMU (IOTLB) and devices (DevTLB).
> 
> The nested domain is allocated by the domain_alloc_user op, and attached
> to the device through the existing iommu_attach_device/group() interfaces.
> 
> A new domain op, named cache_invalidate_user is added for the userspace
> to
> flush the hardware caches for a nested domain through iommufd. No
> wrapper
> for it, as it's only supposed to be used by iommufd.
> 
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
  
Jason Gunthorpe July 28, 2023, 4:59 p.m. UTC | #2
On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote:

> @@ -350,6 +354,10 @@ struct iommu_ops {
>   * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
>   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
>   *            queue
> + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings
> + * @cache_invalidate_user_data_len: Defined length of input user data for the
> + *                                  cache_invalidate_user op, being sizeof the
> + *                                  structure in include/uapi/linux/iommufd.h
>   * @iova_to_phys: translate iova to physical address
>   * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
>   *                           including no-snoop TLPs on PCIe or other platform
> @@ -379,6 +387,9 @@ struct iommu_domain_ops {
>  			       size_t size);
>  	void (*iotlb_sync)(struct iommu_domain *domain,
>  			   struct iommu_iotlb_gather *iotlb_gather);
> +	int (*cache_invalidate_user)(struct iommu_domain *domain,
> +				     void *user_data);

If we are doing const unions, then this void * should also be a const
union.

Jason
  
Nicolin Chen Aug. 3, 2023, 2:36 a.m. UTC | #3
On Fri, Jul 28, 2023 at 01:59:28PM -0300, Jason Gunthorpe wrote:
> On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote:
> 
> > @@ -350,6 +354,10 @@ struct iommu_ops {
> >   * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
> >   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
> >   *            queue
> > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings
> > + * @cache_invalidate_user_data_len: Defined length of input user data for the
> > + *                                  cache_invalidate_user op, being sizeof the
> > + *                                  structure in include/uapi/linux/iommufd.h
> >   * @iova_to_phys: translate iova to physical address
> >   * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
> >   *                           including no-snoop TLPs on PCIe or other platform
> > @@ -379,6 +387,9 @@ struct iommu_domain_ops {
> >  			       size_t size);
> >  	void (*iotlb_sync)(struct iommu_domain *domain,
> >  			   struct iommu_iotlb_gather *iotlb_gather);
> > +	int (*cache_invalidate_user)(struct iommu_domain *domain,
> > +				     void *user_data);
> 
> If we are doing const unions, then this void * should also be a const
> union.

Unlike iommu_domain_user_data is a union on its own, all invalidate
user data structures are added to union ucmd_buffer. It feels a bit
weird to cross reference "union ucmd_buffer" and to pass the naming
"ucmd_buffer" in this cache_invalidate_user.

Any suggestion?

Thanks
Nic
  
Yi Liu Aug. 3, 2023, 2:53 a.m. UTC | #4
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Thursday, August 3, 2023 10:37 AM
> 
> On Fri, Jul 28, 2023 at 01:59:28PM -0300, Jason Gunthorpe wrote:
> > On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote:
> >
> > > @@ -350,6 +354,10 @@ struct iommu_ops {
> > >   * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
> > >   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
> > >   *            queue
> > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings
> > > + * @cache_invalidate_user_data_len: Defined length of input user data for the
> > > + *                                  cache_invalidate_user op, being sizeof the
> > > + *                                  structure in include/uapi/linux/iommufd.h
> > >   * @iova_to_phys: translate iova to physical address
> > >   * @enforce_cache_coherency: Prevent any kind of DMA from bypassing
> IOMMU_CACHE,
> > >   *                           including no-snoop TLPs on PCIe or other platform
> > > @@ -379,6 +387,9 @@ struct iommu_domain_ops {
> > >  			       size_t size);
> > >  	void (*iotlb_sync)(struct iommu_domain *domain,
> > >  			   struct iommu_iotlb_gather *iotlb_gather);
> > > +	int (*cache_invalidate_user)(struct iommu_domain *domain,
> > > +				     void *user_data);
> >
> > If we are doing const unions, then this void * should also be a const
> > union.
> 
> Unlike iommu_domain_user_data is a union on its own, all invalidate
> user data structures are added to union ucmd_buffer. It feels a bit
> weird to cross reference "union ucmd_buffer" and to pass the naming
> "ucmd_buffer" in this cache_invalidate_user.
> 
> Any suggestion?

I think we can have a union like iommu_user_cache_invalidate, every new
data structures should be put in this union, and this union is put in the
ucmd_buffer.

Regards,
Yi Liu
  
Nicolin Chen Aug. 3, 2023, 3:04 a.m. UTC | #5
On Thu, Aug 03, 2023 at 02:53:34AM +0000, Liu, Yi L wrote:
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Thursday, August 3, 2023 10:37 AM
> >
> > On Fri, Jul 28, 2023 at 01:59:28PM -0300, Jason Gunthorpe wrote:
> > > On Mon, Jul 24, 2023 at 04:03:51AM -0700, Yi Liu wrote:
> > >
> > > > @@ -350,6 +354,10 @@ struct iommu_ops {
> > > >   * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
> > > >   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
> > > >   *            queue
> > > > + * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings
> > > > + * @cache_invalidate_user_data_len: Defined length of input user data for the
> > > > + *                                  cache_invalidate_user op, being sizeof the
> > > > + *                                  structure in include/uapi/linux/iommufd.h
> > > >   * @iova_to_phys: translate iova to physical address
> > > >   * @enforce_cache_coherency: Prevent any kind of DMA from bypassing
> > IOMMU_CACHE,
> > > >   *                           including no-snoop TLPs on PCIe or other platform
> > > > @@ -379,6 +387,9 @@ struct iommu_domain_ops {
> > > >                          size_t size);
> > > >   void (*iotlb_sync)(struct iommu_domain *domain,
> > > >                      struct iommu_iotlb_gather *iotlb_gather);
> > > > + int (*cache_invalidate_user)(struct iommu_domain *domain,
> > > > +                              void *user_data);
> > >
> > > If we are doing const unions, then this void * should also be a const
> > > union.
> >
> > Unlike iommu_domain_user_data is a union on its own, all invalidate
> > user data structures are added to union ucmd_buffer. It feels a bit
> > weird to cross reference "union ucmd_buffer" and to pass the naming
> > "ucmd_buffer" in this cache_invalidate_user.
> >
> > Any suggestion?
> 
> I think we can have a union like iommu_user_cache_invalidate, every new
> data structures should be put in this union, and this union is put in the
> ucmd_buffer.

Ah, that should do the job.

Thanks!
  

Patch

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ecbec2627b63..b8f09330b64e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -66,6 +66,9 @@  struct iommu_domain_geometry {
 
 #define __IOMMU_DOMAIN_SVA	(1U << 4)  /* Shared process address space */
 
+#define __IOMMU_DOMAIN_NESTED	(1U << 5)  /* User-managed address space nested
+					      on a stage-2 translation        */
+
 #define IOMMU_DOMAIN_ALLOC_FLAGS ~__IOMMU_DOMAIN_DMA_FQ
 /*
  * This are the possible domain-types
@@ -92,6 +95,7 @@  struct iommu_domain_geometry {
 				 __IOMMU_DOMAIN_DMA_API |	\
 				 __IOMMU_DOMAIN_DMA_FQ)
 #define IOMMU_DOMAIN_SVA	(__IOMMU_DOMAIN_SVA)
+#define IOMMU_DOMAIN_NESTED	(__IOMMU_DOMAIN_NESTED)
 
 struct iommu_domain {
 	unsigned type;
@@ -350,6 +354,10 @@  struct iommu_ops {
  * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
  *            queue
+ * @cache_invalidate_user: Flush hardware TLBs caching user space IO mappings
+ * @cache_invalidate_user_data_len: Defined length of input user data for the
+ *                                  cache_invalidate_user op, being sizeof the
+ *                                  structure in include/uapi/linux/iommufd.h
  * @iova_to_phys: translate iova to physical address
  * @enforce_cache_coherency: Prevent any kind of DMA from bypassing IOMMU_CACHE,
  *                           including no-snoop TLPs on PCIe or other platform
@@ -379,6 +387,9 @@  struct iommu_domain_ops {
 			       size_t size);
 	void (*iotlb_sync)(struct iommu_domain *domain,
 			   struct iommu_iotlb_gather *iotlb_gather);
+	int (*cache_invalidate_user)(struct iommu_domain *domain,
+				     void *user_data);
+	const size_t cache_invalidate_user_data_len;
 
 	phys_addr_t (*iova_to_phys)(struct iommu_domain *domain,
 				    dma_addr_t iova);