[v2,1/6] iommu: Add new iommu op to create domains owned by userspace

Message ID 20230928071528.26258-2-yi.l.liu@intel.com
State New
Headers
Series iommufd support allocating nested parent domain |

Commit Message

Yi Liu Sept. 28, 2023, 7:15 a.m. UTC
  Introduce a new iommu_domain op to create domains owned by userspace,
e.g. through IOMMUFD. These domains have a few different properties
compares to kernel owned domains:

 - They may be UNMANAGED domains, but created with special parameters.
   For instance aperture size changes/number of levels, different
   IOPTE formats, or other things necessary to make a vIOMMU work

 - We have to track all the memory allocations with GFP_KERNEL_ACCOUNT
   to make the cgroup sandbox stronger

 - Device-specialty domains, such as NESTED domains can be created by
   IOMMUFD.

The new op clearly says the domain is being created by IOMMUFD, that
the domain is intended for userspace use, and it provides a way to pass
user flags or a driver specific uAPI structure to customize the created
domain to exactly what the vIOMMU userspace driver requires.

iommu drivers that cannot support VFIO/IOMMUFD should not support this
op. This includes any driver that cannot provide a fully functional
UNMANAGED domain.

This new op for now is only supposed to be used by IOMMUFD, hence no
wrapper for it. IOMMUFD would call the callback directly. As for domain
free, IOMMUFD would use iommu_domain_free().

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h        | 11 ++++++++++-
 include/uapi/linux/iommufd.h | 12 +++++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)
  

Comments

Tian, Kevin Oct. 9, 2023, 1:09 a.m. UTC | #1
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, September 28, 2023 3:15 PM
> 
> Introduce a new iommu_domain op to create domains owned by userspace,
> e.g. through IOMMUFD. These domains have a few different properties
> compares to kernel owned domains:
> 
>  - They may be UNMANAGED domains, but created with special parameters.
>    For instance aperture size changes/number of levels, different
>    IOPTE formats, or other things necessary to make a vIOMMU work
> 
>  - We have to track all the memory allocations with GFP_KERNEL_ACCOUNT
>    to make the cgroup sandbox stronger
> 
>  - Device-specialty domains, such as NESTED domains can be created by
>    IOMMUFD.
> 
> The new op clearly says the domain is being created by IOMMUFD, that
> the domain is intended for userspace use, and it provides a way to pass
> user flags or a driver specific uAPI structure to customize the created
> domain to exactly what the vIOMMU userspace driver requires.
> 
> iommu drivers that cannot support VFIO/IOMMUFD should not support this
> op. This includes any driver that cannot provide a fully functional
> UNMANAGED domain.
> 
> This new op for now is only supposed to be used by IOMMUFD, hence no
> wrapper for it. IOMMUFD would call the callback directly. As for domain
> free, IOMMUFD would use iommu_domain_free().
> 
> Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
> Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
  
Nicolin Chen Oct. 15, 2023, 7:14 a.m. UTC | #2
On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:

> diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> index b4ba0c0cbab6..4a7c5c8fdbb4 100644
> --- a/include/uapi/linux/iommufd.h
> +++ b/include/uapi/linux/iommufd.h
> @@ -347,10 +347,20 @@ struct iommu_vfio_ioas {
>  };
>  #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
> 
> +/**
> + * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
> + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
> + *                                as the parent domain in the nesting
> + *                                configuration.

I just noticed a nit here: we should probably align with other
parts of this file by using "HWPT" v.s. "domain"? I.e.

+ * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve
+ *                                as the parent HWPT in the nesting
+ *                                configuration.

Thanks
Nic
  
Jason Gunthorpe Oct. 16, 2023, 12:04 p.m. UTC | #3
On Sun, Oct 15, 2023 at 12:14:01AM -0700, Nicolin Chen wrote:
> On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:
> 
> > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> > index b4ba0c0cbab6..4a7c5c8fdbb4 100644
> > --- a/include/uapi/linux/iommufd.h
> > +++ b/include/uapi/linux/iommufd.h
> > @@ -347,10 +347,20 @@ struct iommu_vfio_ioas {
> >  };
> >  #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
> > 
> > +/**
> > + * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
> > + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
> > + *                                as the parent domain in the nesting
> > + *                                configuration.
> 
> I just noticed a nit here: we should probably align with other
> parts of this file by using "HWPT" v.s. "domain"? I.e.
> 
> + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve
> + *                                as the parent HWPT in the nesting
> + *                                configuration.

Yes

Jason
  
Nicolin Chen Oct. 16, 2023, 5:46 p.m. UTC | #4
On Mon, Oct 16, 2023 at 09:04:54AM -0300, Jason Gunthorpe wrote:
> On Sun, Oct 15, 2023 at 12:14:01AM -0700, Nicolin Chen wrote:
> > On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:
> > 
> > > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> > > index b4ba0c0cbab6..4a7c5c8fdbb4 100644
> > > --- a/include/uapi/linux/iommufd.h
> > > +++ b/include/uapi/linux/iommufd.h
> > > @@ -347,10 +347,20 @@ struct iommu_vfio_ioas {
> > >  };
> > >  #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
> > > 
> > > +/**
> > > + * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
> > > + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
> > > + *                                as the parent domain in the nesting
> > > + *                                configuration.
> > 
> > I just noticed a nit here: we should probably align with other
> > parts of this file by using "HWPT" v.s. "domain"? I.e.
> > 
> > + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve
> > + *                                as the parent HWPT in the nesting
> > + *                                configuration.
> 
> Yes

Should we resend? Or would it be possible for you to update it
in your for-next tree?
  
Jason Gunthorpe Oct. 17, 2023, 4:05 p.m. UTC | #5
On Mon, Oct 16, 2023 at 10:46:10AM -0700, Nicolin Chen wrote:
> On Mon, Oct 16, 2023 at 09:04:54AM -0300, Jason Gunthorpe wrote:
> > On Sun, Oct 15, 2023 at 12:14:01AM -0700, Nicolin Chen wrote:
> > > On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:
> > > 
> > > > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> > > > index b4ba0c0cbab6..4a7c5c8fdbb4 100644
> > > > --- a/include/uapi/linux/iommufd.h
> > > > +++ b/include/uapi/linux/iommufd.h
> > > > @@ -347,10 +347,20 @@ struct iommu_vfio_ioas {
> > > >  };
> > > >  #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
> > > > 
> > > > +/**
> > > > + * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
> > > > + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
> > > > + *                                as the parent domain in the nesting
> > > > + *                                configuration.
> > > 
> > > I just noticed a nit here: we should probably align with other
> > > parts of this file by using "HWPT" v.s. "domain"? I.e.
> > > 
> > > + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve
> > > + *                                as the parent HWPT in the nesting
> > > + *                                configuration.
> > 
> > Yes
> 
> Should we resend? Or would it be possible for you to update it
> in your for-next tree?

At this point send a Fixes: patch

Jason
  

Patch

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c50a769d569a..3861d66b65c1 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -234,7 +234,15 @@  struct iommu_iotlb_gather {
  *           op is allocated in the iommu driver and freed by the caller after
  *           use. The information type is one of enum iommu_hw_info_type defined
  *           in include/uapi/linux/iommufd.h.
- * @domain_alloc: allocate iommu domain
+ * @domain_alloc: allocate and return an iommu domain if success. Otherwise
+ *                NULL is returned. The domain is not fully initialized until
+ *                the caller iommu_domain_alloc() returns.
+ * @domain_alloc_user: Allocate an iommu domain corresponding to the input
+ *                     parameters as defined in include/uapi/linux/iommufd.h.
+ *                     Unlike @domain_alloc, it is called only by IOMMUFD and
+ *                     must fully initialize the new domain before return.
+ *                     Upon success, a domain is returned. Upon failure,
+ *                     ERR_PTR must be returned.
  * @probe_device: Add device to iommu driver handling
  * @release_device: Remove device from iommu driver handling
  * @probe_finalize: Do final setup work after the device is added to an IOMMU
@@ -267,6 +275,7 @@  struct iommu_ops {
 
 	/* Domain allocation and freeing by the iommu driver */
 	struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type);
+	struct iommu_domain *(*domain_alloc_user)(struct device *dev, u32 flags);
 
 	struct iommu_device *(*probe_device)(struct device *dev);
 	void (*release_device)(struct device *dev);
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index b4ba0c0cbab6..4a7c5c8fdbb4 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -347,10 +347,20 @@  struct iommu_vfio_ioas {
 };
 #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
 
+/**
+ * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
+ * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
+ *                                as the parent domain in the nesting
+ *                                configuration.
+ */
+enum iommufd_hwpt_alloc_flags {
+	IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0,
+};
+
 /**
  * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC)
  * @size: sizeof(struct iommu_hwpt_alloc)
- * @flags: Must be 0
+ * @flags: Combination of enum iommufd_hwpt_alloc_flags
  * @dev_id: The device to allocate this HWPT for
  * @pt_id: The IOAS to connect this HWPT to
  * @out_hwpt_id: The ID of the new HWPT