[v2,01/11] iommu: Add new iommu op to create domains owned by userspace

Message ID 20230511143844.22693-2-yi.l.liu@intel.com
State New
Headers
Series iommufd: Add nesting infrastructure |

Commit Message

Yi Liu May 11, 2023, 2:38 p.m. UTC
  From: Lu Baolu <baolu.lu@linux.intel.com>

Introduce a new iommu_domain op to create domains owned by userspace,
e.g. through iommufd. These domains have a few different properties
compares to kernel owned domains:

 - They may be UNMANAGED domains, but created with special parameters.
   For instance aperture size changes/number of levels, different
   IOPTE formats, or other things necessary to make a vIOMMU work

 - We have to track all the memory allocations with GFP_KERNEL_ACCOUNT
   to make the cgroup sandbox stronger

 - Device-specialty domains, such as NESTED domains can be created by
   iommufd.

The new op clearly says the domain is being created by IOMMUFD, that
the domain is intended for userspace use, and it provides a way to pass
a driver specific uAPI structure to customize the created domain to
exactly what the vIOMMU userspace driver requires.

iommu drivers that cannot support VFIO/IOMMUFD should not support this
op. This includes any driver that cannot provide a fully functional
UNMANAGED domain.

This op chooses to make the special parameters opaque to the core. This
suits the current usage model where accessing any of the IOMMU device
special parameters does require a userspace driver that matches the
kernel driver. If a need for common parameters, implemented similarly
by several drivers, arises then there is room in the design to grow a
generic parameter set as well.

This new op for now is only supposed to be used by iommufd, hence no
wrapper for it. iommufd would call the callback directly. As for domain
free, iommufd would use iommu_domain_free().

Also, add an op to return the length of supported user data structures
that must be added to include/uapi/include/iommufd.h file. This helps
the iommufd core to sanitize the input data before it forwards the data
to an iommu driver.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 include/linux/iommu.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)
  

Comments

Tian, Kevin May 19, 2023, 8:47 a.m. UTC | #1
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, May 11, 2023 10:39 PM
> @@ -229,6 +238,15 @@ struct iommu_iotlb_gather {
>   *           after use. Return the data buffer if success, or ERR_PTR on
>   *           failure.
>   * @domain_alloc: allocate iommu domain
> + * @domain_alloc_user: allocate user iommu domain
> + * @domain_alloc_user_data_len: return the required length of the user
> data
> + *                              to allocate a specific type user iommu domain.
> + *                              @hwpt_type is defined as enum iommu_hwpt_type
> + *                              in include/uapi/linux/iommufd.h. The returned
> + *                              length is the corresponding sizeof driver data
> + *                              structures in include/uapi/linux/iommufd.h.
> + *                              -EOPNOTSUPP would be returned if the input
> + *                              @hwpt_type is not supported by the driver.

Can this be merged with earlier @hw_info callback? That will already
report a list of supported hwpt types. is there a problem to further
describe the data length for each type in that interface?
  
Nicolin Chen May 19, 2023, 6:45 p.m. UTC | #2
On Fri, May 19, 2023 at 08:47:45AM +0000, Tian, Kevin wrote:
> > From: Liu, Yi L <yi.l.liu@intel.com>
> > Sent: Thursday, May 11, 2023 10:39 PM
> > @@ -229,6 +238,15 @@ struct iommu_iotlb_gather {
> >   *           after use. Return the data buffer if success, or ERR_PTR on
> >   *           failure.
> >   * @domain_alloc: allocate iommu domain
> > + * @domain_alloc_user: allocate user iommu domain
> > + * @domain_alloc_user_data_len: return the required length of the user
> > data
> > + *                              to allocate a specific type user iommu domain.
> > + *                              @hwpt_type is defined as enum iommu_hwpt_type
> > + *                              in include/uapi/linux/iommufd.h. The returned
> > + *                              length is the corresponding sizeof driver data
> > + *                              structures in include/uapi/linux/iommufd.h.
> > + *                              -EOPNOTSUPP would be returned if the input
> > + *                              @hwpt_type is not supported by the driver.
> 
> Can this be merged with earlier @hw_info callback? That will already
> report a list of supported hwpt types. is there a problem to further
> describe the data length for each type in that interface?

Yi and I had a last minute talk before he sent this version
actually... This version of hw_info no longer reports a list
of supported hwpt types. We previously did that in a bitmap,
but we found that a bitmap will not be sufficient eventually
if there are more than 64 hwpt_types.

And this domain_alloc_user_data_len might not be necessary,
because in this version the IOMMUFD core doesn't really care
about the actual data_len since it copies the data into the
ucmd_buffer, i.e. we would probably only need a bool op like
"hwpt_type_is_supported".

Thanks
Nic
  
Tian, Kevin May 24, 2023, 5:02 a.m. UTC | #3
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Saturday, May 20, 2023 2:45 AM
> 
> On Fri, May 19, 2023 at 08:47:45AM +0000, Tian, Kevin wrote:
> > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > Sent: Thursday, May 11, 2023 10:39 PM
> > > @@ -229,6 +238,15 @@ struct iommu_iotlb_gather {
> > >   *           after use. Return the data buffer if success, or ERR_PTR on
> > >   *           failure.
> > >   * @domain_alloc: allocate iommu domain
> > > + * @domain_alloc_user: allocate user iommu domain
> > > + * @domain_alloc_user_data_len: return the required length of the user
> > > data
> > > + *                              to allocate a specific type user iommu domain.
> > > + *                              @hwpt_type is defined as enum iommu_hwpt_type
> > > + *                              in include/uapi/linux/iommufd.h. The returned
> > > + *                              length is the corresponding sizeof driver data
> > > + *                              structures in include/uapi/linux/iommufd.h.
> > > + *                              -EOPNOTSUPP would be returned if the input
> > > + *                              @hwpt_type is not supported by the driver.
> >
> > Can this be merged with earlier @hw_info callback? That will already
> > report a list of supported hwpt types. is there a problem to further
> > describe the data length for each type in that interface?
> 
> Yi and I had a last minute talk before he sent this version
> actually... This version of hw_info no longer reports a list
> of supported hwpt types. We previously did that in a bitmap,
> but we found that a bitmap will not be sufficient eventually
> if there are more than 64 hwpt_types.
> 
> And this domain_alloc_user_data_len might not be necessary,
> because in this version the IOMMUFD core doesn't really care
> about the actual data_len since it copies the data into the
> ucmd_buffer, i.e. we would probably only need a bool op like
> "hwpt_type_is_supported".
> 

Or just pass to the @domain_alloc_user ops which should fail
if the type is not supported?
  
Nicolin Chen May 24, 2023, 5:23 a.m. UTC | #4
On Wed, May 24, 2023 at 05:02:19AM +0000, Tian, Kevin wrote:
 
> > From: Nicolin Chen <nicolinc@nvidia.com>
> > Sent: Saturday, May 20, 2023 2:45 AM
> >
> > On Fri, May 19, 2023 at 08:47:45AM +0000, Tian, Kevin wrote:
> > > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > > Sent: Thursday, May 11, 2023 10:39 PM
> > > > @@ -229,6 +238,15 @@ struct iommu_iotlb_gather {
> > > >   *           after use. Return the data buffer if success, or ERR_PTR on
> > > >   *           failure.
> > > >   * @domain_alloc: allocate iommu domain
> > > > + * @domain_alloc_user: allocate user iommu domain
> > > > + * @domain_alloc_user_data_len: return the required length of the user
> > > > data
> > > > + *                              to allocate a specific type user iommu domain.
> > > > + *                              @hwpt_type is defined as enum iommu_hwpt_type
> > > > + *                              in include/uapi/linux/iommufd.h. The returned
> > > > + *                              length is the corresponding sizeof driver data
> > > > + *                              structures in include/uapi/linux/iommufd.h.
> > > > + *                              -EOPNOTSUPP would be returned if the input
> > > > + *                              @hwpt_type is not supported by the driver.
> > >
> > > Can this be merged with earlier @hw_info callback? That will already
> > > report a list of supported hwpt types. is there a problem to further
> > > describe the data length for each type in that interface?
> >
> > Yi and I had a last minute talk before he sent this version
> > actually... This version of hw_info no longer reports a list
> > of supported hwpt types. We previously did that in a bitmap,
> > but we found that a bitmap will not be sufficient eventually
> > if there are more than 64 hwpt_types.
> >
> > And this domain_alloc_user_data_len might not be necessary,
> > because in this version the IOMMUFD core doesn't really care
> > about the actual data_len since it copies the data into the
> > ucmd_buffer, i.e. we would probably only need a bool op like
> > "hwpt_type_is_supported".
> >
> 
> Or just pass to the @domain_alloc_user ops which should fail
> if the type is not supported?

The domain_alloc_user returns NULL, which then would be turned
into an ENOMEM error code. It might be confusing from the user
space perspective. Having an op at least allows the user space
to realize that something is wrong with the input structure?

Thanks
Nic
  
Tian, Kevin May 24, 2023, 7:48 a.m. UTC | #5
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Wednesday, May 24, 2023 1:24 PM
> 
> On Wed, May 24, 2023 at 05:02:19AM +0000, Tian, Kevin wrote:
> 
> > > From: Nicolin Chen <nicolinc@nvidia.com>
> > > Sent: Saturday, May 20, 2023 2:45 AM
> > >
> > > On Fri, May 19, 2023 at 08:47:45AM +0000, Tian, Kevin wrote:
> > > > > From: Liu, Yi L <yi.l.liu@intel.com>
> > > > > Sent: Thursday, May 11, 2023 10:39 PM
> > > > > @@ -229,6 +238,15 @@ struct iommu_iotlb_gather {
> > > > >   *           after use. Return the data buffer if success, or ERR_PTR on
> > > > >   *           failure.
> > > > >   * @domain_alloc: allocate iommu domain
> > > > > + * @domain_alloc_user: allocate user iommu domain
> > > > > + * @domain_alloc_user_data_len: return the required length of the
> user
> > > > > data
> > > > > + *                              to allocate a specific type user iommu domain.
> > > > > + *                              @hwpt_type is defined as enum
> iommu_hwpt_type
> > > > > + *                              in include/uapi/linux/iommufd.h. The returned
> > > > > + *                              length is the corresponding sizeof driver data
> > > > > + *                              structures in include/uapi/linux/iommufd.h.
> > > > > + *                              -EOPNOTSUPP would be returned if the input
> > > > > + *                              @hwpt_type is not supported by the driver.
> > > >
> > > > Can this be merged with earlier @hw_info callback? That will already
> > > > report a list of supported hwpt types. is there a problem to further
> > > > describe the data length for each type in that interface?
> > >
> > > Yi and I had a last minute talk before he sent this version
> > > actually... This version of hw_info no longer reports a list
> > > of supported hwpt types. We previously did that in a bitmap,
> > > but we found that a bitmap will not be sufficient eventually
> > > if there are more than 64 hwpt_types.
> > >
> > > And this domain_alloc_user_data_len might not be necessary,
> > > because in this version the IOMMUFD core doesn't really care
> > > about the actual data_len since it copies the data into the
> > > ucmd_buffer, i.e. we would probably only need a bool op like
> > > "hwpt_type_is_supported".
> > >
> >
> > Or just pass to the @domain_alloc_user ops which should fail
> > if the type is not supported?
> 
> The domain_alloc_user returns NULL, which then would be turned
> into an ENOMEM error code. It might be confusing from the user
> space perspective. Having an op at least allows the user space
> to realize that something is wrong with the input structure?
> 

this is a new callback. any reason why it cannot be defined to
allow returning ERR_PTR?
  
Nicolin Chen May 25, 2023, 1:41 a.m. UTC | #6
On Wed, May 24, 2023 at 07:48:46AM +0000, Tian, Kevin wrote:

> > > > > >   *           after use. Return the data buffer if success, or ERR_PTR on
> > > > > >   *           failure.
> > > > > >   * @domain_alloc: allocate iommu domain
> > > > > > + * @domain_alloc_user: allocate user iommu domain
> > > > > > + * @domain_alloc_user_data_len: return the required length of the
> > user
> > > > > > data
> > > > > > + *                              to allocate a specific type user iommu domain.
> > > > > > + *                              @hwpt_type is defined as enum
> > iommu_hwpt_type
> > > > > > + *                              in include/uapi/linux/iommufd.h. The returned
> > > > > > + *                              length is the corresponding sizeof driver data
> > > > > > + *                              structures in include/uapi/linux/iommufd.h.
> > > > > > + *                              -EOPNOTSUPP would be returned if the input
> > > > > > + *                              @hwpt_type is not supported by the driver.
> > > > >
> > > > > Can this be merged with earlier @hw_info callback? That will already
> > > > > report a list of supported hwpt types. is there a problem to further
> > > > > describe the data length for each type in that interface?
> > > >
> > > > Yi and I had a last minute talk before he sent this version
> > > > actually... This version of hw_info no longer reports a list
> > > > of supported hwpt types. We previously did that in a bitmap,
> > > > but we found that a bitmap will not be sufficient eventually
> > > > if there are more than 64 hwpt_types.
> > > >
> > > > And this domain_alloc_user_data_len might not be necessary,
> > > > because in this version the IOMMUFD core doesn't really care
> > > > about the actual data_len since it copies the data into the
> > > > ucmd_buffer, i.e. we would probably only need a bool op like
> > > > "hwpt_type_is_supported".
> > > >
> > >
> > > Or just pass to the @domain_alloc_user ops which should fail
> > > if the type is not supported?
> >
> > The domain_alloc_user returns NULL, which then would be turned
> > into an ENOMEM error code. It might be confusing from the user
> > space perspective. Having an op at least allows the user space
> > to realize that something is wrong with the input structure?
> >
> 
> this is a new callback. any reason why it cannot be defined to
> allow returning ERR_PTR?

Upon a quick check, I think we could. Though it'd be slightly
mismatched with the domain_alloc op, it should be fine since
iommufd is likely to be the only caller.

So, I think we can just take the approach letting user space
try a hwpt_type and see if the ioctl would fail with -EINVAL.

Thanks
Nic
  
Jason Gunthorpe June 6, 2023, 2:08 p.m. UTC | #7
On Wed, May 24, 2023 at 06:41:41PM -0700, Nicolin Chen wrote:

> Upon a quick check, I think we could. Though it'd be slightly
> mismatched with the domain_alloc op, it should be fine since
> iommufd is likely to be the only caller.

Ideally the main op would return ERR_PTR too

Jason
  
Nicolin Chen June 6, 2023, 7:43 p.m. UTC | #8
On Tue, Jun 06, 2023 at 11:08:44AM -0300, Jason Gunthorpe wrote:
> On Wed, May 24, 2023 at 06:41:41PM -0700, Nicolin Chen wrote:
> 
> > Upon a quick check, I think we could. Though it'd be slightly
> > mismatched with the domain_alloc op, it should be fine since
> > iommufd is likely to be the only caller.
> 
> Ideally the main op would return ERR_PTR too

Yea. It just seems to be a bit painful to change it for that.

Worth a big series?

Thanks
Nic
  
Jason Gunthorpe June 7, 2023, 12:14 a.m. UTC | #9
On Tue, Jun 06, 2023 at 12:43:44PM -0700, Nicolin Chen wrote:
> On Tue, Jun 06, 2023 at 11:08:44AM -0300, Jason Gunthorpe wrote:
> > On Wed, May 24, 2023 at 06:41:41PM -0700, Nicolin Chen wrote:
> > 
> > > Upon a quick check, I think we could. Though it'd be slightly
> > > mismatched with the domain_alloc op, it should be fine since
> > > iommufd is likely to be the only caller.
> > 
> > Ideally the main op would return ERR_PTR too
> 
> Yea. It just seems to be a bit painful to change it for that.
> 
> Worth a big series?

Probably not..

Jason
  

Patch

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index a748d60206e7..7f2046fa53a3 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -220,6 +220,15 @@  struct iommu_iotlb_gather {
 	bool			queued;
 };
 
+/*
+ * The user data to allocate a specific type user iommu domain
+ *
+ * This includes the corresponding driver data structures in
+ * include/uapi/linux/iommufd.h.
+ */
+union iommu_domain_user_data {
+};
+
 /**
  * struct iommu_ops - iommu ops and capabilities
  * @capable: check capability
@@ -229,6 +238,15 @@  struct iommu_iotlb_gather {
  *           after use. Return the data buffer if success, or ERR_PTR on
  *           failure.
  * @domain_alloc: allocate iommu domain
+ * @domain_alloc_user: allocate user iommu domain
+ * @domain_alloc_user_data_len: return the required length of the user data
+ *                              to allocate a specific type user iommu domain.
+ *                              @hwpt_type is defined as enum iommu_hwpt_type
+ *                              in include/uapi/linux/iommufd.h. The returned
+ *                              length is the corresponding sizeof driver data
+ *                              structures in include/uapi/linux/iommufd.h.
+ *                              -EOPNOTSUPP would be returned if the input
+ *                              @hwpt_type is not supported by the driver.
  * @probe_device: Add device to iommu driver handling
  * @release_device: Remove device from iommu driver handling
  * @probe_finalize: Do final setup work after the device is added to an IOMMU
@@ -269,6 +287,10 @@  struct iommu_ops {
 
 	/* Domain allocation and freeing by the iommu driver */
 	struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type);
+	struct iommu_domain *(*domain_alloc_user)(struct device *dev,
+						  struct iommu_domain *parent,
+						  const union iommu_domain_user_data *user_data);
+	int (*domain_alloc_user_data_len)(u32 hwpt_type);
 
 	struct iommu_device *(*probe_device)(struct device *dev);
 	void (*release_device)(struct device *dev);