[v5,3/7] iommu: Validate that devices match domains

Message ID 4e8bda33aac4021b444e40389648deccf61c1f37.1697047261.git.robin.murphy@arm.com
State New
Headers
Series iommu: Retire bus ops |

Commit Message

Robin Murphy Oct. 11, 2023, 6:14 p.m. UTC
  Before we can allow drivers to coexist, we need to make sure that one
driver's domain ops can't misinterpret another driver's dev_iommu_priv
data. To that end, add a token to the domain so we can remember how it
was allocated - for now this may as well be the device ops, since they
still correlate 1:1 with drivers. We can trust ourselves for internal
default domain attachment, so add checks to cover all the public attach
interfaces.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>

---

v4: Cover iommu_attach_device_pasid() as well, and improve robustness
    against theoretical attempts to attach a noiommu group.
---
 drivers/iommu/iommu.c | 10 ++++++++++
 include/linux/iommu.h |  1 +
 2 files changed, 11 insertions(+)
  

Comments

Jerry Snitselaar Oct. 18, 2023, 11:14 p.m. UTC | #1
On Wed, Oct 11, 2023 at 07:14:50PM +0100, Robin Murphy wrote:
> Before we can allow drivers to coexist, we need to make sure that one
> driver's domain ops can't misinterpret another driver's dev_iommu_priv
> data. To that end, add a token to the domain so we can remember how it
> was allocated - for now this may as well be the device ops, since they
> still correlate 1:1 with drivers. We can trust ourselves for internal
> default domain attachment, so add checks to cover all the public attach
> interfaces.
> 
> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> 
> ---
> 
> v4: Cover iommu_attach_device_pasid() as well, and improve robustness
>     against theoretical attempts to attach a noiommu group.
> ---

Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
  
Jason Gunthorpe Oct. 24, 2023, 6:52 p.m. UTC | #2
On Wed, Oct 11, 2023 at 07:14:50PM +0100, Robin Murphy wrote:

> @@ -2279,10 +2280,16 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
>  static int __iommu_attach_group(struct iommu_domain *domain,
>  				struct iommu_group *group)
>  {
> +	struct device *dev;
> +
>  	if (group->domain && group->domain != group->default_domain &&
>  	    group->domain != group->blocking_domain)
>  		return -EBUSY;
>  
> +	dev = iommu_group_first_dev(group);
> +	if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
> +		return -EINVAL;

I was thinking about this later, how does this work for the global
static domains? domain->owner will not be set?

	if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
		return ops->identity_domain;
	else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
		return ops->blocked_domain;

Seems like it will break everything?

I suggest we just put a simple void * tag in the const domain->ops at
compile time to indicate the owning driver.

Jason
  
Robin Murphy Oct. 25, 2023, 12:39 p.m. UTC | #3
On 24/10/2023 7:52 pm, Jason Gunthorpe wrote:
> On Wed, Oct 11, 2023 at 07:14:50PM +0100, Robin Murphy wrote:
> 
>> @@ -2279,10 +2280,16 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
>>   static int __iommu_attach_group(struct iommu_domain *domain,
>>   				struct iommu_group *group)
>>   {
>> +	struct device *dev;
>> +
>>   	if (group->domain && group->domain != group->default_domain &&
>>   	    group->domain != group->blocking_domain)
>>   		return -EBUSY;
>>   
>> +	dev = iommu_group_first_dev(group);
>> +	if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
>> +		return -EINVAL;
> 
> I was thinking about this later, how does this work for the global
> static domains? domain->owner will not be set?
> 
> 	if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
> 		return ops->identity_domain;
> 	else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
> 		return ops->blocked_domain;
> 
> Seems like it will break everything?

I don't believe it makes any significant difference - as the commit 
message points out, this validation is only applied at the public 
interface boundaries of iommu_attach_group(), iommu_attach_device(), and 
iommu_attach_device_pasid(), which are only expected to be operating on 
explicitly-allocated unmanaged domains. For internal default domain 
attachment, the domain is initially derived from the device/group itself 
so we know it's appropriate by construction.

I guess this *would* now prevent some external caller reaching in and 
trying to attach something to some other group's identity default 
domain, but frankly it feels like making that fail would be no bad thing 
anyway.

Thanks,
Robin.
  
Jason Gunthorpe Oct. 25, 2023, 12:55 p.m. UTC | #4
On Wed, Oct 25, 2023 at 01:39:56PM +0100, Robin Murphy wrote:
> On 24/10/2023 7:52 pm, Jason Gunthorpe wrote:
> > On Wed, Oct 11, 2023 at 07:14:50PM +0100, Robin Murphy wrote:
> > 
> > > @@ -2279,10 +2280,16 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
> > >   static int __iommu_attach_group(struct iommu_domain *domain,
> > >   				struct iommu_group *group)
> > >   {
> > > +	struct device *dev;
> > > +
> > >   	if (group->domain && group->domain != group->default_domain &&
> > >   	    group->domain != group->blocking_domain)
> > >   		return -EBUSY;
> > > +	dev = iommu_group_first_dev(group);
> > > +	if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
> > > +		return -EINVAL;
> > 
> > I was thinking about this later, how does this work for the global
> > static domains? domain->owner will not be set?
> > 
> > 	if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
> > 		return ops->identity_domain;
> > 	else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
> > 		return ops->blocked_domain;
> > 
> > Seems like it will break everything?
> 
> I don't believe it makes any significant difference - as the commit message
> points out, this validation is only applied at the public interface
> boundaries of iommu_attach_group(), iommu_attach_device(), 

Oh, making it only work for on domain type seems kind of hacky..

If that is the intention maybe the owner set should be moved into
iommu_domain_alloc() with a little comment noting that it is limited
to work in only a few cases?

I certainly didn't understand from the commit message to mean it was
only actually working for one domain type and this also blocks using
other types with the public interface.

> and iommu_attach_device_pasid(), which are only expected to be
> operating on explicitly-allocated unmanaged domains.

We have nesting now in the iommufd branch, and SVA will come soon for
these APIs.

Regardless this will clash with the iommufd branch for this reason so
I guess it needs to wait till rc1.

Thanks,
Jason
  
Robin Murphy Oct. 25, 2023, 4:05 p.m. UTC | #5
On 25/10/2023 1:55 pm, Jason Gunthorpe wrote:
> On Wed, Oct 25, 2023 at 01:39:56PM +0100, Robin Murphy wrote:
>> On 24/10/2023 7:52 pm, Jason Gunthorpe wrote:
>>> On Wed, Oct 11, 2023 at 07:14:50PM +0100, Robin Murphy wrote:
>>>
>>>> @@ -2279,10 +2280,16 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
>>>>    static int __iommu_attach_group(struct iommu_domain *domain,
>>>>    				struct iommu_group *group)
>>>>    {
>>>> +	struct device *dev;
>>>> +
>>>>    	if (group->domain && group->domain != group->default_domain &&
>>>>    	    group->domain != group->blocking_domain)
>>>>    		return -EBUSY;
>>>> +	dev = iommu_group_first_dev(group);
>>>> +	if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
>>>> +		return -EINVAL;
>>>
>>> I was thinking about this later, how does this work for the global
>>> static domains? domain->owner will not be set?
>>>
>>> 	if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
>>> 		return ops->identity_domain;
>>> 	else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
>>> 		return ops->blocked_domain;
>>>
>>> Seems like it will break everything?
>>
>> I don't believe it makes any significant difference - as the commit message
>> points out, this validation is only applied at the public interface
>> boundaries of iommu_attach_group(), iommu_attach_device(),
> 
> Oh, making it only work for on domain type seems kind of hacky..
> 
> If that is the intention maybe the owner set should be moved into
> iommu_domain_alloc() with a little comment noting that it is limited
> to work in only a few cases?
> 
> I certainly didn't understand from the commit message to mean it was
> only actually working for one domain type and this also blocks using
> other types with the public interface.

It's not about one particular domain type, it's about the scope of what 
we consider valid usage. External API users should almost always be 
attaching to their own domain which they have allocated, however we also 
tolerate co-attaching additional groups to the same DMA domain in rare 
cases where it's reasonable. The fact is that those users cannot 
allocate blocking or identity domains, and I can't see that they would 
ever have any legitimate business trying to do anything with them 
anyway. So although yes, we technically lose some functionality once 
this intersects with the static domain optimisation, it's only 
questionable functionality which was never explicitly intended anyway.

I mean, what would be the valid purpose of trying to attach group A to 
group B's identity domain, even if they *were* backed by the same 
driver? At best it's pointless if group A also has its own identity 
domain already, otherwise at worst it's a deliberate attempt to 
circumvent a default domain policy imposed by the IOMMU core.

>> and iommu_attach_device_pasid(), which are only expected to be
>> operating on explicitly-allocated unmanaged domains.
> 
> We have nesting now in the iommufd branch, and SVA will come soon for
> these APIs.
> 
> Regardless this will clash with the iommufd branch for this reason so
> I guess it needs to wait till rc1.

Sigh, back on the shelf it goes then...

Thanks,
Robin.
  
Jason Gunthorpe Oct. 25, 2023, 4:15 p.m. UTC | #6
On Wed, Oct 25, 2023 at 05:05:08PM +0100, Robin Murphy wrote:
> On 25/10/2023 1:55 pm, Jason Gunthorpe wrote:
> > On Wed, Oct 25, 2023 at 01:39:56PM +0100, Robin Murphy wrote:
> > > On 24/10/2023 7:52 pm, Jason Gunthorpe wrote:
> > > > On Wed, Oct 11, 2023 at 07:14:50PM +0100, Robin Murphy wrote:
> > > > 
> > > > > @@ -2279,10 +2280,16 @@ struct iommu_domain *iommu_get_dma_domain(struct device *dev)
> > > > >    static int __iommu_attach_group(struct iommu_domain *domain,
> > > > >    				struct iommu_group *group)
> > > > >    {
> > > > > +	struct device *dev;
> > > > > +
> > > > >    	if (group->domain && group->domain != group->default_domain &&
> > > > >    	    group->domain != group->blocking_domain)
> > > > >    		return -EBUSY;
> > > > > +	dev = iommu_group_first_dev(group);
> > > > > +	if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
> > > > > +		return -EINVAL;
> > > > 
> > > > I was thinking about this later, how does this work for the global
> > > > static domains? domain->owner will not be set?
> > > > 
> > > > 	if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
> > > > 		return ops->identity_domain;
> > > > 	else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
> > > > 		return ops->blocked_domain;
> > > > 
> > > > Seems like it will break everything?
> > > 
> > > I don't believe it makes any significant difference - as the commit message
> > > points out, this validation is only applied at the public interface
> > > boundaries of iommu_attach_group(), iommu_attach_device(),
> > 
> > Oh, making it only work for on domain type seems kind of hacky..
> > 
> > If that is the intention maybe the owner set should be moved into
> > iommu_domain_alloc() with a little comment noting that it is limited
> > to work in only a few cases?
> > 
> > I certainly didn't understand from the commit message to mean it was
> > only actually working for one domain type and this also blocks using
> > other types with the public interface.
> 
> It's not about one particular domain type, it's about the scope of what we
> consider valid usage. External API users should almost always be attaching
> to their own domain which they have allocated, however we also tolerate
> co-attaching additional groups to the same DMA domain in rare cases where
> it's reasonable. The fact is that those users cannot allocate blocking or
> identity domains, and I can't see that they would ever have any legitimate
> business trying to do anything with them anyway. So although yes, we
> technically lose some functionality once this intersects with the static
> domain optimisation, it's only questionable functionality which was never
> explicitly intended anyway.

I have no problem with that argument, I'm saying this is a subtle
emergent property. Lets document it, lets be more explicit. The owner
checks would do well to go along with specific domain type checks as
well to robustly enforce what you just explained.

Thanks,
Jason
  

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7bb92e8b7a49..578292d3b152 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2114,6 +2114,7 @@  static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
 		return NULL;
 
 	domain->type = type;
+	domain->owner = ops;
 	/*
 	 * If not already set, assume all sizes by default; the driver
 	 * may override this later
@@ -2279,10 +2280,16 @@  struct iommu_domain *iommu_get_dma_domain(struct device *dev)
 static int __iommu_attach_group(struct iommu_domain *domain,
 				struct iommu_group *group)
 {
+	struct device *dev;
+
 	if (group->domain && group->domain != group->default_domain &&
 	    group->domain != group->blocking_domain)
 		return -EBUSY;
 
+	dev = iommu_group_first_dev(group);
+	if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
+		return -EINVAL;
+
 	return __iommu_group_set_domain(group, domain);
 }
 
@@ -3480,6 +3487,9 @@  int iommu_attach_device_pasid(struct iommu_domain *domain,
 	if (!group)
 		return -ENODEV;
 
+	if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner)
+		return -EINVAL;
+
 	mutex_lock(&group->mutex);
 	curr = xa_cmpxchg(&group->pasid_array, pasid, NULL, domain, GFP_KERNEL);
 	if (curr) {
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 2d2802fb2c74..5c9560813d05 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -99,6 +99,7 @@  struct iommu_domain_geometry {
 struct iommu_domain {
 	unsigned type;
 	const struct iommu_domain_ops *ops;
+	const struct iommu_ops *owner; /* Whose domain_alloc we came from */
 	unsigned long pgsize_bitmap;	/* Bitmap of page sizes in use */
 	struct iommu_domain_geometry geometry;
 	struct iommu_dma_cookie *iova_cookie;