[v4,4/7] iommu: Switch __iommu_domain_alloc() to device ops

Message ID 458dd0ed839541693a49da33239b33cf4c48b8ec.1696253096.git.robin.murphy@arm.com
State New
Headers
Series iommu: Retire bus ops |

Commit Message

Robin Murphy Oct. 2, 2023, 1:49 p.m. UTC
  In all the places we allocate default domains, we have (or can easily
get hold of) a device from which to resolve the right IOMMU ops; only
the public iommu_domain_alloc() interface actually depends on bus ops.
Reworking the public API is a big enough mission in its own right, but
in the meantime we can still decouple it from bus ops internally to move
forward.

Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>

---

v3: Make sure blocking domains are covered as well
v4: Reinstate correct bus_for_each_dev() handling from v2
---
 drivers/iommu/iommu.c | 48 ++++++++++++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 17 deletions(-)
  

Comments

Jason Gunthorpe Oct. 2, 2023, 2:16 p.m. UTC | #1
On Mon, Oct 02, 2023 at 02:49:12PM +0100, Robin Murphy wrote:
> @@ -2120,20 +2120,30 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
>  	return domain;
>  }
>  
> -static struct iommu_domain *
> -__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type)
> +static int __iommu_domain_alloc_dev(struct device *dev, void *data)
>  {

Why? The point of this design is that drivers are not allowed to
allocate different things for devices in the same group. So we always
force the driver to see only the first device in the group even if we
have a more specific device available in the call chain. 

This patch has undone this design and passed in more specific devs :(

The new code here:

>  struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus)
>  {
> -	if (bus == NULL || bus->iommu_ops == NULL)
> +	struct device *dev = NULL;
> +
> +	/* We always check the whole bus, so the return value isn't useful */
> +	bus_for_each_dev(bus, NULL, &dev, __iommu_domain_alloc_dev);
> +	if (!dev)
>  		return NULL;
> -	return __iommu_domain_alloc(bus->iommu_ops, NULL,
> -				    IOMMU_DOMAIN_UNMANAGED);
> +
> +	return __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
>  }
>  EXPORT_SYMBOL_GPL(iommu_domain_alloc);

Should just obtain any group for the bus and pass that to
__iommu_group_domain_alloc().

Also, how does the locking work here? Definately can't pass dev
outside the bus_for_each_dev() like this.

If this needs to sweep over arbitary devices that are not the caller's
probe'd device it needs to hold at least the device_lock to prevent
racing with release.

So I'd structure this to find the matching device, lock the
device_lock, get the group refcount, unlock the device_lock then
get the group_mutex, check for non-empty and then call
__iommu_group_domain_alloc()

(there is a missing lockdep annotation in
__iommu_group_domain_alloc(), the group mutex is needed)

Jason
  
Robin Murphy Oct. 2, 2023, 7:02 p.m. UTC | #2
On 02/10/2023 3:16 pm, Jason Gunthorpe wrote:
> On Mon, Oct 02, 2023 at 02:49:12PM +0100, Robin Murphy wrote:
>> @@ -2120,20 +2120,30 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
>>   	return domain;
>>   }
>>   
>> -static struct iommu_domain *
>> -__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type)
>> +static int __iommu_domain_alloc_dev(struct device *dev, void *data)
>>   {
> 
> Why?

Because out of 3 callsites, two were in a context which now needed to
make the iommu_group_first_dev() call itself anyway, thus recalculating
it twice more with the mutex still held would clearly be silly, and so
rather than bother refactoring the __iommu_group_domain_alloc() helper I
squashed it into its one remaining user. I don't know a good way to tell
git-format-patch to show a particular range of lines as a complete
removal of one function plus the addition of an entirely unrelated new
one which just happens to be in the same place.

> The point of this design is that drivers are not allowed to
> allocate different things for devices in the same group. So we always
> force the driver to see only the first device in the group even if we
> have a more specific device available in the call chain.

OK, but I can't read your mind. All I have visibility of is some
code which factored out a helper function for a sequence common to
several users, as is typical; neither the commit message of 8359cf39acba
nor the cover letter from that series provide any obvious explanation of
this "design".

> This patch has undone this design and passed in more specific devs :(

Um... Good? I mean in 3/4 cases it's literally the exact same code just
factored out again, while the one I've added picks some arbitrary device
in a different way. But the first device in the group list is still just
that - some arbitrary device - it's by no means guaranteed to be the
*same* device each time the group is re-locked, so pretending it's some
kind of useful invariant is misleading and wrong. Frankly now that you
*have* explained the design intent here, it's only made me more set on
removing it for being unclear, overcomplicated, and yet fundamentally
useless.

Yes, we absolutely expect that if dev_iommu_ops(dev)->device_group for
two devices returns the same group, the driver should bloody well give
the same result for either dev_iommu_ops(dev)->domain_alloc, but there's
no practical way to enforce that at this level of code. If a driver ever
did have inconsistent behaviour across devices within a group, trying to
always use the "one true device" would only help *hide* that and make it
harder to debug, not at all prevent it.

> The new code here:
> 
>>   struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus)
>>   {
>> -	if (bus == NULL || bus->iommu_ops == NULL)
>> +	struct device *dev = NULL;
>> +
>> +	/* We always check the whole bus, so the return value isn't useful */
>> +	bus_for_each_dev(bus, NULL, &dev, __iommu_domain_alloc_dev);
>> +	if (!dev)
>>   		return NULL;
>> -	return __iommu_domain_alloc(bus->iommu_ops, NULL,
>> -				    IOMMU_DOMAIN_UNMANAGED);
>> +
>> +	return __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
>>   }
>>   EXPORT_SYMBOL_GPL(iommu_domain_alloc);
> 
> Should just obtain any group for the bus and pass that to
> __iommu_group_domain_alloc().
> 
> Also, how does the locking work here? Definately can't pass dev
> outside the bus_for_each_dev() like this.

It works like in most of the rest of the API right now... this is still
the process of cleaning up the old ugly stuff in order to be able to
evolve the API foundations to the point where we *can* have a reasonable
locking scheme. It's a multi-step process, and I'd really like to get
the internal bus-ops-to-instance-ops transition completed in order to
then clean up the "of_iommu_configure() at driver probe time"
catastrophe, get iommu_domain_alloc() callers to pass a valid device
themselves, teach the ARM DMA ops about default domains (for which it's
a great help that you've now got the driver-side groundwork done,
thanks!), and then maybe we can finally have nice things.

> If this needs to sweep over arbitary devices that are not the caller's
> probe'd device it needs to hold at least the device_lock to prevent
> racing with release.
> 
> So I'd structure this to find the matching device, lock the
> device_lock, get the group refcount, unlock the device_lock then
> get the group_mutex, check for non-empty and then call
> __iommu_group_domain_alloc()

...and as your reward you'd get much the same deadlocks as with the last
attempt to bring device_lock into it, thanks to
arm_iommu_create_mapping(), and drivers calling iommu_domain_alloc()
from their own probe routines :(

FYI the diff below is what I've hacked out so far, but I'll test it with
a fresh head tomorrow (just pasting it in here I've noticed at least one
bug...)

> (there is a missing lockdep annotation in
> __iommu_group_domain_alloc(), the group mutex is needed)

(fear not, iommu_group_first_dev() brings that to those places!)

Thanks,
Robin.

----->8-----
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3d29434d57c6..78366e988e31 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2128,27 +2128,41 @@ static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
  static int __iommu_domain_alloc_dev(struct device *dev, void *data)
  {
  	struct device **alloc_dev = data;
+	struct iommu_group *group = iommu_group_get(dev);
  
-	if (!dev_has_iommu(dev))
+	if (!group)
  		return 0;
  
-	WARN_ONCE(*alloc_dev && dev_iommu_ops(dev) != dev_iommu_ops(*alloc_dev),
-		  "Multiple IOMMU drivers present, which the public IOMMU API can't fully support yet. You may still need to disable one or more to get the expected result here, sorry!\n");
+	mutex_lock(&group->mutex);
+	/* Theoretically we could have raced against release */
+	if (list_empty(&group->devices)) {
+		mutex_unlock(&group->mutex);
+		iommu_group_put(group);
+		return 0;
+	}
  
-	*alloc_dev = dev;
+	if (!*alloc_dev)
+		*alloc_dev = dev;
+
+	WARN_ONCE(dev_iommu_ops(dev) != dev_iommu_ops(*alloc_dev),
+		  "Multiple IOMMU drivers present, which the public IOMMU API can't fully support yet. You may still need to disable one or more to get the expected result here, sorry!\n");
  	return 0;
  }
  
  struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus)
  {
  	struct device *dev = NULL;
+	struct iommu_domain *dom;
  
  	/* We always check the whole bus, so the return value isn't useful */
  	bus_for_each_dev(bus, NULL, &dev, __iommu_domain_alloc_dev);
  	if (!dev)
  		return NULL;
  
-	return __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
+	dom = __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
+	mutex_unlock(&dev->iommu_group->mutex);
+	iommu_group_put(dev->iommu_group);
+	return dom;
  }
  EXPORT_SYMBOL_GPL(iommu_domain_alloc);
  
Jason Gunthorpe Oct. 2, 2023, 7:36 p.m. UTC | #3
On Mon, Oct 02, 2023 at 08:02:23PM +0100, Robin Murphy wrote:
> On 02/10/2023 3:16 pm, Jason Gunthorpe wrote:
> > On Mon, Oct 02, 2023 at 02:49:12PM +0100, Robin Murphy wrote:
> > > @@ -2120,20 +2120,30 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
> > >   	return domain;
> > >   }
> > > -static struct iommu_domain *
> > > -__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type)
> > > +static int __iommu_domain_alloc_dev(struct device *dev, void *data)
> > >   {
> > 
> > Why?
> 
> Because out of 3 callsites, two were in a context which now needed to
> make the iommu_group_first_dev() call itself anyway,

I don't see it. Why not just do this?

static int __iommu_domain_alloc_dev(struct device *dev, void *data)
{
	/* FIXME: This is not correctly locked */
	struct iommu_group *group = iommu_group_get(dev);
	struct group **alloc_group = data;

	if (!group)
		return 0;

	mutex_lock(&group->mutex);
	/* Theoretically we could have raced against release */
	if (list_empty(&group->devices)) {
		mutex_unlock(&group->mutex);
		iommu_group_put(group);
		return 0;
	}

	*alloc_group = group;
	return 1;
}

struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus)
{
	struct iommu_group *alloc_group;
	struct iommu_domain *dom;

	/* Only one driver is supported with this interface */
	if (WARN_ON(iommu_num_drivers > 1))
		return NULL;

	bus_for_each_dev(bus, NULL, &alloc_group, __iommu_domain_alloc_dev);
	if (!alloc_group)
		return NULL;
	dom = __iommu_group_domain_alloc(alloc_group, IOMMU_DOMAIN_UNMANAGED);
	mutex_unlock(&alloc_group->mutex);
	iommu_group_put(alloc_group);
	return dom;
}
EXPORT_SYMBOL_GPL(iommu_domain_alloc);

(and ++/-- iommu_num_drivers in iommu_device_register)

One patch, it's pretty easy???

> Um... Good? I mean in 3/4 cases it's literally the exact same code just
> factored out again, while the one I've added picks some arbitrary device
> in a different way.

Sure, but the whole point was to make it obvious that there was no
direct linkage from the various dev parameters we have in places and
what dev will be passed by the driver. Everything passes through
__iommu_group_domain_alloc() and at the end of the day that should be
the only allocation API.

Jason
  
Robin Murphy Oct. 4, 2023, 5:23 p.m. UTC | #4
On 02/10/2023 8:36 pm, Jason Gunthorpe wrote:
> On Mon, Oct 02, 2023 at 08:02:23PM +0100, Robin Murphy wrote:
>> On 02/10/2023 3:16 pm, Jason Gunthorpe wrote:
>>> On Mon, Oct 02, 2023 at 02:49:12PM +0100, Robin Murphy wrote:
>>>> @@ -2120,20 +2120,30 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
>>>>    	return domain;
>>>>    }
>>>> -static struct iommu_domain *
>>>> -__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type)
>>>> +static int __iommu_domain_alloc_dev(struct device *dev, void *data)
>>>>    {
>>>
>>> Why?
>>
>> Because out of 3 callsites, two were in a context which now needed to
>> make the iommu_group_first_dev() call itself anyway,
> 
> I don't see it. Why not just do this?
> 
> static int __iommu_domain_alloc_dev(struct device *dev, void *data)
> {
> 	/* FIXME: This is not correctly locked */
> 	struct iommu_group *group = iommu_group_get(dev);
> 	struct group **alloc_group = data;
> 
> 	if (!group)
> 		return 0;
> 
> 	mutex_lock(&group->mutex);
> 	/* Theoretically we could have raced against release */
> 	if (list_empty(&group->devices)) {
> 		mutex_unlock(&group->mutex);
> 		iommu_group_put(group);
> 		return 0;
> 	}
> 
> 	*alloc_group = group;
> 	return 1;
> }
> 
> struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus)
> {
> 	struct iommu_group *alloc_group;
> 	struct iommu_domain *dom;
> 
> 	/* Only one driver is supported with this interface */
> 	if (WARN_ON(iommu_num_drivers > 1))
> 		return NULL;
> 
> 	bus_for_each_dev(bus, NULL, &alloc_group, __iommu_domain_alloc_dev);
> 	if (!alloc_group)
> 		return NULL;
> 	dom = __iommu_group_domain_alloc(alloc_group, IOMMU_DOMAIN_UNMANAGED);
> 	mutex_unlock(&alloc_group->mutex);
> 	iommu_group_put(alloc_group);
> 	return dom;
> }
> EXPORT_SYMBOL_GPL(iommu_domain_alloc);
> 
> (and ++/-- iommu_num_drivers in iommu_device_register)
> 
> One patch, it's pretty easy???

Sure, that would then leave two callsites for 
__iommu_group_domain_alloc(), which might in turn justify leaving it 
factored out, but the endgame is to get a device to resolve 
dev_iommu_ops(), so if we're starting with a suitable device in hand 
then using its group to immediately re-derive a device again when we 
*could* trivially pass the device straight through is silly and 
overcomplicated.

However it turns out that we can't easily touch the group here at all 
without then needing dependent changes in VFIO, so rather than invite 
any more scope creep at this point I'm going to respin this as a purely 
sideways step that sticks to resolving ops from a bus, and save any 
further device/instance niceties for step 2 when I have to touch all the 
external callers anyway.

>> Um... Good? I mean in 3/4 cases it's literally the exact same code just
>> factored out again, while the one I've added picks some arbitrary device
>> in a different way.
> 
> Sure, but the whole point was to make it obvious that there was no
> direct linkage from the various dev parameters we have in places and
> what dev will be passed by the driver.

But that's the argument that makes no sense - it happens to be the case 
in the current code that all default domain allocation sites are already 
buried in several layers worth of group-based machinery, but that in 
itself holds no significance to the allocation process. I maintain that 
it is simply misleading to pretend that (taking a little artistic 
license with types) a round trip through 
dev->iommu_group->devices->next->dev->iommu holds any significance over 
just dev->iommu. There's hardly anything "obvious" about it either - 
it's buried in core code that driver authors have little reason to even 
look at.

If I'm implementing a driver, I'm going to see the signature of the op I 
need to implement, which tells me to allocate a domain and gives me a 
device to allocate it for, and for reference I'm going to look at how 
other drivers implement that op. I would not start by picking through 
the core code for the callers of the caller of that op, to see some 
weird code that looks redundant when in practice I observe it resolving 
a device back to itself, and think "ah yes, now I see the undocumented 
significance of an idea that only existed in Jason's head". If you don't 
want an IOMMU driver to be able to think the particular device is 
important, don't pass a bloody device! If the only intended purpose is 
for the driver to resolve dev->iommu instance data, then have the core 
do that and just pass the dev_iommu or iommu_dev directly; ambiguity solved.

If one is expected to look at subtleties 2 or 3 levels down the 
callchain of an API in order to understand how to implement that API 
correctly, that is *a badly designed API*, and piling on more 
"correctness theatre" code to attempt to highlight the bad design does 
not address the real problem there.

> Everything passes through
> __iommu_group_domain_alloc() and at the end of the day that should be
> the only allocation API.

But *why* should it? What's inherently more special about a group vs. a 
device or an IOMMU instance (especially with those being the things 
drivers actually deal with in relation to domains)?

If you've noticed, between here and patch #3 I already end up 
effectively enforcing that devices in the same group must have identical 
device ops - that's arguably an even stronger requirement than we need, 
but it fits the current drivers so we may as well not relax it until 
proven necessary. So AFAICS the problem you're desperate to address is a 
theoretical one of a driver author going out of their way to implement a 
single .domain_alloc_paging implementation badly, in a way which would 
only affect the behaviour of that driver, and should be easy to spot in 
patch review anyway. Any complexity in core code in an attempt to make 
any difference to that seems about as worthwhile as trying to hold back 
the tide.

Thanks,
Robin.
  
Jason Gunthorpe Oct. 4, 2023, 9:35 p.m. UTC | #5
On Wed, Oct 04, 2023 at 06:23:05PM +0100, Robin Murphy wrote:

> Sure, that would then leave two callsites for __iommu_group_domain_alloc(),
> which might in turn justify leaving it factored out, but the endgame is to
> get a device to resolve dev_iommu_ops(), so if we're starting with a
> suitable device in hand then using its group to immediately re-derive a
> device again when we *could* trivially pass the device straight through is
> silly and overcomplicated.

Well, it is the same sort of interface as attaching a domain to a
group. The group stuff is like that.
 
> However it turns out that we can't easily touch the group here at all
> without then needing dependent changes in VFIO, 

Why? VFIO only puts its weird groups on the mdev bus and so they will
never hit this.

> more scope creep at this point I'm going to respin this as a purely sideways
> step that sticks to resolving ops from a bus, and save any further
> device/instance niceties for step 2 when I have to touch all the external
> callers anyway.

One thing that I've recently realized is that this patch will break
everything when combined with the patches I'm sending to start doing
finalize during domain_alloc_paging() ops. It will randomly select a
device on the bus which is proably the wrong smmu for the device we
intend to attach to later.

So, to keep everything together the special bus domain alloc path has
to continue to pass in a NULL dev to domain_alloc_paging.

Tthat is certainly a pretty good reason why the above can't use the
existing group call and then you can make the case it doesn't have
enough users to exist anymore.

> But that's the argument that makes no sense - it happens to be the case in
> the current code that all default domain allocation sites are already buried
> in several layers worth of group-based machinery, but that in itself holds
> no significance to the allocation process. 

Wel yes, but also it is what it is. The group stuff gets everywhere in
undesired ways. The original version of this did write it they way you
suggest and Kevin came with the idea to have a group alloc domain. I
tried it and it was indeed cleaner. One place to do the Quite Strange
thing of randomly choosing a device in a group, and combined with the
get group ops function it allowed all the group centric code stay group
centric until the very fringe.

Keeping it clear that the dev was infact just made up in alot of call
paths is a nice bonus.

> only existed in Jason's head". If you don't want an IOMMU driver to be able
> to think the particular device is important, don't pass a bloody device! If
> the only intended purpose is for the driver to resolve dev->iommu instance
> data, then have the core do that and just pass the dev_iommu or iommu_dev
> directly; ambiguity solved.

Oh, I wanted to do that! I looked at doing that even. It doesn't work
for amd and smmuv3 :( They needs to know if the group has any PASID
capable devices or not to choose a PASID compatible RID domain for the
DMA API use. Can't get that from the iommu_dev alone

> If one is expected to look at subtleties 2 or 3 levels down the callchain of
> an API in order to understand how to implement that API correctly, that is
> *a badly designed API*, and piling on more "correctness theatre" code to
> attempt to highlight the bad design does not address the real problem there.

Oh, I agree, this group stuff is troubled like that. It confuses all
kinds of unrelated things into one bundle.

The group stuff exists at the core layer where everything wants to
operate on groups. We attach devices to groups, we allocate domains
for groups. Then the device level works on devices, not groups, and we
have this confusion.

> If you've noticed, between here and patch #3 I already end up effectively
> enforcing that devices in the same group must have identical device ops -

Doesn't it just fall out of the probe rework? All devices are now
attached to a domain during probe. If the single domain in the group
fails attach because of your patch 4 then the device will not probe.

Unraveling that would require multiple domains per groups, which
seems like a huge undertaking :)

> So AFAICS the problem you're desperate to address is a theoretical
> one of a driver author going out of their way to implement a single
> .domain_alloc_paging implementation badly, in a way which would only
> affect the behaviour of that driver, and should be easy to spot in
> patch review anyway. Any complexity in core code in an attempt to
> make any difference to that seems about as worthwhile as trying to
> hold back the tide.

Well note quite.. There are quite obvious things that can trip up a
driver. The requirement for a special RID IOPTE format to get PASID
support is tricky. The drivers are all functionally OK because we link
PASID support to singleton groups, but even that took a long time to
arrive at.

So, please try a v5, lets see what it looks like with the
domain_alloc_paging and semi-locking fix?

Jason
  

Patch

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 7c79a58ef010..c5b5408d1dd7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -95,8 +95,8 @@  static const char * const iommu_group_resv_type_string[] = {
 static int iommu_bus_notifier(struct notifier_block *nb,
 			      unsigned long action, void *data);
 static void iommu_release_device(struct device *dev);
-static struct iommu_domain *
-__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type);
+static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
+						 unsigned type);
 static int __iommu_attach_device(struct iommu_domain *domain,
 				 struct device *dev);
 static int __iommu_attach_group(struct iommu_domain *domain,
@@ -1763,7 +1763,7 @@  __iommu_group_alloc_default_domain(struct iommu_group *group, int req_type)
 {
 	if (group->default_domain && group->default_domain->type == req_type)
 		return group->default_domain;
-	return __iommu_group_domain_alloc(group, req_type);
+	return __iommu_domain_alloc(iommu_group_first_dev(group), req_type);
 }
 
 /*
@@ -2082,10 +2082,10 @@  void iommu_set_fault_handler(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_set_fault_handler);
 
-static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
-						 struct device *dev,
-						 unsigned int type)
+static struct iommu_domain *__iommu_domain_alloc(struct device *dev,
+						 unsigned type)
 {
+	const struct iommu_ops *ops = dev_iommu_ops(dev);
 	struct iommu_domain *domain;
 	unsigned int alloc_type = type & IOMMU_DOMAIN_ALLOC_FLAGS;
 
@@ -2120,20 +2120,30 @@  static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
 	return domain;
 }
 
-static struct iommu_domain *
-__iommu_group_domain_alloc(struct iommu_group *group, unsigned int type)
+static int __iommu_domain_alloc_dev(struct device *dev, void *data)
 {
-	struct device *dev = iommu_group_first_dev(group);
+	struct device **alloc_dev = data;
 
-	return __iommu_domain_alloc(dev_iommu_ops(dev), dev, type);
+	if (!dev_has_iommu(dev))
+		return 0;
+
+	WARN_ONCE(*alloc_dev && dev_iommu_ops(dev) != dev_iommu_ops(*alloc_dev),
+		  "Multiple IOMMU drivers present, which the public IOMMU API can't fully support yet. You may still need to disable one or more to get the expected result here, sorry!\n");
+
+	*alloc_dev = dev;
+	return 0;
 }
 
 struct iommu_domain *iommu_domain_alloc(const struct bus_type *bus)
 {
-	if (bus == NULL || bus->iommu_ops == NULL)
+	struct device *dev = NULL;
+
+	/* We always check the whole bus, so the return value isn't useful */
+	bus_for_each_dev(bus, NULL, &dev, __iommu_domain_alloc_dev);
+	if (!dev)
 		return NULL;
-	return __iommu_domain_alloc(bus->iommu_ops, NULL,
-				    IOMMU_DOMAIN_UNMANAGED);
+
+	return __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
 }
 EXPORT_SYMBOL_GPL(iommu_domain_alloc);
 
@@ -3256,18 +3266,22 @@  void iommu_device_unuse_default_domain(struct device *dev)
 
 static int __iommu_group_alloc_blocking_domain(struct iommu_group *group)
 {
+	struct device *dev = iommu_group_first_dev(group);
+
 	if (group->blocking_domain)
 		return 0;
 
-	group->blocking_domain =
-		__iommu_group_domain_alloc(group, IOMMU_DOMAIN_BLOCKED);
+	/* noiommu groups should never be here */
+	if (WARN_ON(!dev_has_iommu(dev)))
+		return -ENODEV;
+
+	group->blocking_domain = __iommu_domain_alloc(dev, IOMMU_DOMAIN_BLOCKED);
 	if (!group->blocking_domain) {
 		/*
 		 * For drivers that do not yet understand IOMMU_DOMAIN_BLOCKED
 		 * create an empty domain instead.
 		 */
-		group->blocking_domain = __iommu_group_domain_alloc(
-			group, IOMMU_DOMAIN_UNMANAGED);
+		group->blocking_domain = __iommu_domain_alloc(dev, IOMMU_DOMAIN_UNMANAGED);
 		if (!group->blocking_domain)
 			return -EINVAL;
 	}