bus: fsl-mc: don't assume child devices are all fsl-mc devices

Message ID 20230127131636.20889-1-laurentiu.tudor@nxp.com
State New
Headers
Series bus: fsl-mc: don't assume child devices are all fsl-mc devices |

Commit Message

Laurentiu Tudor Jan. 27, 2023, 1:16 p.m. UTC
  From: Laurentiu Tudor <laurentiu.tudor@nxp.com>

Changes in VFIO caused a pseudo-device to be created as child of
fsl-mc devices causing a crash [1] when trying to bind a fsl-mc
device to VFIO. Fix this by checking the device type when enumerating
fsl-mc child devices.

[1]
Modules linked in:
Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
CPU: 6 PID: 1289 Comm: sh Not tainted 6.2.0-rc5-00047-g7c46948a6e9c #2
Hardware name: NXP Layerscape LX2160ARDB (DT)
pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : mc_send_command+0x24/0x1f0
lr : dprc_get_obj_region+0xfc/0x1c0
sp : ffff80000a88b900
x29: ffff80000a88b900 x28: ffff48a9429e1400 x27: 00000000000002b2
x26: ffff48a9429e1718 x25: 0000000000000000 x24: 0000000000000000
x23: ffffd59331ba3918 x22: ffffd59331ba3000 x21: 0000000000000000
x20: ffff80000a88b9b8 x19: 0000000000000000 x18: 0000000000000001
x17: 7270642f636d2d6c x16: 73662e3030303030 x15: ffffffffffffffff
x14: ffffd59330f1d668 x13: ffff48a8727dc389 x12: ffff48a8727dc386
x11: 0000000000000002 x10: 00008ceaf02f35d4 x9 : 0000000000000012
x8 : 0000000000000000 x7 : 0000000000000006 x6 : ffff80000a88bab0
x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff80000a88b9e8
x2 : ffff80000a88b9e8 x1 : 0000000000000000 x0 : ffff48a945142b80
Call trace:
 mc_send_command+0x24/0x1f0
 dprc_get_obj_region+0xfc/0x1c0
 fsl_mc_device_add+0x340/0x590
 fsl_mc_obj_device_add+0xd0/0xf8
 dprc_scan_objects+0x1c4/0x340
 dprc_scan_container+0x38/0x60
 vfio_fsl_mc_probe+0x9c/0xf8
 fsl_mc_driver_probe+0x24/0x70
 really_probe+0xbc/0x2a8
 __driver_probe_device+0x78/0xe0
 device_driver_attach+0x30/0x68
 bind_store+0xa8/0x130
 drv_attr_store+0x24/0x38
 sysfs_kf_write+0x44/0x60
 kernfs_fop_write_iter+0x128/0x1b8
 vfs_write+0x334/0x448
 ksys_write+0x68/0xf0
 __arm64_sys_write+0x1c/0x28
 invoke_syscall+0x44/0x108
 el0_svc_common.constprop.1+0x94/0xf8
 do_el0_svc+0x38/0xb0
 el0_svc+0x20/0x50
 el0t_64_sync_handler+0x98/0xc0
 el0t_64_sync+0x174/0x178
Code: aa0103f4 a9025bf5 d5384100 b9400801 (79401260)
---[ end trace 0000000000000000 ]---

Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
---
 drivers/bus/fsl-mc/dprc-driver.c | 6 ++++++
 1 file changed, 6 insertions(+)
  

Comments

Greg KH Jan. 31, 2023, 11:56 a.m. UTC | #1
On Fri, Jan 27, 2023 at 03:16:36PM +0200, laurentiu.tudor@nxp.com wrote:
> From: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> 
> Changes in VFIO caused a pseudo-device to be created as child of
> fsl-mc devices causing a crash [1] when trying to bind a fsl-mc
> device to VFIO. Fix this by checking the device type when enumerating
> fsl-mc child devices.

What changes?  What commit id does this fix?  Does it need to be
backported?

And what type of "pseudo device" is being created?  Why would it be
passed to this driver if it's the wrong type?

this feels wrong...

thanks,

greg k-h
  
Laurentiu Tudor Feb. 1, 2023, 11:50 a.m. UTC | #2
Hi Greg,

On 1/31/2023 1:56 PM, Greg KH wrote:
> On Fri, Jan 27, 2023 at 03:16:36PM +0200, laurentiu.tudor@nxp.com wrote:
>> From: Laurentiu Tudor <laurentiu.tudor@nxp.com>
>>
>> Changes in VFIO caused a pseudo-device to be created as child of
>> fsl-mc devices causing a crash [1] when trying to bind a fsl-mc
>> device to VFIO. Fix this by checking the device type when enumerating
>> fsl-mc child devices.
> 
> What changes?  What commit id does this fix?  Does it need to be
> backported?

There were a lot of changes in the VFIO area but I'd point at this 
commit [1].

I'll resend the patch with a "Fixes:" tag pointing at this commit if 
that's ok with you.

> And what type of "pseudo device" is being created? 
> Why would it be passed to this driver if it's the wrong type?

It's not passed to the driver per-se. The problem shows up when the 
implementation of the driver does a device_for_each_child() [2] and the 
callback blindly assumes that all enumerated children devices are fsl-mc 
devices. The patch just adds a check for this case.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3c28a76124b25882411f005924be73795b6ef078
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/bus/fsl-mc/dprc-driver.c?#n96

---
Thanks & Best Regards, Laurentiu
  
Greg KH Feb. 1, 2023, 1:04 p.m. UTC | #3
On Wed, Feb 01, 2023 at 01:50:11PM +0200, Laurentiu Tudor wrote:
> Hi Greg,
> 
> On 1/31/2023 1:56 PM, Greg KH wrote:
> > On Fri, Jan 27, 2023 at 03:16:36PM +0200, laurentiu.tudor@nxp.com wrote:
> > > From: Laurentiu Tudor <laurentiu.tudor@nxp.com>
> > > 
> > > Changes in VFIO caused a pseudo-device to be created as child of
> > > fsl-mc devices causing a crash [1] when trying to bind a fsl-mc
> > > device to VFIO. Fix this by checking the device type when enumerating
> > > fsl-mc child devices.
> > 
> > What changes?  What commit id does this fix?  Does it need to be
> > backported?
> 
> There were a lot of changes in the VFIO area but I'd point at this commit
> [1].
> 
> I'll resend the patch with a "Fixes:" tag pointing at this commit if that's
> ok with you.

Please do.

> > And what type of "pseudo device" is being created? Why would it be
> > passed to this driver if it's the wrong type?
> 
> It's not passed to the driver per-se. The problem shows up when the
> implementation of the driver does a device_for_each_child() [2] and the
> callback blindly assumes that all enumerated children devices are fsl-mc
> devices. The patch just adds a check for this case.

Ah, that makes more sense, sorry for the noise.

greg k-h
  

Patch

diff --git a/drivers/bus/fsl-mc/dprc-driver.c b/drivers/bus/fsl-mc/dprc-driver.c
index 4c84be378bf2..ec5f26a45641 100644
--- a/drivers/bus/fsl-mc/dprc-driver.c
+++ b/drivers/bus/fsl-mc/dprc-driver.c
@@ -45,6 +45,9 @@  static int __fsl_mc_device_remove_if_not_in_mc(struct device *dev, void *data)
 	struct fsl_mc_child_objs *objs;
 	struct fsl_mc_device *mc_dev;
 
+	if (!dev_is_fsl_mc(dev))
+		return 0;
+
 	mc_dev = to_fsl_mc_device(dev);
 	objs = data;
 
@@ -64,6 +67,9 @@  static int __fsl_mc_device_remove_if_not_in_mc(struct device *dev, void *data)
 
 static int __fsl_mc_device_remove(struct device *dev, void *data)
 {
+	if (!dev_is_fsl_mc(dev))
+		return 0;
+
 	fsl_mc_device_remove(to_fsl_mc_device(dev));
 	return 0;
 }