char: xillybus: Prevent use-after-free due to race condition

Message ID 2e5cbdfe-f6cd-d24f-9785-55176af6c975@gmail.com
State New
Headers
Series char: xillybus: Prevent use-after-free due to race condition |

Commit Message

Eli Billauer Oct. 26, 2022, 8:52 a.m. UTC
  xillybus_find_inode() is called by xillybus_open() and xillyusb_open()
to translate the inode's major and minor into a pointer to a relevant
data structure and an index.

But with xillyusb_open(), the data structure can be freed by
xillyusb_disconnect() during an unintentional time gap between the
release of the mutex that is taken by xillybus_find_inode() and the
mutex that is subsequently taken by xillyusb_open().

To fix this, xillybus_find_inode() supplies the pointer to the mutex that
it has locked (when returning success), so xillyusb_open() releases this
mutex only after obtaining the mutex that is specific to a device file.
This ensures that xillyusb_disconnect() won't release anything that is in
use.

This manipulation of mutexes poses no risk for a deadlock, because in
all usage scenarios, @unit_mutex (which is locked by xillybus_find_inode)
is always taken when no other mutex is locked. Hence a consistent locking
order is guaranteed.

xillybus_open() unlocks this mutex immediately, as this driver doesn't
support hot unplugging anyhow.

Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
Signed-off-by: Eli Billauer <eli.billauer@gmail.com>
---
 drivers/char/xillybus/xillybus_class.c | 8 +++++---
 drivers/char/xillybus/xillybus_class.h | 2 ++
 drivers/char/xillybus/xillybus_core.c  | 6 +++++-
 drivers/char/xillybus/xillyusb.c       | 4 +++-
 4 files changed, 15 insertions(+), 5 deletions(-)
  

Comments

Greg KH Oct. 26, 2022, 12:07 p.m. UTC | #1
On Wed, Oct 26, 2022 at 11:52:40AM +0300, Eli Billauer wrote:
> xillybus_find_inode() is called by xillybus_open() and xillyusb_open()
> to translate the inode's major and minor into a pointer to a relevant
> data structure and an index.
> 
> But with xillyusb_open(), the data structure can be freed by
> xillyusb_disconnect() during an unintentional time gap between the
> release of the mutex that is taken by xillybus_find_inode() and the
> mutex that is subsequently taken by xillyusb_open().
> 
> To fix this, xillybus_find_inode() supplies the pointer to the mutex that
> it has locked (when returning success), so xillyusb_open() releases this
> mutex only after obtaining the mutex that is specific to a device file.
> This ensures that xillyusb_disconnect() won't release anything that is in
> use.

That's really odd, and not normal at all.  We don't pass around mutexes
like this as how do you know if that's allowed?

> 
> This manipulation of mutexes poses no risk for a deadlock, because in
> all usage scenarios, @unit_mutex (which is locked by xillybus_find_inode)
> is always taken when no other mutex is locked. Hence a consistent locking
> order is guaranteed.
> 
> xillybus_open() unlocks this mutex immediately, as this driver doesn't
> support hot unplugging anyhow.
> 
> Reported-by: Hyunwoo Kim <imv4bel@gmail.com>
> Signed-off-by: Eli Billauer <eli.billauer@gmail.com>
> ---
> drivers/char/xillybus/xillybus_class.c | 8 +++++---
> drivers/char/xillybus/xillybus_class.h | 2 ++
> drivers/char/xillybus/xillybus_core.c  | 6 +++++-
> drivers/char/xillybus/xillyusb.c       | 4 +++-
> 4 files changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/char/xillybus/xillybus_class.c b/drivers/char/xillybus/xillybus_class.c
> index 0f238648dcfe..c846dc3ed225 100644
> --- a/drivers/char/xillybus/xillybus_class.c
> +++ b/drivers/char/xillybus/xillybus_class.c
> @@ -211,6 +211,7 @@ void xillybus_cleanup_chrdev(void *private_data,
> EXPORT_SYMBOL(xillybus_cleanup_chrdev);
> 
> int xillybus_find_inode(struct inode *inode,
> +			struct mutex **to_unlock,

To unlock when?  Who unlocks it?  What is the lifespan here?

Why can't it just be part of the structure?

> 			void **private_data, int *index)
> {
> 	int minor = iminor(inode);
> @@ -227,13 +228,14 @@ int xillybus_find_inode(struct inode *inode,
> 			break;
> 		}
> 
> -	mutex_unlock(&unit_mutex);
> -
> -	if (!unit)
> +	if (!unit) {
> +		mutex_unlock(&unit_mutex);
> 		return -ENODEV;
> +	}
> 
> 	*private_data = unit->private_data;
> 	*index = minor - unit->lowest_minor;
> +	*to_unlock = &unit_mutex;

Why are you wanting the caller to unlock this?  It's a global mutex (for
the whole file), this feels really odd.

What is this function supposed to be doing?  You only return an int, and
you have some odd opaque void pointer being set.  That feels wrong and
is not a normal design at all.

confused,

greg k-h
  
Alan Stern Oct. 26, 2022, 3:02 p.m. UTC | #2
On Wed, Oct 26, 2022 at 11:52:40AM +0300, Eli Billauer wrote:
> xillybus_find_inode() is called by xillybus_open() and xillyusb_open()
> to translate the inode's major and minor into a pointer to a relevant
> data structure and an index.
> 
> But with xillyusb_open(), the data structure can be freed by
> xillyusb_disconnect() during an unintentional time gap between the
> release of the mutex that is taken by xillybus_find_inode() and the
> mutex that is subsequently taken by xillyusb_open().
> 
> To fix this, xillybus_find_inode() supplies the pointer to the mutex that
> it has locked (when returning success), so xillyusb_open() releases this
> mutex only after obtaining the mutex that is specific to a device file.
> This ensures that xillyusb_disconnect() won't release anything that is in
> use.

The standard way of handling this problem is different from this.  The 
driver defines a private mutex, and it ensures that any routine calling 
*_find_inode() holds the mutex.  It also ensures that the mutex is held 
while a new device is being registered and while a device is being 
removed.

Even that won't fix all the synchronization problems.  A process can 
open a device, and then after the device has been removed the process 
can still try to access the device.  The driver needs to ensure that 
such accesses are not allowed.

Alan Stern
  
Eli Billauer Oct. 27, 2022, 8:01 a.m. UTC | #3
Hello Greg,

I'm going to follow Alan's advice, and add a mutex instead of juggling 
with the existing one. So I'll prepare a new patch with a completely 
different solution. I've answered your questions below, even though I'm 
not sure any of that is still relevant.

On 26/10/2022 15:07, Greg KH wrote:

>> int xillybus_find_inode(struct inode *inode,
>> +			struct mutex **to_unlock,
> 
> To unlock when?  Who unlocks it?  What is the lifespan here?

xillybus_find_inode() finds the structure that represents a Xillybus 
(PCIe / OF) or XillyUSB (USB) device. The idea was to delay the unlock 
of the mutex until other means have been taken to prevent that structure 
from being freed. Which was virtually immediately after 
xillybus_find_inode() returns, but gave XillyUSB's driver a chance to 
acquire another mutex before releasing this one.

XillyUSB's driver can't prevent the release of this structure before it 
knows which structure it is (that's xillybus_find_inode()'s job to find 
out). But because xillybus_find_inode() unlocks its mutex before 
returning, there is a short period of time where the structure is 
unprotected. So I thought, let's extend the life of the first mutex just 
a little bit to keep the whole thing watertight.

> 
> Why can't it just be part of the structure?

The problem is that this structure is either a struct xilly_endpoint 
(for PCIe / OF) or a struct xillyusb_dev (for USB), and these have 
virtually nothing in common. xillybus_find_inode() is used by two 
completely different drivers that are grouped into a single class.

>>
>> 	*private_data = unit->private_data;
>> 	*index = minor - unit->lowest_minor;
>> +	*to_unlock = &unit_mutex;
> 
> Why are you wanting the caller to unlock this?  It's a global mutex (for
> the whole file), this feels really odd.

As mentioned above, this gives the caller (i.e. XillyUSB's driver) a 
chance to ensure that the structure isn't freed as a result of the 
device's disconnection.
> 
> What is this function supposed to be doing?  You only return an int, and
> you have some odd opaque void pointer being set.  That feels wrong and
> is not a normal design at all.

xillybus_find_inode() finds the device's structure, based upon the 
inode's major and minor.

The opaque void pointer is the structure that represents the device, 
either a PCIe / OF device or a XillyUSB device. Because of the 
difference between a driver for PCIe / OF and a USB driver, these are 
different structs, and hence represented by a void pointer. The int just 
says which of the device files, that are covered by the struct, has been 
opened.

Once again, all this is history, as I'm preparing a new patch, which is 
going to add a separate mutex.

Regards,
    Eli
  
Eli Billauer Oct. 27, 2022, 8:05 a.m. UTC | #4
Hello Alan,

On 26/10/2022 18:02, Alan Stern wrote:
> On Wed, Oct 26, 2022 at 11:52:40AM +0300, Eli Billauer wrote:
>> To fix this, xillybus_find_inode() supplies the pointer to the mutex that
>> it has locked (when returning success), so xillyusb_open() releases this
>> mutex only after obtaining the mutex that is specific to a device file.
>> This ensures that xillyusb_disconnect() won't release anything that is in
>> use.
> 
> The standard way of handling this problem is different from this.  The
> driver defines a private mutex, and it ensures that any routine calling
> *_find_inode() holds the mutex.  It also ensures that the mutex is held
> while a new device is being registered and while a device is being
> removed.

Thanks, I'm going to follow that advice in my v2 patch.

> 
> Even that won't fix all the synchronization problems.  A process can
> open a device, and then after the device has been removed the process
> can still try to access the device.  The driver needs to ensure that
> such accesses are not allowed.

Indeed. For that purpose, the relevant struct has a kref_counter, and an 
error flag that indicates a removal among others, along with mutexes. 
The problem is the time gap from the moment that the struct has been 
found by xillybus_find_inode() until it has been secured with the kref.

A new mutex is going to solve that.

Regards,
   Eli
  

Patch

diff --git a/drivers/char/xillybus/xillybus_class.c b/drivers/char/xillybus/xillybus_class.c
index 0f238648dcfe..c846dc3ed225 100644
--- a/drivers/char/xillybus/xillybus_class.c
+++ b/drivers/char/xillybus/xillybus_class.c
@@ -211,6 +211,7 @@  void xillybus_cleanup_chrdev(void *private_data,
 EXPORT_SYMBOL(xillybus_cleanup_chrdev);
 
 int xillybus_find_inode(struct inode *inode,
+			struct mutex **to_unlock,
 			void **private_data, int *index)
 {
 	int minor = iminor(inode);
@@ -227,13 +228,14 @@  int xillybus_find_inode(struct inode *inode,
 			break;
 		}
 
-	mutex_unlock(&unit_mutex);
-
-	if (!unit)
+	if (!unit) {
+		mutex_unlock(&unit_mutex);
 		return -ENODEV;
+	}
 
 	*private_data = unit->private_data;
 	*index = minor - unit->lowest_minor;
+	*to_unlock = &unit_mutex;
 
 	return 0;
 }
diff --git a/drivers/char/xillybus/xillybus_class.h b/drivers/char/xillybus/xillybus_class.h
index 5dbfdfc95c65..7cf89ac8bb32 100644
--- a/drivers/char/xillybus/xillybus_class.h
+++ b/drivers/char/xillybus/xillybus_class.h
@@ -12,6 +12,7 @@ 
 #include <linux/device.h>
 #include <linux/fs.h>
 #include <linux/module.h>
+#include <linux/mutex.h>
 
 int xillybus_init_chrdev(struct device *dev,
 			 const struct file_operations *fops,
@@ -25,6 +26,7 @@  void xillybus_cleanup_chrdev(void *private_data,
 			     struct device *dev);
 
 int xillybus_find_inode(struct inode *inode,
+			struct mutex **to_unlock, /* Mutex to release */
 			void **private_data, int *index);
 
 #endif /* __XILLYBUS_CLASS_H */
diff --git a/drivers/char/xillybus/xillybus_core.c b/drivers/char/xillybus/xillybus_core.c
index 11b7c4749274..7fd35f055310 100644
--- a/drivers/char/xillybus/xillybus_core.c
+++ b/drivers/char/xillybus/xillybus_core.c
@@ -1431,11 +1431,15 @@  static int xillybus_open(struct inode *inode, struct file *filp)
 	struct xilly_endpoint *endpoint;
 	struct xilly_channel *channel;
 	int index;
+	struct mutex *to_unlock; /* Mutex locked by xillybus_find_inode() */
 
-	rc = xillybus_find_inode(inode, (void **)&endpoint, &index);
+	rc = xillybus_find_inode(inode, &to_unlock,
+				 (void **)&endpoint, &index);
 	if (rc)
 		return rc;
 
+	mutex_unlock(to_unlock);
+
 	if (endpoint->fatal_error)
 		return -EIO;
 
diff --git a/drivers/char/xillybus/xillyusb.c b/drivers/char/xillybus/xillyusb.c
index 39bcbfd908b4..63e94d935067 100644
--- a/drivers/char/xillybus/xillyusb.c
+++ b/drivers/char/xillybus/xillyusb.c
@@ -1236,8 +1236,9 @@  static int xillyusb_open(struct inode *inode, struct file *filp)
 	struct xillyusb_endpoint *out_ep = NULL;
 	int rc;
 	int index;
+	struct mutex *to_unlock; /* Mutex locked by xillybus_find_inode() */
 
-	rc = xillybus_find_inode(inode, (void **)&xdev, &index);
+	rc = xillybus_find_inode(inode, &to_unlock, (void **)&xdev, &index);
 	if (rc)
 		return rc;
 
@@ -1245,6 +1246,7 @@  static int xillyusb_open(struct inode *inode, struct file *filp)
 	filp->private_data = chan;
 
 	mutex_lock(&chan->lock);
+	mutex_unlock(to_unlock);
 
 	rc = -ENODEV;