[1/1] usb: gadget: f_hid: Conduct proper refcounting on shared f_hidg pointer

Message ID 20221117120813.1257583-1-lee@kernel.org
State New
Headers
Series [1/1] usb: gadget: f_hid: Conduct proper refcounting on shared f_hidg pointer |

Commit Message

Lee Jones Nov. 17, 2022, 12:08 p.m. UTC
  A reference to struct f_hidg is shared with this driver's associated
character device handling component without provision for life-time
handling.  In some circumstances, this can lead to unwanted
behaviour depending on the order in which things are torn down.

Utilise, the reference counting functionality already provided by the
implanted character device structure to ensure the struct f_hidg's
memory is only freed once the character device handling has finished
with it.

Signed-off-by: Lee Jones <lee@kernel.org>
---
 drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------
 1 file changed, 37 insertions(+), 10 deletions(-)
  

Comments

Greg KH Nov. 17, 2022, 12:50 p.m. UTC | #1
On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> A reference to struct f_hidg is shared with this driver's associated
> character device handling component without provision for life-time
> handling.  In some circumstances, this can lead to unwanted
> behaviour depending on the order in which things are torn down.
> 
> Utilise, the reference counting functionality already provided by the
> implanted character device structure to ensure the struct f_hidg's
> memory is only freed once the character device handling has finished
> with it.
> 
> Signed-off-by: Lee Jones <lee@kernel.org>
> ---
>  drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------
>  1 file changed, 37 insertions(+), 10 deletions(-)

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
a patch that has triggered this response.  He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created.  Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- This looks like a new version of a previously submitted patch, but you
  did not list below the --- line any changes from the previous version.
  Please read the section entitled "The canonical patch format" in the
  kernel file, Documentation/SubmittingPatches for what needs to be done
  here to properly describe this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot
  
Greg KH Nov. 17, 2022, 12:50 p.m. UTC | #2
On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> +static inline bool f_hidg_is_open(struct f_hidg *hidg)
> +{
> +	return !!kref_read(&hidg->cdev.kobj.kref);
> +}

Ick, sorry, no, that's not going to work and is not allowed at all.
That's some major layering violations there, AND it can change after you
get the value as well.

greg k-h
  
Lee Jones Nov. 17, 2022, 1:26 p.m. UTC | #3
On Thu, 17 Nov 2022, Greg KH wrote:

> On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> > A reference to struct f_hidg is shared with this driver's associated
> > character device handling component without provision for life-time
> > handling.  In some circumstances, this can lead to unwanted
> > behaviour depending on the order in which things are torn down.
> > 
> > Utilise, the reference counting functionality already provided by the
> > implanted character device structure to ensure the struct f_hidg's
> > memory is only freed once the character device handling has finished
> > with it.
> > 
> > Signed-off-by: Lee Jones <lee@kernel.org>
> > ---
> >  drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------
> >  1 file changed, 37 insertions(+), 10 deletions(-)
> 
> Hi,
> 
> This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
> a patch that has triggered this response.  He used to manually respond
> to these common problems, but in order to save his sanity (he kept
> writing the same thing over and over, yet to different people), I was
> created.  Hopefully you will not take offence and will fix the problem
> in your patch and resubmit it so that it can be accepted into the Linux
> kernel tree.
> 
> You are receiving this message because of the following common error(s)
> as indicated below:
> 
> - This looks like a new version of a previously submitted patch, but you
>   did not list below the --- line any changes from the previous version.
>   Please read the section entitled "The canonical patch format" in the
>   kernel file, Documentation/SubmittingPatches for what needs to be done
>   here to properly describe this.
> 
> If you wish to discuss this problem further, or you have questions about
> how to resolve this issue, please feel free to respond to this email and
> Greg will reply once he has dug out from the pending patches received
> from other developers.
> 
> thanks,
> 
> greg k-h's patch email bot

This is a completely new solution to the same problem.

I'm treating this as a brand new submission.
  
Lee Jones Nov. 17, 2022, 1:46 p.m. UTC | #4
On Thu, 17 Nov 2022, Greg KH wrote:

> On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> > +static inline bool f_hidg_is_open(struct f_hidg *hidg)
> > +{
> > +	return !!kref_read(&hidg->cdev.kobj.kref);
> > +}
> 
> Ick, sorry, no, that's not going to work and is not allowed at all.
> That's some major layering violations there, AND it can change after you
> get the value as well.

This cdev belongs solely to this driver.  Hence the *.*.* and not
*->*->*.  What is preventing us from reading our own data?  If we
cannot do this directly, can I create an API to do it 'officially'?

I do, however, appreciate that a little locking wouldn't go amiss.

If this solution is not acceptable either, then we're left up the
creak without a paddle.  The rules you've communicated are not
compatible with each other.

Rule 1: Only one item in a data structure can reference count.

Due to the embedded cdev struct, this rules out my first solution of
giving f_hidg its own kref so that it can conduct its own life-time
management.

A potential option to satisfy this rule would be to remove the cdev
attribute and create its data dynamically instead.  However, the
staticness of cdev is used to obtain f_hidg (with container_of()) in
the character device handling component, so it cannot be removed.

Rule 2: Reading the present refcount causes a laying violation

So we're essentially saying, if data is already being reference
counted, you have to use the present implementation instead of adding
additional counts, but there is no way to actually do that, which
kind of puts us at stalemate.

Since this is a genuine issue, doing anything is not really an option
either.  So where do we go from here?
  
Alan Stern Nov. 17, 2022, 4:47 p.m. UTC | #5
On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote:
> On Thu, 17 Nov 2022, Greg KH wrote:
> 
> > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> > > +static inline bool f_hidg_is_open(struct f_hidg *hidg)
> > > +{
> > > +	return !!kref_read(&hidg->cdev.kobj.kref);
> > > +}
> > 
> > Ick, sorry, no, that's not going to work and is not allowed at all.
> > That's some major layering violations there, AND it can change after you
> > get the value as well.
> 
> This cdev belongs solely to this driver.  Hence the *.*.* and not
> *->*->*.  What is preventing us from reading our own data?  If we
> cannot do this directly, can I create an API to do it 'officially'?
> 
> I do, however, appreciate that a little locking wouldn't go amiss.
> 
> If this solution is not acceptable either, then we're left up the
> creak without a paddle.  The rules you've communicated are not
> compatible with each other.
> 
> Rule 1: Only one item in a data structure can reference count.
> 
> Due to the embedded cdev struct, this rules out my first solution of
> giving f_hidg its own kref so that it can conduct its own life-time
> management.
> 
> A potential option to satisfy this rule would be to remove the cdev
> attribute and create its data dynamically instead.  However, the
> staticness of cdev is used to obtain f_hidg (with container_of()) in
> the character device handling component, so it cannot be removed.

You have not understood this rule correctly.  Only one item in a data 
structure can hold a reference count _for that structure_.  But several 
items in a structure can hold reference counts for themselves.

So for example, you could put a kref in f_hidg which would hold the 
reference count for the f_hidg structure, while at the same time 
including an embedded cdev with its own reference counter.  The point is 
that the refcount in the embedded cdev refers to the lifetime of the 
cdev, not the lifetime of the f_hidg.

To make this work properly, you have to do two additional things:

	When the cdev's refcount is initialized, increment the kref
	in f_hidg.

	When the cdev's refcount drops to 0, decrement the kref (and
	release f_hidg if the kref hits 0).

Alan Stern

> Rule 2: Reading the present refcount causes a laying violation
> 
> So we're essentially saying, if data is already being reference
> counted, you have to use the present implementation instead of adding
> additional counts, but there is no way to actually do that, which
> kind of puts us at stalemate.
> 
> Since this is a genuine issue, doing anything is not really an option
> either.  So where do we go from here?
> 
> -- 
> Lee Jones [李琼斯]
  
Greg KH Nov. 17, 2022, 5:38 p.m. UTC | #6
On Thu, Nov 17, 2022 at 01:26:21PM +0000, Lee Jones wrote:
> On Thu, 17 Nov 2022, Greg KH wrote:
> 
> > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> > > A reference to struct f_hidg is shared with this driver's associated
> > > character device handling component without provision for life-time
> > > handling.  In some circumstances, this can lead to unwanted
> > > behaviour depending on the order in which things are torn down.
> > > 
> > > Utilise, the reference counting functionality already provided by the
> > > implanted character device structure to ensure the struct f_hidg's
> > > memory is only freed once the character device handling has finished
> > > with it.
> > > 
> > > Signed-off-by: Lee Jones <lee@kernel.org>
> > > ---
> > >  drivers/usb/gadget/function/f_hid.c | 47 +++++++++++++++++++++++------
> > >  1 file changed, 37 insertions(+), 10 deletions(-)
> > 
> > Hi,
> > 
> > This is the friendly patch-bot of Greg Kroah-Hartman.  You have sent him
> > a patch that has triggered this response.  He used to manually respond
> > to these common problems, but in order to save his sanity (he kept
> > writing the same thing over and over, yet to different people), I was
> > created.  Hopefully you will not take offence and will fix the problem
> > in your patch and resubmit it so that it can be accepted into the Linux
> > kernel tree.
> > 
> > You are receiving this message because of the following common error(s)
> > as indicated below:
> > 
> > - This looks like a new version of a previously submitted patch, but you
> >   did not list below the --- line any changes from the previous version.
> >   Please read the section entitled "The canonical patch format" in the
> >   kernel file, Documentation/SubmittingPatches for what needs to be done
> >   here to properly describe this.
> > 
> > If you wish to discuss this problem further, or you have questions about
> > how to resolve this issue, please feel free to respond to this email and
> > Greg will reply once he has dug out from the pending patches received
> > from other developers.
> > 
> > thanks,
> > 
> > greg k-h's patch email bot
> 
> This is a completely new solution to the same problem.
> 
> I'm treating this as a brand new submission.

With the identical subject line, which plays havoc with tools :(

This is a v2, next should be v3 please.

thanks,

greg k-h
  
Lee Jones Nov. 18, 2022, 8:54 a.m. UTC | #7
On Thu, 17 Nov 2022, Alan Stern wrote:

> On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote:
> > On Thu, 17 Nov 2022, Greg KH wrote:
> > 
> > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> > > > +static inline bool f_hidg_is_open(struct f_hidg *hidg)
> > > > +{
> > > > +	return !!kref_read(&hidg->cdev.kobj.kref);
> > > > +}
> > > 
> > > Ick, sorry, no, that's not going to work and is not allowed at all.
> > > That's some major layering violations there, AND it can change after you
> > > get the value as well.
> > 
> > This cdev belongs solely to this driver.  Hence the *.*.* and not
> > *->*->*.  What is preventing us from reading our own data?  If we
> > cannot do this directly, can I create an API to do it 'officially'?
> > 
> > I do, however, appreciate that a little locking wouldn't go amiss.
> > 
> > If this solution is not acceptable either, then we're left up the
> > creak without a paddle.  The rules you've communicated are not
> > compatible with each other.
> > 
> > Rule 1: Only one item in a data structure can reference count.
> > 
> > Due to the embedded cdev struct, this rules out my first solution of
> > giving f_hidg its own kref so that it can conduct its own life-time
> > management.
> > 
> > A potential option to satisfy this rule would be to remove the cdev
> > attribute and create its data dynamically instead.  However, the
> > staticness of cdev is used to obtain f_hidg (with container_of()) in
> > the character device handling component, so it cannot be removed.
> 
> You have not understood this rule correctly.  Only one item in a data 
> structure can hold a reference count _for that structure_.  But several 
> items in a structure can hold reference counts for themselves.

Here was the review comment I was working to on this patch [0]:

 "While at first glance, it seems that f_hidg is not reference
  counted, it really is, with the embedded "struct cdev" a few lines
  above this.

  That is the reference count that should control the lifecycle of
  this object, not another reference here in the "outer layer"
  structure."

> So for example, you could put a kref in f_hidg which would hold the 
> reference count for the f_hidg structure, while at the same time 
> including an embedded cdev with its own reference counter.  The point is 
> that the refcount in the embedded cdev refers to the lifetime of the 
> cdev, not the lifetime of the f_hidg.

This was the approach in the original submission [1], which during
review I was told was unacceptable for the aforementioned reason.

[0] https://lore.kernel.org/all/Y1PnoMvDmZMqXScw@kroah.com/
[1] https://lore.kernel.org/all/20221017112737.230772-1-lee@kernel.org/

> To make this work properly, you have to do two additional things:
> 
> 	When the cdev's refcount is initialized, increment the kref
> 	in f_hidg.
> 
> 	When the cdev's refcount drops to 0, decrement the kref (and
> 	release f_hidg if the kref hits 0).

More than happy to revisit the first solution with Greg's blessing.
  
Alan Stern Nov. 18, 2022, 3:59 p.m. UTC | #8
On Fri, Nov 18, 2022 at 08:54:53AM +0000, Lee Jones wrote:
> On Thu, 17 Nov 2022, Alan Stern wrote:
> 
> > On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote:
> > > On Thu, 17 Nov 2022, Greg KH wrote:
> > > 
> > > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> > > > > +static inline bool f_hidg_is_open(struct f_hidg *hidg)
> > > > > +{
> > > > > +	return !!kref_read(&hidg->cdev.kobj.kref);
> > > > > +}
> > > > 
> > > > Ick, sorry, no, that's not going to work and is not allowed at all.
> > > > That's some major layering violations there, AND it can change after you
> > > > get the value as well.
> > > 
> > > This cdev belongs solely to this driver.  Hence the *.*.* and not
> > > *->*->*.  What is preventing us from reading our own data?  If we
> > > cannot do this directly, can I create an API to do it 'officially'?
> > > 
> > > I do, however, appreciate that a little locking wouldn't go amiss.
> > > 
> > > If this solution is not acceptable either, then we're left up the
> > > creak without a paddle.  The rules you've communicated are not
> > > compatible with each other.
> > > 
> > > Rule 1: Only one item in a data structure can reference count.
> > > 
> > > Due to the embedded cdev struct, this rules out my first solution of
> > > giving f_hidg its own kref so that it can conduct its own life-time
> > > management.
> > > 
> > > A potential option to satisfy this rule would be to remove the cdev
> > > attribute and create its data dynamically instead.  However, the
> > > staticness of cdev is used to obtain f_hidg (with container_of()) in
> > > the character device handling component, so it cannot be removed.
> > 
> > You have not understood this rule correctly.  Only one item in a data 
> > structure can hold a reference count _for that structure_.  But several 
> > items in a structure can hold reference counts for themselves.
> 
> Here was the review comment I was working to on this patch [0]:
> 
>  "While at first glance, it seems that f_hidg is not reference
>   counted, it really is, with the embedded "struct cdev" a few lines
>   above this.
> 
>   That is the reference count that should control the lifecycle of
>   this object, not another reference here in the "outer layer"
>   structure."

It's worth noting that the review comment goes on to say:

 "But, the cdev api is tricky and messy and not really set up to control
  the lifecycle of objects it is embedded in."

This is a good indication that a separate reference counter really is 
needed (in fact it almost contradicts what was written above).

> > So for example, you could put a kref in f_hidg which would hold the 
> > reference count for the f_hidg structure, while at the same time 
> > including an embedded cdev with its own reference counter.  The point is 
> > that the refcount in the embedded cdev refers to the lifetime of the 
> > cdev, not the lifetime of the f_hidg.
> 
> This was the approach in the original submission [1], which during
> review I was told was unacceptable for the aforementioned reason.
> 
> [0] https://lore.kernel.org/all/Y1PnoMvDmZMqXScw@kroah.com/
> [1] https://lore.kernel.org/all/20221017112737.230772-1-lee@kernel.org/
> 
> > To make this work properly, you have to do two additional things:
> > 
> > 	When the cdev's refcount is initialized, increment the kref
> > 	in f_hidg.
> > 
> > 	When the cdev's refcount drops to 0, decrement the kref (and
> > 	release f_hidg if the kref hits 0).
> 
> More than happy to revisit the first solution with Greg's blessing.

Okay, let's see what Greg thinks after he reads this discussion.

Alan Stern
  
John Keeping Nov. 18, 2022, 4:37 p.m. UTC | #9
On Fri, Nov 18, 2022 at 10:59:42AM -0500, Alan Stern wrote:
> On Fri, Nov 18, 2022 at 08:54:53AM +0000, Lee Jones wrote:
> > On Thu, 17 Nov 2022, Alan Stern wrote:
> > 
> > > On Thu, Nov 17, 2022 at 01:46:26PM +0000, Lee Jones wrote:
> > > > On Thu, 17 Nov 2022, Greg KH wrote:
> > > > 
> > > > > On Thu, Nov 17, 2022 at 12:08:13PM +0000, Lee Jones wrote:
> > > > > > +static inline bool f_hidg_is_open(struct f_hidg *hidg)
> > > > > > +{
> > > > > > +	return !!kref_read(&hidg->cdev.kobj.kref);
> > > > > > +}
> > > > > 
> > > > > Ick, sorry, no, that's not going to work and is not allowed at all.
> > > > > That's some major layering violations there, AND it can change after you
> > > > > get the value as well.
> > > > 
> > > > This cdev belongs solely to this driver.  Hence the *.*.* and not
> > > > *->*->*.  What is preventing us from reading our own data?  If we
> > > > cannot do this directly, can I create an API to do it 'officially'?
> > > > 
> > > > I do, however, appreciate that a little locking wouldn't go amiss.
> > > > 
> > > > If this solution is not acceptable either, then we're left up the
> > > > creak without a paddle.  The rules you've communicated are not
> > > > compatible with each other.
> > > > 
> > > > Rule 1: Only one item in a data structure can reference count.
> > > > 
> > > > Due to the embedded cdev struct, this rules out my first solution of
> > > > giving f_hidg its own kref so that it can conduct its own life-time
> > > > management.
> > > > 
> > > > A potential option to satisfy this rule would be to remove the cdev
> > > > attribute and create its data dynamically instead.  However, the
> > > > staticness of cdev is used to obtain f_hidg (with container_of()) in
> > > > the character device handling component, so it cannot be removed.
> > > 
> > > You have not understood this rule correctly.  Only one item in a data 
> > > structure can hold a reference count _for that structure_.  But several 
> > > items in a structure can hold reference counts for themselves.
> > 
> > Here was the review comment I was working to on this patch [0]:
> > 
> >  "While at first glance, it seems that f_hidg is not reference
> >   counted, it really is, with the embedded "struct cdev" a few lines
> >   above this.
> > 
> >   That is the reference count that should control the lifecycle of
> >   this object, not another reference here in the "outer layer"
> >   structure."
> 
> It's worth noting that the review comment goes on to say:
> 
>  "But, the cdev api is tricky and messy and not really set up to control
>   the lifecycle of objects it is embedded in."
> 
> This is a good indication that a separate reference counter really is 
> needed (in fact it almost contradicts what was written above).

I don't think it's at all simple to fix this - I posted a series
addressing the lifetime issues here a few years ago but didn't chase it
up and there was no feedback:

	https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/

That includes a patch to remove the embedded struct cdev and manage its
lifetime separately, which I think is needed as there are two different
struct device objects here and we cannot tie their lifetimes together.
  
Alan Stern Nov. 18, 2022, 9:07 p.m. UTC | #10
On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote:
> I don't think it's at all simple to fix this - I posted a series
> addressing the lifetime issues here a few years ago but didn't chase it
> up and there was no feedback:
> 
> 	https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/
> 
> That includes a patch to remove the embedded struct cdev and manage its
> lifetime separately, which I think is needed as there are two different
> struct device objects here and we cannot tie their lifetimes together.

I still don't have a clear picture of what the real problem is.  Lee's 
original patch description just said "external references are presently 
not tracked", with no details about what those external references are. 
Why not add just proper cdev_get() and cdev_put() calls to whatever code 
handles those external references, so that they _are_ tracked?

What are the two different struct device objects?  Why do their 
lifetimes need to be tied together?  If you do need to tie their 
lifetimes somehow, why not simply make one of them (the one which is 
logically allowed to be shorter-lived) hold a reference to the other?

Alan Stern
  
John Keeping Nov. 20, 2022, 5:22 p.m. UTC | #11
On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote:
> On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote:
> > I don't think it's at all simple to fix this - I posted a series
> > addressing the lifetime issues here a few years ago but didn't chase it
> > up and there was no feedback:
> > 
> > 	https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/
> > 
> > That includes a patch to remove the embedded struct cdev and manage its
> > lifetime separately, which I think is needed as there are two different
> > struct device objects here and we cannot tie their lifetimes together.
> 
> I still don't have a clear picture of what the real problem is.  Lee's 
> original patch description just said "external references are presently 
> not tracked", with no details about what those external references are. 
> Why not add just proper cdev_get() and cdev_put() calls to whatever code 
> handles those external references, so that they _are_ tracked?
> 
> What are the two different struct device objects?  Why do their 
> lifetimes need to be tied together?  If you do need to tie their 
> lifetimes somehow, why not simply make one of them (the one which is 
> logically allowed to be shorter-lived) hold a reference to the other?

The problem is that we have a struct cdev embedded in f_hidg but the
lifetime of f_hidg is not tied to any kobject so we can't solve this in
the right way by setting the parent kobject of the cdev.

While refcounting struct f_hidg is necessary, it's not sufficient
because the only way to keep it alive long enough for the final
kobject_put() on the embedded cdev is to tie the lifetime to a kobject
of its own and there is no suitable object as this is not the model
followed by gadget function instances.

To show the problem (using libusbgx's example commands for conciseness,
but obviously this can be done via configfs directly):

	$ gadget-hid
	$ exec 3<> /dev/hidg0
	$ gadget-vid-pid-remove
	$ exec 3<&-

	==================================================================
	BUG: KASAN: use-after-free in kobject_put+0x24/0x250
	Read of size 1 at addr c61784a0 by task sh/264

	CPU: 1 PID: 264 Comm: sh Tainted: G        W          6.0.5 #1
	Hardware name: Rockchip (Device Tree)
	 unwind_backtrace from show_stack+0x10/0x14
	 show_stack from dump_stack_lvl+0x58/0x70
	 dump_stack_lvl from print_report+0x58/0x4dc
	 print_report from kasan_report+0x88/0xa8
	 kasan_report from kobject_put+0x24/0x250
	 kobject_put from __fput+0x1e4/0x358
	 __fput from task_work_run+0xb4/0xc4
	 task_work_run from do_work_pending+0x4d4/0x524
	 do_work_pending from slow_work_pending+0xc/0x20
	Exception stack(0xf284bfb0 to 0xf284bff8)
	bfa0:                                     00000000 00000003 02292190 00000000
	bfc0: 02293fc8 aed4e1d0 00000001 00000006 022925a0 00000000 022925a0 022923f4
	bfe0: 00532f38 b6864840 0049c218 aec57e38 60000010 0000000b

	Allocated by task 341:
	 kasan_set_track+0x20/0x28
	 ____kasan_kmalloc+0x80/0x88
	 hidg_alloc+0x24/0x1f0
	 usb_get_function+0x28/0x48
	 config_usb_cfg_link+0x90/0x114
	 configfs_symlink+0x24c/0x5d8
	 vfs_symlink+0x58/0x74
	 do_symlinkat+0xdc/0x16c
	 ret_fast_syscall+0x0/0x1c

	Freed by task 352:
	 kasan_set_track+0x20/0x28
	 kasan_set_free_info+0x28/0x34
	 ____kasan_slab_free+0xf8/0x108
	 __kasan_slab_free+0x10/0x18
	 slab_free_freelist_hook+0x9c/0xfc
	 kfree+0x118/0x258
	 hidg_free+0x44/0x6c
	 config_usb_cfg_unlink+0xd4/0xf4
	 configfs_unlink+0x118/0x15c
	 vfs_unlink+0xd8/0x1c4
	 do_unlinkat+0x180/0x28c
	 ret_fast_syscall+0x0/0x1c

	The buggy address belongs to the object at c6178400
	 which belongs to the cache kmalloc-512 of size 512
	The buggy address is located 160 bytes inside of
	 512-byte region [c6178400, c6178600)
  
Alan Stern Nov. 20, 2022, 8:46 p.m. UTC | #12
On Sun, Nov 20, 2022 at 05:22:19PM +0000, John Keeping wrote:
> On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote:
> > On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote:
> > > I don't think it's at all simple to fix this - I posted a series
> > > addressing the lifetime issues here a few years ago but didn't chase it
> > > up and there was no feedback:
> > > 
> > > 	https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/
> > > 
> > > That includes a patch to remove the embedded struct cdev and manage its
> > > lifetime separately, which I think is needed as there are two different
> > > struct device objects here and we cannot tie their lifetimes together.
> > 
> > I still don't have a clear picture of what the real problem is.  Lee's 
> > original patch description just said "external references are presently 
> > not tracked", with no details about what those external references are. 
> > Why not add just proper cdev_get() and cdev_put() calls to whatever code 
> > handles those external references, so that they _are_ tracked?
> > 
> > What are the two different struct device objects?  Why do their 
> > lifetimes need to be tied together?  If you do need to tie their 
> > lifetimes somehow, why not simply make one of them (the one which is 
> > logically allowed to be shorter-lived) hold a reference to the other?
> 
> The problem is that we have a struct cdev embedded in f_hidg but the
> lifetime of f_hidg is not tied to any kobject so we can't solve this in
> the right way by setting the parent kobject of the cdev.
> 
> While refcounting struct f_hidg is necessary, it's not sufficient
> because the only way to keep it alive long enough for the final
> kobject_put() on the embedded cdev is to tie the lifetime to a kobject
> of its own and there is no suitable object as this is not the model
> followed by gadget function instances.

I see.  The solution is simple: Embed a struct device in struct f_hidg, 
and call cdev_device_add() to add the device and the cdev.  This will 
automatically make the device the parent of the cdev, so the device's 
refcount won't go to 0 until the cdev's refcount does.  Then you can tie 
the f_hidg's lifetime to the device's, so the device's release routine 
can safely deallocate the entire f_hidg structure.

The parent of the new struct device should be set to &gadget->dev.  If 
you can't think of a better name for the device, you could simply append 
":I" to the parent's name, where I is the interface number, or even 
append ":C.I" where C is the config number (like we do on the host 
side).

Alan Stern
  
Lee Jones Nov. 21, 2022, 12:30 p.m. UTC | #13
On Sun, 20 Nov 2022, Alan Stern wrote:

> On Sun, Nov 20, 2022 at 05:22:19PM +0000, John Keeping wrote:
> > On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote:
> > > On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote:
> > > > I don't think it's at all simple to fix this - I posted a series
> > > > addressing the lifetime issues here a few years ago but didn't chase it
> > > > up and there was no feedback:
> > > > 
> > > > 	https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/
> > > > 
> > > > That includes a patch to remove the embedded struct cdev and manage its
> > > > lifetime separately, which I think is needed as there are two different
> > > > struct device objects here and we cannot tie their lifetimes together.
> > > 
> > > I still don't have a clear picture of what the real problem is.  Lee's 
> > > original patch description just said "external references are presently 
> > > not tracked", with no details about what those external references are. 
> > > Why not add just proper cdev_get() and cdev_put() calls to whatever code 
> > > handles those external references, so that they _are_ tracked?
> > > 
> > > What are the two different struct device objects?  Why do their 
> > > lifetimes need to be tied together?  If you do need to tie their 
> > > lifetimes somehow, why not simply make one of them (the one which is 
> > > logically allowed to be shorter-lived) hold a reference to the other?
> > 
> > The problem is that we have a struct cdev embedded in f_hidg but the
> > lifetime of f_hidg is not tied to any kobject so we can't solve this in
> > the right way by setting the parent kobject of the cdev.
> > 
> > While refcounting struct f_hidg is necessary, it's not sufficient
> > because the only way to keep it alive long enough for the final
> > kobject_put() on the embedded cdev is to tie the lifetime to a kobject
> > of its own and there is no suitable object as this is not the model
> > followed by gadget function instances.
> 
> I see.  The solution is simple: Embed a struct device in struct f_hidg, 
> and call cdev_device_add() to add the device and the cdev.  This will 
> automatically make the device the parent of the cdev, so the device's 
> refcount won't go to 0 until the cdev's refcount does.  Then you can tie 
> the f_hidg's lifetime to the device's, so the device's release routine 
> can safely deallocate the entire f_hidg structure.
> 
> The parent of the new struct device should be set to &gadget->dev.  If 
> you can't think of a better name for the device, you could simply append 
> ":I" to the parent's name, where I is the interface number, or even 
> append ":C.I" where C is the config number (like we do on the host 
> side).

Thanks for the suggestions Alan.

Not long finished speaking with Greg about this.  He seems okay with
the approach.  I'll knock something together and get a "v1" (*wink*)
out shortly.
  
John Keeping Nov. 21, 2022, 12:38 p.m. UTC | #14
On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote:
> On Sun, Nov 20, 2022 at 05:22:19PM +0000, John Keeping wrote:
> > On Fri, Nov 18, 2022 at 04:07:24PM -0500, Alan Stern wrote:
> > > On Fri, Nov 18, 2022 at 04:37:32PM +0000, John Keeping wrote:
> > > > I don't think it's at all simple to fix this - I posted a series
> > > > addressing the lifetime issues here a few years ago but didn't chase it
> > > > up and there was no feedback:
> > > > 
> > > > 	https://lore.kernel.org/linux-usb/20191028114228.3679219-1-john@metanate.com/
> > > > 
> > > > That includes a patch to remove the embedded struct cdev and manage its
> > > > lifetime separately, which I think is needed as there are two different
> > > > struct device objects here and we cannot tie their lifetimes together.
> > > 
> > > I still don't have a clear picture of what the real problem is.  Lee's 
> > > original patch description just said "external references are presently 
> > > not tracked", with no details about what those external references are. 
> > > Why not add just proper cdev_get() and cdev_put() calls to whatever code 
> > > handles those external references, so that they _are_ tracked?
> > > 
> > > What are the two different struct device objects?  Why do their 
> > > lifetimes need to be tied together?  If you do need to tie their 
> > > lifetimes somehow, why not simply make one of them (the one which is 
> > > logically allowed to be shorter-lived) hold a reference to the other?
> > 
> > The problem is that we have a struct cdev embedded in f_hidg but the
> > lifetime of f_hidg is not tied to any kobject so we can't solve this in
> > the right way by setting the parent kobject of the cdev.
> > 
> > While refcounting struct f_hidg is necessary, it's not sufficient
> > because the only way to keep it alive long enough for the final
> > kobject_put() on the embedded cdev is to tie the lifetime to a kobject
> > of its own and there is no suitable object as this is not the model
> > followed by gadget function instances.
> 
> I see.  The solution is simple: Embed a struct device in struct f_hidg, 
> and call cdev_device_add() to add the device and the cdev.  This will 
> automatically make the device the parent of the cdev, so the device's 
> refcount won't go to 0 until the cdev's refcount does.  Then you can tie 
> the f_hidg's lifetime to the device's, so the device's release routine 
> can safely deallocate the entire f_hidg structure.
> 
> The parent of the new struct device should be set to &gadget->dev.  If 
> you can't think of a better name for the device, you could simply append 
> ":I" to the parent's name, where I is the interface number, or even 
> append ":C.I" where C is the config number (like we do on the host 
> side).

There is no gadget->dev at the time struct f_hidg is allocated.

AFAICT the device is the UDC, which is only associated with the gadget
when it is bound.  The functions are allocated earlier than this and I
can't see any device associated with struct usb_function_instance.

The patch below does fix the issue, but I'm wondering if there's a
deeper problem here that can only be properly solved by adding some
device/kobject hierarchy to the config side of things.

-- >8 --
Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev

The embedded struct cdev does not have its lifetime correctly tied to
the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN
is held open while the gadget is deleted.

This can readily be replicated with libusbgx's example programs (for
conciseness - operating directly via configfs is equivalent):

	gadget-hid
	exec 3<> /dev/hidg0
	gadget-vid-pid-remove
	exec 3<&-

Add a device to f_hidg so that the cdev can properly take a reference to
this and the structure will be freed only when the child device is
released.

[Also fix refcount leak on an error path.]

Signed-off-by: John Keeping <john@metanate.com>
---
 drivers/usb/gadget/function/f_hid.c | 35 ++++++++++++++++++++++++-----
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
index ca0a7d9eaa34..7ae97e5c4d20 100644
--- a/drivers/usb/gadget/function/f_hid.c
+++ b/drivers/usb/gadget/function/f_hid.c
@@ -71,6 +71,7 @@ struct f_hidg {
 	wait_queue_head_t		write_queue;
 	struct usb_request		*req;
 
+	struct device			dev;
 	int				minor;
 	struct cdev			cdev;
 	struct usb_function		func;
@@ -84,6 +85,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f)
 	return container_of(f, struct f_hidg, func);
 }
 
+static void hidg_release(struct device *dev)
+{
+	struct f_hidg *hidg = container_of(dev, struct f_hidg, dev);
+
+	kfree(hidg->set_report_buf);
+	kfree(hidg);
+}
+
 /*-------------------------------------------------------------------------*/
 /*                           Static descriptors                            */
 
@@ -999,6 +1008,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
 
 	/* create char device */
 	cdev_init(&hidg->cdev, &f_hidg_fops);
+	cdev_set_parent(&hidg->cdev, &hidg->dev.kobj);
 	dev = MKDEV(major, hidg->minor);
 	status = cdev_add(&hidg->cdev, dev, 1);
 	if (status)
@@ -1244,9 +1254,7 @@ static void hidg_free(struct usb_function *f)
 
 	hidg = func_to_hidg(f);
 	opts = container_of(f->fi, struct f_hid_opts, func_inst);
-	kfree(hidg->report_desc);
-	kfree(hidg->set_report_buf);
-	kfree(hidg);
+	device_unregister(&hidg->dev);
 	mutex_lock(&opts->lock);
 	--opts->refcnt;
 	mutex_unlock(&opts->lock);
@@ -1266,6 +1274,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
 {
 	struct f_hidg *hidg;
 	struct f_hid_opts *opts;
+	int ret;
 
 	/* allocate and initialize one new instance */
 	hidg = kzalloc(sizeof(*hidg), GFP_KERNEL);
@@ -1277,17 +1286,27 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
 	mutex_lock(&opts->lock);
 	++opts->refcnt;
 
+	device_initialize(&hidg->dev);
+	hidg->dev.release = hidg_release;
+	ret = dev_set_name(&hidg->dev, "hidg%d", hidg->minor);
+	if (ret) {
+		--opts->refcnt;
+		mutex_unlock(&opts->lock);
+		return ERR_PTR(ret);
+	}
+
 	hidg->minor = opts->minor;
 	hidg->bInterfaceSubClass = opts->subclass;
 	hidg->bInterfaceProtocol = opts->protocol;
 	hidg->report_length = opts->report_length;
 	hidg->report_desc_length = opts->report_desc_length;
 	if (opts->report_desc) {
-		hidg->report_desc = kmemdup(opts->report_desc,
+		hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc,
 					    opts->report_desc_length,
 					    GFP_KERNEL);
 		if (!hidg->report_desc) {
-			kfree(hidg);
+			put_device(&hidg->dev);
+			--opts->refcnt;
 			mutex_unlock(&opts->lock);
 			return ERR_PTR(-ENOMEM);
 		}
@@ -1307,6 +1326,12 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
 	/* this could be made configurable at some point */
 	hidg->qlen	   = 4;
 
+	ret = device_add(&hidg->dev);
+	if (ret) {
+		put_device(&hidg->dev);
+		return ERR_PTR(ret);
+	}
+
 	return &hidg->func;
 }
  
Alan Stern Nov. 21, 2022, 4:18 p.m. UTC | #15
On Mon, Nov 21, 2022 at 12:38:37PM +0000, John Keeping wrote:
> On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote:
> > I see.  The solution is simple: Embed a struct device in struct f_hidg, 
> > and call cdev_device_add() to add the device and the cdev.  This will 
> > automatically make the device the parent of the cdev, so the device's 
> > refcount won't go to 0 until the cdev's refcount does.  Then you can tie 
> > the f_hidg's lifetime to the device's, so the device's release routine 
> > can safely deallocate the entire f_hidg structure.
> > 
> > The parent of the new struct device should be set to &gadget->dev.  If 
> > you can't think of a better name for the device, you could simply append 
> > ":I" to the parent's name, where I is the interface number, or even 
> > append ":C.I" where C is the config number (like we do on the host 
> > side).
> 
> There is no gadget->dev at the time struct f_hidg is allocated.
> 
> AFAICT the device is the UDC, which is only associated with the gadget
> when it is bound.  The functions are allocated earlier than this and I
> can't see any device associated with struct usb_function_instance.

Ah, that's a shame.  All right, so be it.

> The patch below does fix the issue, but I'm wondering if there's a
> deeper problem here that can only be properly solved by adding some
> device/kobject hierarchy to the config side of things.

I don't believe there's any deeper problem.  If someone wants to have an 
fhidg device as a child of the gadget, I think it could be added and 
removed in the ->set_alt() and ->disable() callbacks.  Or maybe the 
->bind() and ->unbind() callbacks (I've never had to work with configfs 
so I'm not clear on the details).

> -- >8 --
> Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev
> 
> The embedded struct cdev does not have its lifetime correctly tied to
> the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN
> is held open while the gadget is deleted.
> 
> This can readily be replicated with libusbgx's example programs (for
> conciseness - operating directly via configfs is equivalent):
> 
> 	gadget-hid
> 	exec 3<> /dev/hidg0
> 	gadget-vid-pid-remove
> 	exec 3<&-
> 
> Add a device to f_hidg so that the cdev can properly take a reference to
> this and the structure will be freed only when the child device is
> released.
> 
> [Also fix refcount leak on an error path.]
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  drivers/usb/gadget/function/f_hid.c | 35 ++++++++++++++++++++++++-----
>  1 file changed, 30 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
> index ca0a7d9eaa34..7ae97e5c4d20 100644
> --- a/drivers/usb/gadget/function/f_hid.c
> +++ b/drivers/usb/gadget/function/f_hid.c
> @@ -71,6 +71,7 @@ struct f_hidg {
>  	wait_queue_head_t		write_queue;
>  	struct usb_request		*req;
>  
> +	struct device			dev;
>  	int				minor;
>  	struct cdev			cdev;
>  	struct usb_function		func;
> @@ -84,6 +85,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f)
>  	return container_of(f, struct f_hidg, func);
>  }
>  
> +static void hidg_release(struct device *dev)
> +{
> +	struct f_hidg *hidg = container_of(dev, struct f_hidg, dev);
> +
> +	kfree(hidg->set_report_buf);
> +	kfree(hidg);
> +}
> +
>  /*-------------------------------------------------------------------------*/
>  /*                           Static descriptors                            */
>  
> @@ -999,6 +1008,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
>  
>  	/* create char device */
>  	cdev_init(&hidg->cdev, &f_hidg_fops);
> +	cdev_set_parent(&hidg->cdev, &hidg->dev.kobj);
>  	dev = MKDEV(major, hidg->minor);
>  	status = cdev_add(&hidg->cdev, dev, 1);

Instead of doing it by hand like this, why not call cdev_device_add()?

>  	if (status)
> @@ -1244,9 +1254,7 @@ static void hidg_free(struct usb_function *f)
>  
>  	hidg = func_to_hidg(f);
>  	opts = container_of(f->fi, struct f_hid_opts, func_inst);
> -	kfree(hidg->report_desc);
> -	kfree(hidg->set_report_buf);
> -	kfree(hidg);
> +	device_unregister(&hidg->dev);

Are you certain at this point that hidg->dev has been registered?  Even 
on all the error paths?

>  	mutex_lock(&opts->lock);
>  	--opts->refcnt;
>  	mutex_unlock(&opts->lock);
> @@ -1266,6 +1274,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
>  {
>  	struct f_hidg *hidg;
>  	struct f_hid_opts *opts;
> +	int ret;
>  
>  	/* allocate and initialize one new instance */
>  	hidg = kzalloc(sizeof(*hidg), GFP_KERNEL);
> @@ -1277,17 +1286,27 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
>  	mutex_lock(&opts->lock);
>  	++opts->refcnt;
>  
> +	device_initialize(&hidg->dev);
> +	hidg->dev.release = hidg_release;
> +	ret = dev_set_name(&hidg->dev, "hidg%d", hidg->minor);
> +	if (ret) {
> +		--opts->refcnt;
> +		mutex_unlock(&opts->lock);
> +		return ERR_PTR(ret);
> +	}
> +
>  	hidg->minor = opts->minor;
>  	hidg->bInterfaceSubClass = opts->subclass;
>  	hidg->bInterfaceProtocol = opts->protocol;
>  	hidg->report_length = opts->report_length;
>  	hidg->report_desc_length = opts->report_desc_length;
>  	if (opts->report_desc) {
> -		hidg->report_desc = kmemdup(opts->report_desc,
> +		hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc,
>  					    opts->report_desc_length,
>  					    GFP_KERNEL);
>  		if (!hidg->report_desc) {
> -			kfree(hidg);
> +			put_device(&hidg->dev);
> +			--opts->refcnt;
>  			mutex_unlock(&opts->lock);
>  			return ERR_PTR(-ENOMEM);
>  		}
> @@ -1307,6 +1326,12 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
>  	/* this could be made configurable at some point */
>  	hidg->qlen	   = 4;
>  
> +	ret = device_add(&hidg->dev);
> +	if (ret) {
> +		put_device(&hidg->dev);
> +		return ERR_PTR(ret);
> +	}

Why do this here instead of when the cdev is registered?  Or to put it 
another way, why not add the two of them at the same time by calling 
cdev_device_add() rather than adding them at two separate places?

Alan Stern

> +
>  	return &hidg->func;
>  }
>  
> -- 
> 2.38.1
>
  
John Keeping Nov. 21, 2022, 6:54 p.m. UTC | #16
On Mon, Nov 21, 2022 at 11:18:34AM -0500, Alan Stern wrote:
> On Mon, Nov 21, 2022 at 12:38:37PM +0000, John Keeping wrote:
> > On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote:
> > > I see.  The solution is simple: Embed a struct device in struct f_hidg, 
> > > and call cdev_device_add() to add the device and the cdev.  This will 
> > > automatically make the device the parent of the cdev, so the device's 
> > > refcount won't go to 0 until the cdev's refcount does.  Then you can tie 
> > > the f_hidg's lifetime to the device's, so the device's release routine 
> > > can safely deallocate the entire f_hidg structure.
> > > 
> > > The parent of the new struct device should be set to &gadget->dev.  If 
> > > you can't think of a better name for the device, you could simply append 
> > > ":I" to the parent's name, where I is the interface number, or even 
> > > append ":C.I" where C is the config number (like we do on the host 
> > > side).
> > 
> > There is no gadget->dev at the time struct f_hidg is allocated.
> > 
> > AFAICT the device is the UDC, which is only associated with the gadget
> > when it is bound.  The functions are allocated earlier than this and I
> > can't see any device associated with struct usb_function_instance.
> 
> Ah, that's a shame.  All right, so be it.
> 
> > The patch below does fix the issue, but I'm wondering if there's a
> > deeper problem here that can only be properly solved by adding some
> > device/kobject hierarchy to the config side of things.
> 
> I don't believe there's any deeper problem.  If someone wants to have an 
> fhidg device as a child of the gadget, I think it could be added and 
> removed in the ->set_alt() and ->disable() callbacks.  Or maybe the 
> ->bind() and ->unbind() callbacks (I've never had to work with configfs 
> so I'm not clear on the details).

It turns out there's already a device being created here, just not
associated with the structure.  Your suggestions around
cdev_device_add() made me spot what's going on with that so the actual
fix is to pull its lifetime up to match struct f_hidg.

-- >8 --
Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev

The embedded struct cdev does not have its lifetime correctly tied to
the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN
is held open while the gadget is deleted.

This can readily be replicated with libusbgx's example programs (for
conciseness - operating directly via configfs is equivalent):

	gadget-hid
	exec 3<> /dev/hidg0
	gadget-vid-pid-remove
	exec 3<&-

Pull the existing device up in to struct f_hidg and make use of the
cdev_device_{add,del}() helpers.  This changes the lifetime of the
device object to match struct f_hidg, but note that it is still added
and deleted at the same time.

[Also fix refcount leak on an error path.]

Signed-off-by: John Keeping <john@metanate.com>
---
 drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++-------------
 1 file changed, 28 insertions(+), 22 deletions(-)

diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
index ca0a7d9eaa34..0b94668a3812 100644
--- a/drivers/usb/gadget/function/f_hid.c
+++ b/drivers/usb/gadget/function/f_hid.c
@@ -71,7 +71,7 @@ struct f_hidg {
 	wait_queue_head_t		write_queue;
 	struct usb_request		*req;
 
-	int				minor;
+	struct device			dev;
 	struct cdev			cdev;
 	struct usb_function		func;
 
@@ -84,6 +84,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f)
 	return container_of(f, struct f_hidg, func);
 }
 
+static void hidg_release(struct device *dev)
+{
+	struct f_hidg *hidg = container_of(dev, struct f_hidg, dev);
+
+	kfree(hidg->set_report_buf);
+	kfree(hidg);
+}
+
 /*-------------------------------------------------------------------------*/
 /*                           Static descriptors                            */
 
@@ -904,9 +912,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
 	struct usb_ep		*ep;
 	struct f_hidg		*hidg = func_to_hidg(f);
 	struct usb_string	*us;
-	struct device		*device;
 	int			status;
-	dev_t			dev;
 
 	/* maybe allocate device-global string IDs, and patch descriptors */
 	us = usb_gstrings_attach(c->cdev, ct_func_strings,
@@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
 
 	/* create char device */
 	cdev_init(&hidg->cdev, &f_hidg_fops);
-	dev = MKDEV(major, hidg->minor);
-	status = cdev_add(&hidg->cdev, dev, 1);
+	cdev_set_parent(&hidg->cdev, &hidg->dev.kobj);
+	status = cdev_device_add(&hidg->cdev, &hidg->dev);
 	if (status)
 		goto fail_free_descs;
 
-	device = device_create(hidg_class, NULL, dev, NULL,
-			       "%s%d", "hidg", hidg->minor);
-	if (IS_ERR(device)) {
-		status = PTR_ERR(device);
-		goto del;
-	}
-
 	return 0;
-del:
-	cdev_del(&hidg->cdev);
 fail_free_descs:
 	usb_free_all_descriptors(f);
 fail:
@@ -1244,9 +1241,7 @@ static void hidg_free(struct usb_function *f)
 
 	hidg = func_to_hidg(f);
 	opts = container_of(f->fi, struct f_hid_opts, func_inst);
-	kfree(hidg->report_desc);
-	kfree(hidg->set_report_buf);
-	kfree(hidg);
+	put_device(&hidg->dev);
 	mutex_lock(&opts->lock);
 	--opts->refcnt;
 	mutex_unlock(&opts->lock);
@@ -1256,8 +1251,7 @@ static void hidg_unbind(struct usb_configuration *c, struct usb_function *f)
 {
 	struct f_hidg *hidg = func_to_hidg(f);
 
-	device_destroy(hidg_class, MKDEV(major, hidg->minor));
-	cdev_del(&hidg->cdev);
+	cdev_device_del(&hidg->cdev);
 
 	usb_free_all_descriptors(f);
 }
@@ -1266,6 +1260,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
 {
 	struct f_hidg *hidg;
 	struct f_hid_opts *opts;
+	int ret;
 
 	/* allocate and initialize one new instance */
 	hidg = kzalloc(sizeof(*hidg), GFP_KERNEL);
@@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
 	mutex_lock(&opts->lock);
 	++opts->refcnt;
 
-	hidg->minor = opts->minor;
+	device_initialize(&hidg->dev);
+	hidg->dev.release = hidg_release;
+	hidg->dev.class = hidg_class;
+	hidg->dev.devt = MKDEV(major, opts->minor);
+	ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor);
+	if (ret) {
+		--opts->refcnt;
+		mutex_unlock(&opts->lock);
+		return ERR_PTR(ret);
+	}
+
 	hidg->bInterfaceSubClass = opts->subclass;
 	hidg->bInterfaceProtocol = opts->protocol;
 	hidg->report_length = opts->report_length;
 	hidg->report_desc_length = opts->report_desc_length;
 	if (opts->report_desc) {
-		hidg->report_desc = kmemdup(opts->report_desc,
+		hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc,
 					    opts->report_desc_length,
 					    GFP_KERNEL);
 		if (!hidg->report_desc) {
-			kfree(hidg);
+			put_device(&hidg->dev);
+			--opts->refcnt;
 			mutex_unlock(&opts->lock);
 			return ERR_PTR(-ENOMEM);
 		}
  
Alan Stern Nov. 21, 2022, 7:17 p.m. UTC | #17
On Mon, Nov 21, 2022 at 06:54:55PM +0000, John Keeping wrote:
> It turns out there's already a device being created here, just not
> associated with the structure.  Your suggestions around
> cdev_device_add() made me spot what's going on with that so the actual
> fix is to pull its lifetime up to match struct f_hidg.

It's not obvious from the patch what device was already being created, 
but never mind...

> -- >8 --
> Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev
> 
> The embedded struct cdev does not have its lifetime correctly tied to
> the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN
> is held open while the gadget is deleted.
> 
> This can readily be replicated with libusbgx's example programs (for
> conciseness - operating directly via configfs is equivalent):
> 
> 	gadget-hid
> 	exec 3<> /dev/hidg0
> 	gadget-vid-pid-remove
> 	exec 3<&-
> 
> Pull the existing device up in to struct f_hidg and make use of the
> cdev_device_{add,del}() helpers.  This changes the lifetime of the
> device object to match struct f_hidg, but note that it is still added
> and deleted at the same time.
> 
> [Also fix refcount leak on an error path.]
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++-------------
>  1 file changed, 28 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
> index ca0a7d9eaa34..0b94668a3812 100644
> --- a/drivers/usb/gadget/function/f_hid.c
> +++ b/drivers/usb/gadget/function/f_hid.c

> @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
>  
>  	/* create char device */
>  	cdev_init(&hidg->cdev, &f_hidg_fops);
> -	dev = MKDEV(major, hidg->minor);
> -	status = cdev_add(&hidg->cdev, dev, 1);
> +	cdev_set_parent(&hidg->cdev, &hidg->dev.kobj);

This line isn't needed; cdev_device_add() does it for you because 
hidg->dev.devt has been set.

> +	status = cdev_device_add(&hidg->cdev, &hidg->dev);
>  	if (status)
>  		goto fail_free_descs;

> @@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
>  	mutex_lock(&opts->lock);
>  	++opts->refcnt;
>  
> -	hidg->minor = opts->minor;
> +	device_initialize(&hidg->dev);
> +	hidg->dev.release = hidg_release;
> +	hidg->dev.class = hidg_class;
> +	hidg->dev.devt = MKDEV(major, opts->minor);
> +	ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor);
> +	if (ret) {
> +		--opts->refcnt;
> +		mutex_unlock(&opts->lock);
> +		return ERR_PTR(ret);
> +	}
> +

Otherwise this looks okay (although I don't know any of the details of 
how fhidg works, so you shouldn't take my word for it).

Alan Stern
  
Lee Jones Nov. 22, 2022, 8:31 a.m. UTC | #18
On Mon, 21 Nov 2022, John Keeping wrote:
65;6999;1c
> On Mon, Nov 21, 2022 at 11:18:34AM -0500, Alan Stern wrote:
> > On Mon, Nov 21, 2022 at 12:38:37PM +0000, John Keeping wrote:
> > > On Sun, Nov 20, 2022 at 03:46:26PM -0500, Alan Stern wrote:
> > > > I see.  The solution is simple: Embed a struct device in struct f_hidg, 
> > > > and call cdev_device_add() to add the device and the cdev.  This will 
> > > > automatically make the device the parent of the cdev, so the device's 
> > > > refcount won't go to 0 until the cdev's refcount does.  Then you can tie 
> > > > the f_hidg's lifetime to the device's, so the device's release routine 
> > > > can safely deallocate the entire f_hidg structure.
> > > > 
> > > > The parent of the new struct device should be set to &gadget->dev.  If 
> > > > you can't think of a better name for the device, you could simply append 
> > > > ":I" to the parent's name, where I is the interface number, or even 
> > > > append ":C.I" where C is the config number (like we do on the host 
> > > > side).
> > > 
> > > There is no gadget->dev at the time struct f_hidg is allocated.
> > > 
> > > AFAICT the device is the UDC, which is only associated with the gadget
> > > when it is bound.  The functions are allocated earlier than this and I
> > > can't see any device associated with struct usb_function_instance.
> > 
> > Ah, that's a shame.  All right, so be it.
> > 
> > > The patch below does fix the issue, but I'm wondering if there's a
> > > deeper problem here that can only be properly solved by adding some
> > > device/kobject hierarchy to the config side of things.
> > 
> > I don't believe there's any deeper problem.  If someone wants to have an 
> > fhidg device as a child of the gadget, I think it could be added and 
> > removed in the ->set_alt() and ->disable() callbacks.  Or maybe the 
> > ->bind() and ->unbind() callbacks (I've never had to work with configfs 
> > so I'm not clear on the details).
> 
> It turns out there's already a device being created here, just not
> associated with the structure.  Your suggestions around
> cdev_device_add() made me spot what's going on with that so the actual
> fix is to pull its lifetime up to match struct f_hidg.
> 
> -- >8 --
> Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev
> 
> The embedded struct cdev does not have its lifetime correctly tied to
> the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN
> is held open while the gadget is deleted.
> 
> This can readily be replicated with libusbgx's example programs (for
> conciseness - operating directly via configfs is equivalent):
> 
> 	gadget-hid
> 	exec 3<> /dev/hidg0
> 	gadget-vid-pid-remove
> 	exec 3<&-
> 
> Pull the existing device up in to struct f_hidg and make use of the
> cdev_device_{add,del}() helpers.  This changes the lifetime of the
> device object to match struct f_hidg, but note that it is still added
> and deleted at the same time.

This is much better, thanks for re-spinning.

> [Also fix refcount leak on an error path.]
> 
> Signed-off-by: John Keeping <john@metanate.com>
> ---
>  drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++-------------
>  1 file changed, 28 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
> index ca0a7d9eaa34..0b94668a3812 100644
> --- a/drivers/usb/gadget/function/f_hid.c
> +++ b/drivers/usb/gadget/function/f_hid.c
> @@ -71,7 +71,7 @@ struct f_hidg {
>  	wait_queue_head_t		write_queue;
>  	struct usb_request		*req;
>  
> -	int				minor;
> +	struct device			dev;
>  	struct cdev			cdev;
>  	struct usb_function		func;
>  
> @@ -84,6 +84,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f)
>  	return container_of(f, struct f_hidg, func);
>  }
>  
> +static void hidg_release(struct device *dev)
> +{
> +	struct f_hidg *hidg = container_of(dev, struct f_hidg, dev);

Could we store/fetch this with dev_{set,get}_drvdata(), and make
hidg->dev a pointer reducing the size of the struct f_hidg.

> +	kfree(hidg->set_report_buf);
> +	kfree(hidg);
> +}
> +
>  /*-------------------------------------------------------------------------*/
>  /*                           Static descriptors                            */
>  
> @@ -904,9 +912,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
>  	struct usb_ep		*ep;
>  	struct f_hidg		*hidg = func_to_hidg(f);
>  	struct usb_string	*us;
> -	struct device		*device;
>  	int			status;
> -	dev_t			dev;
>  
>  	/* maybe allocate device-global string IDs, and patch descriptors */
>  	us = usb_gstrings_attach(c->cdev, ct_func_strings,
> @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
>  
>  	/* create char device */
>  	cdev_init(&hidg->cdev, &f_hidg_fops);
> -	dev = MKDEV(major, hidg->minor);
> -	status = cdev_add(&hidg->cdev, dev, 1);
> +	cdev_set_parent(&hidg->cdev, &hidg->dev.kobj);

cdev_device_add() should take care of this, so long as:

    if (dev->devt)
        dev_set_parent(cdev, &dev->kobj);

> +	status = cdev_device_add(&hidg->cdev, &hidg->dev);
>  	if (status)
>  		goto fail_free_descs;
>  
> -	device = device_create(hidg_class, NULL, dev, NULL,
> -			       "%s%d", "hidg", hidg->minor);
> -	if (IS_ERR(device)) {
> -		status = PTR_ERR(device);
> -		goto del;
> -	}
> -
>  	return 0;
> -del:
> -	cdev_del(&hidg->cdev);
>  fail_free_descs:
>  	usb_free_all_descriptors(f);
>  fail:
> @@ -1244,9 +1241,7 @@ static void hidg_free(struct usb_function *f)
>  
>  	hidg = func_to_hidg(f);
>  	opts = container_of(f->fi, struct f_hid_opts, func_inst);
> -	kfree(hidg->report_desc);
> -	kfree(hidg->set_report_buf);
> -	kfree(hidg);
> +	put_device(&hidg->dev);
>  	mutex_lock(&opts->lock);
>  	--opts->refcnt;
>  	mutex_unlock(&opts->lock);
> @@ -1256,8 +1251,7 @@ static void hidg_unbind(struct usb_configuration *c, struct usb_function *f)
>  {
>  	struct f_hidg *hidg = func_to_hidg(f);
>  
> -	device_destroy(hidg_class, MKDEV(major, hidg->minor));
> -	cdev_del(&hidg->cdev);
> +	cdev_device_del(&hidg->cdev);
>  
>  	usb_free_all_descriptors(f);
>  }
> @@ -1266,6 +1260,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
>  {
>  	struct f_hidg *hidg;
>  	struct f_hid_opts *opts;
> +	int ret;
>  
>  	/* allocate and initialize one new instance */
>  	hidg = kzalloc(sizeof(*hidg), GFP_KERNEL);
> @@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
>  	mutex_lock(&opts->lock);
>  	++opts->refcnt;
>  
> -	hidg->minor = opts->minor;
> +	device_initialize(&hidg->dev);
> +	hidg->dev.release = hidg_release;
> +	hidg->dev.class = hidg_class;
> +	hidg->dev.devt = MKDEV(major, opts->minor);
> +	ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor);
> +	if (ret) {
> +		--opts->refcnt;

Since we're holding the opts lock at this point, is there anything
preventing us from incrementing the refcnt at the end, just before
giving up the lock, thus saving 2 decrements in the error paths?

> +		mutex_unlock(&opts->lock);
> +		return ERR_PTR(ret);
> +	}
> +
>  	hidg->bInterfaceSubClass = opts->subclass;
>  	hidg->bInterfaceProtocol = opts->protocol;
>  	hidg->report_length = opts->report_length;
>  	hidg->report_desc_length = opts->report_desc_length;
>  	if (opts->report_desc) {
> -		hidg->report_desc = kmemdup(opts->report_desc,
> +		hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc,

Nice.

>  					    opts->report_desc_length,
>  					    GFP_KERNEL);
>  		if (!hidg->report_desc) {
> -			kfree(hidg);
> +			put_device(&hidg->dev);
> +			--opts->refcnt;
>  			mutex_unlock(&opts->lock);
>  			return ERR_PTR(-ENOMEM);
>  		}

Thanks for doing this John, your work is appreciated.
  
John Keeping Nov. 22, 2022, 11:52 a.m. UTC | #19
On Mon, Nov 21, 2022 at 02:17:56PM -0500, Alan Stern wrote:
> On Mon, Nov 21, 2022 at 06:54:55PM +0000, John Keeping wrote:
> > It turns out there's already a device being created here, just not
> > associated with the structure.  Your suggestions around
> > cdev_device_add() made me spot what's going on with that so the actual
> > fix is to pull its lifetime up to match struct f_hidg.
> 
> It's not obvious from the patch what device was already being created, 
> but never mind...

The patch has:

-       device = device_create(hidg_class, NULL, dev, NULL,
-                              "%s%d", "hidg", hidg->minor);

but this device was not previously associated with the cdev (apart from
indirectly via dev_t).

> > -- >8 --
> > Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev
> > 
> > The embedded struct cdev does not have its lifetime correctly tied to
> > the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN
> > is held open while the gadget is deleted.
> > 
> > This can readily be replicated with libusbgx's example programs (for
> > conciseness - operating directly via configfs is equivalent):
> > 
> > 	gadget-hid
> > 	exec 3<> /dev/hidg0
> > 	gadget-vid-pid-remove
> > 	exec 3<&-
> > 
> > Pull the existing device up in to struct f_hidg and make use of the
> > cdev_device_{add,del}() helpers.  This changes the lifetime of the
> > device object to match struct f_hidg, but note that it is still added
> > and deleted at the same time.
> > 
> > [Also fix refcount leak on an error path.]
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> >  drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++-------------
> >  1 file changed, 28 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
> > index ca0a7d9eaa34..0b94668a3812 100644
> > --- a/drivers/usb/gadget/function/f_hid.c
> > +++ b/drivers/usb/gadget/function/f_hid.c
> 
> > @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
> >  
> >  	/* create char device */
> >  	cdev_init(&hidg->cdev, &f_hidg_fops);
> > -	dev = MKDEV(major, hidg->minor);
> > -	status = cdev_add(&hidg->cdev, dev, 1);
> > +	cdev_set_parent(&hidg->cdev, &hidg->dev.kobj);
> 
> This line isn't needed; cdev_device_add() does it for you because 
> hidg->dev.devt has been set.

Thanks, I'll drop this line.
  
John Keeping Nov. 22, 2022, 11:55 a.m. UTC | #20
On Tue, Nov 22, 2022 at 08:31:08AM +0000, Lee Jones wrote:
> On Mon, 21 Nov 2022, John Keeping wrote:
> > Subject: [PATCH] usb: gadget: f_hid: fix f_hidg lifetime vs cdev
> > 
> > The embedded struct cdev does not have its lifetime correctly tied to
> > the enclosing struct f_hidg, so there is a use-after-free if /dev/hidgN
> > is held open while the gadget is deleted.
> > 
> > This can readily be replicated with libusbgx's example programs (for
> > conciseness - operating directly via configfs is equivalent):
> > 
> > 	gadget-hid
> > 	exec 3<> /dev/hidg0
> > 	gadget-vid-pid-remove
> > 	exec 3<&-
> > 
> > Pull the existing device up in to struct f_hidg and make use of the
> > cdev_device_{add,del}() helpers.  This changes the lifetime of the
> > device object to match struct f_hidg, but note that it is still added
> > and deleted at the same time.
> 
> This is much better, thanks for re-spinning.
> 
> > [Also fix refcount leak on an error path.]
> > 
> > Signed-off-by: John Keeping <john@metanate.com>
> > ---
> >  drivers/usb/gadget/function/f_hid.c | 50 ++++++++++++++++-------------
> >  1 file changed, 28 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
> > index ca0a7d9eaa34..0b94668a3812 100644
> > --- a/drivers/usb/gadget/function/f_hid.c
> > +++ b/drivers/usb/gadget/function/f_hid.c
> > @@ -71,7 +71,7 @@ struct f_hidg {
> >  	wait_queue_head_t		write_queue;
> >  	struct usb_request		*req;
> >  
> > -	int				minor;
> > +	struct device			dev;
> >  	struct cdev			cdev;
> >  	struct usb_function		func;
> >  
> > @@ -84,6 +84,14 @@ static inline struct f_hidg *func_to_hidg(struct usb_function *f)
> >  	return container_of(f, struct f_hidg, func);
> >  }
> >  
> > +static void hidg_release(struct device *dev)
> > +{
> > +	struct f_hidg *hidg = container_of(dev, struct f_hidg, dev);
> 
> Could we store/fetch this with dev_{set,get}_drvdata(), and make
> hidg->dev a pointer reducing the size of the struct f_hidg.

That will reduce the size of struct f_hidg, but we'll have an extra
allocation for the device object, so I don't think that's a real
benefit.

It seems simpler to keep a single allocation and embed the device.

> > +	kfree(hidg->set_report_buf);
> > +	kfree(hidg);
> > +}
> > +
> >  /*-------------------------------------------------------------------------*/
> >  /*                           Static descriptors                            */
> >  
> > @@ -904,9 +912,7 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
> >  	struct usb_ep		*ep;
> >  	struct f_hidg		*hidg = func_to_hidg(f);
> >  	struct usb_string	*us;
> > -	struct device		*device;
> >  	int			status;
> > -	dev_t			dev;
> >  
> >  	/* maybe allocate device-global string IDs, and patch descriptors */
> >  	us = usb_gstrings_attach(c->cdev, ct_func_strings,
> > @@ -999,21 +1005,12 @@ static int hidg_bind(struct usb_configuration *c, struct usb_function *f)
> >  
> >  	/* create char device */
> >  	cdev_init(&hidg->cdev, &f_hidg_fops);
> > -	dev = MKDEV(major, hidg->minor);
> > -	status = cdev_add(&hidg->cdev, dev, 1);
> > +	cdev_set_parent(&hidg->cdev, &hidg->dev.kobj);
> 
> cdev_device_add() should take care of this, so long as:
> 
>     if (dev->devt)
>         dev_set_parent(cdev, &dev->kobj);

Thanks, I'll change this.

> > +	status = cdev_device_add(&hidg->cdev, &hidg->dev);
> >  	if (status)
> >  		goto fail_free_descs;
> >  
> > -	device = device_create(hidg_class, NULL, dev, NULL,
> > -			       "%s%d", "hidg", hidg->minor);
> > -	if (IS_ERR(device)) {
> > -		status = PTR_ERR(device);
> > -		goto del;
> > -	}
> > -
> >  	return 0;
> > -del:
> > -	cdev_del(&hidg->cdev);
> >  fail_free_descs:
> >  	usb_free_all_descriptors(f);
> >  fail:
> > @@ -1244,9 +1241,7 @@ static void hidg_free(struct usb_function *f)
> >  
> >  	hidg = func_to_hidg(f);
> >  	opts = container_of(f->fi, struct f_hid_opts, func_inst);
> > -	kfree(hidg->report_desc);
> > -	kfree(hidg->set_report_buf);
> > -	kfree(hidg);
> > +	put_device(&hidg->dev);
> >  	mutex_lock(&opts->lock);
> >  	--opts->refcnt;
> >  	mutex_unlock(&opts->lock);
> > @@ -1256,8 +1251,7 @@ static void hidg_unbind(struct usb_configuration *c, struct usb_function *f)
> >  {
> >  	struct f_hidg *hidg = func_to_hidg(f);
> >  
> > -	device_destroy(hidg_class, MKDEV(major, hidg->minor));
> > -	cdev_del(&hidg->cdev);
> > +	cdev_device_del(&hidg->cdev);
> >  
> >  	usb_free_all_descriptors(f);
> >  }
> > @@ -1266,6 +1260,7 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
> >  {
> >  	struct f_hidg *hidg;
> >  	struct f_hid_opts *opts;
> > +	int ret;
> >  
> >  	/* allocate and initialize one new instance */
> >  	hidg = kzalloc(sizeof(*hidg), GFP_KERNEL);
> > @@ -1277,17 +1272,28 @@ static struct usb_function *hidg_alloc(struct usb_function_instance *fi)
> >  	mutex_lock(&opts->lock);
> >  	++opts->refcnt;
> >  
> > -	hidg->minor = opts->minor;
> > +	device_initialize(&hidg->dev);
> > +	hidg->dev.release = hidg_release;
> > +	hidg->dev.class = hidg_class;
> > +	hidg->dev.devt = MKDEV(major, opts->minor);
> > +	ret = dev_set_name(&hidg->dev, "hidg%d", opts->minor);
> > +	if (ret) {
> > +		--opts->refcnt;
> 
> Since we're holding the opts lock at this point, is there anything
> preventing us from incrementing the refcnt at the end, just before
> giving up the lock, thus saving 2 decrements in the error paths?

This makes sense.  I'll respin this as a series and include a patch
tidying up the error handling as a final step.

> > +		mutex_unlock(&opts->lock);
> > +		return ERR_PTR(ret);
> > +	}
> > +
> >  	hidg->bInterfaceSubClass = opts->subclass;
> >  	hidg->bInterfaceProtocol = opts->protocol;
> >  	hidg->report_length = opts->report_length;
> >  	hidg->report_desc_length = opts->report_desc_length;
> >  	if (opts->report_desc) {
> > -		hidg->report_desc = kmemdup(opts->report_desc,
> > +		hidg->report_desc = devm_kmemdup(&hidg->dev, opts->report_desc,
> 
> Nice.
> 
> >  					    opts->report_desc_length,
> >  					    GFP_KERNEL);
> >  		if (!hidg->report_desc) {
> > -			kfree(hidg);
> > +			put_device(&hidg->dev);
> > +			--opts->refcnt;
> >  			mutex_unlock(&opts->lock);
> >  			return ERR_PTR(-ENOMEM);
> >  		}
> 
> Thanks for doing this John, your work is appreciated.
  

Patch

diff --git a/drivers/usb/gadget/function/f_hid.c b/drivers/usb/gadget/function/f_hid.c
index ca0a7d9eaa34e..5da8f44d47d9d 100644
--- a/drivers/usb/gadget/function/f_hid.c
+++ b/drivers/usb/gadget/function/f_hid.c
@@ -77,6 +77,8 @@  struct f_hidg {
 
 	struct usb_ep			*in_ep;
 	struct usb_ep			*out_ep;
+
+	bool				gc;
 };
 
 static inline struct f_hidg *func_to_hidg(struct usb_function *f)
@@ -84,6 +86,11 @@  static inline struct f_hidg *func_to_hidg(struct usb_function *f)
 	return container_of(f, struct f_hidg, func);
 }
 
+static inline bool f_hidg_is_open(struct f_hidg *hidg)
+{
+	return !!kref_read(&hidg->cdev.kobj.kref);
+}
+
 /*-------------------------------------------------------------------------*/
 /*                           Static descriptors                            */
 
@@ -273,6 +280,18 @@  static struct usb_gadget_strings *ct_func_strings[] = {
 	NULL,
 };
 
+static void hidg_free_resources(struct f_hidg *hidg)
+{
+       struct f_hid_opts *opts = container_of(hidg->func.fi, struct f_hid_opts, func_inst);
+
+       mutex_lock(&opts->lock);
+       kfree(hidg->report_desc);
+       kfree(hidg->set_report_buf);
+       kfree(hidg);
+       --opts->refcnt;
+       mutex_unlock(&opts->lock);
+}
+
 /*-------------------------------------------------------------------------*/
 /*                              Char Device                                */
 
@@ -539,7 +558,16 @@  static __poll_t f_hidg_poll(struct file *file, poll_table *wait)
 
 static int f_hidg_release(struct inode *inode, struct file *fd)
 {
+	struct f_hidg *hidg  = fd->private_data;
+
+	if (hidg->gc) {
+		/* Gadget has already been disconnected and we are the last f_hidg user */
+		cdev_del(&hidg->cdev);
+		hidg_free_resources(hidg);
+	}
+
 	fd->private_data = NULL;
+
 	return 0;
 }
 
@@ -1239,17 +1267,16 @@  static struct usb_function_instance *hidg_alloc_inst(void)
 
 static void hidg_free(struct usb_function *f)
 {
-	struct f_hidg *hidg;
-	struct f_hid_opts *opts;
+	struct f_hidg *hidg = func_to_hidg(f);
 
-	hidg = func_to_hidg(f);
-	opts = container_of(f->fi, struct f_hid_opts, func_inst);
-	kfree(hidg->report_desc);
-	kfree(hidg->set_report_buf);
-	kfree(hidg);
-	mutex_lock(&opts->lock);
-	--opts->refcnt;
-	mutex_unlock(&opts->lock);
+	if (f_hidg_is_open(hidg))
+		/*
+		 * Gadget disconnected whilst an open dev node exists.
+		 * Delay freeing resources until it closes.
+		 */
+		hidg->gc = true;
+	else
+		hidg_free_resources(hidg);
 }
 
 static void hidg_unbind(struct usb_configuration *c, struct usb_function *f)