[net,v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

Message ID 20230313090002.3308025-1-zyytlz.wz@163.com
State New
Headers
Series [net,v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition |

Commit Message

Zheng Wang March 13, 2023, 9 a.m. UTC
  In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
to init priv->rings and bound &ring->work with p9_xen_response.

When it calls xen_9pfs_front_event_handler to handle IRQ requests,
it will finally call schedule_work to start the work.

When we call xen_9pfs_front_remove to remove the driver, there
may be a sequence as follows:

Fix it by finishing the work before cleanup in xen_9pfs_front_free.

Note that, this bug is found by static analysis, which might be
false positive.

CPU0                  CPU1

                     |p9_xen_response
xen_9pfs_front_remove|
  xen_9pfs_front_free|
kfree(priv)          |
//free priv          |
                     |p9_tag_lookup
                     |//use priv->client

Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
---
v2:
- fix type error of ring found by kernel test robot
---
 net/9p/trans_xen.c | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

Michal Swiatkowski March 13, 2023, 1:54 p.m. UTC | #1
On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> to init priv->rings and bound &ring->work with p9_xen_response.
> 
> When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> it will finally call schedule_work to start the work.
> 
> When we call xen_9pfs_front_remove to remove the driver, there
> may be a sequence as follows:
> 
> Fix it by finishing the work before cleanup in xen_9pfs_front_free.
> 
> Note that, this bug is found by static analysis, which might be
> false positive.
> 
> CPU0                  CPU1
> 
>                      |p9_xen_response
> xen_9pfs_front_remove|
>   xen_9pfs_front_free|
> kfree(priv)          |
> //free priv          |
>                      |p9_tag_lookup
>                      |//use priv->client
> 
> Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> ---
> v2:
> - fix type error of ring found by kernel test robot
> ---
>  net/9p/trans_xen.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> index c64050e839ac..83764431c066 100644
> --- a/net/9p/trans_xen.c
> +++ b/net/9p/trans_xen.c
> @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
>  static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
>  {
>  	int i, j;
> +	struct xen_9pfs_dataring *ring = NULL;
Move it before int i, j to have RCT.

>  
>  	write_lock(&xen_9pfs_lock);
>  	list_del(&priv->list);
>  	write_unlock(&xen_9pfs_lock);
>  
>  	for (i = 0; i < priv->num_rings; i++) {
> +		/*cancel work*/
It isn't needed I think, the function cancel_work_sync() tells everything
here.

> +		ring = &priv->rings[i];
> +		cancel_work_sync(&ring->work);
> +
>  		if (!priv->rings[i].intf)
>  			break;
>  		if (priv->rings[i].irq > 0)
> -- 
> 2.25.1
  
Zheng Hacker March 13, 2023, 2:01 p.m. UTC | #2
Michal Swiatkowski <michal.swiatkowski@linux.intel.com> 于2023年3月13日周一 21:54写道:
>
> On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> > In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> > to init priv->rings and bound &ring->work with p9_xen_response.
> >
> > When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> > it will finally call schedule_work to start the work.
> >
> > When we call xen_9pfs_front_remove to remove the driver, there
> > may be a sequence as follows:
> >
> > Fix it by finishing the work before cleanup in xen_9pfs_front_free.
> >
> > Note that, this bug is found by static analysis, which might be
> > false positive.
> >
> > CPU0                  CPU1
> >
> >                      |p9_xen_response
> > xen_9pfs_front_remove|
> >   xen_9pfs_front_free|
> > kfree(priv)          |
> > //free priv          |
> >                      |p9_tag_lookup
> >                      |//use priv->client
> >
> > Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> > Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> > ---
> > v2:
> > - fix type error of ring found by kernel test robot
> > ---
> >  net/9p/trans_xen.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> > index c64050e839ac..83764431c066 100644
> > --- a/net/9p/trans_xen.c
> > +++ b/net/9p/trans_xen.c
> > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> >  static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> >  {
> >       int i, j;
> > +     struct xen_9pfs_dataring *ring = NULL;
> Move it before int i, j to have RCT.
>
> >
> >       write_lock(&xen_9pfs_lock);
> >       list_del(&priv->list);
> >       write_unlock(&xen_9pfs_lock);
> >
> >       for (i = 0; i < priv->num_rings; i++) {
> > +             /*cancel work*/
> It isn't needed I think, the function cancel_work_sync() tells everything
> here.
>

Get it, will remove it in the next version of patch.

Best regards,
Zheng
  
Michal Swiatkowski March 13, 2023, 2:07 p.m. UTC | #3
On Mon, Mar 13, 2023 at 10:01:21PM +0800, Zheng Hacker wrote:
> Michal Swiatkowski <michal.swiatkowski@linux.intel.com> 于2023年3月13日周一 21:54写道:
> >
> > On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> > > In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> > > to init priv->rings and bound &ring->work with p9_xen_response.
> > >
> > > When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> > > it will finally call schedule_work to start the work.
> > >
> > > When we call xen_9pfs_front_remove to remove the driver, there
> > > may be a sequence as follows:
> > >
> > > Fix it by finishing the work before cleanup in xen_9pfs_front_free.
> > >
> > > Note that, this bug is found by static analysis, which might be
> > > false positive.
> > >
> > > CPU0                  CPU1
> > >
> > >                      |p9_xen_response
> > > xen_9pfs_front_remove|
> > >   xen_9pfs_front_free|
> > > kfree(priv)          |
> > > //free priv          |
> > >                      |p9_tag_lookup
> > >                      |//use priv->client
> > >
> > > Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> > > Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> > > ---
> > > v2:
> > > - fix type error of ring found by kernel test robot
> > > ---
> > >  net/9p/trans_xen.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> > > index c64050e839ac..83764431c066 100644
> > > --- a/net/9p/trans_xen.c
> > > +++ b/net/9p/trans_xen.c
> > > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > >  static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > >  {
> > >       int i, j;
> > > +     struct xen_9pfs_dataring *ring = NULL;
> > Move it before int i, j to have RCT.
> >
Sorry I didn't notice it before, move the definition to for loop.

> > >
> > >       write_lock(&xen_9pfs_lock);
> > >       list_del(&priv->list);
> > >       write_unlock(&xen_9pfs_lock);
> > >
> > >       for (i = 0; i < priv->num_rings; i++) {
Here:
struct xen_9pfs_dataring *ring = &priv->rings[i];

> > > +             /*cancel work*/
> > It isn't needed I think, the function cancel_work_sync() tells everything
> > here.
> >
> 
> Get it, will remove it in the next version of patch.
> 
> Best regards,
> Zheng
  
Zheng Hacker March 13, 2023, 2:10 p.m. UTC | #4
Michal Swiatkowski <michal.swiatkowski@linux.intel.com> 于2023年3月13日周一 22:08写道:
>
> On Mon, Mar 13, 2023 at 10:01:21PM +0800, Zheng Hacker wrote:
> > Michal Swiatkowski <michal.swiatkowski@linux.intel.com> 于2023年3月13日周一 21:54写道:
> > >
> > > On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> > > > In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> > > > to init priv->rings and bound &ring->work with p9_xen_response.
> > > >
> > > > When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> > > > it will finally call schedule_work to start the work.
> > > >
> > > > When we call xen_9pfs_front_remove to remove the driver, there
> > > > may be a sequence as follows:
> > > >
> > > > Fix it by finishing the work before cleanup in xen_9pfs_front_free.
> > > >
> > > > Note that, this bug is found by static analysis, which might be
> > > > false positive.
> > > >
> > > > CPU0                  CPU1
> > > >
> > > >                      |p9_xen_response
> > > > xen_9pfs_front_remove|
> > > >   xen_9pfs_front_free|
> > > > kfree(priv)          |
> > > > //free priv          |
> > > >                      |p9_tag_lookup
> > > >                      |//use priv->client
> > > >
> > > > Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> > > > Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> > > > ---
> > > > v2:
> > > > - fix type error of ring found by kernel test robot
> > > > ---
> > > >  net/9p/trans_xen.c | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> > > > index c64050e839ac..83764431c066 100644
> > > > --- a/net/9p/trans_xen.c
> > > > +++ b/net/9p/trans_xen.c
> > > > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > > >  static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > > >  {
> > > >       int i, j;
> > > > +     struct xen_9pfs_dataring *ring = NULL;
> > > Move it before int i, j to have RCT.
> > >
> Sorry I didn't notice it before, move the definition to for loop.
>

Get it, will correct it in the next patch. Thanks for your notice :)

> > > >
> > > >       write_lock(&xen_9pfs_lock);
> > > >       list_del(&priv->list);
> > > >       write_unlock(&xen_9pfs_lock);
> > > >
> > > >       for (i = 0; i < priv->num_rings; i++) {
> Here:
> struct xen_9pfs_dataring *ring = &priv->rings[i];
>

Best regards,
Zheng
  
Jakub Kicinski March 13, 2023, 9:30 p.m. UTC | #5
On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> >  static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> >  {
> >  	int i, j;
> > +	struct xen_9pfs_dataring *ring = NULL;  
> Move it before int i, j to have RCT.
> 
> >  
> >  	write_lock(&xen_9pfs_lock);
> >  	list_del(&priv->list);
> >  	write_unlock(&xen_9pfs_lock);
> >  
> >  	for (i = 0; i < priv->num_rings; i++) {
> > +		/*cancel work*/  
> It isn't needed I think, the function cancel_work_sync() tells everything
> here.

Note that 9p is more storage than networking, so this patch is likely
to go via a different tree than us.
  
Dominique Martinet March 13, 2023, 10:07 p.m. UTC | #6
Jakub Kicinski wrote on Mon, Mar 13, 2023 at 02:30:54PM -0700:
> On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > >  	for (i = 0; i < priv->num_rings; i++) {
> > > +		/*cancel work*/  
> > It isn't needed I think, the function cancel_work_sync() tells everything
> > here.
> 
> Note that 9p is more storage than networking, so this patch is likely
> to go via a different tree than us.

Any review done is useful anyway ;)

Either Eric or me will take the patch, but in the past such fixes have
sometimes also been taken into the net tree; honestly I wouldn't mind a
bit more "rule" here as it's a bit weird that some of our patches are Cc
to fsdevel@ (fs/ from fs/9p) and the other half netdev@ (net/ from
net/9p), but afaict the MAINTAINERS syntax doesn't have a way of
excluding e.g. net/9p from the `NETWORKING [GENERAL]` group so I guess
we just have to live with that.

There's little enough volume and netdev automation sends a mail when a
patch is picked up, so as long as there's no conflict (large majority of
the cases) such fixes can go either way as far as I'm concerned.
  
Zheng Hacker March 14, 2023, 1:07 a.m. UTC | #7
Jakub Kicinski <kuba@kernel.org> 于2023年3月14日周二 05:30写道:
>
> On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > >  static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > >  {
> > >     int i, j;
> > > +   struct xen_9pfs_dataring *ring = NULL;
> > Move it before int i, j to have RCT.
> >
> > >
> > >     write_lock(&xen_9pfs_lock);
> > >     list_del(&priv->list);
> > >     write_unlock(&xen_9pfs_lock);
> > >
> > >     for (i = 0; i < priv->num_rings; i++) {
> > > +           /*cancel work*/
> > It isn't needed I think, the function cancel_work_sync() tells everything
> > here.
>
> Note that 9p is more storage than networking, so this patch is likely
> to go via a different tree than us.

Sorry I got confused.

Best regards,
Zheng
  
Zheng Hacker March 14, 2023, 1:14 a.m. UTC | #8
<asmadeus@codewreck.org> 于2023年3月14日周二 06:08写道:
>
> Jakub Kicinski wrote on Mon, Mar 13, 2023 at 02:30:54PM -0700:
> > On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > > >   for (i = 0; i < priv->num_rings; i++) {
> > > > +         /*cancel work*/
> > > It isn't needed I think, the function cancel_work_sync() tells everything
> > > here.
> >
> > Note that 9p is more storage than networking, so this patch is likely
> > to go via a different tree than us.
>
> Any review done is useful anyway ;)
>
> Either Eric or me will take the patch, but in the past such fixes have
> sometimes also been taken into the net tree; honestly I wouldn't mind a
> bit more "rule" here as it's a bit weird that some of our patches are Cc
> to fsdevel@ (fs/ from fs/9p) and the other half netdev@ (net/ from
> net/9p), but afaict the MAINTAINERS syntax doesn't have a way of
> excluding e.g. net/9p from the `NETWORKING [GENERAL]` group so I guess
> we just have to live with that.

Dear Dominique,

Sorry for my confusion and thanks for your patient explanation. I'll take care
of it when submitting a fix to the linux kernel in the future.

>
> There's little enough volume and netdev automation sends a mail when a
> patch is picked up, so as long as there's no conflict (large majority of
> the cases) such fixes can go either way as far as I'm concerned.
>
Thanks again for your effort. Hope you have a good day :)

Best regards,
Zheng
  

Patch

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index c64050e839ac..83764431c066 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -274,12 +274,17 @@  static const struct xenbus_device_id xen_9pfs_front_ids[] = {
 static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
 {
 	int i, j;
+	struct xen_9pfs_dataring *ring = NULL;
 
 	write_lock(&xen_9pfs_lock);
 	list_del(&priv->list);
 	write_unlock(&xen_9pfs_lock);
 
 	for (i = 0; i < priv->num_rings; i++) {
+		/*cancel work*/
+		ring = &priv->rings[i];
+		cancel_work_sync(&ring->work);
+
 		if (!priv->rings[i].intf)
 			break;
 		if (priv->rings[i].irq > 0)