[v2] scsi: qedi: Fix use after free bug in qedi_remove due to race condition

Message ID 20230413033422.28003-1-zyytlz.wz@163.com
State New
Headers
Series [v2] scsi: qedi: Fix use after free bug in qedi_remove due to race condition |

Commit Message

Zheng Wang April 13, 2023, 3:34 a.m. UTC
  In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
with qedi_recovery_handler and bound &qedi->board_disable_work
with qedi_board_disable_work.

When it calls qedi_schedule_recovery_handler, it will finally
call schedule_delayed_work to start the work.

When we call qedi_remove to remove the driver, there
may be a sequence as follows:

Fix it by finishing the work before cleanup in qedi_remove.

CPU0                  CPU1

                     |qedi_recovery_handler
qedi_remove          |
  __qedi_remove      |
iscsi_host_free      |
scsi_host_put        |
//free shost         |
                     |iscsi_host_for_each_session
                     |//use qedi->shost

Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
---
v2:
- remove unnecessary comment suggested by Mike Christie and cancel the work
after qedi_ops->stop and qedi_ops->ll2->stop which ensure there is no more
work suggested by Manish Rangankar
---
 drivers/scsi/qedi/qedi_main.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Manish Rangankar April 20, 2023, 5:49 a.m. UTC | #1
> -----Original Message-----
> From: Zheng Wang <zyytlz.wz@163.com>
> Sent: Thursday, April 13, 2023 9:04 AM
> To: Nilesh Javali <njavali@marvell.com>
> Cc: Manish Rangankar <mrangankar@marvell.com>; GR-QLogic-Storage-
> Upstream <GR-QLogic-Storage-Upstream@marvell.com>;
> jejb@linux.ibm.com; martin.petersen@oracle.com; linux-
> scsi@vger.kernel.org; linux-kernel@vger.kernel.org;
> hackerzheng666@gmail.com; 1395428693sheep@gmail.com;
> alex000young@gmail.com; Zheng Wang <zyytlz.wz@163.com>
> Subject: [EXT] [PATCH v2] scsi: qedi: Fix use after free bug in qedi_remove
> due to race condition
> 
> External Email
> 
> ----------------------------------------------------------------------
> In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
> with qedi_recovery_handler and bound &qedi->board_disable_work with
> qedi_board_disable_work.
> 
> When it calls qedi_schedule_recovery_handler, it will finally call
> schedule_delayed_work to start the work.
> 
> When we call qedi_remove to remove the driver, there may be a sequence
> as follows:
> 
> Fix it by finishing the work before cleanup in qedi_remove.
> 
> CPU0                  CPU1
> 
>                      |qedi_recovery_handler
> qedi_remove          |
>   __qedi_remove      |
> iscsi_host_free      |
> scsi_host_put        |
> //free shost         |
>                      |iscsi_host_for_each_session
>                      |//use qedi->shost
> 
> Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
> Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> ---
> v2:
> - remove unnecessary comment suggested by Mike Christie and cancel the
> work after qedi_ops->stop and qedi_ops->ll2->stop which ensure there is no
> more work suggested by Manish Rangankar
> ---
>  drivers/scsi/qedi/qedi_main.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
> index f2ee49756df8..45d359554182 100644
> --- a/drivers/scsi/qedi/qedi_main.c
> +++ b/drivers/scsi/qedi/qedi_main.c
> @@ -2450,6 +2450,9 @@ static void __qedi_remove(struct pci_dev *pdev,
> int mode)
>  		qedi_ops->ll2->stop(qedi->cdev);
>  	}
> 
> +	cancel_delayed_work_sync(&qedi->recovery_work);
> +	cancel_delayed_work_sync(&qedi->board_disable_work);
> +
>  	qedi_free_iscsi_pf_param(qedi);
> 
>  	rval = qedi_ops->common->update_drv_state(qedi->cdev, false);
> --
> 2.25.1

Thanks,

Acked-by: Manish Rangankar <mrangankar@marvell.com>
  
Mike Christie April 20, 2023, 3:38 p.m. UTC | #2
On 4/12/23 10:34 PM, Zheng Wang wrote:
> In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
> with qedi_recovery_handler and bound &qedi->board_disable_work
> with qedi_board_disable_work.
> 
> When it calls qedi_schedule_recovery_handler, it will finally
> call schedule_delayed_work to start the work.
> 
> When we call qedi_remove to remove the driver, there
> may be a sequence as follows:
> 
> Fix it by finishing the work before cleanup in qedi_remove.
> 
> CPU0                  CPU1
> 
>                      |qedi_recovery_handler
> qedi_remove          |
>   __qedi_remove      |
> iscsi_host_free      |
> scsi_host_put        |
> //free shost         |
>                      |iscsi_host_for_each_session
>                      |//use qedi->shost
> 
> Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
> Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> ---
> v2:
> - remove unnecessary comment suggested by Mike Christie and cancel the work
> after qedi_ops->stop and qedi_ops->ll2->stop which ensure there is no more
> work suggested by Manish Rangankar

Look ok to me now. Thanks.

Reviewed-by: Mike Christie <michael.christie@oracle.com>
  
Zheng Hacker April 21, 2023, 2:45 a.m. UTC | #3
Manish Rangankar <mrangankar@marvell.com> 于2023年4月20日周四 13:49写道:
>
>
>
> > -----Original Message-----
> > From: Zheng Wang <zyytlz.wz@163.com>
> > Sent: Thursday, April 13, 2023 9:04 AM
> > To: Nilesh Javali <njavali@marvell.com>
> > Cc: Manish Rangankar <mrangankar@marvell.com>; GR-QLogic-Storage-
> > Upstream <GR-QLogic-Storage-Upstream@marvell.com>;
> > jejb@linux.ibm.com; martin.petersen@oracle.com; linux-
> > scsi@vger.kernel.org; linux-kernel@vger.kernel.org;
> > hackerzheng666@gmail.com; 1395428693sheep@gmail.com;
> > alex000young@gmail.com; Zheng Wang <zyytlz.wz@163.com>
> > Subject: [EXT] [PATCH v2] scsi: qedi: Fix use after free bug in qedi_remove
> > due to race condition
> >
> > External Email
> >
> > ----------------------------------------------------------------------
> > In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
> > with qedi_recovery_handler and bound &qedi->board_disable_work with
> > qedi_board_disable_work.
> >
> > When it calls qedi_schedule_recovery_handler, it will finally call
> > schedule_delayed_work to start the work.
> >
> > When we call qedi_remove to remove the driver, there may be a sequence
> > as follows:
> >
> > Fix it by finishing the work before cleanup in qedi_remove.
> >
> > CPU0                  CPU1
> >
> >                      |qedi_recovery_handler
> > qedi_remove          |
> >   __qedi_remove      |
> > iscsi_host_free      |
> > scsi_host_put        |
> > //free shost         |
> >                      |iscsi_host_for_each_session
> >                      |//use qedi->shost
> >
> > Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
> > Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> > ---
> > v2:
> > - remove unnecessary comment suggested by Mike Christie and cancel the
> > work after qedi_ops->stop and qedi_ops->ll2->stop which ensure there is no
> > more work suggested by Manish Rangankar
> > ---
> >  drivers/scsi/qedi/qedi_main.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
> > index f2ee49756df8..45d359554182 100644
> > --- a/drivers/scsi/qedi/qedi_main.c
> > +++ b/drivers/scsi/qedi/qedi_main.c
> > @@ -2450,6 +2450,9 @@ static void __qedi_remove(struct pci_dev *pdev,
> > int mode)
> >               qedi_ops->ll2->stop(qedi->cdev);
> >       }
> >
> > +     cancel_delayed_work_sync(&qedi->recovery_work);
> > +     cancel_delayed_work_sync(&qedi->board_disable_work);
> > +
> >       qedi_free_iscsi_pf_param(qedi);
> >
> >       rval = qedi_ops->common->update_drv_state(qedi->cdev, false);
> > --
> > 2.25.1
>
> Thanks,
>
> Acked-by: Manish Rangankar <mrangankar@marvell.com>
>

Thanks for your review.

Best regards,
Zheng
  
Zheng Hacker April 21, 2023, 2:45 a.m. UTC | #4
Mike Christie <michael.christie@oracle.com> 于2023年4月20日周四 23:39写道:
>
> On 4/12/23 10:34 PM, Zheng Wang wrote:
> > In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
> > with qedi_recovery_handler and bound &qedi->board_disable_work
> > with qedi_board_disable_work.
> >
> > When it calls qedi_schedule_recovery_handler, it will finally
> > call schedule_delayed_work to start the work.
> >
> > When we call qedi_remove to remove the driver, there
> > may be a sequence as follows:
> >
> > Fix it by finishing the work before cleanup in qedi_remove.
> >
> > CPU0                  CPU1
> >
> >                      |qedi_recovery_handler
> > qedi_remove          |
> >   __qedi_remove      |
> > iscsi_host_free      |
> > scsi_host_put        |
> > //free shost         |
> >                      |iscsi_host_for_each_session
> >                      |//use qedi->shost
> >
> > Fixes: 4b1068f5d74b ("scsi: qedi: Add MFW error recovery process")
> > Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> > ---
> > v2:
> > - remove unnecessary comment suggested by Mike Christie and cancel the work
> > after qedi_ops->stop and qedi_ops->ll2->stop which ensure there is no more
> > work suggested by Manish Rangankar
>
> Look ok to me now. Thanks.
>
> Reviewed-by: Mike Christie <michael.christie@oracle.com>

Thanks for your review.

Best regards,
Zheng
  
Martin K. Petersen April 25, 2023, 3:32 a.m. UTC | #5
Zheng,

> In qedi_probe, it calls __qedi_probe, which bound &qedi->recovery_work
> with qedi_recovery_handler and bound &qedi->board_disable_work
> with qedi_board_disable_work.

Applied to 6.4/scsi-staging, thanks!
  

Patch

diff --git a/drivers/scsi/qedi/qedi_main.c b/drivers/scsi/qedi/qedi_main.c
index f2ee49756df8..45d359554182 100644
--- a/drivers/scsi/qedi/qedi_main.c
+++ b/drivers/scsi/qedi/qedi_main.c
@@ -2450,6 +2450,9 @@  static void __qedi_remove(struct pci_dev *pdev, int mode)
 		qedi_ops->ll2->stop(qedi->cdev);
 	}
 
+	cancel_delayed_work_sync(&qedi->recovery_work);
+	cancel_delayed_work_sync(&qedi->board_disable_work);
+
 	qedi_free_iscsi_pf_param(qedi);
 
 	rval = qedi_ops->common->update_drv_state(qedi->cdev, false);