[-next,v3] md/raid5-cache: fix a deadlock in r5l_exit_log()

Message ID 20230708091727.1417894-1-yukuai1@huaweicloud.com
State New
Headers
Series [-next,v3] md/raid5-cache: fix a deadlock in r5l_exit_log() |

Commit Message

Yu Kuai July 8, 2023, 9:17 a.m. UTC
  From: Yu Kuai <yukuai3@huawei.com>

Commit b13015af94cf ("md/raid5-cache: Clear conf->log after finishing
work") introduce a new problem:

// caller hold reconfig_mutex
r5l_exit_log
 flush_work(&log->disable_writeback_work)
			r5c_disable_writeback_async
			 wait_event
			  /*
			   * conf->log is not NULL, and mddev_trylock()
			   * will fail, wait_event() can never pass.
			   */
 conf->log = NULL

Fix this problem by setting 'config->log' to NULL before wake_up() as it
used to be, so that wait_event() from r5c_disable_writeback_async() can
exist. In the meantime, move forward md_unregister_thread() so that
null-ptr-deref this commit fixed can still be fixed.

Fixes: b13015af94cf ("md/raid5-cache: Clear conf->log after finishing work")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---

Changes in v3:
 - Use a different solution.

 drivers/md/raid5-cache.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)
  

Comments

Song Liu July 29, 2023, 10:44 a.m. UTC | #1
On Sat, Jul 8, 2023 at 5:19 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Commit b13015af94cf ("md/raid5-cache: Clear conf->log after finishing
> work") introduce a new problem:
>
> // caller hold reconfig_mutex
> r5l_exit_log
>  flush_work(&log->disable_writeback_work)
>                         r5c_disable_writeback_async
>                          wait_event
>                           /*
>                            * conf->log is not NULL, and mddev_trylock()
>                            * will fail, wait_event() can never pass.
>                            */
>  conf->log = NULL
>
> Fix this problem by setting 'config->log' to NULL before wake_up() as it
> used to be, so that wait_event() from r5c_disable_writeback_async() can
> exist. In the meantime, move forward md_unregister_thread() so that
> null-ptr-deref this commit fixed can still be fixed.
>
> Fixes: b13015af94cf ("md/raid5-cache: Clear conf->log after finishing work")
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Applied to md-next. Thanks!

Song

> ---
>
> Changes in v3:
>  - Use a different solution.
>
>  drivers/md/raid5-cache.c | 9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
> index 47ba7d9e81e1..2eac4a50d99b 100644
> --- a/drivers/md/raid5-cache.c
> +++ b/drivers/md/raid5-cache.c
> @@ -3168,12 +3168,15 @@ void r5l_exit_log(struct r5conf *conf)
>  {
>         struct r5l_log *log = conf->log;
>
> -       /* Ensure disable_writeback_work wakes up and exits */
> -       wake_up(&conf->mddev->sb_wait);
> -       flush_work(&log->disable_writeback_work);
>         md_unregister_thread(&log->reclaim_thread);
>
> +       /*
> +        * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
> +        * ensure disable_writeback_work wakes up and exits.
> +        */
>         conf->log = NULL;
> +       wake_up(&conf->mddev->sb_wait);
> +       flush_work(&log->disable_writeback_work);
>
>         mempool_exit(&log->meta_pool);
>         bioset_exit(&log->bs);
> --
> 2.39.2
>
  

Patch

diff --git a/drivers/md/raid5-cache.c b/drivers/md/raid5-cache.c
index 47ba7d9e81e1..2eac4a50d99b 100644
--- a/drivers/md/raid5-cache.c
+++ b/drivers/md/raid5-cache.c
@@ -3168,12 +3168,15 @@  void r5l_exit_log(struct r5conf *conf)
 {
 	struct r5l_log *log = conf->log;
 
-	/* Ensure disable_writeback_work wakes up and exits */
-	wake_up(&conf->mddev->sb_wait);
-	flush_work(&log->disable_writeback_work);
 	md_unregister_thread(&log->reclaim_thread);
 
+	/*
+	 * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
+	 * ensure disable_writeback_work wakes up and exits.
+	 */
 	conf->log = NULL;
+	wake_up(&conf->mddev->sb_wait);
+	flush_work(&log->disable_writeback_work);
 
 	mempool_exit(&log->meta_pool);
 	bioset_exit(&log->bs);