[v2,1/6] md: fix missing flush of sync_work

Message ID 20231124075953.1932764-2-yukuai1@huaweicloud.com
State New
Headers
Series md: bugfix and cleanup for sync_thread |

Commit Message

Yu Kuai Nov. 24, 2023, 7:59 a.m. UTC
  From: Yu Kuai <yukuai3@huawei.com>

Commit ac619781967b ("md: use separate work_struct for md_start_sync()")
use a new sync_work to replace del_work, however, stop_sync_thread() and
__md_stop_writes() was trying to wait for sync_thread to be done, hence
they should switch to use sync_work as well.

Noted that md_start_sync() from sync_work will grab 'reconfig_mutex',
hence other contex can't held the same lock to flush work, and this will
be fixed in later patches.

Fixes: ac619781967b ("md: use separate work_struct for md_start_sync()")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/md.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Song Liu Nov. 28, 2023, 12:02 a.m. UTC | #1
On Fri, Nov 24, 2023 at 12:00 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Commit ac619781967b ("md: use separate work_struct for md_start_sync()")
> use a new sync_work to replace del_work, however, stop_sync_thread() and
> __md_stop_writes() was trying to wait for sync_thread to be done, hence
> they should switch to use sync_work as well.
>
> Noted that md_start_sync() from sync_work will grab 'reconfig_mutex',
> hence other contex can't held the same lock to flush work, and this will
> be fixed in later patches.
>
> Fixes: ac619781967b ("md: use separate work_struct for md_start_sync()")

This fix should go via md-fixes branch. Please send it separately.

Thanks,
Song

> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/md.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 09686d8db983..1701e2fb219f 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -4865,7 +4865,7 @@ static void stop_sync_thread(struct mddev *mddev)
>                 return;
>         }
>
> -       if (work_pending(&mddev->del_work))
> +       if (work_pending(&mddev->sync_work))
>                 flush_workqueue(md_misc_wq);
>
>         set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> @@ -6273,7 +6273,7 @@ static void md_clean(struct mddev *mddev)
>  static void __md_stop_writes(struct mddev *mddev)
>  {
>         set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
> -       if (work_pending(&mddev->del_work))
> +       if (work_pending(&mddev->sync_work))
>                 flush_workqueue(md_misc_wq);
>         if (mddev->sync_thread) {
>                 set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> --
> 2.39.2
>
  
Yu Kuai Nov. 28, 2023, 2:07 a.m. UTC | #2
Hi,

在 2023/11/28 8:02, Song Liu 写道:
> On Fri, Nov 24, 2023 at 12:00 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> Commit ac619781967b ("md: use separate work_struct for md_start_sync()")
>> use a new sync_work to replace del_work, however, stop_sync_thread() and
>> __md_stop_writes() was trying to wait for sync_thread to be done, hence
>> they should switch to use sync_work as well.
>>
>> Noted that md_start_sync() from sync_work will grab 'reconfig_mutex',
>> hence other contex can't held the same lock to flush work, and this will
>> be fixed in later patches.
>>
>> Fixes: ac619781967b ("md: use separate work_struct for md_start_sync()")
> 
> This fix should go via md-fixes branch. Please send it separately.

This patch alone is not good, there are follow up problems to be fixed
completely after patch 5. Can this patchset applied to md-fixes?

Or I can split patch 1,4 and 5 for md-fixes, and keep others to md-next.

Thanks,
Kuai

> 
> Thanks,
> Song
> 
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/md.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>> index 09686d8db983..1701e2fb219f 100644
>> --- a/drivers/md/md.c
>> +++ b/drivers/md/md.c
>> @@ -4865,7 +4865,7 @@ static void stop_sync_thread(struct mddev *mddev)
>>                  return;
>>          }
>>
>> -       if (work_pending(&mddev->del_work))
>> +       if (work_pending(&mddev->sync_work))
>>                  flush_workqueue(md_misc_wq);
>>
>>          set_bit(MD_RECOVERY_INTR, &mddev->recovery);
>> @@ -6273,7 +6273,7 @@ static void md_clean(struct mddev *mddev)
>>   static void __md_stop_writes(struct mddev *mddev)
>>   {
>>          set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>> -       if (work_pending(&mddev->del_work))
>> +       if (work_pending(&mddev->sync_work))
>>                  flush_workqueue(md_misc_wq);
>>          if (mddev->sync_thread) {
>>                  set_bit(MD_RECOVERY_INTR, &mddev->recovery);
>> --
>> 2.39.2
>>
> .
>
  
Song Liu Nov. 28, 2023, 3:48 a.m. UTC | #3
On Mon, Nov 27, 2023 at 6:07 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2023/11/28 8:02, Song Liu 写道:
> > On Fri, Nov 24, 2023 at 12:00 AM Yu Kuai <yukuai1@huaweicloud.com> wrote:
> >>
> >> From: Yu Kuai <yukuai3@huawei.com>
> >>
> >> Commit ac619781967b ("md: use separate work_struct for md_start_sync()")
> >> use a new sync_work to replace del_work, however, stop_sync_thread() and
> >> __md_stop_writes() was trying to wait for sync_thread to be done, hence
> >> they should switch to use sync_work as well.
> >>
> >> Noted that md_start_sync() from sync_work will grab 'reconfig_mutex',
> >> hence other contex can't held the same lock to flush work, and this will
> >> be fixed in later patches.
> >>
> >> Fixes: ac619781967b ("md: use separate work_struct for md_start_sync()")
> >
> > This fix should go via md-fixes branch. Please send it separately.
>
> This patch alone is not good, there are follow up problems to be fixed
> completely after patch 5. Can this patchset applied to md-fixes?
>
> Or I can split patch 1,4 and 5 for md-fixes, and keep others to md-next.

Yes, please split the patches into two sets.

Thanks,
Song
  

Patch

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 09686d8db983..1701e2fb219f 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -4865,7 +4865,7 @@  static void stop_sync_thread(struct mddev *mddev)
 		return;
 	}
 
-	if (work_pending(&mddev->del_work))
+	if (work_pending(&mddev->sync_work))
 		flush_workqueue(md_misc_wq);
 
 	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
@@ -6273,7 +6273,7 @@  static void md_clean(struct mddev *mddev)
 static void __md_stop_writes(struct mddev *mddev)
 {
 	set_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
-	if (work_pending(&mddev->del_work))
+	if (work_pending(&mddev->sync_work))
 		flush_workqueue(md_misc_wq);
 	if (mddev->sync_thread) {
 		set_bit(MD_RECOVERY_INTR, &mddev->recovery);