[RFC,v4,13/14] dm: wait for IO completion before removing dm device

Message ID 20240130021843.3608859-14-yukuai1@huaweicloud.com
State New
Headers
Series dm-raid: fix v6.7 regressions |

Commit Message

Yu Kuai Jan. 30, 2024, 2:18 a.m. UTC
  From: Yu Kuai <yukuai3@huawei.com>

__dm_destroy() guarantee that device openers is zero, and then
only call 'presuspend' and 'postsuspend' for the target. For
request-based dm, 'md->holders' will be grabbed for each rq and
__dm_destroy() will wait for 'md->holders' to be zero. However, for
bio-based device, __dm_destroy() doesn't wait for all bios to be done.

Fix this problem by calling dm_wait_for_completion() to wail for all
inflight IO to be done, like what dm_suspend() does.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/dm.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Mikulas Patocka Jan. 30, 2024, 11:46 a.m. UTC | #1
On Tue, 30 Jan 2024, Yu Kuai wrote:

> From: Yu Kuai <yukuai3@huawei.com>
> 
> __dm_destroy() guarantee that device openers is zero, and then
> only call 'presuspend' and 'postsuspend' for the target. For
> request-based dm, 'md->holders' will be grabbed for each rq and
> __dm_destroy() will wait for 'md->holders' to be zero. However, for
> bio-based device, __dm_destroy() doesn't wait for all bios to be done.
> 
> Fix this problem by calling dm_wait_for_completion() to wail for all
> inflight IO to be done, like what dm_suspend() does.

If the number of openers is zero, it is guaranteed that there are no bios 
in flight. Therefore, we don't have to wait for them.

If there are bios in flight, it is a bug in the code that issues the bios. 
You can put WARN_ON(dm_in_flight_bios(md)) there.

Mikulas

> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
> ---
>  drivers/md/dm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 8dcabf84d866..2c0eae67d0f1 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -58,6 +58,7 @@ static DEFINE_IDR(_minor_idr);
>  static DEFINE_SPINLOCK(_minor_lock);
>  
>  static void do_deferred_remove(struct work_struct *w);
> +static int dm_wait_for_completion(struct mapped_device *md, unsigned int task_state);
>  
>  static DECLARE_WORK(deferred_remove_work, do_deferred_remove);
>  
> @@ -2495,6 +2496,8 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>  	if (!dm_suspended_md(md)) {
>  		dm_table_presuspend_targets(map);
>  		set_bit(DMF_SUSPENDED, &md->flags);
> +		if (wait)
> +			dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
>  		set_bit(DMF_POST_SUSPENDING, &md->flags);
>  		dm_table_postsuspend_targets(map);
>  	}
> -- 
> 2.39.2
>
  
Yu Kuai Jan. 30, 2024, 1:05 p.m. UTC | #2
Hi,

在 2024/01/30 19:46, Mikulas Patocka 写道:
> 
> 
> On Tue, 30 Jan 2024, Yu Kuai wrote:
> 
>> From: Yu Kuai <yukuai3@huawei.com>
>>
>> __dm_destroy() guarantee that device openers is zero, and then
>> only call 'presuspend' and 'postsuspend' for the target. For
>> request-based dm, 'md->holders' will be grabbed for each rq and
>> __dm_destroy() will wait for 'md->holders' to be zero. However, for
>> bio-based device, __dm_destroy() doesn't wait for all bios to be done.
>>
>> Fix this problem by calling dm_wait_for_completion() to wail for all
>> inflight IO to be done, like what dm_suspend() does.
> 
> If the number of openers is zero, it is guaranteed that there are no bios
> in flight. Therefore, we don't have to wait for them.
> 
> If there are bios in flight, it is a bug in the code that issues the bios.
> You can put WARN_ON(dm_in_flight_bios(md)) there.

I add this patch because while testing, there is a problem that is
hard to reporduce, as I mentioned in the other thread. I'll add BUG_ON()
and try if I can still reporduce this problem without triggering it.

Thanks,
Kuai

[12504.959682] BUG bio-296 (Not tainted): Object already free
[12504.960239] 
-----------------------------------------------------------------------------
[12504.960239]
[12504.961209] Allocated in mempool_alloc+0xe8/0x270 age=30 cpu=1 pid=203288
[12504.961905]  kmem_cache_alloc+0x36a/0x3b0
[12504.962324]  mempool_alloc+0xe8/0x270
[12504.962712]  bio_alloc_bioset+0x3b5/0x920
[12504.963129]  bio_alloc_clone+0x3e/0x160
[12504.963533]  alloc_io+0x3d/0x1f0
[12504.963876]  dm_submit_bio+0x12f/0xa30
[12504.964267]  __submit_bio+0x9c/0xe0
[12504.964639]  submit_bio_noacct_nocheck+0x25a/0x570
[12504.965136]  submit_bio_wait+0xc2/0x160
[12504.965535]  blkdev_issue_zeroout+0x19b/0x2e0
[12504.965991]  ext4_init_inode_table+0x246/0x560
[12504.966462]  ext4_lazyinit_thread+0x750/0xbe0
[12504.966922]  kthread+0x1b4/0x1f0
> 
> Mikulas
> 
>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>> ---
>>   drivers/md/dm.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>> index 8dcabf84d866..2c0eae67d0f1 100644
>> --- a/drivers/md/dm.c
>> +++ b/drivers/md/dm.c
>> @@ -58,6 +58,7 @@ static DEFINE_IDR(_minor_idr);
>>   static DEFINE_SPINLOCK(_minor_lock);
>>   
>>   static void do_deferred_remove(struct work_struct *w);
>> +static int dm_wait_for_completion(struct mapped_device *md, unsigned int task_state);
>>   
>>   static DECLARE_WORK(deferred_remove_work, do_deferred_remove);
>>   
>> @@ -2495,6 +2496,8 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>>   	if (!dm_suspended_md(md)) {
>>   		dm_table_presuspend_targets(map);
>>   		set_bit(DMF_SUSPENDED, &md->flags);
>> +		if (wait)
>> +			dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
>>   		set_bit(DMF_POST_SUSPENDING, &md->flags);
>>   		dm_table_postsuspend_targets(map);
>>   	}
>> -- 
>> 2.39.2
>>
> 
> .
>
  
Yu Kuai Jan. 31, 2024, 1:35 a.m. UTC | #3
Hi,

在 2024/01/30 21:05, Yu Kuai 写道:
> Hi,
> 
> 在 2024/01/30 19:46, Mikulas Patocka 写道:
>>
>>
>> On Tue, 30 Jan 2024, Yu Kuai wrote:
>>
>>> From: Yu Kuai <yukuai3@huawei.com>
>>>
>>> __dm_destroy() guarantee that device openers is zero, and then
>>> only call 'presuspend' and 'postsuspend' for the target. For
>>> request-based dm, 'md->holders' will be grabbed for each rq and
>>> __dm_destroy() will wait for 'md->holders' to be zero. However, for
>>> bio-based device, __dm_destroy() doesn't wait for all bios to be done.
>>>
>>> Fix this problem by calling dm_wait_for_completion() to wail for all
>>> inflight IO to be done, like what dm_suspend() does.
>>
>> If the number of openers is zero, it is guaranteed that there are no bios
>> in flight. Therefore, we don't have to wait for them.
>>
>> If there are bios in flight, it is a bug in the code that issues the 
>> bios.
>> You can put WARN_ON(dm_in_flight_bios(md)) there.
> 
> I add this patch because while testing, there is a problem that is
> hard to reporduce, as I mentioned in the other thread. I'll add BUG_ON()
> and try if I can still reporduce this problem without triggering it.
> 
> Thanks,
> Kuai
> 
> [12504.959682] BUG bio-296 (Not tainted): Object already free
> [12504.960239] 
> ----------------------------------------------------------------------------- 
> 
> [12504.960239]
> [12504.961209] Allocated in mempool_alloc+0xe8/0x270 age=30 cpu=1 
> pid=203288
> [12504.961905]  kmem_cache_alloc+0x36a/0x3b0
> [12504.962324]  mempool_alloc+0xe8/0x270
> [12504.962712]  bio_alloc_bioset+0x3b5/0x920
> [12504.963129]  bio_alloc_clone+0x3e/0x160
> [12504.963533]  alloc_io+0x3d/0x1f0
> [12504.963876]  dm_submit_bio+0x12f/0xa30
> [12504.964267]  __submit_bio+0x9c/0xe0
> [12504.964639]  submit_bio_noacct_nocheck+0x25a/0x570
> [12504.965136]  submit_bio_wait+0xc2/0x160
> [12504.965535]  blkdev_issue_zeroout+0x19b/0x2e0
> [12504.965991]  ext4_init_inode_table+0x246/0x560
> [12504.966462]  ext4_lazyinit_thread+0x750/0xbe0
> [12504.966922]  kthread+0x1b4/0x1f0

After adding the BUG_ON(), I can still reporducing this BUG, this really
looks like a BUG, and I don't think this is related to dm-raid. Perhaps
you guys can take a look?

Thanks,
Kuai

>>
>> Mikulas
>>
>>> Signed-off-by: Yu Kuai <yukuai3@huawei.com>
>>> ---
>>>   drivers/md/dm.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
>>> index 8dcabf84d866..2c0eae67d0f1 100644
>>> --- a/drivers/md/dm.c
>>> +++ b/drivers/md/dm.c
>>> @@ -58,6 +58,7 @@ static DEFINE_IDR(_minor_idr);
>>>   static DEFINE_SPINLOCK(_minor_lock);
>>>   static void do_deferred_remove(struct work_struct *w);
>>> +static int dm_wait_for_completion(struct mapped_device *md, unsigned 
>>> int task_state);
>>>   static DECLARE_WORK(deferred_remove_work, do_deferred_remove);
>>> @@ -2495,6 +2496,8 @@ static void __dm_destroy(struct mapped_device 
>>> *md, bool wait)
>>>       if (!dm_suspended_md(md)) {
>>>           dm_table_presuspend_targets(map);
>>>           set_bit(DMF_SUSPENDED, &md->flags);
>>> +        if (wait)
>>> +            dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
>>>           set_bit(DMF_POST_SUSPENDING, &md->flags);
>>>           dm_table_postsuspend_targets(map);
>>>       }
>>> -- 
>>> 2.39.2
>>>
>>
>> .
>>
> 
> .
>
  

Patch

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 8dcabf84d866..2c0eae67d0f1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -58,6 +58,7 @@  static DEFINE_IDR(_minor_idr);
 static DEFINE_SPINLOCK(_minor_lock);
 
 static void do_deferred_remove(struct work_struct *w);
+static int dm_wait_for_completion(struct mapped_device *md, unsigned int task_state);
 
 static DECLARE_WORK(deferred_remove_work, do_deferred_remove);
 
@@ -2495,6 +2496,8 @@  static void __dm_destroy(struct mapped_device *md, bool wait)
 	if (!dm_suspended_md(md)) {
 		dm_table_presuspend_targets(map);
 		set_bit(DMF_SUSPENDED, &md->flags);
+		if (wait)
+			dm_wait_for_completion(md, TASK_UNINTERRUPTIBLE);
 		set_bit(DMF_POST_SUSPENDING, &md->flags);
 		dm_table_postsuspend_targets(map);
 	}