diff mbox series

[v5,1/2] md/raid10: fix incorrect done of recovery

Message ID	20230601062424.3613218-2-linan666@huaweicloud.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; From: linan666@huaweicloud.com To: song@kernel.org, neilb@suse.de Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan122@huawei.com, yukuai3@huawei.com, yi.zhang@huawei.com, houtao1@huawei.com, yangerkun@huawei.com Subject: [PATCH v5 1/2] md/raid10: fix incorrect done of recovery Date: Thu, 1 Jun 2023 14:24:23 +0800 Message-Id: <20230601062424.3613218-2-linan666@huaweicloud.com> In-Reply-To: <20230601062424.3613218-1-linan666@huaweicloud.com> References: <20230601062424.3613218-1-linan666@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	raid10 bugfix \| [v5,0/2] raid10 bugfix [v5,1/2] md/raid10: fix incorrect done of recovery [v5,2/2] md/raid10: fix io loss while replacement replace rdev

Commit Message

Li Nan June 1, 2023, 6:24 a.m. UTC

  From: Li Nan <linan122@huawei.com>

In raid10_sync_request(), if data cannot be read from any disk for
recovery, it will go to 'giveup' and let 'chunks_skipped' + 1. After
multiple 'giveup', when 'chunks_skipped >= geo.raid_disks', it will
return 'max_sector', indicating that the recovery has been completed.
However, the recovery is just aborted and the data remains inconsistent.

Fix it by setting mirror->recovery_disabled, which will prevent the spare
disk from being added to this mirror. The same issue also exists during
resync, it will be fixed afterwards.

Signed-off-by: Li Nan <linan122@huawei.com>
---
 drivers/md/raid10.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

Comments

Paul Menzel June 1, 2023, 7:06 a.m. UTC | #1

Dear Li,


Thank you for your patch.

Am 01.06.23 um 08:24 schrieb linan666@huaweicloud.com:
> From: Li Nan <linan122@huawei.com>

Unfortunately, I do not understand your commit message summary “fix 
incorrect done of recovery”. Maybe:

Do not add sparse disk when recovery aborts

> In raid10_sync_request(), if data cannot be read from any disk for
> recovery, it will go to 'giveup' and let 'chunks_skipped' + 1. After
> multiple 'giveup', when 'chunks_skipped >= geo.raid_disks', it will
> return 'max_sector', indicating that the recovery has been completed.
> However, the recovery is just aborted and the data remains inconsistent.
> 
> Fix it by setting mirror->recovery_disabled, which will prevent the spare
> disk from being added to this mirror. The same issue also exists during
> resync, it will be fixed afterwards.
> 
> Signed-off-by: Li Nan <linan122@huawei.com>
> ---
>   drivers/md/raid10.c | 18 +++++++++++++++++-
>   1 file changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
> index d93d8cb2b620..3ba1516ea160 100644
> --- a/drivers/md/raid10.c
> +++ b/drivers/md/raid10.c
> @@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
>   	int chunks_skipped = 0;
>   	sector_t chunk_mask = conf->geo.chunk_mask;
>   	int page_idx = 0;
> +	int error_disk = -1;
>   
>   	/*
>   	 * Allow skipping a full rebuild for incremental assembly
> @@ -3386,7 +3387,20 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
>   		return reshape_request(mddev, sector_nr, skipped);
>   
>   	if (chunks_skipped >= conf->geo.raid_disks) {
> -		/* if there has been nothing to do on any drive,
> +		pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
> +			test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?  "resync" : "recovery");
> +		if (error_disk >= 0 &&
> +		    !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
> +			/*
> +			 * recovery fail, set mirrors.recovory_disabled,

recov*e*ry

> +			 * device shouldn't be added to there.
> +			 */
> +			conf->mirrors[error_disk].recovery_disabled =
> +						mddev->recovery_disabled;
> +			return 0;
> +		}
> +		/*
> +		 * if there has been nothing to do on any drive,
>   		 * then there is nothing to do at all..

Just one dot/period at the end?

>   		 */
>   		*skipped = 1;
> @@ -3638,6 +3652,8 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
>   						       mdname(mddev));
>   					mirror->recovery_disabled
>   						= mddev->recovery_disabled;
> +				} else {
> +					error_disk = i;
>   				}
>   				put_buf(r10_bio);
>   				if (rb2)


Kind regards,

Paul

Li Nan June 2, 2023, 6:57 a.m. UTC | #2

在 2023/6/1 15:06, Paul Menzel 写道:
> Dear Li,
> 
> 
> Thank you for your patch.
> 
> Am 01.06.23 um 08:24 schrieb linan666@huaweicloud.com:
>> From: Li Nan <linan122@huawei.com>
> 
> Unfortunately, I do not understand your commit message summary “fix 
> incorrect done of recovery”. Maybe:
> 
> Do not add sparse disk when recovery aborts
> 

"recovery fail" is better?

>> In raid10_sync_request(), if data cannot be read from any disk for
>> recovery, it will go to 'giveup' and let 'chunks_skipped' + 1. After
>> multiple 'giveup', when 'chunks_skipped >= geo.raid_disks', it will
>> return 'max_sector', indicating that the recovery has been completed.
>> However, the recovery is just aborted and the data remains inconsistent.
>>
>> Fix it by setting mirror->recovery_disabled, which will prevent the spare
>> disk from being added to this mirror. The same issue also exists during
>> resync, it will be fixed afterwards.
>>
>> Signed-off-by: Li Nan <linan122@huawei.com>
>> ---
>>   drivers/md/raid10.c | 18 +++++++++++++++++-
>>   1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
>> index d93d8cb2b620..3ba1516ea160 100644
>> --- a/drivers/md/raid10.c
>> +++ b/drivers/md/raid10.c
>> @@ -3303,6 +3303,7 @@ static sector_t raid10_sync_request(struct mddev 
>> *mddev, sector_t sector_nr,
>>       int chunks_skipped = 0;
>>       sector_t chunk_mask = conf->geo.chunk_mask;
>>       int page_idx = 0;
>> +    int error_disk = -1;
>>       /*
>>        * Allow skipping a full rebuild for incremental assembly
>> @@ -3386,7 +3387,20 @@ static sector_t raid10_sync_request(struct 
>> mddev *mddev, sector_t sector_nr,
>>           return reshape_request(mddev, sector_nr, skipped);
>>       if (chunks_skipped >= conf->geo.raid_disks) {
>> -        /* if there has been nothing to do on any drive,
>> +        pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
>> +            test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?  "resync" 
>> : "recovery");
>> +        if (error_disk >= 0 &&
>> +            !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
>> +            /*
>> +             * recovery fail, set mirrors.recovory_disabled,
> 
> recov*e*ry
> 
>> +             * device shouldn't be added to there.
>> +             */
>> +            conf->mirrors[error_disk].recovery_disabled =
>> +                        mddev->recovery_disabled;
>> +            return 0;
>> +        }
>> +        /*
>> +         * if there has been nothing to do on any drive,
>>            * then there is nothing to do at all..
> 
> Just one dot/period at the end?
>
Thanks for your suggestion. I will change it in next version.

Paul Menzel June 2, 2023, 8:33 a.m. UTC | #3

Dear Li Nan,


Am 02.06.23 um 08:57 schrieb Li Nan:

> 在 2023/6/1 15:06, Paul Menzel 写道:

>> Am 01.06.23 um 08:24 schrieb linan666@huaweicloud.com:
>>> From: Li Nan <linan122@huawei.com>
>>
>> Unfortunately, I do not understand your commit message summary “fix 
>> incorrect done of recovery”. Maybe:
>>
>> Do not add sparse disk when recovery aborts
> 
> "recovery fail" is better?

I think the grammar is incorrect, and it should be fail*s*.

[…]


Kind regards,

Paul

Li Nan June 2, 2023, 9:23 a.m. UTC | #4

在 2023/6/2 16:33, Paul Menzel 写道:
> Dear Li Nan,
> 
> 
> Am 02.06.23 um 08:57 schrieb Li Nan:
> 
>> 在 2023/6/1 15:06, Paul Menzel 写道:
> 
>>> Am 01.06.23 um 08:24 schrieb linan666@huaweicloud.com:
>>>> From: Li Nan <linan122@huawei.com>
>>>
>>> Unfortunately, I do not understand your commit message summary “fix 
>>> incorrect done of recovery”. Maybe:
>>>
>>> Do not add sparse disk when recovery aborts
>>
>> "recovery fail" is better?
> 
> I think the grammar is incorrect, and it should be fail*s*.
> 

fix in v7. :)

diff mbox series

Patch

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index d93d8cb2b620..3ba1516ea160 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -3303,6 +3303,7 @@  static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 	int chunks_skipped = 0;
 	sector_t chunk_mask = conf->geo.chunk_mask;
 	int page_idx = 0;
+	int error_disk = -1;
 
 	/*
 	 * Allow skipping a full rebuild for incremental assembly
@@ -3386,7 +3387,20 @@  static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 		return reshape_request(mddev, sector_nr, skipped);
 
 	if (chunks_skipped >= conf->geo.raid_disks) {
-		/* if there has been nothing to do on any drive,
+		pr_err("md/raid10:%s: %s fail\n", mdname(mddev),
+			test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ?  "resync" : "recovery");
+		if (error_disk >= 0 &&
+		    !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
+			/*
+			 * recovery fail, set mirrors.recovory_disabled,
+			 * device shouldn't be added to there.
+			 */
+			conf->mirrors[error_disk].recovery_disabled =
+						mddev->recovery_disabled;
+			return 0;
+		}
+		/*
+		 * if there has been nothing to do on any drive,
 		 * then there is nothing to do at all..
 		 */
 		*skipped = 1;
@@ -3638,6 +3652,8 @@  static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr,
 						       mdname(mddev));
 					mirror->recovery_disabled
 						= mddev->recovery_disabled;
+				} else {
+					error_disk = i;
 				}
 				put_buf(r10_bio);
 				if (rb2)