[v2,0/6] fix error flag covered by journal recovery

Message ID	20230210032044.146115-1-yebin@huaweicloud.com
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; From: Ye Bin <yebin@huaweicloud.com> To: tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org Cc: linux-kernel@vger.kernel.org, jack@suse.cz, Ye Bin <yebin10@huawei.com> Subject: [PATCH v2 0/6] fix error flag covered by journal recovery Date: Fri, 10 Feb 2023 11:20:38 +0800 Message-Id: <20230210032044.146115-1-yebin@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	fix error flag covered by journal recovery \| [v2,0/6] fix error flag covered by journal recovery [v2,1/6] jbd2: introduce callback for recovery journal [v2,2/6] ext4: introudce helper for jounral recover handle [v2,3/6] jbd2: do extra handle when do journal recovery [v2,4/6] ext4: remove backup for super block when recovery journal [v2,5/6] ext4: fix super block checksum error [v2,6/6] ext4: make sure fs error flag setted before clear journal error

Message ID

20230210032044.146115-1-yebin@huaweicloud.com

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
From: Ye Bin <yebin@huaweicloud.com>
To: tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, jack@suse.cz,
        Ye Bin <yebin10@huawei.com>
Subject: [PATCH v2 0/6] fix error flag covered by journal recovery
Date: Fri, 10 Feb 2023 11:20:38 +0800
Message-Id: <20230210032044.146115-1-yebin@huaweicloud.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

fix error flag covered by journal recovery |

Message

Ye Bin Feb. 10, 2023, 3:20 a.m. UTC

  From: Ye Bin <yebin10@huawei.com>

Diff v2 vs v1:
Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
ext4_load_journal() to jbd2_journal_recover().

When do fault injection test, got issue as follows:
EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
EXT4-fs (dm-5): recovery complete
EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro

EXT4-fs (dm-5): recovery complete
EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro

Without do file system check, file system is clean when do second mount.
Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
mode the last super block is commit directly. So super block in journal is
not uptodate. When do jounral recovery, the uptodate super block will be
covered by jounral data. If super block submit all failed after recover
journal, then file system error flag is lost. When do "fsck -a" couldn't
repair file system deeply.
To solve above issue we need to do extra handle when do super block journal
recovery.

Ye Bin (6):
  jbd2: introduce callback for recovery journal
  ext4: introudce helper for jounral recover handle
  jbd2: do extra handle when do journal recovery
  ext4: remove backup for super block when recovery journal
  ext4: fix super block checksum error
  ext4: make sure fs error flag setted before clear journal error

 fs/ext4/ext4_jbd2.c  | 66 ++++++++++++++++++++++++++++++++++++++++++++
 fs/ext4/ext4_jbd2.h  |  2 ++
 fs/ext4/super.c      | 18 ++++--------
 fs/jbd2/recovery.c   | 27 ++++++++++++++++++
 include/linux/jbd2.h | 11 ++++++++
 5 files changed, 112 insertions(+), 12 deletions(-)

Comments

Jan Kara Feb. 10, 2023, 11:56 a.m. UTC | #1

Hello!

On Fri 10-02-23 11:20:38, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
> 
> Diff v2 vs v1:
> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
> ext4_load_journal() to jbd2_journal_recover().
> 
> When do fault injection test, got issue as follows:
> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
> 
> EXT4-fs (dm-5): recovery complete
> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
> 
> Without do file system check, file system is clean when do second mount.
> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
> mode the last super block is commit directly. So super block in journal is
> not uptodate. When do jounral recovery, the uptodate super block will be
> covered by jounral data. If super block submit all failed after recover
> journal, then file system error flag is lost. When do "fsck -a" couldn't
> repair file system deeply.
> To solve above issue we need to do extra handle when do super block journal
> recovery.

Thanks for the patches. Looking through the patches, I think this is a bit
of an overengineering for the problem at hand. The only thing that is
really worth preserving so that it is not lost after journal replay is the
error information. So in ext4_load_journal() I would just save that if
EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
after journal replay. Sure if the superblock write during journal replay
succeeds but the write restoring the error information fails, we will loose
the error information but that is so unlikely in practice that I don't
think it is really worth complicating the code for it. Also the only
downside is we will loose the information there is some error in the
filesystem - we'll soon find that out again anyway :).

								Honza

> 
> Ye Bin (6):
>   jbd2: introduce callback for recovery journal
>   ext4: introudce helper for jounral recover handle
>   jbd2: do extra handle when do journal recovery
>   ext4: remove backup for super block when recovery journal
>   ext4: fix super block checksum error
>   ext4: make sure fs error flag setted before clear journal error
> 
>  fs/ext4/ext4_jbd2.c  | 66 ++++++++++++++++++++++++++++++++++++++++++++
>  fs/ext4/ext4_jbd2.h  |  2 ++
>  fs/ext4/super.c      | 18 ++++--------
>  fs/jbd2/recovery.c   | 27 ++++++++++++++++++
>  include/linux/jbd2.h | 11 ++++++++
>  5 files changed, 112 insertions(+), 12 deletions(-)
> 
> -- 
> 2.31.1
>

Zhang Yi Feb. 10, 2023, 12:47 p.m. UTC | #2

On 2023/2/10 19:56, Jan Kara wrote:
> Hello!
> 
> On Fri 10-02-23 11:20:38, Ye Bin wrote:
>> From: Ye Bin <yebin10@huawei.com>
>>
>> Diff v2 vs v1:
>> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
>> ext4_load_journal() to jbd2_journal_recover().
>>
>> When do fault injection test, got issue as follows:
>> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
>> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> Without do file system check, file system is clean when do second mount.
>> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
>> mode the last super block is commit directly. So super block in journal is
>> not uptodate. When do jounral recovery, the uptodate super block will be
>> covered by jounral data. If super block submit all failed after recover
>> journal, then file system error flag is lost. When do "fsck -a" couldn't
>> repair file system deeply.
>> To solve above issue we need to do extra handle when do super block journal
>> recovery.
> 
> Thanks for the patches. Looking through the patches, I think this is a bit
> of an overengineering for the problem at hand. The only thing that is
> really worth preserving so that it is not lost after journal replay is the
> error information. So in ext4_load_journal() I would just save that if
> EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
> after journal replay. Sure if the superblock write during journal replay
> succeeds but the write restoring the error information fails, we will loose
> the error information but that is so unlikely in practice that I don't
> think it is really worth complicating the code for it. Also the only
> downside is we will loose the information there is some error in the
> filesystem - we'll soon find that out again anyway :).
> 

I think so, also add a error message if we failed to restoring the error
information, it could let us know what happened.

Thanks,
Yi.

Ye Bin Feb. 15, 2023, 1:14 a.m. UTC | #3

On 2023/2/10 19:56, Jan Kara wrote:
> Hello!
>
> On Fri 10-02-23 11:20:38, Ye Bin wrote:
>> From: Ye Bin <yebin10@huawei.com>
>>
>> Diff v2 vs v1:
>> Move call 'j_replay_prepare_callback' and 'j_replay_end_callback' from
>> ext4_load_journal() to jbd2_journal_recover().
>>
>> When do fault injection test, got issue as follows:
>> EXT4-fs (dm-5): warning: mounting fs with errors, running e2fsck is recommended
>> EXT4-fs (dm-5): Errors on filesystem, clearing orphan list.
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> EXT4-fs (dm-5): recovery complete
>> EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: data_err=abort,errors=remount-ro
>>
>> Without do file system check, file system is clean when do second mount.
>> Theoretically, the kernel will not clear fs error flag. In errors=remount-ro
>> mode the last super block is commit directly. So super block in journal is
>> not uptodate. When do jounral recovery, the uptodate super block will be
>> covered by jounral data. If super block submit all failed after recover
>> journal, then file system error flag is lost. When do "fsck -a" couldn't
>> repair file system deeply.
>> To solve above issue we need to do extra handle when do super block journal
>> recovery.
> Thanks for the patches. Looking through the patches, I think this is a bit
> of an overengineering for the problem at hand. The only thing that is
> really worth preserving so that it is not lost after journal replay is the
> error information. So in ext4_load_journal() I would just save that if
> EXT4_ERROR_FS is set in es->s_state before journal replay and restore it
> after journal replay. Sure if the superblock write during journal replay
> succeeds but the write restoring the error information fails, we will loose
> the error information but that is so unlikely in practice that I don't
> think it is really worth complicating the code for it. Also the only
> downside is we will loose the information there is some error in the
> filesystem - we'll soon find that out again anyway :).
>
> 								Honza
Yes, this solution seems a little cumbersome, but to solve the problem 
of error
information loss, I can only think of this solution.
I re-analyzed the issue scenario. Because the error information of the 
last journal
super block was not recorded. This will cause that the error flag will 
not be updated
when the super block is submitted subsequently. However, when processing 
orphan
list, the file system errors were recorded in the memory, and the orphan 
list were
cleared directly, resulting in file system inconsistencies. To solve 
above isuue, i sent
V3 patch.
>> Ye Bin (6):
>>    jbd2: introduce callback for recovery journal
>>    ext4: introudce helper for jounral recover handle
>>    jbd2: do extra handle when do journal recovery
>>    ext4: remove backup for super block when recovery journal
>>    ext4: fix super block checksum error
>>    ext4: make sure fs error flag setted before clear journal error
>>
>>   fs/ext4/ext4_jbd2.c  | 66 ++++++++++++++++++++++++++++++++++++++++++++
>>   fs/ext4/ext4_jbd2.h  |  2 ++
>>   fs/ext4/super.c      | 18 ++++--------
>>   fs/jbd2/recovery.c   | 27 ++++++++++++++++++
>>   include/linux/jbd2.h | 11 ++++++++
>>   5 files changed, 112 insertions(+), 12 deletions(-)
>>
>> -- 
>> 2.31.1
>>