[v3,1/2] ext4: commit super block if fs record error when journal record without error
Message ID | 20230214022905.765088-2-yebin@huaweicloud.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2708108wrn; Mon, 13 Feb 2023 18:16:22 -0800 (PST) X-Google-Smtp-Source: AK7set8NH6Xa9aSYkDIZUi+Wx8t4DbOV+gMMcZ3i/DPPIT1arbi1AOsJ8UzPKihlVQ/+u5dZej3T X-Received: by 2002:a62:1a05:0:b0:593:ed9c:9f07 with SMTP id a5-20020a621a05000000b00593ed9c9f07mr417147pfa.27.1676340982653; Mon, 13 Feb 2023 18:16:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676340982; cv=none; d=google.com; s=arc-20160816; b=DGGTrjO9ToxU07exe9/McCLmJ9LXHVK09NYxiR1/FZaKw0xyFjUJCEck3LiL05c/5l agys7tfTlRSPU07PzxNnoZvbHNaKXkRCyHIz44Pkh0n9zRiuVmnFAkoBjqzoi2/HKVCW oYPg63MAsYlLogtDl6W71Kx1rD1GDkcBJ+NIUoWLY6OjVk/xENeEXq03GIlo2boGof7y qQpr8r++QmWkYYS6ivptDitVr4/+jJ0NZzOvYKQVucxo3dCK5qkUI6uBS2qtjXd5Yd5T pv6ZDd0aErn/rP1kTpJJBA9dkoB5B226dJ0Xx5K26dlJk08p7jnzpmybLKoLnE7YMS3i l21Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=tKYy2SDFiNcLF6buKGmqFbAlMORNzamYAW+pNZFXv1c=; b=rJE5aF+fEuiK3hexI3l7tRQlyELiRXWG+HMkz7bcYQSKvYAv7w74Pjgwq72X9KYRu5 YjfGTQcNXtN7+eK7vGomZ+DqUScLEyF1qGPUq9Zs/f4QYMQgVivMr8fHzSHbWwxJfZRH ++jp+0kcZl4kzBdLWYJWoRgFpNLCnquJweGUckrpmWjzSnOC43oj/TllI14wyxdRq5fb VCXs+N1A42L1JqvFlbPj7EmeuQ79A7UBEhequVo4Jr+tZoaM6WpvKIgnbSWkIBWVbs2j Ue4GM8iA2fA2B8JUkK7WW0gPDPGzVX4BYqZxZxW1EwjLtZ7Y8Bq1cWh87QpqssAVS/2r 8gPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v129-20020a622f87000000b0059d96ba730dsi11868767pfv.110.2023.02.13.18.16.09; Mon, 13 Feb 2023 18:16:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231224AbjBNCFg (ORCPT <rfc822;tebrre53rla2o@gmail.com> + 99 others); Mon, 13 Feb 2023 21:05:36 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229934AbjBNCFc (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 13 Feb 2023 21:05:32 -0500 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C83D113E2; Mon, 13 Feb 2023 18:05:29 -0800 (PST) Received: from mail02.huawei.com (unknown [172.30.67.143]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4PG4Kv6bmNz4f3jHy; Tue, 14 Feb 2023 10:05:23 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.127.227]) by APP3 (Coremail) with SMTP id _Ch0CgCnUiBk7OpjvrkRDQ--.53521S5; Tue, 14 Feb 2023 10:05:26 +0800 (CST) From: Ye Bin <yebin@huaweicloud.com> To: tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org Cc: linux-kernel@vger.kernel.org, jack@suse.cz, Ye Bin <yebin10@huawei.com> Subject: [PATCH v3 1/2] ext4: commit super block if fs record error when journal record without error Date: Tue, 14 Feb 2023 10:29:04 +0800 Message-Id: <20230214022905.765088-2-yebin@huaweicloud.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230214022905.765088-1-yebin@huaweicloud.com> References: <20230214022905.765088-1-yebin@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: _Ch0CgCnUiBk7OpjvrkRDQ--.53521S5 X-Coremail-Antispam: 1UD129KBjvJXoWrtrWUJry7uF48tr48ZFWrZrb_yoW8Jr48p3 95Arn7AFZ0vF17CwsrJwsrXFyvg34Fka4UWr1Sk3Z3Aa9xtr9IvrZ8tF15AFWjgrW8Ww1r X34UGay7G3s5Kr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUvGb4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGw A2048vs2IY020Ec7CjxVAFwI0_JFI_Gr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxS w2x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxV W8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMc Ij6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_ Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr 0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY 17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcV C0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY 6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa 73UjIFyTuYvjxUzoGQUUUUU X-CM-SenderInfo: p1hex046kxt4xhlfz01xgou0bp/ X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,KHOP_HELO_FCRDNS, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757770922354289951?= X-GMAIL-MSGID: =?utf-8?q?1757770922354289951?= |
Series |
fix error flag covered by journal recovery
|
|
Commit Message
Ye Bin
Feb. 14, 2023, 2:29 a.m. UTC
From: Ye Bin <yebin10@huawei.com> Now, 'es->s_state' maybe covered by recover journal. And journal errno maybe not recorded in journal sb as IO error. ext4_update_super() only update error information when 'sbi->s_add_error_count' large than zero. Then 'EXT4_ERROR_FS' flag maybe lost. To solve above issue commit error information after recover journal. Signed-off-by: Ye Bin <yebin10@huawei.com> --- fs/ext4/super.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
Comments
On 2023/2/14 10:29, Ye Bin wrote: > From: Ye Bin <yebin10@huawei.com> > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > maybe not recorded in journal sb as IO error. ext4_update_super() only > update error information when 'sbi->s_add_error_count' large than zero. > Then 'EXT4_ERROR_FS' flag maybe lost. > To solve above issue commit error information after recover journal. > > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > fs/ext4/super.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index dc3907dff13a..b94754ba8556 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > goto err_out; > } > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > + err = ext4_commit_super(sb); > + if (err) { > + ext4_msg(sb, KERN_ERR, > + "Failed to commit error information, please repair fs force!"); > + goto err_out; > + } > + } > + > EXT4_SB(sb)->s_journal = journal; > err = ext4_clear_journal_err(sb, es); > if (err) { I think we don't need such a complicated judgment, after the journal replay and saving the error info, if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just add this flag directly to es->s_state. This way the EXT4_ERROR_FS flag and the error message will be written to disk the next time ext4_commit_super() is executed. The code change is as follows: diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 260c1b3e3ef2..341b11c589b3 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct super_block *sb, memcpy(((char *) es) + EXT4_S_ERR_START, save, EXT4_S_ERR_LEN); kfree(save); + es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS); } if (err) {
On 2023/2/16 15:17, Baokun Li wrote: > On 2023/2/14 10:29, Ye Bin wrote: >> From: Ye Bin <yebin10@huawei.com> >> >> Now, 'es->s_state' maybe covered by recover journal. And journal errno >> maybe not recorded in journal sb as IO error. ext4_update_super() only >> update error information when 'sbi->s_add_error_count' large than zero. >> Then 'EXT4_ERROR_FS' flag maybe lost. >> To solve above issue commit error information after recover journal. >> >> Signed-off-by: Ye Bin <yebin10@huawei.com> >> --- >> fs/ext4/super.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index dc3907dff13a..b94754ba8556 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct >> super_block *sb, >> goto err_out; >> } >> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >> + err = ext4_commit_super(sb); >> + if (err) { >> + ext4_msg(sb, KERN_ERR, >> + "Failed to commit error information, please repair >> fs force!"); >> + goto err_out; >> + } >> + } >> + >> EXT4_SB(sb)->s_journal = journal; >> err = ext4_clear_journal_err(sb, es); >> if (err) { > I think we don't need such a complicated judgment, after the journal > replay and saving the error info, > if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just > add this flag directly to es->s_state. > This way the EXT4_ERROR_FS flag and the error message will be written > to disk the next time Thanks for your suggestion. There are two reasons for this: 1. We want to write the error mark to the disk as soon as possible. 2. Here we deal with the case where there is no error mark bit but there is an error record. In this case, the file system should be marked with an error and the user should be prompted. > ext4_commit_super() is executed. The code change is as follows: > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 260c1b3e3ef2..341b11c589b3 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct super_block > *sb, > memcpy(((char *) es) + EXT4_S_ERR_START, > save, EXT4_S_ERR_LEN); > kfree(save); > + es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state > & EXT4_ERROR_FS); > } > > if (err) { >
On 2023/2/16 15:44, yebin (H) wrote: > > > On 2023/2/16 15:17, Baokun Li wrote: >> On 2023/2/14 10:29, Ye Bin wrote: >>> From: Ye Bin <yebin10@huawei.com> >>> >>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>> update error information when 'sbi->s_add_error_count' large than zero. >>> Then 'EXT4_ERROR_FS' flag maybe lost. >>> To solve above issue commit error information after recover journal. >>> >>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>> --- >>> fs/ext4/super.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>> index dc3907dff13a..b94754ba8556 100644 >>> --- a/fs/ext4/super.c >>> +++ b/fs/ext4/super.c >>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct >>> super_block *sb, >>> goto err_out; >>> } >>> + if (unlikely(es->s_error_count && >>> !jbd2_journal_errno(journal) && >>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>> + err = ext4_commit_super(sb); >>> + if (err) { >>> + ext4_msg(sb, KERN_ERR, >>> + "Failed to commit error information, please repair >>> fs force!"); >>> + goto err_out; >>> + } >>> + } >>> + >>> EXT4_SB(sb)->s_journal = journal; >>> err = ext4_clear_journal_err(sb, es); >>> if (err) { >> I think we don't need such a complicated judgment, after the journal >> replay and saving the error info, >> if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just >> add this flag directly to es->s_state. >> This way the EXT4_ERROR_FS flag and the error message will be written >> to disk the next time > > Thanks for your suggestion. There are two reasons for this: > 1. We want to write the error mark to the disk as soon as possible. > 2. Here we deal with the case where there is no error mark bit but > there is an error record. > In this case, the file system should be marked with an error and the > user should be prompted. The EXT4_ERROR_FS flag is always written to disk at the same time as the error info, except when the journal is replayed. During journal replay the error info is additionally copied to disk first, and the EXT4_ERROR_FS flag in the sbi is not written to disk until the ext4_put_super() is called. It is only when a failure occurs during this time that there is an error info but no EXT4_ERROR_FS flag. So we just need to make sure that the EXT4_ERROR_FS flag is also written to disk at the same time as the error info after the journal replay. >> ext4_commit_super() is executed. The code change is as follows: >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index 260c1b3e3ef2..341b11c589b3 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct super_block >> *sb, >> memcpy(((char *) es) + EXT4_S_ERR_START, >> save, EXT4_S_ERR_LEN); >> kfree(save); >> + es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state >> & EXT4_ERROR_FS); >> } >> >> if (err) { >> > > >
On 2023/2/16 17:17, Baokun Li wrote: > On 2023/2/16 15:44, yebin (H) wrote: >> >> >> On 2023/2/16 15:17, Baokun Li wrote: >>> On 2023/2/14 10:29, Ye Bin wrote: >>>> From: Ye Bin <yebin10@huawei.com> >>>> >>>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>>> update error information when 'sbi->s_add_error_count' large than >>>> zero. >>>> Then 'EXT4_ERROR_FS' flag maybe lost. >>>> To solve above issue commit error information after recover journal. >>>> >>>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>>> --- >>>> fs/ext4/super.c | 12 ++++++++++++ >>>> 1 file changed, 12 insertions(+) >>>> >>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>> index dc3907dff13a..b94754ba8556 100644 >>>> --- a/fs/ext4/super.c >>>> +++ b/fs/ext4/super.c >>>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct >>>> super_block *sb, >>>> goto err_out; >>>> } >>>> + if (unlikely(es->s_error_count && >>>> !jbd2_journal_errno(journal) && >>>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>>> + err = ext4_commit_super(sb); >>>> + if (err) { >>>> + ext4_msg(sb, KERN_ERR, >>>> + "Failed to commit error information, please >>>> repair fs force!"); >>>> + goto err_out; >>>> + } >>>> + } >>>> + >>>> EXT4_SB(sb)->s_journal = journal; >>>> err = ext4_clear_journal_err(sb, es); >>>> if (err) { >>> I think we don't need such a complicated judgment, after the journal >>> replay and saving the error info, >>> if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just >>> add this flag directly to es->s_state. >>> This way the EXT4_ERROR_FS flag and the error message will be >>> written to disk the next time >> >> Thanks for your suggestion. There are two reasons for this: >> 1. We want to write the error mark to the disk as soon as possible. >> 2. Here we deal with the case where there is no error mark bit but >> there is an error record. >> In this case, the file system should be marked with an error and the >> user should be prompted. > The EXT4_ERROR_FS flag is always written to disk at the same time as > the error info, > except when the journal is replayed. During journal replay the error > info is additionally > copied to disk first, and the EXT4_ERROR_FS flag in the sbi is not > written to disk until > the ext4_put_super() is called. It is only when a failure occurs > during this time that > there is an error info but no EXT4_ERROR_FS flag. So we just need to > make sure that > the EXT4_ERROR_FS flag is also written to disk at the same time as the > error info > after the journal replay. The situation you said is based on the situation after the repair. What about the existing image with such inconsistency? >>> ext4_commit_super() is executed. The code change is as follows: >>> >>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>> index 260c1b3e3ef2..341b11c589b3 100644 >>> --- a/fs/ext4/super.c >>> +++ b/fs/ext4/super.c >>> @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct >>> super_block *sb, >>> memcpy(((char *) es) + EXT4_S_ERR_START, >>> save, EXT4_S_ERR_LEN); >>> kfree(save); >>> + es->s_state |= >>> cpu_to_le16(EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS); >>> } >>> >>> if (err) { >>> >> >> >>
On Tue 14-02-23 10:29:04, Ye Bin wrote: > From: Ye Bin <yebin10@huawei.com> > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > maybe not recorded in journal sb as IO error. ext4_update_super() only > update error information when 'sbi->s_add_error_count' large than zero. > Then 'EXT4_ERROR_FS' flag maybe lost. > To solve above issue commit error information after recover journal. > > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > fs/ext4/super.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index dc3907dff13a..b94754ba8556 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > goto err_out; > } > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > + err = ext4_commit_super(sb); > + if (err) { > + ext4_msg(sb, KERN_ERR, > + "Failed to commit error information, please repair fs force!"); > + goto err_out; > + } > + } > + Hum, I'm not sure I follow here. If journal replay has overwritten the superblock (and thus the stored error info), then I'd expect es->s_error_count got overwritten (possibly to 0) as well. And this is actually relatively realistic scenario with errors=remount-ro behavior when the first fs error happens. What I intended in my original suggestion was to save es->s_error_count, es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before doing journal replay in ext4_load_journal() and then after journal replay merge this info back to the superblock - if EXT4_ERROR_FS was set, set it now as well, take max of old and new s_error_count, set s_first_error_* if it is now unset, set s_last_error_* if stored timestamp is newer than current timestamp. Or am I overengineering it now? :) Honza
On 2023/2/17 1:31, Jan Kara wrote: > On Tue 14-02-23 10:29:04, Ye Bin wrote: >> From: Ye Bin <yebin10@huawei.com> >> >> Now, 'es->s_state' maybe covered by recover journal. And journal errno >> maybe not recorded in journal sb as IO error. ext4_update_super() only >> update error information when 'sbi->s_add_error_count' large than zero. >> Then 'EXT4_ERROR_FS' flag maybe lost. >> To solve above issue commit error information after recover journal. >> >> Signed-off-by: Ye Bin <yebin10@huawei.com> >> --- >> fs/ext4/super.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) > Hum, I'm not sure I follow here. If journal replay has overwritten the > superblock (and thus the stored error info), then I'd expect > es->s_error_count got overwritten (possibly to 0) as well. And this is > actually relatively realistic scenario with errors=remount-ro behavior when > the first fs error happens. > > What I intended in my original suggestion was to save es->s_error_count, > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > doing journal replay in ext4_load_journal() and then after journal replay > merge this info back to the superblock - if EXT4_ERROR_FS was set, set it > now as well, take max of old and new s_error_count, set s_first_error_* if > it is now unset, set s_last_error_* if stored timestamp is newer than > current timestamp. > > Or am I overengineering it now? :) > > Honza This is exactly how the code is designed now! The code has now saved all the above information except EXT4_ERROR_FS by the following two pieces of logic, as follows: ---------------- In struct ext4_super_block ---------------- 1412 #define EXT4_S_ERR_START offsetof(struct ext4_super_block, s_error_count) 1413 __le32 s_error_count; /* number of fs errors */ 1414 __le32 s_first_error_time; /* first time an error happened */ 1415 __le32 s_first_error_ino; /* inode involved in first error */ 1416 __le64 s_first_error_block; /* block involved of first error */ 1417 __u8 s_first_error_func[32] __nonstring; /* function where the error happened */ 1418 __le32 s_first_error_line; /* line number where error happened */ 1419 __le32 s_last_error_time; /* most recent time of an error */ 1420 __le32 s_last_error_ino; /* inode involved in last error */ 1421 __le32 s_last_error_line; /* line number where error happened */ 1422 __le64 s_last_error_block; /* block involved of last error */ 1423 __u8 s_last_error_func[32] __nonstring; /* function where the error happened */ 1424 #define EXT4_S_ERR_END offsetof(struct ext4_super_block, s_mount_opts) ----------------------------------------------------------- ---------------- In ext4_load_journal() ---------------- 5929 char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL); 5930 if (save) 5931 memcpy(save, ((char *) es) + 5932 EXT4_S_ERR_START, EXT4_S_ERR_LEN); 5933 err = jbd2_journal_load(journal); 5934 if (save) 5935 memcpy(((char *) es) + EXT4_S_ERR_START, 5936 save, EXT4_S_ERR_LEN); 5937 kfree(save); -------------------------------------------------------- As you said, we should also save EXT4_ERROR_FS to es->s_state. But we are not saving this now, so I think we just need to add `es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS);` to save the possible EXT4_ERROR_FS flag after copying the error area data to es.
On 2023/2/17 1:31, Jan Kara wrote: > On Tue 14-02-23 10:29:04, Ye Bin wrote: >> From: Ye Bin <yebin10@huawei.com> >> >> Now, 'es->s_state' maybe covered by recover journal. And journal errno >> maybe not recorded in journal sb as IO error. ext4_update_super() only >> update error information when 'sbi->s_add_error_count' large than zero. >> Then 'EXT4_ERROR_FS' flag maybe lost. >> To solve above issue commit error information after recover journal. >> >> Signed-off-by: Ye Bin <yebin10@huawei.com> >> --- >> fs/ext4/super.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index dc3907dff13a..b94754ba8556 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, >> goto err_out; >> } >> >> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >> + err = ext4_commit_super(sb); >> + if (err) { >> + ext4_msg(sb, KERN_ERR, >> + "Failed to commit error information, please repair fs force!"); >> + goto err_out; >> + } >> + } >> + > Hum, I'm not sure I follow here. If journal replay has overwritten the > superblock (and thus the stored error info), then I'd expect > es->s_error_count got overwritten (possibly to 0) as well. And this is > actually relatively realistic scenario with errors=remount-ro behavior when > the first fs error happens. > > What I intended in my original suggestion was to save es->s_error_count, > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > doing journal replay in ext4_load_journal() and then after journal replay > merge this info back to the superblock Actually,commit 1c13d5c08728 ("ext4: Save error information to the superblock for analysis") already merged error info back to the superblock after journal replay except 'es->s_state'. The problem I have now is that the error flag in the journal superblock was not recorded, but the error message was recorded in the superblock. So it leads to ext4_clear_journal_err() does not detect errors and marks the file system as an error. Because ext4_update_super() is only set error flag when 'sbi->s_add_error_count > 0'. Although 'sbi->s_mount_state' is written to the super block when umount, but it is also conditional. So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) && !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i prefer to mark fs as error if it contain detail error info without EXT4_ERROR_FS. > - if EXT4_ERROR_FS was set, set it > now as well, take max of old and new s_error_count, set s_first_error_* if > it is now unset, set s_last_error_* if stored timestamp is newer than > current timestamp. > > Or am I overengineering it now? :) > > Honza
On Fri 17-02-23 09:44:57, yebin (H) wrote: > On 2023/2/17 1:31, Jan Kara wrote: > > On Tue 14-02-23 10:29:04, Ye Bin wrote: > > > From: Ye Bin <yebin10@huawei.com> > > > > > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > > > maybe not recorded in journal sb as IO error. ext4_update_super() only > > > update error information when 'sbi->s_add_error_count' large than zero. > > > Then 'EXT4_ERROR_FS' flag maybe lost. > > > To solve above issue commit error information after recover journal. > > > > > > Signed-off-by: Ye Bin <yebin10@huawei.com> > > > --- > > > fs/ext4/super.c | 12 ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > > index dc3907dff13a..b94754ba8556 100644 > > > --- a/fs/ext4/super.c > > > +++ b/fs/ext4/super.c > > > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > > > goto err_out; > > > } > > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > > > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > > > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > > > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > + err = ext4_commit_super(sb); > > > + if (err) { > > > + ext4_msg(sb, KERN_ERR, > > > + "Failed to commit error information, please repair fs force!"); > > > + goto err_out; > > > + } > > > + } > > > + > > Hum, I'm not sure I follow here. If journal replay has overwritten the > > superblock (and thus the stored error info), then I'd expect > > es->s_error_count got overwritten (possibly to 0) as well. And this is > > actually relatively realistic scenario with errors=remount-ro behavior when > > the first fs error happens. > > > > What I intended in my original suggestion was to save es->s_error_count, > > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > > doing journal replay in ext4_load_journal() and then after journal replay > > merge this info back to the superblock > Actually,commit 1c13d5c08728 ("ext4: Save error information to the > superblock for analysis") > already merged error info back to the superblock after journal replay except > 'es->s_state'. > The problem I have now is that the error flag in the journal superblock was > not recorded, > but the error message was recorded in the superblock. So it leads to > ext4_clear_journal_err() > does not detect errors and marks the file system as an error. Because > ext4_update_super() is > only set error flag when 'sbi->s_add_error_count > 0'. Although > 'sbi->s_mount_state' is > written to the super block when umount, but it is also conditional. > So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) > && > !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store > 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i > prefer to mark fs as error if it contain detail error info without > EXT4_ERROR_FS. Aha, thanks for explanation! So now I finally understand what the problem exactly is. I'm not sure relying on es->s_error_count is too good. Probably it works but I'd be calmer if when saving error info we also did: bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); copy other info err = jbd2_journal_load(journal); restore other info if (error_fs) es->s_state |= cpu_to_le16(EXT4_ERROR_FS); /* Write out restored error information to the superblock */ err2 = ext4_commit_super(sb); and be done with this. I don't think trying to optimize away the committing of the superblock when we had to replay the journal is really worth it. Does this solve your concerns? Honza
On 2023/2/17 18:56, Jan Kara wrote: > On Fri 17-02-23 09:44:57, yebin (H) wrote: >> On 2023/2/17 1:31, Jan Kara wrote: >>> On Tue 14-02-23 10:29:04, Ye Bin wrote: >>>> From: Ye Bin <yebin10@huawei.com> >>>> >>>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>>> update error information when 'sbi->s_add_error_count' large than zero. >>>> Then 'EXT4_ERROR_FS' flag maybe lost. >>>> To solve above issue commit error information after recover journal. >>>> >>>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>>> --- >>>> fs/ext4/super.c | 12 ++++++++++++ >>>> 1 file changed, 12 insertions(+) >>>> >>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>> index dc3907dff13a..b94754ba8556 100644 >>>> --- a/fs/ext4/super.c >>>> +++ b/fs/ext4/super.c >>>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, >>>> goto err_out; >>>> } >>>> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >>>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>>> + err = ext4_commit_super(sb); >>>> + if (err) { >>>> + ext4_msg(sb, KERN_ERR, >>>> + "Failed to commit error information, please repair fs force!"); >>>> + goto err_out; >>>> + } >>>> + } >>>> + >>> Hum, I'm not sure I follow here. If journal replay has overwritten the >>> superblock (and thus the stored error info), then I'd expect >>> es->s_error_count got overwritten (possibly to 0) as well. And this is >>> actually relatively realistic scenario with errors=remount-ro behavior when >>> the first fs error happens. >>> >>> What I intended in my original suggestion was to save es->s_error_count, >>> es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before >>> doing journal replay in ext4_load_journal() and then after journal replay >>> merge this info back to the superblock >> Actually,commit 1c13d5c08728 ("ext4: Save error information to the >> superblock for analysis") >> already merged error info back to the superblock after journal replay except >> 'es->s_state'. >> The problem I have now is that the error flag in the journal superblock was >> not recorded, >> but the error message was recorded in the superblock. So it leads to >> ext4_clear_journal_err() >> does not detect errors and marks the file system as an error. Because >> ext4_update_super() is >> only set error flag when 'sbi->s_add_error_count > 0'. Although >> 'sbi->s_mount_state' is >> written to the super block when umount, but it is also conditional. >> So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) >> && >> !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store >> 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i >> prefer to mark fs as error if it contain detail error info without >> EXT4_ERROR_FS. > Aha, thanks for explanation! So now I finally understand what the problem > exactly is. I'm not sure relying on es->s_error_count is too good. Probably > it works but I'd be calmer if when saving error info we also did: > > bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > copy other info > err = jbd2_journal_load(journal); > restore other info > if (error_fs) > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > /* Write out restored error information to the superblock */ > err2 = ext4_commit_super(sb); > > and be done with this. I don't think trying to optimize away the committing > of the superblock when we had to replay the journal is really worth it. > > Does this solve your concerns? > > Honza Thanks for your suggestion. I think if journal super block has 'j_errno' ext4_clear_journal_err() will commit error info. The scenario we need to deal with is:(1) journal super block has no 'j_errno'; (2) super block has detail error info, but 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has no error flag and the newest super block has error flag. so we need to store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' has 'EXT4_ERROR_FS', it means super block in journal has error flag, so we do not need to store error flag in super block. I don't know if I can explain my idea of repair.
On Sat 18-02-23 10:18:42, yebin (H) wrote: > On 2023/2/17 18:56, Jan Kara wrote: > > On Fri 17-02-23 09:44:57, yebin (H) wrote: > > > On 2023/2/17 1:31, Jan Kara wrote: > > > > On Tue 14-02-23 10:29:04, Ye Bin wrote: > > > > > From: Ye Bin <yebin10@huawei.com> > > > > > > > > > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > > > > > maybe not recorded in journal sb as IO error. ext4_update_super() only > > > > > update error information when 'sbi->s_add_error_count' large than zero. > > > > > Then 'EXT4_ERROR_FS' flag maybe lost. > > > > > To solve above issue commit error information after recover journal. > > > > > > > > > > Signed-off-by: Ye Bin <yebin10@huawei.com> > > > > > --- > > > > > fs/ext4/super.c | 12 ++++++++++++ > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > > > > index dc3907dff13a..b94754ba8556 100644 > > > > > --- a/fs/ext4/super.c > > > > > +++ b/fs/ext4/super.c > > > > > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > > > > > goto err_out; > > > > > } > > > > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > > > > > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > > > > > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > > > > > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > > + err = ext4_commit_super(sb); > > > > > + if (err) { > > > > > + ext4_msg(sb, KERN_ERR, > > > > > + "Failed to commit error information, please repair fs force!"); > > > > > + goto err_out; > > > > > + } > > > > > + } > > > > > + > > > > Hum, I'm not sure I follow here. If journal replay has overwritten the > > > > superblock (and thus the stored error info), then I'd expect > > > > es->s_error_count got overwritten (possibly to 0) as well. And this is > > > > actually relatively realistic scenario with errors=remount-ro behavior when > > > > the first fs error happens. > > > > > > > > What I intended in my original suggestion was to save es->s_error_count, > > > > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > > > > doing journal replay in ext4_load_journal() and then after journal replay > > > > merge this info back to the superblock > > > Actually,commit 1c13d5c08728 ("ext4: Save error information to the > > > superblock for analysis") > > > already merged error info back to the superblock after journal replay except > > > 'es->s_state'. > > > The problem I have now is that the error flag in the journal superblock was > > > not recorded, > > > but the error message was recorded in the superblock. So it leads to > > > ext4_clear_journal_err() > > > does not detect errors and marks the file system as an error. Because > > > ext4_update_super() is > > > only set error flag when 'sbi->s_add_error_count > 0'. Although > > > 'sbi->s_mount_state' is > > > written to the super block when umount, but it is also conditional. > > > So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) > > > && > > > !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store > > > 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i > > > prefer to mark fs as error if it contain detail error info without > > > EXT4_ERROR_FS. > > Aha, thanks for explanation! So now I finally understand what the problem > > exactly is. I'm not sure relying on es->s_error_count is too good. Probably > > it works but I'd be calmer if when saving error info we also did: > > > > bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > > > copy other info > > err = jbd2_journal_load(journal); > > restore other info > > if (error_fs) > > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > /* Write out restored error information to the superblock */ > > err2 = ext4_commit_super(sb); > > > > and be done with this. I don't think trying to optimize away the committing > > of the superblock when we had to replay the journal is really worth it. > > > > Does this solve your concerns? > Thanks for your suggestion. > > I think if journal super block has 'j_errno' ext4_clear_journal_err() > will commit error info. The scenario we need to deal with is:(1) journal > super block has no 'j_errno'; (2) super block has detail error info, but > 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has > no error flag and the newest super block has error flag. But my code above should be handling this situation you describe - the check: error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); will check the newest state in the superblock before journal replay. Then we replay the journal - es->s_state may loose the EXT4_ERROR_FS flag if the superblock version in the journal didn't have it. So we do: if (error_fs) es->s_state |= cpu_to_le16(EXT4_ERROR_FS); which makes sure EXT4_ERROR_FS is set either if it was set in the journal or in the newest superblock version before journal replay. > so we need to > store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' > has 'EXT4_ERROR_FS', it means super block in journal has error flag, so > we do not need to store error flag in super block. Why do you think that if es->s_state has EXT4_ERROR_FS, the super block in the journal has that flag? During mount, we load the superblock directly from the first block in the filesystem and until we call jbd2_journal_load(), es->s_state contains this newest value of the superblock state. What am I missing? Honza
On 2023/2/27 19:20, Jan Kara wrote: > On Sat 18-02-23 10:18:42, yebin (H) wrote: >> On 2023/2/17 18:56, Jan Kara wrote: >>> On Fri 17-02-23 09:44:57, yebin (H) wrote: >>>> On 2023/2/17 1:31, Jan Kara wrote: >>>>> On Tue 14-02-23 10:29:04, Ye Bin wrote: >>>>>> From: Ye Bin <yebin10@huawei.com> >>>>>> >>>>>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>>>>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>>>>> update error information when 'sbi->s_add_error_count' large than zero. >>>>>> Then 'EXT4_ERROR_FS' flag maybe lost. >>>>>> To solve above issue commit error information after recover journal. >>>>>> >>>>>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>>>>> --- >>>>>> fs/ext4/super.c | 12 ++++++++++++ >>>>>> 1 file changed, 12 insertions(+) >>>>>> >>>>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>>>> index dc3907dff13a..b94754ba8556 100644 >>>>>> --- a/fs/ext4/super.c >>>>>> +++ b/fs/ext4/super.c >>>>>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, >>>>>> goto err_out; >>>>>> } >>>>>> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >>>>>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>>>>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>>>>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>>>>> + err = ext4_commit_super(sb); >>>>>> + if (err) { >>>>>> + ext4_msg(sb, KERN_ERR, >>>>>> + "Failed to commit error information, please repair fs force!"); >>>>>> + goto err_out; >>>>>> + } >>>>>> + } >>>>>> + >>>>> Hum, I'm not sure I follow here. If journal replay has overwritten the >>>>> superblock (and thus the stored error info), then I'd expect >>>>> es->s_error_count got overwritten (possibly to 0) as well. And this is >>>>> actually relatively realistic scenario with errors=remount-ro behavior when >>>>> the first fs error happens. >>>>> >>>>> What I intended in my original suggestion was to save es->s_error_count, >>>>> es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before >>>>> doing journal replay in ext4_load_journal() and then after journal replay >>>>> merge this info back to the superblock >>>> Actually,commit 1c13d5c08728 ("ext4: Save error information to the >>>> superblock for analysis") >>>> already merged error info back to the superblock after journal replay except >>>> 'es->s_state'. >>>> The problem I have now is that the error flag in the journal superblock was >>>> not recorded, >>>> but the error message was recorded in the superblock. So it leads to >>>> ext4_clear_journal_err() >>>> does not detect errors and marks the file system as an error. Because >>>> ext4_update_super() is >>>> only set error flag when 'sbi->s_add_error_count > 0'. Although >>>> 'sbi->s_mount_state' is >>>> written to the super block when umount, but it is also conditional. >>>> So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) >>>> && >>>> !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store >>>> 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i >>>> prefer to mark fs as error if it contain detail error info without >>>> EXT4_ERROR_FS. >>> Aha, thanks for explanation! So now I finally understand what the problem >>> exactly is. I'm not sure relying on es->s_error_count is too good. Probably >>> it works but I'd be calmer if when saving error info we also did: >>> >>> bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); >>> >>> copy other info >>> err = jbd2_journal_load(journal); >>> restore other info >>> if (error_fs) >>> es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>> /* Write out restored error information to the superblock */ >>> err2 = ext4_commit_super(sb); >>> >>> and be done with this. I don't think trying to optimize away the committing >>> of the superblock when we had to replay the journal is really worth it. >>> >>> Does this solve your concerns? >> Thanks for your suggestion. >> >> I think if journal super block has 'j_errno' ext4_clear_journal_err() >> will commit error info. The scenario we need to deal with is:(1) journal >> super block has no 'j_errno'; (2) super block has detail error info, but >> 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has >> no error flag and the newest super block has error flag. > But my code above should be handling this situation you describe - the > check: > > error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); Here, we do not need to backup 'error_fs', as 'EXT4_SB(sb)->s_mount_state' already record this flag when fs has error flag before do journal replay. > will check the newest state in the superblock before journal replay. Then > we replay the journal - es->s_state may loose the EXT4_ERROR_FS flag if the > superblock version in the journal didn't have it. So we do: > > if (error_fs) > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > which makes sure EXT4_ERROR_FS is set either if it was set in the journal > or in the newest superblock version before journal replay. My modification is to deal with the situation we missed, and I don't want to introduce an invalid super block submission. If you think my judgment is too obscure, I can send another version according to your suggestion. >> so we need to >> store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' >> has 'EXT4_ERROR_FS', it means super block in journal has error flag, so >> we do not need to store error flag in super block. > Why do you think that if es->s_state has EXT4_ERROR_FS, the super block in > the journal has that flag? During mount, we load the superblock directly > from the first block in the filesystem and until we call > jbd2_journal_load(), es->s_state contains this newest value of the > superblock state. What am I missing? After jbd2_journal_load() 'es->s_state' is already covered by the super block in journal. If there is super block in journal and 'es->s_state' has error flag this means super block in journal has error flag. If there is no super block in journal and 'es->s_state' has error flag, this means super block already has error flag.In both cases we can do nothing. > > Honza
On Tue 28-02-23 10:24:26, yebin (H) wrote: > On 2023/2/27 19:20, Jan Kara wrote: > > On Sat 18-02-23 10:18:42, yebin (H) wrote: > > > On 2023/2/17 18:56, Jan Kara wrote: > > > > On Fri 17-02-23 09:44:57, yebin (H) wrote: > > > > > On 2023/2/17 1:31, Jan Kara wrote: > > > > > > On Tue 14-02-23 10:29:04, Ye Bin wrote: > > > > > > > From: Ye Bin <yebin10@huawei.com> > > > > > > > > > > > > > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > > > > > > > maybe not recorded in journal sb as IO error. ext4_update_super() only > > > > > > > update error information when 'sbi->s_add_error_count' large than zero. > > > > > > > Then 'EXT4_ERROR_FS' flag maybe lost. > > > > > > > To solve above issue commit error information after recover journal. > > > > > > > > > > > > > > Signed-off-by: Ye Bin <yebin10@huawei.com> > > > > > > > --- > > > > > > > fs/ext4/super.c | 12 ++++++++++++ > > > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > > > > > > index dc3907dff13a..b94754ba8556 100644 > > > > > > > --- a/fs/ext4/super.c > > > > > > > +++ b/fs/ext4/super.c > > > > > > > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > > > > > > > goto err_out; > > > > > > > } > > > > > > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > > > > > > > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > > > > > > > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > > > > > > > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > > > > + err = ext4_commit_super(sb); > > > > > > > + if (err) { > > > > > > > + ext4_msg(sb, KERN_ERR, > > > > > > > + "Failed to commit error information, please repair fs force!"); > > > > > > > + goto err_out; > > > > > > > + } > > > > > > > + } > > > > > > > + > > > > > > Hum, I'm not sure I follow here. If journal replay has overwritten the > > > > > > superblock (and thus the stored error info), then I'd expect > > > > > > es->s_error_count got overwritten (possibly to 0) as well. And this is > > > > > > actually relatively realistic scenario with errors=remount-ro behavior when > > > > > > the first fs error happens. > > > > > > > > > > > > What I intended in my original suggestion was to save es->s_error_count, > > > > > > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > > > > > > doing journal replay in ext4_load_journal() and then after journal replay > > > > > > merge this info back to the superblock > > > > > Actually,commit 1c13d5c08728 ("ext4: Save error information to the > > > > > superblock for analysis") > > > > > already merged error info back to the superblock after journal replay except > > > > > 'es->s_state'. > > > > > The problem I have now is that the error flag in the journal superblock was > > > > > not recorded, > > > > > but the error message was recorded in the superblock. So it leads to > > > > > ext4_clear_journal_err() > > > > > does not detect errors and marks the file system as an error. Because > > > > > ext4_update_super() is > > > > > only set error flag when 'sbi->s_add_error_count > 0'. Although > > > > > 'sbi->s_mount_state' is > > > > > written to the super block when umount, but it is also conditional. > > > > > So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) > > > > > && > > > > > !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store > > > > > 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i > > > > > prefer to mark fs as error if it contain detail error info without > > > > > EXT4_ERROR_FS. > > > > Aha, thanks for explanation! So now I finally understand what the problem > > > > exactly is. I'm not sure relying on es->s_error_count is too good. Probably > > > > it works but I'd be calmer if when saving error info we also did: > > > > > > > > bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > > > > > > > copy other info > > > > err = jbd2_journal_load(journal); > > > > restore other info > > > > if (error_fs) > > > > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > /* Write out restored error information to the superblock */ > > > > err2 = ext4_commit_super(sb); > > > > > > > > and be done with this. I don't think trying to optimize away the committing > > > > of the superblock when we had to replay the journal is really worth it. > > > > > > > > Does this solve your concerns? > > > Thanks for your suggestion. > > > > > > I think if journal super block has 'j_errno' ext4_clear_journal_err() > > > will commit error info. The scenario we need to deal with is:(1) journal > > > super block has no 'j_errno'; (2) super block has detail error info, but > > > 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has > > > no error flag and the newest super block has error flag. > > But my code above should be handling this situation you describe - the > > check: > > > > error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > Here, we do not need to backup 'error_fs', as > 'EXT4_SB(sb)->s_mount_state' already record this flag when fs has error > flag before do journal replay. Yeah, right. We don't need error_fs variable and can just look at EXT4_SB(sb)->s_mount_state. > > will check the newest state in the superblock before journal replay. Then > > we replay the journal - es->s_state may loose the EXT4_ERROR_FS flag if the > > superblock version in the journal didn't have it. So we do: > > > > if (error_fs) > > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > which makes sure EXT4_ERROR_FS is set either if it was set in the journal > > or in the newest superblock version before journal replay. > > My modification is to deal with the situation we missed, and I don't want > to introduce an invalid super block submission. If you think my judgment > is too obscure, I can send another version according to your suggestion. So is the extra superblock write your only concern? Honestly, I prefer code simplicity over saved one superblock write in case we had to replay the journal (which should be rare anyway). If you look at the code, we can writeout superblock several times in that mount path anyway. > > > so we need to > > > store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' > > > has 'EXT4_ERROR_FS', it means super block in journal has error flag, so > > > we do not need to store error flag in super block. > > Why do you think that if es->s_state has EXT4_ERROR_FS, the super block in > > the journal has that flag? During mount, we load the superblock directly > > from the first block in the filesystem and until we call > > jbd2_journal_load(), es->s_state contains this newest value of the > > superblock state. What am I missing? > After jbd2_journal_load() 'es->s_state' is already covered by the super > block in journal. If there is super block in journal and 'es->s_state' > has error flag this means super block in journal has error flag. If there > is no super block in journal and 'es->s_state' has error flag, this means > super block already has error flag.In both cases we can do nothing. If what you wanted to say: "It is not necessary to write the superblock if EXT4_ERROR_FS is already set." I tend to agree although not 100% because journal replay could result in rewinding 's_last_error_*' fields and so writing superblock would still make sense even if EXT4_ERROR_FS is set in es->s_state after journal reply. That's why I wouldn't complicate things and just always write the superblock after journal replay. Honza
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index dc3907dff13a..b94754ba8556 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, goto err_out; } + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); + err = ext4_commit_super(sb); + if (err) { + ext4_msg(sb, KERN_ERR, + "Failed to commit error information, please repair fs force!"); + goto err_out; + } + } + EXT4_SB(sb)->s_journal = journal; err = ext4_clear_journal_err(sb, es); if (err) {