From patchwork Mon Nov 20 11:13:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhihao Cheng X-Patchwork-Id: 166930 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp1956156vqn; Sun, 19 Nov 2023 19:20:33 -0800 (PST) X-Google-Smtp-Source: AGHT+IEZvXdwkERQIpLTGRZ9ULhrIfcdb5SG7y0L8bwTfGQaiNXZ1rLL8jObkzuA3rTeZpIOZxua X-Received: by 2002:a05:6a20:2446:b0:187:b7eb:944d with SMTP id t6-20020a056a20244600b00187b7eb944dmr8395742pzc.13.1700450433500; Sun, 19 Nov 2023 19:20:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700450433; cv=none; d=google.com; s=arc-20160816; b=dq2Nyy5MSPF0bqFf0kkV2RBIWh1r42ZTJtW5l/D4b7t9LrGYQ8b4Vn5oS6yhIV4PXd YMRImF7qjKj2X75Crb2IFpBDzB0ddOYC/pIR0vHAsQBipAVmSEFJ50mjJb2sOywQVNbu 2Eoib6uU83Q8cGNPMkG03ufmWl5extH56iYHhzT7Cr7Lpc+AvM909cOZNW3Dfc9uof6G QTlHq7mZ3Q0zpTLUf3VAKoJ3iCzUSSfCLJNMz5nwS9sbeDAGoYm5JujNKKa/RCtFEBdJ 4b+Yfwnol86t1EpTW6x7eYR7PB2FPot0qgdScriLy0+9gKfWMWnHYTfg80CildvVpQbW sGuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=K0pTdeW9kn3t8ne+2EO0W06wp4y5vYhyjWGUYlA3Ldo=; fh=8iuAhxKZ4zRmu4yJ8kyLM0sJOjlCiDjhtzZdjPnU8ps=; b=0HqloszpmgiFhTyfGpUvqQrWIVghD4TyaxcVOsanNcn1qEMZZWxEjEithX/AY3gYsT Pfbm5zYE3rHBDVzOUw1xFotHi09h6iOR6PKe1iP6Bk0t2GrvvEdQbiU5iKhkSNptBmYe orIRJ2J7aHO56ZZUfqjusy/VOE5GRptwo/525AF37x+LGYccenBoaV6bUwwgpRpFM/Vv FHg+u0UbUoTu9Zcw9N90jsNzujtgwJpO6fS+Qt5ZE/kowjmOyFGTNQjqQ1XIo8jrUBHy IwJWxH8ZSy97trULnos857NuQR/de1pzUKew7OYO5Li9CyI944hnRgR9v9XaRkEzhfVd 1SFw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id y188-20020a6364c5000000b005c203ad2343si6728699pgb.94.2023.11.19.19.20.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Nov 2023 19:20:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 700AC809AFD7; Sun, 19 Nov 2023 19:20:25 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231882AbjKTDT5 (ORCPT + 28 others); Sun, 19 Nov 2023 22:19:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231814AbjKTDTy (ORCPT ); Sun, 19 Nov 2023 22:19:54 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDBECD3 for ; Sun, 19 Nov 2023 19:19:49 -0800 (PST) Received: from kwepemm000013.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4SYXmQ11N8zWhYP; Mon, 20 Nov 2023 11:19:18 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm000013.china.huawei.com (7.193.23.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 20 Nov 2023 11:19:47 +0800 From: Zhihao Cheng To: , , , , CC: , Subject: [PATCH 1/3] ubifs: Check @c->dirty_[n|p]n_cnt and @c->nroot state under @c->lp_mutex Date: Mon, 20 Nov 2023 19:13:45 +0800 Message-ID: <20231120111347.2254153-2-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231120111347.2254153-1-chengzhihao1@huawei.com> References: <20231120111347.2254153-1-chengzhihao1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.104.67] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemm000013.china.huawei.com (7.193.23.81) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-0.8 required=5.0 tests=DATE_IN_FUTURE_06_12, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Sun, 19 Nov 2023 19:20:25 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783051513604523625 X-GMAIL-MSGID: 1783051513604523625 The checking of @c->nroot->flags and @c->dirty_[n|p]n_cnt in function nothing_to_commit() is not atomic, which could be raced with modifying of lpt, for example: P1 P2 P3 run_gc ubifs_garbage_collect do_commit ubifs_return_leb ubifs_lpt_lookup_dirty dirty_cow_nnode do_commit nothing_to_commit if (test_bit(DIRTY_CNODE, &c->nroot->flags) // false test_and_set_bit(DIRTY_CNODE, &nnode->flags) c->dirty_nn_cnt += 1 ubifs_assert(c, c->dirty_nn_cnt == 0) // false ! Fetch a reproducer in Link: UBIFS error (ubi0:0 pid 2747): ubifs_assert_failed UBIFS assert failed: c->dirty_pn_cnt == 0, in fs/ubifs/commit.c Call Trace: ubifs_ro_mode+0x58/0x70 [ubifs] ubifs_assert_failed+0x6a/0x90 [ubifs] do_commit+0x5b7/0x930 [ubifs] ubifs_run_commit+0xc6/0x1a0 [ubifs] ubifs_sync_fs+0xd8/0x110 [ubifs] sync_filesystem+0xb4/0x120 do_syscall_64+0x6f/0x140 Fix it by checking @c->dirty_[n|p]n_cnt and @c->nroot state with @c->lp_mutex locked. Fixes: 944fdef52ca9 ("UBIFS: do not start the commit if there is nothing to commit") Link: https://bugzilla.kernel.org/show_bug.cgi?id=218162 Signed-off-by: Zhihao Cheng --- fs/ubifs/commit.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/fs/ubifs/commit.c b/fs/ubifs/commit.c index c4fc1047fc07..5b3a840098b0 100644 --- a/fs/ubifs/commit.c +++ b/fs/ubifs/commit.c @@ -69,6 +69,14 @@ static int nothing_to_commit(struct ubifs_info *c) if (c->zroot.znode && ubifs_zn_dirty(c->zroot.znode)) return 0; + /* + * Increasing @c->dirty_pn_cnt/@c->dirty_nn_cnt and marking + * nnodes/pnodes as dirty in run_gc() could race with following + * checking, which leads inconsistent states between @c->nroot + * and @c->dirty_pn_cnt/@c->dirty_nn_cnt, holding @c->lp_mutex + * to avoid that. + */ + mutex_lock(&c->lp_mutex); /* * Even though the TNC is clean, the LPT tree may have dirty nodes. For * example, this may happen if the budgeting subsystem invoked GC to @@ -76,12 +84,15 @@ static int nothing_to_commit(struct ubifs_info *c) * free space. In this case GC would just change the lprops of this * LEB (by turning all space into free space) and unmap it. */ - if (c->nroot && test_bit(DIRTY_CNODE, &c->nroot->flags)) + if (c->nroot && test_bit(DIRTY_CNODE, &c->nroot->flags)) { + mutex_unlock(&c->lp_mutex); return 0; + } ubifs_assert(c, atomic_long_read(&c->dirty_zn_cnt) == 0); ubifs_assert(c, c->dirty_pn_cnt == 0); ubifs_assert(c, c->dirty_nn_cnt == 0); + mutex_unlock(&c->lp_mutex); return 1; } From patchwork Mon Nov 20 11:13:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Zhihao Cheng X-Patchwork-Id: 166928 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp1956024vqn; Sun, 19 Nov 2023 19:20:08 -0800 (PST) X-Google-Smtp-Source: AGHT+IGsrjoSHUaWuDJUfTS8RNwoAcCmFrIxrJQ8GNJdBljaY1b0RdHh0nMt5NWiHcVChORGIvWi X-Received: by 2002:a17:90b:1bca:b0:27d:3f08:cc21 with SMTP id oa10-20020a17090b1bca00b0027d3f08cc21mr7255775pjb.5.1700450408402; Sun, 19 Nov 2023 19:20:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700450408; cv=none; d=google.com; s=arc-20160816; b=ZKU3N/3f5Z0AIQSVBiwpbmzeJ/0NfJN3G/J4v6LGM/ULdx80/2NthaoHj7cxH0DQqd WiO7bVv2ikMFsDNrLSDZGIG1KM1fKJwkfh5ogMKPH6c1UKbBj4r3Yoqs7kAGP6HjVpz8 OtTul1I90QdjzFw9j9LUEXoMNjHo5MAjffP1g0iLMPGZedKQC/3dnOvzsBPWRNqJi8kr fGAIs+ciNeOSZJwkLhzz8MBefsv7khRN1qyMZ/nIyKnQIs7FDkkg9M1tpNTGrJeuwqRQ lSF6NIcC5d11epuRHv//o/7nZ3VeKf2x7ICX0oY2yCOwQqu8SuOyUZG+DbvcIrf214pI 8zJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=htadz3AQoJ4Ze6NyzhsujKFld5EWVXe6VoVeHRjMA5I=; fh=8iuAhxKZ4zRmu4yJ8kyLM0sJOjlCiDjhtzZdjPnU8ps=; b=w+Ny5SWH1r3AjN5+mTk7g3dvrrCRDrpc59gLNzM6SoeD6C4yzR7SbW1t7fnZlcSib4 pzAKjO0bO7A0lkKZ14MVl3LZPixb/MFlSN/pKiuxGZqrNBzQ9J7ZgMH062dDJ2dLpbqJ dXB6YeQYwyg3G4tqvxh7uvXQ//WfsODJ+hbrI2R8yHJ075ay2/dlJIoSj/GsfkcvOZv5 INVOLvC5p7vfOtVTlmVj+qhHIS4i+KP/c8MTMLataUaEaOCfsvAWdJVz1pR8TKQapofV 8jOOHnxRjAZ/UzEtXzKRMfSjkGGmffJVsSuyeKAtrAITxzXht5bEaPxLi8ADqiSSpiMw BBDQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id ci3-20020a17090afc8300b002839a6e8812si6434970pjb.36.2023.11.19.19.20.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Nov 2023 19:20:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 6A335802C7D7; Sun, 19 Nov 2023 19:20:07 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231890AbjKTDT7 (ORCPT + 28 others); Sun, 19 Nov 2023 22:19:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231862AbjKTDTz (ORCPT ); Sun, 19 Nov 2023 22:19:55 -0500 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFC1F137 for ; Sun, 19 Nov 2023 19:19:50 -0800 (PST) Received: from kwepemm000013.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4SYXj30Chkz1P6j8; Mon, 20 Nov 2023 11:16:23 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm000013.china.huawei.com (7.193.23.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 20 Nov 2023 11:19:48 +0800 From: Zhihao Cheng To: , , , , CC: , Subject: [PATCH 2/3] ubifs: ubifs_tnc_locate: Drop TNC mutex lockless reading path Date: Mon, 20 Nov 2023 19:13:46 +0800 Message-ID: <20231120111347.2254153-3-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231120111347.2254153-1-chengzhihao1@huawei.com> References: <20231120111347.2254153-1-chengzhihao1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.104.67] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemm000013.china.huawei.com (7.193.23.81) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-1.0 required=5.0 tests=BAYES_00,DATE_IN_FUTURE_06_12, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Sun, 19 Nov 2023 19:20:07 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783051487307323991 X-GMAIL-MSGID: 1783051487307323991 This problem is described in [1], and Tao has proposed a solution in [2]. There exists a race between node reading and gc, which is shown as: P1 P2 ubifs_tnc_locate zbr.lnum = lnum_a GC // node is moved from lnum_a to lnum_b journal head switching // lnum_a becomes a bud ubifs_get_wbuf(c, zbr.lnum) // true ubifs_tnc_read_node ubifs_read_node_wbuf // read data from lnum_a check node failed ! There are two ways of reading node(See ubifs_tnc_read_node()): 1. Reading from wbuf. The node is written in wbuf(in mem), and wbuf is not written back to flash. 2. Otherwise, reading from flash. In most cases, ubifs_tnc_read_node() is invoked with TNC mutex locked, but except the lockless path in ubifs_tnc_locate() which is imported by commit 601c0bc46753("UBIFS: allow for racing between GC and TNC"). Function ubifs_tnc_locate() is mainly used for path lookup and file reading, VFS has inode/dentry/page cache for multiple times reading, the lockless optimization only works for first reading. Based on the discussion in [2], this patch simply drops the TNC mutex lockless reading path in ubifs_tnc_locate(). Fetch a reproducer in [3]. [1] https://lore.kernel.org/all/fda84926-09d1-1fc7-4b78-99e0d04508bc@huawei.com/T/ [2] https://lore.kernel.org/linux-mtd/20200305092205.127758-1-houtao1@huawei.com/ [3] https://bugzilla.kernel.org/show_bug.cgi?id=218163 Fixes: 601c0bc46753 ("UBIFS: allow for racing between GC and TNC") Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Reported-by: 李傲傲 (Carson Li1/9542) Analyzed-by: Hou Tao Signed-off-by: Zhihao Cheng --- fs/ubifs/tnc.c | 31 +++---------------------------- 1 file changed, 3 insertions(+), 28 deletions(-) diff --git a/fs/ubifs/tnc.c b/fs/ubifs/tnc.c index f4728e65d1bd..7b7d75ed3ec7 100644 --- a/fs/ubifs/tnc.c +++ b/fs/ubifs/tnc.c @@ -1478,11 +1478,10 @@ static int maybe_leb_gced(struct ubifs_info *c, int lnum, int gc_seq1) int ubifs_tnc_locate(struct ubifs_info *c, const union ubifs_key *key, void *node, int *lnum, int *offs) { - int found, n, err, safely = 0, gc_seq1; + int found, n, err; struct ubifs_znode *znode; - struct ubifs_zbranch zbr, *zt; + struct ubifs_zbranch *zt; -again: mutex_lock(&c->tnc_mutex); found = ubifs_lookup_level0(c, key, &znode, &n); if (!found) { @@ -1505,31 +1504,7 @@ int ubifs_tnc_locate(struct ubifs_info *c, const union ubifs_key *key, err = tnc_read_hashed_node(c, zt, node); goto out; } - if (safely) { - err = ubifs_tnc_read_node(c, zt, node); - goto out; - } - /* Drop the TNC mutex prematurely and race with garbage collection */ - zbr = znode->zbranch[n]; - gc_seq1 = c->gc_seq; - mutex_unlock(&c->tnc_mutex); - - if (ubifs_get_wbuf(c, zbr.lnum)) { - /* We do not GC journal heads */ - err = ubifs_tnc_read_node(c, &zbr, node); - return err; - } - - err = fallible_read_node(c, key, &zbr, node); - if (err <= 0 || maybe_leb_gced(c, zbr.lnum, gc_seq1)) { - /* - * The node may have been GC'ed out from under us so try again - * while keeping the TNC mutex locked. - */ - safely = 1; - goto again; - } - return 0; + err = ubifs_tnc_read_node(c, zt, node); out: mutex_unlock(&c->tnc_mutex); From patchwork Mon Nov 20 11:13:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhihao Cheng X-Patchwork-Id: 166929 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp1956045vqn; Sun, 19 Nov 2023 19:20:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IFEpxI6a5ex19uVZK+YsoVCVrLnQSvr+KljZVkaeTIg9AAbnNqtXqysLTl/t5wcL33SDYv2 X-Received: by 2002:a05:6a20:3944:b0:188:20e0:2bca with SMTP id r4-20020a056a20394400b0018820e02bcamr4765133pzg.13.1700450413332; Sun, 19 Nov 2023 19:20:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700450413; cv=none; d=google.com; s=arc-20160816; b=RW6uFGZF6BV52lMxBScyhFH8RVJvaTtHg7SNjXM1xFEPnVUcvizSYWuFLS/KAFO5DW hYCJlrXBkRbkNl8mveCt+OdanParpKirGpZBQcLgd4lio/ErYAjDV/NIgY2ZCzOCte7A +UefZMQUXjHhXwzHbuRgSJYl5JDgPXPD2nB47maBPw/mzM7zqlE02vYvZp9t4NBiLJY+ pXI4dNUiThbqymS1nWvfNFzoEQDcPDYcUBPNv7CPd9Cy2ra7dKQ1lF18VP8q+74X5bkP s/GUF1yDZAbSHMXjADAFEw5lig0v3Cwsb5/rn+DkYs/Sr1ITn4zgAdJb846Zu9Y1dln5 r+pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=ZoI42SeOXddVf1ARPcLSmtIeIR6Wjv/p1YfHxNDRPkg=; fh=8iuAhxKZ4zRmu4yJ8kyLM0sJOjlCiDjhtzZdjPnU8ps=; b=KN+C7wjhHqlPshwzg3NBWV9iHRGt/XliU7ydPzDAap+QLxkH3Or6xRXIjq+6FrIrFJ XVT/lPWhJiicSd5Lj9p8oO3KBMzhtEPOMHXOg7mP0910kdSQ3uydF8bFBtMdlV4cubY+ QDxv+9td8/f69VcI0HIwdNgD5yjJhr88grXUx8yIc9n1L3x+BWe0aRhlw4pnTJtIIqLO 5GcTgcTHzAs9rZ8+PD+vnUNxHZGebTe0QfJBSOU81iv0TpbNGtAu1FUlTqttZo3HiDfk HCUnD4TIrHeZdYE2iQCN/xriHNE293T2LOuwnSTGw3skn8OkZeQ0ViLdrxvVIxYjnomv u6Iw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id q13-20020a170902dacd00b001cf65739fd9si182928plx.431.2023.11.19.19.20.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Nov 2023 19:20:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 957E5802C7E0; Sun, 19 Nov 2023 19:20:12 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231912AbjKTDUB (ORCPT + 28 others); Sun, 19 Nov 2023 22:20:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231874AbjKTDT4 (ORCPT ); Sun, 19 Nov 2023 22:19:56 -0500 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7758513E for ; Sun, 19 Nov 2023 19:19:51 -0800 (PST) Received: from kwepemm000013.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4SYXj26WJTzsRH6; Mon, 20 Nov 2023 11:16:22 +0800 (CST) Received: from huawei.com (10.175.104.67) by kwepemm000013.china.huawei.com (7.193.23.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 20 Nov 2023 11:19:48 +0800 From: Zhihao Cheng To: , , , , CC: , Subject: [PATCH 3/3] ubifs: Don't stop retrying even if committing times exceeds two Date: Mon, 20 Nov 2023 19:13:47 +0800 Message-ID: <20231120111347.2254153-4-chengzhihao1@huawei.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231120111347.2254153-1-chengzhihao1@huawei.com> References: <20231120111347.2254153-1-chengzhihao1@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.175.104.67] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To kwepemm000013.china.huawei.com (7.193.23.81) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-3.3 required=5.0 tests=BAYES_00,DATE_IN_FUTURE_06_12, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Sun, 19 Nov 2023 19:20:12 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783051492441645365 X-GMAIL-MSGID: 1783051492441645365 Recently we catched ENOSPC returned by make_reservation() while doing fsstress on UBIFS, we got following information when it occurred (See details in Link): UBIFS error (ubi0:0 pid 3640152): make_reservation [ubifs]: cannot reserve 112 bytes in jhead 2, error -28 CPU: 2 PID: 3640152 Comm: kworker/u16:2 Tainted: G B W Hardware name: Hisilicon PhosphorHi1230 EMU (DT) Workqueue: writeback wb_workfn (flush-ubifs_0_0) Call trace: dump_stack+0x114/0x198 make_reservation+0x564/0x610 [ubifs] ubifs_jnl_write_data+0x328/0x48c [ubifs] do_writepage+0x2a8/0x3e4 [ubifs] ubifs_writepage+0x16c/0x374 [ubifs] generic_writepages+0xb4/0x114 do_writepages+0xcc/0x11c writeback_sb_inodes+0x2d0/0x564 wb_writeback+0x20c/0x2b4 wb_workfn+0x404/0x510 process_one_work+0x304/0x4ac worker_thread+0x31c/0x4e4 kthread+0x23c/0x290 (pid 3640152) Budgeting info: data budget sum 17576, total budget sum 17768 budg_data_growth 4144, budg_dd_growth 13432, budg_idx_growth 192 min_idx_lebs 13, old_idx_sz 988640, uncommitted_idx 0 page_budget 4144, inode_budget 160, dent_budget 312 nospace 0, nospace_rp 0 dark_wm 8192, dead_wm 4096, max_idx_node_sz 192 freeable_cnt 0, calc_idx_sz 988640, idx_gc_cnt 0 dirty_pg_cnt 4, dirty_zn_cnt 0, clean_zn_cnt 4811 gc_lnum 21, ihead_lnum 14 jhead 0 (GC) LEB 16 jhead 1 (base) LEB 34 jhead 2 (data) LEB 23 bud LEB 16 bud LEB 23 bud LEB 34 old bud LEB 33 old bud LEB 31 old bud LEB 15 commit state 4 Budgeting predictions: available: 33832, outstanding 17576, free 15356 (pid 3640152) start dumping LEB properties (pid 3640152) Lprops statistics: empty_lebs 3, idx_lebs 11 taken_empty_lebs 1, total_free 1253376, total_dirty 2445736 total_used 3438712, total_dark 65536, total_dead 17248 LEB 15 free 0 dirty 248000 used 5952 (taken) LEB 16 free 110592 dirty 896 used 142464 (taken, jhead 0 (GC)) LEB 21 free 253952 dirty 0 used 0 (taken, GC LEB) LEB 23 free 0 dirty 248104 used 5848 (taken, jhead 2 (data)) LEB 29 free 253952 dirty 0 used 0 (empty) LEB 33 free 0 dirty 253952 used 0 (taken) LEB 34 free 217088 dirty 36544 used 320 (taken, jhead 1 (base)) LEB 37 free 253952 dirty 0 used 0 (empty) OTHERS: index lebs, zero-available non-index lebs According to the budget algorithm, there are 5 LEBs reserved for budget: three journal heads(16,23,34), 1 GC LEB(21) and 1 deletion LEB(can be used in make_reservation()). There are 2 empty LEBs used for index nodes, which is calculated as min_idx_lebs - idx_lebs = 2. In theory, LEB 15 and 33 should be reclaimed as free state after committing, but it is now in taken state. After looking the realization of reserve_space(), there's a possible situation: LEB 15: free 2000 dirty 248000 used 3952 (jhead 2) LEB 23: free 2000 dirty 248104 used 3848 (bud, taken) LEB 33: free 2000 dirty 251952 used 0 (bud, taken) wb_workfn wb_workfn_2 do_writepage // write 3000 bytes ubifs_jnl_write_data make_reservation reserve_space ubifs_garbage_collect ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken nospc_retries++ // 1 ubifs_run_commit do_commit LEB 15: free 2000 dirty 248000 used 3952 (jhead 2) LEB 23: free 2000 dirty 248104 used 3848 (dirty) LEB 33: free 2000 dirty 251952 used 0 (dirty) do_writepage // write 2000 bytes for 3 times ubifs_jnl_write_data // grabs 15\23\33 LEB 15: free 0 dirty 248000 used 5952 (bud, taken) LEB 23: free 0 dirty 248104 used 5848 (jhead 2) LEB 33: free 0 dirty 253952 used 0 (bud, taken) reserve_space ubifs_garbage_collect ubifs_find_dirty_leb // ret ENOSPC, dirty LEBs are taken if (nospc_retries++ < 2) // false ubifs_ro_mode ! Fetch a reproducer in Link. The dirty LEBs could be grabbed by other writeback threads, which fails finding dirty LEBs of GC in current thread, so make_reservation() should try many times to invoke GC&&committing, but current realization limits the times of retrying as 'nospc_retries'(twice). Actually the limitation of retrying times should be removed from make_reservation(), but considering the existence of 'cmt_retries' in case of infinite loop, just remove 'nospc_retries' and set the limitation of retrying times same as 'cmt_retries'. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218164 Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Signed-off-by: Zhihao Cheng --- fs/ubifs/journal.c | 26 ++++++++++---------------- 1 file changed, 10 insertions(+), 16 deletions(-) diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index f0a5538c84b0..ec42d42f06dc 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -310,7 +310,7 @@ static int write_head(struct ubifs_info *c, int jhead, void *buf, int len, */ static int make_reservation(struct ubifs_info *c, int jhead, int len) { - int err, cmt_retries = 0, nospc_retries = 0; + int err, cmt_retries = 0; again: down_read(&c->commit_sem); @@ -323,21 +323,14 @@ static int make_reservation(struct ubifs_info *c, int jhead, int len) if (err == -ENOSPC) { /* * GC could not make any progress. We should try to commit - * once because it could make some dirty space and GC would - * make progress, so make the error -EAGAIN so that the below - * will commit and re-try. - */ - if (nospc_retries++ < 2) { - dbg_jnl("no space, retry"); - err = -EAGAIN; - } - - /* - * This means that the budgeting is incorrect. We always have - * to be able to write to the media, because all operations are - * budgeted. Deletions are not budgeted, though, but we reserve - * an extra LEB for them. + * because it could make some dirty space and GC would make + * progress, so that the below will commit and re-try. This + * process could repeat many times(cmt_retries < 128), + * journal heads in other threads could grab the dirty lebs, + * which fails finding dirty LEBs of GC in current thread. */ + dbg_jnl("no space, retry"); + err = -EAGAIN; } if (err != -EAGAIN) @@ -350,7 +343,8 @@ static int make_reservation(struct ubifs_info *c, int jhead, int len) if (cmt_retries > 128) { /* * This should not happen unless the journal size limitations - * are too tough. + * are too tough, or the racing between GC and journal heads + * switching is too frequent when space is nearly run out. */ ubifs_err(c, "stuck in space allocation"); err = -ENOSPC;