Message ID | 20230613203947.2745943-1-jaegeuk@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp836001vqr; Tue, 13 Jun 2023 14:32:43 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5lxBrAjBbl1w3A0LmuEkGCEsXgGU5cpGQp1NbM3BjKk37Cz61RKmefa9jLdMJIVrQZwmin X-Received: by 2002:a17:907:705:b0:974:56aa:6dce with SMTP id xb5-20020a170907070500b0097456aa6dcemr10466139ejb.46.1686691963415; Tue, 13 Jun 2023 14:32:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686691963; cv=none; d=google.com; s=arc-20160816; b=M9WPGSoe5+rghqdOtqXjx0Wjnf783d/TzlQ3+16lM3EudmLMDGy1NXOsS7YqcTVLUs zUh6yoj+oifvVaC5IYCjdHH/1R6WMzLD+mBbTaQybzuPHMso3jMymT5K/98NYj+zfT3p 41ORXq7ETy6A0W8axNdqVpBlzQLqrBd1fqryYwtx1Ix17uSWRLWeEcvnqr7h9LWUKoju 7RdIWoTcKNlruVow4bZuWllOIrx3Bm33TxeNTRPYjMc5+pQ1c5NhGFkVHVg0FW7x8bzo K7XvtGmIBZ0cypVLO682eBGBajyT1VciHX1kycBDbqfefT6J5gh2IHayt0kOqI+vr4Rf MY6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=HYEPHnKJZJwBllOOaE+wWwPcndapSn+SEdVnQ29R52I=; b=CZOShWpr0uDGcdLlRRYN/UZt1whLgPBOfsUuLt3anN2MN0erd6ItNgmbTglSaHfL2a 6DXu2z0bPOaemxAIwNT7hxf0qrraB3c9itMMg4kkisutvsCQZHdFQuXsg1WGK0Dk/n+S Hdsr2fpu4YnQhkq0xK7ROAlvS/GijM7m9O1m78Gwtj40I4pJnZGJZwGKe3IPIiwGhtbn VpbmWypxG5QWIZdQoxGfF7OLDDUTl6ZeJclca66+ofdZZXt6LooyKR5FtXXzfQ1/RlF1 1TxkR5EX0wsPVh5jS4v59AeQZAem5xbxdIW8sUsnfo4y0s+2XtThSvY8wR2LHGV8p4xd ixrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=nZ9YV6tj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h21-20020a1709063c1500b00977cc7cec71si3379921ejg.285.2023.06.13.14.32.19; Tue, 13 Jun 2023 14:32:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=nZ9YV6tj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233857AbjFMUkR (ORCPT <rfc822;lekhanya01809@gmail.com> + 99 others); Tue, 13 Jun 2023 16:40:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233864AbjFMUjv (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 13 Jun 2023 16:39:51 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A392EB8 for <linux-kernel@vger.kernel.org>; Tue, 13 Jun 2023 13:39:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 383D061337 for <linux-kernel@vger.kernel.org>; Tue, 13 Jun 2023 20:39:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 910F0C433C0; Tue, 13 Jun 2023 20:39:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686688789; bh=sLZ+P360Imf7bwyICslM2xpcRXFrq+wunkzXqiYuC6c=; h=From:To:Cc:Subject:Date:From; b=nZ9YV6tj3Y7La98xNEpw85oJMLL0gU6jsgf/cp7ApkpVAIGvgK6noVqsGG3xyy9Gu qcoPD2tHLvUPYABIyRfS7fHlQU9EEILPo0vYwdBWx2sCCWEkQIZjcMJNFeO9eqj2sP 73Qo3VEL0dr2ed1HPLZUgIVLNeAkLCRKj6dvVtchxtkHTNfnyn/IcY4u0gsI8qel8k mS5CKjoi8WAUQYkx8m7mDD1zIOMl6B8vDuaCeRRLrF38IBPr/da4aC7HQuA16C5rtS CkWfC0E0PkN0djNLLccF3WAUuZXccRiDUJ05GJ4hhRaHyhA3miYfEQE1/sg5rX/Vsi smDV0j5ZU5Vxw== From: Jaegeuk Kim <jaegeuk@kernel.org> To: linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net Cc: Jaegeuk Kim <jaegeuk@kernel.org> Subject: [PATCH] f2fs: do not issue small discard commands during checkpoint Date: Tue, 13 Jun 2023 13:39:47 -0700 Message-ID: <20230613203947.2745943-1-jaegeuk@kernel.org> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768624712492201042?= X-GMAIL-MSGID: =?utf-8?q?1768624712492201042?= |
Series |
f2fs: do not issue small discard commands during checkpoint
|
|
Commit Message
Jaegeuk Kim
June 13, 2023, 8:39 p.m. UTC
If there're huge # of small discards, this will increase checkpoint latency
insanely. Let's issue small discards only by trim.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
fs/f2fs/segment.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
Comments
Hi Jaegeuk, > If there're huge # of small discards, this will increase checkpoint latency > insanely. Let's issue small discards only by trim. > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > --- > fs/f2fs/segment.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > index 0c0c033c4bdd..ef46bb085385 100644 > --- a/fs/f2fs/segment.c > +++ b/fs/f2fs/segment.c > @@ -2178,7 +2178,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > } > mutex_unlock(&dirty_i->seglist_lock); > > - if (!f2fs_block_unit_discard(sbi)) > + if (!f2fs_block_unit_discard(sbi) || !force) If we don't handle the discard entries here, dcc->entry_list will still have them, so stale discard entries may be handled by trim, causing incorrect discards to be issued. Therefore, I think this patch should also prevent the creation of discard entries in add_discard_addrs(), unless it is a checkpoint caused by trim. This would further reduce the latency caused by the creation of a discard entry. Thanks, Daejun
If there're huge # of small discards, this will increase checkpoint latency
insanely. Let's issue small discards only by trim.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
---
Change log from v1:
- move the skip logic to avoid dangling objects
fs/f2fs/segment.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index 8c7af8b4fc47..0457d620011f 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi,
len = next_pos - cur_pos;
if (f2fs_sb_has_blkzoned(sbi) ||
- (force && len < cpc->trim_minlen))
+ !force || len < cpc->trim_minlen)
goto skip;
f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
Hi Daejun, On 06/14, Daejun Park wrote: > Hi Jaegeuk, > > > If there're huge # of small discards, this will increase checkpoint latency > > insanely. Let's issue small discards only by trim. > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > --- > > fs/f2fs/segment.c | 5 ++--- > > 1 file changed, 2 insertions(+), 3 deletions(-) > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > index 0c0c033c4bdd..ef46bb085385 100644 > > --- a/fs/f2fs/segment.c > > +++ b/fs/f2fs/segment.c > > @@ -2178,7 +2178,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > } > > mutex_unlock(&dirty_i->seglist_lock); > > > > - if (!f2fs_block_unit_discard(sbi)) > > + if (!f2fs_block_unit_discard(sbi) || !force) > > If we don't handle the discard entries here, dcc->entry_list will still have them, > so stale discard entries may be handled by trim, causing incorrect discards to be issued. > Therefore, I think this patch should also prevent the creation of discard entries > in add_discard_addrs(), unless it is a checkpoint caused by trim. > This would further reduce the latency caused by the creation of a discard entry. I found this causes some objects were not reclaimed when removing the module. Hence I'm testing v2. > > Thanks, > Daejun >
On 2023/6/15 0:10, Jaegeuk Kim wrote: > If there're huge # of small discards, this will increase checkpoint latency > insanely. Let's issue small discards only by trim. > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > --- > > Change log from v1: > - move the skip logic to avoid dangling objects > > fs/f2fs/segment.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > index 8c7af8b4fc47..0457d620011f 100644 > --- a/fs/f2fs/segment.c > +++ b/fs/f2fs/segment.c > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > len = next_pos - cur_pos; > > if (f2fs_sb_has_blkzoned(sbi) || > - (force && len < cpc->trim_minlen)) > + !force || len < cpc->trim_minlen) > goto skip; Sorry for late reply. We have a configuration for such case, what do you think of setting max_small_discards to zero? otherwise, w/ above change, max_small_discards logic may be broken? What: /sys/fs/f2fs/<disk>/max_small_discards Date: November 2013 Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> Description: Controls the issue rate of discard commands that consist of small blocks less than 2MB. The candidates to be discarded are cached until checkpoint is triggered, and issued during the checkpoint. By default, it is disabled with 0. Or, if we prefer to disable small_discards by default, what about below change: From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 From: Chao Yu <chao@kernel.org> Date: Mon, 3 Jul 2023 09:06:53 +0800 Subject: [PATCH] Revert "f2fs: enable small discard by default" This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order to disable small discard by default, so that if there're huge number of small discards, it will decrease checkpoint's latency obviously. Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint"), due to it breaks small discard feature which may be configured via sysfs entry max_small_discards. Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") Signed-off-by: Chao Yu <chao@kernel.org> --- fs/f2fs/segment.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 14c822e5c9c9..0a313368f18b 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, len = next_pos - cur_pos; if (f2fs_sb_has_blkzoned(sbi) || - !force || len < cpc->trim_minlen) + (force && len < cpc->trim_minlen)) goto skip; f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) atomic_set(&dcc->queued_discard, 0); atomic_set(&dcc->discard_cmd_cnt, 0); dcc->nr_discards = 0; - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; + dcc->max_discards = 0; dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME;
On 07/03, Chao Yu wrote: > On 2023/6/15 0:10, Jaegeuk Kim wrote: > > If there're huge # of small discards, this will increase checkpoint latency > > insanely. Let's issue small discards only by trim. > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > --- > > > > Change log from v1: > > - move the skip logic to avoid dangling objects > > > > fs/f2fs/segment.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > index 8c7af8b4fc47..0457d620011f 100644 > > --- a/fs/f2fs/segment.c > > +++ b/fs/f2fs/segment.c > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > len = next_pos - cur_pos; > > if (f2fs_sb_has_blkzoned(sbi) || > > - (force && len < cpc->trim_minlen)) > > + !force || len < cpc->trim_minlen) > > goto skip; > > Sorry for late reply. > > We have a configuration for such case, what do you think of setting > max_small_discards to zero? otherwise, w/ above change, max_small_discards > logic may be broken? > > What: /sys/fs/f2fs/<disk>/max_small_discards > Date: November 2013 > Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> > Description: Controls the issue rate of discard commands that consist of small > blocks less than 2MB. The candidates to be discarded are cached until > checkpoint is triggered, and issued during the checkpoint. > By default, it is disabled with 0. > > Or, if we prefer to disable small_discards by default, what about below change: I think small_discards is fine, but need to avoid long checkpoint latency only. > > From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 > From: Chao Yu <chao@kernel.org> > Date: Mon, 3 Jul 2023 09:06:53 +0800 > Subject: [PATCH] Revert "f2fs: enable small discard by default" > > This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order > to disable small discard by default, so that if there're huge number of > small discards, it will decrease checkpoint's latency obviously. > > Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard > commands during checkpoint"), due to it breaks small discard feature which > may be configured via sysfs entry max_small_discards. > > Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") > Signed-off-by: Chao Yu <chao@kernel.org> > --- > fs/f2fs/segment.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > index 14c822e5c9c9..0a313368f18b 100644 > --- a/fs/f2fs/segment.c > +++ b/fs/f2fs/segment.c > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > len = next_pos - cur_pos; > > if (f2fs_sb_has_blkzoned(sbi) || > - !force || len < cpc->trim_minlen) > + (force && len < cpc->trim_minlen)) > goto skip; > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > atomic_set(&dcc->queued_discard, 0); > atomic_set(&dcc->discard_cmd_cnt, 0); > dcc->nr_discards = 0; > - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; > + dcc->max_discards = 0; > dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; > dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; > dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; > -- > 2.40.1 > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/7/4 18:53, Jaegeuk Kim wrote: > On 07/03, Chao Yu wrote: >> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>> If there're huge # of small discards, this will increase checkpoint latency >>> insanely. Let's issue small discards only by trim. >>> >>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>> --- >>> >>> Change log from v1: >>> - move the skip logic to avoid dangling objects >>> >>> fs/f2fs/segment.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>> index 8c7af8b4fc47..0457d620011f 100644 >>> --- a/fs/f2fs/segment.c >>> +++ b/fs/f2fs/segment.c >>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>> len = next_pos - cur_pos; >>> if (f2fs_sb_has_blkzoned(sbi) || >>> - (force && len < cpc->trim_minlen)) >>> + !force || len < cpc->trim_minlen) >>> goto skip; >> >> Sorry for late reply. >> >> We have a configuration for such case, what do you think of setting >> max_small_discards to zero? otherwise, w/ above change, max_small_discards >> logic may be broken? >> >> What: /sys/fs/f2fs/<disk>/max_small_discards >> Date: November 2013 >> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >> Description: Controls the issue rate of discard commands that consist of small >> blocks less than 2MB. The candidates to be discarded are cached until >> checkpoint is triggered, and issued during the checkpoint. >> By default, it is disabled with 0. >> >> Or, if we prefer to disable small_discards by default, what about below change: > > I think small_discards is fine, but need to avoid long checkpoint latency only. I didn't get you, do you mean we can still issue small discard by fstrim, so small_discards functionality is fine? Thanks, > >> >> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >> From: Chao Yu <chao@kernel.org> >> Date: Mon, 3 Jul 2023 09:06:53 +0800 >> Subject: [PATCH] Revert "f2fs: enable small discard by default" >> >> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >> to disable small discard by default, so that if there're huge number of >> small discards, it will decrease checkpoint's latency obviously. >> >> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >> commands during checkpoint"), due to it breaks small discard feature which >> may be configured via sysfs entry max_small_discards. >> >> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >> Signed-off-by: Chao Yu <chao@kernel.org> >> --- >> fs/f2fs/segment.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >> index 14c822e5c9c9..0a313368f18b 100644 >> --- a/fs/f2fs/segment.c >> +++ b/fs/f2fs/segment.c >> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >> len = next_pos - cur_pos; >> >> if (f2fs_sb_has_blkzoned(sbi) || >> - !force || len < cpc->trim_minlen) >> + (force && len < cpc->trim_minlen)) >> goto skip; >> >> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >> atomic_set(&dcc->queued_discard, 0); >> atomic_set(&dcc->discard_cmd_cnt, 0); >> dcc->nr_discards = 0; >> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >> + dcc->max_discards = 0; >> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >> -- >> 2.40.1 >> >> >> >>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 07/04, Chao Yu wrote: > On 2023/7/4 18:53, Jaegeuk Kim wrote: > > On 07/03, Chao Yu wrote: > > > On 2023/6/15 0:10, Jaegeuk Kim wrote: > > > > If there're huge # of small discards, this will increase checkpoint latency > > > > insanely. Let's issue small discards only by trim. > > > > > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > --- > > > > > > > > Change log from v1: > > > > - move the skip logic to avoid dangling objects > > > > > > > > fs/f2fs/segment.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > index 8c7af8b4fc47..0457d620011f 100644 > > > > --- a/fs/f2fs/segment.c > > > > +++ b/fs/f2fs/segment.c > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > len = next_pos - cur_pos; > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > - (force && len < cpc->trim_minlen)) > > > > + !force || len < cpc->trim_minlen) > > > > goto skip; > > > > > > Sorry for late reply. > > > > > > We have a configuration for such case, what do you think of setting > > > max_small_discards to zero? otherwise, w/ above change, max_small_discards > > > logic may be broken? > > > > > > What: /sys/fs/f2fs/<disk>/max_small_discards > > > Date: November 2013 > > > Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> > > > Description: Controls the issue rate of discard commands that consist of small > > > blocks less than 2MB. The candidates to be discarded are cached until > > > checkpoint is triggered, and issued during the checkpoint. > > > By default, it is disabled with 0. > > > > > > Or, if we prefer to disable small_discards by default, what about below change: > > > > I think small_discards is fine, but need to avoid long checkpoint latency only. > > I didn't get you, do you mean we can still issue small discard by > fstrim, so small_discards functionality is fine? You got the point. > > Thanks, > > > > > > > > > From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 > > > From: Chao Yu <chao@kernel.org> > > > Date: Mon, 3 Jul 2023 09:06:53 +0800 > > > Subject: [PATCH] Revert "f2fs: enable small discard by default" > > > > > > This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order > > > to disable small discard by default, so that if there're huge number of > > > small discards, it will decrease checkpoint's latency obviously. > > > > > > Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard > > > commands during checkpoint"), due to it breaks small discard feature which > > > may be configured via sysfs entry max_small_discards. > > > > > > Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") > > > Signed-off-by: Chao Yu <chao@kernel.org> > > > --- > > > fs/f2fs/segment.c | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > index 14c822e5c9c9..0a313368f18b 100644 > > > --- a/fs/f2fs/segment.c > > > +++ b/fs/f2fs/segment.c > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > len = next_pos - cur_pos; > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > - !force || len < cpc->trim_minlen) > > > + (force && len < cpc->trim_minlen)) > > > goto skip; > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > > > atomic_set(&dcc->queued_discard, 0); > > > atomic_set(&dcc->discard_cmd_cnt, 0); > > > dcc->nr_discards = 0; > > > - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; > > > + dcc->max_discards = 0; > > > dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; > > > dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; > > > dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; > > > -- > > > 2.40.1 > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/7/6 1:30, Jaegeuk Kim wrote: > On 07/04, Chao Yu wrote: >> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>> On 07/03, Chao Yu wrote: >>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>> insanely. Let's issue small discards only by trim. >>>>> >>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>> --- >>>>> >>>>> Change log from v1: >>>>> - move the skip logic to avoid dangling objects >>>>> >>>>> fs/f2fs/segment.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>> >>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>> --- a/fs/f2fs/segment.c >>>>> +++ b/fs/f2fs/segment.c >>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>> len = next_pos - cur_pos; >>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>> - (force && len < cpc->trim_minlen)) >>>>> + !force || len < cpc->trim_minlen) >>>>> goto skip; >>>> >>>> Sorry for late reply. >>>> >>>> We have a configuration for such case, what do you think of setting >>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>> logic may be broken? >>>> >>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>> Date: November 2013 >>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>> Description: Controls the issue rate of discard commands that consist of small >>>> blocks less than 2MB. The candidates to be discarded are cached until >>>> checkpoint is triggered, and issued during the checkpoint. >>>> By default, it is disabled with 0. >>>> >>>> Or, if we prefer to disable small_discards by default, what about below change: >>> >>> I think small_discards is fine, but need to avoid long checkpoint latency only. >> >> I didn't get you, do you mean we can still issue small discard by >> fstrim, so small_discards functionality is fine? > > You got the point. Well, actually, what I mean is max_small_discards sysfs entry's functionality is broken. Now, the entry can not be used to control number of small discards committed by checkpoint. I think there is another way to achieve "avoid long checkpoint latency caused by committing huge # of small discards", the way is we can set max_small_discards to small value or zero, w/ such configuration, it will take checkpoint much less time or no time to committing small discard due to below control logic: f2fs_flush_sit_entries() { ... if (!(cpc->reason & CP_DISCARD)) { cpc->trim_start = segno; add_discard_addrs(sbi, cpc, false); } ... } add_discard_addrs() { ... while (force || SM_I(sbi)->dcc_info->nr_discards <= SM_I(sbi)->dcc_info->max_discards) { It will break the loop once nr_discards is larger than max_discards, if max_discards is set to zero, checkpoint won't take time to handle small discards. ... if (!de) { de = f2fs_kmem_cache_alloc(discard_entry_slab, GFP_F2FS_ZERO, true, NULL); de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); list_add_tail(&de->list, head); } ... } ... Thanks, > >> >> Thanks, >> >>> >>>> >>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>> From: Chao Yu <chao@kernel.org> >>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>> >>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>> to disable small discard by default, so that if there're huge number of >>>> small discards, it will decrease checkpoint's latency obviously. >>>> >>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>> commands during checkpoint"), due to it breaks small discard feature which >>>> may be configured via sysfs entry max_small_discards. >>>> >>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>> --- >>>> fs/f2fs/segment.c | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>> index 14c822e5c9c9..0a313368f18b 100644 >>>> --- a/fs/f2fs/segment.c >>>> +++ b/fs/f2fs/segment.c >>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>> len = next_pos - cur_pos; >>>> >>>> if (f2fs_sb_has_blkzoned(sbi) || >>>> - !force || len < cpc->trim_minlen) >>>> + (force && len < cpc->trim_minlen)) >>>> goto skip; >>>> >>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>> atomic_set(&dcc->queued_discard, 0); >>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>> dcc->nr_discards = 0; >>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>> + dcc->max_discards = 0; >>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>> -- >>>> 2.40.1 >>>> >>>> >>>> >>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/7/6 8:46, Chao Yu wrote: > On 2023/7/6 1:30, Jaegeuk Kim wrote: >> On 07/04, Chao Yu wrote: >>> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>>> On 07/03, Chao Yu wrote: >>>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>>> insanely. Let's issue small discards only by trim. >>>>>> >>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>>> --- >>>>>> >>>>>> Change log from v1: >>>>>> - move the skip logic to avoid dangling objects >>>>>> >>>>>> fs/f2fs/segment.c | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>>> --- a/fs/f2fs/segment.c >>>>>> +++ b/fs/f2fs/segment.c >>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>> len = next_pos - cur_pos; >>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>> - (force && len < cpc->trim_minlen)) >>>>>> + !force || len < cpc->trim_minlen) >>>>>> goto skip; >>>>> >>>>> Sorry for late reply. >>>>> >>>>> We have a configuration for such case, what do you think of setting >>>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>>> logic may be broken? >>>>> >>>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>>> Date: November 2013 >>>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>>> Description: Controls the issue rate of discard commands that consist of small >>>>> blocks less than 2MB. The candidates to be discarded are cached until >>>>> checkpoint is triggered, and issued during the checkpoint. >>>>> By default, it is disabled with 0. >>>>> >>>>> Or, if we prefer to disable small_discards by default, what about below change: >>>> >>>> I think small_discards is fine, but need to avoid long checkpoint latency only. >>> >>> I didn't get you, do you mean we can still issue small discard by >>> fstrim, so small_discards functionality is fine? >> >> You got the point. > > Well, actually, what I mean is max_small_discards sysfs entry's functionality > is broken. Now, the entry can not be used to control number of small discards > committed by checkpoint. > > I think there is another way to achieve "avoid long checkpoint latency caused > by committing huge # of small discards", the way is we can set max_small_discards > to small value or zero, w/ such configuration, it will take checkpoint much less > time or no time to committing small discard due to below control logic: Jaegeuk, any comments? Thanks, > > f2fs_flush_sit_entries() > { > ... > if (!(cpc->reason & CP_DISCARD)) { > cpc->trim_start = segno; > add_discard_addrs(sbi, cpc, false); > } > ... > } > > add_discard_addrs() > { > ... > while (force || SM_I(sbi)->dcc_info->nr_discards <= > SM_I(sbi)->dcc_info->max_discards) { > > It will break the loop once nr_discards is larger than max_discards, if > max_discards is set to zero, checkpoint won't take time to handle small discards. > > ... > if (!de) { > de = f2fs_kmem_cache_alloc(discard_entry_slab, > GFP_F2FS_ZERO, true, NULL); > de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); > list_add_tail(&de->list, head); > } > ... > } > ... > > Thanks, > >> >>> >>> Thanks, >>> >>>> >>>>> >>>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>>> From: Chao Yu <chao@kernel.org> >>>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>>> >>>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>>> to disable small discard by default, so that if there're huge number of >>>>> small discards, it will decrease checkpoint's latency obviously. >>>>> >>>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>>> commands during checkpoint"), due to it breaks small discard feature which >>>>> may be configured via sysfs entry max_small_discards. >>>>> >>>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>>> --- >>>>> fs/f2fs/segment.c | 4 ++-- >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>> index 14c822e5c9c9..0a313368f18b 100644 >>>>> --- a/fs/f2fs/segment.c >>>>> +++ b/fs/f2fs/segment.c >>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>> len = next_pos - cur_pos; >>>>> >>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>> - !force || len < cpc->trim_minlen) >>>>> + (force && len < cpc->trim_minlen)) >>>>> goto skip; >>>>> >>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>> atomic_set(&dcc->queued_discard, 0); >>>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>>> dcc->nr_discards = 0; >>>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>>> + dcc->max_discards = 0; >>>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>>> -- >>>>> 2.40.1 >>>>> >>>>> >>>>> >>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
On 07/06, Chao Yu wrote: > On 2023/7/6 1:30, Jaegeuk Kim wrote: > > On 07/04, Chao Yu wrote: > > > On 2023/7/4 18:53, Jaegeuk Kim wrote: > > > > On 07/03, Chao Yu wrote: > > > > > On 2023/6/15 0:10, Jaegeuk Kim wrote: > > > > > > If there're huge # of small discards, this will increase checkpoint latency > > > > > > insanely. Let's issue small discards only by trim. > > > > > > > > > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > > > --- > > > > > > > > > > > > Change log from v1: > > > > > > - move the skip logic to avoid dangling objects > > > > > > > > > > > > fs/f2fs/segment.c | 2 +- > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > index 8c7af8b4fc47..0457d620011f 100644 > > > > > > --- a/fs/f2fs/segment.c > > > > > > +++ b/fs/f2fs/segment.c > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > len = next_pos - cur_pos; > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > - (force && len < cpc->trim_minlen)) > > > > > > + !force || len < cpc->trim_minlen) > > > > > > goto skip; > > > > > > > > > > Sorry for late reply. > > > > > > > > > > We have a configuration for such case, what do you think of setting > > > > > max_small_discards to zero? otherwise, w/ above change, max_small_discards > > > > > logic may be broken? > > > > > > > > > > What: /sys/fs/f2fs/<disk>/max_small_discards > > > > > Date: November 2013 > > > > > Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> > > > > > Description: Controls the issue rate of discard commands that consist of small > > > > > blocks less than 2MB. The candidates to be discarded are cached until > > > > > checkpoint is triggered, and issued during the checkpoint. > > > > > By default, it is disabled with 0. > > > > > > > > > > Or, if we prefer to disable small_discards by default, what about below change: > > > > > > > > I think small_discards is fine, but need to avoid long checkpoint latency only. > > > > > > I didn't get you, do you mean we can still issue small discard by > > > fstrim, so small_discards functionality is fine? > > > > You got the point. > > Well, actually, what I mean is max_small_discards sysfs entry's functionality > is broken. Now, the entry can not be used to control number of small discards > committed by checkpoint. Could you descrbie this problem first? > > I think there is another way to achieve "avoid long checkpoint latency caused > by committing huge # of small discards", the way is we can set max_small_discards > to small value or zero, w/ such configuration, it will take checkpoint much less > time or no time to committing small discard due to below control logic: > > f2fs_flush_sit_entries() > { > ... > if (!(cpc->reason & CP_DISCARD)) { > cpc->trim_start = segno; > add_discard_addrs(sbi, cpc, false); > } > ... > } > > add_discard_addrs() > { > ... > while (force || SM_I(sbi)->dcc_info->nr_discards <= > SM_I(sbi)->dcc_info->max_discards) { > > It will break the loop once nr_discards is larger than max_discards, if > max_discards is set to zero, checkpoint won't take time to handle small discards. > > ... > if (!de) { > de = f2fs_kmem_cache_alloc(discard_entry_slab, > GFP_F2FS_ZERO, true, NULL); > de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); > list_add_tail(&de->list, head); > } > ... > } > ... > > Thanks, > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 > > > > > From: Chao Yu <chao@kernel.org> > > > > > Date: Mon, 3 Jul 2023 09:06:53 +0800 > > > > > Subject: [PATCH] Revert "f2fs: enable small discard by default" > > > > > > > > > > This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order > > > > > to disable small discard by default, so that if there're huge number of > > > > > small discards, it will decrease checkpoint's latency obviously. > > > > > > > > > > Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard > > > > > commands during checkpoint"), due to it breaks small discard feature which > > > > > may be configured via sysfs entry max_small_discards. > > > > > > > > > > Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") > > > > > Signed-off-by: Chao Yu <chao@kernel.org> > > > > > --- > > > > > fs/f2fs/segment.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > index 14c822e5c9c9..0a313368f18b 100644 > > > > > --- a/fs/f2fs/segment.c > > > > > +++ b/fs/f2fs/segment.c > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > len = next_pos - cur_pos; > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > - !force || len < cpc->trim_minlen) > > > > > + (force && len < cpc->trim_minlen)) > > > > > goto skip; > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > > > @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > > > > > atomic_set(&dcc->queued_discard, 0); > > > > > atomic_set(&dcc->discard_cmd_cnt, 0); > > > > > dcc->nr_discards = 0; > > > > > - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; > > > > > + dcc->max_discards = 0; > > > > > dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; > > > > > dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; > > > > > dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; > > > > > -- > > > > > 2.40.1 > > > > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/7/12 0:37, Jaegeuk Kim wrote: > On 07/06, Chao Yu wrote: >> On 2023/7/6 1:30, Jaegeuk Kim wrote: >>> On 07/04, Chao Yu wrote: >>>> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>>>> On 07/03, Chao Yu wrote: >>>>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>>>> insanely. Let's issue small discards only by trim. >>>>>>> >>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>>>> --- >>>>>>> >>>>>>> Change log from v1: >>>>>>> - move the skip logic to avoid dangling objects >>>>>>> >>>>>>> fs/f2fs/segment.c | 2 +- >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>>>> --- a/fs/f2fs/segment.c >>>>>>> +++ b/fs/f2fs/segment.c >>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>> len = next_pos - cur_pos; >>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>> - (force && len < cpc->trim_minlen)) >>>>>>> + !force || len < cpc->trim_minlen) >>>>>>> goto skip; >>>>>> >>>>>> Sorry for late reply. >>>>>> >>>>>> We have a configuration for such case, what do you think of setting >>>>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>>>> logic may be broken? >>>>>> >>>>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>>>> Date: November 2013 >>>>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>>>> Description: Controls the issue rate of discard commands that consist of small >>>>>> blocks less than 2MB. The candidates to be discarded are cached until >>>>>> checkpoint is triggered, and issued during the checkpoint. >>>>>> By default, it is disabled with 0. >>>>>> >>>>>> Or, if we prefer to disable small_discards by default, what about below change: >>>>> >>>>> I think small_discards is fine, but need to avoid long checkpoint latency only. >>>> >>>> I didn't get you, do you mean we can still issue small discard by >>>> fstrim, so small_discards functionality is fine? >>> >>> You got the point. >> >> Well, actually, what I mean is max_small_discards sysfs entry's functionality >> is broken. Now, the entry can not be used to control number of small discards >> committed by checkpoint. > > Could you descrbie this problem first? Oh, alright, actually, I've described this problem literally, but maybe it's not clear, let me give some examples as below: echo 0 > /sys/fs/f2fs/vdb/max_small_discards xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" xfs_io /mnt/f2fs/file -c "fpunch 0 4k" sync cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 echo 100 > /sys/fs/f2fs/vdb/max_small_discards rm /mnt/f2fs/file xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" xfs_io /mnt/f2fs/file -c "fpunch 0 4k" sync cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 Before the patch: Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 . . . . . . . . Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 3 1 . . . . . . After the patch: Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 . . . . . . . . Discard pend list(Show diacrd_cmd count on each entry, .:not exist): 0 . . . . . . . . So, now max_small_discards can not be used to control small discard number cached by checkpoint. Thanks, > >> >> I think there is another way to achieve "avoid long checkpoint latency caused >> by committing huge # of small discards", the way is we can set max_small_discards >> to small value or zero, w/ such configuration, it will take checkpoint much less >> time or no time to committing small discard due to below control logic: >> >> f2fs_flush_sit_entries() >> { >> ... >> if (!(cpc->reason & CP_DISCARD)) { >> cpc->trim_start = segno; >> add_discard_addrs(sbi, cpc, false); >> } >> ... >> } >> >> add_discard_addrs() >> { >> ... >> while (force || SM_I(sbi)->dcc_info->nr_discards <= >> SM_I(sbi)->dcc_info->max_discards) { >> >> It will break the loop once nr_discards is larger than max_discards, if >> max_discards is set to zero, checkpoint won't take time to handle small discards. >> >> ... >> if (!de) { >> de = f2fs_kmem_cache_alloc(discard_entry_slab, >> GFP_F2FS_ZERO, true, NULL); >> de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); >> list_add_tail(&de->list, head); >> } >> ... >> } >> ... >> >> Thanks, >> >>> >>>> >>>> Thanks, >>>> >>>>> >>>>>> >>>>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>>>> From: Chao Yu <chao@kernel.org> >>>>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>>>> >>>>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>>>> to disable small discard by default, so that if there're huge number of >>>>>> small discards, it will decrease checkpoint's latency obviously. >>>>>> >>>>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>>>> commands during checkpoint"), due to it breaks small discard feature which >>>>>> may be configured via sysfs entry max_small_discards. >>>>>> >>>>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>>>> --- >>>>>> fs/f2fs/segment.c | 4 ++-- >>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>> index 14c822e5c9c9..0a313368f18b 100644 >>>>>> --- a/fs/f2fs/segment.c >>>>>> +++ b/fs/f2fs/segment.c >>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>> len = next_pos - cur_pos; >>>>>> >>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>> - !force || len < cpc->trim_minlen) >>>>>> + (force && len < cpc->trim_minlen)) >>>>>> goto skip; >>>>>> >>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>> atomic_set(&dcc->queued_discard, 0); >>>>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>>>> dcc->nr_discards = 0; >>>>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>>>> + dcc->max_discards = 0; >>>>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>>>> -- >>>>>> 2.40.1 >>>>>> >>>>>> >>>>>> >>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 07/12, Chao Yu wrote: > On 2023/7/12 0:37, Jaegeuk Kim wrote: > > On 07/06, Chao Yu wrote: > > > On 2023/7/6 1:30, Jaegeuk Kim wrote: > > > > On 07/04, Chao Yu wrote: > > > > > On 2023/7/4 18:53, Jaegeuk Kim wrote: > > > > > > On 07/03, Chao Yu wrote: > > > > > > > On 2023/6/15 0:10, Jaegeuk Kim wrote: > > > > > > > > If there're huge # of small discards, this will increase checkpoint latency > > > > > > > > insanely. Let's issue small discards only by trim. > > > > > > > > > > > > > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > > > > > --- > > > > > > > > > > > > > > > > Change log from v1: > > > > > > > > - move the skip logic to avoid dangling objects > > > > > > > > > > > > > > > > fs/f2fs/segment.c | 2 +- > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > > index 8c7af8b4fc47..0457d620011f 100644 > > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > > len = next_pos - cur_pos; > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > > - (force && len < cpc->trim_minlen)) > > > > > > > > + !force || len < cpc->trim_minlen) > > > > > > > > goto skip; > > > > > > > > > > > > > > Sorry for late reply. > > > > > > > > > > > > > > We have a configuration for such case, what do you think of setting > > > > > > > max_small_discards to zero? otherwise, w/ above change, max_small_discards > > > > > > > logic may be broken? > > > > > > > > > > > > > > What: /sys/fs/f2fs/<disk>/max_small_discards > > > > > > > Date: November 2013 > > > > > > > Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> > > > > > > > Description: Controls the issue rate of discard commands that consist of small > > > > > > > blocks less than 2MB. The candidates to be discarded are cached until > > > > > > > checkpoint is triggered, and issued during the checkpoint. > > > > > > > By default, it is disabled with 0. > > > > > > > > > > > > > > Or, if we prefer to disable small_discards by default, what about below change: > > > > > > > > > > > > I think small_discards is fine, but need to avoid long checkpoint latency only. > > > > > > > > > > I didn't get you, do you mean we can still issue small discard by > > > > > fstrim, so small_discards functionality is fine? > > > > > > > > You got the point. > > > > > > Well, actually, what I mean is max_small_discards sysfs entry's functionality > > > is broken. Now, the entry can not be used to control number of small discards > > > committed by checkpoint. > > > > Could you descrbie this problem first? > > Oh, alright, actually, I've described this problem literally, but maybe it's not > clear, let me give some examples as below: > > echo 0 > /sys/fs/f2fs/vdb/max_small_discards > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > sync > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > echo 100 > /sys/fs/f2fs/vdb/max_small_discards > rm /mnt/f2fs/file > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > sync > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > Before the patch: > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > 0 . . . . . . . . > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > 0 3 1 . . . . . . > > After the patch: > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > 0 . . . . . . . . > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > 0 . . . . . . . . > > So, now max_small_discards can not be used to control small discard number > cached by checkpoint. Since we do not submit small discards anymore during checkpoint. Why not relying on the discard thread to issue them? > > Thanks, > > > > > > > > > I think there is another way to achieve "avoid long checkpoint latency caused > > > by committing huge # of small discards", the way is we can set max_small_discards > > > to small value or zero, w/ such configuration, it will take checkpoint much less > > > time or no time to committing small discard due to below control logic: > > > > > > f2fs_flush_sit_entries() > > > { > > > ... > > > if (!(cpc->reason & CP_DISCARD)) { > > > cpc->trim_start = segno; > > > add_discard_addrs(sbi, cpc, false); > > > } > > > ... > > > } > > > > > > add_discard_addrs() > > > { > > > ... > > > while (force || SM_I(sbi)->dcc_info->nr_discards <= > > > SM_I(sbi)->dcc_info->max_discards) { > > > > > > It will break the loop once nr_discards is larger than max_discards, if > > > max_discards is set to zero, checkpoint won't take time to handle small discards. > > > > > > ... > > > if (!de) { > > > de = f2fs_kmem_cache_alloc(discard_entry_slab, > > > GFP_F2FS_ZERO, true, NULL); > > > de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); > > > list_add_tail(&de->list, head); > > > } > > > ... > > > } > > > ... > > > > > > Thanks, > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 > > > > > > > From: Chao Yu <chao@kernel.org> > > > > > > > Date: Mon, 3 Jul 2023 09:06:53 +0800 > > > > > > > Subject: [PATCH] Revert "f2fs: enable small discard by default" > > > > > > > > > > > > > > This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order > > > > > > > to disable small discard by default, so that if there're huge number of > > > > > > > small discards, it will decrease checkpoint's latency obviously. > > > > > > > > > > > > > > Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard > > > > > > > commands during checkpoint"), due to it breaks small discard feature which > > > > > > > may be configured via sysfs entry max_small_discards. > > > > > > > > > > > > > > Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") > > > > > > > Signed-off-by: Chao Yu <chao@kernel.org> > > > > > > > --- > > > > > > > fs/f2fs/segment.c | 4 ++-- > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > index 14c822e5c9c9..0a313368f18b 100644 > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > len = next_pos - cur_pos; > > > > > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > - !force || len < cpc->trim_minlen) > > > > > > > + (force && len < cpc->trim_minlen)) > > > > > > > goto skip; > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > > > > > @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > > > > > > > atomic_set(&dcc->queued_discard, 0); > > > > > > > atomic_set(&dcc->discard_cmd_cnt, 0); > > > > > > > dcc->nr_discards = 0; > > > > > > > - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; > > > > > > > + dcc->max_discards = 0; > > > > > > > dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; > > > > > > > dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; > > > > > > > dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; > > > > > > > -- > > > > > > > 2.40.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/7/12 23:55, Jaegeuk Kim wrote: > On 07/12, Chao Yu wrote: >> On 2023/7/12 0:37, Jaegeuk Kim wrote: >>> On 07/06, Chao Yu wrote: >>>> On 2023/7/6 1:30, Jaegeuk Kim wrote: >>>>> On 07/04, Chao Yu wrote: >>>>>> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>>>>>> On 07/03, Chao Yu wrote: >>>>>>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>>>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>>>>>> insanely. Let's issue small discards only by trim. >>>>>>>>> >>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>>>>>> --- >>>>>>>>> >>>>>>>>> Change log from v1: >>>>>>>>> - move the skip logic to avoid dangling objects >>>>>>>>> >>>>>>>>> fs/f2fs/segment.c | 2 +- >>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>>> >>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>> len = next_pos - cur_pos; >>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>> - (force && len < cpc->trim_minlen)) >>>>>>>>> + !force || len < cpc->trim_minlen) >>>>>>>>> goto skip; >>>>>>>> >>>>>>>> Sorry for late reply. >>>>>>>> >>>>>>>> We have a configuration for such case, what do you think of setting >>>>>>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>>>>>> logic may be broken? >>>>>>>> >>>>>>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>>>>>> Date: November 2013 >>>>>>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>>>>>> Description: Controls the issue rate of discard commands that consist of small >>>>>>>> blocks less than 2MB. The candidates to be discarded are cached until >>>>>>>> checkpoint is triggered, and issued during the checkpoint. >>>>>>>> By default, it is disabled with 0. >>>>>>>> >>>>>>>> Or, if we prefer to disable small_discards by default, what about below change: >>>>>>> >>>>>>> I think small_discards is fine, but need to avoid long checkpoint latency only. >>>>>> >>>>>> I didn't get you, do you mean we can still issue small discard by >>>>>> fstrim, so small_discards functionality is fine? >>>>> >>>>> You got the point. >>>> >>>> Well, actually, what I mean is max_small_discards sysfs entry's functionality >>>> is broken. Now, the entry can not be used to control number of small discards >>>> committed by checkpoint. >>> >>> Could you descrbie this problem first? >> >> Oh, alright, actually, I've described this problem literally, but maybe it's not >> clear, let me give some examples as below: >> >> echo 0 > /sys/fs/f2fs/vdb/max_small_discards >> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >> sync >> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >> >> echo 100 > /sys/fs/f2fs/vdb/max_small_discards >> rm /mnt/f2fs/file >> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >> sync >> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >> >> Before the patch: >> >> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >> 0 . . . . . . . . >> >> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >> 0 3 1 . . . . . . >> >> After the patch: >> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >> 0 . . . . . . . . >> >> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >> 0 . . . . . . . . >> >> So, now max_small_discards can not be used to control small discard number >> cached by checkpoint. Let me explain more: Previously, we have two mechanisms to cache & submit small discards: a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint will cache small discard candidates w/ configured maximum number. b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM interface is asynchronized, so it won't submit discard directly. Finally, discard thread will submit them in background periodically. So what I mean is the mechanism a) is broken, since no matter how we configure the sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small discard candidates any more. So, it needs to fix max_small_discards sysfs functionality? or just drop the functionality? > > Since we do not submit small discards anymore during checkpoint. Why not relying > on the discard thread to issue them? Sorry, I'm not sure I get your point, do you mean max_small_discards functionality is obsoleted, so it recommended to use fstrim to cache & submit small discards? Let me know, if I'm missing something or misunderstanding the point. Thanks, > >> >> Thanks, >> >>> >>>> >>>> I think there is another way to achieve "avoid long checkpoint latency caused >>>> by committing huge # of small discards", the way is we can set max_small_discards >>>> to small value or zero, w/ such configuration, it will take checkpoint much less >>>> time or no time to committing small discard due to below control logic: >>>> >>>> f2fs_flush_sit_entries() >>>> { >>>> ... >>>> if (!(cpc->reason & CP_DISCARD)) { >>>> cpc->trim_start = segno; >>>> add_discard_addrs(sbi, cpc, false); >>>> } >>>> ... >>>> } >>>> >>>> add_discard_addrs() >>>> { >>>> ... >>>> while (force || SM_I(sbi)->dcc_info->nr_discards <= >>>> SM_I(sbi)->dcc_info->max_discards) { >>>> >>>> It will break the loop once nr_discards is larger than max_discards, if >>>> max_discards is set to zero, checkpoint won't take time to handle small discards. >>>> >>>> ... >>>> if (!de) { >>>> de = f2fs_kmem_cache_alloc(discard_entry_slab, >>>> GFP_F2FS_ZERO, true, NULL); >>>> de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); >>>> list_add_tail(&de->list, head); >>>> } >>>> ... >>>> } >>>> ... >>>> >>>> Thanks, >>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>>> >>>>>>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>>>>>> From: Chao Yu <chao@kernel.org> >>>>>>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>>>>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>>>>>> >>>>>>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>>>>>> to disable small discard by default, so that if there're huge number of >>>>>>>> small discards, it will decrease checkpoint's latency obviously. >>>>>>>> >>>>>>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>>>>>> commands during checkpoint"), due to it breaks small discard feature which >>>>>>>> may be configured via sysfs entry max_small_discards. >>>>>>>> >>>>>>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>>>>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>>>>>> --- >>>>>>>> fs/f2fs/segment.c | 4 ++-- >>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>>>> >>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>> index 14c822e5c9c9..0a313368f18b 100644 >>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>> len = next_pos - cur_pos; >>>>>>>> >>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>> - !force || len < cpc->trim_minlen) >>>>>>>> + (force && len < cpc->trim_minlen)) >>>>>>>> goto skip; >>>>>>>> >>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>>>>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>>>> atomic_set(&dcc->queued_discard, 0); >>>>>>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>>>>>> dcc->nr_discards = 0; >>>>>>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>>>>>> + dcc->max_discards = 0; >>>>>>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>>>>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>> -- >>>>>>>> 2.40.1 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
Comments? On 2023/7/13 9:31, Chao Yu wrote: > On 2023/7/12 23:55, Jaegeuk Kim wrote: >> On 07/12, Chao Yu wrote: >>> On 2023/7/12 0:37, Jaegeuk Kim wrote: >>>> On 07/06, Chao Yu wrote: >>>>> On 2023/7/6 1:30, Jaegeuk Kim wrote: >>>>>> On 07/04, Chao Yu wrote: >>>>>>> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>>>>>>> On 07/03, Chao Yu wrote: >>>>>>>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>>>>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>>>>>>> insanely. Let's issue small discards only by trim. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>>>>>>> --- >>>>>>>>>> >>>>>>>>>> Change log from v1: >>>>>>>>>> - move the skip logic to avoid dangling objects >>>>>>>>>> >>>>>>>>>> fs/f2fs/segment.c | 2 +- >>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>>>> >>>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>>> len = next_pos - cur_pos; >>>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>>> - (force && len < cpc->trim_minlen)) >>>>>>>>>> + !force || len < cpc->trim_minlen) >>>>>>>>>> goto skip; >>>>>>>>> >>>>>>>>> Sorry for late reply. >>>>>>>>> >>>>>>>>> We have a configuration for such case, what do you think of setting >>>>>>>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>>>>>>> logic may be broken? >>>>>>>>> >>>>>>>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>>>>>>> Date: November 2013 >>>>>>>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>>>>>>> Description: Controls the issue rate of discard commands that consist of small >>>>>>>>> blocks less than 2MB. The candidates to be discarded are cached until >>>>>>>>> checkpoint is triggered, and issued during the checkpoint. >>>>>>>>> By default, it is disabled with 0. >>>>>>>>> >>>>>>>>> Or, if we prefer to disable small_discards by default, what about below change: >>>>>>>> >>>>>>>> I think small_discards is fine, but need to avoid long checkpoint latency only. >>>>>>> >>>>>>> I didn't get you, do you mean we can still issue small discard by >>>>>>> fstrim, so small_discards functionality is fine? >>>>>> >>>>>> You got the point. >>>>> >>>>> Well, actually, what I mean is max_small_discards sysfs entry's functionality >>>>> is broken. Now, the entry can not be used to control number of small discards >>>>> committed by checkpoint. >>>> >>>> Could you descrbie this problem first? >>> >>> Oh, alright, actually, I've described this problem literally, but maybe it's not >>> clear, let me give some examples as below: >>> >>> echo 0 > /sys/fs/f2fs/vdb/max_small_discards >>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>> sync >>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>> >>> echo 100 > /sys/fs/f2fs/vdb/max_small_discards >>> rm /mnt/f2fs/file >>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>> sync >>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>> >>> Before the patch: >>> >>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>> 0 . . . . . . . . >>> >>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>> 0 3 1 . . . . . . >>> >>> After the patch: >>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>> 0 . . . . . . . . >>> >>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>> 0 . . . . . . . . >>> >>> So, now max_small_discards can not be used to control small discard number >>> cached by checkpoint. > > Let me explain more: > > Previously, we have two mechanisms to cache & submit small discards: > > a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint > will cache small discard candidates w/ configured maximum number. > > b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard > candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM > interface is asynchronized, so it won't submit discard directly. > > Finally, discard thread will submit them in background periodically. > > So what I mean is the mechanism a) is broken, since no matter how we configure the > sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small > discard candidates any more. > > So, it needs to fix max_small_discards sysfs functionality? or just drop the > functionality? > >> >> Since we do not submit small discards anymore during checkpoint. Why not relying >> on the discard thread to issue them? > > Sorry, I'm not sure I get your point, do you mean max_small_discards functionality > is obsoleted, so it recommended to use fstrim to cache & submit small discards? > > Let me know, if I'm missing something or misunderstanding the point. > > Thanks, > >> >>> >>> Thanks, >>> >>>> >>>>> >>>>> I think there is another way to achieve "avoid long checkpoint latency caused >>>>> by committing huge # of small discards", the way is we can set max_small_discards >>>>> to small value or zero, w/ such configuration, it will take checkpoint much less >>>>> time or no time to committing small discard due to below control logic: >>>>> >>>>> f2fs_flush_sit_entries() >>>>> { >>>>> ... >>>>> if (!(cpc->reason & CP_DISCARD)) { >>>>> cpc->trim_start = segno; >>>>> add_discard_addrs(sbi, cpc, false); >>>>> } >>>>> ... >>>>> } >>>>> >>>>> add_discard_addrs() >>>>> { >>>>> ... >>>>> while (force || SM_I(sbi)->dcc_info->nr_discards <= >>>>> SM_I(sbi)->dcc_info->max_discards) { >>>>> >>>>> It will break the loop once nr_discards is larger than max_discards, if >>>>> max_discards is set to zero, checkpoint won't take time to handle small discards. >>>>> >>>>> ... >>>>> if (!de) { >>>>> de = f2fs_kmem_cache_alloc(discard_entry_slab, >>>>> GFP_F2FS_ZERO, true, NULL); >>>>> de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); >>>>> list_add_tail(&de->list, head); >>>>> } >>>>> ... >>>>> } >>>>> ... >>>>> >>>>> Thanks, >>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>>>>>>> From: Chao Yu <chao@kernel.org> >>>>>>>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>>>>>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>>>>>>> >>>>>>>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>>>>>>> to disable small discard by default, so that if there're huge number of >>>>>>>>> small discards, it will decrease checkpoint's latency obviously. >>>>>>>>> >>>>>>>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>>>>>>> commands during checkpoint"), due to it breaks small discard feature which >>>>>>>>> may be configured via sysfs entry max_small_discards. >>>>>>>>> >>>>>>>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>>>>>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>>>>>>> --- >>>>>>>>> fs/f2fs/segment.c | 4 ++-- >>>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>> index 14c822e5c9c9..0a313368f18b 100644 >>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>> len = next_pos - cur_pos; >>>>>>>>> >>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>> - !force || len < cpc->trim_minlen) >>>>>>>>> + (force && len < cpc->trim_minlen)) >>>>>>>>> goto skip; >>>>>>>>> >>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>>>>>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>>>>> atomic_set(&dcc->queued_discard, 0); >>>>>>>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>>>>>>> dcc->nr_discards = 0; >>>>>>>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>>>>>>> + dcc->max_discards = 0; >>>>>>>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>>>>>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>>> -- >>>>>>>>> 2.40.1 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
On 07/13, Chao Yu wrote: > On 2023/7/12 23:55, Jaegeuk Kim wrote: > > On 07/12, Chao Yu wrote: > > > On 2023/7/12 0:37, Jaegeuk Kim wrote: > > > > On 07/06, Chao Yu wrote: > > > > > On 2023/7/6 1:30, Jaegeuk Kim wrote: > > > > > > On 07/04, Chao Yu wrote: > > > > > > > On 2023/7/4 18:53, Jaegeuk Kim wrote: > > > > > > > > On 07/03, Chao Yu wrote: > > > > > > > > > On 2023/6/15 0:10, Jaegeuk Kim wrote: > > > > > > > > > > If there're huge # of small discards, this will increase checkpoint latency > > > > > > > > > > insanely. Let's issue small discards only by trim. > > > > > > > > > > > > > > > > > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > Change log from v1: > > > > > > > > > > - move the skip logic to avoid dangling objects > > > > > > > > > > > > > > > > > > > > fs/f2fs/segment.c | 2 +- > > > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > > > > index 8c7af8b4fc47..0457d620011f 100644 > > > > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > > > > len = next_pos - cur_pos; > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > > > > - (force && len < cpc->trim_minlen)) > > > > > > > > > > + !force || len < cpc->trim_minlen) > > > > > > > > > > goto skip; > > > > > > > > > > > > > > > > > > Sorry for late reply. > > > > > > > > > > > > > > > > > > We have a configuration for such case, what do you think of setting > > > > > > > > > max_small_discards to zero? otherwise, w/ above change, max_small_discards > > > > > > > > > logic may be broken? > > > > > > > > > > > > > > > > > > What: /sys/fs/f2fs/<disk>/max_small_discards > > > > > > > > > Date: November 2013 > > > > > > > > > Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> > > > > > > > > > Description: Controls the issue rate of discard commands that consist of small > > > > > > > > > blocks less than 2MB. The candidates to be discarded are cached until > > > > > > > > > checkpoint is triggered, and issued during the checkpoint. > > > > > > > > > By default, it is disabled with 0. > > > > > > > > > > > > > > > > > > Or, if we prefer to disable small_discards by default, what about below change: > > > > > > > > > > > > > > > > I think small_discards is fine, but need to avoid long checkpoint latency only. > > > > > > > > > > > > > > I didn't get you, do you mean we can still issue small discard by > > > > > > > fstrim, so small_discards functionality is fine? > > > > > > > > > > > > You got the point. > > > > > > > > > > Well, actually, what I mean is max_small_discards sysfs entry's functionality > > > > > is broken. Now, the entry can not be used to control number of small discards > > > > > committed by checkpoint. > > > > > > > > Could you descrbie this problem first? > > > > > > Oh, alright, actually, I've described this problem literally, but maybe it's not > > > clear, let me give some examples as below: > > > > > > echo 0 > /sys/fs/f2fs/vdb/max_small_discards > > > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > > > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > > > sync > > > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > > > > > echo 100 > /sys/fs/f2fs/vdb/max_small_discards > > > rm /mnt/f2fs/file > > > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > > > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > > > sync > > > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > > > > > Before the patch: > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > 0 . . . . . . . . > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > 0 3 1 . . . . . . > > > > > > After the patch: > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > 0 . . . . . . . . > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > 0 . . . . . . . . > > > > > > So, now max_small_discards can not be used to control small discard number > > > cached by checkpoint. > > Let me explain more: > > Previously, we have two mechanisms to cache & submit small discards: > > a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint > will cache small discard candidates w/ configured maximum number. > > b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard > candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM > interface is asynchronized, so it won't submit discard directly. > > Finally, discard thread will submit them in background periodically. > > So what I mean is the mechanism a) is broken, since no matter how we configure the > sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small > discard candidates any more. Ok, it seems what I encountered before was adding this small discard even after issuing it by checkpoint. Thoughts? node=0], cp [data=0, node=0, meta=0], app [read=0 (direct=0, buffered=0), mapped=0], compr(buffered=0, mapped=0)], fs [data=0, (gc_data=0, cdata=0), node=0, meta=0] f2fs_discard-25-752 [003] ..... 9744.173085: f2fs_iostat_latency: dev = (254,51), iotype [peak lat.(ms)/avg lat.(ms)/count], rd_data [0/0/0], rd_node [0/0/0], rd_meta [0/0/0], wr_sync_data [0/0/0], wr_sync_node [0/0/0], wr_sync_meta [0/0/0], wr_async_data [0/0/0], wr_async_node [0/0/0], wr_async_meta [0/0/0] f2fs_discard-25-752 [003] ..... 9744.173089: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0c6, blklen = 0x1 f2fs_discard-25-752 [003] ..... 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 f2fs_discard-25-752 [003] ..... 9744.173126: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c116, blklen = 0x1 f2fs_discard-25-752 [003] ..... 9744.173140: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c121, blklen = 0x1 f2fs_discard-25-752 [003] ..... 9744.173154: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c15b, blklen = 0x1 f2fs_discard-25-752 [003] ..... 9744.173169: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c16e, blklen = 0x1 f2fs_discard-25-752 [003] ..... 9744.173183: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c181, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175272: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0a9, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175345: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0c6, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175352: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c116, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175354: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c121, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175360: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c15b, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175362: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c16e, blklen = 0x1 f2fs_discard-25-752 [004] ..... 9744.175367: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c181, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.228775: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c197, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.228950: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1ad, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.228965: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1b1, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.228994: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1b3, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.229006: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c27a, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.229017: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c2a3, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.229028: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c2ab, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.229041: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c304, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230811: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c197, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230926: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1ad, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230929: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1b1, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230932: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1b3, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230940: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c27a, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230945: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c2a3, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230947: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c2ab, blklen = 0x1 f2fs_discard-25-752 [002] ..... 9744.230952: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c304, blklen = 0x1 > > So, it needs to fix max_small_discards sysfs functionality? or just drop the > functionality? > > > > > Since we do not submit small discards anymore during checkpoint. Why not relying > > on the discard thread to issue them? > > Sorry, I'm not sure I get your point, do you mean max_small_discards functionality > is obsoleted, so it recommended to use fstrim to cache & submit small discards? > > Let me know, if I'm missing something or misunderstanding the point. > > Thanks, > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > I think there is another way to achieve "avoid long checkpoint latency caused > > > > > by committing huge # of small discards", the way is we can set max_small_discards > > > > > to small value or zero, w/ such configuration, it will take checkpoint much less > > > > > time or no time to committing small discard due to below control logic: > > > > > > > > > > f2fs_flush_sit_entries() > > > > > { > > > > > ... > > > > > if (!(cpc->reason & CP_DISCARD)) { > > > > > cpc->trim_start = segno; > > > > > add_discard_addrs(sbi, cpc, false); > > > > > } > > > > > ... > > > > > } > > > > > > > > > > add_discard_addrs() > > > > > { > > > > > ... > > > > > while (force || SM_I(sbi)->dcc_info->nr_discards <= > > > > > SM_I(sbi)->dcc_info->max_discards) { > > > > > > > > > > It will break the loop once nr_discards is larger than max_discards, if > > > > > max_discards is set to zero, checkpoint won't take time to handle small discards. > > > > > > > > > > ... > > > > > if (!de) { > > > > > de = f2fs_kmem_cache_alloc(discard_entry_slab, > > > > > GFP_F2FS_ZERO, true, NULL); > > > > > de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); > > > > > list_add_tail(&de->list, head); > > > > > } > > > > > ... > > > > > } > > > > > ... > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 > > > > > > > > > From: Chao Yu <chao@kernel.org> > > > > > > > > > Date: Mon, 3 Jul 2023 09:06:53 +0800 > > > > > > > > > Subject: [PATCH] Revert "f2fs: enable small discard by default" > > > > > > > > > > > > > > > > > > This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order > > > > > > > > > to disable small discard by default, so that if there're huge number of > > > > > > > > > small discards, it will decrease checkpoint's latency obviously. > > > > > > > > > > > > > > > > > > Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard > > > > > > > > > commands during checkpoint"), due to it breaks small discard feature which > > > > > > > > > may be configured via sysfs entry max_small_discards. > > > > > > > > > > > > > > > > > > Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") > > > > > > > > > Signed-off-by: Chao Yu <chao@kernel.org> > > > > > > > > > --- > > > > > > > > > fs/f2fs/segment.c | 4 ++-- > > > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > > > index 14c822e5c9c9..0a313368f18b 100644 > > > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > > > len = next_pos - cur_pos; > > > > > > > > > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > > > - !force || len < cpc->trim_minlen) > > > > > > > > > + (force && len < cpc->trim_minlen)) > > > > > > > > > goto skip; > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > > > > > > > @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > > > > > > > > > atomic_set(&dcc->queued_discard, 0); > > > > > > > > > atomic_set(&dcc->discard_cmd_cnt, 0); > > > > > > > > > dcc->nr_discards = 0; > > > > > > > > > - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; > > > > > > > > > + dcc->max_discards = 0; > > > > > > > > > dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; > > > > > > > > > dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; > > > > > > > > > dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; > > > > > > > > > -- > > > > > > > > > 2.40.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/7/22 4:23, Jaegeuk Kim wrote: > On 07/13, Chao Yu wrote: >> On 2023/7/12 23:55, Jaegeuk Kim wrote: >>> On 07/12, Chao Yu wrote: >>>> On 2023/7/12 0:37, Jaegeuk Kim wrote: >>>>> On 07/06, Chao Yu wrote: >>>>>> On 2023/7/6 1:30, Jaegeuk Kim wrote: >>>>>>> On 07/04, Chao Yu wrote: >>>>>>>> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>>>>>>>> On 07/03, Chao Yu wrote: >>>>>>>>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>>>>>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>>>>>>>> insanely. Let's issue small discards only by trim. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>>>>>>>> --- >>>>>>>>>>> >>>>>>>>>>> Change log from v1: >>>>>>>>>>> - move the skip logic to avoid dangling objects >>>>>>>>>>> >>>>>>>>>>> fs/f2fs/segment.c | 2 +- >>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>>>> len = next_pos - cur_pos; >>>>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>>>> - (force && len < cpc->trim_minlen)) >>>>>>>>>>> + !force || len < cpc->trim_minlen) >>>>>>>>>>> goto skip; >>>>>>>>>> >>>>>>>>>> Sorry for late reply. >>>>>>>>>> >>>>>>>>>> We have a configuration for such case, what do you think of setting >>>>>>>>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>>>>>>>> logic may be broken? >>>>>>>>>> >>>>>>>>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>>>>>>>> Date: November 2013 >>>>>>>>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>>>>>>>> Description: Controls the issue rate of discard commands that consist of small >>>>>>>>>> blocks less than 2MB. The candidates to be discarded are cached until >>>>>>>>>> checkpoint is triggered, and issued during the checkpoint. >>>>>>>>>> By default, it is disabled with 0. >>>>>>>>>> >>>>>>>>>> Or, if we prefer to disable small_discards by default, what about below change: >>>>>>>>> >>>>>>>>> I think small_discards is fine, but need to avoid long checkpoint latency only. >>>>>>>> >>>>>>>> I didn't get you, do you mean we can still issue small discard by >>>>>>>> fstrim, so small_discards functionality is fine? >>>>>>> >>>>>>> You got the point. >>>>>> >>>>>> Well, actually, what I mean is max_small_discards sysfs entry's functionality >>>>>> is broken. Now, the entry can not be used to control number of small discards >>>>>> committed by checkpoint. >>>>> >>>>> Could you descrbie this problem first? >>>> >>>> Oh, alright, actually, I've described this problem literally, but maybe it's not >>>> clear, let me give some examples as below: >>>> >>>> echo 0 > /sys/fs/f2fs/vdb/max_small_discards >>>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>>> sync >>>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>>> >>>> echo 100 > /sys/fs/f2fs/vdb/max_small_discards >>>> rm /mnt/f2fs/file >>>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>>> sync >>>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>>> >>>> Before the patch: >>>> >>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>> 0 . . . . . . . . >>>> >>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>> 0 3 1 . . . . . . >>>> >>>> After the patch: >>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>> 0 . . . . . . . . >>>> >>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>> 0 . . . . . . . . >>>> >>>> So, now max_small_discards can not be used to control small discard number >>>> cached by checkpoint. >> >> Let me explain more: >> >> Previously, we have two mechanisms to cache & submit small discards: >> >> a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint >> will cache small discard candidates w/ configured maximum number. >> >> b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard >> candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM >> interface is asynchronized, so it won't submit discard directly. >> >> Finally, discard thread will submit them in background periodically. >> >> So what I mean is the mechanism a) is broken, since no matter how we configure the >> sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small >> discard candidates any more. > > Ok, it seems what I encountered before was adding this small discard even > after issuing it by checkpoint. Thoughts? Do you mean: in f2fs_clear_prefree_segments(), small discard may overlap segment granularity discard? e.g. - f2fs_clear_prefree_segments - f2fs_issue_discard(0, 512) --- segment granularity discard - f2fs_issue_discard(0, 1) --- small discard - f2fs_issue_discard(5, 1) --- small discard Thanks, > > node=0], cp [data=0, node=0, meta=0], app [read=0 (direct=0, buffered=0), mapped=0], compr(buffered=0, mapped=0)], fs [data=0, (gc_data=0, cdata=0), node=0, meta=0] > f2fs_discard-25-752 [003] ..... 9744.173085: f2fs_iostat_latency: dev = (254,51), iotype [peak lat.(ms)/avg lat.(ms)/count], rd_data [0/0/0], rd_node [0/0/0], rd_meta [0/0/0], wr_sync_data [0/0/0], wr_sync_node [0/0/0], wr_sync_meta [0/0/0], wr_async_data [0/0/0], wr_async_node [0/0/0], wr_async_meta [0/0/0] > f2fs_discard-25-752 [003] ..... 9744.173089: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0c6, blklen = 0x1 > f2fs_discard-25-752 [003] ..... 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > f2fs_discard-25-752 [003] ..... 9744.173126: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c116, blklen = 0x1 > f2fs_discard-25-752 [003] ..... 9744.173140: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c121, blklen = 0x1 > f2fs_discard-25-752 [003] ..... 9744.173154: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c15b, blklen = 0x1 > f2fs_discard-25-752 [003] ..... 9744.173169: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c16e, blklen = 0x1 > f2fs_discard-25-752 [003] ..... 9744.173183: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c181, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175272: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0a9, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175345: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0c6, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175352: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c116, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175354: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c121, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175360: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c15b, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175362: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c16e, blklen = 0x1 > f2fs_discard-25-752 [004] ..... 9744.175367: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c181, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.228775: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c197, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.228950: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1ad, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.228965: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1b1, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.228994: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1b3, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.229006: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c27a, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.229017: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c2a3, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.229028: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c2ab, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.229041: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c304, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230811: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c197, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230926: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1ad, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230929: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1b1, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230932: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1b3, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230940: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c27a, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230945: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c2a3, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230947: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c2ab, blklen = 0x1 > f2fs_discard-25-752 [002] ..... 9744.230952: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c304, blklen = 0x1 > >> >> So, it needs to fix max_small_discards sysfs functionality? or just drop the >> functionality? >> >>> >>> Since we do not submit small discards anymore during checkpoint. Why not relying >>> on the discard thread to issue them? >> >> Sorry, I'm not sure I get your point, do you mean max_small_discards functionality >> is obsoleted, so it recommended to use fstrim to cache & submit small discards? >> >> Let me know, if I'm missing something or misunderstanding the point. >> >> Thanks, >> >>> >>>> >>>> Thanks, >>>> >>>>> >>>>>> >>>>>> I think there is another way to achieve "avoid long checkpoint latency caused >>>>>> by committing huge # of small discards", the way is we can set max_small_discards >>>>>> to small value or zero, w/ such configuration, it will take checkpoint much less >>>>>> time or no time to committing small discard due to below control logic: >>>>>> >>>>>> f2fs_flush_sit_entries() >>>>>> { >>>>>> ... >>>>>> if (!(cpc->reason & CP_DISCARD)) { >>>>>> cpc->trim_start = segno; >>>>>> add_discard_addrs(sbi, cpc, false); >>>>>> } >>>>>> ... >>>>>> } >>>>>> >>>>>> add_discard_addrs() >>>>>> { >>>>>> ... >>>>>> while (force || SM_I(sbi)->dcc_info->nr_discards <= >>>>>> SM_I(sbi)->dcc_info->max_discards) { >>>>>> >>>>>> It will break the loop once nr_discards is larger than max_discards, if >>>>>> max_discards is set to zero, checkpoint won't take time to handle small discards. >>>>>> >>>>>> ... >>>>>> if (!de) { >>>>>> de = f2fs_kmem_cache_alloc(discard_entry_slab, >>>>>> GFP_F2FS_ZERO, true, NULL); >>>>>> de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); >>>>>> list_add_tail(&de->list, head); >>>>>> } >>>>>> ... >>>>>> } >>>>>> ... >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>>>>>>>> From: Chao Yu <chao@kernel.org> >>>>>>>>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>>>>>>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>>>>>>>> >>>>>>>>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>>>>>>>> to disable small discard by default, so that if there're huge number of >>>>>>>>>> small discards, it will decrease checkpoint's latency obviously. >>>>>>>>>> >>>>>>>>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>>>>>>>> commands during checkpoint"), due to it breaks small discard feature which >>>>>>>>>> may be configured via sysfs entry max_small_discards. >>>>>>>>>> >>>>>>>>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>>>>>>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>>>>>>>> --- >>>>>>>>>> fs/f2fs/segment.c | 4 ++-- >>>>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>>> index 14c822e5c9c9..0a313368f18b 100644 >>>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>>> len = next_pos - cur_pos; >>>>>>>>>> >>>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>>> - !force || len < cpc->trim_minlen) >>>>>>>>>> + (force && len < cpc->trim_minlen)) >>>>>>>>>> goto skip; >>>>>>>>>> >>>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>>>>>>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>>>>>> atomic_set(&dcc->queued_discard, 0); >>>>>>>>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>>>>>>>> dcc->nr_discards = 0; >>>>>>>>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>>>>>>>> + dcc->max_discards = 0; >>>>>>>>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>>>>>>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>>>> -- >>>>>>>>>> 2.40.1 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 07/25, Chao Yu wrote: > On 2023/7/22 4:23, Jaegeuk Kim wrote: > > On 07/13, Chao Yu wrote: > > > On 2023/7/12 23:55, Jaegeuk Kim wrote: > > > > On 07/12, Chao Yu wrote: > > > > > On 2023/7/12 0:37, Jaegeuk Kim wrote: > > > > > > On 07/06, Chao Yu wrote: > > > > > > > On 2023/7/6 1:30, Jaegeuk Kim wrote: > > > > > > > > On 07/04, Chao Yu wrote: > > > > > > > > > On 2023/7/4 18:53, Jaegeuk Kim wrote: > > > > > > > > > > On 07/03, Chao Yu wrote: > > > > > > > > > > > On 2023/6/15 0:10, Jaegeuk Kim wrote: > > > > > > > > > > > > If there're huge # of small discards, this will increase checkpoint latency > > > > > > > > > > > > insanely. Let's issue small discards only by trim. > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > > > > > Change log from v1: > > > > > > > > > > > > - move the skip logic to avoid dangling objects > > > > > > > > > > > > > > > > > > > > > > > > fs/f2fs/segment.c | 2 +- > > > > > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > > > > > > index 8c7af8b4fc47..0457d620011f 100644 > > > > > > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > > > > > > len = next_pos - cur_pos; > > > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > > > > > > - (force && len < cpc->trim_minlen)) > > > > > > > > > > > > + !force || len < cpc->trim_minlen) > > > > > > > > > > > > goto skip; > > > > > > > > > > > > > > > > > > > > > > Sorry for late reply. > > > > > > > > > > > > > > > > > > > > > > We have a configuration for such case, what do you think of setting > > > > > > > > > > > max_small_discards to zero? otherwise, w/ above change, max_small_discards > > > > > > > > > > > logic may be broken? > > > > > > > > > > > > > > > > > > > > > > What: /sys/fs/f2fs/<disk>/max_small_discards > > > > > > > > > > > Date: November 2013 > > > > > > > > > > > Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> > > > > > > > > > > > Description: Controls the issue rate of discard commands that consist of small > > > > > > > > > > > blocks less than 2MB. The candidates to be discarded are cached until > > > > > > > > > > > checkpoint is triggered, and issued during the checkpoint. > > > > > > > > > > > By default, it is disabled with 0. > > > > > > > > > > > > > > > > > > > > > > Or, if we prefer to disable small_discards by default, what about below change: > > > > > > > > > > > > > > > > > > > > I think small_discards is fine, but need to avoid long checkpoint latency only. > > > > > > > > > > > > > > > > > > I didn't get you, do you mean we can still issue small discard by > > > > > > > > > fstrim, so small_discards functionality is fine? > > > > > > > > > > > > > > > > You got the point. > > > > > > > > > > > > > > Well, actually, what I mean is max_small_discards sysfs entry's functionality > > > > > > > is broken. Now, the entry can not be used to control number of small discards > > > > > > > committed by checkpoint. > > > > > > > > > > > > Could you descrbie this problem first? > > > > > > > > > > Oh, alright, actually, I've described this problem literally, but maybe it's not > > > > > clear, let me give some examples as below: > > > > > > > > > > echo 0 > /sys/fs/f2fs/vdb/max_small_discards > > > > > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > > > > > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > > > > > sync > > > > > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > > > > > > > > > echo 100 > /sys/fs/f2fs/vdb/max_small_discards > > > > > rm /mnt/f2fs/file > > > > > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > > > > > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > > > > > sync > > > > > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > > > > > > > > > Before the patch: > > > > > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > 0 . . . . . . . . > > > > > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > 0 3 1 . . . . . . > > > > > > > > > > After the patch: > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > 0 . . . . . . . . > > > > > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > 0 . . . . . . . . > > > > > > > > > > So, now max_small_discards can not be used to control small discard number > > > > > cached by checkpoint. > > > > > > Let me explain more: > > > > > > Previously, we have two mechanisms to cache & submit small discards: > > > > > > a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint > > > will cache small discard candidates w/ configured maximum number. > > > > > > b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard > > > candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM > > > interface is asynchronized, so it won't submit discard directly. > > > > > > Finally, discard thread will submit them in background periodically. > > > > > > So what I mean is the mechanism a) is broken, since no matter how we configure the > > > sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small > > > discard candidates any more. > > > > Ok, it seems what I encountered before was adding this small discard even > > after issuing it by checkpoint. Thoughts? > > Do you mean: in f2fs_clear_prefree_segments(), small discard may overlap > segment granularity discard? I didn't dig enough tho, don't think so. Somehow I got a loop as below which said the same LBA was issued and added back repeatedly, not seen this short log unfortunately. > > e.g. > - f2fs_clear_prefree_segments > - f2fs_issue_discard(0, 512) --- segment granularity discard > - f2fs_issue_discard(0, 1) --- small discard > - f2fs_issue_discard(5, 1) --- small discard > > Thanks, > > > > > node=0], cp [data=0, node=0, meta=0], app [read=0 (direct=0, buffered=0), mapped=0], compr(buffered=0, mapped=0)], fs [data=0, (gc_data=0, cdata=0), node=0, meta=0] > > f2fs_discard-25-752 [003] ..... 9744.173085: f2fs_iostat_latency: dev = (254,51), iotype [peak lat.(ms)/avg lat.(ms)/count], rd_data [0/0/0], rd_node [0/0/0], rd_meta [0/0/0], wr_sync_data [0/0/0], wr_sync_node [0/0/0], wr_sync_meta [0/0/0], wr_async_data [0/0/0], wr_async_node [0/0/0], wr_async_meta [0/0/0] > > f2fs_discard-25-752 [003] ..... 9744.173089: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0c6, blklen = 0x1 > > f2fs_discard-25-752 [003] ..... 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > > f2fs_discard-25-752 [003] ..... 9744.173126: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c116, blklen = 0x1 > > f2fs_discard-25-752 [003] ..... 9744.173140: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c121, blklen = 0x1 > > f2fs_discard-25-752 [003] ..... 9744.173154: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c15b, blklen = 0x1 > > f2fs_discard-25-752 [003] ..... 9744.173169: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c16e, blklen = 0x1 > > f2fs_discard-25-752 [003] ..... 9744.173183: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c181, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175272: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0a9, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175345: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0c6, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175352: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c116, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175354: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c121, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175360: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c15b, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175362: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c16e, blklen = 0x1 > > f2fs_discard-25-752 [004] ..... 9744.175367: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c181, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.228775: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c197, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.228950: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1ad, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.228965: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1b1, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.228994: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c1b3, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.229006: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c27a, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.229017: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c2a3, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.229028: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c2ab, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.229041: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c304, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230811: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c197, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230926: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1ad, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230929: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1b1, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230932: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c1b3, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230940: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c27a, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230945: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c2a3, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230947: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c2ab, blklen = 0x1 > > f2fs_discard-25-752 [002] ..... 9744.230952: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c304, blklen = 0x1 > > > > > > > > So, it needs to fix max_small_discards sysfs functionality? or just drop the > > > functionality? > > > > > > > > > > > Since we do not submit small discards anymore during checkpoint. Why not relying > > > > on the discard thread to issue them? > > > > > > Sorry, I'm not sure I get your point, do you mean max_small_discards functionality > > > is obsoleted, so it recommended to use fstrim to cache & submit small discards? > > > > > > Let me know, if I'm missing something or misunderstanding the point. > > > > > > Thanks, > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > I think there is another way to achieve "avoid long checkpoint latency caused > > > > > > > by committing huge # of small discards", the way is we can set max_small_discards > > > > > > > to small value or zero, w/ such configuration, it will take checkpoint much less > > > > > > > time or no time to committing small discard due to below control logic: > > > > > > > > > > > > > > f2fs_flush_sit_entries() > > > > > > > { > > > > > > > ... > > > > > > > if (!(cpc->reason & CP_DISCARD)) { > > > > > > > cpc->trim_start = segno; > > > > > > > add_discard_addrs(sbi, cpc, false); > > > > > > > } > > > > > > > ... > > > > > > > } > > > > > > > > > > > > > > add_discard_addrs() > > > > > > > { > > > > > > > ... > > > > > > > while (force || SM_I(sbi)->dcc_info->nr_discards <= > > > > > > > SM_I(sbi)->dcc_info->max_discards) { > > > > > > > > > > > > > > It will break the loop once nr_discards is larger than max_discards, if > > > > > > > max_discards is set to zero, checkpoint won't take time to handle small discards. > > > > > > > > > > > > > > ... > > > > > > > if (!de) { > > > > > > > de = f2fs_kmem_cache_alloc(discard_entry_slab, > > > > > > > GFP_F2FS_ZERO, true, NULL); > > > > > > > de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); > > > > > > > list_add_tail(&de->list, head); > > > > > > > } > > > > > > > ... > > > > > > > } > > > > > > > ... > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 > > > > > > > > > > > From: Chao Yu <chao@kernel.org> > > > > > > > > > > > Date: Mon, 3 Jul 2023 09:06:53 +0800 > > > > > > > > > > > Subject: [PATCH] Revert "f2fs: enable small discard by default" > > > > > > > > > > > > > > > > > > > > > > This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order > > > > > > > > > > > to disable small discard by default, so that if there're huge number of > > > > > > > > > > > small discards, it will decrease checkpoint's latency obviously. > > > > > > > > > > > > > > > > > > > > > > Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard > > > > > > > > > > > commands during checkpoint"), due to it breaks small discard feature which > > > > > > > > > > > may be configured via sysfs entry max_small_discards. > > > > > > > > > > > > > > > > > > > > > > Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") > > > > > > > > > > > Signed-off-by: Chao Yu <chao@kernel.org> > > > > > > > > > > > --- > > > > > > > > > > > fs/f2fs/segment.c | 4 ++-- > > > > > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > > > > > index 14c822e5c9c9..0a313368f18b 100644 > > > > > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > > > > > len = next_pos - cur_pos; > > > > > > > > > > > > > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > > > > > - !force || len < cpc->trim_minlen) > > > > > > > > > > > + (force && len < cpc->trim_minlen)) > > > > > > > > > > > goto skip; > > > > > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > > > > > > > > > @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > > > > > > > > > > > atomic_set(&dcc->queued_discard, 0); > > > > > > > > > > > atomic_set(&dcc->discard_cmd_cnt, 0); > > > > > > > > > > > dcc->nr_discards = 0; > > > > > > > > > > > - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; > > > > > > > > > > > + dcc->max_discards = 0; > > > > > > > > > > > dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; > > > > > > > > > > > dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; > > > > > > > > > > > dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; > > > > > > > > > > > -- > > > > > > > > > > > 2.40.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/8/5 4:52, Jaegeuk Kim wrote: > On 07/25, Chao Yu wrote: >> On 2023/7/22 4:23, Jaegeuk Kim wrote: >>> On 07/13, Chao Yu wrote: >>>> On 2023/7/12 23:55, Jaegeuk Kim wrote: >>>>> On 07/12, Chao Yu wrote: >>>>>> On 2023/7/12 0:37, Jaegeuk Kim wrote: >>>>>>> On 07/06, Chao Yu wrote: >>>>>>>> On 2023/7/6 1:30, Jaegeuk Kim wrote: >>>>>>>>> On 07/04, Chao Yu wrote: >>>>>>>>>> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>>>>>>>>>> On 07/03, Chao Yu wrote: >>>>>>>>>>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>>>>>>>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>>>>>>>>>> insanely. Let's issue small discards only by trim. >>>>>>>>>>>>> >>>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>>>>>>>>>> --- >>>>>>>>>>>>> >>>>>>>>>>>>> Change log from v1: >>>>>>>>>>>>> - move the skip logic to avoid dangling objects >>>>>>>>>>>>> >>>>>>>>>>>>> fs/f2fs/segment.c | 2 +- >>>>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>>>>>>> >>>>>>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>>>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>>>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>>>>>> len = next_pos - cur_pos; >>>>>>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>>>>>> - (force && len < cpc->trim_minlen)) >>>>>>>>>>>>> + !force || len < cpc->trim_minlen) >>>>>>>>>>>>> goto skip; >>>>>>>>>>>> >>>>>>>>>>>> Sorry for late reply. >>>>>>>>>>>> >>>>>>>>>>>> We have a configuration for such case, what do you think of setting >>>>>>>>>>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>>>>>>>>>> logic may be broken? >>>>>>>>>>>> >>>>>>>>>>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>>>>>>>>>> Date: November 2013 >>>>>>>>>>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>>>>>>>>>> Description: Controls the issue rate of discard commands that consist of small >>>>>>>>>>>> blocks less than 2MB. The candidates to be discarded are cached until >>>>>>>>>>>> checkpoint is triggered, and issued during the checkpoint. >>>>>>>>>>>> By default, it is disabled with 0. >>>>>>>>>>>> >>>>>>>>>>>> Or, if we prefer to disable small_discards by default, what about below change: >>>>>>>>>>> >>>>>>>>>>> I think small_discards is fine, but need to avoid long checkpoint latency only. >>>>>>>>>> >>>>>>>>>> I didn't get you, do you mean we can still issue small discard by >>>>>>>>>> fstrim, so small_discards functionality is fine? >>>>>>>>> >>>>>>>>> You got the point. >>>>>>>> >>>>>>>> Well, actually, what I mean is max_small_discards sysfs entry's functionality >>>>>>>> is broken. Now, the entry can not be used to control number of small discards >>>>>>>> committed by checkpoint. >>>>>>> >>>>>>> Could you descrbie this problem first? >>>>>> >>>>>> Oh, alright, actually, I've described this problem literally, but maybe it's not >>>>>> clear, let me give some examples as below: >>>>>> >>>>>> echo 0 > /sys/fs/f2fs/vdb/max_small_discards >>>>>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>>>>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>>>>> sync >>>>>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>>>>> >>>>>> echo 100 > /sys/fs/f2fs/vdb/max_small_discards >>>>>> rm /mnt/f2fs/file >>>>>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>>>>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>>>>> sync >>>>>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>>>>> >>>>>> Before the patch: >>>>>> >>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>> 0 . . . . . . . . >>>>>> >>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>> 0 3 1 . . . . . . >>>>>> >>>>>> After the patch: >>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>> 0 . . . . . . . . >>>>>> >>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>> 0 . . . . . . . . >>>>>> >>>>>> So, now max_small_discards can not be used to control small discard number >>>>>> cached by checkpoint. >>>> >>>> Let me explain more: >>>> >>>> Previously, we have two mechanisms to cache & submit small discards: >>>> >>>> a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint >>>> will cache small discard candidates w/ configured maximum number. >>>> >>>> b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard >>>> candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM >>>> interface is asynchronized, so it won't submit discard directly. >>>> >>>> Finally, discard thread will submit them in background periodically. >>>> >>>> So what I mean is the mechanism a) is broken, since no matter how we configure the >>>> sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small >>>> discard candidates any more. >>> >>> Ok, it seems what I encountered before was adding this small discard even >>> after issuing it by checkpoint. Thoughts? >> >> Do you mean: in f2fs_clear_prefree_segments(), small discard may overlap >> segment granularity discard? > > I didn't dig enough tho, don't think so. Somehow I got a loop as below which > said the same LBA was issued and added back repeatedly, not seen this short log > unfortunately. > >> >> e.g. >> - f2fs_clear_prefree_segments >> - f2fs_issue_discard(0, 512) --- segment granularity discard >> - f2fs_issue_discard(0, 1) --- small discard >> - f2fs_issue_discard(5, 1) --- small discard >> >> Thanks, [snip] >>> f2fs_discard-25-752 [003] ..... 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 >>> f2fs_discard-25-752 [004] ..... 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 I don't see any loop via above log, am I missing something? The traces were printed in below call paths, and it printed the same LBAs as expected? - issue_discard_thread - __issue_discard_cmd - __submit_discard_cmd - trace_f2fs_issue_discard 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 - __wait_all_discard_cmd - __wait_discard_cmd_range - __remove_discard_cmd - trace_f2fs_remove_discard 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 Thanks, >>> >>>> >>>> So, it needs to fix max_small_discards sysfs functionality? or just drop the >>>> functionality? >>>> >>>>> >>>>> Since we do not submit small discards anymore during checkpoint. Why not relying >>>>> on the discard thread to issue them? >>>> >>>> Sorry, I'm not sure I get your point, do you mean max_small_discards functionality >>>> is obsoleted, so it recommended to use fstrim to cache & submit small discards? >>>> >>>> Let me know, if I'm missing something or misunderstanding the point. >>>> >>>> Thanks, >>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>>> >>>>>>>> I think there is another way to achieve "avoid long checkpoint latency caused >>>>>>>> by committing huge # of small discards", the way is we can set max_small_discards >>>>>>>> to small value or zero, w/ such configuration, it will take checkpoint much less >>>>>>>> time or no time to committing small discard due to below control logic: >>>>>>>> >>>>>>>> f2fs_flush_sit_entries() >>>>>>>> { >>>>>>>> ... >>>>>>>> if (!(cpc->reason & CP_DISCARD)) { >>>>>>>> cpc->trim_start = segno; >>>>>>>> add_discard_addrs(sbi, cpc, false); >>>>>>>> } >>>>>>>> ... >>>>>>>> } >>>>>>>> >>>>>>>> add_discard_addrs() >>>>>>>> { >>>>>>>> ... >>>>>>>> while (force || SM_I(sbi)->dcc_info->nr_discards <= >>>>>>>> SM_I(sbi)->dcc_info->max_discards) { >>>>>>>> >>>>>>>> It will break the loop once nr_discards is larger than max_discards, if >>>>>>>> max_discards is set to zero, checkpoint won't take time to handle small discards. >>>>>>>> >>>>>>>> ... >>>>>>>> if (!de) { >>>>>>>> de = f2fs_kmem_cache_alloc(discard_entry_slab, >>>>>>>> GFP_F2FS_ZERO, true, NULL); >>>>>>>> de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); >>>>>>>> list_add_tail(&de->list, head); >>>>>>>> } >>>>>>>> ... >>>>>>>> } >>>>>>>> ... >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>>>>>>>>>> From: Chao Yu <chao@kernel.org> >>>>>>>>>>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>>>>>>>>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>>>>>>>>>> >>>>>>>>>>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>>>>>>>>>> to disable small discard by default, so that if there're huge number of >>>>>>>>>>>> small discards, it will decrease checkpoint's latency obviously. >>>>>>>>>>>> >>>>>>>>>>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>>>>>>>>>> commands during checkpoint"), due to it breaks small discard feature which >>>>>>>>>>>> may be configured via sysfs entry max_small_discards. >>>>>>>>>>>> >>>>>>>>>>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>>>>>>>>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>>>>>>>>>> --- >>>>>>>>>>>> fs/f2fs/segment.c | 4 ++-- >>>>>>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>>>>> index 14c822e5c9c9..0a313368f18b 100644 >>>>>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>>>>> len = next_pos - cur_pos; >>>>>>>>>>>> >>>>>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>>>>> - !force || len < cpc->trim_minlen) >>>>>>>>>>>> + (force && len < cpc->trim_minlen)) >>>>>>>>>>>> goto skip; >>>>>>>>>>>> >>>>>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>>>>>>>>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>>>>>>>> atomic_set(&dcc->queued_discard, 0); >>>>>>>>>>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>>>>>>>>>> dcc->nr_discards = 0; >>>>>>>>>>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>>>>>>>>>> + dcc->max_discards = 0; >>>>>>>>>>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>>>>>>>>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>>>>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>>>>>> -- >>>>>>>>>>>> 2.40.1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 08/07, Chao Yu wrote: > On 2023/8/5 4:52, Jaegeuk Kim wrote: > > On 07/25, Chao Yu wrote: > > > On 2023/7/22 4:23, Jaegeuk Kim wrote: > > > > On 07/13, Chao Yu wrote: > > > > > On 2023/7/12 23:55, Jaegeuk Kim wrote: > > > > > > On 07/12, Chao Yu wrote: > > > > > > > On 2023/7/12 0:37, Jaegeuk Kim wrote: > > > > > > > > On 07/06, Chao Yu wrote: > > > > > > > > > On 2023/7/6 1:30, Jaegeuk Kim wrote: > > > > > > > > > > On 07/04, Chao Yu wrote: > > > > > > > > > > > On 2023/7/4 18:53, Jaegeuk Kim wrote: > > > > > > > > > > > > On 07/03, Chao Yu wrote: > > > > > > > > > > > > > On 2023/6/15 0:10, Jaegeuk Kim wrote: > > > > > > > > > > > > > > If there're huge # of small discards, this will increase checkpoint latency > > > > > > > > > > > > > > insanely. Let's issue small discards only by trim. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > > > > > > > > > > > > > > > Change log from v1: > > > > > > > > > > > > > > - move the skip logic to avoid dangling objects > > > > > > > > > > > > > > > > > > > > > > > > > > > > fs/f2fs/segment.c | 2 +- > > > > > > > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > > > > > > > > index 8c7af8b4fc47..0457d620011f 100644 > > > > > > > > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > > > > > > > > len = next_pos - cur_pos; > > > > > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > > > > > > > > - (force && len < cpc->trim_minlen)) > > > > > > > > > > > > > > + !force || len < cpc->trim_minlen) > > > > > > > > > > > > > > goto skip; > > > > > > > > > > > > > > > > > > > > > > > > > > Sorry for late reply. > > > > > > > > > > > > > > > > > > > > > > > > > > We have a configuration for such case, what do you think of setting > > > > > > > > > > > > > max_small_discards to zero? otherwise, w/ above change, max_small_discards > > > > > > > > > > > > > logic may be broken? > > > > > > > > > > > > > > > > > > > > > > > > > > What: /sys/fs/f2fs/<disk>/max_small_discards > > > > > > > > > > > > > Date: November 2013 > > > > > > > > > > > > > Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> > > > > > > > > > > > > > Description: Controls the issue rate of discard commands that consist of small > > > > > > > > > > > > > blocks less than 2MB. The candidates to be discarded are cached until > > > > > > > > > > > > > checkpoint is triggered, and issued during the checkpoint. > > > > > > > > > > > > > By default, it is disabled with 0. > > > > > > > > > > > > > > > > > > > > > > > > > > Or, if we prefer to disable small_discards by default, what about below change: > > > > > > > > > > > > > > > > > > > > > > > > I think small_discards is fine, but need to avoid long checkpoint latency only. > > > > > > > > > > > > > > > > > > > > > > I didn't get you, do you mean we can still issue small discard by > > > > > > > > > > > fstrim, so small_discards functionality is fine? > > > > > > > > > > > > > > > > > > > > You got the point. > > > > > > > > > > > > > > > > > > Well, actually, what I mean is max_small_discards sysfs entry's functionality > > > > > > > > > is broken. Now, the entry can not be used to control number of small discards > > > > > > > > > committed by checkpoint. > > > > > > > > > > > > > > > > Could you descrbie this problem first? > > > > > > > > > > > > > > Oh, alright, actually, I've described this problem literally, but maybe it's not > > > > > > > clear, let me give some examples as below: > > > > > > > > > > > > > > echo 0 > /sys/fs/f2fs/vdb/max_small_discards > > > > > > > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > > > > > > > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > > > > > > > sync > > > > > > > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > > > > > > > > > > > > > echo 100 > /sys/fs/f2fs/vdb/max_small_discards > > > > > > > rm /mnt/f2fs/file > > > > > > > xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" > > > > > > > xfs_io /mnt/f2fs/file -c "fpunch 0 4k" > > > > > > > sync > > > > > > > cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 > > > > > > > > > > > > > > Before the patch: > > > > > > > > > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > > > 0 . . . . . . . . > > > > > > > > > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > > > 0 3 1 . . . . . . > > > > > > > > > > > > > > After the patch: > > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > > > 0 . . . . . . . . > > > > > > > > > > > > > > Discard pend list(Show diacrd_cmd count on each entry, .:not exist): > > > > > > > 0 . . . . . . . . > > > > > > > > > > > > > > So, now max_small_discards can not be used to control small discard number > > > > > > > cached by checkpoint. > > > > > > > > > > Let me explain more: > > > > > > > > > > Previously, we have two mechanisms to cache & submit small discards: > > > > > > > > > > a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint > > > > > will cache small discard candidates w/ configured maximum number. > > > > > > > > > > b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard > > > > > candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM > > > > > interface is asynchronized, so it won't submit discard directly. > > > > > > > > > > Finally, discard thread will submit them in background periodically. > > > > > > > > > > So what I mean is the mechanism a) is broken, since no matter how we configure the > > > > > sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small > > > > > discard candidates any more. > > > > > > > > Ok, it seems what I encountered before was adding this small discard even > > > > after issuing it by checkpoint. Thoughts? > > > > > > Do you mean: in f2fs_clear_prefree_segments(), small discard may overlap > > > segment granularity discard? > > > > I didn't dig enough tho, don't think so. Somehow I got a loop as below which > > said the same LBA was issued and added back repeatedly, not seen this short log > > unfortunately. > > > > > > > > e.g. > > > - f2fs_clear_prefree_segments > > > - f2fs_issue_discard(0, 512) --- segment granularity discard > > > - f2fs_issue_discard(0, 1) --- small discard > > > - f2fs_issue_discard(5, 1) --- small discard > > > > > > Thanks, > > [snip] > > > > > f2fs_discard-25-752 [003] ..... 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > > > > f2fs_discard-25-752 [004] ..... 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > > I don't see any loop via above log, am I missing something? > > The traces were printed in below call paths, and it printed the same LBAs as > expected? I said the log was cut to show the loop enoughly. I'll try to reproduce this later locally. > > - issue_discard_thread > - __issue_discard_cmd > - __submit_discard_cmd > - trace_f2fs_issue_discard > 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > - __wait_all_discard_cmd > - __wait_discard_cmd_range > - __remove_discard_cmd > - trace_f2fs_remove_discard > 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 > > Thanks, > > > > > > > > > > > > > > > So, it needs to fix max_small_discards sysfs functionality? or just drop the > > > > > functionality? > > > > > > > > > > > > > > > > > Since we do not submit small discards anymore during checkpoint. Why not relying > > > > > > on the discard thread to issue them? > > > > > > > > > > Sorry, I'm not sure I get your point, do you mean max_small_discards functionality > > > > > is obsoleted, so it recommended to use fstrim to cache & submit small discards? > > > > > > > > > > Let me know, if I'm missing something or misunderstanding the point. > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think there is another way to achieve "avoid long checkpoint latency caused > > > > > > > > > by committing huge # of small discards", the way is we can set max_small_discards > > > > > > > > > to small value or zero, w/ such configuration, it will take checkpoint much less > > > > > > > > > time or no time to committing small discard due to below control logic: > > > > > > > > > > > > > > > > > > f2fs_flush_sit_entries() > > > > > > > > > { > > > > > > > > > ... > > > > > > > > > if (!(cpc->reason & CP_DISCARD)) { > > > > > > > > > cpc->trim_start = segno; > > > > > > > > > add_discard_addrs(sbi, cpc, false); > > > > > > > > > } > > > > > > > > > ... > > > > > > > > > } > > > > > > > > > > > > > > > > > > add_discard_addrs() > > > > > > > > > { > > > > > > > > > ... > > > > > > > > > while (force || SM_I(sbi)->dcc_info->nr_discards <= > > > > > > > > > SM_I(sbi)->dcc_info->max_discards) { > > > > > > > > > > > > > > > > > > It will break the loop once nr_discards is larger than max_discards, if > > > > > > > > > max_discards is set to zero, checkpoint won't take time to handle small discards. > > > > > > > > > > > > > > > > > > ... > > > > > > > > > if (!de) { > > > > > > > > > de = f2fs_kmem_cache_alloc(discard_entry_slab, > > > > > > > > > GFP_F2FS_ZERO, true, NULL); > > > > > > > > > de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); > > > > > > > > > list_add_tail(&de->list, head); > > > > > > > > > } > > > > > > > > > ... > > > > > > > > > } > > > > > > > > > ... > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 > > > > > > > > > > > > > From: Chao Yu <chao@kernel.org> > > > > > > > > > > > > > Date: Mon, 3 Jul 2023 09:06:53 +0800 > > > > > > > > > > > > > Subject: [PATCH] Revert "f2fs: enable small discard by default" > > > > > > > > > > > > > > > > > > > > > > > > > > This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order > > > > > > > > > > > > > to disable small discard by default, so that if there're huge number of > > > > > > > > > > > > > small discards, it will decrease checkpoint's latency obviously. > > > > > > > > > > > > > > > > > > > > > > > > > > Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard > > > > > > > > > > > > > commands during checkpoint"), due to it breaks small discard feature which > > > > > > > > > > > > > may be configured via sysfs entry max_small_discards. > > > > > > > > > > > > > > > > > > > > > > > > > > Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") > > > > > > > > > > > > > Signed-off-by: Chao Yu <chao@kernel.org> > > > > > > > > > > > > > --- > > > > > > > > > > > > > fs/f2fs/segment.c | 4 ++-- > > > > > > > > > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > > > > > > > > > diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c > > > > > > > > > > > > > index 14c822e5c9c9..0a313368f18b 100644 > > > > > > > > > > > > > --- a/fs/f2fs/segment.c > > > > > > > > > > > > > +++ b/fs/f2fs/segment.c > > > > > > > > > > > > > @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, > > > > > > > > > > > > > len = next_pos - cur_pos; > > > > > > > > > > > > > > > > > > > > > > > > > > if (f2fs_sb_has_blkzoned(sbi) || > > > > > > > > > > > > > - !force || len < cpc->trim_minlen) > > > > > > > > > > > > > + (force && len < cpc->trim_minlen)) > > > > > > > > > > > > > goto skip; > > > > > > > > > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, > > > > > > > > > > > > > @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) > > > > > > > > > > > > > atomic_set(&dcc->queued_discard, 0); > > > > > > > > > > > > > atomic_set(&dcc->discard_cmd_cnt, 0); > > > > > > > > > > > > > dcc->nr_discards = 0; > > > > > > > > > > > > > - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; > > > > > > > > > > > > > + dcc->max_discards = 0; > > > > > > > > > > > > > dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; > > > > > > > > > > > > > dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; > > > > > > > > > > > > > dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; > > > > > > > > > > > > > -- > > > > > > > > > > > > > 2.40.1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
On 2023/8/8 3:55, Jaegeuk Kim wrote: > On 08/07, Chao Yu wrote: >> On 2023/8/5 4:52, Jaegeuk Kim wrote: >>> On 07/25, Chao Yu wrote: >>>> On 2023/7/22 4:23, Jaegeuk Kim wrote: >>>>> On 07/13, Chao Yu wrote: >>>>>> On 2023/7/12 23:55, Jaegeuk Kim wrote: >>>>>>> On 07/12, Chao Yu wrote: >>>>>>>> On 2023/7/12 0:37, Jaegeuk Kim wrote: >>>>>>>>> On 07/06, Chao Yu wrote: >>>>>>>>>> On 2023/7/6 1:30, Jaegeuk Kim wrote: >>>>>>>>>>> On 07/04, Chao Yu wrote: >>>>>>>>>>>> On 2023/7/4 18:53, Jaegeuk Kim wrote: >>>>>>>>>>>>> On 07/03, Chao Yu wrote: >>>>>>>>>>>>>> On 2023/6/15 0:10, Jaegeuk Kim wrote: >>>>>>>>>>>>>>> If there're huge # of small discards, this will increase checkpoint latency >>>>>>>>>>>>>>> insanely. Let's issue small discards only by trim. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> >>>>>>>>>>>>>>> --- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Change log from v1: >>>>>>>>>>>>>>> - move the skip logic to avoid dangling objects >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> fs/f2fs/segment.c | 2 +- >>>>>>>>>>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>>>>>>>> index 8c7af8b4fc47..0457d620011f 100644 >>>>>>>>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>>>>>>>> len = next_pos - cur_pos; >>>>>>>>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>>>>>>>> - (force && len < cpc->trim_minlen)) >>>>>>>>>>>>>>> + !force || len < cpc->trim_minlen) >>>>>>>>>>>>>>> goto skip; >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sorry for late reply. >>>>>>>>>>>>>> >>>>>>>>>>>>>> We have a configuration for such case, what do you think of setting >>>>>>>>>>>>>> max_small_discards to zero? otherwise, w/ above change, max_small_discards >>>>>>>>>>>>>> logic may be broken? >>>>>>>>>>>>>> >>>>>>>>>>>>>> What: /sys/fs/f2fs/<disk>/max_small_discards >>>>>>>>>>>>>> Date: November 2013 >>>>>>>>>>>>>> Contact: "Jaegeuk Kim" <jaegeuk.kim@samsung.com> >>>>>>>>>>>>>> Description: Controls the issue rate of discard commands that consist of small >>>>>>>>>>>>>> blocks less than 2MB. The candidates to be discarded are cached until >>>>>>>>>>>>>> checkpoint is triggered, and issued during the checkpoint. >>>>>>>>>>>>>> By default, it is disabled with 0. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Or, if we prefer to disable small_discards by default, what about below change: >>>>>>>>>>>>> >>>>>>>>>>>>> I think small_discards is fine, but need to avoid long checkpoint latency only. >>>>>>>>>>>> >>>>>>>>>>>> I didn't get you, do you mean we can still issue small discard by >>>>>>>>>>>> fstrim, so small_discards functionality is fine? >>>>>>>>>>> >>>>>>>>>>> You got the point. >>>>>>>>>> >>>>>>>>>> Well, actually, what I mean is max_small_discards sysfs entry's functionality >>>>>>>>>> is broken. Now, the entry can not be used to control number of small discards >>>>>>>>>> committed by checkpoint. >>>>>>>>> >>>>>>>>> Could you descrbie this problem first? >>>>>>>> >>>>>>>> Oh, alright, actually, I've described this problem literally, but maybe it's not >>>>>>>> clear, let me give some examples as below: >>>>>>>> >>>>>>>> echo 0 > /sys/fs/f2fs/vdb/max_small_discards >>>>>>>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>>>>>>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>>>>>>> sync >>>>>>>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>>>>>>> >>>>>>>> echo 100 > /sys/fs/f2fs/vdb/max_small_discards >>>>>>>> rm /mnt/f2fs/file >>>>>>>> xfs_io -f /mnt/f2fs/file -c "pwrite 0 2m" -c "fsync" >>>>>>>> xfs_io /mnt/f2fs/file -c "fpunch 0 4k" >>>>>>>> sync >>>>>>>> cat /proc/fs/f2fs/vdb/discard_plist_info |head -2 >>>>>>>> >>>>>>>> Before the patch: >>>>>>>> >>>>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>>>> 0 . . . . . . . . >>>>>>>> >>>>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>>>> 0 3 1 . . . . . . >>>>>>>> >>>>>>>> After the patch: >>>>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>>>> 0 . . . . . . . . >>>>>>>> >>>>>>>> Discard pend list(Show diacrd_cmd count on each entry, .:not exist): >>>>>>>> 0 . . . . . . . . >>>>>>>> >>>>>>>> So, now max_small_discards can not be used to control small discard number >>>>>>>> cached by checkpoint. >>>>>> >>>>>> Let me explain more: >>>>>> >>>>>> Previously, we have two mechanisms to cache & submit small discards: >>>>>> >>>>>> a) set max small discard number in /sys/fs/f2fs/vdb/max_small_discards, and checkpoint >>>>>> will cache small discard candidates w/ configured maximum number. >>>>>> >>>>>> b) call FITRIM ioctl, also, checkpoint in f2fs_trim_fs() will cache small discard >>>>>> candidates w/ configured discard granularity, but w/o limitation of number. FSTRIM >>>>>> interface is asynchronized, so it won't submit discard directly. >>>>>> >>>>>> Finally, discard thread will submit them in background periodically. >>>>>> >>>>>> So what I mean is the mechanism a) is broken, since no matter how we configure the >>>>>> sysfs entry /sys/fs/f2fs/vdb/max_small_discards, checkpoint will not cache small >>>>>> discard candidates any more. >>>>> >>>>> Ok, it seems what I encountered before was adding this small discard even >>>>> after issuing it by checkpoint. Thoughts? >>>> >>>> Do you mean: in f2fs_clear_prefree_segments(), small discard may overlap >>>> segment granularity discard? >>> >>> I didn't dig enough tho, don't think so. Somehow I got a loop as below which >>> said the same LBA was issued and added back repeatedly, not seen this short log >>> unfortunately. >>> >>>> >>>> e.g. >>>> - f2fs_clear_prefree_segments >>>> - f2fs_issue_discard(0, 512) --- segment granularity discard >>>> - f2fs_issue_discard(0, 1) --- small discard >>>> - f2fs_issue_discard(5, 1) --- small discard >>>> >>>> Thanks, >> >> [snip] >> >>>>> f2fs_discard-25-752 [003] ..... 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 >>>>> f2fs_discard-25-752 [004] ..... 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 >> >> I don't see any loop via above log, am I missing something? >> >> The traces were printed in below call paths, and it printed the same LBAs as >> expected? > > I said the log was cut to show the loop enoughly. I'll try to reproduce this Oh, I misunderstood it... > later locally. Alright. Thanks, > >> >> - issue_discard_thread >> - __issue_discard_cmd >> - __submit_discard_cmd >> - trace_f2fs_issue_discard >> 9744.173111: f2fs_issue_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 >> - __wait_all_discard_cmd >> - __wait_discard_cmd_range >> - __remove_discard_cmd >> - trace_f2fs_remove_discard >> 9744.175348: f2fs_remove_discard: dev = (254,51), blkstart = 0x18c0ca, blklen = 0x1 >> >> Thanks, >> >>>>> >>>>>> >>>>>> So, it needs to fix max_small_discards sysfs functionality? or just drop the >>>>>> functionality? >>>>>> >>>>>>> >>>>>>> Since we do not submit small discards anymore during checkpoint. Why not relying >>>>>>> on the discard thread to issue them? >>>>>> >>>>>> Sorry, I'm not sure I get your point, do you mean max_small_discards functionality >>>>>> is obsoleted, so it recommended to use fstrim to cache & submit small discards? >>>>>> >>>>>> Let me know, if I'm missing something or misunderstanding the point. >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> I think there is another way to achieve "avoid long checkpoint latency caused >>>>>>>>>> by committing huge # of small discards", the way is we can set max_small_discards >>>>>>>>>> to small value or zero, w/ such configuration, it will take checkpoint much less >>>>>>>>>> time or no time to committing small discard due to below control logic: >>>>>>>>>> >>>>>>>>>> f2fs_flush_sit_entries() >>>>>>>>>> { >>>>>>>>>> ... >>>>>>>>>> if (!(cpc->reason & CP_DISCARD)) { >>>>>>>>>> cpc->trim_start = segno; >>>>>>>>>> add_discard_addrs(sbi, cpc, false); >>>>>>>>>> } >>>>>>>>>> ... >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> add_discard_addrs() >>>>>>>>>> { >>>>>>>>>> ... >>>>>>>>>> while (force || SM_I(sbi)->dcc_info->nr_discards <= >>>>>>>>>> SM_I(sbi)->dcc_info->max_discards) { >>>>>>>>>> >>>>>>>>>> It will break the loop once nr_discards is larger than max_discards, if >>>>>>>>>> max_discards is set to zero, checkpoint won't take time to handle small discards. >>>>>>>>>> >>>>>>>>>> ... >>>>>>>>>> if (!de) { >>>>>>>>>> de = f2fs_kmem_cache_alloc(discard_entry_slab, >>>>>>>>>> GFP_F2FS_ZERO, true, NULL); >>>>>>>>>> de->start_blkaddr = START_BLOCK(sbi, cpc->trim_start); >>>>>>>>>> list_add_tail(&de->list, head); >>>>>>>>>> } >>>>>>>>>> ... >>>>>>>>>> } >>>>>>>>>> ... >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> From eb89d9b56e817e3046d7fa17165b12416f09d456 Mon Sep 17 00:00:00 2001 >>>>>>>>>>>>>> From: Chao Yu <chao@kernel.org> >>>>>>>>>>>>>> Date: Mon, 3 Jul 2023 09:06:53 +0800 >>>>>>>>>>>>>> Subject: [PATCH] Revert "f2fs: enable small discard by default" >>>>>>>>>>>>>> >>>>>>>>>>>>>> This reverts commit d618ebaf0aa83d175658aea5291e0c459d471d39 in order >>>>>>>>>>>>>> to disable small discard by default, so that if there're huge number of >>>>>>>>>>>>>> small discards, it will decrease checkpoint's latency obviously. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also, this patch reverts 9ac00e7cef10 ("f2fs: do not issue small discard >>>>>>>>>>>>>> commands during checkpoint"), due to it breaks small discard feature which >>>>>>>>>>>>>> may be configured via sysfs entry max_small_discards. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Fixes: 9ac00e7cef10 ("f2fs: do not issue small discard commands during checkpoint") >>>>>>>>>>>>>> Signed-off-by: Chao Yu <chao@kernel.org> >>>>>>>>>>>>>> --- >>>>>>>>>>>>>> fs/f2fs/segment.c | 4 ++-- >>>>>>>>>>>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c >>>>>>>>>>>>>> index 14c822e5c9c9..0a313368f18b 100644 >>>>>>>>>>>>>> --- a/fs/f2fs/segment.c >>>>>>>>>>>>>> +++ b/fs/f2fs/segment.c >>>>>>>>>>>>>> @@ -2193,7 +2193,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, >>>>>>>>>>>>>> len = next_pos - cur_pos; >>>>>>>>>>>>>> >>>>>>>>>>>>>> if (f2fs_sb_has_blkzoned(sbi) || >>>>>>>>>>>>>> - !force || len < cpc->trim_minlen) >>>>>>>>>>>>>> + (force && len < cpc->trim_minlen)) >>>>>>>>>>>>>> goto skip; >>>>>>>>>>>>>> >>>>>>>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos, >>>>>>>>>>>>>> @@ -2269,7 +2269,7 @@ static int create_discard_cmd_control(struct f2fs_sb_info *sbi) >>>>>>>>>>>>>> atomic_set(&dcc->queued_discard, 0); >>>>>>>>>>>>>> atomic_set(&dcc->discard_cmd_cnt, 0); >>>>>>>>>>>>>> dcc->nr_discards = 0; >>>>>>>>>>>>>> - dcc->max_discards = MAIN_SEGS(sbi) << sbi->log_blocks_per_seg; >>>>>>>>>>>>>> + dcc->max_discards = 0; >>>>>>>>>>>>>> dcc->max_discard_request = DEF_MAX_DISCARD_REQUEST; >>>>>>>>>>>>>> dcc->min_discard_issue_time = DEF_MIN_DISCARD_ISSUE_TIME; >>>>>>>>>>>>>> dcc->mid_discard_issue_time = DEF_MID_DISCARD_ISSUE_TIME; >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> 2.40.1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,
diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 0c0c033c4bdd..ef46bb085385 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -2178,7 +2178,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, } mutex_unlock(&dirty_i->seglist_lock); - if (!f2fs_block_unit_discard(sbi)) + if (!f2fs_block_unit_discard(sbi) || !force) goto wakeup; /* send small discards */ @@ -2192,8 +2192,7 @@ void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, sbi->blocks_per_seg, cur_pos); len = next_pos - cur_pos; - if (f2fs_sb_has_blkzoned(sbi) || - (force && len < cpc->trim_minlen)) + if (f2fs_sb_has_blkzoned(sbi) || len < cpc->trim_minlen) goto skip; f2fs_issue_discard(sbi, entry->start_blkaddr + cur_pos,