From patchwork Sat Jan 27 01:58:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192939 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp267026dyb; Fri, 26 Jan 2024 18:04:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IFwVn0PDHkK0lvQgePv+U6IpUQiERJIDqyVVygUNdPJdpJ8kOxcJUvzDxFppttkGPzh7CWh X-Received: by 2002:a05:622a:1ce:b0:42a:518c:4e0d with SMTP id t14-20020a05622a01ce00b0042a518c4e0dmr924013qtw.16.1706321091928; Fri, 26 Jan 2024 18:04:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321091; cv=pass; d=google.com; s=arc-20160816; b=NdarFVDuCO+QyiUbKooHj884G87j2viWzDOqRU7T90pUJtowvtmxBc1Cnsm5uFPlLs UHUTNW+4NjwhEEhZlktoWkpTOi1QJz7fNU3w25VUWz2AMAi8/eApSroTiKXHOKY0mcHO CxILPThqVzrexgxN/cD1NxC6O+40f9UGQgIuHTbIZ5YPGArHnrAUtDmjiG0e4UjieqO9 HuWvq3tpfYEb68kohbZ9JJT+ONCpVu5eGVHTjo8X/m5LHtV85ticOAXW3iQQY4GvFgzT FeXIfWSV3cjEw9cdd3F6BTEUk/wQ1uz7ck2h5e6z1X8VJWIafFLjboNf+M2Fjxdh+Hx9 8OAQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=LDz98XSxYiJRQC1cVb6S82GBYBYojLb8QpdBTLPL8OI=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=KDCzjXd4/0qAWrQxn6HkIxK5RZJJf13lea9GVgrupfGN8lH1KlyAa+isq6qdaBfQTD 83cb6pnFHgHMDf76T9a13ckusPYRvjQxpH9vhUfD4bcAOSDJW5b37ZuU/6bpSKKQkwK1 jLGBYiZFvF8upqr7pILpSDPywWraGdzivL3tDxAcwCXbFDI4TXpmCaK9d3Wz1qPH1JXg QNrqWQCNnZHQyvPCZXOnAJKa+JEMelCZA7HgG/113x28J6kZX0Upj0MHJvUR7PTBm6yF esnM0y24xt6AqaWaUMlCF8vsj9a8sVvH1BGwY9iGV114bAIzuGGGtI4OgaLHd/FNii6T /NWA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id u42-20020a05622a19aa00b0042a87e8a5f3si1240098qtc.739.2024.01.26.18.04.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:04:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 893A11C23012 for ; Sat, 27 Jan 2024 02:04:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 890E11DDD9; Sat, 27 Jan 2024 02:02:48 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC154B65D; Sat, 27 Jan 2024 02:02:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320966; cv=none; b=O+Jvoj7od3m5Ov8nCRED761cIj6yLOZpRs/S8gol9/8vxOXl7zs7n7xebsYRgzntRTiRoU6vw6WFWXQ1Q6nDtvbxNkMz4iLb5GCPgibcwWRwdFv52CugWH0C/GKHKb8504xqMKh/aXAvfSYSTptUoXRRtcQFhaaREi+ktEeGFZg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320966; c=relaxed/simple; bh=8NzAsROSLBAk7ya48ZTzZy30thoEIq5MDcHiLLmh0Yg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RjhMNhevbCB4rwQqtCyzAszIdHR05fgWFMJkshPePUPn0hekhvvSdXwENQzFXUmCPk4MfPYg6maRccVtduSyioMC3wZnHjtYYsC1PYmU+paB/Q463r40tbW9SS5G+IAmU9Z4DQ0Irvb/z/TgPGWGSIx2XTWOBq3M7bOmjldOQ4w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TMHrV2Y0Nz4f3lfX; Sat, 27 Jan 2024 10:02:34 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id B0C4C1A0171; Sat, 27 Jan 2024 10:02:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S6; Sat, 27 Jan 2024 10:02:40 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [PATCH v3 02/26] ext4: convert to exclusive lock while inserting delalloc extents Date: Sat, 27 Jan 2024 09:58:01 +0800 Message-Id: <20240127015825.1608160-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S6 X-Coremail-Antispam: 1UD129KBjvJXoWxJw4DZFy8uF4kXr1fZry3XFb_yoW5Zr4kpr ZIkFyfCr1UW3ykuayIqr17XF1xG3WUJFW7GFZxGF4UZFyUAFnaqF1jyF1aqFyftrZ7AF4Y qFW0qry5uayUCrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8Jw CI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUoSdgDUUU U X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207345236511934 X-GMAIL-MSGID: 1789207345236511934 From: Zhang Yi ext4_da_map_blocks() only hold i_data_sem in shared mode and i_rwsem when inserting delalloc extents, it could be raced by another querying path of ext4_map_blocks() without i_rwsem, .e.g buffered read path. Suppose we buffered read a file containing just a hole, and without any cached extents tree, then it is raced by another delayed buffered write to the same area or the near area belongs to the same hole, and the new delalloc extent could be overwritten to a hole extent. pread() pwrite() filemap_read_folio() ext4_mpage_readpages() ext4_map_blocks() down_read(i_data_sem) ext4_ext_determine_hole() //find hole ext4_ext_put_gap_in_cache() ext4_es_find_extent_range() //no delalloc extent ext4_da_map_blocks() down_read(i_data_sem) ext4_insert_delayed_block() //insert delalloc extent ext4_es_insert_extent() //overwrite delalloc extent to hole This race could lead to inconsistent delalloc extents tree and incorrect reserved space counter. Fix this by converting to hold i_data_sem in exclusive mode when adding a new delalloc extent in ext4_da_map_blocks(). Cc: stable@vger.kernel.org Signed-off-by: Zhang Yi Suggested-by: Jan Kara Reviewed-by: Jan Kara --- fs/ext4/inode.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5b0d3075be12..142c67f5c7fc 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1703,10 +1703,8 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, /* Lookup extent status tree firstly */ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) { - if (ext4_es_is_hole(&es)) { - down_read(&EXT4_I(inode)->i_data_sem); + if (ext4_es_is_hole(&es)) goto add_delayed; - } /* * Delayed extent could be allocated by fallocate. @@ -1748,8 +1746,10 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, retval = ext4_ext_map_blocks(NULL, inode, map, 0); else retval = ext4_ind_map_blocks(NULL, inode, map, 0); - if (retval < 0) - goto out_unlock; + if (retval < 0) { + up_read(&EXT4_I(inode)->i_data_sem); + return retval; + } if (retval > 0) { unsigned int status; @@ -1765,24 +1765,21 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; ext4_es_insert_extent(inode, map->m_lblk, map->m_len, map->m_pblk, status); - goto out_unlock; + up_read(&EXT4_I(inode)->i_data_sem); + return retval; } + up_read(&EXT4_I(inode)->i_data_sem); add_delayed: - /* - * XXX: __block_prepare_write() unmaps passed block, - * is it OK? - */ + down_write(&EXT4_I(inode)->i_data_sem); retval = ext4_insert_delayed_block(inode, map->m_lblk); + up_write(&EXT4_I(inode)->i_data_sem); if (retval) - goto out_unlock; + return retval; map_bh(bh, inode->i_sb, invalid_block); set_buffer_new(bh); set_buffer_delay(bh); - -out_unlock: - up_read((&EXT4_I(inode)->i_data_sem)); return retval; } From patchwork Sat Jan 27 01:58:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192941 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp267373dyb; Fri, 26 Jan 2024 18:05:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IGXMsCs1+sRfNZQEPVoBCUOH3VjHdf6VwPlpu58A5ETIGV7FJLGM7GyP24l7bxQ/5v/gP1d X-Received: by 2002:a17:906:55c7:b0:a31:1ab7:5152 with SMTP id z7-20020a17090655c700b00a311ab75152mr388569ejp.51.1706321152069; Fri, 26 Jan 2024 18:05:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321152; cv=pass; d=google.com; s=arc-20160816; b=YwxlbQiILaTuXf+cbQ7FCV/I0w47vS3zEYtiiOqDfj59sJY8mH6H1gDv2HoTQzd6Xw dgQPHsieYqYkyGcz2abtEYZUvownI90h8pP9Ihsmx7fn/A71sYtUfXjUMkLf6NM3mQII JWpn25cghQ+OjzxPe/BYAAjsPlPNBH2bseDcGinxdTZqtWZzfAD9544bYAfOLGteSHL0 +qu2CJ8+BeaVvxacbq7Hb+nXPZqqHQ5djIgR8qvdIWmco/IE41s/ZJoMT8Vpc0TSfOoG cBht6agCmxLIh7meocvkZB+FxOE37OENeL2NNv7kNeX9N28GyU3X3ODFCpBnNx7XY773 Q4vQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=wSf1NcIMWRvjDJDKI/OHBqUKBoVpTLWWYixqR/BPt/g=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=y9CB3Fq/tlStAYIBdTBx4E3u2CNIgpC+fTvGbUzp2i6nrEkFJrU6OVOKJVwsvhpoqN qvWQYiJ8Lw3Q+3xbUQfP6dCLRM13XNLhDfpN8Agk8Ks6JFfKEk92nufBlostnYa8RR/+ vHS1lRJsD4WZdDnvnixfvfDQn8y2z3rQfUPh2RtndPw+rzHx/U/GdtLjhPSt8PORAtco L5d0WcS3dV63mFnjEBaxbYhYxknhNDD+J8VAdQ1UAe0fitcK1Yh20hL8hWJjYRegf8n3 gRqeslHkHarzWv+jDZrtM2GnfXCGui9R2LPxb4s4TN0lLdP5NSQYcBHlpe4mprus58ME Ur2w== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40984-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40984-ouuuleilei=gmail.com@vger.kernel.org" Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id l15-20020a1709066b8f00b00a34d0a865d7si1122317ejr.734.2024.01.26.18.05.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:05:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40984-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40984-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40984-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 852191F2956B for ; Sat, 27 Jan 2024 02:05:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 306B720330; Sat, 27 Jan 2024 02:02:50 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66CD6B65F; Sat, 27 Jan 2024 02:02:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320966; cv=none; b=kRJWo7v3XUBePZrHHCUJ6LL1gHs61zgFWhDuji5bVaijXAfXgPZbFj/mPrmYbzYeGhNT+9WfXdGH36zN0E0dKKXjZ1M86lUmYgWsGnt70oAo3G7tM12ruXBgsu6VHFceVjqpFJj9hp1WY3UrC2Dul130OIh9ch9J60i8FlPmgI4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320966; c=relaxed/simple; bh=PH/0ZaXurrIcEe+ZEV4TMObaqqFSEIjL1SHGxIDWPxo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=K2yPgOKy5lc6SwvcJvOsX2v0oZw6t0Uaw2F300Gd27aTY8pX7yl2D9nl2k/+kvjl+SnUnrVlnHGiL9ou8ht9fBll89oj5YRWDjYk9FRLys2AYia+DTZdvXjEd4uFhPJY4rzmFCi3N4STwp7/klezUZbgmqPcaYTQBV6CcDi3W3U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TMHrW091Fz4f3lg1; Sat, 27 Jan 2024 10:02:35 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 5E7F01A0171; Sat, 27 Jan 2024 10:02:41 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S7; Sat, 27 Jan 2024 10:02:41 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [PATCH v3 03/26] ext4: correct the hole length returned by ext4_map_blocks() Date: Sat, 27 Jan 2024 09:58:02 +0800 Message-Id: <20240127015825.1608160-4-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S7 X-Coremail-Antispam: 1UD129KBjvJXoW3Gr1UCF4DJw13Xw47uFyrCrg_yoWxJr13pF Wfury5Gws8W34q93yxAF4UZr1rC3W8CrW7JrZ3tr1SyF15Jr48WF18KF12vFZ3tFWrJF1Y vr4jya4UCa1akaDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8Jw CI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUoL0eDUUU U X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207408473666172 X-GMAIL-MSGID: 1789207408473666172 From: Zhang Yi In ext4_map_blocks(), if we can't find a range of mapping in the extents cache, we are calling ext4_ext_map_blocks() to search the real path and ext4_ext_determine_hole() to determine the hole range. But if the querying range was partially or completely overlaped by a delalloc extent, we can't find it in the real extent path, so the returned hole length could be incorrect. Fortunately, ext4_ext_put_gap_in_cache() have already handle delalloc extent, but it searches start from the expanded hole_start, doesn't start from the querying range, so the delalloc extent found could not be the one that overlaped the querying range, plus, it also didn't adjust the hole length. Let's just remove ext4_ext_put_gap_in_cache(), handle delalloc and insert adjusted hole extent in ext4_ext_determine_hole(). Signed-off-by: Zhang Yi Suggested-by: Jan Kara Reviewed-by: Jan Kara --- fs/ext4/extents.c | 111 +++++++++++++++++++++++++++++----------------- 1 file changed, 70 insertions(+), 41 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index d5efe076d3d3..e0b7e48c4c67 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2229,7 +2229,7 @@ static int ext4_fill_es_cache_info(struct inode *inode, /* - * ext4_ext_determine_hole - determine hole around given block + * ext4_ext_find_hole - find hole around given block according to the given path * @inode: inode we lookup in * @path: path in extent tree to @lblk * @lblk: pointer to logical block around which we want to determine hole @@ -2241,9 +2241,9 @@ static int ext4_fill_es_cache_info(struct inode *inode, * The function returns the length of a hole starting at @lblk. We update @lblk * to the beginning of the hole if we managed to find it. */ -static ext4_lblk_t ext4_ext_determine_hole(struct inode *inode, - struct ext4_ext_path *path, - ext4_lblk_t *lblk) +static ext4_lblk_t ext4_ext_find_hole(struct inode *inode, + struct ext4_ext_path *path, + ext4_lblk_t *lblk) { int depth = ext_depth(inode); struct ext4_extent *ex; @@ -2270,30 +2270,6 @@ static ext4_lblk_t ext4_ext_determine_hole(struct inode *inode, return len; } -/* - * ext4_ext_put_gap_in_cache: - * calculate boundaries of the gap that the requested block fits into - * and cache this gap - */ -static void -ext4_ext_put_gap_in_cache(struct inode *inode, ext4_lblk_t hole_start, - ext4_lblk_t hole_len) -{ - struct extent_status es; - - ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start, - hole_start + hole_len - 1, &es); - if (es.es_len) { - /* There's delayed extent containing lblock? */ - if (es.es_lblk <= hole_start) - return; - hole_len = min(es.es_lblk - hole_start, hole_len); - } - ext_debug(inode, " -> %u:%u\n", hole_start, hole_len); - ext4_es_insert_extent(inode, hole_start, hole_len, ~0, - EXTENT_STATUS_HOLE); -} - /* * ext4_ext_rm_idx: * removes index from the index block. @@ -4062,6 +4038,69 @@ static int get_implied_cluster_alloc(struct super_block *sb, return 0; } +/* + * Determine hole length around the given logical block, first try to + * locate and expand the hole from the given @path, and then adjust it + * if it's partially or completely converted to delayed extents, insert + * it into the extent cache tree if it's indeed a hole, finally return + * the length of the determined extent. + */ +static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, + struct ext4_ext_path *path, + ext4_lblk_t lblk) +{ + ext4_lblk_t hole_start, len; + struct extent_status es; + + hole_start = lblk; + len = ext4_ext_find_hole(inode, path, &hole_start); +again: + ext4_es_find_extent_range(inode, &ext4_es_is_delayed, hole_start, + hole_start + len - 1, &es); + if (!es.es_len) + goto insert_hole; + + /* + * There's a delalloc extent in the hole, handle it if the delalloc + * extent is in front of, behind and straddle the queried range. + */ + if (lblk >= es.es_lblk + es.es_len) { + /* + * The delalloc extent is in front of the queried range, + * find again from the queried start block. + */ + len -= lblk - hole_start; + hole_start = lblk; + goto again; + } else if (in_range(lblk, es.es_lblk, es.es_len)) { + /* + * The delalloc extent containing lblk, it must have been + * added after ext4_map_blocks() checked the extent status + * tree, adjust the length to the delalloc extent's after + * lblk. + */ + len = es.es_lblk + es.es_len - lblk; + return len; + } else { + /* + * The delalloc extent is partially or completely behind + * the queried range, update hole length until the + * beginning of the delalloc extent. + */ + len = min(es.es_lblk - hole_start, len); + } + +insert_hole: + /* Put just found gap into cache to speed up subsequent requests */ + ext_debug(inode, " -> %u:%u\n", hole_start, len); + ext4_es_insert_extent(inode, hole_start, len, ~0, EXTENT_STATUS_HOLE); + + /* Update hole_len to reflect hole size after lblk */ + if (hole_start != lblk) + len -= lblk - hole_start; + + return len; +} /* * Block allocation/map/preallocation routine for extents based files @@ -4179,22 +4218,12 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, * we couldn't try to create block if create flag is zero */ if ((flags & EXT4_GET_BLOCKS_CREATE) == 0) { - ext4_lblk_t hole_start, hole_len; + ext4_lblk_t len; - hole_start = map->m_lblk; - hole_len = ext4_ext_determine_hole(inode, path, &hole_start); - /* - * put just found gap into cache to speed up - * subsequent requests - */ - ext4_ext_put_gap_in_cache(inode, hole_start, hole_len); + len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk); - /* Update hole_len to reflect hole size after map->m_lblk */ - if (hole_start != map->m_lblk) - hole_len -= map->m_lblk - hole_start; map->m_pblk = 0; - map->m_len = min_t(unsigned int, map->m_len, hole_len); - + map->m_len = min_t(unsigned int, map->m_len, len); goto out; } From patchwork Sat Jan 27 01:58:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192943 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp267766dyb; Fri, 26 Jan 2024 18:06:53 -0800 (PST) X-Google-Smtp-Source: AGHT+IGTsGCjH9T8cTX0Qou62xys0u05V10gZjpcBqACYAJt2XyG0kUYyqOPLA/Lg2WfJZzYj2Rf X-Received: by 2002:ac8:4e45:0:b0:42a:75d2:c3be with SMTP id e5-20020ac84e45000000b0042a75d2c3bemr959257qtw.11.1706321212990; Fri, 26 Jan 2024 18:06:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321212; cv=pass; d=google.com; s=arc-20160816; b=ml6rjjkUJ1QmNlgTCM8OYFIRoS8CNqoNVH/G1nBTtNzGGzKlXGSYjPdcsespAoNSkg CDR2HN9bc2KL7wHL2SB+MM0ZET8JTriSzDQdHWx7E431ZmJTgpUtrbqzPGZ4AJ1WbqlJ 0edPNMgAZRCi18/wheO7EH2bZGeUDu7g39z2wzPiBF16LxpC9f9F/eO+xl5ANFLEXPon 0iOBCr4XAhG5ZbMkoRI6TdSABZbVCW+/4s7v9UEUnWLsp8UMJyCPtDpoZLPwPYnvUfIm s7I9TBEyR1Gx9x4Z4VUG51JV+Y5AmGQELauzk9qP6kCttrhzR6MJOTL19ChV3dLfBDz3 SbUw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=nDGmM1XnD1u3GRfMKazoP/kuf8eh300xJ6+WETgmkfg=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=uXqG9DE5rOmeVLvgRiueB7rH2n+ZvjfwCKDQ3jyNUjsxeRfl5XnFK2imEeHoyPjKgi hVpwpwb4dBSxZ8Rr6em/qs6Asq1XMVqeAr/vRiZSdsuJJA5TuRMb9YFwJ42JZMsXTcjP 35YaOiYM/K/ee6vxXomgDEgADwIDHF3twBR6B9//SktT6lCE4Nj8zYxD1z9jrXWj4IfH deT6+2Pe0O/F2Z+YWIO40q22yBtxqp8rUm6Lm10S1ylrd15iA3A/9gmdUld8Uj32W9lp OkXARfgJEuGd85NFL9e6VBrd//4/KnsEKcRJc7T8NuPHN5kdbY9S84qA+zw4O1HpaPqz 7rVA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40985-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40985-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id c21-20020a05622a025500b0042a73941ad9si2555472qtx.400.2024.01.26.18.06.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:06:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40985-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40985-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40985-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 9B8021C24A9D for ; Sat, 27 Jan 2024 02:06:31 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2D7BE23767; Sat, 27 Jan 2024 02:02:51 +0000 (UTC) Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACA6FB66C; Sat, 27 Jan 2024 02:02:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320967; cv=none; b=maoTDF6s9URsRkjgou0gbABBNVo6tVrOaVHJsuNkscdHZ2fKf7iPuaYEKiZyNKdWcjFQ3tYuMA7HKL7/htS0F/m4EwUzRYMlt5ij1wZqSp6B32fuEAFF9OPPm0Dw10+x0b1GFmfHlo9yzv3/IyIhqoDsmzCI+f+mseyh69xhywE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320967; c=relaxed/simple; bh=C30o+PKrP0VIZsH1K6/xigRC2DNMlOYLTeWrXfrxCVE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=neylJB1aXqZ5d6w9QtyBnOyu3NUI+sloEBq3llWyk+/vxKDDMBp1kBAGuJVU0RW0k2vs0zxeS8xdp+31ZFD0Es+ecWCxYKtUJHOMmOcTSXIYFj58hYh3f8ZuazQrAJT5KQsn3Ba6mOXuU9r7lJrjN5nUqBSa6aGv+S5jXAxTq0c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4TMHrZ4w05z4f3lwS; Sat, 27 Jan 2024 10:02:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id AA3CE1A01E9; Sat, 27 Jan 2024 10:02:42 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S9; Sat, 27 Jan 2024 10:02:42 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [PATCH v3 05/26] ext4: make ext4_map_blocks() distinguish delalloc only extent Date: Sat, 27 Jan 2024 09:58:04 +0800 Message-Id: <20240127015825.1608160-6-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S9 X-Coremail-Antispam: 1UD129KBjvJXoW7Zw47Xw1xGw1DAFy8tr4Dtwb_yoW8KrW8pa 95GFy8GFs8Ww1q93yxGa1UJr1UKa48C3y7Cr4Yqr4F9F9xJr1ftF1q9F1fZFyYgrWxXFyU XF4Utw1UCa9IkrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUl 2NtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207472621100494 X-GMAIL-MSGID: 1789207472621100494 From: Zhang Yi Add a new map flag EXT4_MAP_DELAYED to indicate the mapping range is a delayed allocated only (not unwritten) one, and making ext4_map_blocks() can distinguish it, no longer mixing it with holes. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/ext4.h | 4 +++- fs/ext4/extents.c | 7 +++++-- fs/ext4/inode.c | 2 ++ 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index a5d784872303..55195909d32f 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -252,8 +252,10 @@ struct ext4_allocation_request { #define EXT4_MAP_MAPPED BIT(BH_Mapped) #define EXT4_MAP_UNWRITTEN BIT(BH_Unwritten) #define EXT4_MAP_BOUNDARY BIT(BH_Boundary) +#define EXT4_MAP_DELAYED BIT(BH_Delay) #define EXT4_MAP_FLAGS (EXT4_MAP_NEW | EXT4_MAP_MAPPED |\ - EXT4_MAP_UNWRITTEN | EXT4_MAP_BOUNDARY) + EXT4_MAP_UNWRITTEN | EXT4_MAP_BOUNDARY |\ + EXT4_MAP_DELAYED) struct ext4_map_blocks { ext4_fsblk_t m_pblk; diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index e0b7e48c4c67..6b64319a7df8 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4076,8 +4076,11 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, /* * The delalloc extent containing lblk, it must have been * added after ext4_map_blocks() checked the extent status - * tree, adjust the length to the delalloc extent's after - * lblk. + * tree so we are not holding i_rwsem and delalloc info is + * only stabilized by i_data_sem we are going to release + * soon. Don't modify the extent status tree and report + * extent as a hole, just adjust the length to the delalloc + * extent's after lblk. */ len = es.es_lblk + es.es_len - lblk; return len; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 1b5e6409f958..c141bf6d8db2 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -515,6 +515,8 @@ int ext4_map_blocks(handle_t *handle, struct inode *inode, map->m_len = retval; } else if (ext4_es_is_delayed(&es) || ext4_es_is_hole(&es)) { map->m_pblk = 0; + map->m_flags |= ext4_es_is_delayed(&es) ? + EXT4_MAP_DELAYED : 0; retval = es.es_len - (map->m_lblk - es.es_lblk); if (retval > map->m_len) retval = map->m_len; From patchwork Sat Jan 27 01:58:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192944 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp267889dyb; Fri, 26 Jan 2024 18:07:15 -0800 (PST) X-Google-Smtp-Source: AGHT+IH/ysZhg3GFdu28hp7r5Zm+xaevDhTh7WzJxLfgTKJCz3H/qqLhqmPb3Bi2NUqlKafX/LL2 X-Received: by 2002:a05:6214:240c:b0:688:5400:819 with SMTP id fv12-20020a056214240c00b0068854000819mr1066112qvb.77.1706321235474; Fri, 26 Jan 2024 18:07:15 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321235; cv=pass; d=google.com; s=arc-20160816; b=OF52nadR/z4CdnPTvk0o/eN3Xf4tIiefL4Yeerp+PKgO5y8zVSIoqSXuQhQoKyjFY2 cXwM8rkMfnbg060ggS8nWr4tUk1aP1mReS332yVYqXpow1gVpkCQQR2ETlMkKFefshgh T54PwKOBvOi2H5guOtRkMhi7fbAyGOYasClYeFi+62gxUXzrMx8B83mcevu7G7fOfTnk 91+9dOp1CQ/RMXpM0uI46MyOAJG2ZIx4vnv7bLrpETFmsoVzFuKI/0txk7582V9OWngo CImPhRBcwBQBmpSTlxBvgtOoNNbqEPBIQDzbp4xwh+cKzn0ceR5WMQes8b4Qy6Oi3uL2 loZA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=Iw3JaRFBdREWOAR52tTvEFycZqWua1x0l9UMHTUegX0=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=Di6b1e/asq67gVgrxS4g7zWy33aH1gajyuw/htd6q7DWhh9Ul0A5MSbHJ4RqrCnXhw d+3vXBa39BP7sKxU5kO1fqJ4WKM1tPJLguWm6kNfzUMRNI5+uAeaubZptEQKTACR2IOm QQK3yrQyHf2cFmAM690O2ys097wVXwhZbBzo4p8/ljc4x7mlpC21cIETfG8EQl8HJEOh Ce9L48LU7ns5zzolFHwTCnr8vH6+2w7YTQP6vPpmfExrs4InVEpOtpQkWQJANNZVo2kX hwyQelW5hnGP6xBAYg0eo2cFOqvibVEJLIOpUvb6vDdhhmtJD2EtV3F11/av+RzegCXu 45jA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40988-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40988-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id j6-20020ae9c206000000b007839a123f18si2569341qkg.269.2024.01.26.18.07.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:07:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40988-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40988-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40988-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 383601C20C80 for ; Sat, 27 Jan 2024 02:07:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 6879524B49; Sat, 27 Jan 2024 02:02:52 +0000 (UTC) Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6CA614AAD; Sat, 27 Jan 2024 02:02:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320969; cv=none; b=mHX0Qz1I2joFxI28DKLYmZwUisNmM8ODDy61kpEWZ9JA8RpSxN/1GemORveL6Nb4tlJa3xk8Cy7LFG6D/13PxPQKN++npCdCmJewQJw7KTPLEQAEC4SJPeioQ69aDnt2cw5917MuMwvei4Cghhftiwi9l+aFRVWOdmJzOSF+rSQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320969; c=relaxed/simple; bh=5htVoT5yAcADrznHX69BozU0tvv0yd+Udq8Ttbk3Z7g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JWvatbb3oRejKfT5+97sNEzEenXCwhGaNfSZE2wE1C0OWIsWwiylkv5gfWyc4rhnBP7OazkqdrlkSHc0z7bkXssCIfdW+CpHvThokiTwc6SXpQJHGYnfxgCGtjahjGDvA6cd0BswpzIfhtm5lDQTxEdRBLOVbG6W7MQC64q+37w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4TMHrc4sr7z4f3lwf; Sat, 27 Jan 2024 10:02:40 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id A87281A0172; Sat, 27 Jan 2024 10:02:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S12; Sat, 27 Jan 2024 10:02:44 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [RFC PATCH v3 08/26] iomap: add pos and dirty_len into trace_iomap_writepage_map Date: Sat, 27 Jan 2024 09:58:07 +0800 Message-Id: <20240127015825.1608160-9-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S12 X-Coremail-Antispam: 1UD129KBjvJXoWxGw4xZF18Cw4Utw48CrWUXFb_yoW5AryfpF 9FyFZ8Cr4kJr429w1fZ34rArs0vF95ur4Utr13u3y5Zw4vyr17GF4vkFWjvF95ArnIyr13 XF4F934kG3WUCw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUl 2NtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207496264611101 X-GMAIL-MSGID: 1789207496264611101 From: Zhang Yi Since commit "iomap: map multiple blocks at a time", we could map multi-blocks once a time, and the dirty_len indicates the expected map length, map_len won't large than it. The pos and dirty_len means the dirty range that should be mapped to write, add them into trace_iomap_writepage_map() could be more useful for debug. Signed-off-by: Zhang Yi Reviewed-by: Christoph Hellwig --- fs/iomap/buffered-io.c | 2 +- fs/iomap/trace.h | 43 +++++++++++++++++++++++++++++++++++++++++- 2 files changed, 43 insertions(+), 2 deletions(-) diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 2ae936e5af74..9a9f1bfe80b4 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -1806,7 +1806,7 @@ static int iomap_writepage_map_blocks(struct iomap_writepage_ctx *wpc, error = wpc->ops->map_blocks(wpc, inode, pos, dirty_len); if (error) break; - trace_iomap_writepage_map(inode, &wpc->iomap); + trace_iomap_writepage_map(inode, pos, dirty_len, &wpc->iomap); map_len = min_t(u64, dirty_len, wpc->iomap.offset + wpc->iomap.length - pos); diff --git a/fs/iomap/trace.h b/fs/iomap/trace.h index c16fd55f5595..3ef694f9489f 100644 --- a/fs/iomap/trace.h +++ b/fs/iomap/trace.h @@ -154,7 +154,48 @@ DEFINE_EVENT(iomap_class, name, \ TP_ARGS(inode, iomap)) DEFINE_IOMAP_EVENT(iomap_iter_dstmap); DEFINE_IOMAP_EVENT(iomap_iter_srcmap); -DEFINE_IOMAP_EVENT(iomap_writepage_map); + +TRACE_EVENT(iomap_writepage_map, + TP_PROTO(struct inode *inode, u64 pos, unsigned int dirty_len, + struct iomap *iomap), + TP_ARGS(inode, pos, dirty_len, iomap), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(u64, ino) + __field(u64, pos) + __field(u64, dirty_len) + __field(u64, addr) + __field(loff_t, offset) + __field(u64, length) + __field(u16, type) + __field(u16, flags) + __field(dev_t, bdev) + ), + TP_fast_assign( + __entry->dev = inode->i_sb->s_dev; + __entry->ino = inode->i_ino; + __entry->pos = pos; + __entry->dirty_len = dirty_len; + __entry->addr = iomap->addr; + __entry->offset = iomap->offset; + __entry->length = iomap->length; + __entry->type = iomap->type; + __entry->flags = iomap->flags; + __entry->bdev = iomap->bdev ? iomap->bdev->bd_dev : 0; + ), + TP_printk("dev %d:%d ino 0x%llx bdev %d:%d pos 0x%llx dirty len 0x%llx " + "addr 0x%llx offset 0x%llx length 0x%llx type %s flags %s", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->ino, + MAJOR(__entry->bdev), MINOR(__entry->bdev), + __entry->pos, + __entry->dirty_len, + __entry->addr, + __entry->offset, + __entry->length, + __print_symbolic(__entry->type, IOMAP_TYPE_STRINGS), + __print_flags(__entry->flags, "|", IOMAP_F_FLAGS_STRINGS)) +); TRACE_EVENT(iomap_iter, TP_PROTO(struct iomap_iter *iter, const void *ops, From patchwork Sat Jan 27 01:58:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192945 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp268230dyb; Fri, 26 Jan 2024 18:08:25 -0800 (PST) X-Google-Smtp-Source: AGHT+IEFurvOJonr8NEQ5yVuOVan0G3e3TelTrGYAFhB1gwMKjOxCJArhxx+bccvtRGNGsjUj/9H X-Received: by 2002:a81:c946:0:b0:5ff:8ae2:22f3 with SMTP id c6-20020a81c946000000b005ff8ae222f3mr681854ywl.73.1706321305371; Fri, 26 Jan 2024 18:08:25 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321305; cv=pass; d=google.com; s=arc-20160816; b=C6U1Bctcsz5AQIBFvA38g2pTxVYqDorsKa7jQRRYOO/HqMdLB9lWOYypSdX69JXdqa c5J3L1VAi7vpY68EQUtGt0QUml13q5gMOQUVVVZS0p59ket+RKEw5xVity3dK/vFTCIq 8Nk4180T5XtZSG2fB3GkBL218rvYOmlu7xSq4lXyeFaWUOpyipEVJU8YJ2reG9cL3AAL qP0KvLrBxEdEjWiCkH0ft0xQD4z+pSeUCAFrZkRbG6jl550hgX9Wrlv5tT/DppMNLrGK d7hnG5+LbTjDVW1+BE5m7v0c4bxF+HFFyjpH3eEdYfzUYlz/I6QVZPmxSTPWGEDBStz6 M31w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=tKcrRdUmH14QplBC92PnCc5CF4CQLz76teadQEIKfa0=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=IHiIdbwIxXn5hq/KlmCQe0k5Lsg1cjq/cOofSBN6cpu0BO6X/oQTyP4i6WBnOlkgzd Mi4wrnUuc7AMX1gJBdywmvt4PMJAT+l9gg/Y+tgWCE/wzm0ybXtBHJTjgwNcn2H5RpLa Am4WYG46Yne58im8rD1BTje6dgi/rvR7rPNpi5PNQdJ3kCPNMxJw5l8dOXzH2wDhjTKf ql92se9JJXPmEbS7XRWOzkYPhcbD77cuEetryNLF8RgM7L5Nd0AdnAnXJStigsnxfly8 4nQ3erNJr1W/OgLOIuNtpmzPTRrFkH0FkBQ/Xe4u7+geZY7aWamj5LyGVAxNf6phC6vD 3YhQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40989-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40989-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id a17-20020a05622a02d100b0042a670273c4si2659630qtx.141.2024.01.26.18.08.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:08:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40989-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40989-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40989-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 1E96F1C20D1D for ; Sat, 27 Jan 2024 02:08:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7187025770; Sat, 27 Jan 2024 02:02:54 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6E1015AF9; Sat, 27 Jan 2024 02:02:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320970; cv=none; b=dR6/RAdbKkMOp3DTBNBoK7kOvTvBp0DkoaLZ2bgJ7Ri6BA80Jt7P4MaVjhmn9wnCPRnWoBHmxrlekEwvGJzm6V6B4LfiJKI6Db+TjUNQkY4Njx/y5gz/kwLwozbYFdZ9HErPYlCxjVWTeAp2e+7LFA/0enmmXRjdFIjk6dQQI2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320970; c=relaxed/simple; bh=VoqDSRbyAjc+tn8as/CvY3+iHeCoS70AGlrAHTfVlC8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tIxE9NaecCW7foLjq0GngoR+WR/4/UTTHfqW+6cVVwT/au5JKA3yzoEttaKAFh+/ku4RsrdhRxPlrVQuQ+sv68HiXU7S079sF90toUGNLE3cRyVj2wSXGxCbeuBENgX//uBk9NEjlqM4EqIDPjDWeyjN1qTj6aJeQ4BEGuhUt2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TMHrf6cZ5z4f3k5x; Sat, 27 Jan 2024 10:02:42 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 56FF91A01E9; Sat, 27 Jan 2024 10:02:45 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S13; Sat, 27 Jan 2024 10:02:45 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [RFC PATCH v3 09/26] ext4: allow inserting delalloc extents with multi-blocks Date: Sat, 27 Jan 2024 09:58:08 +0800 Message-Id: <20240127015825.1608160-10-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S13 X-Coremail-Antispam: 1UD129KBjvJXoW3Jr4xKr43GrW8tw1xtF18Krg_yoWfKryDpF Z8CF18GrWag34vgFWSqr4UZr1S9a4xtrWUJr9agw1fZFy8JFySqF1UtF1YvFyrtrZ5Jrn0 qFyYy34Uua1jga7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUl 2NtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207569175968494 X-GMAIL-MSGID: 1789207569175968494 From: Zhang Yi Introduce a new helper ext4_insert_delayed_blocks() to replace ext4_insert_delayed_block() that we could add multi-delayed blocks into the extent status tree once a time. But for now, it doesn't support bigalloc feature yet. Also rename ext4_es_insert_delayed_block() to ext4_es_insert_delayed_extent(), which matches the name style of other ext4_es_{insert|remove}_extent() functions. Signed-off-by: Zhang Yi --- fs/ext4/extents_status.c | 26 ++++++++++++++----------- fs/ext4/extents_status.h | 4 ++-- fs/ext4/inode.c | 39 ++++++++++++++++++++++--------------- include/trace/events/ext4.h | 12 +++++++----- 4 files changed, 47 insertions(+), 34 deletions(-) diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 4a00e2f019d9..324a6b0a6283 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -2052,19 +2052,21 @@ bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk) } /* - * ext4_es_insert_delayed_block - adds a delayed block to the extents status - * tree, adding a pending reservation where - * needed + * ext4_es_insert_delayed_extent - adds delayed blocks to the extents status + * tree, adding a pending reservation where + * needed * * @inode - file containing the newly added block - * @lblk - logical block to be added + * @lblk - first logical block to be added + * @len - length of blocks to be added * @allocated - indicates whether a physical cluster has been allocated for * the logical cluster that contains the block */ -void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, - bool allocated) +void ext4_es_insert_delayed_extent(struct inode *inode, ext4_lblk_t lblk, + unsigned int len, bool allocated) { struct extent_status newes; + ext4_lblk_t end = lblk + len - 1; int err1 = 0, err2 = 0, err3 = 0; struct extent_status *es1 = NULL; struct extent_status *es2 = NULL; @@ -2073,13 +2075,15 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; - es_debug("add [%u/1) delayed to extent status tree of inode %lu\n", - lblk, inode->i_ino); + es_debug("add [%u/%u) delayed to extent status tree of inode %lu\n", + lblk, len, inode->i_ino); + if (!len) + return; newes.es_lblk = lblk; - newes.es_len = 1; + newes.es_len = len; ext4_es_store_pblock_status(&newes, ~0, EXTENT_STATUS_DELAYED); - trace_ext4_es_insert_delayed_block(inode, &newes, allocated); + trace_ext4_es_insert_delayed_extent(inode, &newes, allocated); ext4_es_insert_extent_check(inode, &newes); @@ -2092,7 +2096,7 @@ void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, pr = __alloc_pending(true); write_lock(&EXT4_I(inode)->i_es_lock); - err1 = __es_remove_extent(inode, lblk, lblk, NULL, es1); + err1 = __es_remove_extent(inode, lblk, end, NULL, es1); if (err1 != 0) goto error; /* Free preallocated extent if it didn't get used. */ diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index d9847a4a25db..24493e682ab4 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -249,8 +249,8 @@ extern void ext4_exit_pending(void); extern void ext4_init_pending_tree(struct ext4_pending_tree *tree); extern void ext4_remove_pending(struct inode *inode, ext4_lblk_t lblk); extern bool ext4_is_pending(struct inode *inode, ext4_lblk_t lblk); -extern void ext4_es_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk, - bool allocated); +extern void ext4_es_insert_delayed_extent(struct inode *inode, ext4_lblk_t lblk, + unsigned int len, bool allocated); extern unsigned int ext4_es_delayed_clu(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len); extern void ext4_clear_inode_es(struct inode *inode); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 0458d7f0c059..bc29c2e92750 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1452,7 +1452,7 @@ static int ext4_journalled_write_end(struct file *file, /* * Reserve space for a single cluster */ -static int ext4_da_reserve_space(struct inode *inode) +static int ext4_da_reserve_space(struct inode *inode, unsigned int len) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); struct ext4_inode_info *ei = EXT4_I(inode); @@ -1463,18 +1463,18 @@ static int ext4_da_reserve_space(struct inode *inode) * us from metadata over-estimation, though we may go over by * a small amount in the end. Here we just reserve for data. */ - ret = dquot_reserve_block(inode, EXT4_C2B(sbi, 1)); + ret = dquot_reserve_block(inode, EXT4_C2B(sbi, len)); if (ret) return ret; spin_lock(&ei->i_block_reservation_lock); - if (ext4_claim_free_clusters(sbi, 1, 0)) { + if (ext4_claim_free_clusters(sbi, len, 0)) { spin_unlock(&ei->i_block_reservation_lock); - dquot_release_reservation_block(inode, EXT4_C2B(sbi, 1)); + dquot_release_reservation_block(inode, EXT4_C2B(sbi, len)); return -ENOSPC; } - ei->i_reserved_data_blocks++; - trace_ext4_da_reserve_space(inode); + ei->i_reserved_data_blocks += len; + trace_ext4_da_reserve_space(inode, len); spin_unlock(&ei->i_block_reservation_lock); return 0; /* success */ @@ -1620,18 +1620,21 @@ static void ext4_print_free_blocks(struct inode *inode) return; } + /* - * ext4_insert_delayed_block - adds a delayed block to the extents status - * tree, incrementing the reserved cluster/block - * count or making a pending reservation - * where needed + * ext4_insert_delayed_blocks - adds multi-delayed blocks to the extents + * status tree, incrementing the reserved + * cluster/block count or making a pending + * reservation where needed. * * @inode - file containing the newly added block - * @lblk - logical block to be added + * @lblk - start logical block to be added + * @len - length of blocks to be added * * Returns 0 on success, negative error code on failure. */ -static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) +static int ext4_insert_delayed_blocks(struct inode *inode, ext4_lblk_t lblk, + ext4_lblk_t len) { struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); int ret; @@ -1649,10 +1652,14 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) * extents status tree doesn't get a match. */ if (sbi->s_cluster_ratio == 1) { - ret = ext4_da_reserve_space(inode); + ret = ext4_da_reserve_space(inode, len); if (ret != 0) /* ENOSPC */ return ret; } else { /* bigalloc */ + /* TODO: support bigalloc for multi-blocks. */ + if (len != 1) + return -EOPNOTSUPP; + if (!ext4_es_scan_clu(inode, &ext4_es_is_delonly, lblk)) { if (!ext4_es_scan_clu(inode, &ext4_es_is_mapped, lblk)) { @@ -1661,7 +1668,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) if (ret < 0) return ret; if (ret == 0) { - ret = ext4_da_reserve_space(inode); + ret = ext4_da_reserve_space(inode, 1); if (ret != 0) /* ENOSPC */ return ret; } else { @@ -1673,7 +1680,7 @@ static int ext4_insert_delayed_block(struct inode *inode, ext4_lblk_t lblk) } } - ext4_es_insert_delayed_block(inode, lblk, allocated); + ext4_es_insert_delayed_extent(inode, lblk, len, allocated); return 0; } @@ -1774,7 +1781,7 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, add_delayed: down_write(&EXT4_I(inode)->i_data_sem); - retval = ext4_insert_delayed_block(inode, map->m_lblk); + retval = ext4_insert_delayed_blocks(inode, map->m_lblk, map->m_len); up_write(&EXT4_I(inode)->i_data_sem); if (retval) return retval; diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 65029dfb92fb..53aa7a7fb3be 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -1249,14 +1249,15 @@ TRACE_EVENT(ext4_da_update_reserve_space, ); TRACE_EVENT(ext4_da_reserve_space, - TP_PROTO(struct inode *inode), + TP_PROTO(struct inode *inode, int reserved_blocks), - TP_ARGS(inode), + TP_ARGS(inode, reserved_blocks), TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( __u64, i_blocks ) + __field( int, reserved_blocks ) __field( int, reserved_data_blocks ) __field( __u16, mode ) ), @@ -1265,16 +1266,17 @@ TRACE_EVENT(ext4_da_reserve_space, __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; __entry->i_blocks = inode->i_blocks; + __entry->reserved_blocks = reserved_blocks; __entry->reserved_data_blocks = EXT4_I(inode)->i_reserved_data_blocks; __entry->mode = inode->i_mode; ), - TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu " + TP_printk("dev %d,%d ino %lu mode 0%o i_blocks %llu reserved_blocks %u " "reserved_data_blocks %d", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->mode, __entry->i_blocks, - __entry->reserved_data_blocks) + __entry->reserved_blocks, __entry->reserved_data_blocks) ); TRACE_EVENT(ext4_da_release_space, @@ -2481,7 +2483,7 @@ TRACE_EVENT(ext4_es_shrink, __entry->scan_time, __entry->nr_skipped, __entry->retried) ); -TRACE_EVENT(ext4_es_insert_delayed_block, +TRACE_EVENT(ext4_es_insert_delayed_extent, TP_PROTO(struct inode *inode, struct extent_status *es, bool allocated), From patchwork Sat Jan 27 01:58:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192946 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp268242dyb; Fri, 26 Jan 2024 18:08:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IGG0E10KlvoBfMDCWoMeUqig7jYuwlnEw5g3dxzJXBGxJjpcghmK+NQnynwqBo0nArURipt X-Received: by 2002:a05:620a:166a:b0:783:930f:e57b with SMTP id d10-20020a05620a166a00b00783930fe57bmr1181626qko.2.1706321308844; Fri, 26 Jan 2024 18:08:28 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321308; cv=pass; d=google.com; s=arc-20160816; b=SDcJQx0j+3JeB5Ox86PZ3CSTxeM9zqEVrvyj/gzBM9JZe1kvRARqgiS8j921D3ma+o c1GLIt6yN6WfnPE9RhvvXTZxcbeng8bK4Sc1BtRRXvFHWvdEgX6BTCS+uZayF069CiYA GwccSnHYudyAr3PAGbvL24dVCvJ+NMm/hqpO1FDsTaPLE9Gf1DaYyT4l0qZF+VRLqaYl TxoTdtWh/kJWSlXzd+WxauO85pjcZBjJNaq4Hz649t80WAzjtr7LsvD7UCkYITyBXD7n 9B9rqg/2EBA0iDDmFtP+86IrFTPPfAb0XW+XDTOPEeLqXfm/anhzXJjY2H3ETnoE/r0e qQew== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=dyRM1jf5CeAp9qbUfvlJ8CU8r3Drv8/gQ2NAo01PoDQ=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=iY1l/u5uMez9a0g441GVQyVFK+nvoMj50Go2/C8TUQKAiOvIVzJAEuzddCT7uFN0pM XUSuuYzTzr4/XIKX1NK4+ujKwiw7EUqM+TVHIyHSFNrDmpsHPx+t7i0VbY2DF9rXEFr6 NlSgu4fJlm9GqRMVuYPvjOWqj3nDSzHAb7S6s4KTY1GBLybFN2pMOrYZt6cIMdX5sitJ QgW8ZfnblNP1iaL039MUtk5GExOb7LaoSBBFNeNG7XsUXSm0vlLinsA2mD32T2CAqfnt Gk0o/PEqFBf06142CLzq/3kcqguVGZtp7PcZ9NFxDH617P/5UG8MQINS+jCeTE0DNHvr t2iQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40992-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40992-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id l18-20020a37f512000000b007839bd5837esi2546711qkk.504.2024.01.26.18.08.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:08:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40992-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40992-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40992-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id A03F11C208C8 for ; Sat, 27 Jan 2024 02:08:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A2C8B286AF; Sat, 27 Jan 2024 02:02:54 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D8FA2110F; Sat, 27 Jan 2024 02:02:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320972; cv=none; b=gIfB3no91v22l9h5xIRvrLeq39tvj/M/jqG2hfzf80OA7hie6SuDpZalK0NUxpzKHmSyef2+Mtuzwrc7A9R38HHdhPcu/y8bP0HK9O0OaW4DNLTJc6wLqswM+iCeMYqezSjhvXd6pqAAczZaKOjEmVoZtceUJcbZPT7mNxoakT8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320972; c=relaxed/simple; bh=Djzt3gG9VGFpPyJJtELMB+1UlRsfxWfHCTLPbGz+oKk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=P8xGD4hZnNGfQRewzHOV1c3uqPwqheOjWuBIddA3IlwKMJ/0qywMqZj3nXajyL2hPFWbJ4IyojsabLLfM4n53xn4UwsiU3VNAEDYDEjMhUXkr6ORpLqHEMvzjEsMPixr3htmqEHokibmnrjuK0OR1w/09L5H+Ov1jmLnsCVEc+s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TMHrj47R7z4f3k6D; Sat, 27 Jan 2024 10:02:45 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 01B511A016E; Sat, 27 Jan 2024 10:02:48 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S17; Sat, 27 Jan 2024 10:02:47 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [RFC PATCH v3 13/26] ext4: use reserved metadata blocks when splitting extent in endio Date: Sat, 27 Jan 2024 09:58:12 +0800 Message-Id: <20240127015825.1608160-14-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S17 X-Coremail-Antispam: 1UD129KBjvJXoW7CF4rtw18JFykGr1kKw1xXwb_yoW8CrWUpr 93Ar18Gr409a409aykZ3WUG34rua43WF47GrZ0q3y29FWUCFyFgr47tFyrZFyrtrWxX3Wj vF40v348AwnxAa7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr 0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUl 2NtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207572796937434 X-GMAIL-MSGID: 1789207572796937434 From: Zhang Yi Now ext4 only reserved space for delalloc for data blocks in ext4_da_reserve_space(), not reserve space for meta blocks when allocating data blocks. Besides, if we enable dioread_nolock mount option, it also not reserve meta blocks for the unwritten extents to written extents conversion. In order to prevent data loss due to failed to allocate meta blocks when writing data back, we reserve 2% space or 4096 blocks for meta data, and use EXT4_GET_BLOCKS_PRE_IO to do the potential split in advance. But all these two methods are just best efforts, if it's really running out of sapce, there is no difference between splitting extent when writing data back and when I/Os are completed, both will result in data loss. The best way is to reserve enough space for metadata. Before that, we can at least make sure that things will not get worse if we postpone splitting extent in endio. Plus, after converting regular file's buffered write path from buffer_head to iomap, splitting extents in endio becomes mandatory. Now, in preparation for the future conversion, let's use reserved sapce in endio too. Signed-off-by: Zhang Yi --- fs/ext4/extents.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 6b64319a7df8..48d6d125ec37 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3722,7 +3722,8 @@ static int ext4_convert_unwritten_extents_endio(handle_t *handle, (unsigned long long)map->m_lblk, map->m_len); #endif err = ext4_split_convert_extents(handle, inode, map, ppath, - EXT4_GET_BLOCKS_CONVERT); + EXT4_GET_BLOCKS_CONVERT | + EXT4_GET_BLOCKS_METADATA_NOFAIL); if (err < 0) return err; path = ext4_find_extent(inode, map->m_lblk, ppath, 0); From patchwork Sat Jan 27 01:58:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192947 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp269938dyb; Fri, 26 Jan 2024 18:13:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IGuwUtRLBTGm5hAq8OmboeRaPCAYB7uX57DBIWwwNJlC0RtaApF5yEMsH6aWU9SYDAnk0we X-Received: by 2002:a05:6214:27ef:b0:686:1bb8:4822 with SMTP id jt15-20020a05621427ef00b006861bb84822mr839010qvb.131.1706321597703; Fri, 26 Jan 2024 18:13:17 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321597; cv=pass; d=google.com; s=arc-20160816; b=AE3XvY8bs7lJKY1rFd1Ehv/4FgkZYudlKQuxwKvds30TEZNP9EjjVhjvFTvL51wrdy wAG6PRBfWzhC0HCZHLIOkmd1BLfxMHwwYXw9e2IfLEyAGBcNuLmdJuBwTuxBG4otWyAr 6JERGObdhv3DAXQG0FUz+VExGlPodh4YYSVu9C8tAOirfeXtx4hduLshItpl166OTX4L e/SOxgztBLO1vbw+WTJeIUyvDoW+80sQYKxol0YivKqR9JN6aVvEq0/ocTRImbqjdtnt 8YEq9ctNRxNUG9uoSY22omctO1rJdNJkNGYdFV5Illa/Og8zHUL7ClZkF8vCOCKB5J0v 8+5Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=rsTGrzoWnjZ1nLzTycpGM4rxzaTDNiMxWKaoTYkDihQ=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=Z+PoELgXIG/Xr12xXJ44Z2MtmPKHD7ObHVwt7+NGK3BUTvQh3soEqI5pIXF0DKDp7X tXsq7+TEP2ghXE9PEK82XEOa5bkcBBcuFtpguykGv2cUapLqMZBIFFkM1yab3a3AENzw 2kQ7qBw+AxCLt68bkqjnYkgz/RY4psFCaEQ3XNx58/+Mo+zHdY8HCN3k9ObHYCfK9kFz zq7IK80dSJJLMyCSAfKXO0oEo4PXB6LZp6EEk4mAThnTWmYsKBu2Y6aNBtStMxmAhUPt 0bU5uN2bAmoZNg1eEcaCbqbDsfOvHaWzNW0JUDsIM+370zaJJsOSxpotXSccStUt1XSW CsCg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40999-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40999-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id s7-20020a0562140ca700b0068c3b45bf28si487905qvs.477.2024.01.26.18.13.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:13:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40999-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40999-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40999-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 70AAD1C2179E for ; Sat, 27 Jan 2024 02:13:17 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B8FD258AAC; Sat, 27 Jan 2024 02:03:02 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF97628DB5; Sat, 27 Jan 2024 02:02:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320977; cv=none; b=b3n5dyqkUiVwVWcFBCOQA7dDJUTyLwT84qLXK5tMtB0G64wLpJxndR6oMMYaCuJhOjc3Z6fNfdfeV7Qnxq5eqSMUNM6OMxo1KPGXx8BsnLHSVIi9gU73oblLI6NzRufIAcTHL8GrY9AoDaBn8J5y+LUuAuz0xKYL/Su2xlM3RhI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320977; c=relaxed/simple; bh=ZQs/pUU050ayJ0zJQou8RKPWpvmxPSbPUb/yTH0mT9k=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sZjjnEDaO73/hIMfdRcmgqvJdD/UqDT6+RFwYZ+7R6B+W/yCwBkY1fKVGSWkC46TJUAyleOy6/tsfaQriH6VL34sjbyLuPg9eOIwjwVlJi0L4D5ispiyN4nD60H8fnzieVuBUA9+XyFKyVPXfH0di0jJW+G5Fg86EEdrYqafHOU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TMHrj4Xrpz4f3lg5; Sat, 27 Jan 2024 10:02:45 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 008D91A01E9; Sat, 27 Jan 2024 10:02:51 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S23; Sat, 27 Jan 2024 10:02:51 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [RFC PATCH v3 19/26] ext4: implement writeback iomap path Date: Sat, 27 Jan 2024 09:58:18 +0800 Message-Id: <20240127015825.1608160-20-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S23 X-Coremail-Antispam: 1UD129KBjvAXoW3CF45CrWxCrW5XrW8Zr4rAFb_yoW8Wr4kWo WavF43Xr48Jr98tayFkr1xJFyUuan7Ga1rJF15Zr40qa43AF1Y9w4xKw43Wa43Ww4Fkr4x ZryxJF45Gr4kXF4rn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUO07AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l 84ACjcxK6xIIjxv20xvE14v26F1j6w1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJV WxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE 3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2I x0cI8IcVAFwI0_Jrv_JF1lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8 JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2 ka0xkIwI1l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG 67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r4a6rW5MI IYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1lIxAIcVC0I7IYx2IY6xkF7I0E 14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r 4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfU oxhLUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207875620331817 X-GMAIL-MSGID: 1789207875620331817 From: Zhang Yi Implement buffered writeback iomap path, include map_blocks() and prepare_ioend() callbacks in iomap_writeback_ops and the corresponding end io process. Add ext4_iomap_map_blocks() to query mapping status, start journal handle and allocate new blocks if it's not allocated. Add ext4_iomap_prepare_ioend() to register the end io handler of converting unwritten extents to mapped extents. In order to reduce the number of map times, we map an entire delayed extent once a time, it means that we could map more blocks than we expected, so we have to calculate the length through wbc->range_end carefully. Besides, we won't be able to avoid splitting extents in endio, so we would allow that because we have been using reserved blocks in endio, it doesn't bring more risks of data loss than splitting before submitting IO. Since we always allocate unwritten extents and keep them until data have been written to disk, we don't need to write back the data before the metadata is updated to disk, there is no risk of exposing stale data, so the data=ordered journal mode becomes useless. We don't need to attach data to the jinode, and the journal thread won't have to write any dependent data any more. In the meantime, we don't have to reserve journal credits for the extent status conversion and we also could postpone i_disksize updating into endio. Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 4 + fs/ext4/ext4_jbd2.c | 6 ++ fs/ext4/extents.c | 23 +++--- fs/ext4/inode.c | 190 +++++++++++++++++++++++++++++++++++++++++++- fs/ext4/page-io.c | 107 +++++++++++++++++++++++++ fs/ext4/super.c | 2 + 6 files changed, 322 insertions(+), 10 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 03cdcf3d86a5..eaf29bade606 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1147,6 +1147,8 @@ struct ext4_inode_info { */ struct list_head i_rsv_conversion_list; struct work_struct i_rsv_conversion_work; + struct list_head i_iomap_ioend_list; + struct work_struct i_iomap_ioend_work; atomic_t i_unwritten; /* Nr. of inflight conversions pending */ spinlock_t i_block_reservation_lock; @@ -3755,6 +3757,8 @@ int ext4_bio_write_folio(struct ext4_io_submit *io, struct folio *page, size_t len); extern struct ext4_io_end_vec *ext4_alloc_io_end_vec(ext4_io_end_t *io_end); extern struct ext4_io_end_vec *ext4_last_io_end_vec(ext4_io_end_t *io_end); +extern void ext4_iomap_end_io(struct work_struct *work); +extern void ext4_iomap_end_bio(struct bio *bio); /* mmp.c */ extern int ext4_multi_mount_protect(struct super_block *, ext4_fsblk_t); diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c index d1a2e6624401..94c8073b49e7 100644 --- a/fs/ext4/ext4_jbd2.c +++ b/fs/ext4/ext4_jbd2.c @@ -11,6 +11,12 @@ int ext4_inode_journal_mode(struct inode *inode) { if (EXT4_JOURNAL(inode) == NULL) return EXT4_INODE_WRITEBACK_DATA_MODE; /* writeback */ + /* + * Ordered mode is no longer needed for the inode that use the + * iomap path, always use writeback mode. + */ + if (ext4_test_inode_state(inode, EXT4_STATE_BUFFERED_IOMAP)) + return EXT4_INODE_WRITEBACK_DATA_MODE; /* writeback */ /* We do not support data journalling with delayed allocation */ if (!S_ISREG(inode->i_mode) || ext4_test_inode_flag(inode, EXT4_INODE_EA_INODE) || diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 48d6d125ec37..46805b8e7bdc 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -3708,18 +3708,23 @@ static int ext4_convert_unwritten_extents_endio(handle_t *handle, ext_debug(inode, "logical block %llu, max_blocks %u\n", (unsigned long long)ee_block, ee_len); - /* If extent is larger than requested it is a clear sign that we still - * have some extent state machine issues left. So extent_split is still - * required. - * TODO: Once all related issues will be fixed this situation should be - * illegal. + /* + * For the inodes that use the buffered iomap path need to split + * extents in endio, other inodes not. + * + * TODO: Reserve enough sapce for splitting extents, always split + * extents here, and totally remove this warning. */ if (ee_block != map->m_lblk || ee_len > map->m_len) { #ifdef CONFIG_EXT4_DEBUG - ext4_warning(inode->i_sb, "Inode (%ld) finished: extent logical block %llu," - " len %u; IO logical block %llu, len %u", - inode->i_ino, (unsigned long long)ee_block, ee_len, - (unsigned long long)map->m_lblk, map->m_len); + if (!ext4_test_inode_state(inode, EXT4_STATE_BUFFERED_IOMAP)) { + ext4_warning(inode->i_sb, + "Inode (%ld) finished: extent logical block %llu, " + "len %u; IO logical block %llu, len %u", + inode->i_ino, (unsigned long long)ee_block, + ee_len, (unsigned long long)map->m_lblk, + map->m_len); + } #endif err = ext4_split_convert_extents(handle, inode, map, ppath, EXT4_GET_BLOCKS_CONVERT | diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index c48aca637896..8ee0eca6a1e8 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -43,6 +43,7 @@ #include #include "ext4_jbd2.h" +#include "ext4_extents.h" #include "xattr.h" #include "acl.h" #include "truncate.h" @@ -3705,10 +3706,197 @@ static void ext4_iomap_readahead(struct readahead_control *rac) iomap_readahead(rac, &ext4_iomap_buffered_read_ops); } +struct ext4_writeback_ctx { + struct iomap_writepage_ctx ctx; + struct writeback_control *wbc; + unsigned int data_seq; +}; + +static int ext4_iomap_map_one_extent(struct inode *inode, + struct ext4_map_blocks *map) +{ + struct extent_status es; + handle_t *handle = NULL; + int credits, map_flags; + int retval; + + credits = ext4_da_writepages_trans_blocks(inode); + handle = ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, credits); + if (IS_ERR(handle)) + return PTR_ERR(handle); + + map->m_flags = 0; + /* + * In order to protect from the race of truncate, we have to lookup + * extent stats and map blocks under i_data_sem, otherwise the + * delalloc extent could be stale. + */ + down_write(&EXT4_I(inode)->i_data_sem); + if (likely(ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es))) { + retval = es.es_len - (map->m_lblk - es.es_lblk); + map->m_len = min_t(unsigned int, retval, map->m_len); + + if (likely(ext4_es_is_delonly(&es))) { + trace_ext4_da_write_pages_extent(inode, map); + /* + * Call ext4_map_create_blocks() to allocate any delayed + * allocation blocks. It is possible that we're going to + * need more metadata blocks, however we must not fail + * because we're in writeback and there is nothing we + * can do so it might result in data loss. So use + * reserved blocks to allocate metadata if possible. + * + * We pass in the magic EXT4_GET_BLOCKS_DELALLOC_RESERVE + * indicates that the blocks and quotas has already been + * checked when the data was copied into the page cache. + */ + map_flags = EXT4_GET_BLOCKS_CREATE_UNWRIT_EXT | + EXT4_GET_BLOCKS_METADATA_NOFAIL | + EXT4_GET_BLOCKS_DELALLOC_RESERVE; + + retval = ext4_map_create_blocks(handle, inode, map, + map_flags); + } + if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) { + map->m_pblk = ext4_es_pblock(&es) + map->m_lblk - + es.es_lblk; + map->m_flags = ext4_es_is_written(&es) ? + EXT4_MAP_MAPPED : EXT4_MAP_UNWRITTEN; + } + } else { + retval = ext4_map_query_blocks(handle, inode, map, 0); + } + + up_write(&EXT4_I(inode)->i_data_sem); + ext4_journal_stop(handle); + return retval < 0 ? retval : 0; +} + +static int ext4_iomap_map_blocks(struct iomap_writepage_ctx *wpc, + struct inode *inode, loff_t offset, + unsigned int dirty_len) +{ + struct ext4_writeback_ctx *ewpc = + container_of(wpc, struct ext4_writeback_ctx, ctx); + struct super_block *sb = inode->i_sb; + struct journal_s *journal = EXT4_SB(sb)->s_journal; + struct ext4_inode_info *ei = EXT4_I(inode); + struct ext4_map_blocks map; + unsigned int blkbits = inode->i_blkbits; + unsigned int index = offset >> blkbits; + unsigned int end, len; + int ret; + + if (unlikely(ext4_forced_shutdown(inode->i_sb))) + return -EIO; + + /* Check validity of the cached writeback mapping. */ + if (offset >= wpc->iomap.offset && + offset < wpc->iomap.offset + wpc->iomap.length && + ewpc->data_seq == READ_ONCE(ei->i_es_seq)) + return 0; + + end = min_t(unsigned int, + (ewpc->wbc->range_end >> blkbits), (UINT_MAX - 1)); + len = (end > index + dirty_len) ? end - index + 1 : dirty_len; + +retry: + map.m_lblk = index; + map.m_len = min_t(unsigned int, EXT_UNWRITTEN_MAX_LEN, len); + ret = ext4_map_blocks(NULL, inode, &map, 0); + if (ret < 0) + return ret; + + /* + * The map isn't a delalloc extent, it must be a hole or have + * already been allocated. + */ + if (!(map.m_flags & EXT4_MAP_DELAYED)) + goto out; + + /* Map one delalloc extent. */ + ret = ext4_iomap_map_one_extent(inode, &map); + if (ret < 0) { + if (ext4_forced_shutdown(sb)) + return ret; + + /* + * Retry transient ENOSPC errors, if + * ext4_count_free_blocks() is non-zero, a commit + * should free up blocks. + */ + if (ret == -ENOSPC && ext4_count_free_clusters(sb)) { + jbd2_journal_force_commit_nested(journal); + goto retry; + } + + ext4_msg(sb, KERN_CRIT, + "Delayed block allocation failed for " + "inode %lu at logical offset %llu with " + "max blocks %u with error %d", + inode->i_ino, (unsigned long long)map.m_lblk, + (unsigned int)map.m_len, -ret); + ext4_msg(sb, KERN_CRIT, + "This should not happen!! Data will " + "be lost\n"); + if (ret == -ENOSPC) + ext4_print_free_blocks(inode); + return ret; + } +out: + ewpc->data_seq = READ_ONCE(ei->i_es_seq); + ext4_set_iomap(inode, &wpc->iomap, &map, offset, + map.m_len << blkbits, 0); + return 0; +} + +static int ext4_iomap_prepare_ioend(struct iomap_ioend *ioend, int status) +{ + struct ext4_inode_info *ei = EXT4_I(ioend->io_inode); + + /* Need to convert unwritten extents when I/Os are completed. */ + if (ioend->io_type == IOMAP_UNWRITTEN || + ioend->io_offset + ioend->io_size > READ_ONCE(ei->i_disksize)) + ioend->io_bio.bi_end_io = ext4_iomap_end_bio; + + return status; +} + +static void ext4_iomap_discard_folio(struct folio *folio, loff_t pos) +{ + struct inode *inode = folio->mapping->host; + + ext4_iomap_punch_delalloc(inode, pos, + folio_pos(folio) + folio_size(folio) - pos); +} + +static const struct iomap_writeback_ops ext4_writeback_ops = { + .map_blocks = ext4_iomap_map_blocks, + .prepare_ioend = ext4_iomap_prepare_ioend, + .discard_folio = ext4_iomap_discard_folio, +}; + static int ext4_iomap_writepages(struct address_space *mapping, struct writeback_control *wbc) { - return 0; + struct inode *inode = mapping->host; + struct super_block *sb = inode->i_sb; + long nr = wbc->nr_to_write; + int alloc_ctx, ret; + struct ext4_writeback_ctx ewpc = { + .wbc = wbc, + }; + + if (unlikely(ext4_forced_shutdown(sb))) + return -EIO; + + alloc_ctx = ext4_writepages_down_read(sb); + trace_ext4_writepages(inode, wbc); + ret = iomap_writepages(mapping, wbc, &ewpc.ctx, &ext4_writeback_ops); + trace_ext4_writepages_result(inode, wbc, ret, nr - wbc->nr_to_write); + ext4_writepages_up_read(sb, alloc_ctx); + + return ret; } /* diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index dfdd7e5cf038..a499208500e4 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -565,3 +566,109 @@ int ext4_bio_write_folio(struct ext4_io_submit *io, struct folio *folio, return 0; } + +static void ext4_iomap_finish_ioend(struct iomap_ioend *ioend) +{ + struct inode *inode = ioend->io_inode; + struct ext4_inode_info *ei = EXT4_I(inode); + loff_t pos = ioend->io_offset; + size_t size = ioend->io_size; + loff_t new_disksize; + handle_t *handle; + int credits; + int ret, err; + + ret = blk_status_to_errno(ioend->io_bio.bi_status); + if (unlikely(ret)) + goto out; + + /* + * We may need to convert up to one extent per block in + * the page and we may dirty the inode. + */ + credits = ext4_chunk_trans_blocks(inode, + EXT4_MAX_BLOCKS(size, pos, inode->i_blkbits)); + handle = ext4_journal_start(inode, EXT4_HT_EXT_CONVERT, credits); + if (IS_ERR(handle)) { + ret = PTR_ERR(handle); + goto out_err; + } + + if (ioend->io_type == IOMAP_UNWRITTEN) { + ret = ext4_convert_unwritten_extents(handle, inode, pos, size); + if (ret) + goto out_journal; + } + + /* + * Update on-disk size after IO is completed. Races with + * truncate are avoided by checking i_size under i_data_sem. + */ + new_disksize = pos + size; + if (new_disksize > READ_ONCE(ei->i_disksize)) { + down_write(&ei->i_data_sem); + new_disksize = min(new_disksize, i_size_read(inode)); + if (new_disksize > ei->i_disksize) + ei->i_disksize = new_disksize; + up_write(&ei->i_data_sem); + ret = ext4_mark_inode_dirty(handle, inode); + if (ret) + EXT4_ERROR_INODE_ERR(inode, -ret, + "Failed to mark inode dirty"); + } + +out_journal: + err = ext4_journal_stop(handle); + if (!ret) + ret = err; +out_err: + if (ret < 0 && !ext4_forced_shutdown(inode->i_sb)) { + ext4_msg(inode->i_sb, KERN_EMERG, + "failed to convert unwritten extents to " + "written extents or update inode size -- " + "potential data loss! (inode %lu, error %d)", + inode->i_ino, ret); + } +out: + iomap_finish_ioends(ioend, ret); +} + +/* + * Work on buffered iomap completed IO, to convert unwritten extents to + * mapped extents + */ +void ext4_iomap_end_io(struct work_struct *work) +{ + struct ext4_inode_info *ei = container_of(work, struct ext4_inode_info, + i_iomap_ioend_work); + struct iomap_ioend *ioend; + struct list_head ioend_list; + unsigned long flags; + + spin_lock_irqsave(&ei->i_completed_io_lock, flags); + list_replace_init(&ei->i_iomap_ioend_list, &ioend_list); + spin_unlock_irqrestore(&ei->i_completed_io_lock, flags); + + iomap_sort_ioends(&ioend_list); + while (!list_empty(&ioend_list)) { + ioend = list_entry(ioend_list.next, struct iomap_ioend, io_list); + list_del_init(&ioend->io_list); + iomap_ioend_try_merge(ioend, &ioend_list); + ext4_iomap_finish_ioend(ioend); + } +} + +void ext4_iomap_end_bio(struct bio *bio) +{ + struct iomap_ioend *ioend = iomap_ioend_from_bio(bio); + struct ext4_inode_info *ei = EXT4_I(ioend->io_inode); + struct ext4_sb_info *sbi = EXT4_SB(ioend->io_inode->i_sb); + unsigned long flags; + + /* Only reserved conversions from writeback should enter here */ + spin_lock_irqsave(&ei->i_completed_io_lock, flags); + if (list_empty(&ei->i_iomap_ioend_list)) + queue_work(sbi->rsv_conversion_wq, &ei->i_iomap_ioend_work); + list_add_tail(&ioend->io_list, &ei->i_iomap_ioend_list); + spin_unlock_irqrestore(&ei->i_completed_io_lock, flags); +} diff --git a/fs/ext4/super.c b/fs/ext4/super.c index d4afaae8ab38..811a46f83bb9 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1431,11 +1431,13 @@ static struct inode *ext4_alloc_inode(struct super_block *sb) #endif ei->jinode = NULL; INIT_LIST_HEAD(&ei->i_rsv_conversion_list); + INIT_LIST_HEAD(&ei->i_iomap_ioend_list); spin_lock_init(&ei->i_completed_io_lock); ei->i_sync_tid = 0; ei->i_datasync_tid = 0; atomic_set(&ei->i_unwritten, 0); INIT_WORK(&ei->i_rsv_conversion_work, ext4_end_io_rsv_work); + INIT_WORK(&ei->i_iomap_ioend_work, ext4_iomap_end_io); ext4_fc_init_inode(&ei->vfs_inode); mutex_init(&ei->i_fc_lock); return &ei->vfs_inode;