From patchwork Sat Jan 27 01:58:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhang Yi X-Patchwork-Id: 192939 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2395:b0:106:343:edcb with SMTP id gw21csp267026dyb; Fri, 26 Jan 2024 18:04:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IFwVn0PDHkK0lvQgePv+U6IpUQiERJIDqyVVygUNdPJdpJ8kOxcJUvzDxFppttkGPzh7CWh X-Received: by 2002:a05:622a:1ce:b0:42a:518c:4e0d with SMTP id t14-20020a05622a01ce00b0042a518c4e0dmr924013qtw.16.1706321091928; Fri, 26 Jan 2024 18:04:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706321091; cv=pass; d=google.com; s=arc-20160816; b=NdarFVDuCO+QyiUbKooHj884G87j2viWzDOqRU7T90pUJtowvtmxBc1Cnsm5uFPlLs UHUTNW+4NjwhEEhZlktoWkpTOi1QJz7fNU3w25VUWz2AMAi8/eApSroTiKXHOKY0mcHO CxILPThqVzrexgxN/cD1NxC6O+40f9UGQgIuHTbIZ5YPGArHnrAUtDmjiG0e4UjieqO9 HuWvq3tpfYEb68kohbZ9JJT+ONCpVu5eGVHTjo8X/m5LHtV85ticOAXW3iQQY4GvFgzT FeXIfWSV3cjEw9cdd3F6BTEUk/wQ1uz7ck2h5e6z1X8VJWIafFLjboNf+M2Fjxdh+Hx9 8OAQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=LDz98XSxYiJRQC1cVb6S82GBYBYojLb8QpdBTLPL8OI=; fh=UhGiR19HeAIu8tzXykjZgtAKLLMPZj8YX6gPScUOsdE=; b=KDCzjXd4/0qAWrQxn6HkIxK5RZJJf13lea9GVgrupfGN8lH1KlyAa+isq6qdaBfQTD 83cb6pnFHgHMDf76T9a13ckusPYRvjQxpH9vhUfD4bcAOSDJW5b37ZuU/6bpSKKQkwK1 jLGBYiZFvF8upqr7pILpSDPywWraGdzivL3tDxAcwCXbFDI4TXpmCaK9d3Wz1qPH1JXg QNrqWQCNnZHQyvPCZXOnAJKa+JEMelCZA7HgG/113x28J6kZX0Upj0MHJvUR7PTBm6yF esnM0y24xt6AqaWaUMlCF8vsj9a8sVvH1BGwY9iGV114bAIzuGGGtI4OgaLHd/FNii6T /NWA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id u42-20020a05622a19aa00b0042a87e8a5f3si1240098qtc.739.2024.01.26.18.04.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jan 2024 18:04:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huaweicloud.com); spf=pass (google.com: domain of linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-40982-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 893A11C23012 for ; Sat, 27 Jan 2024 02:04:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 890E11DDD9; Sat, 27 Jan 2024 02:02:48 +0000 (UTC) Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC154B65D; Sat, 27 Jan 2024 02:02:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320966; cv=none; b=O+Jvoj7od3m5Ov8nCRED761cIj6yLOZpRs/S8gol9/8vxOXl7zs7n7xebsYRgzntRTiRoU6vw6WFWXQ1Q6nDtvbxNkMz4iLb5GCPgibcwWRwdFv52CugWH0C/GKHKb8504xqMKh/aXAvfSYSTptUoXRRtcQFhaaREi+ktEeGFZg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706320966; c=relaxed/simple; bh=8NzAsROSLBAk7ya48ZTzZy30thoEIq5MDcHiLLmh0Yg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=RjhMNhevbCB4rwQqtCyzAszIdHR05fgWFMJkshPePUPn0hekhvvSdXwENQzFXUmCPk4MfPYg6maRccVtduSyioMC3wZnHjtYYsC1PYmU+paB/Q463r40tbW9SS5G+IAmU9Z4DQ0Irvb/z/TgPGWGSIx2XTWOBq3M7bOmjldOQ4w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TMHrV2Y0Nz4f3lfX; Sat, 27 Jan 2024 10:02:34 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id B0C4C1A0171; Sat, 27 Jan 2024 10:02:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgAX5g40ZLRlGJtmCA--.7377S6; Sat, 27 Jan 2024 10:02:40 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ritesh.list@gmail.com, hch@infradead.org, djwong@kernel.org, willy@infradead.org, zokeefe@google.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, chengzhihao1@huawei.com, yukuai3@huawei.com, wangkefeng.wang@huawei.com Subject: [PATCH v3 02/26] ext4: convert to exclusive lock while inserting delalloc extents Date: Sat, 27 Jan 2024 09:58:01 +0800 Message-Id: <20240127015825.1608160-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> References: <20240127015825.1608160-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-CM-TRANSID: cCh0CgAX5g40ZLRlGJtmCA--.7377S6 X-Coremail-Antispam: 1UD129KBjvJXoWxJw4DZFy8uF4kXr1fZry3XFb_yoW5Zr4kpr ZIkFyfCr1UW3ykuayIqr17XF1xG3WUJFW7GFZxGF4UZFyUAFnaqF1jyF1aqFyftrZ7AF4Y qFW0qry5uayUCrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8Jw CI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUoSdgDUUU U X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789207345236511934 X-GMAIL-MSGID: 1789207345236511934 From: Zhang Yi ext4_da_map_blocks() only hold i_data_sem in shared mode and i_rwsem when inserting delalloc extents, it could be raced by another querying path of ext4_map_blocks() without i_rwsem, .e.g buffered read path. Suppose we buffered read a file containing just a hole, and without any cached extents tree, then it is raced by another delayed buffered write to the same area or the near area belongs to the same hole, and the new delalloc extent could be overwritten to a hole extent. pread() pwrite() filemap_read_folio() ext4_mpage_readpages() ext4_map_blocks() down_read(i_data_sem) ext4_ext_determine_hole() //find hole ext4_ext_put_gap_in_cache() ext4_es_find_extent_range() //no delalloc extent ext4_da_map_blocks() down_read(i_data_sem) ext4_insert_delayed_block() //insert delalloc extent ext4_es_insert_extent() //overwrite delalloc extent to hole This race could lead to inconsistent delalloc extents tree and incorrect reserved space counter. Fix this by converting to hold i_data_sem in exclusive mode when adding a new delalloc extent in ext4_da_map_blocks(). Cc: stable@vger.kernel.org Signed-off-by: Zhang Yi Suggested-by: Jan Kara Reviewed-by: Jan Kara --- fs/ext4/inode.c | 25 +++++++++++-------------- 1 file changed, 11 insertions(+), 14 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5b0d3075be12..142c67f5c7fc 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1703,10 +1703,8 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, /* Lookup extent status tree firstly */ if (ext4_es_lookup_extent(inode, iblock, NULL, &es)) { - if (ext4_es_is_hole(&es)) { - down_read(&EXT4_I(inode)->i_data_sem); + if (ext4_es_is_hole(&es)) goto add_delayed; - } /* * Delayed extent could be allocated by fallocate. @@ -1748,8 +1746,10 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, retval = ext4_ext_map_blocks(NULL, inode, map, 0); else retval = ext4_ind_map_blocks(NULL, inode, map, 0); - if (retval < 0) - goto out_unlock; + if (retval < 0) { + up_read(&EXT4_I(inode)->i_data_sem); + return retval; + } if (retval > 0) { unsigned int status; @@ -1765,24 +1765,21 @@ static int ext4_da_map_blocks(struct inode *inode, sector_t iblock, EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; ext4_es_insert_extent(inode, map->m_lblk, map->m_len, map->m_pblk, status); - goto out_unlock; + up_read(&EXT4_I(inode)->i_data_sem); + return retval; } + up_read(&EXT4_I(inode)->i_data_sem); add_delayed: - /* - * XXX: __block_prepare_write() unmaps passed block, - * is it OK? - */ + down_write(&EXT4_I(inode)->i_data_sem); retval = ext4_insert_delayed_block(inode, map->m_lblk); + up_write(&EXT4_I(inode)->i_data_sem); if (retval) - goto out_unlock; + return retval; map_bh(bh, inode->i_sb, invalid_block); set_buffer_new(bh); set_buffer_delay(bh); - -out_unlock: - up_read((&EXT4_I(inode)->i_data_sem)); return retval; }