From patchwork Thu Feb 8 16:58:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Luis Henriques X-Patchwork-Id: 198485 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:50ea:b0:106:860b:bbdd with SMTP id r10csp306212dyd; Thu, 8 Feb 2024 08:58:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IHMLgCZcYz4y+XpuhPI3jj57onNelmvHd3XyeF+ZiWICzdp3XGt9NFPUkUh8oL2Fv0+kYOq X-Received: by 2002:a17:906:7d48:b0:a37:249d:2799 with SMTP id l8-20020a1709067d4800b00a37249d2799mr6376321ejp.62.1707411514214; Thu, 08 Feb 2024 08:58:34 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707411514; cv=pass; d=google.com; s=arc-20160816; b=X6C08i6wSkCbwzf3LDLWMXvEhEvVPRjQnTvvcYFxEp6DiFS/f/EaMWFns8LkG92Wz9 /op2v8cWDyo2OoSI4galyI487bkarOzbPVyLFAZGxp+fRqFChH+1Drwf99phRy0gapX4 sCDfX/cA6Wmf8Nh2br6xzoHIPKPoD77Ld56ztxtJBc8fFepbF5uJBaxu8rqbhcZZU9E4 +woRxT8ZQDZFuo+QeO8SzNMogBxohMlxgPtGOFD67ipdXv/gmKPAc8osQAxFuu2wgleA 8670obU23EYCaGdKrC2YpHWPH4aXMnCoahP1/3npAPpQM1VIAudpL945lPG3NN+PNUjA LcsA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature:dkim-signature:dkim-signature:dkim-signature; bh=4ZgXjD6osKpD5V9BCXa39jxK5ZSPDcRIhG5CJmbbskg=; fh=9UkXlguXL5tnm8C3lgW8wIZBa02kUUPTwhPuCh/Qevk=; b=blwg4PLOr49qPNImy07ZohK2WkmlVfqe/reyMa6asYwbx6Be0lXjskyoN5Y4a4mLGk AcYiZYTBVk0mcdSn7aRyvOVsFKQEaYMjnwhl9r3y74FSc/SmpwNfsOlxsUQxVWSMqJJ+ 5iBAmBPFDNgwd220sAB/qzMY2ilMM1G+5e/dkUES2i5OyAB87mpTvZ2I8HVSCXN85Qgz 86Hj2X730PziVpt8rsk4GoxQtQ/YPsrX0Plj0H930JR0jk3vTrCDApFZdZGGe3wHCC7x Y3aFRoKUPhs261xMUXEgtJe5I9cMzrNlYztpD41+hqfBVZEv0CpYW1aIb4KCjDwaUamn 5QDQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b="bz/TewRj"; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b="bz/TewRj"; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; arc=pass (i=1 spf=pass spfdomain=suse.de dkim=pass dkdomain=suse.de dkim=pass dkdomain=suse.de dmarc=pass fromdomain=suse.de); spf=pass (google.com: domain of linux-kernel+bounces-58388-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-58388-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de X-Forwarded-Encrypted: i=2; AJvYcCWOMDuKRBRNwtBKCFmdHzvHliKyJVAufOfU+JTvaktJ0BHJgTS7iA7yMMDMPvFF7hZU1EMPJOrkHFLKfHIiGgGuW6XbTw== Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id qh16-20020a170906ecb000b00a3a29a0c37asi210864ejb.455.2024.02.08.08.58.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Feb 2024 08:58:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-58388-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b="bz/TewRj"; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b="bz/TewRj"; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; arc=pass (i=1 spf=pass spfdomain=suse.de dkim=pass dkdomain=suse.de dkim=pass dkdomain=suse.de dmarc=pass fromdomain=suse.de); spf=pass (google.com: domain of linux-kernel+bounces-58388-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-58388-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id A34671F2838D for ; Thu, 8 Feb 2024 16:58:33 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0C1E3823A0; Thu, 8 Feb 2024 16:58:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="bz/TewRj"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="nIQFLhzf"; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b="bz/TewRj"; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b="nIQFLhzf" Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4919C73164; Thu, 8 Feb 2024 16:58:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707411494; cv=none; b=ThUeZzGSkjC3fyFQtejkMdnSuf65WK1YjdFfS3Gf8X4klrx4h0Y6QxCcc0CbtoWXmkoTVQQIL9Adm1rBuIH70aUvfwQeICERGv04weeRslbk7xiyBmYZc1AOMLvL4rvWJWNKzqs8ctkoQ2w4IHtkAnUkQ7fFr95185sDxza8iKQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707411494; c=relaxed/simple; bh=/H4JDepbabc9q49WGUHSfSWdZExqVMaCPHCVozpHu0E=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=Di/ZTWIURKWOhDFtHEiVhjs9gSx0izrkX+LTvYkDqgpiPCJfZ17IutIso5bPnAzv26HeO1XBxpSD90Va6JCByF/GnDMuNMk73DuUbI3N8dPuY3oDug60kgWCc78ZwA4JpCoDdRuMSM8GFYzvkh//kfyDV9562itdTXN0x4TKoz0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de; spf=pass smtp.mailfrom=suse.de; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=bz/TewRj; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=nIQFLhzf; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.b=bz/TewRj; dkim=permerror (0-bit key) header.d=suse.de header.i=@suse.de header.b=nIQFLhzf; arc=none smtp.client-ip=195.135.223.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=suse.de Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 6E9D021F60; Thu, 8 Feb 2024 16:58:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1707411490; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=4ZgXjD6osKpD5V9BCXa39jxK5ZSPDcRIhG5CJmbbskg=; b=bz/TewRjkqpTmUZKQKCJP7eJSNDfp5FqxBpIpMxhQmVM4rpuvoBvXS2YhB79C/Cmw6+YqD G0AERVZHLw9Ozz9DoetzgMpEOLotRXUOWNSQKT0K0F4XZcCEPTaU9lIBOrKHjj6fqJNFTs p2XZk8DAxt3Ywpg1kAJiokZXagBgPOs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1707411490; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=4ZgXjD6osKpD5V9BCXa39jxK5ZSPDcRIhG5CJmbbskg=; b=nIQFLhzfT80Hm9wpwCMZJQCuCM56tJTjzU4H6yRCztaPh0Kzt0ZQuzDsmE2UQiyQ0YPT3Y jEetek3XjewewQCQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1707411490; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=4ZgXjD6osKpD5V9BCXa39jxK5ZSPDcRIhG5CJmbbskg=; b=bz/TewRjkqpTmUZKQKCJP7eJSNDfp5FqxBpIpMxhQmVM4rpuvoBvXS2YhB79C/Cmw6+YqD G0AERVZHLw9Ozz9DoetzgMpEOLotRXUOWNSQKT0K0F4XZcCEPTaU9lIBOrKHjj6fqJNFTs p2XZk8DAxt3Ywpg1kAJiokZXagBgPOs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1707411490; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=4ZgXjD6osKpD5V9BCXa39jxK5ZSPDcRIhG5CJmbbskg=; b=nIQFLhzfT80Hm9wpwCMZJQCuCM56tJTjzU4H6yRCztaPh0Kzt0ZQuzDsmE2UQiyQ0YPT3Y jEetek3XjewewQCQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id ECFCA1326D; Thu, 8 Feb 2024 16:58:09 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 9wqANiEIxWW2FAAAD6G6ig (envelope-from ); Thu, 08 Feb 2024 16:58:09 +0000 Received: from localhost (brahms.olymp [local]) by brahms.olymp (OpenSMTPD) with ESMTPA id 5e47eb05; Thu, 8 Feb 2024 16:58:09 +0000 (UTC) From: Luis Henriques To: "Theodore Y. Ts'o" , Andreas Dilger Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, Luis Henriques , Daniel Dawson Subject: [RFC PATCH] ext4: destroy inline data immediately when converting to extent Date: Thu, 8 Feb 2024 16:58:07 +0000 Message-ID: <20240208165808.5494-1-lhenriques@suse.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -3.30 X-Spamd-Result: default: False [-3.30 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCPT_COUNT_FIVE(0.00)[6]; RCVD_COUNT_THREE(0.00)[4]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MID_CONTAINS_FROM(1.00)[]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.de:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; RCVD_TLS_LAST(0.00)[]; FREEMAIL_CC(0.00)[vger.kernel.org,suse.de,gmail.com]; BAYES_HAM(-3.00)[100.00%] X-Spam-Flag: NO X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1790350735673287316 X-GMAIL-MSGID: 1790350735673287316 When writing to an inode that has inline data and the amount of data written exceeds the maximum inline data length, that data is un-inlined, i.e. it is converted into an extent. However, when delayed allocation is enabled the destruction of the data is postponed until the data writeback. This causes consistency problems. Here's a very simple test case, run on a filesystem with delayed allocation and inline data features enabled: $ dd if=/dev/zero of=test-file bs=64 count=3 status=none $ lsattr test-file ------------------N--- test-file The 'lsattr' command shows that the file has data stored inline. However, that is not correct because writing 192 bytes (3 * 64) has forced the data to be un-inlined. Doing a 'sync' before running the 'lsattr' fixes it. Note that this bug doesn't happen if the filesytems is mount using the 'nodelalloc' option. (There's a similar test case using read() instead in the bugzilla linked bellow.) This patch fixes the issue in the delayed allocation path by destroying the inline data immediately after converting it to an extent instead of delaying that operation until the writeback. This is done by invoking function ext4_destroy_inline_data_nolock(), which is going to clean-up all the missing data structures, including clearing ĨNLINE_DATA and setting the EXTENTS inode flags. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200681 Cc: Daniel Dawson Signed-off-by: Luis Henriques --- Hi! I'm sending this patch as an RFC because, although I've done a good amount of testing, I'm still not convinced it is correct. I.e. there may exist a good reason for postponing the call to ext4_destroy_inline_data_nolock() and that I'm failing to see it. Please let me know what you think. fs/ext4/inline.c | 20 ++++++++++---------- fs/ext4/inode.c | 18 +----------------- 2 files changed, 11 insertions(+), 27 deletions(-) diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index d5bd1e3a5d36..e19a176cfc93 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -830,11 +830,12 @@ int ext4_write_inline_data_end(struct inode *inode, loff_t pos, unsigned len, * update and dirty so that ext4_da_writepages can handle it. We don't * need to start the journal since the file's metadata isn't changed now. */ -static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping, +static int ext4_da_convert_inline_data_to_extent(handle_t *handle, + struct address_space *mapping, struct inode *inode, void **fsdata) { - int ret = 0, inline_size; + int ret = 0, inline_size, no_expand; struct folio *folio; folio = __filemap_get_folio(mapping, 0, FGP_WRITEBEGIN, @@ -842,7 +843,7 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping, if (IS_ERR(folio)) return PTR_ERR(folio); - down_read(&EXT4_I(inode)->xattr_sem); + ext4_write_lock_xattr(inode, &no_expand); if (!ext4_has_inline_data(inode)) { ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); goto out; @@ -859,20 +860,18 @@ static int ext4_da_convert_inline_data_to_extent(struct address_space *mapping, ret = __block_write_begin(&folio->page, 0, inline_size, ext4_da_get_block_prep); if (ret) { - up_read(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); folio_unlock(folio); folio_put(folio); ext4_truncate_failed_write(inode); return ret; } - folio_mark_dirty(folio); - folio_mark_uptodate(folio); - ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA); + ret = ext4_destroy_inline_data_nolock(handle, inode); *fsdata = (void *)CONVERT_INLINE_DATA; out: - up_read(&EXT4_I(inode)->xattr_sem); + ext4_write_unlock_xattr(inode, &no_expand); if (folio) { folio_unlock(folio); folio_put(folio); @@ -916,10 +915,11 @@ int ext4_da_write_inline_data_begin(struct address_space *mapping, goto out_journal; if (ret == -ENOSPC) { - ext4_journal_stop(handle); - ret = ext4_da_convert_inline_data_to_extent(mapping, + ret = ext4_da_convert_inline_data_to_extent(handle, + mapping, inode, fsdata); + ext4_journal_stop(handle); if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) goto retry_journal; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 2ccf3b5e3a7c..43fa930fafa0 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2548,23 +2548,7 @@ static int ext4_do_writepages(struct mpage_da_data *mpd) goto out_writepages; } - /* - * If we have inline data and arrive here, it means that - * we will soon create the block for the 1st page, so - * we'd better clear the inline data here. - */ - if (ext4_has_inline_data(inode)) { - /* Just inode will be modified... */ - handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); - if (IS_ERR(handle)) { - ret = PTR_ERR(handle); - goto out_writepages; - } - BUG_ON(ext4_test_inode_state(inode, - EXT4_STATE_MAY_INLINE_DATA)); - ext4_destroy_inline_data(handle, inode); - ext4_journal_stop(handle); - } + WARN_ON_ONCE(ext4_has_inline_data(inode)); /* * data=journal mode does not do delalloc so we just need to writeout /