Message ID | 20231128053202.29007-3-zhangjiachen.jaycee@bytedance.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce62:0:b0:403:3b70:6f57 with SMTP id o2csp3698461vqx; Mon, 27 Nov 2023 21:35:22 -0800 (PST) X-Google-Smtp-Source: AGHT+IGItASGC+5wE59ugbJkf6GZ2WiV9FexM2E+VyIwAwK13196kGbH18+NOY8PyVmDBk4gfcMF X-Received: by 2002:a17:90a:190e:b0:285:8cb1:7f53 with SMTP id 14-20020a17090a190e00b002858cb17f53mr13082422pjg.31.1701149722460; Mon, 27 Nov 2023 21:35:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701149722; cv=none; d=google.com; s=arc-20160816; b=XASub3caZTeju86/hO4AJzqGZ8ufDrw0h5w9quqTjn6FakulKc3oVwoiMDxHz+1Ya+ q9gyGBKoc+s34PmG16IbJwUHLD4xP3FA53Ww0SgD5Ko6Uw0sptswdd0et5sI5X0GlTun Shof2Zf+BSJwRy9zTijABWAXiLklFgux3FC1Wp/J57diNZjZY+QhiXrPFa/RPuHnIvH+ bnkuufmreT5RBDqqmSyTPm0twzifd/MX3TW3+YhOfz+g0jiqY8iPDsmDUezBoG5fnhc5 r8RGDNrHlLCN8250ASZqtQWVLvp4Ojbo3M7asmrIorxeeEDaBQ1r28aMswnGLyiQbLtB /i7g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PRqsnYQ6nV0dzX6Vx06lKgWxOV6XUUCzggjpXFerXKM=; fh=/A4uQK4zZ4z4hFyiYOoCGRCW2xUQ8LWPawF3BhAqto0=; b=v898r4Jmk73Uea+Vv0CfmgwoA8NQ6KmfWfV6gWIPTAXGumj4wNc8HxBGlaE4viAuTX sCd/r8poSuYHx5ZjlEUb4/3BUZVYiqcrPJhAvwYdlQzg5HF86S5h+eZJG2/MKHt2+JwA 5lJThEClqSDScvZwa5xi1Torf4dfzNbPJOlm+4dzxOFC3jMx/f8gL0mCK50x0gMpNE3K G9ez2ULL6wAQwHCT0+QROMji2uM9f6a1zpptSCWLi8wbb9tOOojWKPZUPiXx+CzQvMxi rb16KD7ebAVHqk4pt9dZvyQoPNkjfgI07oPXf9yd4UtR/llZMC524T9rsHeTXDfPB/Uu VUHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=MFB9k4mW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id ha21-20020a17090af3d500b0028595b2d53bsi8247887pjb.90.2023.11.27.21.35.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Nov 2023 21:35:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=MFB9k4mW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id D815A8078601; Mon, 27 Nov 2023 21:35:19 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234627AbjK1Fe6 (ORCPT <rfc822;toshivichauhan@gmail.com> + 99 others); Tue, 28 Nov 2023 00:34:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343552AbjK1Fex (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 28 Nov 2023 00:34:53 -0500 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C293DE for <linux-kernel@vger.kernel.org>; Mon, 27 Nov 2023 21:34:38 -0800 (PST) Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1cf7a8ab047so38875575ad.1 for <linux-kernel@vger.kernel.org>; Mon, 27 Nov 2023 21:34:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701149678; x=1701754478; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PRqsnYQ6nV0dzX6Vx06lKgWxOV6XUUCzggjpXFerXKM=; b=MFB9k4mWSQv3i3ltZ0GgzWfKQZHjbG9ZrBpeRijLSZKtffqDCMVd6PdK7C+H1xnYy+ mnYSd+ZaOyp6QARthaZGS0BkKDx3o7r6Tfbeimc2eFcwfZuPHI6d0GrY+diF9bNpbpMU 2/pGMPwuo939+BVPZmXf/hs9BjQTTxIPrdz+nQYcaahJNps7l2ZyjW3utV063M70zPlG J/1s5dUkecy9SzsvacUlkrHghiH+Mk7gO5NEqSISQYl5czkOfdtuHs3ln1n3GTm83ZlH QljufujAVvTWjlgoIa1njlTrdgi9XVT3Y65bvOqHKMWkXznfk452ZVzRozr3AOeBJR49 MqAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701149678; x=1701754478; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PRqsnYQ6nV0dzX6Vx06lKgWxOV6XUUCzggjpXFerXKM=; b=NM8EQFNMao2frWPa5sxF0cKGEXjdTYoqYVkSaVqUeW9AnyRjneR+XKc0I26SEBCEk2 pbSgiBB9R/5O/RFND5IWbAE0a4eKTHLFl7VhPmDlMgg85+l/Q5ELVcYDZLXw/Z6jH5x1 aLXBu/jv7mbDC4vnmN6in5scpcb8Pv36fIKEwZtON/s+b7FjTTkvRNiJ61+p2aBczyYj QQr6Owjn9ZoYOGoFNRxwjOYWjjUdAxs5rgPAfY0RIjJ+I4ffQVLM49zmICHSBomrncbT UjMqpfd+/5DnWThKleS64jtahIH4GZoL0whh5D8KoKRDnBoMcDZLPucdjT40wa1szWY1 FxVg== X-Gm-Message-State: AOJu0Yy9dQnoNgqtvuVk8VAcS43aWIoY9xBlvSrgPhq0CnJJUTn9ffe1 sEV29PYNKYRA2IWzC1efWz7JMg== X-Received: by 2002:a17:903:183:b0:1ca:a290:4c0c with SMTP id z3-20020a170903018300b001caa2904c0cmr16142204plg.16.1701149678085; Mon, 27 Nov 2023 21:34:38 -0800 (PST) Received: from localhost.localdomain ([61.213.176.14]) by smtp.gmail.com with ESMTPSA id u12-20020a17090341cc00b001cfb6bef8fesm5372899ple.186.2023.11.27.21.34.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Nov 2023 21:34:37 -0800 (PST) From: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com> To: Chandan Babu R <chandan.babu@oracle.com>, "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <dchinner@redhat.com>, Allison Henderson <allison.henderson@oracle.com>, Zhang Tianci <zhangtianci.1997@bytedance.com>, Brian Foster <bfoster@redhat.com>, Ben Myers <bpm@sgi.com>, linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org, xieyongji@bytedance.com, me@jcix.top Subject: [PATCH 2/2] xfs: update dir3 leaf block metadata after swap Date: Tue, 28 Nov 2023 13:32:02 +0800 Message-Id: <20231128053202.29007-3-zhangjiachen.jaycee@bytedance.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20231128053202.29007-1-zhangjiachen.jaycee@bytedance.com> References: <20231128053202.29007-1-zhangjiachen.jaycee@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Mon, 27 Nov 2023 21:35:20 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783784771439935093 X-GMAIL-MSGID: 1783784771439935093 |
Series |
Fixes for ENOSPC xfs_remove
|
|
Commit Message
Jiachen Zhang
Nov. 28, 2023, 5:32 a.m. UTC
From: Zhang Tianci <zhangtianci.1997@bytedance.com> xfs_da3_swap_lastblock() copy the last block content to the dead block, but do not update the metadata in it. We need update some metadata for some kinds of type block, such as dir3 leafn block records its blkno, we shall update it to the dead block blkno. Otherwise, before write the xfs_buf to disk, the verify_write() will fail in blk_hdr->blkno != xfs_buf->b_bn, then xfs will be shutdown. We will get this warning: XFS (dm-0): Metadata corruption detected at xfs_dir3_leaf_verify+0xa8/0xe0 [xfs], xfs_dir3_leafn block 0x178 XFS (dm-0): Unmount and run xfs_repair XFS (dm-0): First 128 bytes of corrupted metadata buffer: 00000000e80f1917: 00 80 00 0b 00 80 00 07 3d ff 00 00 00 00 00 00 ........=....... 000000009604c005: 00 00 00 00 00 00 01 a0 00 00 00 00 00 00 00 00 ................ 000000006b6fb2bf: e4 44 e3 97 b5 64 44 41 8b 84 60 0e 50 43 d9 bf .D...dDA..`.PC.. 00000000678978a2: 00 00 00 00 00 00 00 83 01 73 00 93 00 00 00 00 .........s...... 00000000b28b247c: 99 29 1d 38 00 00 00 00 99 29 1d 40 00 00 00 00 .).8.....).@.... 000000002b2a662c: 99 29 1d 48 00 00 00 00 99 49 11 00 00 00 00 00 .).H.....I...... 00000000ea2ffbb8: 99 49 11 08 00 00 45 25 99 49 11 10 00 00 48 fe .I....E%.I....H. 0000000069e86440: 99 49 11 18 00 00 4c 6b 99 49 11 20 00 00 4d 97 .I....Lk.I. ..M. XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1423 of file fs/xfs/xfs_buf.c. Return address = 00000000c0ff63c1 XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem XFS (dm-0): Please umount the filesystem and rectify the problem(s) From the log above, we know xfs_buf->b_no is 0x178, but the block's hdr record its blkno is 0x1a0. Fixes: 24df33b45ecf ("xfs: add CRC checking to dir2 leaf blocks") Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com> --- fs/xfs/libxfs/xfs_da_btree.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
Comments
On Tue, Nov 28, 2023 at 01:32:02PM +0800, Jiachen Zhang wrote: > From: Zhang Tianci <zhangtianci.1997@bytedance.com> > > xfs_da3_swap_lastblock() copy the last block content to the dead block, > but do not update the metadata in it. We need update some metadata > for some kinds of type block, such as dir3 leafn block records its > blkno, we shall update it to the dead block blkno. Otherwise, > before write the xfs_buf to disk, the verify_write() will fail in > blk_hdr->blkno != xfs_buf->b_bn, then xfs will be shutdown. Do you have a reproducer for this? It would be very helpful to add it to xfstests. > > We will get this warning: > > XFS (dm-0): Metadata corruption detected at xfs_dir3_leaf_verify+0xa8/0xe0 [xfs], xfs_dir3_leafn block 0x178 > XFS (dm-0): Unmount and run xfs_repair > XFS (dm-0): First 128 bytes of corrupted metadata buffer: > 00000000e80f1917: 00 80 00 0b 00 80 00 07 3d ff 00 00 00 00 00 00 ........=....... > 000000009604c005: 00 00 00 00 00 00 01 a0 00 00 00 00 00 00 00 00 ................ > 000000006b6fb2bf: e4 44 e3 97 b5 64 44 41 8b 84 60 0e 50 43 d9 bf .D...dDA..`.PC.. > 00000000678978a2: 00 00 00 00 00 00 00 83 01 73 00 93 00 00 00 00 .........s...... > 00000000b28b247c: 99 29 1d 38 00 00 00 00 99 29 1d 40 00 00 00 00 .).8.....).@.... > 000000002b2a662c: 99 29 1d 48 00 00 00 00 99 49 11 00 00 00 00 00 .).H.....I...... > 00000000ea2ffbb8: 99 49 11 08 00 00 45 25 99 49 11 10 00 00 48 fe .I....E%.I....H. > 0000000069e86440: 99 49 11 18 00 00 4c 6b 99 49 11 20 00 00 4d 97 .I....Lk.I. ..M. > XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1423 of file fs/xfs/xfs_buf.c. Return address = 00000000c0ff63c1 > XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem > XFS (dm-0): Please umount the filesystem and rectify the problem(s) > > >From the log above, we know xfs_buf->b_no is 0x178, but the block's hdr record > its blkno is 0x1a0. > > Fixes: 24df33b45ecf ("xfs: add CRC checking to dir2 leaf blocks") > Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com> > --- > fs/xfs/libxfs/xfs_da_btree.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c > index e576560b46e9..35f70e4c6447 100644 > --- a/fs/xfs/libxfs/xfs_da_btree.c > +++ b/fs/xfs/libxfs/xfs_da_btree.c > @@ -2318,8 +2318,18 @@ xfs_da3_swap_lastblock( > * Copy the last block into the dead buffer and log it. > */ > memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); > - xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); > dead_info = dead_buf->b_addr; > + /* > + * Update the moved block's blkno if it's a dir3 leaf block > + */ > + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || > + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || > + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { > + struct xfs_da3_blkinfo *dap = (struct xfs_da3_blkinfo *)dead_info; > + > + dap->blkno = cpu_to_be64(dead_buf->b_bn); > + } > + xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); The fix here looks correct to me, but also a little ugly and ad-hoc. At last we should be using container_of and not casts for getting from a xfs_da_blkinfo to a xfs_da3_blkinfo (even if there is bad precedence for the cast in existing code). But I think it would be useful to add a helper that stamps in the blkno in for a caller that only has as xfs_da_blkinfo but no xfs_da3_blkinfo and use in all the places that do it currently in an open coded fashion e.g. xfs_da3_root_join, xfs_da3_root_split, xfs_attr3_leaf_to_node. That should probably be done on top of the small backportable fix.
Hi Jiachen, kernel test robot noticed the following build errors: [auto build test ERROR on xfs-linux/for-next] [also build test ERROR on linus/master v6.7-rc3 next-20231128] [cannot apply to djwong-xfs/djwong-devel] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Jiachen-Zhang/xfs-ensure-tmp_logflags-is-initialized-in-xfs_bmap_del_extent_real/20231128-135955 base: https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git for-next patch link: https://lore.kernel.org/r/20231128053202.29007-3-zhangjiachen.jaycee%40bytedance.com patch subject: [PATCH 2/2] xfs: update dir3 leaf block metadata after swap config: powerpc64-randconfig-r081-20231128 (https://download.01.org/0day-ci/archive/20231128/202311281800.GTI1cgFY-lkp@intel.com/config) compiler: powerpc64-linux-gcc (GCC) 13.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231128/202311281800.GTI1cgFY-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202311281800.GTI1cgFY-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from include/linux/byteorder/big_endian.h:5, from arch/powerpc/include/uapi/asm/byteorder.h:14, from arch/powerpc/include/asm/qspinlock_types.h:6, from arch/powerpc/include/asm/spinlock_types.h:10, from include/linux/spinlock_types_raw.h:7, from include/linux/ratelimit_types.h:7, from include/linux/printk.h:9, from include/asm-generic/bug.h:22, from arch/powerpc/include/asm/bug.h:116, from include/linux/bug.h:5, from include/linux/fortify-string.h:5, from include/linux/string.h:295, from include/linux/uuid.h:11, from fs/xfs/xfs_linux.h:10, from fs/xfs/xfs.h:22, from fs/xfs/libxfs/xfs_da_btree.c:7: fs/xfs/libxfs/xfs_da_btree.c: In function 'xfs_da3_swap_lastblock': >> fs/xfs/libxfs/xfs_da_btree.c:2330:50: error: 'struct xfs_buf' has no member named 'b_bn' 2330 | dap->blkno = cpu_to_be64(dead_buf->b_bn); | ^~ include/uapi/linux/byteorder/big_endian.h:38:51: note: in definition of macro '__cpu_to_be64' 38 | #define __cpu_to_be64(x) ((__force __be64)(__u64)(x)) | ^ fs/xfs/libxfs/xfs_da_btree.c:2330:30: note: in expansion of macro 'cpu_to_be64' 2330 | dap->blkno = cpu_to_be64(dead_buf->b_bn); | ^~~~~~~~~~~ vim +2330 fs/xfs/libxfs/xfs_da_btree.c 2254 2255 /* 2256 * Ick. We need to always be able to remove a btree block, even 2257 * if there's no space reservation because the filesystem is full. 2258 * This is called if xfs_bunmapi on a btree block fails due to ENOSPC. 2259 * It swaps the target block with the last block in the file. The 2260 * last block in the file can always be removed since it can't cause 2261 * a bmap btree split to do that. 2262 */ 2263 STATIC int 2264 xfs_da3_swap_lastblock( 2265 struct xfs_da_args *args, 2266 xfs_dablk_t *dead_blknop, 2267 struct xfs_buf **dead_bufp) 2268 { 2269 struct xfs_da_blkinfo *dead_info; 2270 struct xfs_da_blkinfo *sib_info; 2271 struct xfs_da_intnode *par_node; 2272 struct xfs_da_intnode *dead_node; 2273 struct xfs_dir2_leaf *dead_leaf2; 2274 struct xfs_da_node_entry *btree; 2275 struct xfs_da3_icnode_hdr par_hdr; 2276 struct xfs_inode *dp; 2277 struct xfs_trans *tp; 2278 struct xfs_mount *mp; 2279 struct xfs_buf *dead_buf; 2280 struct xfs_buf *last_buf; 2281 struct xfs_buf *sib_buf; 2282 struct xfs_buf *par_buf; 2283 xfs_dahash_t dead_hash; 2284 xfs_fileoff_t lastoff; 2285 xfs_dablk_t dead_blkno; 2286 xfs_dablk_t last_blkno; 2287 xfs_dablk_t sib_blkno; 2288 xfs_dablk_t par_blkno; 2289 int error; 2290 int w; 2291 int entno; 2292 int level; 2293 int dead_level; 2294 2295 trace_xfs_da_swap_lastblock(args); 2296 2297 dead_buf = *dead_bufp; 2298 dead_blkno = *dead_blknop; 2299 tp = args->trans; 2300 dp = args->dp; 2301 w = args->whichfork; 2302 ASSERT(w == XFS_DATA_FORK); 2303 mp = dp->i_mount; 2304 lastoff = args->geo->freeblk; 2305 error = xfs_bmap_last_before(tp, dp, &lastoff, w); 2306 if (error) 2307 return error; 2308 if (XFS_IS_CORRUPT(mp, lastoff == 0)) 2309 return -EFSCORRUPTED; 2310 /* 2311 * Read the last block in the btree space. 2312 */ 2313 last_blkno = (xfs_dablk_t)lastoff - args->geo->fsbcount; 2314 error = xfs_da3_node_read(tp, dp, last_blkno, &last_buf, w); 2315 if (error) 2316 return error; 2317 /* 2318 * Copy the last block into the dead buffer and log it. 2319 */ 2320 memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); 2321 dead_info = dead_buf->b_addr; 2322 /* 2323 * Update the moved block's blkno if it's a dir3 leaf block 2324 */ 2325 if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || 2326 dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || 2327 dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { 2328 struct xfs_da3_blkinfo *dap = (struct xfs_da3_blkinfo *)dead_info; 2329 > 2330 dap->blkno = cpu_to_be64(dead_buf->b_bn); 2331 } 2332 xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); 2333 /* 2334 * Get values from the moved block. 2335 */ 2336 if (dead_info->magic == cpu_to_be16(XFS_DIR2_LEAFN_MAGIC) || 2337 dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC)) { 2338 struct xfs_dir3_icleaf_hdr leafhdr; 2339 struct xfs_dir2_leaf_entry *ents; 2340 2341 dead_leaf2 = (xfs_dir2_leaf_t *)dead_info; 2342 xfs_dir2_leaf_hdr_from_disk(dp->i_mount, &leafhdr, 2343 dead_leaf2); 2344 ents = leafhdr.ents; 2345 dead_level = 0; 2346 dead_hash = be32_to_cpu(ents[leafhdr.count - 1].hashval); 2347 } else { 2348 struct xfs_da3_icnode_hdr deadhdr; 2349 2350 dead_node = (xfs_da_intnode_t *)dead_info; 2351 xfs_da3_node_hdr_from_disk(dp->i_mount, &deadhdr, dead_node); 2352 btree = deadhdr.btree; 2353 dead_level = deadhdr.level; 2354 dead_hash = be32_to_cpu(btree[deadhdr.count - 1].hashval); 2355 } 2356 sib_buf = par_buf = NULL; 2357 /* 2358 * If the moved block has a left sibling, fix up the pointers. 2359 */ 2360 if ((sib_blkno = be32_to_cpu(dead_info->back))) { 2361 error = xfs_da3_node_read(tp, dp, sib_blkno, &sib_buf, w); 2362 if (error) 2363 goto done; 2364 sib_info = sib_buf->b_addr; 2365 if (XFS_IS_CORRUPT(mp, 2366 be32_to_cpu(sib_info->forw) != last_blkno || 2367 sib_info->magic != dead_info->magic)) { 2368 error = -EFSCORRUPTED; 2369 goto done; 2370 } 2371 sib_info->forw = cpu_to_be32(dead_blkno); 2372 xfs_trans_log_buf(tp, sib_buf, 2373 XFS_DA_LOGRANGE(sib_info, &sib_info->forw, 2374 sizeof(sib_info->forw))); 2375 sib_buf = NULL; 2376 } 2377 /* 2378 * If the moved block has a right sibling, fix up the pointers. 2379 */ 2380 if ((sib_blkno = be32_to_cpu(dead_info->forw))) { 2381 error = xfs_da3_node_read(tp, dp, sib_blkno, &sib_buf, w); 2382 if (error) 2383 goto done; 2384 sib_info = sib_buf->b_addr; 2385 if (XFS_IS_CORRUPT(mp, 2386 be32_to_cpu(sib_info->back) != last_blkno || 2387 sib_info->magic != dead_info->magic)) { 2388 error = -EFSCORRUPTED; 2389 goto done; 2390 } 2391 sib_info->back = cpu_to_be32(dead_blkno); 2392 xfs_trans_log_buf(tp, sib_buf, 2393 XFS_DA_LOGRANGE(sib_info, &sib_info->back, 2394 sizeof(sib_info->back))); 2395 sib_buf = NULL; 2396 } 2397 par_blkno = args->geo->leafblk; 2398 level = -1; 2399 /* 2400 * Walk down the tree looking for the parent of the moved block. 2401 */ 2402 for (;;) { 2403 error = xfs_da3_node_read(tp, dp, par_blkno, &par_buf, w); 2404 if (error) 2405 goto done; 2406 par_node = par_buf->b_addr; 2407 xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node); 2408 if (XFS_IS_CORRUPT(mp, 2409 level >= 0 && level != par_hdr.level + 1)) { 2410 error = -EFSCORRUPTED; 2411 goto done; 2412 } 2413 level = par_hdr.level; 2414 btree = par_hdr.btree; 2415 for (entno = 0; 2416 entno < par_hdr.count && 2417 be32_to_cpu(btree[entno].hashval) < dead_hash; 2418 entno++) 2419 continue; 2420 if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) { 2421 error = -EFSCORRUPTED; 2422 goto done; 2423 } 2424 par_blkno = be32_to_cpu(btree[entno].before); 2425 if (level == dead_level + 1) 2426 break; 2427 xfs_trans_brelse(tp, par_buf); 2428 par_buf = NULL; 2429 } 2430 /* 2431 * We're in the right parent block. 2432 * Look for the right entry. 2433 */ 2434 for (;;) { 2435 for (; 2436 entno < par_hdr.count && 2437 be32_to_cpu(btree[entno].before) != last_blkno; 2438 entno++) 2439 continue; 2440 if (entno < par_hdr.count) 2441 break; 2442 par_blkno = par_hdr.forw; 2443 xfs_trans_brelse(tp, par_buf); 2444 par_buf = NULL; 2445 if (XFS_IS_CORRUPT(mp, par_blkno == 0)) { 2446 error = -EFSCORRUPTED; 2447 goto done; 2448 } 2449 error = xfs_da3_node_read(tp, dp, par_blkno, &par_buf, w); 2450 if (error) 2451 goto done; 2452 par_node = par_buf->b_addr; 2453 xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node); 2454 if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) { 2455 error = -EFSCORRUPTED; 2456 goto done; 2457 } 2458 btree = par_hdr.btree; 2459 entno = 0; 2460 } 2461 /* 2462 * Update the parent entry pointing to the moved block. 2463 */ 2464 btree[entno].before = cpu_to_be32(dead_blkno); 2465 xfs_trans_log_buf(tp, par_buf, 2466 XFS_DA_LOGRANGE(par_node, &btree[entno].before, 2467 sizeof(btree[entno].before))); 2468 *dead_blknop = last_blkno; 2469 *dead_bufp = last_buf; 2470 return 0; 2471 done: 2472 if (par_buf) 2473 xfs_trans_brelse(tp, par_buf); 2474 if (sib_buf) 2475 xfs_trans_brelse(tp, sib_buf); 2476 xfs_trans_brelse(tp, last_buf); 2477 return error; 2478 } 2479
Hi Jiachen, kernel test robot noticed the following build errors: [auto build test ERROR on xfs-linux/for-next] [also build test ERROR on linus/master v6.7-rc3 next-20231128] [cannot apply to djwong-xfs/djwong-devel] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Jiachen-Zhang/xfs-ensure-tmp_logflags-is-initialized-in-xfs_bmap_del_extent_real/20231128-135955 base: https://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git for-next patch link: https://lore.kernel.org/r/20231128053202.29007-3-zhangjiachen.jaycee%40bytedance.com patch subject: [PATCH 2/2] xfs: update dir3 leaf block metadata after swap config: i386-randconfig-141-20231128 (https://download.01.org/0day-ci/archive/20231128/202311281904.r45MkLJq-lkp@intel.com/config) compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231128/202311281904.r45MkLJq-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202311281904.r45MkLJq-lkp@intel.com/ All errors (new ones prefixed by >>): >> fs/xfs/libxfs/xfs_da_btree.c:2330:38: error: no member named 'b_bn' in 'struct xfs_buf' dap->blkno = cpu_to_be64(dead_buf->b_bn); ~~~~~~~~ ^ include/linux/byteorder/generic.h:92:21: note: expanded from macro 'cpu_to_be64' #define cpu_to_be64 __cpu_to_be64 ^ include/uapi/linux/byteorder/little_endian.h:38:53: note: expanded from macro '__cpu_to_be64' #define __cpu_to_be64(x) ((__force __be64)__swab64((x))) ^ include/uapi/linux/swab.h:128:54: note: expanded from macro '__swab64' #define __swab64(x) (__u64)__builtin_bswap64((__u64)(x)) ^ 1 error generated. vim +2330 fs/xfs/libxfs/xfs_da_btree.c 2254 2255 /* 2256 * Ick. We need to always be able to remove a btree block, even 2257 * if there's no space reservation because the filesystem is full. 2258 * This is called if xfs_bunmapi on a btree block fails due to ENOSPC. 2259 * It swaps the target block with the last block in the file. The 2260 * last block in the file can always be removed since it can't cause 2261 * a bmap btree split to do that. 2262 */ 2263 STATIC int 2264 xfs_da3_swap_lastblock( 2265 struct xfs_da_args *args, 2266 xfs_dablk_t *dead_blknop, 2267 struct xfs_buf **dead_bufp) 2268 { 2269 struct xfs_da_blkinfo *dead_info; 2270 struct xfs_da_blkinfo *sib_info; 2271 struct xfs_da_intnode *par_node; 2272 struct xfs_da_intnode *dead_node; 2273 struct xfs_dir2_leaf *dead_leaf2; 2274 struct xfs_da_node_entry *btree; 2275 struct xfs_da3_icnode_hdr par_hdr; 2276 struct xfs_inode *dp; 2277 struct xfs_trans *tp; 2278 struct xfs_mount *mp; 2279 struct xfs_buf *dead_buf; 2280 struct xfs_buf *last_buf; 2281 struct xfs_buf *sib_buf; 2282 struct xfs_buf *par_buf; 2283 xfs_dahash_t dead_hash; 2284 xfs_fileoff_t lastoff; 2285 xfs_dablk_t dead_blkno; 2286 xfs_dablk_t last_blkno; 2287 xfs_dablk_t sib_blkno; 2288 xfs_dablk_t par_blkno; 2289 int error; 2290 int w; 2291 int entno; 2292 int level; 2293 int dead_level; 2294 2295 trace_xfs_da_swap_lastblock(args); 2296 2297 dead_buf = *dead_bufp; 2298 dead_blkno = *dead_blknop; 2299 tp = args->trans; 2300 dp = args->dp; 2301 w = args->whichfork; 2302 ASSERT(w == XFS_DATA_FORK); 2303 mp = dp->i_mount; 2304 lastoff = args->geo->freeblk; 2305 error = xfs_bmap_last_before(tp, dp, &lastoff, w); 2306 if (error) 2307 return error; 2308 if (XFS_IS_CORRUPT(mp, lastoff == 0)) 2309 return -EFSCORRUPTED; 2310 /* 2311 * Read the last block in the btree space. 2312 */ 2313 last_blkno = (xfs_dablk_t)lastoff - args->geo->fsbcount; 2314 error = xfs_da3_node_read(tp, dp, last_blkno, &last_buf, w); 2315 if (error) 2316 return error; 2317 /* 2318 * Copy the last block into the dead buffer and log it. 2319 */ 2320 memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); 2321 dead_info = dead_buf->b_addr; 2322 /* 2323 * Update the moved block's blkno if it's a dir3 leaf block 2324 */ 2325 if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || 2326 dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || 2327 dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { 2328 struct xfs_da3_blkinfo *dap = (struct xfs_da3_blkinfo *)dead_info; 2329 > 2330 dap->blkno = cpu_to_be64(dead_buf->b_bn); 2331 } 2332 xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); 2333 /* 2334 * Get values from the moved block. 2335 */ 2336 if (dead_info->magic == cpu_to_be16(XFS_DIR2_LEAFN_MAGIC) || 2337 dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC)) { 2338 struct xfs_dir3_icleaf_hdr leafhdr; 2339 struct xfs_dir2_leaf_entry *ents; 2340 2341 dead_leaf2 = (xfs_dir2_leaf_t *)dead_info; 2342 xfs_dir2_leaf_hdr_from_disk(dp->i_mount, &leafhdr, 2343 dead_leaf2); 2344 ents = leafhdr.ents; 2345 dead_level = 0; 2346 dead_hash = be32_to_cpu(ents[leafhdr.count - 1].hashval); 2347 } else { 2348 struct xfs_da3_icnode_hdr deadhdr; 2349 2350 dead_node = (xfs_da_intnode_t *)dead_info; 2351 xfs_da3_node_hdr_from_disk(dp->i_mount, &deadhdr, dead_node); 2352 btree = deadhdr.btree; 2353 dead_level = deadhdr.level; 2354 dead_hash = be32_to_cpu(btree[deadhdr.count - 1].hashval); 2355 } 2356 sib_buf = par_buf = NULL; 2357 /* 2358 * If the moved block has a left sibling, fix up the pointers. 2359 */ 2360 if ((sib_blkno = be32_to_cpu(dead_info->back))) { 2361 error = xfs_da3_node_read(tp, dp, sib_blkno, &sib_buf, w); 2362 if (error) 2363 goto done; 2364 sib_info = sib_buf->b_addr; 2365 if (XFS_IS_CORRUPT(mp, 2366 be32_to_cpu(sib_info->forw) != last_blkno || 2367 sib_info->magic != dead_info->magic)) { 2368 error = -EFSCORRUPTED; 2369 goto done; 2370 } 2371 sib_info->forw = cpu_to_be32(dead_blkno); 2372 xfs_trans_log_buf(tp, sib_buf, 2373 XFS_DA_LOGRANGE(sib_info, &sib_info->forw, 2374 sizeof(sib_info->forw))); 2375 sib_buf = NULL; 2376 } 2377 /* 2378 * If the moved block has a right sibling, fix up the pointers. 2379 */ 2380 if ((sib_blkno = be32_to_cpu(dead_info->forw))) { 2381 error = xfs_da3_node_read(tp, dp, sib_blkno, &sib_buf, w); 2382 if (error) 2383 goto done; 2384 sib_info = sib_buf->b_addr; 2385 if (XFS_IS_CORRUPT(mp, 2386 be32_to_cpu(sib_info->back) != last_blkno || 2387 sib_info->magic != dead_info->magic)) { 2388 error = -EFSCORRUPTED; 2389 goto done; 2390 } 2391 sib_info->back = cpu_to_be32(dead_blkno); 2392 xfs_trans_log_buf(tp, sib_buf, 2393 XFS_DA_LOGRANGE(sib_info, &sib_info->back, 2394 sizeof(sib_info->back))); 2395 sib_buf = NULL; 2396 } 2397 par_blkno = args->geo->leafblk; 2398 level = -1; 2399 /* 2400 * Walk down the tree looking for the parent of the moved block. 2401 */ 2402 for (;;) { 2403 error = xfs_da3_node_read(tp, dp, par_blkno, &par_buf, w); 2404 if (error) 2405 goto done; 2406 par_node = par_buf->b_addr; 2407 xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node); 2408 if (XFS_IS_CORRUPT(mp, 2409 level >= 0 && level != par_hdr.level + 1)) { 2410 error = -EFSCORRUPTED; 2411 goto done; 2412 } 2413 level = par_hdr.level; 2414 btree = par_hdr.btree; 2415 for (entno = 0; 2416 entno < par_hdr.count && 2417 be32_to_cpu(btree[entno].hashval) < dead_hash; 2418 entno++) 2419 continue; 2420 if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) { 2421 error = -EFSCORRUPTED; 2422 goto done; 2423 } 2424 par_blkno = be32_to_cpu(btree[entno].before); 2425 if (level == dead_level + 1) 2426 break; 2427 xfs_trans_brelse(tp, par_buf); 2428 par_buf = NULL; 2429 } 2430 /* 2431 * We're in the right parent block. 2432 * Look for the right entry. 2433 */ 2434 for (;;) { 2435 for (; 2436 entno < par_hdr.count && 2437 be32_to_cpu(btree[entno].before) != last_blkno; 2438 entno++) 2439 continue; 2440 if (entno < par_hdr.count) 2441 break; 2442 par_blkno = par_hdr.forw; 2443 xfs_trans_brelse(tp, par_buf); 2444 par_buf = NULL; 2445 if (XFS_IS_CORRUPT(mp, par_blkno == 0)) { 2446 error = -EFSCORRUPTED; 2447 goto done; 2448 } 2449 error = xfs_da3_node_read(tp, dp, par_blkno, &par_buf, w); 2450 if (error) 2451 goto done; 2452 par_node = par_buf->b_addr; 2453 xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node); 2454 if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) { 2455 error = -EFSCORRUPTED; 2456 goto done; 2457 } 2458 btree = par_hdr.btree; 2459 entno = 0; 2460 } 2461 /* 2462 * Update the parent entry pointing to the moved block. 2463 */ 2464 btree[entno].before = cpu_to_be32(dead_blkno); 2465 xfs_trans_log_buf(tp, par_buf, 2466 XFS_DA_LOGRANGE(par_node, &btree[entno].before, 2467 sizeof(btree[entno].before))); 2468 *dead_blknop = last_blkno; 2469 *dead_bufp = last_buf; 2470 return 0; 2471 done: 2472 if (par_buf) 2473 xfs_trans_brelse(tp, par_buf); 2474 if (sib_buf) 2475 xfs_trans_brelse(tp, sib_buf); 2476 xfs_trans_brelse(tp, last_buf); 2477 return error; 2478 } 2479
On Tue, Nov 28, 2023 at 01:32:02PM +0800, Jiachen Zhang wrote: > From: Zhang Tianci <zhangtianci.1997@bytedance.com> > > xfs_da3_swap_lastblock() copy the last block content to the dead block, > but do not update the metadata in it. We need update some metadata > for some kinds of type block, such as dir3 leafn block records its > blkno, we shall update it to the dead block blkno. Otherwise, > before write the xfs_buf to disk, the verify_write() will fail in > blk_hdr->blkno != xfs_buf->b_bn, then xfs will be shutdown. > > We will get this warning: > > XFS (dm-0): Metadata corruption detected at xfs_dir3_leaf_verify+0xa8/0xe0 [xfs], xfs_dir3_leafn block 0x178 > XFS (dm-0): Unmount and run xfs_repair > XFS (dm-0): First 128 bytes of corrupted metadata buffer: > 00000000e80f1917: 00 80 00 0b 00 80 00 07 3d ff 00 00 00 00 00 00 ........=....... > 000000009604c005: 00 00 00 00 00 00 01 a0 00 00 00 00 00 00 00 00 ................ > 000000006b6fb2bf: e4 44 e3 97 b5 64 44 41 8b 84 60 0e 50 43 d9 bf .D...dDA..`.PC.. > 00000000678978a2: 00 00 00 00 00 00 00 83 01 73 00 93 00 00 00 00 .........s...... > 00000000b28b247c: 99 29 1d 38 00 00 00 00 99 29 1d 40 00 00 00 00 .).8.....).@.... > 000000002b2a662c: 99 29 1d 48 00 00 00 00 99 49 11 00 00 00 00 00 .).H.....I...... > 00000000ea2ffbb8: 99 49 11 08 00 00 45 25 99 49 11 10 00 00 48 fe .I....E%.I....H. > 0000000069e86440: 99 49 11 18 00 00 4c 6b 99 49 11 20 00 00 4d 97 .I....Lk.I. ..M. > XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1423 of file fs/xfs/xfs_buf.c. Return address = 00000000c0ff63c1 > XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem > XFS (dm-0): Please umount the filesystem and rectify the problem(s) > > From the log above, we know xfs_buf->b_no is 0x178, but the block's hdr record > its blkno is 0x1a0. > > Fixes: 24df33b45ecf ("xfs: add CRC checking to dir2 leaf blocks") > Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com> > --- > fs/xfs/libxfs/xfs_da_btree.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c > index e576560b46e9..35f70e4c6447 100644 > --- a/fs/xfs/libxfs/xfs_da_btree.c > +++ b/fs/xfs/libxfs/xfs_da_btree.c > @@ -2318,8 +2318,18 @@ xfs_da3_swap_lastblock( > * Copy the last block into the dead buffer and log it. > */ > memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); > - xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); > dead_info = dead_buf->b_addr; > + /* > + * Update the moved block's blkno if it's a dir3 leaf block > + */ > + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || > + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || > + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { a.k.a. if (xfs_has_crc(mp)) { i.e. this is not specific to the buffer type being processed, it's specific to v4 vs v5 on-disk format. Hence it's a fs-feature check, not a block magic number check. Cheers, Dave.
On Wed, Nov 29, 2023 at 10:15:52AM +1100, Dave Chinner wrote: > > + /* > > + * Update the moved block's blkno if it's a dir3 leaf block > > + */ > > + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || > > + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || > > + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { > > a.k.a. > > if (xfs_has_crc(mp)) { > > i.e. this is not specific to the buffer type being processed, it's > specific to v4 vs v5 on-disk format. Hence it's a fs-feature check, > not a block magic number check. We have these magic based checks in quite a few places right now, so I'm not surprised that Jiachen picked it up from there..
On Wed, Nov 29, 2023 at 10:15:52AM +1100, Dave Chinner wrote: > On Tue, Nov 28, 2023 at 01:32:02PM +0800, Jiachen Zhang wrote: > > From: Zhang Tianci <zhangtianci.1997@bytedance.com> > > > > xfs_da3_swap_lastblock() copy the last block content to the dead block, > > but do not update the metadata in it. We need update some metadata > > for some kinds of type block, such as dir3 leafn block records its > > blkno, we shall update it to the dead block blkno. Otherwise, > > before write the xfs_buf to disk, the verify_write() will fail in > > blk_hdr->blkno != xfs_buf->b_bn, then xfs will be shutdown. > > > > We will get this warning: > > > > XFS (dm-0): Metadata corruption detected at xfs_dir3_leaf_verify+0xa8/0xe0 [xfs], xfs_dir3_leafn block 0x178 > > XFS (dm-0): Unmount and run xfs_repair > > XFS (dm-0): First 128 bytes of corrupted metadata buffer: > > 00000000e80f1917: 00 80 00 0b 00 80 00 07 3d ff 00 00 00 00 00 00 ........=....... > > 000000009604c005: 00 00 00 00 00 00 01 a0 00 00 00 00 00 00 00 00 ................ > > 000000006b6fb2bf: e4 44 e3 97 b5 64 44 41 8b 84 60 0e 50 43 d9 bf .D...dDA..`.PC.. > > 00000000678978a2: 00 00 00 00 00 00 00 83 01 73 00 93 00 00 00 00 .........s...... > > 00000000b28b247c: 99 29 1d 38 00 00 00 00 99 29 1d 40 00 00 00 00 .).8.....).@.... > > 000000002b2a662c: 99 29 1d 48 00 00 00 00 99 49 11 00 00 00 00 00 .).H.....I...... > > 00000000ea2ffbb8: 99 49 11 08 00 00 45 25 99 49 11 10 00 00 48 fe .I....E%.I....H. > > 0000000069e86440: 99 49 11 18 00 00 4c 6b 99 49 11 20 00 00 4d 97 .I....Lk.I. ..M. > > XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1423 of file fs/xfs/xfs_buf.c. Return address = 00000000c0ff63c1 > > XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem > > XFS (dm-0): Please umount the filesystem and rectify the problem(s) > > > > From the log above, we know xfs_buf->b_no is 0x178, but the block's hdr record > > its blkno is 0x1a0. > > > > Fixes: 24df33b45ecf ("xfs: add CRC checking to dir2 leaf blocks") > > Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com> > > --- > > fs/xfs/libxfs/xfs_da_btree.c | 12 +++++++++++- > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c > > index e576560b46e9..35f70e4c6447 100644 > > --- a/fs/xfs/libxfs/xfs_da_btree.c > > +++ b/fs/xfs/libxfs/xfs_da_btree.c > > @@ -2318,8 +2318,18 @@ xfs_da3_swap_lastblock( > > * Copy the last block into the dead buffer and log it. > > */ > > memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); > > - xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); > > dead_info = dead_buf->b_addr; > > + /* > > + * Update the moved block's blkno if it's a dir3 leaf block > > + */ > > + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || > > + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || > > + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { On second thought, does this code have to handle XFS_DA3_NODE_MAGIC as well? > > a.k.a. > > if (xfs_has_crc(mp)) { > > i.e. this is not specific to the buffer type being processed, it's > specific to v4 vs v5 on-disk format. Hence it's a fs-feature check, > not a block magic number check. in which case Dave's comment is correct, yes? --D > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com >
On Wed, Nov 29, 2023 at 2:34 PM Darrick J. Wong <djwong@kernel.org> wrote: > > On Wed, Nov 29, 2023 at 10:15:52AM +1100, Dave Chinner wrote: > > On Tue, Nov 28, 2023 at 01:32:02PM +0800, Jiachen Zhang wrote: > > > From: Zhang Tianci <zhangtianci.1997@bytedance.com> > > > > > > xfs_da3_swap_lastblock() copy the last block content to the dead block, > > > but do not update the metadata in it. We need update some metadata > > > for some kinds of type block, such as dir3 leafn block records its > > > blkno, we shall update it to the dead block blkno. Otherwise, > > > before write the xfs_buf to disk, the verify_write() will fail in > > > blk_hdr->blkno != xfs_buf->b_bn, then xfs will be shutdown. > > > > > > We will get this warning: > > > > > > XFS (dm-0): Metadata corruption detected at xfs_dir3_leaf_verify+0xa8/0xe0 [xfs], xfs_dir3_leafn block 0x178 > > > XFS (dm-0): Unmount and run xfs_repair > > > XFS (dm-0): First 128 bytes of corrupted metadata buffer: > > > 00000000e80f1917: 00 80 00 0b 00 80 00 07 3d ff 00 00 00 00 00 00 ........=....... > > > 000000009604c005: 00 00 00 00 00 00 01 a0 00 00 00 00 00 00 00 00 ................ > > > 000000006b6fb2bf: e4 44 e3 97 b5 64 44 41 8b 84 60 0e 50 43 d9 bf .D...dDA..`.PC.. > > > 00000000678978a2: 00 00 00 00 00 00 00 83 01 73 00 93 00 00 00 00 .........s...... > > > 00000000b28b247c: 99 29 1d 38 00 00 00 00 99 29 1d 40 00 00 00 00 .).8.....).@.... > > > 000000002b2a662c: 99 29 1d 48 00 00 00 00 99 49 11 00 00 00 00 00 .).H.....I...... > > > 00000000ea2ffbb8: 99 49 11 08 00 00 45 25 99 49 11 10 00 00 48 fe .I....E%.I....H. > > > 0000000069e86440: 99 49 11 18 00 00 4c 6b 99 49 11 20 00 00 4d 97 .I....Lk.I. ..M. > > > XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1423 of file fs/xfs/xfs_buf.c. Return address = 00000000c0ff63c1 > > > XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem > > > XFS (dm-0): Please umount the filesystem and rectify the problem(s) > > > > > > From the log above, we know xfs_buf->b_no is 0x178, but the block's hdr record > > > its blkno is 0x1a0. > > > > > > Fixes: 24df33b45ecf ("xfs: add CRC checking to dir2 leaf blocks") > > > Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com> > > > --- > > > fs/xfs/libxfs/xfs_da_btree.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c > > > index e576560b46e9..35f70e4c6447 100644 > > > --- a/fs/xfs/libxfs/xfs_da_btree.c > > > +++ b/fs/xfs/libxfs/xfs_da_btree.c > > > @@ -2318,8 +2318,18 @@ xfs_da3_swap_lastblock( > > > * Copy the last block into the dead buffer and log it. > > > */ > > > memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); > > > - xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); > > > dead_info = dead_buf->b_addr; > > > + /* > > > + * Update the moved block's blkno if it's a dir3 leaf block > > > + */ > > > + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || > > > + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || > > > + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { > > On second thought, does this code have to handle XFS_DA3_NODE_MAGIC as > well? I think the node block is not too possible the last block, but it's harmless to add this check. I would use Dave's suggestion to check xfs's crc-feature rather than magic. I think it's equivalent in this function. We will send the v2 patchset soon. Thanks. > > > > > a.k.a. > > > > if (xfs_has_crc(mp)) { > > > > i.e. this is not specific to the buffer type being processed, it's > > specific to v4 vs v5 on-disk format. Hence it's a fs-feature check, > > not a block magic number check. > > in which case Dave's comment is correct, yes? > > --D > > > Cheers, > > > > Dave. > > -- > > Dave Chinner > > david@fromorbit.com > >
On Tue, Nov 28, 2023 at 10:34:09PM -0800, Christoph Hellwig wrote: > On Wed, Nov 29, 2023 at 10:15:52AM +1100, Dave Chinner wrote: > > > + /* > > > + * Update the moved block's blkno if it's a dir3 leaf block > > > + */ > > > + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || > > > + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || > > > + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { > > > > a.k.a. > > > > if (xfs_has_crc(mp)) { > > > > i.e. this is not specific to the buffer type being processed, it's > > specific to v4 vs v5 on-disk format. Hence it's a fs-feature check, > > not a block magic number check. > > We have these magic based checks in quite a few places right now, > so I'm not surprised that Jiachen picked it up from there.. Yes, but that doesn't mean the magic number check has been used correctly here. That is, we use the magic number check when the code has a conditional on the type of buffer being processed (i.e. what block type are we operating on? e.g. DANODE vs LEAFN as is checked a few lines further down in this code). When the conditional is "what on-disk format are we operating on?" such as when we are decoding headers or running verifiers, we use xfs_has_crc() because we can't trust magic numbers to be correct prior to validation. Hence we use xfs_has_crc() to determine how to decode/encode/verify the structure header, not the magic number in the block. Cheers, Dave.
On Tue, Nov 28, 2023 at 10:34:33PM -0800, Darrick J. Wong wrote: > On Wed, Nov 29, 2023 at 10:15:52AM +1100, Dave Chinner wrote: > > On Tue, Nov 28, 2023 at 01:32:02PM +0800, Jiachen Zhang wrote: > > > From: Zhang Tianci <zhangtianci.1997@bytedance.com> > > > > > > xfs_da3_swap_lastblock() copy the last block content to the dead block, > > > but do not update the metadata in it. We need update some metadata > > > for some kinds of type block, such as dir3 leafn block records its > > > blkno, we shall update it to the dead block blkno. Otherwise, > > > before write the xfs_buf to disk, the verify_write() will fail in > > > blk_hdr->blkno != xfs_buf->b_bn, then xfs will be shutdown. > > > > > > We will get this warning: > > > > > > XFS (dm-0): Metadata corruption detected at xfs_dir3_leaf_verify+0xa8/0xe0 [xfs], xfs_dir3_leafn block 0x178 > > > XFS (dm-0): Unmount and run xfs_repair > > > XFS (dm-0): First 128 bytes of corrupted metadata buffer: > > > 00000000e80f1917: 00 80 00 0b 00 80 00 07 3d ff 00 00 00 00 00 00 ........=....... > > > 000000009604c005: 00 00 00 00 00 00 01 a0 00 00 00 00 00 00 00 00 ................ > > > 000000006b6fb2bf: e4 44 e3 97 b5 64 44 41 8b 84 60 0e 50 43 d9 bf .D...dDA..`.PC.. > > > 00000000678978a2: 00 00 00 00 00 00 00 83 01 73 00 93 00 00 00 00 .........s...... > > > 00000000b28b247c: 99 29 1d 38 00 00 00 00 99 29 1d 40 00 00 00 00 .).8.....).@.... > > > 000000002b2a662c: 99 29 1d 48 00 00 00 00 99 49 11 00 00 00 00 00 .).H.....I...... > > > 00000000ea2ffbb8: 99 49 11 08 00 00 45 25 99 49 11 10 00 00 48 fe .I....E%.I....H. > > > 0000000069e86440: 99 49 11 18 00 00 4c 6b 99 49 11 20 00 00 4d 97 .I....Lk.I. ..M. > > > XFS (dm-0): xfs_do_force_shutdown(0x8) called from line 1423 of file fs/xfs/xfs_buf.c. Return address = 00000000c0ff63c1 > > > XFS (dm-0): Corruption of in-memory data detected. Shutting down filesystem > > > XFS (dm-0): Please umount the filesystem and rectify the problem(s) > > > > > > From the log above, we know xfs_buf->b_no is 0x178, but the block's hdr record > > > its blkno is 0x1a0. > > > > > > Fixes: 24df33b45ecf ("xfs: add CRC checking to dir2 leaf blocks") > > > Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com> > > > --- > > > fs/xfs/libxfs/xfs_da_btree.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c > > > index e576560b46e9..35f70e4c6447 100644 > > > --- a/fs/xfs/libxfs/xfs_da_btree.c > > > +++ b/fs/xfs/libxfs/xfs_da_btree.c > > > @@ -2318,8 +2318,18 @@ xfs_da3_swap_lastblock( > > > * Copy the last block into the dead buffer and log it. > > > */ > > > memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); > > > - xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); > > > dead_info = dead_buf->b_addr; > > > + /* > > > + * Update the moved block's blkno if it's a dir3 leaf block > > > + */ > > > + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || > > > + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || > > > + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { > > On second thought, does this code have to handle XFS_DA3_NODE_MAGIC as > well? Yes, it does. The code to decode the block as a danode instead of leaf1/leafn block is a few lines further down. This code does not support ATTR_LEAF blocks, however. xfs_da_shrink_inode() will only try to swap blocks on the data fork, never on the attribute fork. That's largely a moot issue, though. -Dave.
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index e576560b46e9..35f70e4c6447 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2318,8 +2318,18 @@ xfs_da3_swap_lastblock( * Copy the last block into the dead buffer and log it. */ memcpy(dead_buf->b_addr, last_buf->b_addr, args->geo->blksize); - xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); dead_info = dead_buf->b_addr; + /* + * Update the moved block's blkno if it's a dir3 leaf block + */ + if (dead_info->magic == cpu_to_be16(XFS_DIR3_LEAF1_MAGIC) || + dead_info->magic == cpu_to_be16(XFS_DIR3_LEAFN_MAGIC) || + dead_info->magic == cpu_to_be16(XFS_ATTR3_LEAF_MAGIC)) { + struct xfs_da3_blkinfo *dap = (struct xfs_da3_blkinfo *)dead_info; + + dap->blkno = cpu_to_be64(dead_buf->b_bn); + } + xfs_trans_log_buf(tp, dead_buf, 0, args->geo->blksize - 1); /* * Get values from the moved block. */