Message ID | 20221222020244.1821308-1-jun.nie@linaro.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:e747:0:0:0:0:0 with SMTP id c7csp94004wrn; Wed, 21 Dec 2022 18:24:12 -0800 (PST) X-Google-Smtp-Source: AMrXdXuEvyuzOSyOMgBnYaF4a5jTPam3f21sX5tDAW+S7ERzxSBNhn0f2j5WPACcM7ggmRCutKME X-Received: by 2002:a17:906:5dd2:b0:7d5:29e1:15ea with SMTP id p18-20020a1709065dd200b007d529e115eamr3675375ejv.8.1671675852194; Wed, 21 Dec 2022 18:24:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1671675852; cv=none; d=google.com; s=arc-20160816; b=j+fO0NovnwFJdOUzYfn23QiwPZr2wNov2lEGHfSsC0LHZtrmfdDPK6It4AcuALBpv7 WEDAbuJ0JPFMR9qMnjrQrs0ChwVpUvQzcbGBsEpqwdaeuFITmlGjLwfWHUNAvR+4ODqN qZcGUQAuUPS8jizBgOuEGgdS8KYDpHdjD4mBmeaGIpXZ7U8XSDW023r9Mpq4h77UlwV/ GIbO89cYuMWqdILPD+B+Zm80t8vbzJNmLKi7pFHiJfJNcKE4fbt5WLc+Lt6FhLuE4zms rFtAMEm6Q09dGOdn1mMPfZLQqs4SRhwfZCvIyVE0UpXNdDx0vJ9YhNj4XzAuOy5lXgZe bRiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature; bh=/vSB+l3+h0STRS9DPPu2XELvIiGiSoWBn/FlK1C83SA=; b=IEXS4Rqo207YVDDHkbB0bD9RA/QAyhFw64NyieRo1d+dFxHrHbtdl9Uw1lz1p8DOcA +QrPL/1aV/gP3CtRVbkctb5yAiZRo37S4nSUF7ui6LUAqeHa2QDHxIDraXz/1grTnqfz 69/tIeUGaeRJoP1mca4Ew5HyVBzvjTGn2XxQ9VIUStt8qdnDXF82Myr1B6/t2pAQc7cX thgNfS2hu+C3nL+tk1n89odCq9bRH+HCmn+SZqgZUjUhMIfuEeIp/vLrflv2AO74LG/C UXXhs4uNkGPDALmdcA8isX/B2dHBC8S9V6prPy64bG7Dp4EnAsNEdlINfBO7vlt7OUiZ yjZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ou9acgID; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fg3-20020a056402548300b0047a42ff5dd9si7812180edb.328.2022.12.21.18.23.48; Wed, 21 Dec 2022 18:24:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=ou9acgID; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229601AbiLVCCl (ORCPT <rfc822;pacteraone@gmail.com> + 99 others); Wed, 21 Dec 2022 21:02:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229620AbiLVCCj (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 21 Dec 2022 21:02:39 -0500 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E139220F6D for <linux-kernel@vger.kernel.org>; Wed, 21 Dec 2022 18:02:38 -0800 (PST) Received: by mail-pl1-x632.google.com with SMTP id jn22so628756plb.13 for <linux-kernel@vger.kernel.org>; Wed, 21 Dec 2022 18:02:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=/vSB+l3+h0STRS9DPPu2XELvIiGiSoWBn/FlK1C83SA=; b=ou9acgIDlwQ3xDhECuqvM25H6FtaND3bPaqdjPAqWdAhqo7Y83YVVOiGn0oWKpRBS4 vdgdqILkVAYHVfYdf3pof6pdXyJCljnEs1wptCI9ej7Cr5ZAZl7t3PNkhtzsg1W3bURS 2HV6er5F+bAYJRN1hxlb+2zSVOYMkz7cfO219sfs7x/Gq2WOpVPAOuneveYdLQEqPNo2 /u7t31kWI4LUDAVT50QZFdyQZlAbgClBC1npmARQJs8QRnxoa/Y+JaRC1ofFSMs1nIk8 eSyFzxaLsJjIdkigx9S0+zBa5ISSuOse5qFZ5m7kcu4uxDC5ykAF2+Vzkbq69ZX0hlll tKow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/vSB+l3+h0STRS9DPPu2XELvIiGiSoWBn/FlK1C83SA=; b=0uU9PXaqoLdzY760wOb7vl4L/uonGBQuRkw4FP+3qAO0BHzyAxOAcnZ/2y0euMuxO/ 8JkODUgxdNTmBMs1691jmqcmnpwW0r6CX+gkzVXMKqWFLmDYPWnlqm0VTFl0wIqKiMxW fI7UwBIZxZbpXHv0ZHenLYLfrdGLXoGkPu2iUuL2Pv6urfZs+uJ4mc3mtHlQaiMNqeSi 1i3Mjax4BKY9yLRXYt6FcEp7spEPBcciNbLs0Z9iu11Qu628yaXopsNhPgTeqLMnJKCv LrMzGDSJnf+ca+7jFsEoMKdA+1ZxVALZWAFKXgKnHDoqzAPAlvAyg4a8axwlmaFndWIw yS4A== X-Gm-Message-State: AFqh2kqXqck/JVL3shc7HyR6G5A+f6WMXTgQo+1EfFNHHeROZWXxFjCa 7500VqTAhvJOMxFCcRqHFmpywg== X-Received: by 2002:a17:902:82c8:b0:191:4389:f8fb with SMTP id u8-20020a17090282c800b001914389f8fbmr3938992plz.65.1671674558276; Wed, 21 Dec 2022 18:02:38 -0800 (PST) Received: from niej-dt-7B47.. (80.251.214.228.16clouds.com. [80.251.214.228]) by smtp.gmail.com with ESMTPSA id k15-20020a170902c40f00b001869b988d93sm12154909plk.187.2022.12.21.18.02.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Dec 2022 18:02:37 -0800 (PST) From: Jun Nie <jun.nie@linaro.org> To: tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] ext4: fix underflow in group bitmap calculation Date: Thu, 22 Dec 2022 10:02:44 +0800 Message-Id: <20221222020244.1821308-1-jun.nie@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752879178154607870?= X-GMAIL-MSGID: =?utf-8?q?1752879178154607870?= |
Series |
ext4: fix underflow in group bitmap calculation
|
|
Commit Message
Jun Nie
Dec. 22, 2022, 2:02 a.m. UTC
There is case that s_first_data_block is not 0 and block nr is smaller than
s_first_data_block when calculating group bitmap during allocation. This
underflow make index exceed es->s_groups_count in ext4_get_group_info()
and trigger the BUG_ON.
Fix it with protection of underflow.
Fixes: 72b64b594081ef ("ext4 uninline ext4_get_group_no_and_offset()")
Link: https://syzkaller.appspot.com/bug?id=79d5768e9bfe362911ac1a5057a36fc6b5c30002
Reported-by: syzbot+6be2b977c89f79b6b153@syzkaller.appspotmail.com
Signed-off-by: Jun Nie <jun.nie@linaro.org>
---
fs/ext4/balloc.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Comments
On Thu, Dec 22, 2022 at 10:02:44AM +0800, Jun Nie wrote: > There is case that s_first_data_block is not 0 and block nr is smaller than > s_first_data_block when calculating group bitmap during allocation. This > underflow make index exceed es->s_groups_count in ext4_get_group_info() > and trigger the BUG_ON. > > Fix it with protection of underflow. When was this happening, and why? If blocknr is less than s_first_data_block, this is either a insufficient input validation, insufficient validation to detection file system corruption. or some other kernel bug. Looking quickly at the code and the repro, it appears that issue is that FS_IOC_GETFSMAP is getting passed a stating physical block of 0 in fmh_keys[0] when on a file system with a blocksize of 1k (in which case s_first_data_block is 1). It's unclear to me what FS_IOC_GETFSMAP should *do* when passed a value which requests that it provide a mapping for a block which is out of bounds (either too big, or too small)?. Should it return an error? Should it simply not return a mapping? The map page for ioctl_getfsmap() doesn't shed any light on this question. Darrick, you designed the interface and wrote most of fs/ext4/fsmap.c. Can you let us know what is supposed to happen in this case? Many thanks!! > Fixes: 72b64b594081ef ("ext4 uninline ext4_get_group_no_and_offset()") This makes ***no*** sense; the commit in question is from 2006, which means that in some jourisdictions it's old enough to drive a car. :-) Futhermore, all it does is move the function from an inline function to a C file (in this case, balloc.c). It also long predates introduction of FS_IOC_GETFSMAP support, which was in 2017. I'm guessing you just did a "git blame" and blindly assumed that whatever commit last touched the C code in question was what introduced the problem? Anyway, please try to understand what is going on instead of doing the moral equivalent of taking a sledgehammer to the code until the reproducer stops triggering a BUG. It's not enough to shut up the reproducer; you should understand what is happening, and why, and then strive to find the best fix to the problem. Papering over problems in the end will result in more fragile code, and the goal of syzkaller is to improve kernel quality. But syzkaller is just a tool and used wrongly, it can have the opposite effect. Regards, - Ted
On Thu, Dec 22, 2022 at 12:41:58PM -0500, Theodore Ts'o wrote: > On Thu, Dec 22, 2022 at 10:02:44AM +0800, Jun Nie wrote: > > There is case that s_first_data_block is not 0 and block nr is smaller than > > s_first_data_block when calculating group bitmap during allocation. This > > underflow make index exceed es->s_groups_count in ext4_get_group_info() > > and trigger the BUG_ON. > > > > Fix it with protection of underflow. > > When was this happening, and why? If blocknr is less than > s_first_data_block, this is either a insufficient input validation, > insufficient validation to detection file system corruption. or some > other kernel bug. > > Looking quickly at the code and the repro, it appears that issue is > that FS_IOC_GETFSMAP is getting passed a stating physical block of 0 > in fmh_keys[0] when on a file system with a blocksize of 1k (in which > case s_first_data_block is 1). It's unclear to me what Question -- on a 1k-block filesystem, are the first 1024 bytes of the device *reserved* by ext4 for whatever bootloader crud goes in there? Or is that space undefined in the filesystem specification? I never did figure that out when I was writing the ondisk specification that's in the kernel, but maybe you remember? > FS_IOC_GETFSMAP should *do* when passed a value which requests that it > provide a mapping for a block which is out of bounds (either too big, > or too small)?. Should it return an error? Should it simply not > return a mapping? The map page for ioctl_getfsmap() doesn't shed any > light on this question. > > Darrick, you designed the interface and wrote most of fs/ext4/fsmap.c. > Can you let us know what is supposed to happen in this case? Many > thanks!! If those first 1024 bytes are defined to be reserved in the ondisk format, then you could return a mapping for those bytes with the owner code set to EXT4_FMR_OWN_UNKNOWN. If, however, the space is undefined, then going off this statement in the manpage: "For example, if the low key (fsmap_head.fmh_keys[0]) is set to (8:0, 36864, 0, 0, 0), the filesystem will only return records for extents starting at or above 36 KiB on disk." I think the 'at or above' clause means that ext4 should not pass back any mapping for the byte range 0-1023 on a 1k-block filesystem. If the low key is set to (8:0, 0, 0, 0, 0) and high key is set to (8:0, 1023, 0, 0, 0) then ext4 shouldn't return any mapping at all, because there's no space usage defined for that region of the disk. If the low key is set to (8:0, 0, 0, 0, 0) and high key is set to all ones, then ext4 can return mappings for the primary superblock at offset 1024. --D > > > Fixes: 72b64b594081ef ("ext4 uninline ext4_get_group_no_and_offset()") > > This makes ***no*** sense; the commit in question is from 2006, which > means that in some jourisdictions it's old enough to drive a car. :-) > Futhermore, all it does is move the function from an inline function > to a C file (in this case, balloc.c). It also long predates > introduction of FS_IOC_GETFSMAP support, which was in 2017. > > I'm guessing you just did a "git blame" and blindly assumed that > whatever commit last touched the C code in question was what > introduced the problem? > > Anyway, please try to understand what is going on instead of doing the > moral equivalent of taking a sledgehammer to the code until the > reproducer stops triggering a BUG. It's not enough to shut up the > reproducer; you should understand what is happening, and why, and then > strive to find the best fix to the problem. Papering over problems in > the end will result in more fragile code, and the goal of syzkaller is > to improve kernel quality. But syzkaller is just a tool and used > wrongly, it can have the opposite effect. > > Regards, > > - Ted
On Thu, Dec 22, 2022 at 10:08:59AM -0800, Darrick J. Wong wrote: > > Question -- on a 1k-block filesystem, are the first 1024 bytes of the > device *reserved* by ext4 for whatever bootloader crud goes in there? > Or is that space undefined in the filesystem specification? > > I never did figure that out when I was writing the ondisk specification > that's in the kernel, but maybe you remember? That's an interesting (and philosophical) question. The ext2 file system never had a formal specification, and this part of the file system format was devised by Remy Card before I had gotten involved with ext2. (I first got started writing e2fsprogs; which replaced the previous file system utilities, which were forked from minix's tools, and which were quite inefficient.) In favor of it being undefined, the first 1024 bytes are not part of any block group in an ext2 file system with a 1k block size. (The first block group is composed of physical blocks 1 through 8192 inclusive when the block size is 1k. Whereas if the blocksize is 4k, the first block group is composed of physical blocks 0 through 32767.) In addition, the status of the first 1024 bytes is not controlled by an ext2 block allocation bitmap. One could also argue that to the extent that ext2 was derived the ext file system, which in turn was derived from Minix --- and Minix File System (which does have a specification, explicitly states that "block 0" is reserved for the Bootloader, with "Block 1" being the location of the superblock. But Minix only supports a 1k blocksize, and doesn't have the concept of FFS-style block (cylinder) groups. So I'd come down on the side which states that the first 1024 bytes are "undefined" on a 1k block file system. (One could also aruge that they are "undefined" on a 2k and 4k block file system, but the first 1024 bytes are part of "block 0", and on 2k and 4k block file systems, "block 0" is part of a block group.) > If those first 1024 bytes are defined to be reserved in the ondisk > format, then you could return a mapping for those bytes with the owner > code set to EXT4_FMR_OWN_UNKNOWN. > > If, however, the space is undefined, then going off this statement in > the manpage: > > "For example, if the low key (fsmap_head.fmh_keys[0]) is set to (8:0, > 36864, 0, 0, 0), the filesystem will only return records for extents > starting at or above 36 KiB on disk." > > I think the 'at or above' clause means that ext4 should not pass back > any mapping for the byte range 0-1023 on a 1k-block filesystem. Sure, sounds good to me. - Ted
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c index 8ff4b9192a9f..177ef6bd635a 100644 --- a/fs/ext4/balloc.c +++ b/fs/ext4/balloc.c @@ -56,7 +56,8 @@ void ext4_get_group_no_and_offset(struct super_block *sb, ext4_fsblk_t blocknr, struct ext4_super_block *es = EXT4_SB(sb)->s_es; ext4_grpblk_t offset; - blocknr = blocknr - le32_to_cpu(es->s_first_data_block); + blocknr = blocknr > le32_to_cpu(es->s_first_data_block) ? + blocknr - le32_to_cpu(es->s_first_data_block) : 0; offset = do_div(blocknr, EXT4_BLOCKS_PER_GROUP(sb)) >> EXT4_SB(sb)->s_cluster_bits; if (offsetp)