Message ID | 20221128154434.4177442-9-linan122@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp5750388wrr; Mon, 28 Nov 2022 07:28:23 -0800 (PST) X-Google-Smtp-Source: AA0mqf6my/HQw0Sxa42jySsia6tQGNmc8xY1CHk+O4hQcaSMBkZKKEvVtpHi9Q30g/H+61BJnc7a X-Received: by 2002:a50:f602:0:b0:469:4e4f:4827 with SMTP id c2-20020a50f602000000b004694e4f4827mr40311396edn.214.1669649303787; Mon, 28 Nov 2022 07:28:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669649303; cv=none; d=google.com; s=arc-20160816; b=o6iAcm8MY5Sp/k8RT69zo0n2rb0dSFv05OFH0oAIJbfZfUl5iKghEAyBeIHcQlnn3e pXHDqTc5WoO8VVQFKNBi85Lpc+cMjVxnfCpWURdJHWQJX8v7AWxTpXxN6gC5nf0X04Qa pTz/+BiYHt0onmQjCuW8ceFKoecwEYLjtLQZYTx3P2UYVc3yZm7wezLhrJGmdG7L6Aov kflamcqb5bgDqXRHel8Ps+/YoMGF6i6gYWjQs085SMEUeTncCJfnge/zE5eD2L/C2VPL DMS4lVdslxh4KZP5OP0lvvJpdbgOzK2SUuxoM1xTSpNvFEq8C6ysxRPg07pQiKrucAfa x1QQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=DFv0pknN1CNLR3dxTrIaXMFbPHMpYDqm9AJmGvSasDE=; b=P3xDeBXYWz3MZkp5HszbDI3w1xO1T7hYVzNDjH0mfR1zFoXbJ2NcesCZk1XRfc54am V5L8fyxb+HdBlqt9pnignKFk11NYy8wcEt+RJIyV390VEvViajEpUnIFDsYe278d+cWC rzM3MfVnSzvBaJSKazeDa5MLdO7N87wVfNenrnLhQXgQqzkhOG2ePmadw82bbxN5rK2s N9zas6ppzlNcrbAFVPjk+Twbw9slciu9PigQgKxIlR6WtBKID576ZRcpe49phD1QFPFD oZdEkVkt6KHeLVmAXaDjx1PmXlK4NFi//BjYFH2+IOQERxBWtEo+re0uNBI9XqReFCoC pCgg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g5-20020a1709065d0500b007ada2efa6b7si11520092ejt.567.2022.11.28.07.27.58; Mon, 28 Nov 2022 07:28:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231964AbiK1PYB (ORCPT <rfc822;gah0developer@gmail.com> + 99 others); Mon, 28 Nov 2022 10:24:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231551AbiK1PXc (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 28 Nov 2022 10:23:32 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 617072DFC; Mon, 28 Nov 2022 07:23:32 -0800 (PST) Received: from dggpeml500020.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4NLTk15F8fzHw8m; Mon, 28 Nov 2022 23:22:49 +0800 (CST) Received: from dggpeml500003.china.huawei.com (7.185.36.200) by dggpeml500020.china.huawei.com (7.185.36.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 28 Nov 2022 23:23:30 +0800 Received: from huawei.com (10.175.127.227) by dggpeml500003.china.huawei.com (7.185.36.200) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 28 Nov 2022 23:23:30 +0800 From: Li Nan <linan122@huawei.com> To: <tj@kernel.org>, <josef@toxicpanda.com>, <axboe@kernel.dk> CC: <cgroups@vger.kernel.org>, <linux-block@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linan122@huawei.com>, <yukuai3@huawei.com>, <yi.zhang@huawei.com> Subject: [PATCH -next 8/8] block: fix null-pointer dereference in ioc_pd_init Date: Mon, 28 Nov 2022 23:44:34 +0800 Message-ID: <20221128154434.4177442-9-linan122@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20221128154434.4177442-1-linan122@huawei.com> References: <20221128154434.4177442-1-linan122@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpeml500003.china.huawei.com (7.185.36.200) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750754188451215720?= X-GMAIL-MSGID: =?utf-8?q?1750754188451215720?= |
Series |
iocost bugfix
|
|
Commit Message
Li Nan
Nov. 28, 2022, 3:44 p.m. UTC
Remove block device when iocost is initializing may cause
null-pointer dereference:
CPU1 CPU2
ioc_qos_write
blkcg_conf_open_bdev
blkdev_get_no_open
kobject_get_unless_zero
blk_iocost_init
rq_qos_add
del_gendisk
rq_qos_exit
q->rq_qos = rqos->next
//iocost is removed from q->roqs
blkcg_activate_policy
pd_init_fn
ioc_pd_init
ioc = q_to_ioc(blkg->q)
//cant find iocost and return null
Fix problem by moving rq_qos_exit() to disk_release(). ioc_qos_write() get
bd_device.kobj in blkcg_conf_open_bdev(), so disk_release will not be
actived until iocost initialization is complited.
Signed-off-by: Li Nan <linan122@huawei.com>
---
block/genhd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Hi Li, Thank you for the patch! Yet something to improve: [auto build test ERROR on next-20221128] url: https://github.com/intel-lab-lkp/linux/commits/Li-Nan/iocost-bugfix/20221128-232536 patch link: https://lore.kernel.org/r/20221128154434.4177442-9-linan122%40huawei.com patch subject: [PATCH -next 8/8] block: fix null-pointer dereference in ioc_pd_init config: arm-randconfig-r046-20221128 compiler: arm-linux-gnueabi-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/44da423d235b6c23ecf6a38325d0429cfac20ee2 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Li-Nan/iocost-bugfix/20221128-232536 git checkout 44da423d235b6c23ecf6a38325d0429cfac20ee2 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): block/genhd.c: In function 'disk_release': >> block/genhd.c:1170:21: error: 'q' undeclared (first use in this function); did you mean 'rq'? 1170 | rq_qos_exit(q); | ^ | rq block/genhd.c:1170:21: note: each undeclared identifier is reported only once for each function it appears in vim +1170 block/genhd.c 1136 1137 /** 1138 * disk_release - releases all allocated resources of the gendisk 1139 * @dev: the device representing this disk 1140 * 1141 * This function releases all allocated resources of the gendisk. 1142 * 1143 * Drivers which used __device_add_disk() have a gendisk with a request_queue 1144 * assigned. Since the request_queue sits on top of the gendisk for these 1145 * drivers we also call blk_put_queue() for them, and we expect the 1146 * request_queue refcount to reach 0 at this point, and so the request_queue 1147 * will also be freed prior to the disk. 1148 * 1149 * Context: can sleep 1150 */ 1151 static void disk_release(struct device *dev) 1152 { 1153 struct gendisk *disk = dev_to_disk(dev); 1154 1155 might_sleep(); 1156 WARN_ON_ONCE(disk_live(disk)); 1157 1158 /* 1159 * To undo the all initialization from blk_mq_init_allocated_queue in 1160 * case of a probe failure where add_disk is never called we have to 1161 * call blk_mq_exit_queue here. We can't do this for the more common 1162 * teardown case (yet) as the tagset can be gone by the time the disk 1163 * is released once it was added. 1164 */ 1165 if (queue_is_mq(disk->queue) && 1166 test_bit(GD_OWNS_QUEUE, &disk->state) && 1167 !test_bit(GD_ADDED, &disk->state)) 1168 blk_mq_exit_queue(disk->queue); 1169 > 1170 rq_qos_exit(q); 1171 blkcg_exit_disk(disk); 1172 1173 bioset_exit(&disk->bio_split); 1174 1175 disk_release_events(disk); 1176 kfree(disk->random); 1177 disk_free_zone_bitmaps(disk); 1178 xa_destroy(&disk->part_tbl); 1179 1180 disk->queue->disk = NULL; 1181 blk_put_queue(disk->queue); 1182 1183 if (test_bit(GD_ADDED, &disk->state) && disk->fops->free_disk) 1184 disk->fops->free_disk(disk); 1185 1186 iput(disk->part0->bd_inode); /* frees the disk */ 1187 } 1188
On Mon, Nov 28, 2022 at 11:44:34PM +0800, Li Nan wrote:
> Fix problem by moving rq_qos_exit() to disk_release().
No, that now means it is removed to later. You need to add proper
synchronization.
Hi Li, Thank you for the patch! Yet something to improve: [auto build test ERROR on next-20221128] url: https://github.com/intel-lab-lkp/linux/commits/Li-Nan/iocost-bugfix/20221128-232536 patch link: https://lore.kernel.org/r/20221128154434.4177442-9-linan122%40huawei.com patch subject: [PATCH -next 8/8] block: fix null-pointer dereference in ioc_pd_init config: s390-randconfig-r044-20221128 compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 6e4cea55f0d1104408b26ac574566a0e4de48036) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install s390 cross compiling tool for clang build # apt-get install binutils-s390x-linux-gnu # https://github.com/intel-lab-lkp/linux/commit/44da423d235b6c23ecf6a38325d0429cfac20ee2 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Li-Nan/iocost-bugfix/20221128-232536 git checkout 44da423d235b6c23ecf6a38325d0429cfac20ee2 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from block/genhd.c:28: In file included from block/blk-throttle.h:4: In file included from block/blk-cgroup-rwstat.h:9: In file included from block/blk-cgroup.h:20: In file included from include/linux/blk-mq.h:8: In file included from include/linux/scatterlist.h:9: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __raw_readb(PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu' #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x)) ^ include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16' #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) ^ In file included from block/genhd.c:28: In file included from block/blk-throttle.h:4: In file included from block/blk-cgroup-rwstat.h:9: In file included from block/blk-cgroup.h:20: In file included from include/linux/blk-mq.h:8: In file included from include/linux/scatterlist.h:9: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr)); ~~~~~~~~~~ ^ include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu' #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x)) ^ include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32' #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x)) ^ In file included from block/genhd.c:28: In file included from block/blk-throttle.h:4: In file included from block/blk-cgroup-rwstat.h:9: In file included from block/blk-cgroup.h:20: In file included from include/linux/blk-mq.h:8: In file included from include/linux/scatterlist.h:9: In file included from arch/s390/include/asm/io.h:75: include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writeb(value, PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); ~~~~~~~~~~ ^ include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] readsl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesb(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesw(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic] writesl(PCI_IOBASE + addr, buffer, count); ~~~~~~~~~~ ^ >> block/genhd.c:1170:14: error: use of undeclared identifier 'q' rq_qos_exit(q); ^ 12 warnings and 1 error generated. vim +/q +1170 block/genhd.c 1136 1137 /** 1138 * disk_release - releases all allocated resources of the gendisk 1139 * @dev: the device representing this disk 1140 * 1141 * This function releases all allocated resources of the gendisk. 1142 * 1143 * Drivers which used __device_add_disk() have a gendisk with a request_queue 1144 * assigned. Since the request_queue sits on top of the gendisk for these 1145 * drivers we also call blk_put_queue() for them, and we expect the 1146 * request_queue refcount to reach 0 at this point, and so the request_queue 1147 * will also be freed prior to the disk. 1148 * 1149 * Context: can sleep 1150 */ 1151 static void disk_release(struct device *dev) 1152 { 1153 struct gendisk *disk = dev_to_disk(dev); 1154 1155 might_sleep(); 1156 WARN_ON_ONCE(disk_live(disk)); 1157 1158 /* 1159 * To undo the all initialization from blk_mq_init_allocated_queue in 1160 * case of a probe failure where add_disk is never called we have to 1161 * call blk_mq_exit_queue here. We can't do this for the more common 1162 * teardown case (yet) as the tagset can be gone by the time the disk 1163 * is released once it was added. 1164 */ 1165 if (queue_is_mq(disk->queue) && 1166 test_bit(GD_OWNS_QUEUE, &disk->state) && 1167 !test_bit(GD_ADDED, &disk->state)) 1168 blk_mq_exit_queue(disk->queue); 1169 > 1170 rq_qos_exit(q); 1171 blkcg_exit_disk(disk); 1172 1173 bioset_exit(&disk->bio_split); 1174 1175 disk_release_events(disk); 1176 kfree(disk->random); 1177 disk_free_zone_bitmaps(disk); 1178 xa_destroy(&disk->part_tbl); 1179 1180 disk->queue->disk = NULL; 1181 blk_put_queue(disk->queue); 1182 1183 if (test_bit(GD_ADDED, &disk->state) && disk->fops->free_disk) 1184 disk->fops->free_disk(disk); 1185 1186 iput(disk->part0->bd_inode); /* frees the disk */ 1187 } 1188
Hi, 在 2022/11/29 22:25, Christoph Hellwig 写道: > On Mon, Nov 28, 2022 at 11:44:34PM +0800, Li Nan wrote: >> Fix problem by moving rq_qos_exit() to disk_release(). > > No, that now means it is removed to later. You need to add proper > synchronization. > . > Can you explain a bit more? Maybe I'm being noob, here disk is about to be freed, and I can think of any contention. Thanks, Kuai
On Wed, Nov 30, 2022 at 09:32:58AM +0800, Yu Kuai wrote: > > No, that now means it is removed to later. You need to add proper > > synchronization. > > . > > > > Can you explain a bit more? Maybe I'm being noob, here disk is about to > be freed, and I can think of any contention. Right now we need synchronization with e.g. open_mutex and a check for a dead disk, which I suggst to add insted of creating a lifetime imbalance.
diff --git a/block/genhd.c b/block/genhd.c index dcf200bcbd3e..c264da49eaaa 100644 --- a/block/genhd.c +++ b/block/genhd.c @@ -656,7 +656,6 @@ void del_gendisk(struct gendisk *disk) elevator_exit(q); mutex_unlock(&q->sysfs_lock); } - rq_qos_exit(q); blk_mq_unquiesce_queue(q); /* @@ -1168,6 +1167,7 @@ static void disk_release(struct device *dev) !test_bit(GD_ADDED, &disk->state)) blk_mq_exit_queue(disk->queue); + rq_qos_exit(q); blkcg_exit_disk(disk); bioset_exit(&disk->bio_split);