Message ID | 20231127062116.2355129-2-yukuai1@huaweicloud.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce62:0:b0:403:3b70:6f57 with SMTP id o2csp2883683vqx; Sun, 26 Nov 2023 22:22:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IHQHOzaBq2HTCnQZi+dvp/mbsgFZSzYOXjKfrmLVAoshZ+TBqMURwYe9Q0B9j0DsBdQ5ot0 X-Received: by 2002:a05:6358:528a:b0:16d:bd00:3d61 with SMTP id g10-20020a056358528a00b0016dbd003d61mr10979916rwa.23.1701066137082; Sun, 26 Nov 2023 22:22:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701066137; cv=none; d=google.com; s=arc-20160816; b=sOiaKzhp/oxRV/g6fCe55rmRRUd7PV/y4XS81l9dKVSfLCDsmetfdaW8bIsQc4rMwF ER+0M4sg9A+StoIc1+asmsZtwzfNDGhK2yiPiUAb14QyKKgP9gWBa5KK1wCEZOKNeS9/ enmMCIwobDEHZHqYK0xS2ZJbqM+m7nhgc1NUA0zn2GENt43B9LOCyBhkOsLNTU/M1tjq m4MAha3wcj9gy5AiEqzXekou6M195t7aaR2kvNtJj4eyRrGY3dZFQjl4aa/wfXZvGN5T BGgAu5AKQG+UhG69Kw/nNlMLqAlyVGd8X/NxILbEU/S/MpCkVn3MhJPAvQoZw6hFfEqr VYVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=FWHQjVzpMaarhTRSYf4Ep1bzu+cd9CjX50gQSHCFCKg=; fh=NWhZGYtc2+f9PHv/QRNNXUJMQaHV/FSqSljSUR3ePxw=; b=QtYOB+FVXc+KZbnSStZ+1+MKv3w0tBE6gYGE+wn63Y99A/++kvXgoBXoS74wyIUD0Z 5WMCEnlOZrwnZEOClh4wzF7CQV0waqIVMhVwwmbZ1BiVSLQlawhKBvKmEAMemR+mXWw8 LuDAoStaIicoeevomoLQ+zaLHFbbSGYmeoRpIrb4lWMFlerMt+LUfY5exPnE8kgSeZuy /XgS8V6tiklFt80YrWEBdVgXbfGYCSNoPj5O7sSPRX6MAP1wXj3wtYImf597tILTCgnL LrKDqag+hLxcEM6ppofzLTzjbyROFKY2AdORe/NV4n5+zuFaG6vVqOCz9KdvqMnMYT/a M1Tw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id 85-20020a630258000000b005a9debd7854si8879085pgc.828.2023.11.26.22.22.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 26 Nov 2023 22:22:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id EAB5780615C0; Sun, 26 Nov 2023 22:22:15 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231234AbjK0GV7 (ORCPT <rfc822;toshivichauhan@gmail.com> + 99 others); Mon, 27 Nov 2023 01:21:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229565AbjK0GV4 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 27 Nov 2023 01:21:56 -0500 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7014C113; Sun, 26 Nov 2023 22:21:59 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4SdwTt3KXzz4f3kFw; Mon, 27 Nov 2023 14:21:54 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 436C01A0C09; Mon, 27 Nov 2023 14:21:56 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgDX2hB+NWRlrcU8CA--.57866S5; Mon, 27 Nov 2023 14:21:55 +0800 (CST) From: Yu Kuai <yukuai1@huaweicloud.com> To: hch@infradead.org, ming.lei@redhat.com, axboe@kernel.dk, roger.pau@citrix.com, colyli@suse.de, kent.overstreet@gmail.com, joern@lazybastard.org, miquel.raynal@bootlin.com, richard@nod.at, vigneshr@ti.com, sth@linux.ibm.com, hoeppner@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, jejb@linux.ibm.com, martin.petersen@oracle.com, clm@fb.com, josef@toxicpanda.com, dsterba@suse.com, viro@zeniv.linux.org.uk, brauner@kernel.org, nico@fluxnic.net, xiang@kernel.org, chao@kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, agruenba@redhat.com, jack@suse.com, konishi.ryusuke@gmail.com, dchinner@redhat.com, linux@weissschuh.net, min15.li@samsung.com, yukuai3@huawei.com, dlemoal@kernel.org, willy@infradead.org, akpm@linux-foundation.org, hare@suse.de, p.raghav@samsung.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, xen-devel@lists.xenproject.org, linux-bcache@vger.kernel.org, linux-mtd@lists.infradead.org, linux-s390@vger.kernel.org, linux-scsi@vger.kernel.org, linux-bcachefs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-erofs@lists.ozlabs.org, linux-ext4@vger.kernel.org, gfs2@lists.linux.dev, linux-nilfs@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH block/for-next v2 01/16] block: add a new helper to get inode from block_device Date: Mon, 27 Nov 2023 14:21:01 +0800 Message-Id: <20231127062116.2355129-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231127062116.2355129-1-yukuai1@huaweicloud.com> References: <20231127062116.2355129-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: cCh0CgDX2hB+NWRlrcU8CA--.57866S5 X-Coremail-Antispam: 1UD129KBjvJXoW7uF1kur1UZFWfWw45tr1rtFb_yoW8Aw4rpF nxGFy5GrWDWry2gF4vvw17Zry3K3W0k3y8JrZaqw4Y9ayUtr1IgF1ktr17Ary0vrZ3KF4j gF1Y9rW8urWUC3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Wrv_Gr1UMIIYrx kI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v2 6r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUOR6z UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Sun, 26 Nov 2023 22:22:16 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783697125480839219 X-GMAIL-MSGID: 1783697125480839219 |
Series |
block: remove field 'bd_inode' from block_device
|
|
Commit Message
Yu Kuai
Nov. 27, 2023, 6:21 a.m. UTC
From: Yu Kuai <yukuai3@huawei.com> block_devcie is allocated from bdev_alloc() by bdev_alloc_inode(), and currently block_device contains a pointer that point to the address of inode, while such inode is allocated together: bdev_alloc inode = new_inode() // inode is &bdev_inode->vfs_inode bdev = I_BDEV(inode) // bdev is &bdev_inode->bdev bdev->inode = inode Add a new helper to get address of inode from bdev by add operation instead of memory access, which is more efficiency. Signed-off-by: Yu Kuai <yukuai3@huawei.com> --- block/bdev.c | 5 ----- include/linux/blk_types.h | 12 ++++++++++++ 2 files changed, 12 insertions(+), 5 deletions(-)
Comments
On Mon, Nov 27, 2023 at 02:21:01PM +0800, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > block_devcie is allocated from bdev_alloc() by bdev_alloc_inode(), and > currently block_device contains a pointer that point to the address of > inode, while such inode is allocated together: This is going the wrong way. Nothing outside of core block layer code should ever directly use the bdev inode. We've been rather sloppy and added a lot of direct reference to it, but they really need to go away and be replaced with well defined high level operation on struct block_device. Once that is done we can remove the bd_inode pointer, but replacing it with something that pokes even more deeply into bdev internals is a bad idea.
Hi, 在 2023/11/27 15:21, Christoph Hellwig 写道: > On Mon, Nov 27, 2023 at 02:21:01PM +0800, Yu Kuai wrote: >> From: Yu Kuai <yukuai3@huawei.com> >> >> block_devcie is allocated from bdev_alloc() by bdev_alloc_inode(), and >> currently block_device contains a pointer that point to the address of >> inode, while such inode is allocated together: > > This is going the wrong way. Nothing outside of core block layer code > should ever directly use the bdev inode. We've been rather sloppy > and added a lot of direct reference to it, but they really need to > go away and be replaced with well defined high level operation on > struct block_device. Once that is done we can remove the bd_inode > pointer, but replacing it with something that pokes even more deeply > into bdev internals is a bad idea. Thanks for the advice, however, after collecting how other modules are using bdev inode, I got two main questions: 1) Is't okay to add a new helper to pass in bdev for following apis? If so, then almost all the fs and driver can avoid to access bd_inode dirctly. errseq_check(&bdev->bd_inode->i_mapping->wb_err, wb_err); errseq_check_and_advance(&bdev->bd_inode->i_mapping->wb_err, &wb_err); mapping_gfp_constraint(bdev->bd_inode->i_mapping, gfp); i_size_read(bdev->bd_inode) find_get_page(bdev->bd_inode->i_mapping, offset); find_or_create_page(bdev->bd_inode->i_mapping, index, gfp); read_cache_page_gfp(bdev->bd_inode->i_mapping, index, gfp); invalidate_inode_pages2(bdev->bd_inode->i_mapping); invalidate_inode_pages2_range(bdev->bd_inode->i_mapping, start, end); read_mapping_folio(bdev->bd_inode->i_mapping, index, file); read_mapping_page(bdev->bd_inode->i_mapping, index, file); balance_dirty_pages_ratelimited(bdev->bd_inode->i_mapping) file_ra_state_init(ra, bdev->bd_inode->i_mapping); page_cache_sync_readahead(bdev->bd_inode->i_mapping, ra, file, index, req_count); inode_to_bdi(bdev->bd_inode) 2) For the file fs/buffer.c, there are some special usage like following that I don't think it's good to add a helper: spin_lock(&bd_inode->i_mapping->private_lock); Is't okay to move following apis from fs/buffer.c directly to block/bdev.c? __find_get_block bdev_getblk Thanks, Kuai > . >
On Mon, Nov 27, 2023 at 09:07:22PM +0800, Yu Kuai wrote: > 1) Is't okay to add a new helper to pass in bdev for following apis? For some we already have them (e.g. bdev_nr_bytes to read the bdev) size, for some we need to add them. The big thing that seems to stick out is page cache API, and I think that is where we need to define maintainable APIs for file systems and others to use the block device page cache. Probably only in folio versions and not pages once if we're touching the code anyay > 2) For the file fs/buffer.c, there are some special usage like > following that I don't think it's good to add a helper: > > spin_lock(&bd_inode->i_mapping->private_lock); > > Is't okay to move following apis from fs/buffer.c directly to > block/bdev.c? > > __find_get_block > bdev_getblk I'm not sure moving is a good idea, but we might end up the some kind of low-level access from buffer.c, be that special helpers, a separate header or something else. Let's sort out the rest of the kernel first.
Hi, 在 2023/11/28 0:32, Christoph Hellwig 写道: > On Mon, Nov 27, 2023 at 09:07:22PM +0800, Yu Kuai wrote: >> 1) Is't okay to add a new helper to pass in bdev for following apis? > > > For some we already have them (e.g. bdev_nr_bytes to read the bdev) > size, for some we need to add them. The big thing that seems to > stick out is page cache API, and I think that is where we need to > define maintainable APIs for file systems and others to use the > block device page cache. Probably only in folio versions and not > pages once if we're touching the code anyay Thanks for the advice! In case I'm understanding correctly, do you mean that all other fs/drivers that is using pages versions can safely switch to folio versions now? By the way, my orginal idea was trying to add a new field 'bd_flags' in block_devcie, and then add a new bit so that bio_check_ro() will only warn once for each partition. Now that this patchset will be quite complex, I'll add a new bool field 'bd_ro_warned' to fix the above problem first, and then add 'bd_flags' once this patchset is done. Thanks, Kuai > >> 2) For the file fs/buffer.c, there are some special usage like >> following that I don't think it's good to add a helper: >> >> spin_lock(&bd_inode->i_mapping->private_lock); >> >> Is't okay to move following apis from fs/buffer.c directly to >> block/bdev.c? >> >> __find_get_block >> bdev_getblk > > I'm not sure moving is a good idea, but we might end up the > some kind of low-level access from buffer.c, be that special > helpers, a separate header or something else. Let's sort out > the rest of the kernel first. > > . >
On Tue, Nov 28, 2023 at 09:35:56AM +0800, Yu Kuai wrote: > Thanks for the advice! In case I'm understanding correctly, do you mean > that all other fs/drivers that is using pages versions can safely switch > to folio versions now? If you never allocate a high-order folio pages are identical to folios. So yes, we can do folio based interfaces only, and also use that as an opportunity to convert over the callers. > By the way, my orginal idea was trying to add a new field 'bd_flags' > in block_devcie, and then add a new bit so that bio_check_ro() will > only warn once for each partition. Now that this patchset will be quite > complex, I'll add a new bool field 'bd_ro_warned' to fix the above > problem first, and then add 'bd_flags' once this patchset is done. Yes, please do a minimal version if you can find space where the rmw cycles don't cause damage to neighbouring fields. Or just leave the current set of warnings in if it's too hard.
diff --git a/block/bdev.c b/block/bdev.c index e4cfb7adb645..7509389095b7 100644 --- a/block/bdev.c +++ b/block/bdev.c @@ -30,11 +30,6 @@ #include "../fs/internal.h" #include "blk.h" -struct bdev_inode { - struct block_device bdev; - struct inode vfs_inode; -}; - static inline struct bdev_inode *BDEV_I(struct inode *inode) { return container_of(inode, struct bdev_inode, vfs_inode); diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index d5c5e59ddbd2..06de8393dcd1 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -85,6 +85,18 @@ struct block_device { #define bdev_kobj(_bdev) \ (&((_bdev)->bd_device.kobj)) +struct bdev_inode { + struct block_device bdev; + struct inode vfs_inode; +}; + +static inline struct inode *bdev_inode(struct block_device *bdev) +{ + struct bdev_inode *bi = container_of(bdev, struct bdev_inode, bdev); + + return &bi->vfs_inode; +} + /* * Block error status values. See block/blk-core:blk_errors for the details. * Alpha cannot write a byte atomically, so we need to use 32-bit value.