From patchwork Fri Jan 6 12:53:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 40118 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp807585wrt; Fri, 6 Jan 2023 04:55:17 -0800 (PST) X-Google-Smtp-Source: AMrXdXtkSL82wriPs47CSZjfg1HNV/z9KODVVkFVFLHi7Qs2nqFQFdO7NLXWFoZScnB+MUKsBeGD X-Received: by 2002:a17:902:d486:b0:186:60c0:9f9e with SMTP id c6-20020a170902d48600b0018660c09f9emr84933074plg.39.1673009717436; Fri, 06 Jan 2023 04:55:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673009717; cv=none; d=google.com; s=arc-20160816; b=Q9HVhGlNYI2izrfkWzccs1gXTs0tTjX/Q4gYWq0U/rEDjnkMM/loqHX/Qvwj7LMKrp BC1t2vvrUDeG5RMCWqbFSeeHrHc1HgjY60VU2GXqHF2TZOoLOTOK/9OxN7lt4V/7r+Lb tEd9cyRQtPSRAKq0K7yhBWJzV2XkRq8IJk/+fvesW4+WOyiEdfzIigLm5vNYRj8GrTQI CG289Y8V8faF3n6S9VovPpcOW196C68gApyC5j3xAMiMLYjNorpWaF/HCoqd1IjudQJ7 II8eGtuAcBj5qSMBfG9XS4S5HPQNYStCP3ZDshtQH4z5ZRGjbjF2HAoFVZwedt2scP7c I0PA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=EhdxxlpYwtjX8jwA7cAhNWmBzIgaulcEQjTbzizakHA=; b=LfvSULUX4Watjvf66PHOXdxyhU2qCK3UReeeKpEacEYRC2/3tXAVhaERy49YdbLRS9 TaGLpJiHOc1MEGWnebjr4mHs6yNsJPNjQxVSlh7mFwLl0gAyQ1RbjwOetvn3bpWTDjCA NETh6qP5oatlbEnIWn69zxfzdanG1i5qLjmiJRpxYbmkhnZgttsGXawSMPZVJHNEh3KZ rDC1Vwbf/2aVbJrRct8lh1QtjVIfiMnaxG5mrIGukL9/dE3JxiJH80ZjD81P+IpuljAW 4Vgd16jkk5QjAKd1VMaOThQE2SR1exUe1gJlcaHP3Uvbj/HeJxy8yThW/Bit86F1+bEL Izrg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b4-20020a170902d50400b00192b2d2183esi968982plg.493.2023.01.06.04.55.05; Fri, 06 Jan 2023 04:55:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234039AbjAFMyA (ORCPT + 99 others); Fri, 6 Jan 2023 07:54:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234177AbjAFMxf (ORCPT ); Fri, 6 Jan 2023 07:53:35 -0500 Received: from out30-7.freemail.mail.aliyun.com (out30-7.freemail.mail.aliyun.com [115.124.30.7]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D318728B7; Fri, 6 Jan 2023 04:53:34 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R891e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046056;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VZ-H5YK_1673009611; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VZ-H5YK_1673009611) by smtp.aliyun-inc.com; Fri, 06 Jan 2023 20:53:32 +0800 From: Jingbo Xu To: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: huyue2@coolpad.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH 1/6] erofs: remove unused device mapping in the meta routine Date: Fri, 6 Jan 2023 20:53:25 +0800 Message-Id: <20230106125330.55529-2-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230106125330.55529-1-jefflexu@linux.alibaba.com> References: <20230106125330.55529-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754277837773475704?= X-GMAIL-MSGID: =?utf-8?q?1754277837773475704?= Currently there're two anonymous inodes (inode and anon_inode in struct erofs_fscache) may be used for each blob. The former is only for bootstrap and used as the address_space of page cache, while the latter is used for both bootstrap and data blobs when share domain mode enabled and behaves as a sentinel in the shared domain. In prep for the following support for page cache sharing, following patch will unify these two anonymous inodes. That is, the unified anonymous inode not only acts as the address_space of page cache, but also a sentinel in share domain mode. However the current meta routine can't work if above change applied. Current meta routine will make a device mapping, and superblock of the filesystem is required to do the device mapping. Currently the superblock is derived from the input meta folio, which is reasonable since the anonymous inode (used for the address_space of page cache) is always allocated from the filesystem's sb. However after anonymous inodes are unified, that is no longer always true. For example, in share domain mode, the unified anonymous inode will be allocated from pseudo mnt, and the superblock derived from the folio is actually a pseudo sb, which can't be used for the device mapping at all. As for the meta routine itself, currently metadata is always on bootstrap, which means device mapping is not needed so far. After removing the redundant device mapping logic, we can derive the required fscache_ctx from anonymous inode's i_private. Signed-off-by: Jingbo Xu --- fs/erofs/fscache.c | 17 ++++------------- 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 014e20962376..03de4dc99302 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -164,18 +164,8 @@ static int erofs_fscache_read_folios_async(struct fscache_cookie *cookie, static int erofs_fscache_meta_read_folio(struct file *data, struct folio *folio) { int ret; - struct super_block *sb = folio_mapping(folio)->host->i_sb; + struct erofs_fscache *ctx = folio_mapping(folio)->host->i_private; struct erofs_fscache_request *req; - struct erofs_map_dev mdev = { - .m_deviceid = 0, - .m_pa = folio_pos(folio), - }; - - ret = erofs_map_dev(sb, &mdev); - if (ret) { - folio_unlock(folio); - return ret; - } req = erofs_fscache_req_alloc(folio_mapping(folio), folio_pos(folio), folio_size(folio)); @@ -184,8 +174,8 @@ static int erofs_fscache_meta_read_folio(struct file *data, struct folio *folio) return PTR_ERR(req); } - ret = erofs_fscache_read_folios_async(mdev.m_fscache->cookie, - req, mdev.m_pa, folio_size(folio)); + ret = erofs_fscache_read_folios_async(ctx->cookie, req, + folio_pos(folio), folio_size(folio)); if (ret) req->error = ret; @@ -469,6 +459,7 @@ struct erofs_fscache *erofs_fscache_acquire_cookie(struct super_block *sb, inode->i_size = OFFSET_MAX; inode->i_mapping->a_ops = &erofs_fscache_meta_aops; mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); + inode->i_private = ctx; ctx->inode = inode; } From patchwork Fri Jan 6 12:53:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 40116 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp807448wrt; Fri, 6 Jan 2023 04:54:59 -0800 (PST) X-Google-Smtp-Source: AMrXdXtRtL3PC7S4/1iebV6cOyITiy0xbWlC6YHxF3MNVHed0dXZL5RxQ0QBSpItRUruOEiEZRoG X-Received: by 2002:a17:90a:a4b:b0:225:d285:acd4 with SMTP id o69-20020a17090a0a4b00b00225d285acd4mr47582180pjo.32.1673009699302; Fri, 06 Jan 2023 04:54:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673009699; cv=none; d=google.com; s=arc-20160816; b=kItm+jgGpH4rg1i7oJ3NPvQ4PnNh0MRfvZV6ioLn5XkLZtquWjIAaJXlFh68q/NyG5 cs54lcqFFWS5YWzdVcb4/Ch9LvyTHWpZ4bZAt0GCYNQwQCjcjt2ayrdOJ18Vof/yBPKD VocsycLOQflGaoN6EhAdmjVmvGXk+Une4vexqq2l1hsuYWsUfH7gL+zmx0ndtAb20pJE Mw+Qy+YhEVELdyqSbqAtrejFVr4JxY7Ye5SoR97k+yZ05EQhuP2UzYIIy/kOq8Fgy4w3 nT+v5vzl9FaBx8PR5RmMH6EoYzebmHySREwkR6ZWeVO4NVfWA8RJnshmAeUUI01MVFRh vk+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=QzCaHVGA+mnU+tXGCWF5QU6dVNjVdQT0YHFeF6UlgGY=; b=hZ3YVY3YdjOqTpSQdAZREuKNXVW2DDVkYLEdOjTvbzRE5quUdfSoABOHSe+qjfbBRv avxeXrmc0B3MaU0yl4wipQeK+a9TkLGLsc38BF4h4xQyZaaxEtAPSCrGqRFRM5W/CY6A jDfUYwqdeyCG44DqP4G8qyaJCiDs1uDRW7g0u6XohGF+C5FROoYIlQExcX93fAEIosCF nWp9jVVz9+pyESX4H/oJc7Xke9h5ID77aG4u1N63h7uj1pRTCCbaZv3w0KtFVfaK11e/ byRRLaghjiHaY8GacG+joTZdQIengcIuYEX+wqCFure780IfKBpcoGpX1yBju2rBCUMO afbQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bg7-20020a17090b0d8700b00225de00d4dfsi1226897pjb.118.2023.01.06.04.54.46; Fri, 06 Jan 2023 04:54:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234002AbjAFMyH (ORCPT + 99 others); Fri, 6 Jan 2023 07:54:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233714AbjAFMxj (ORCPT ); Fri, 6 Jan 2023 07:53:39 -0500 Received: from out30-56.freemail.mail.aliyun.com (out30-56.freemail.mail.aliyun.com [115.124.30.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B2A471880; Fri, 6 Jan 2023 04:53:36 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R411e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VZ-Kukg_1673009612; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VZ-Kukg_1673009612) by smtp.aliyun-inc.com; Fri, 06 Jan 2023 20:53:33 +0800 From: Jingbo Xu To: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: huyue2@coolpad.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH 2/6] erofs: unify anonymous inodes for blob Date: Fri, 6 Jan 2023 20:53:26 +0800 Message-Id: <20230106125330.55529-3-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230106125330.55529-1-jefflexu@linux.alibaba.com> References: <20230106125330.55529-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754277818243301748?= X-GMAIL-MSGID: =?utf-8?q?1754277818243301748?= Currently only bootstrap will allocate anonymous inode for the address_space of page cache. In prep for the following support for page cache sharing, as the first step, always allocate anonymous inode for this use for both bootstrap and data blobs. Since now anonymous inode is also allocated for data blobs, release these anonymous inodes when .put_super() is called, or we'll get "VFS: Busy inodes after unmount." warning. Similarly, the fscache contexts for data blobs are initialized prior to the root inode, thus .kill_sb() shall also contain the cleanup routine, so that these fscache contexts can be cleaned up when mount fails while root inode has not been initialized yet. Also remove the redundant set_nlink() when initializing anonymous inode, since i_nlink has already been initialized to 1 when the inode gets allocated. Until then there're two anonymous inodes (inode and anon_inode in struct erofs_fscache) may be used for each blob. The former is used as the address_space of page cache, while the latter is used as a sentinel in the shared domain. In prep for the following support for page cache sharing, unify these two anonymous inodes. Signed-off-by: Jingbo Xu --- fs/erofs/fscache.c | 97 +++++++++++++++++++++------------------------ fs/erofs/internal.h | 6 +-- fs/erofs/super.c | 2 + 3 files changed, 49 insertions(+), 56 deletions(-) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 03de4dc99302..4d7785a70926 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -318,8 +318,6 @@ const struct address_space_operations erofs_fscache_access_aops = { static void erofs_fscache_domain_put(struct erofs_domain *domain) { - if (!domain) - return; mutex_lock(&erofs_domain_list_lock); if (refcount_dec_and_test(&domain->ref)) { list_del(&domain->list); @@ -423,12 +421,13 @@ static int erofs_fscache_register_domain(struct super_block *sb) static struct erofs_fscache *erofs_fscache_acquire_cookie(struct super_block *sb, - char *name, - unsigned int flags) + char *name) { struct fscache_volume *volume = EROFS_SB(sb)->volume; struct erofs_fscache *ctx; struct fscache_cookie *cookie; + struct super_block *psb = sb; + struct inode *inode; int ret; ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); @@ -442,33 +441,29 @@ struct erofs_fscache *erofs_fscache_acquire_cookie(struct super_block *sb, ret = -EINVAL; goto err; } - fscache_use_cookie(cookie, false); - ctx->cookie = cookie; - - if (flags & EROFS_REG_COOKIE_NEED_INODE) { - struct inode *const inode = new_inode(sb); - if (!inode) { - erofs_err(sb, "failed to get anon inode for %s", name); - ret = -ENOMEM; - goto err_cookie; - } - - set_nlink(inode, 1); - inode->i_size = OFFSET_MAX; - inode->i_mapping->a_ops = &erofs_fscache_meta_aops; - mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); - inode->i_private = ctx; - - ctx->inode = inode; + if (EROFS_SB(sb)->domain_id) + psb = erofs_pseudo_mnt->mnt_sb; + inode = new_inode(psb); + if (!inode) { + erofs_err(sb, "failed to get anon inode for %s", name); + ret = -ENOMEM; + goto err_cookie; } + inode->i_size = OFFSET_MAX; + inode->i_mapping->a_ops = &erofs_fscache_meta_aops; + mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); + inode->i_private = ctx; + + ctx->cookie = cookie; + ctx->inode = inode; return ctx; err_cookie: - fscache_unuse_cookie(ctx->cookie, NULL, NULL); - fscache_relinquish_cookie(ctx->cookie, false); + fscache_unuse_cookie(cookie, NULL, NULL); + fscache_relinquish_cookie(cookie, false); err: kfree(ctx); return ERR_PTR(ret); @@ -485,38 +480,34 @@ static void erofs_fscache_relinquish_cookie(struct erofs_fscache *ctx) static struct erofs_fscache *erofs_fscache_domain_init_cookie(struct super_block *sb, - char *name, - unsigned int flags) + char *name) { - int err; - struct inode *inode; struct erofs_fscache *ctx; struct erofs_domain *domain = EROFS_SB(sb)->domain; - ctx = erofs_fscache_acquire_cookie(sb, name, flags); + ctx = erofs_fscache_acquire_cookie(sb, name); if (IS_ERR(ctx)) return ctx; ctx->name = kstrdup(name, GFP_KERNEL); if (!ctx->name) { - err = -ENOMEM; - goto out; + erofs_fscache_relinquish_cookie(ctx); + return ERR_PTR(-ENOMEM); } - inode = new_inode(erofs_pseudo_mnt->mnt_sb); - if (!inode) { - err = -ENOMEM; - goto out; - } + /* + * In share domain scenarios, the unified anonymous inode is not only + * used as the address_space of shared page cache, but also a sentinel + * in pseudo mount. The initial refcount is used for the former and + * will be killed when the cookie finally gets relinquished. For the + * latter, the refcount is increased every time the cookie in the domain + * gets referred to. + */ + igrab(ctx->inode); ctx->domain = domain; - ctx->anon_inode = inode; - inode->i_private = ctx; refcount_inc(&domain->ref); return ctx; -out: - erofs_fscache_relinquish_cookie(ctx); - return ERR_PTR(err); } static @@ -547,7 +538,7 @@ struct erofs_fscache *erofs_domain_register_cookie(struct super_block *sb, return ctx; } spin_unlock(&psb->s_inode_list_lock); - ctx = erofs_fscache_domain_init_cookie(sb, name, flags); + ctx = erofs_fscache_domain_init_cookie(sb, name); mutex_unlock(&erofs_domain_cookies_lock); return ctx; } @@ -558,7 +549,7 @@ struct erofs_fscache *erofs_fscache_register_cookie(struct super_block *sb, { if (EROFS_SB(sb)->domain_id) return erofs_domain_register_cookie(sb, name, flags); - return erofs_fscache_acquire_cookie(sb, name, flags); + return erofs_fscache_acquire_cookie(sb, name); } void erofs_fscache_unregister_cookie(struct erofs_fscache *ctx) @@ -568,18 +559,21 @@ void erofs_fscache_unregister_cookie(struct erofs_fscache *ctx) if (!ctx) return; + domain = ctx->domain; if (domain) { mutex_lock(&erofs_domain_cookies_lock); - drop = atomic_read(&ctx->anon_inode->i_count) == 1; - iput(ctx->anon_inode); + /* drop the ref for the sentinel in pseudo mount */ + iput(ctx->inode); + drop = atomic_read(&ctx->inode->i_count) == 1; + if (drop) + erofs_fscache_relinquish_cookie(ctx); mutex_unlock(&erofs_domain_cookies_lock); - if (!drop) - return; + if (drop) + erofs_fscache_domain_put(domain); + } else { + erofs_fscache_relinquish_cookie(ctx); } - - erofs_fscache_relinquish_cookie(ctx); - erofs_fscache_domain_put(domain); } int erofs_fscache_register_fs(struct super_block *sb) @@ -587,7 +581,7 @@ int erofs_fscache_register_fs(struct super_block *sb) int ret; struct erofs_sb_info *sbi = EROFS_SB(sb); struct erofs_fscache *fscache; - unsigned int flags; + unsigned int flags = 0; if (sbi->domain_id) ret = erofs_fscache_register_domain(sb); @@ -606,7 +600,6 @@ int erofs_fscache_register_fs(struct super_block *sb) * * Acquired domain/volume will be relinquished in kill_sb() on error. */ - flags = EROFS_REG_COOKIE_NEED_INODE; if (sbi->domain_id) flags |= EROFS_REG_COOKIE_NEED_NOEXIST; fscache = erofs_fscache_register_cookie(sb, sbi->fsid, flags); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index bb8501c0ff5b..b3d04bc2d279 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -108,8 +108,7 @@ struct erofs_domain { struct erofs_fscache { struct fscache_cookie *cookie; - struct inode *inode; - struct inode *anon_inode; + struct inode *inode; /* anonymous indoe for the blob */ struct erofs_domain *domain; char *name; }; @@ -604,8 +603,7 @@ static inline int z_erofs_load_lzma_config(struct super_block *sb, #endif /* !CONFIG_EROFS_FS_ZIP */ /* flags for erofs_fscache_register_cookie() */ -#define EROFS_REG_COOKIE_NEED_INODE 1 -#define EROFS_REG_COOKIE_NEED_NOEXIST 2 +#define EROFS_REG_COOKIE_NEED_NOEXIST 1 /* fscache.c */ #ifdef CONFIG_EROFS_FS_ONDEMAND diff --git a/fs/erofs/super.c b/fs/erofs/super.c index 481788c24a68..4ce00285a2e4 100644 --- a/fs/erofs/super.c +++ b/fs/erofs/super.c @@ -970,6 +970,8 @@ static void erofs_put_super(struct super_block *sb) iput(sbi->packed_inode); sbi->packed_inode = NULL; #endif + erofs_free_dev_context(sbi->devs); + sbi->devs = NULL; erofs_fscache_unregister_fs(sb); } From patchwork Fri Jan 6 12:53:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 40115 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp807231wrt; Fri, 6 Jan 2023 04:54:35 -0800 (PST) X-Google-Smtp-Source: AMrXdXv25A5XmIc7tt17pJTJXukvuWDOGr7XuStzMbUrPzAD/+6O52hlbnwTnm+E9kV3e9Buu9l1 X-Received: by 2002:a17:902:b609:b0:192:4ed2:7509 with SMTP id b9-20020a170902b60900b001924ed27509mr57725149pls.15.1673009675617; Fri, 06 Jan 2023 04:54:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673009675; cv=none; d=google.com; s=arc-20160816; b=Ch/TsFAmtE09mp6PpOJHzF6gNAZm3Y2TNrlQQH/iyr7bzrPbQ8EPSVGwzPwJxJ4sp/ ZbYtFLY009qg3ETBnoPnL0Xv6L8A+4Xm//6B9+Eja6JGPPlrwyn3y3h76gf1ahLWQ61s 7B8WlTtBKKIi3RYKt/ZvNoG3PBxKuWivrNzUn39PaAt6HVREAU+p5hqlPot8ZJSH5PDC Ch0MxVeTuUzQXcL9LqkURQCvYudVFR19AcsprOpLFHwICaZTUF3Wwllf+3VxZL+8j1Ek DA4oi/Zcvf8j4j5+1cMzsZw+WMcDylhYemr4ldLlC1ZxklAwbU+kaIET+PEHkmFJy+xX 2w4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=kxWiVkdTqcbUSfP2ja6n7cFVaew1Ywg9ZocDw/v2Ipg=; b=rs25HxrMOFVA8jeNbGX6vp7YiKE7P/Si4gpXrJDm0JKxYX11zlv6As7cWpdgbjaHh+ G7tykynQ+xsr+cXo9lWCy9XFJ4/ninKdaoffY51mNqKE/PEP5GTltQX0V108v1reubfy kbD1a92PMxxCYcWYSbMd71MtP0urGK3opfzIIqAKkCLBet10i1TUvCwuEPPBY6wdBGI6 sZOiPt3jkANAtd99JdZzM1uiYXaxJ/dLCR9ue/iNCfF5AMHscoSnagIiNKOlecrHvBCA f1BAF2SqZvd0KVAdlGo0MtW+DkFx4r6oHQB6JrO6BwLq/rOs9sDEdJ3ritS+gq7ybMaU g4UA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t18-20020a170902e1d200b001930212bebfsi812115pla.530.2023.01.06.04.54.21; Fri, 06 Jan 2023 04:54:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233945AbjAFMxx (ORCPT + 99 others); Fri, 6 Jan 2023 07:53:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231666AbjAFMxj (ORCPT ); Fri, 6 Jan 2023 07:53:39 -0500 Received: from out30-42.freemail.mail.aliyun.com (out30-42.freemail.mail.aliyun.com [115.124.30.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99F03745B7; Fri, 6 Jan 2023 04:53:37 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VZ-Kul8_1673009613; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VZ-Kul8_1673009613) by smtp.aliyun-inc.com; Fri, 06 Jan 2023 20:53:34 +0800 From: Jingbo Xu To: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: huyue2@coolpad.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH 3/6] erofs: alloc anonymous file for blob in share domain mode Date: Fri, 6 Jan 2023 20:53:27 +0800 Message-Id: <20230106125330.55529-4-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230106125330.55529-1-jefflexu@linux.alibaba.com> References: <20230106125330.55529-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754277793746571879?= X-GMAIL-MSGID: =?utf-8?q?1754277793746571879?= In prep for the following support for page cache sharing based mmap, allocate an anonymous file for each blob, so that we can link associated vma to blobs later. Since page cache sharing will be enabled only for share domain mode, prepare anonymous file only in share domain mode. Signed-off-by: Jingbo Xu --- fs/erofs/fscache.c | 24 +++++++++++++++++++++++- fs/erofs/internal.h | 1 + 2 files changed, 24 insertions(+), 1 deletion(-) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 4d7785a70926..ea276884f043 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -4,6 +4,8 @@ * Copyright (C) 2022, Bytedance Inc. All rights reserved. */ #include +#include +#include #include "internal.h" static DEFINE_MUTEX(erofs_domain_list_lock); @@ -316,6 +318,8 @@ const struct address_space_operations erofs_fscache_access_aops = { .readahead = erofs_fscache_readahead, }; +static const struct file_operations erofs_fscache_meta_fops = {}; + static void erofs_fscache_domain_put(struct erofs_domain *domain) { mutex_lock(&erofs_domain_list_lock); @@ -428,6 +432,7 @@ struct erofs_fscache *erofs_fscache_acquire_cookie(struct super_block *sb, struct fscache_cookie *cookie; struct super_block *psb = sb; struct inode *inode; + struct file *file; int ret; ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); @@ -457,10 +462,24 @@ struct erofs_fscache *erofs_fscache_acquire_cookie(struct super_block *sb, mapping_set_gfp_mask(inode->i_mapping, GFP_NOFS); inode->i_private = ctx; + if (EROFS_SB(sb)->domain_id) { + ihold(inode); + file = alloc_file_pseudo(inode, erofs_pseudo_mnt, "[erofs]", + O_RDONLY, &erofs_fscache_meta_fops); + if (IS_ERR(file)) { + ret = PTR_ERR(file); + iput(inode); + goto err_inode; + } + ctx->file = file; + } + ctx->cookie = cookie; ctx->inode = inode; return ctx; +err_inode: + iput(inode); err_cookie: fscache_unuse_cookie(cookie, NULL, NULL); fscache_relinquish_cookie(cookie, false); @@ -473,6 +492,8 @@ static void erofs_fscache_relinquish_cookie(struct erofs_fscache *ctx) { fscache_unuse_cookie(ctx->cookie, NULL, NULL); fscache_relinquish_cookie(ctx->cookie, false); + if (ctx->file) + fput(ctx->file); iput(ctx->inode); kfree(ctx->name); kfree(ctx); @@ -565,7 +586,8 @@ void erofs_fscache_unregister_cookie(struct erofs_fscache *ctx) mutex_lock(&erofs_domain_cookies_lock); /* drop the ref for the sentinel in pseudo mount */ iput(ctx->inode); - drop = atomic_read(&ctx->inode->i_count) == 1; + /* one initial ref, and one ref for anonymous file */ + drop = atomic_read(&ctx->inode->i_count) == 2; if (drop) erofs_fscache_relinquish_cookie(ctx); mutex_unlock(&erofs_domain_cookies_lock); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index b3d04bc2d279..24d471fe2fa4 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -109,6 +109,7 @@ struct erofs_domain { struct erofs_fscache { struct fscache_cookie *cookie; struct inode *inode; /* anonymous indoe for the blob */ + struct file *file; /* anonymous file */ struct erofs_domain *domain; char *name; }; From patchwork Fri Jan 6 12:53:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 40117 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp807572wrt; Fri, 6 Jan 2023 04:55:15 -0800 (PST) X-Google-Smtp-Source: AMrXdXsyzA6G4PYFgDc9X3zB6DmmL6MbP72ksgHJRw/21U9JqPID1IDbwblj0iHUcwE94ZIIaWaJ X-Received: by 2002:a17:902:8f85:b0:193:678:df13 with SMTP id z5-20020a1709028f8500b001930678df13mr4501947plo.36.1673009714956; Fri, 06 Jan 2023 04:55:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673009714; cv=none; d=google.com; s=arc-20160816; b=00/qrIB3ArW0YUAwmf9zjQzwc+ahK4+XXhaymyJsGFNrQweMfDgbUw+KPFLXsq+trT lLB/2Hb0hFd23GNF6V/gCAvMcM7zvTN/biR8nGmukp1JcBmnxcBFcW2eVCb+YTHm1kZP v5j0Hxex6vpjxUkRQQZSeSVfYcXuyw1tgB+xZbpdD6IG7iasJWY+3psoLtcnNXgkviCY 07DIY2HwgBlC6PBXsR1BSrmo42SNw3+JjHMAD2CKhgYNW3Z71aBLExyZU03jtlnwJZm+ 0XUPjd+inw22o6i9jMSYm9SuUODtvcRJB45LlDoeTZ2JLJRu0D6F1L7UaekmVnmSYeiA zOxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=jsRht6/294COPn4qdt8JzTnl1/ZHETxQZMTz/U4zz/s=; b=mAllSjkcyDicYFhg19iDsNuvSwP9UehGoD5iN/sxoOdmixyahJpScGr5L4wgXxPlXw DldPp5gYLHEHhQ6O3Yxk5brj1DT0nNfmiszoGNl99Q1pjT67ZxaMLJGyxdWsKUt7EdLt qytf3bTCmSniZWeTUWWoEJsxd6c+41jqAWwgpPLov1kBCRoLP0zBdWNZ83RFaopIpnJY d/5DOHeBfK3pMJuQyyBGdN8GoniOOq8YC99jzFMNQGXLnMuSKW71UtA7fcgyEb/uD5j6 0CWCRw6WBlEGMq14OTGVikUoD9gSJKB7iYBJWd/akzZv3oKoCwUVtqUI90cspKRRHKAz KV0w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 206-20020a6302d7000000b004a92e6dba98si1358727pgc.624.2023.01.06.04.55.02; Fri, 06 Jan 2023 04:55:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234230AbjAFMyL (ORCPT + 99 others); Fri, 6 Jan 2023 07:54:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233792AbjAFMxk (ORCPT ); Fri, 6 Jan 2023 07:53:40 -0500 Received: from out30-54.freemail.mail.aliyun.com (out30-54.freemail.mail.aliyun.com [115.124.30.54]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21E7D66981; Fri, 6 Jan 2023 04:53:37 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VZ-Jgw5_1673009614; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VZ-Jgw5_1673009614) by smtp.aliyun-inc.com; Fri, 06 Jan 2023 20:53:35 +0800 From: Jingbo Xu To: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: huyue2@coolpad.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH 4/6] erofs: implement .read_iter for page cache sharing Date: Fri, 6 Jan 2023 20:53:28 +0800 Message-Id: <20230106125330.55529-5-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230106125330.55529-1-jefflexu@linux.alibaba.com> References: <20230106125330.55529-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754277834979241135?= X-GMAIL-MSGID: =?utf-8?q?1754277834979241135?= When page cache sharing enabled, page caches are managed in the address space of blobs rather than erofs inodes. All erofs inodes sharing one chunk will refer to and share the page cache in the blob's address space. Signed-off-by: Jingbo Xu --- fs/erofs/fscache.c | 64 +++++++++++++++++++++++++++++++++++++++++++++ fs/erofs/internal.h | 1 + 2 files changed, 65 insertions(+) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index ea276884f043..1f2a42dd1590 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -320,6 +320,70 @@ const struct address_space_operations erofs_fscache_access_aops = { static const struct file_operations erofs_fscache_meta_fops = {}; +static ssize_t erofs_fscache_share_file_read_iter(struct kiocb *iocb, + struct iov_iter *to) +{ + struct file *filp = iocb->ki_filp; + struct inode *inode = file_inode(filp); + /* since page cache sharing is enabled only when i_size <= chunk_size */ + struct erofs_map_blocks map = {}; /* .m_la = 0 */ + struct erofs_map_dev mdev; + struct folio *folio; + ssize_t already_read = 0; + int ret = 0; + + /* no need taking (shared) inode lock since it's a ro filesystem */ + if (!iov_iter_count(to)) + return 0; + + if (IS_DAX(inode) || iocb->ki_flags & IOCB_DIRECT) + return -EOPNOTSUPP; + + ret = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW); + if (ret) + return ret; + + mdev = (struct erofs_map_dev) { + .m_deviceid = map.m_deviceid, + .m_pa = map.m_pa, + }; + ret = erofs_map_dev(inode->i_sb, &mdev); + if (ret) + return ret; + + do { + size_t bytes, copied, offset, fsize; + pgoff_t index = (mdev.m_pa >> PAGE_SHIFT) + (iocb->ki_pos >> PAGE_SHIFT); + + folio = read_cache_folio(mdev.m_fscache->inode->i_mapping, index, NULL, NULL); + if (IS_ERR(folio)) { + ret = PTR_ERR(folio); + break; + } + + fsize = folio_size(folio); + offset = iocb->ki_pos & (fsize - 1); + bytes = min_t(size_t, inode->i_size - iocb->ki_pos, iov_iter_count(to)); + bytes = min_t(size_t, bytes, fsize - offset); + copied = copy_folio_to_iter(folio, offset, bytes, to); + folio_put(folio); + iocb->ki_pos += copied; + already_read += copied; + if (copied < bytes) { + ret = -EFAULT; + break; + } + } while (iov_iter_count(to) && iocb->ki_pos < inode->i_size); + + file_accessed(filp); + return already_read ? already_read : ret; +} + +const struct file_operations erofs_fscache_share_file_fops = { + .llseek = generic_file_llseek, + .read_iter = erofs_fscache_share_file_read_iter, +}; + static void erofs_fscache_domain_put(struct erofs_domain *domain) { mutex_lock(&erofs_domain_list_lock); diff --git a/fs/erofs/internal.h b/fs/erofs/internal.h index 24d471fe2fa4..386e2fd4c025 100644 --- a/fs/erofs/internal.h +++ b/fs/erofs/internal.h @@ -617,6 +617,7 @@ struct erofs_fscache *erofs_fscache_register_cookie(struct super_block *sb, void erofs_fscache_unregister_cookie(struct erofs_fscache *fscache); extern const struct address_space_operations erofs_fscache_access_aops; +extern const struct file_operations erofs_fscache_share_file_fops; #else static inline int erofs_fscache_register_fs(struct super_block *sb) { From patchwork Fri Jan 6 12:53:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 40119 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp808440wrt; Fri, 6 Jan 2023 04:56:45 -0800 (PST) X-Google-Smtp-Source: AMrXdXtJ+dUnWzIQfYH9DuQUjAg/vpGYc4ZIvCcQWjFyT0Mq4qx7PAhmj2mxR6XNetngszLybb61 X-Received: by 2002:a17:90a:e7c1:b0:226:23e4:fd9e with SMTP id kb1-20020a17090ae7c100b0022623e4fd9emr31816962pjb.18.1673009805571; Fri, 06 Jan 2023 04:56:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673009805; cv=none; d=google.com; s=arc-20160816; b=dR/zxkK0ciLLdHRLI/kFQ1wrvM+qsovYxBW9TVeiGdFwKZELcHCVHnZu3x/JC++ICA t1Q1PkLHrQ034OuatY56l6ulRUBL2NkOVMUPBMlUDZlEGV9tHEtI/PPLtQ4OFb+My9vZ EGn/LseqNt+/LS19cvW6kkB0a6ahKiliYl7d8qkGJ40jvtr5cRyC1ZpkXhyst1Er6M9T xjQQ+ZrDFlQdmWfZcu5lBGwi7R/bqhB7qFv6Bs+O5DKWGTNa5v55qSHHaLyGuVlbt3h8 4oC22qFo6y2RYhLj1qW6xARL1MeD6+vzR9UCLbm0UJyauMgDJ6tE3qIErsDZ0DgPwlmn qU0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=srHps/lUyKaEXgs7RIdre+N+YWkkeUNqcr0IA+cF1k4=; b=uSv5VTUEoqOa2H67YtNdpColgYaN0jXttaDUTixq1uBC6e25zW8kcT52kbKI9oHeUv 1LiClkNggVzfG4iGxRCaQa0tfratqEopiliecl3dIIJfKWMiZWdmxNeqNhnhkueaZVIU 3mc7U77BI3wwGqFo72H0wJ8gKCnRSS7wWpqUIS6MenW1pnXo8WCe2kOu7zXfQNBJH4Aj r2xRL+wGDAjtxPcekCXLAkJ4t3XZJ50zf+o5HEY7PBMSkc45/Jqc3bbUkN4u6T00EQNN gSslpMpx+CsCNy41PciFqrwhRpY/mZ7hZX4dOA4LR9KoKCobDnoxpkAY1CGXYxzGLzRu jp5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e11-20020a17090a9a8b00b0020d4dc7fa97si1286879pjp.110.2023.01.06.04.56.32; Fri, 06 Jan 2023 04:56:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234252AbjAFMyQ (ORCPT + 99 others); Fri, 6 Jan 2023 07:54:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233796AbjAFMxl (ORCPT ); Fri, 6 Jan 2023 07:53:41 -0500 Received: from out199-3.us.a.mail.aliyun.com (out199-3.us.a.mail.aliyun.com [47.90.199.3]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5B6B736E8; Fri, 6 Jan 2023 04:53:39 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R421e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VZ-Iakh_1673009615; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VZ-Iakh_1673009615) by smtp.aliyun-inc.com; Fri, 06 Jan 2023 20:53:36 +0800 From: Jingbo Xu To: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: huyue2@coolpad.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH 5/6] erofs: implement .mmap for page cache sharing Date: Fri, 6 Jan 2023 20:53:29 +0800 Message-Id: <20230106125330.55529-6-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230106125330.55529-1-jefflexu@linux.alibaba.com> References: <20230106125330.55529-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754277930047750364?= X-GMAIL-MSGID: =?utf-8?q?1754277930047750364?= In mmap(2), replace vma->vm_file with the anonymous file associated with the blob, so that the vma will be linked to the address_space of the blob. One thing worth noting is that, we return error early in mmap(2) if users attempt to map beyond the file size. Normally filesystems won't restrict this in mmap(2). The checking is done in the fault handler, and SIGBUS will be signaled to users if they actually attempt to access the area beyond the end of the file. However since vma->vm_file has been changed to the anonymous file in mmap(2), we can no way derive the file size of the original file. As file size is immutable in ro filesystem, let's fail early in mmap(2) in this case. Signed-off-by: Jingbo Xu --- fs/erofs/fscache.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/fs/erofs/fscache.c b/fs/erofs/fscache.c index 1f2a42dd1590..98341d4d9c0d 100644 --- a/fs/erofs/fscache.c +++ b/fs/erofs/fscache.c @@ -379,9 +379,46 @@ static ssize_t erofs_fscache_share_file_read_iter(struct kiocb *iocb, return already_read ? already_read : ret; } +static const struct vm_operations_struct erofs_fscache_share_file_vm_ops = { + .fault = filemap_fault, +}; + +static int erofs_fscache_share_file_mmap(struct file *file, + struct vm_area_struct *vma) +{ + struct inode *inode = file_inode(file); + /* since page cache sharing is enabled only when i_size <= chunk_size */ + struct erofs_map_blocks map = {}; /* .m_la = 0 */ + struct erofs_map_dev mdev; + int ret; + + if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) + return -EINVAL; + + ret = erofs_map_blocks(inode, &map, EROFS_GET_BLOCKS_RAW); + if (ret) + return ret; + + mdev = (struct erofs_map_dev) { + .m_deviceid = map.m_deviceid, + .m_pa = map.m_pa, + }; + ret = erofs_map_dev(inode->i_sb, &mdev); + if (ret) + return ret; + + vma_set_file(vma, mdev.m_fscache->file); + vma->vm_pgoff = (mdev.m_pa >> PAGE_SHIFT) + vma->vm_pgoff; + vma->vm_ops = &erofs_fscache_share_file_vm_ops; + + file_accessed(file); + return 0; +} + const struct file_operations erofs_fscache_share_file_fops = { .llseek = generic_file_llseek, .read_iter = erofs_fscache_share_file_read_iter, + .mmap = erofs_fscache_share_file_mmap, }; static void erofs_fscache_domain_put(struct erofs_domain *domain) From patchwork Fri Jan 6 12:53:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 40120 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp808441wrt; Fri, 6 Jan 2023 04:56:45 -0800 (PST) X-Google-Smtp-Source: AMrXdXs39435VsBXwQbQVHqywp9ryxr8492dlQMCjjPR47DoiLExY+jpLy4ZtRQAGyLHPa8sc1VK X-Received: by 2002:a05:6a00:1d03:b0:580:149a:5650 with SMTP id a3-20020a056a001d0300b00580149a5650mr58381192pfx.22.1673009805576; Fri, 06 Jan 2023 04:56:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673009805; cv=none; d=google.com; s=arc-20160816; b=Iw65Gi+GelalpX1JHOxKlmm77f83V8S78uoH92zEMFct/Zz7zhcq75IpkOzksxEh/l njvC7rFwAStDssPbChySn9egPcDdT7cAkJ4oTJcUB8T/p+jf8DGtN8JLrImmtLAyiCXc 4je6F6Jk2ed0koTZHAUocnE4G1MBCMfox802QyBMK6rV6JK4aRPofK80Zhx0bKIDOXJq Rqschxwvivj9tJ/W8KN+0FYIkziNbMBSkm+CbwcsuL5JcB5+vTcvwoDBwyR2mOLamlPu /4FoJQfZy1eSjBtydD0k+SHi6oYprrXXul5aOQd+eY6nNl8ICYVrgac6GiZFpBuxOZwy UyzA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=PSEiY+7Ervdpu4TtZoid50RmtZ2hhcRRzf8ShkPHGpQ=; b=aNTImwZC5ui5cROcxZ65UeWHlZDWNVUJY6wof+/WOZAMbXuUAnC94Z/4c5m/qUV8ro 35t/2dzsYEh9XNjSRtOmbiuUY1xp50Nw6MDNcSNQ2vc9874jIH3+1gdOn4gNUbESAPEq LYPlcYnklachBBHZdcNNp4FCindclVrftz7sJa5tLVYFuRzQSekP0oi/M4A6BSoxON0q r0/uKAzNANeeuVk052/zzIys2aQ4qEBDbGaJxuBOcZAwrHpTAcTzsZ5xT32uRVDdNPu0 GU4iwtgBJI3gSbzxOZQ6vTWdSTRLL1+ouXjdXC2mmeyMQPno9xpiDF5QL0cummVhyc+t bdAw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j125-20020a625583000000b00557a43656c6si1224103pfb.109.2023.01.06.04.56.32; Fri, 06 Jan 2023 04:56:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234291AbjAFMyU (ORCPT + 99 others); Fri, 6 Jan 2023 07:54:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233864AbjAFMxo (ORCPT ); Fri, 6 Jan 2023 07:53:44 -0500 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BCE268CA1; Fri, 6 Jan 2023 04:53:41 -0800 (PST) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R421e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VZ-H5ZQ_1673009616; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VZ-H5ZQ_1673009616) by smtp.aliyun-inc.com; Fri, 06 Jan 2023 20:53:37 +0800 From: Jingbo Xu To: xiang@kernel.org, chao@kernel.org, linux-erofs@lists.ozlabs.org Cc: huyue2@coolpad.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH 6/6] erofs: enable page cache sharing in fscache mode Date: Fri, 6 Jan 2023 20:53:30 +0800 Message-Id: <20230106125330.55529-7-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230106125330.55529-1-jefflexu@linux.alibaba.com> References: <20230106125330.55529-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754277929918412361?= X-GMAIL-MSGID: =?utf-8?q?1754277929918412361?= Erofs supports chunk deduplication to reduce disk usage. Furthermore we can make inodes share page cache of these deduplicated chunks to reduce the memory usage. This shall be much usable in container scenarios as deduplication is requisite for container image. This can be achieved by managing page cache of deduplicated chunks in blob's address space. In this way, all inodes sharing the deduplicated chunk will refer to and share the page cache in the blob's address space. So far there are some restrictions for enabling this feature. The page cache sharing feature also supports .mmap(). The reverse mapping requires that one vma can not be shared among inodes and can be linked to only one inode. As the vma will be finally linked to the blob's address space when page cache sharing enabled, the restriction of the reverse mapping actually requires that the mapped file area can not be mapped to multiple blobs. Thus page cache sharing can only be enabled for those files mapped to one blob. The chunk based data layout guarantees that a chunk will not cross the device (blob) boundary. Thus in chunk based data layout, those files smaller than the chunk size shall be guaranteed to be mapped to one blob. As chunk size is tunable at a per-file basis, this restriction can be relaxed at image building phase. As long as we ensure that the file can not be deduplicated, the file's chunk size can be set to a reasonable value larger than the file size, so that the page cache sharing feature can be enabled on this file later. The second restriction is that EROFS_BLKSIZ mus be multiples of PAGE_SIZE to avoid data leakage. Otherwise unrelated data may be exposed at the end of the last page, since file's data is arranged in unit of EROFS_BLKSIZ in the image. Signed-off-by: Jingbo Xu --- fs/erofs/inode.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c index d3b8736fa124..8fe9b29422b5 100644 --- a/fs/erofs/inode.c +++ b/fs/erofs/inode.c @@ -241,6 +241,29 @@ static int erofs_fill_symlink(struct inode *inode, void *kaddr, return 0; } +static bool erofs_can_share_page_cache(struct inode *inode) +{ + struct erofs_inode *vi = EROFS_I(inode); + + /* enable page cache sharing only in share domain mode */ + if (!erofs_is_fscache_mode(inode->i_sb) || + !EROFS_SB(inode->i_sb)->domain_id) + return false; + + if (vi->datalayout != EROFS_INODE_CHUNK_BASED) + return false; + + /* avoid crossing multi devicces/blobs */ + if (inode->i_size > 1UL << vi->chunkbits) + return false; + + /* avoid data leakage in mmap routine */ + if (EROFS_BLKSIZ % PAGE_SIZE) + return false; + + return true; +} + static int erofs_fill_inode(struct inode *inode) { struct erofs_inode *vi = EROFS_I(inode); @@ -262,6 +285,10 @@ static int erofs_fill_inode(struct inode *inode) inode->i_op = &erofs_generic_iops; if (erofs_inode_is_data_compressed(vi->datalayout)) inode->i_fop = &generic_ro_fops; +#ifdef CONFIG_EROFS_FS_ONDEMAND + else if (erofs_can_share_page_cache(inode)) + inode->i_fop = &erofs_fscache_share_file_fops; +#endif else inode->i_fop = &erofs_file_fops; break;