Message ID | 20230424123250.125404-1-jefflexu@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2713116vqo; Mon, 24 Apr 2023 05:39:46 -0700 (PDT) X-Google-Smtp-Source: AKy350ZBnYaAuXfUCKLtIfd1FrTiTm4dlGUfy3F5ghe8zwxTJvirkUJOmRpvXGt2ovDhp/KybpMt X-Received: by 2002:a17:90a:7004:b0:247:1e30:5880 with SMTP id f4-20020a17090a700400b002471e305880mr12635700pjk.38.1682339985971; Mon, 24 Apr 2023 05:39:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682339985; cv=none; d=google.com; s=arc-20160816; b=nP3FDiZg0AxsBUFMsm3ygXc1a/TK52QF+p3yYlr6xSu+u67ptCmKVnwEAk9XNmnVXm T8VLgtUDuOHFBfCxM2D1A0XsMbZTo4yv0ySJ7vS4DhKzSgJsxxiy/IwH4upeit5UjPWB XL1Rkl7ssJscaBTWg0iSu4XdFqb4v97VKwWrYRzxlOl/vv4gTJjTR9wIJqvwQ2qdnkQU 80GnCb7qKnFXmtW8+CGKKqbSqAaE1OdFZ8r55ocifCnY6gBK2/qWwJnfU64FKP5CVSE4 I77APhN9MABxbzpSL40izskRYaN9WXSwyyC6gnZo6GzPCVNzB1cH8iquxcCayB1NRFfv WuVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=r9quIsCD+smK/DcPMcI6zHHJe6P4tbqxnLlYeh+6S3Q=; b=dFNW74vnrC3FwiR+qNy4IZ1DIR1zPYWzazaLu8oGgdFyVvJPmvctgpLsnG8opTbkOK 5Rjczefq7JjqhMjom4CO732nTAj36U0euQF7/79SochqR5S9kIIDGrP61uFsoRi+x43g TQk3mHrvTjCiKGG/QcAJasr38PlrcZSL3IoQuSz4yrRF39e3xybTA+DUOb/0iMPhM8UW OWMYDuRe4XJY0ZFoj2Mu+1niCM8fUBOQsFWPcsKOQr+MxoGFO4N3SRFo+3zG2qjZpcGV hWQqfvE2mbgtyGjzBhIeYMZ+1cB57z0p3PNLuZ3WZ63DPq6Q+z/J0c8FAPLWf76yPrwP /Bhg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n8-20020a17090a9f0800b00247af6f0497si10974663pjp.21.2023.04.24.05.39.33; Mon, 24 Apr 2023 05:39:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231784AbjDXMdD (ORCPT <rfc822;zxc52fgh@gmail.com> + 99 others); Mon, 24 Apr 2023 08:33:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230499AbjDXMdC (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 24 Apr 2023 08:33:02 -0400 Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 438FF30C5; Mon, 24 Apr 2023 05:32:55 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R781e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0VguiHQo_1682339570; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VguiHQo_1682339570) by smtp.aliyun-inc.com; Mon, 24 Apr 2023 20:32:51 +0800 From: Jingbo Xu <jefflexu@linux.alibaba.com> To: miklos@szeredi.hu, vgoyal@redhat.com, linux-fsdevel@vger.kernel.org Cc: gerry@linux.alibaba.com, linux-kernel@vger.kernel.org Subject: [PATCH] fuse: fix return value of inode_inline_reclaim_one_dmap in error path Date: Mon, 24 Apr 2023 20:32:50 +0800 Message-Id: <20230424123250.125404-1-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764061333240998328?= X-GMAIL-MSGID: =?utf-8?q?1764061333240998328?= |
Series |
fuse: fix return value of inode_inline_reclaim_one_dmap in error path
|
|
Commit Message
Jingbo Xu
April 24, 2023, 12:32 p.m. UTC
When range already got reclaimed by somebody else, return NULL so that
the caller could retry to allocate or reclaim another range, instead of
mistakenly returning the range already got reclaimed and reused by
others.
Reported-by: Liu Jiang <gerry@linux.alibaba.com>
Fixes: 9a752d18c85a ("virtiofs: add logic to free up a memory range")
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
---
fs/fuse/dax.c | 1 +
1 file changed, 1 insertion(+)
Comments
ping... On 4/24/23 8:32 PM, Jingbo Xu wrote: > When range already got reclaimed by somebody else, return NULL so that > the caller could retry to allocate or reclaim another range, instead of > mistakenly returning the range already got reclaimed and reused by > others. > > Reported-by: Liu Jiang <gerry@linux.alibaba.com> > Fixes: 9a752d18c85a ("virtiofs: add logic to free up a memory range") > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > --- > fs/fuse/dax.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c > index 8e74f278a3f6..59aadfd89ee5 100644 > --- a/fs/fuse/dax.c > +++ b/fs/fuse/dax.c > @@ -985,6 +985,7 @@ inode_inline_reclaim_one_dmap(struct fuse_conn_dax *fcd, struct inode *inode, > node = interval_tree_iter_first(&fi->dax->tree, start_idx, start_idx); > /* Range already got reclaimed by somebody else */ > if (!node) { > + dmap = NULL; > if (retry) > *retry = true; > goto out_write_dmap_sem;
On Mon, Apr 24, 2023 at 08:32:50PM +0800, Jingbo Xu wrote: > When range already got reclaimed by somebody else, return NULL so that > the caller could retry to allocate or reclaim another range, instead of > mistakenly returning the range already got reclaimed and reused by > others. > > Reported-by: Liu Jiang <gerry@linux.alibaba.com> > Fixes: 9a752d18c85a ("virtiofs: add logic to free up a memory range") > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> Hi Jingbo, This patch looks correct to me. Are you able to reproduce the problem? Or you are fixing it based on code inspection? How are you testing this? We don't have virtiofsd DAX implementation yet in rust virtiofsd yet. I am not sure how to test this chagne now. We had out of tree patches in qemu and now qemu has gotten rid of C version of virtiofsd so these patches might not even work now. Thanks Vivek > --- > fs/fuse/dax.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c > index 8e74f278a3f6..59aadfd89ee5 100644 > --- a/fs/fuse/dax.c > +++ b/fs/fuse/dax.c > @@ -985,6 +985,7 @@ inode_inline_reclaim_one_dmap(struct fuse_conn_dax *fcd, struct inode *inode, > node = interval_tree_iter_first(&fi->dax->tree, start_idx, start_idx); > /* Range already got reclaimed by somebody else */ > if (!node) { > + dmap = NULL; > if (retry) > *retry = true; > goto out_write_dmap_sem; > -- > 2.19.1.6.gb485710b >
On 6/1/23 4:03 AM, Vivek Goyal wrote: > On Mon, Apr 24, 2023 at 08:32:50PM +0800, Jingbo Xu wrote: >> When range already got reclaimed by somebody else, return NULL so that >> the caller could retry to allocate or reclaim another range, instead of >> mistakenly returning the range already got reclaimed and reused by >> others. >> >> Reported-by: Liu Jiang <gerry@linux.alibaba.com> >> Fixes: 9a752d18c85a ("virtiofs: add logic to free up a memory range") >> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > > Hi Jingbo, > > This patch looks correct to me. > > Are you able to reproduce the problem? Or you are fixing it based on > code inspection? It's spotted by Liu Jiang during code review. Not tested yet. > > How are you testing this? We don't have virtiofsd DAX implementation yet > in rust virtiofsd yet. > > I am not sure how to test this chagne now. We had out of tree patches > in qemu and now qemu has gotten rid of C version of virtiofsd so these > patches might not even work now. Yeah this exception path may not be so easy to be tested as it is only triggered in the race condition. I have the old branch (of qemu) with support for DAX, and maybe I could try to reproduce the exception path by configuring limited DAX window and heavy IO workload.
On Thu, Jun 01, 2023 at 09:45:52AM +0800, Jingbo Xu wrote: > > > On 6/1/23 4:03 AM, Vivek Goyal wrote: > > On Mon, Apr 24, 2023 at 08:32:50PM +0800, Jingbo Xu wrote: > >> When range already got reclaimed by somebody else, return NULL so that > >> the caller could retry to allocate or reclaim another range, instead of > >> mistakenly returning the range already got reclaimed and reused by > >> others. > >> > >> Reported-by: Liu Jiang <gerry@linux.alibaba.com> > >> Fixes: 9a752d18c85a ("virtiofs: add logic to free up a memory range") > >> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > > > > Hi Jingbo, > > > > This patch looks correct to me. > > > > Are you able to reproduce the problem? Or you are fixing it based on > > code inspection? > > It's spotted by Liu Jiang during code review. Not tested yet. > > > > > How are you testing this? We don't have virtiofsd DAX implementation yet > > in rust virtiofsd yet. > > > > I am not sure how to test this chagne now. We had out of tree patches > > in qemu and now qemu has gotten rid of C version of virtiofsd so these > > patches might not even work now. > > Yeah this exception path may not be so easy to be tested as it is only > triggered in the race condition. I have the old branch (of qemu) with > support for DAX, and maybe I could try to reproduce the exception path > by configuring limited DAX window and heavy IO workload. That would be great. Please test it with really small DAX window size. Also put some pr_debug() statements to make sure you are hitting this particular path during testing. Thanks Vivek
On 6/1/23 7:45 PM, Vivek Goyal wrote: > On Thu, Jun 01, 2023 at 09:45:52AM +0800, Jingbo Xu wrote: >> >> >> On 6/1/23 4:03 AM, Vivek Goyal wrote: >>> On Mon, Apr 24, 2023 at 08:32:50PM +0800, Jingbo Xu wrote: >>>> When range already got reclaimed by somebody else, return NULL so that >>>> the caller could retry to allocate or reclaim another range, instead of >>>> mistakenly returning the range already got reclaimed and reused by >>>> others. >>>> >>>> Reported-by: Liu Jiang <gerry@linux.alibaba.com> >>>> Fixes: 9a752d18c85a ("virtiofs: add logic to free up a memory range") >>>> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> >>> >>> Hi Jingbo, >>> >>> This patch looks correct to me. >>> >>> Are you able to reproduce the problem? Or you are fixing it based on >>> code inspection? >> >> It's spotted by Liu Jiang during code review. Not tested yet. >> >>> >>> How are you testing this? We don't have virtiofsd DAX implementation yet >>> in rust virtiofsd yet. >>> >>> I am not sure how to test this chagne now. We had out of tree patches >>> in qemu and now qemu has gotten rid of C version of virtiofsd so these >>> patches might not even work now. >> >> Yeah this exception path may not be so easy to be tested as it is only >> triggered in the race condition. I have the old branch (of qemu) with >> support for DAX, and maybe I could try to reproduce the exception path >> by configuring limited DAX window and heavy IO workload. > > That would be great. Please test it with really small DAX window size. > Also put some pr_debug() statements to make sure you are hitting this > particular path during testing. Got it. Thanks.
On 6/1/23 7:45 PM, Vivek Goyal wrote: > On Thu, Jun 01, 2023 at 09:45:52AM +0800, Jingbo Xu wrote: >> >> >> On 6/1/23 4:03 AM, Vivek Goyal wrote: >>> On Mon, Apr 24, 2023 at 08:32:50PM +0800, Jingbo Xu wrote: >>>> When range already got reclaimed by somebody else, return NULL so that >>>> the caller could retry to allocate or reclaim another range, instead of >>>> mistakenly returning the range already got reclaimed and reused by >>>> others. >>>> >>>> Reported-by: Liu Jiang <gerry@linux.alibaba.com> >>>> Fixes: 9a752d18c85a ("virtiofs: add logic to free up a memory range") >>>> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> >>> >>> Hi Jingbo, >>> >>> This patch looks correct to me. >>> >>> Are you able to reproduce the problem? Or you are fixing it based on >>> code inspection? >> >> It's spotted by Liu Jiang during code review. Not tested yet. >> >>> >>> How are you testing this? We don't have virtiofsd DAX implementation yet >>> in rust virtiofsd yet. >>> >>> I am not sure how to test this chagne now. We had out of tree patches >>> in qemu and now qemu has gotten rid of C version of virtiofsd so these >>> patches might not even work now. >> >> Yeah this exception path may not be so easy to be tested as it is only >> triggered in the race condition. I have the old branch (of qemu) with >> support for DAX, and maybe I could try to reproduce the exception path >> by configuring limited DAX window and heavy IO workload. > > That would be great. Please test it with really small DAX window size. > Also put some pr_debug() statements to make sure you are hitting this > particular path during testing. I tried to reproduce it but failed. It seems the race is impossible theoretically. In theory, the race occurs when a freeable dmap is found in inode's interval tree but found it is removed from the interval tree in the second query. However the above procedure is protected with filemap_invalidate_lock(inode->i_mapping) held in inode_inline_reclaim_one_dmap(). Given the dmap deletion operations from inode's interval tree are all protected with filemap_invalidate_lock(inode->i_mapping) held, e.g. inside inode_inline_reclaim_one_dmap() and lookup_and_reclaim_dmap(), the above race seems impossible then.
diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 8e74f278a3f6..59aadfd89ee5 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -985,6 +985,7 @@ inode_inline_reclaim_one_dmap(struct fuse_conn_dax *fcd, struct inode *inode, node = interval_tree_iter_first(&fi->dax->tree, start_idx, start_idx); /* Range already got reclaimed by somebody else */ if (!node) { + dmap = NULL; if (retry) *retry = true; goto out_write_dmap_sem;