Message ID | 71897125-e570-46ce-946a-d4729725e28f@kernel.dk |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp1072176vqu; Mon, 25 Sep 2023 02:05:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IENR0ywska+82avlYc7gp3MBl69r2/CGx1tJB1W/XEDoRxb61AUmLy/ddi2Jn3LyOboMsBQ X-Received: by 2002:a17:902:e54e:b0:1c5:b1cc:fe0e with SMTP id n14-20020a170902e54e00b001c5b1ccfe0emr4965482plf.53.1695632739508; Mon, 25 Sep 2023 02:05:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695632739; cv=none; d=google.com; s=arc-20160816; b=pJMu0czFqiRpCUSfsfpxWpsV2epQvE0kh2TeUHo7cZHWBrVTC3V9nvKUXNB91x7bN4 KFvscJCBKXb2jpQvGkk/QPs9YNIsWSeevGvUhhZGyQF0fHDJk5JQbn4+K//BvJgOALKM OaWD+BvsLjxftSAHD1S0lxajJIrzOFsI0aQICq7V5HeYHFsqB9XT2wMGjMOyw0ZVE+MS qF0V+BIs2H63JWNBRYWXWdu4T7DN3YriHc2UCu0EloC7+5jrmKmeEMUxRtit82TFvhMJ Q83i9rPgEfG380TaEV5qQ6ox8GRZGcFrWIZVo4mSNLv5FliZjC6ashVZspSXX4YnRs+S Q8Fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:subject:from:to :content-language:user-agent:mime-version:date:message-id :dkim-signature; bh=lGoflC/MDGiByLKpoE4UCpKhG9wCXgC3jrpF9mnawmc=; fh=uUeoF+0GaSUySnt5NJ8yi4FuCTrYD5hh+36uZY7EIak=; b=apcJiLxJqGGOXzV29pdofDobHLYvISnVw+6Zs3/Ii92JSvZ3+ISMWZRL1Jlcm9QeR4 pjHL17NzTVaHpclCYyhgVfWNVnbBHJk2I3cSxY2MFqIRt+xcck5sOWk25X8hTpCQg0o3 yXG44i8v53tN5vZvreHAS+FzERx3bwARNxpwqGpqLDRE1b3um5mfFyOJIR/KS9AwcEMW Ljx3amYKKXjymOSw+JcPNjzuvAYxoFPp+Wa9kqincmS/Qisb7Ekvedyo435eyFy+CL8k nZUZttXRZ1rAVTCcHvDaL6J2pbvry21pzJ89dFbZ5xAsWDKw6btvAM05QXn9YPF6NCof 74MQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=ECMthtLk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id t4-20020a17090340c400b001bc434b9d95si9832762pld.156.2023.09.25.02.05.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Sep 2023 02:05:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=ECMthtLk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id BEE1580477A4; Sun, 24 Sep 2023 23:23:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232164AbjIYGXr (ORCPT <rfc822;ezelljr.billy@gmail.com> + 30 others); Mon, 25 Sep 2023 02:23:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49980 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232240AbjIYGXd (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 25 Sep 2023 02:23:33 -0400 Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFED226A0 for <linux-kernel@vger.kernel.org>; Sun, 24 Sep 2023 23:21:39 -0700 (PDT) Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-40471c054f9so18560645e9.0 for <linux-kernel@vger.kernel.org>; Sun, 24 Sep 2023 23:21:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1695622897; x=1696227697; darn=vger.kernel.org; h=content-transfer-encoding:cc:subject:from:to:content-language :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=lGoflC/MDGiByLKpoE4UCpKhG9wCXgC3jrpF9mnawmc=; b=ECMthtLkjF0hBBM6O/lZvvlfnDcHj5vR4Lxe+D45C7tZDbaykHERGtOEudoribQwOl Fte35//cCpWckzrS07O64yvry6jMO7VfOKtLeD/nXLDiv4Ysfkf+H89g1LNrj1z+yqOT 2I18D6Y96LGQEkTX0YHq8CBhSFjxCwXQXdq3HR3fjFYO2ESM8TMMf2E8WFwpGpGwdPup XZLZmKwIZdtOhxGa5aszprHKFkSXUEiPr/yUWolFA3Ywh6qOHjxnGhu8WpG7cMtjaH2c 2AkCOgUwfisfgx/0KcsWXaeaUjfgFWCRkxWW6LnHiDTfwMkT8kuRNTF18tyYP5NqQBI2 LJzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695622897; x=1696227697; h=content-transfer-encoding:cc:subject:from:to:content-language :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=lGoflC/MDGiByLKpoE4UCpKhG9wCXgC3jrpF9mnawmc=; b=HBAT1djwmFrZLhAKp0bUfZ20k6k+WLIbR9mvxHHuozFTZLGJ5MrtPI1wqpRPd6VVz3 9kpx3ghen5/yvZrCq31FzIGyJdM5kMhmXAejp+lxiUWTty6mo74O70dA0fpMHzZUJraz RXcKcoYIiTMbOYImbXnhbySj5BOmzKIfqQzqzsOLv+5HfyGX4XtOaDQXky46AOc/X6KU MlLHbEPvbribZ7HRmRQo2eKRTCgiyba94qEdjWLS/Wv9DvquGmC1ILS3Nsypg1dZ88kq Ap2wOiSBpPB8LlEvF3PsW1cEtW5TIWc+F1L6eY9rgW+pYMrlXLVVFNys5Wx9XkkeyKJi us2Q== X-Gm-Message-State: AOJu0YxAViUFHrHMIoJHKPP8DdccOIJfJzFXrx5iJ2LYFCAjGUe3rrG1 Bn3sppw0eTNQ9/bw5rTe/ruTSQ== X-Received: by 2002:a05:600c:4e05:b0:405:3cc1:e115 with SMTP id b5-20020a05600c4e0500b004053cc1e115mr5041954wmq.3.1695622896458; Sun, 24 Sep 2023 23:21:36 -0700 (PDT) Received: from [172.20.13.88] ([45.147.210.162]) by smtp.gmail.com with ESMTPSA id u23-20020a05600c211700b004042dbb8925sm14054942wml.38.2023.09.24.23.21.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 24 Sep 2023 23:21:35 -0700 (PDT) Message-ID: <71897125-e570-46ce-946a-d4729725e28f@kernel.dk> Date: Mon, 25 Sep 2023 00:21:35 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Amir Goldstein <amir73il@gmail.com>, LKML <linux-kernel@vger.kernel.org>, linux-unionfs@vger.kernel.org From: Jens Axboe <axboe@kernel.dk> Subject: [PATCH] ovl: disable IOCB_DIO_CALLER_COMP Cc: Zorro Lang <zlang@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Sun, 24 Sep 2023 23:23:46 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777999795438022345 X-GMAIL-MSGID: 1777999795438022345 |
Series |
ovl: disable IOCB_DIO_CALLER_COMP
|
|
Commit Message
Jens Axboe
Sept. 25, 2023, 6:21 a.m. UTC
overlayfs copies the kiocb flags when it sets up a new kiocb to handle
a write, but it doesn't properly support dealing with the deferred
caller completions of the kiocb. This means it doesn't get the final
write completion value, and hence will complete the write with '0' as
the result.
We could support the caller completions in overlayfs, but for now let's
just disable them in the generated write kiocb.
Reported-by: Zorro Lang <zlang@redhat.com>
Link: https://lore.kernel.org/io-uring/20230924142754.ejwsjen5pvyc32l4@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/
Fixes: 8c052fb3002e ("iomap: support IOCB_DIO_CALLER_COMP")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
Comments
On Mon, Sep 25, 2023 at 9:21 AM Jens Axboe <axboe@kernel.dk> wrote: > > overlayfs copies the kiocb flags when it sets up a new kiocb to handle > a write, but it doesn't properly support dealing with the deferred > caller completions of the kiocb. This means it doesn't get the final > write completion value, and hence will complete the write with '0' as > the result. > > We could support the caller completions in overlayfs, but for now let's > just disable them in the generated write kiocb. > > Reported-by: Zorro Lang <zlang@redhat.com> > Link: https://lore.kernel.org/io-uring/20230924142754.ejwsjen5pvyc32l4@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/ > Fixes: 8c052fb3002e ("iomap: support IOCB_DIO_CALLER_COMP") > Signed-off-by: Jens Axboe <axboe@kernel.dk> > Thanks for fixing this Jens! If you or Christian want to send this fix to Linus, you have my ACK. On the bright side, I am glad that you are aware of the overlayfs "kiocb_clone" use case, which delegates/forwards the io request to another file in another fs. I have already posted an RFC [1] for moving this functionality to common vfs code. My main goal was to expose it to other filesystem (fuse), but a very desired side effect is that this functionality gets more vfs reviewer eyes and then the chances of catching a regression like this one during review of vfs changes hopefully increases. As for test coverage, I need to check why my tests did not catch this - I suspect fsx may not have been rebuilt with io_uring support, but not sure (not near workstation atm). If you would like to add overlayfs to your test coverage, as Zorro explained, it is as simple as running ./check -overlay with your existing fstests config. ./check -overlay is a relatively faster test run because many of the tests do _notrun on overlayfs. I don't have to tell you that io_uring code will end up running on overlayfs in many container workloads, so it is not a niche setup. Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/20230912185408.3343163-1-amir73il@gmail.com/ > --- > > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c > index 4193633c4c7a..693971d20280 100644 > --- a/fs/overlayfs/file.c > +++ b/fs/overlayfs/file.c > @@ -391,6 +391,12 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter) > if (!ovl_should_sync(OVL_FS(inode->i_sb))) > ifl &= ~(IOCB_DSYNC | IOCB_SYNC); > > + /* > + * Overlayfs doesn't support deferred completions, don't copy > + * this property in case it is set by the issuer. > + */ > + ifl &= ~IOCB_DIO_CALLER_COMP; > + > old_cred = ovl_override_creds(file_inode(file)->i_sb); > if (is_sync_kiocb(iocb)) { > file_start_write(real.file); > > -- > Jens Axboe > >
On 9/25/23 3:18 AM, Amir Goldstein wrote: > On Mon, Sep 25, 2023 at 9:21?AM Jens Axboe <axboe@kernel.dk> wrote: >> >> overlayfs copies the kiocb flags when it sets up a new kiocb to handle >> a write, but it doesn't properly support dealing with the deferred >> caller completions of the kiocb. This means it doesn't get the final >> write completion value, and hence will complete the write with '0' as >> the result. >> >> We could support the caller completions in overlayfs, but for now let's >> just disable them in the generated write kiocb. >> >> Reported-by: Zorro Lang <zlang@redhat.com> >> Link: https://lore.kernel.org/io-uring/20230924142754.ejwsjen5pvyc32l4@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/ >> Fixes: 8c052fb3002e ("iomap: support IOCB_DIO_CALLER_COMP") >> Signed-off-by: Jens Axboe <axboe@kernel.dk> >> > > Thanks for fixing this Jens! > If you or Christian want to send this fix to Linus, you have my ACK. No problem - and thanks, maybe Christian can pick this one up? I tentatively queued it up here just so I don't forget it: https://git.kernel.dk/cgit/linux/log/?h=ovl-kiocb > On the bright side, I am glad that you are aware of the overlayfs > "kiocb_clone" use case, which delegates/forwards the io request to > another file in another fs. > > I have already posted an RFC [1] for moving this functionality to > common vfs code. My main goal was to expose it to other filesystem > (fuse), but a very desired side effect is that this functionality gets > more vfs reviewer eyes and then the chances of catching a regression > like this one during review of vfs changes hopefully increases. Ah that's great! Yeah it's a bit hidden in there if you don't know about it, and I did grep today when writing this patch to ensure we didn't have any others like it. So I think we're good for now, at least. > As for test coverage, I need to check why my tests did not catch > this - I suspect fsx may not have been rebuilt with io_uring support, > but not sure (not near workstation atm). I'm guessing it's because you don't have liburing installed on the test box, then fsx etc don't get built with io_uring support in xfstests. > If you would like to add overlayfs to your test coverage, as Zorro > explained, it is as simple as running ./check -overlay with your > existing fstests config. > ./check -overlay is a relatively faster test run because many of the > tests do _notrun on overlayfs. > I don't have to tell you that io_uring code will end up running on > overlayfs in many container workloads, so it is not a niche setup. Will add it to the mix! Thanks for the details.
On Mon, 25 Sep 2023 00:21:35 -0600, Jens Axboe wrote: > overlayfs copies the kiocb flags when it sets up a new kiocb to handle > a write, but it doesn't properly support dealing with the deferred > caller completions of the kiocb. This means it doesn't get the final > write completion value, and hence will complete the write with '0' as > the result. > > We could support the caller completions in overlayfs, but for now let's > just disable them in the generated write kiocb. > > [...] Applied to the vfs.fixes branch of the vfs/vfs.git tree. Patches in the vfs.fixes branch should appear in linux-next soon. Please report any outstanding bugs that were missed during review in a new review to the original patch series allowing us to drop it. It's encouraged to provide Acked-bys and Reviewed-bys even though the patch has now been applied. If possible patch trailers will be updated. Note that commit hashes shown below are subject to change due to rebase, trailer updates or similar. If in doubt, please check the listed branch. tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git branch: vfs.fixes [1/1] ovl: disable IOCB_DIO_CALLER_COMP https://git.kernel.org/vfs/vfs/c/2d1b3bbc3dd5
> No problem - and thanks, maybe Christian can pick this one up? I
Snatched it I've got a pile I need to send to Linus this week anyway.
(Thanks for the Cc, Amir!)
On Sep 25, 2023, at 11:39 AM, Christian Brauner <brauner@kernel.org> wrote: > > >> >> No problem - and thanks, maybe Christian can pick this one up? I > > Snatched it I've got a pile I need to send to Linus this week anyway. > (Thanks for the Cc, Amir!) Perfect, thanks!
On Mon, Sep 25, 2023 at 12:21:35AM -0600, Jens Axboe wrote: > overlayfs copies the kiocb flags when it sets up a new kiocb to handle > a write, but it doesn't properly support dealing with the deferred > caller completions of the kiocb. This means it doesn't get the final > write completion value, and hence will complete the write with '0' as > the result. > > We could support the caller completions in overlayfs, but for now let's > just disable them in the generated write kiocb. > > Reported-by: Zorro Lang <zlang@redhat.com> > Link: https://lore.kernel.org/io-uring/20230924142754.ejwsjen5pvyc32l4@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/ > Fixes: 8c052fb3002e ("iomap: support IOCB_DIO_CALLER_COMP") > Signed-off-by: Jens Axboe <axboe@kernel.dk> > > --- Thanks Jens, the fstests generic/617 works on latest linux kernel with this patch now. > > diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c > index 4193633c4c7a..693971d20280 100644 > --- a/fs/overlayfs/file.c > +++ b/fs/overlayfs/file.c > @@ -391,6 +391,12 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter) > if (!ovl_should_sync(OVL_FS(inode->i_sb))) > ifl &= ~(IOCB_DSYNC | IOCB_SYNC); > > + /* > + * Overlayfs doesn't support deferred completions, don't copy > + * this property in case it is set by the issuer. > + */ > + ifl &= ~IOCB_DIO_CALLER_COMP; > + > old_cred = ovl_override_creds(file_inode(file)->i_sb); > if (is_sync_kiocb(iocb)) { > file_start_write(real.file); > > -- > Jens Axboe > > >
On 9/26/23 9:23 AM, Zorro Lang wrote: > On Mon, Sep 25, 2023 at 12:21:35AM -0600, Jens Axboe wrote: >> overlayfs copies the kiocb flags when it sets up a new kiocb to handle >> a write, but it doesn't properly support dealing with the deferred >> caller completions of the kiocb. This means it doesn't get the final >> write completion value, and hence will complete the write with '0' as >> the result. >> >> We could support the caller completions in overlayfs, but for now let's >> just disable them in the generated write kiocb. >> >> Reported-by: Zorro Lang <zlang@redhat.com> >> Link: https://lore.kernel.org/io-uring/20230924142754.ejwsjen5pvyc32l4@dell-per750-06-vm-08.rhts.eng.pek2.redhat.com/ >> Fixes: 8c052fb3002e ("iomap: support IOCB_DIO_CALLER_COMP") >> Signed-off-by: Jens Axboe <axboe@kernel.dk> >> >> --- > > Thanks Jens, the fstests generic/617 works on latest linux kernel with > this patch now. Excellent, thanks for re-testing.
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 4193633c4c7a..693971d20280 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -391,6 +391,12 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter) if (!ovl_should_sync(OVL_FS(inode->i_sb))) ifl &= ~(IOCB_DSYNC | IOCB_SYNC); + /* + * Overlayfs doesn't support deferred completions, don't copy + * this property in case it is set by the issuer. + */ + ifl &= ~IOCB_DIO_CALLER_COMP; + old_cred = ovl_override_creds(file_inode(file)->i_sb); if (is_sync_kiocb(iocb)) { file_start_write(real.file);