Message ID | 20221113152439.2821942-1-feldsherov@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1725855wru; Sun, 13 Nov 2022 07:33:47 -0800 (PST) X-Google-Smtp-Source: AA0mqf4XN7SXqDupKAS2dF4SkZBUMGvTr1dSQNngS7o99m+yJ0ZUcaKdQBYvUhv3Pe/3ohpuyDhD X-Received: by 2002:a17:90a:a882:b0:214:1802:29f4 with SMTP id h2-20020a17090aa88200b00214180229f4mr10656191pjq.24.1668353626813; Sun, 13 Nov 2022 07:33:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668353626; cv=none; d=google.com; s=arc-20160816; b=p/kVW4mF6KW9b09uVundAcRJrNbDwJCbej162CM/9MfD8q3FZapJ1JDttTjIvhDk6L iqRU1HfbhtxNv/cLRtx4u2l3My/kfBMEKm38StQmZXagn3BmknznO/Mf2dg66YlbaHmQ Xq0WvATi35hnoXmHbsylt8fEmFSgT3Xdjt0cEGiuLHVcI/HoVpxL/zB1A4vz3AQ+5CDT Ud2P5+2XFUgG9gwaG7M7CvXAZ/3msL9UxCRtmwbyzukv2fXCsJkTYilbdR9bf4RoTj0z AizXDdKSf6NRO70mNN6grxTe8ajDG3WfQQI9UrWfO3AOHhvHKVms9gL1geShaK0FBEmZ FNaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=wESqFKb7C5s2QIMr5hhjMl9Ig7evvOrmD1dhNzXIW7o=; b=bgx82dgVSoCR0P5lHpk9+777WOOF1ZLHdW37qteRVB3WZAuBdfZC5YLmxKd81K3c6b qfQ2+MISlMtgWgVE6+INb+aDfy5Gf/HepanD4cEV3GUrD65e9ZnOLSQY8bmYhoIRr4H5 ubqjwrUsoIMSPQ4y5vIeKaT4DbLl6yEZhOB/z/PMlXWXdfEt2ptvyGGpK07XtccyfWx+ nHKEaf7NvGabQ8UAnebFrUjXS8rx0Gr9DvI19HyQ4Cq6ab0bT1nuzh0daTt6MVpYp1nM aVJxiDwknVQuWjo5YsmUbjZVeE6r+0hR3xaTOUvGLGw7LRm/u3ykkOVW4yvOhE1yTSyb OCQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="oHeNec/6"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d4-20020a056a0010c400b005672abf1284si7861581pfu.161.2022.11.13.07.33.33; Sun, 13 Nov 2022 07:33:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b="oHeNec/6"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233401AbiKMPY5 (ORCPT <rfc822;winker.wchi@gmail.com> + 99 others); Sun, 13 Nov 2022 10:24:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233029AbiKMPYy (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 13 Nov 2022 10:24:54 -0500 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06D91958F for <linux-kernel@vger.kernel.org>; Sun, 13 Nov 2022 07:24:54 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id j73-20020a25d24c000000b006dca101748bso8547844ybg.14 for <linux-kernel@vger.kernel.org>; Sun, 13 Nov 2022 07:24:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=wESqFKb7C5s2QIMr5hhjMl9Ig7evvOrmD1dhNzXIW7o=; b=oHeNec/6hDQHpPziy+sFHVcQQepLBtNOeV+UpNDq9YuXKjTOI3Y6ngmG3Je5yl55kT FgJD2CLza9PiZ9r4XK2zVoFdb/ANe77FPSHUDW5+CkVC2QTH46KjkqH1DNYGAd3ZyXbM OwsWSiagrMQfBdwIP7GC7rz2WKxRpfzmviURhryTUq8VrKGB/vIZS6dwwUpSAzhYIDXv V5wfuOL2c7fOxxjadOie4ysNlI8wFITKa8iuyJjk9hjoDexlntBADRcwLZGGVCLZffcr uePdi5e3c3i8948Jcz2WWluMckOs+cdF45cXtiJ2nK+mjOUHe1lxOrb20yTXtvNy7OHe ddRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=wESqFKb7C5s2QIMr5hhjMl9Ig7evvOrmD1dhNzXIW7o=; b=OjoDtRFoYidAAB1+QRYbw53EgH5a7+1HC3J+NW3awskqyPHWcymIejVsu0Sx3qZiem TQ041tbhEnvJiTwcSJQyAFZhMj3MinV8Z4/vUozZjlPINNtd30scu1qHd7OuXJfMHmer XXIEjxwAQu5eTsR8i6oOYI5p4EhO8d0KUUZGbWa2XB45PrGZ0rSoAOr8jkBwzUfliLRA X1OrcqWT6/8DgLjcTejKCXyuaiKUqInaljSazVV0nUY1b/I8LksaHSChJNhazihIdT1o g+kmOHO6/vIogkqXdhiTVvjVLm8j91sKw185vttSoxCzCas4dR6wroCOyRfNro2S1Kn5 bM+A== X-Gm-Message-State: ACrzQf0RDFWXrvUBp77BiJJVid93HzpXJni5wOB+u5HTUNfIOsOkrQQ8 rnPBS08pARJ+LLsG9+TXfpJ/MvVfQ8Jutr1n X-Received: from feldsherov-ws1.tlv.corp.google.com ([2620:0:1045:10:7dda:435d:701d:5257]) (user=feldsherov job=sendgmr) by 2002:a0d:fcc6:0:b0:349:7d12:7255 with SMTP id m189-20020a0dfcc6000000b003497d127255mr64188333ywf.427.1668353092929; Sun, 13 Nov 2022 07:24:52 -0800 (PST) Date: Sun, 13 Nov 2022 17:24:39 +0200 Mime-Version: 1.0 X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221113152439.2821942-1-feldsherov@google.com> Subject: [PATCH] fs: do not push freeing inode to b_dirty_time list From: Svyatoslav Feldsherov <feldsherov@google.com> To: Alexander Viro <viro@zeniv.linux.org.uk>, Lukas Czerner <lczerner@redhat.com>, "Theodore Ts'o" <tytso@mit.edu>, Jan Kara <jack@suse.cz> Cc: syzbot+6ba92bd00d5093f7e371@syzkaller.appspotmail.com, oferz@google.com, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Svyatoslav Feldsherov <feldsherov@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749395572580883343?= X-GMAIL-MSGID: =?utf-8?q?1749395572580883343?= |
Series |
fs: do not push freeing inode to b_dirty_time list
|
|
Commit Message
Svyatoslav Feldsherov
Nov. 13, 2022, 3:24 p.m. UTC
After commit cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode
already has I_DIRTY_INODE") writeiback_single_inode can push inode with
I_DIRTY_TIME set to b_dirty_time list. In case of freeing inode with
I_DIRTY_TIME set this can happened after deletion of inode io_list at
evict. Stack trace is following.
evict
fat_evict_inode
fat_truncate_blocks
fat_flush_inodes
writeback_inode
sync_inode_metadata
writeback_single_inode
This will lead to use after free in flusher thread.
Fixes: cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE")
Reported-by: syzbot+6ba92bd00d5093f7e371@syzkaller.appspotmail.com
Signed-off-by: Svyatoslav Feldsherov <feldsherov@google.com>
---
fs/fs-writeback.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Sun 13-11-22 17:24:39, Svyatoslav Feldsherov wrote: > After commit cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode > already has I_DIRTY_INODE") writeiback_single_inode can push inode with > I_DIRTY_TIME set to b_dirty_time list. In case of freeing inode with > I_DIRTY_TIME set this can happened after deletion of inode io_list at > evict. Stack trace is following. > > evict > fat_evict_inode > fat_truncate_blocks > fat_flush_inodes > writeback_inode > sync_inode_metadata > writeback_single_inode > > This will lead to use after free in flusher thread. > > Fixes: cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE") > Reported-by: syzbot+6ba92bd00d5093f7e371@syzkaller.appspotmail.com > Signed-off-by: Svyatoslav Feldsherov <feldsherov@google.com> Thanks for the analysis! I was scratching my head over this syzbot report for a while and it didn't occur to me somebody could be calling writeback_single_inode() from the .evict callback. Also what contributes to the problem is that FAT calls sync_inode_metadata(inode, 0) so it is not marking this final flush as data integrity sync and so we happily leave the I_DIRTY_TIME bit set. > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 443f83382b9b..31c93cbdb3fe 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -1718,7 +1718,7 @@ static int writeback_single_inode(struct inode *inode, > */ > if (!(inode->i_state & I_DIRTY_ALL)) > inode_cgwb_move_to_attached(inode, wb); > - else if (!(inode->i_state & I_SYNC_QUEUED)) { > + else if (!(inode->i_state & (I_SYNC_QUEUED | I_FREEING))) { > if ((inode->i_state & I_DIRTY)) > redirty_tail_locked(inode, wb); > else if (inode->i_state & I_DIRTY_TIME) { So even calling inode_cgwb_move_to_attached() is not safe when I_FREEING is already set. So I belive the I_FREEING bit check needs to be before this whole if block. I also think we should add some assertions into i_io_list handling functions to complain if I_FREEING bit is set to catch these problems earlier which means to be also more careful in __mark_inode_dirty(). But this is for a separate cleanup. Honza
Thank you for looking into this! On Mon, Nov 14, 2022 at 12:46 PM Jan Kara <jack@suse.cz> wrote: > > On Sun 13-11-22 17:24:39, Svyatoslav Feldsherov wrote: > > After commit cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode > > already has I_DIRTY_INODE") writeiback_single_inode can push inode with > > I_DIRTY_TIME set to b_dirty_time list. In case of freeing inode with > > I_DIRTY_TIME set this can happened after deletion of inode io_list at > > evict. Stack trace is following. > > > > evict > > fat_evict_inode > > fat_truncate_blocks > > fat_flush_inodes > > writeback_inode > > sync_inode_metadata > > writeback_single_inode > > > > This will lead to use after free in flusher thread. > > > > Fixes: cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE") > > Reported-by: syzbot+6ba92bd00d5093f7e371@syzkaller.appspotmail.com > > Signed-off-by: Svyatoslav Feldsherov <feldsherov@google.com> > > Thanks for the analysis! I was scratching my head over this syzbot report > for a while and it didn't occur to me somebody could be calling > writeback_single_inode() from the .evict callback. > > Also what contributes to the problem is that FAT calls > sync_inode_metadata(inode, 0) so it is not marking this final flush as data > integrity sync and so we happily leave the I_DIRTY_TIME bit set. > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > index 443f83382b9b..31c93cbdb3fe 100644 > > --- a/fs/fs-writeback.c > > +++ b/fs/fs-writeback.c > > @@ -1718,7 +1718,7 @@ static int writeback_single_inode(struct inode *inode, > > */ > > if (!(inode->i_state & I_DIRTY_ALL)) > > inode_cgwb_move_to_attached(inode, wb); > > - else if (!(inode->i_state & I_SYNC_QUEUED)) { > > + else if (!(inode->i_state & (I_SYNC_QUEUED | I_FREEING))) { > > if ((inode->i_state & I_DIRTY)) > > redirty_tail_locked(inode, wb); > > else if (inode->i_state & I_DIRTY_TIME) { > > So even calling inode_cgwb_move_to_attached() is not safe when I_FREEING is > already set. So I belive the I_FREEING bit check needs to be before this > whole if block. Agree, let me move the I_FREEING check before this if block. The commit I am fixing didn't change this codepath, so I suspect there is an implicit invariant which keeps inode_cgwb_move_to_attached call safe. But I am 100% in favor of making I_FREEING check explicitly. > > I also think we should add some assertions into i_io_list handling > functions to complain if I_FREEING bit is set to catch these problems > earlier which means to be also more careful in __mark_inode_dirty(). But > this is for a separate cleanup. > > Honza Sounds reasonable. Will look into that afterwards. > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR -- Slava
On Mon 14-11-22 19:43:54, Svyatoslav Feldsherov wrote: > Thank you for looking into this! > > On Mon, Nov 14, 2022 at 12:46 PM Jan Kara <jack@suse.cz> wrote: > > > > On Sun 13-11-22 17:24:39, Svyatoslav Feldsherov wrote: > > > After commit cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode > > > already has I_DIRTY_INODE") writeiback_single_inode can push inode with > > > I_DIRTY_TIME set to b_dirty_time list. In case of freeing inode with > > > I_DIRTY_TIME set this can happened after deletion of inode io_list at > > > evict. Stack trace is following. > > > > > > evict > > > fat_evict_inode > > > fat_truncate_blocks > > > fat_flush_inodes > > > writeback_inode > > > sync_inode_metadata > > > writeback_single_inode > > > > > > This will lead to use after free in flusher thread. > > > > > > Fixes: cbfecb927f42 ("fs: record I_DIRTY_TIME even if inode already has I_DIRTY_INODE") > > > Reported-by: syzbot+6ba92bd00d5093f7e371@syzkaller.appspotmail.com > > > Signed-off-by: Svyatoslav Feldsherov <feldsherov@google.com> > > > > Thanks for the analysis! I was scratching my head over this syzbot report > > for a while and it didn't occur to me somebody could be calling > > writeback_single_inode() from the .evict callback. > > > > Also what contributes to the problem is that FAT calls > > sync_inode_metadata(inode, 0) so it is not marking this final flush as data > > integrity sync and so we happily leave the I_DIRTY_TIME bit set. > > > > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > > > index 443f83382b9b..31c93cbdb3fe 100644 > > > --- a/fs/fs-writeback.c > > > +++ b/fs/fs-writeback.c > > > @@ -1718,7 +1718,7 @@ static int writeback_single_inode(struct inode *inode, > > > */ > > > if (!(inode->i_state & I_DIRTY_ALL)) > > > inode_cgwb_move_to_attached(inode, wb); > > > - else if (!(inode->i_state & I_SYNC_QUEUED)) { > > > + else if (!(inode->i_state & (I_SYNC_QUEUED | I_FREEING))) { > > > if ((inode->i_state & I_DIRTY)) > > > redirty_tail_locked(inode, wb); > > > else if (inode->i_state & I_DIRTY_TIME) { > > > > So even calling inode_cgwb_move_to_attached() is not safe when I_FREEING is > > already set. So I belive the I_FREEING bit check needs to be before this > > whole if block. > > Agree, let me move the I_FREEING check before this if block. > The commit I am fixing didn't change this codepath, so I suspect there is an > implicit invariant which keeps inode_cgwb_move_to_attached call safe. > But I am 100% in favor of making I_FREEING check explicitly. Actually, as I've looked into fat_evict_inode() I don't see anything making that safe except for the fact that it may be more difficult for syzbot to excercise the per-memcg writeback path... > > I also think we should add some assertions into i_io_list handling > > functions to complain if I_FREEING bit is set to catch these problems > > earlier which means to be also more careful in __mark_inode_dirty(). But > > this is for a separate cleanup. > > Sounds reasonable. Will look into that afterwards. Thanks! Honza
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 443f83382b9b..31c93cbdb3fe 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1718,7 +1718,7 @@ static int writeback_single_inode(struct inode *inode, */ if (!(inode->i_state & I_DIRTY_ALL)) inode_cgwb_move_to_attached(inode, wb); - else if (!(inode->i_state & I_SYNC_QUEUED)) { + else if (!(inode->i_state & (I_SYNC_QUEUED | I_FREEING))) { if ((inode->i_state & I_DIRTY)) redirty_tail_locked(inode, wb); else if (inode->i_state & I_DIRTY_TIME) {