[6/6] fs: add automatic kernel fs freeze / thaw and remove kthread freezing

Message ID 20230508011717.4034511-7-mcgrof@kernel.org
State New
Headers
Series vfs: provide automatic kernel freeze / resume |

Commit Message

Luis Chamberlain May 8, 2023, 1:17 a.m. UTC
  Add support to automatically handle freezing and thawing filesystems
during the kernel's suspend/resume cycle.

This is needed so that we properly really stop IO in flight without
races after userspace has been frozen. Without this we rely on
kthread freezing and its semantics are loose and error prone.
For instance, even though a kthread may use try_to_freeze() and end
up being frozen we have no way of being sure that everything that
has been spawned asynchronously from it (such as timers) have also
been stopped as well.

A long term advantage of also adding filesystem freeze / thawing
supporting during suspend / hibernation is that long term we may
be able to eventually drop the kernel's thread freezing completely
as it was originally added to stop disk IO in flight as we hibernate
or suspend.

This does not remove the superfluous freezer calls on all filesystems.
Each filesystem must remove all the kthread freezer stuff and peg
the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
flag.

Subsequent patches remove the kthread freezer usage from each
filesystem, one at a time to make all this work bisectable.
Once all filesystems remove the usage of the kthread freezer we
can remove the FS_AUTOFREEZE flag.

Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 fs/super.c             | 50 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h     | 14 ++++++++++++
 kernel/power/process.c | 15 ++++++++++++-
 3 files changed, 78 insertions(+), 1 deletion(-)
  

Comments

Dave Chinner May 9, 2023, 1:20 a.m. UTC | #1
On Sun, May 07, 2023 at 06:17:17PM -0700, Luis Chamberlain wrote:
> Add support to automatically handle freezing and thawing filesystems
> during the kernel's suspend/resume cycle.
> 
> This is needed so that we properly really stop IO in flight without
> races after userspace has been frozen. Without this we rely on
> kthread freezing and its semantics are loose and error prone.
> For instance, even though a kthread may use try_to_freeze() and end
> up being frozen we have no way of being sure that everything that
> has been spawned asynchronously from it (such as timers) have also
> been stopped as well.
> 
> A long term advantage of also adding filesystem freeze / thawing
> supporting during suspend / hibernation is that long term we may
> be able to eventually drop the kernel's thread freezing completely
> as it was originally added to stop disk IO in flight as we hibernate
> or suspend.
> 
> This does not remove the superfluous freezer calls on all filesystems.
> Each filesystem must remove all the kthread freezer stuff and peg
> the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
> flag.
> 
> Subsequent patches remove the kthread freezer usage from each
> filesystem, one at a time to make all this work bisectable.
> Once all filesystems remove the usage of the kthread freezer we
> can remove the FS_AUTOFREEZE flag.
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  fs/super.c             | 50 ++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h     | 14 ++++++++++++
>  kernel/power/process.c | 15 ++++++++++++-
>  3 files changed, 78 insertions(+), 1 deletion(-)

.....

> diff --git a/kernel/power/process.c b/kernel/power/process.c
> index cae81a87cc91..7ca7688f0b5d 100644
> --- a/kernel/power/process.c
> +++ b/kernel/power/process.c
> @@ -140,6 +140,16 @@ int freeze_processes(void)
>  
>  	BUG_ON(in_atomic());
>  
> +	pr_info("Freezing filesystems ... ");
> +	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
> +	if (error) {
> +		pr_cont("failed\n");
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
> +		thaw_processes();
> +		return error;

That looks wrong. i.e. if the sb iteration fails to freeze a
filesystem (for whatever reason) then every userspace frozen
filesystem will be thawed by this call, right? i.e. it will thaw
more than just the filesystems frozen by the suspend freeze
iteration before it failed.

Don't we only want to thaw the superblocks we froze before the
failure occurred? i.e. the "undo" iteration needs to start from the
last superblock we successfully froze and then only walk to the tail
of the list we started from?

> +	}
> +	pr_cont("done.\n");
> +
>  	/*
>  	 * Now that the whole userspace is frozen we need to disable
>  	 * the OOM killer to disallow any further interference with
> @@ -149,8 +159,10 @@ int freeze_processes(void)
>  	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
>  		error = -EBUSY;
>  
> -	if (error)
> +	if (error) {
> +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
>  		thaw_processes();
> +	}

Does this also have the same problem? i.e. if fs_suspend_freeze_sb()
skips over superblocks that are already userspace frozen without any
error, then this will incorrectly thaw those userspace frozen
filesystems.

Cheers,

Dave.
  
Darrick J. Wong May 16, 2023, 3:17 p.m. UTC | #2
On Tue, May 09, 2023 at 11:20:13AM +1000, Dave Chinner wrote:
> On Sun, May 07, 2023 at 06:17:17PM -0700, Luis Chamberlain wrote:
> > Add support to automatically handle freezing and thawing filesystems
> > during the kernel's suspend/resume cycle.
> > 
> > This is needed so that we properly really stop IO in flight without
> > races after userspace has been frozen. Without this we rely on
> > kthread freezing and its semantics are loose and error prone.
> > For instance, even though a kthread may use try_to_freeze() and end
> > up being frozen we have no way of being sure that everything that
> > has been spawned asynchronously from it (such as timers) have also
> > been stopped as well.
> > 
> > A long term advantage of also adding filesystem freeze / thawing
> > supporting during suspend / hibernation is that long term we may
> > be able to eventually drop the kernel's thread freezing completely
> > as it was originally added to stop disk IO in flight as we hibernate
> > or suspend.
> > 
> > This does not remove the superfluous freezer calls on all filesystems.
> > Each filesystem must remove all the kthread freezer stuff and peg
> > the fs_type flags as supporting auto-freezing with the FS_AUTOFREEZE
> > flag.
> > 
> > Subsequent patches remove the kthread freezer usage from each
> > filesystem, one at a time to make all this work bisectable.
> > Once all filesystems remove the usage of the kthread freezer we
> > can remove the FS_AUTOFREEZE flag.
> > 
> > Reviewed-by: Jan Kara <jack@suse.cz>
> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> > ---
> >  fs/super.c             | 50 ++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/fs.h     | 14 ++++++++++++
> >  kernel/power/process.c | 15 ++++++++++++-
> >  3 files changed, 78 insertions(+), 1 deletion(-)
> 
> .....
> 
> > diff --git a/kernel/power/process.c b/kernel/power/process.c
> > index cae81a87cc91..7ca7688f0b5d 100644
> > --- a/kernel/power/process.c
> > +++ b/kernel/power/process.c
> > @@ -140,6 +140,16 @@ int freeze_processes(void)
> >  
> >  	BUG_ON(in_atomic());
> >  
> > +	pr_info("Freezing filesystems ... ");
> > +	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
> > +	if (error) {
> > +		pr_cont("failed\n");
> > +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
> > +		thaw_processes();
> > +		return error;
> 
> That looks wrong. i.e. if the sb iteration fails to freeze a
> filesystem (for whatever reason) then every userspace frozen
> filesystem will be thawed by this call, right? i.e. it will thaw
> more than just the filesystems frozen by the suspend freeze
> iteration before it failed.
> 
> Don't we only want to thaw the superblocks we froze before the
> failure occurred? i.e. the "undo" iteration needs to start from the
> last superblock we successfully froze and then only walk to the tail
> of the list we started from?

I think fs_suspend_thaw_sb calls thaw_super(..., false), which will not
undo a userspace freeze.  So strictly speaking the answer to your
question is (AFAICT) "no it won't"

That said, I read this and also had a raised-eyebrow moment -- we
shouldn't (un?)touch superblocks that fs_suspend_freeze_sb didn't touch
in the first place.

I wonder if we should be using that NULL parameter to keep track of the
last super that fs_suspend_freeze_sb didn't fail on?

--D

> > +	}
> > +	pr_cont("done.\n");
> > +
> >  	/*
> >  	 * Now that the whole userspace is frozen we need to disable
> >  	 * the OOM killer to disallow any further interference with
> > @@ -149,8 +159,10 @@ int freeze_processes(void)
> >  	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
> >  		error = -EBUSY;
> >  
> > -	if (error)
> > +	if (error) {
> > +		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
> >  		thaw_processes();
> > +	}
> 
> Does this also have the same problem? i.e. if fs_suspend_freeze_sb()
> skips over superblocks that are already userspace frozen without any
> error, then this will incorrectly thaw those userspace frozen
> filesystems.
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
  

Patch

diff --git a/fs/super.c b/fs/super.c
index d5eab6b38b03..148674429ab8 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1910,3 +1910,53 @@  int sb_init_dio_done_wq(struct super_block *sb)
 		destroy_workqueue(wq);
 	return 0;
 }
+
+#ifdef CONFIG_PM_SLEEP
+static bool super_should_freeze(struct super_block *sb)
+{
+	if (!(sb->s_type->fs_flags & FS_AUTOFREEZE))
+		return false;
+	/*
+	 * We don't freeze virtual filesystems, we skip those filesystems with
+	 * no backing device.
+	 */
+	if (sb->s_bdi == &noop_backing_dev_info)
+		return false;
+
+	return true;
+}
+
+int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): freezing\n", sb->s_type->name, sb->s_id);
+
+	error = freeze_super(sb, false);
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to freeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+out:
+	return error;
+}
+
+int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
+{
+	int error = 0;
+
+	if (!super_should_freeze(sb))
+		goto out;
+
+	pr_info("%s (%s): thawing\n", sb->s_type->name, sb->s_id);
+
+	error = thaw_super(sb, false);
+	if (error && error != -EBUSY)
+		pr_notice("%s (%s): Unable to unfreeze, error=%d",
+			  sb->s_type->name, sb->s_id, error);
+out:
+	return error;
+}
+#endif /* CONFIG_PM_SLEEP */
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 22dd697ab703..92c85c8ec1ed 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2247,6 +2247,7 @@  struct file_system_type {
 #define FS_DISALLOW_NOTIFY_PERM	16	/* Disable fanotify permission events */
 #define FS_ALLOW_IDMAP         32      /* FS has been updated to handle vfs idmappings. */
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
+#define FS_AUTOFREEZE           (1<<16)	/*  temporary as we phase kthread freezer out */
 	int (*init_fs_context)(struct fs_context *);
 	const struct fs_parameter_spec *parameters;
 	struct dentry *(*mount) (struct file_system_type *, int,
@@ -2322,6 +2323,19 @@  extern int user_statfs(const char __user *, struct kstatfs *);
 extern int fd_statfs(int, struct kstatfs *);
 extern int freeze_super(struct super_block *super, bool usercall);
 extern int thaw_super(struct super_block *super, bool usercall);
+#ifdef CONFIG_PM_SLEEP
+int fs_suspend_freeze_sb(struct super_block *sb, void *priv);
+int fs_suspend_thaw_sb(struct super_block *sb, void *priv);
+#else
+static inline int fs_suspend_freeze_sb(struct super_block *sb, void *priv)
+{
+	return 0;
+}
+static inline int fs_suspend_thaw_sb(struct super_block *sb, void *priv)
+{
+	return 0;
+}
+#endif
 extern __printf(2, 3)
 int super_setup_bdi_name(struct super_block *sb, char *fmt, ...);
 extern int super_setup_bdi(struct super_block *sb);
diff --git a/kernel/power/process.c b/kernel/power/process.c
index cae81a87cc91..7ca7688f0b5d 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -140,6 +140,16 @@  int freeze_processes(void)
 
 	BUG_ON(in_atomic());
 
+	pr_info("Freezing filesystems ... ");
+	error = iterate_supers_reverse_excl(fs_suspend_freeze_sb, NULL);
+	if (error) {
+		pr_cont("failed\n");
+		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
+		thaw_processes();
+		return error;
+	}
+	pr_cont("done.\n");
+
 	/*
 	 * Now that the whole userspace is frozen we need to disable
 	 * the OOM killer to disallow any further interference with
@@ -149,8 +159,10 @@  int freeze_processes(void)
 	if (!error && !oom_killer_disable(msecs_to_jiffies(freeze_timeout_msecs)))
 		error = -EBUSY;
 
-	if (error)
+	if (error) {
+		iterate_supers_excl(fs_suspend_thaw_sb, NULL);
 		thaw_processes();
+	}
 	return error;
 }
 
@@ -188,6 +200,7 @@  void thaw_processes(void)
 	pm_nosig_freezing = false;
 
 	oom_killer_enable();
+	iterate_supers_excl(fs_suspend_thaw_sb, NULL);
 
 	pr_info("Restarting tasks ... ");