cgroup: always put cset in cgroup_css_set_put_fork

Message ID 20230521192953.229715-1-jsperbeck@google.com
State New
Headers
Series cgroup: always put cset in cgroup_css_set_put_fork |

Commit Message

John Sperbeck May 21, 2023, 7:29 p.m. UTC
  A successful call to cgroup_css_set_fork() will always have taken
a ref on kargs->cset (regardless of CLONE_INTO_CGROUP), so always
do a corresponding put in cgroup_css_set_put_fork().

Without this, a cset and its contained css structures will be
leaked for some fork failures.  The following script reproduces
the leak for a fork failure due to exceeding pids.max in the
pids controller.  A similar thing can happen if we jump to the
bad_fork_cancel_cgroup label in copy_process().

[ -z "$1" ] && echo "Usage $0 pids-root" && exit 1
PID_ROOT=$1
CGROUP=$PID_ROOT/foo

[ -e $CGROUP ] && rmdir -f $CGROUP
mkdir $CGROUP
echo 5 > $CGROUP/pids.max
echo $$ > $CGROUP/cgroup.procs

fork_bomb()
{
	set -e
	for i in $(seq 10); do
		/bin/sleep 3600 &
	done
}

(fork_bomb) &
wait
echo $$ > $PID_ROOT/cgroup.procs
kill $(cat $CGROUP/cgroup.procs)
rmdir $CGROUP

Fixes: ef2c41cf38a7 ("clone3: allow spawning processes into cgroups")
Signed-off-by: John Sperbeck <jsperbeck@google.com>
---
 kernel/cgroup/cgroup.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)
  

Comments

Tejun Heo May 23, 2023, 12:57 a.m. UTC | #1
On Sun, May 21, 2023 at 07:29:53PM +0000, John Sperbeck wrote:
> A successful call to cgroup_css_set_fork() will always have taken
> a ref on kargs->cset (regardless of CLONE_INTO_CGROUP), so always
> do a corresponding put in cgroup_css_set_put_fork().
> 
> Without this, a cset and its contained css structures will be
> leaked for some fork failures.  The following script reproduces
> the leak for a fork failure due to exceeding pids.max in the
> pids controller.  A similar thing can happen if we jump to the
> bad_fork_cancel_cgroup label in copy_process().
> 
> [ -z "$1" ] && echo "Usage $0 pids-root" && exit 1
> PID_ROOT=$1
> CGROUP=$PID_ROOT/foo
> 
> [ -e $CGROUP ] && rmdir -f $CGROUP
> mkdir $CGROUP
> echo 5 > $CGROUP/pids.max
> echo $$ > $CGROUP/cgroup.procs
> 
> fork_bomb()
> {
> 	set -e
> 	for i in $(seq 10); do
> 		/bin/sleep 3600 &
> 	done
> }
> 
> (fork_bomb) &
> wait
> echo $$ > $PID_ROOT/cgroup.procs
> kill $(cat $CGROUP/cgroup.procs)
> rmdir $CGROUP
> 
> Fixes: ef2c41cf38a7 ("clone3: allow spawning processes into cgroups")
> Signed-off-by: John Sperbeck <jsperbeck@google.com>

Applied to cgroup/for-6.4-fixes w/ stable cc'd.

Thanks.
  
Christian Brauner May 23, 2023, 9:58 a.m. UTC | #2
On Sun, May 21, 2023 at 07:29:53PM +0000, John Sperbeck wrote:
> A successful call to cgroup_css_set_fork() will always have taken
> a ref on kargs->cset (regardless of CLONE_INTO_CGROUP), so always
> do a corresponding put in cgroup_css_set_put_fork().
> 
> Without this, a cset and its contained css structures will be
> leaked for some fork failures.  The following script reproduces
> the leak for a fork failure due to exceeding pids.max in the
> pids controller.  A similar thing can happen if we jump to the
> bad_fork_cancel_cgroup label in copy_process().
> 
> [ -z "$1" ] && echo "Usage $0 pids-root" && exit 1
> PID_ROOT=$1
> CGROUP=$PID_ROOT/foo
> 
> [ -e $CGROUP ] && rmdir -f $CGROUP
> mkdir $CGROUP
> echo 5 > $CGROUP/pids.max
> echo $$ > $CGROUP/cgroup.procs
> 
> fork_bomb()
> {
> 	set -e
> 	for i in $(seq 10); do
> 		/bin/sleep 3600 &
> 	done
> }
> 
> (fork_bomb) &
> wait
> echo $$ > $PID_ROOT/cgroup.procs
> kill $(cat $CGROUP/cgroup.procs)
> rmdir $CGROUP
> 
> Fixes: ef2c41cf38a7 ("clone3: allow spawning processes into cgroups")
> Signed-off-by: John Sperbeck <jsperbeck@google.com>
> ---

Reviewed-by: Christian Brauner <brauner@kernel.org>
  

Patch

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 625d7483951c..245cf62ce85a 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6486,19 +6486,18 @@  static int cgroup_css_set_fork(struct kernel_clone_args *kargs)
 static void cgroup_css_set_put_fork(struct kernel_clone_args *kargs)
 	__releases(&cgroup_threadgroup_rwsem) __releases(&cgroup_mutex)
 {
+	struct cgroup *cgrp = kargs->cgrp;
+	struct css_set *cset = kargs->cset;
+
 	cgroup_threadgroup_change_end(current);
 
-	if (kargs->flags & CLONE_INTO_CGROUP) {
-		struct cgroup *cgrp = kargs->cgrp;
-		struct css_set *cset = kargs->cset;
+	if (cset) {
+		put_css_set(cset);
+		kargs->cset = NULL;
+	}
 
+	if (kargs->flags & CLONE_INTO_CGROUP) {
 		cgroup_unlock();
-
-		if (cset) {
-			put_css_set(cset);
-			kargs->cset = NULL;
-		}
-
 		if (cgrp) {
 			cgroup_put(cgrp);
 			kargs->cgrp = NULL;