cgroup: fix missing cpus_read_{lock,unlock}() in cgroup_transfer_tasks()

Message ID 20230517074545.2045035-1-qi.zheng@linux.dev
State New
Headers
Series cgroup: fix missing cpus_read_{lock,unlock}() in cgroup_transfer_tasks() |

Commit Message

Qi Zheng May 17, 2023, 7:45 a.m. UTC
  From: Qi Zheng <zhengqi.arch@bytedance.com>

The commit 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock()
deadlock") fixed the deadlock between cgroup_threadgroup_rwsem and
cpus_read_lock() by introducing cgroup_attach_{lock,unlock}() and removing
cpus_read_{lock,unlock}() from cpuset_attach(). But cgroup_transfer_tasks()
was missed and not handled, which will cause th following warning:

 WARNING: CPU: 0 PID: 589 at kernel/cpu.c:526 lockdep_assert_cpus_held+0x32/0x40
 CPU: 0 PID: 589 Comm: kworker/1:4 Not tainted 6.4.0-rc2-next-20230517 #50
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 Workqueue: events cpuset_hotplug_workfn
 RIP: 0010:lockdep_assert_cpus_held+0x32/0x40
 <...>
 Call Trace:
  <TASK>
  cpuset_attach+0x40/0x240
  cgroup_migrate_execute+0x452/0x5e0
  ? _raw_spin_unlock_irq+0x28/0x40
  cgroup_transfer_tasks+0x1f3/0x360
  ? find_held_lock+0x32/0x90
  ? cpuset_hotplug_workfn+0xc81/0xed0
  cpuset_hotplug_workfn+0xcb1/0xed0
  ? process_one_work+0x248/0x5b0
  process_one_work+0x2b9/0x5b0
  worker_thread+0x56/0x3b0
  ? process_one_work+0x5b0/0x5b0
  kthread+0xf1/0x120
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>

So just use the cgroup_attach_{lock,unlock}() helper to fix it.

Fixes: 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock")
Reported-by: Zhao Gongyi <zhaogongyi@bytedance.com>
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
 kernel/cgroup/cgroup-v1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
  

Comments

Muchun Song May 17, 2023, 7:51 a.m. UTC | #1
> On May 17, 2023, at 15:45, Qi Zheng <qi.zheng@linux.dev> wrote:
> 
> From: Qi Zheng <zhengqi.arch@bytedance.com>
> 
> The commit 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock()
> deadlock") fixed the deadlock between cgroup_threadgroup_rwsem and
> cpus_read_lock() by introducing cgroup_attach_{lock,unlock}() and removing
> cpus_read_{lock,unlock}() from cpuset_attach(). But cgroup_transfer_tasks()
> was missed and not handled, which will cause th following warning:
> 
> WARNING: CPU: 0 PID: 589 at kernel/cpu.c:526 lockdep_assert_cpus_held+0x32/0x40
> CPU: 0 PID: 589 Comm: kworker/1:4 Not tainted 6.4.0-rc2-next-20230517 #50
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
> Workqueue: events cpuset_hotplug_workfn
> RIP: 0010:lockdep_assert_cpus_held+0x32/0x40
> <...>
> Call Trace:
>  <TASK>
>  cpuset_attach+0x40/0x240
>  cgroup_migrate_execute+0x452/0x5e0
>  ? _raw_spin_unlock_irq+0x28/0x40
>  cgroup_transfer_tasks+0x1f3/0x360
>  ? find_held_lock+0x32/0x90
>  ? cpuset_hotplug_workfn+0xc81/0xed0
>  cpuset_hotplug_workfn+0xcb1/0xed0
>  ? process_one_work+0x248/0x5b0
>  process_one_work+0x2b9/0x5b0
>  worker_thread+0x56/0x3b0
>  ? process_one_work+0x5b0/0x5b0
>  kthread+0xf1/0x120
>  ? kthread_complete_and_exit+0x20/0x20
>  ret_from_fork+0x1f/0x30
>  </TASK>
> 
> So just use the cgroup_attach_{lock,unlock}() helper to fix it.
> 
> Fixes: 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock")
> Reported-by: Zhao Gongyi <zhaogongyi@bytedance.com>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>

Acked-by: Muchun Song <songmuchun@bytedance.com>

Thanks.
  
Qi Zheng May 22, 2023, 2:23 a.m. UTC | #2
On 2023/5/17 15:51, Muchun Song wrote:
> 
> 
>> On May 17, 2023, at 15:45, Qi Zheng <qi.zheng@linux.dev> wrote:
>>
>> From: Qi Zheng <zhengqi.arch@bytedance.com>
>>
>> The commit 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock()
>> deadlock") fixed the deadlock between cgroup_threadgroup_rwsem and
>> cpus_read_lock() by introducing cgroup_attach_{lock,unlock}() and removing
>> cpus_read_{lock,unlock}() from cpuset_attach(). But cgroup_transfer_tasks()
>> was missed and not handled, which will cause th following warning:
>>
>> WARNING: CPU: 0 PID: 589 at kernel/cpu.c:526 lockdep_assert_cpus_held+0x32/0x40
>> CPU: 0 PID: 589 Comm: kworker/1:4 Not tainted 6.4.0-rc2-next-20230517 #50
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
>> Workqueue: events cpuset_hotplug_workfn
>> RIP: 0010:lockdep_assert_cpus_held+0x32/0x40
>> <...>
>> Call Trace:
>>   <TASK>
>>   cpuset_attach+0x40/0x240
>>   cgroup_migrate_execute+0x452/0x5e0
>>   ? _raw_spin_unlock_irq+0x28/0x40
>>   cgroup_transfer_tasks+0x1f3/0x360
>>   ? find_held_lock+0x32/0x90
>>   ? cpuset_hotplug_workfn+0xc81/0xed0
>>   cpuset_hotplug_workfn+0xcb1/0xed0
>>   ? process_one_work+0x248/0x5b0
>>   process_one_work+0x2b9/0x5b0
>>   worker_thread+0x56/0x3b0
>>   ? process_one_work+0x5b0/0x5b0
>>   kthread+0xf1/0x120
>>   ? kthread_complete_and_exit+0x20/0x20
>>   ret_from_fork+0x1f/0x30
>>   </TASK>
>>
>> So just use the cgroup_attach_{lock,unlock}() helper to fix it.
>>
>> Fixes: 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock")
>> Reported-by: Zhao Gongyi <zhaogongyi@bytedance.com>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> 
> Acked-by: Muchun Song <songmuchun@bytedance.com>

Thanks. And hi Tejun, Can this patch be applied?

> 
> Thanks.
>
  
Tejun Heo May 22, 2023, 6:45 p.m. UTC | #3
On Wed, May 17, 2023 at 07:45:45AM +0000, Qi Zheng wrote:
> From: Qi Zheng <zhengqi.arch@bytedance.com>
> 
> The commit 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock()
> deadlock") fixed the deadlock between cgroup_threadgroup_rwsem and
> cpus_read_lock() by introducing cgroup_attach_{lock,unlock}() and removing
> cpus_read_{lock,unlock}() from cpuset_attach(). But cgroup_transfer_tasks()
> was missed and not handled, which will cause th following warning:
> 
>  WARNING: CPU: 0 PID: 589 at kernel/cpu.c:526 lockdep_assert_cpus_held+0x32/0x40
>  CPU: 0 PID: 589 Comm: kworker/1:4 Not tainted 6.4.0-rc2-next-20230517 #50
>  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
>  Workqueue: events cpuset_hotplug_workfn
>  RIP: 0010:lockdep_assert_cpus_held+0x32/0x40
>  <...>
>  Call Trace:
>   <TASK>
>   cpuset_attach+0x40/0x240
>   cgroup_migrate_execute+0x452/0x5e0
>   ? _raw_spin_unlock_irq+0x28/0x40
>   cgroup_transfer_tasks+0x1f3/0x360
>   ? find_held_lock+0x32/0x90
>   ? cpuset_hotplug_workfn+0xc81/0xed0
>   cpuset_hotplug_workfn+0xcb1/0xed0
>   ? process_one_work+0x248/0x5b0
>   process_one_work+0x2b9/0x5b0
>   worker_thread+0x56/0x3b0
>   ? process_one_work+0x5b0/0x5b0
>   kthread+0xf1/0x120
>   ? kthread_complete_and_exit+0x20/0x20
>   ret_from_fork+0x1f/0x30
>   </TASK>
> 
> So just use the cgroup_attach_{lock,unlock}() helper to fix it.
> 
> Fixes: 4f7e7236435c ("cgroup: Fix threadgroup_rwsem <-> cpus_read_lock() deadlock")
> Reported-by: Zhao Gongyi <zhaogongyi@bytedance.com>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>

Applied to cgroupp/for-6.4-fixes w/ Fixes tag updated to 05c7b7a92cc8 (the
commit that 4f7e7236435c fixes) and stable tag added.

Thanks.
  

Patch

diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index aeef06c465ef..5407241dbb45 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -108,7 +108,7 @@  int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
 
 	cgroup_lock();
 
-	percpu_down_write(&cgroup_threadgroup_rwsem);
+	cgroup_attach_lock(true);
 
 	/* all tasks in @from are being moved, all csets are source */
 	spin_lock_irq(&css_set_lock);
@@ -144,7 +144,7 @@  int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
 	} while (task && !ret);
 out_err:
 	cgroup_migrate_finish(&mgctx);
-	percpu_up_write(&cgroup_threadgroup_rwsem);
+	cgroup_attach_unlock(true);
 	cgroup_unlock();
 	return ret;
 }