Message ID | 20230414125532.14958-3-wander@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp374580vqo; Fri, 14 Apr 2023 06:16:46 -0700 (PDT) X-Google-Smtp-Source: AKy350ay7bZpFofYjR5thSQhs7e7URdyLkNFAv3vywqE0ESovPJ7oRrgtRxjWQE2mp2/L/UioGS8 X-Received: by 2002:a17:902:e810:b0:1a6:4a64:4d27 with SMTP id u16-20020a170902e81000b001a64a644d27mr3286120plg.40.1681478206399; Fri, 14 Apr 2023 06:16:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681478206; cv=none; d=google.com; s=arc-20160816; b=U66oaG9I+F9+9moJYv7Hgbm+07u/tnPmmwJss5bFqjFdasyh3VDFIr1jrJ47Pggjyc 6q3lT+0P9wnt7MV+0RcjhSxRhbX6FAntlkxOu7vhUG96etXLkllyu6Noz6gcA+qXzJek pZherznWiXwAZKErMcwsL88xdnzWC8K3uLltytqecU8iJf2u4HL5PkFnS8D5EjbjT8+k 8nGmzK/mjIAGQ+iSKee1HQSX463Qes3MFm+4kvAkCckCajpcGfydql7aqcPMhVa0oDVx AoKUXdx3S9dldkUY4ROiSRVbuVxzaTLa2TKLaZE5af1abNoBc72tTHuf8t7CY6Fs0GXr JUTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vWFEUBEXcCftzEc0hD+kF0Un2IVbNKstSMRg6ONrT+s=; b=X3c1ukUMgR3sOeJEwu1MFAP+q2DDkvsZeX7mUYypzXHTafyoxkDHF/fYzadfotd7vb i3IDqkzSH6l4mpVgwE0Qs+M5tsXitfoervY1VQ4vQnxT++koxqrdw93ZxC6O5M8UUaOW R733qgvudVwqUN7zJo+fZxM2SBp1rEC4Np6pSa6zWfhXqj+nfEIjOUQjaEyTzLdSmIVB GUzfZsCbnW8ok3IM7rJop5tbPd3Oy4q/PGWGL17XibM1rpzSVxgXU9R5Axz9toSSKTqB ZCm+vxuK7uoYf8Wrw12g+EFaogrnbsYEOo1jMdLZ9YE84ViYATERKUx2m6Lk6Gp++/6h K0Jw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IYIVdq5e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l19-20020a170902d35300b001a1a0db7f61si4464423plk.336.2023.04.14.06.16.32; Fri, 14 Apr 2023 06:16:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IYIVdq5e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230167AbjDNM5M (ORCPT <rfc822;leviz.kernel.dev@gmail.com> + 99 others); Fri, 14 Apr 2023 08:57:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230081AbjDNM5G (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 14 Apr 2023 08:57:06 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7589D6584 for <linux-kernel@vger.kernel.org>; Fri, 14 Apr 2023 05:56:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681476977; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vWFEUBEXcCftzEc0hD+kF0Un2IVbNKstSMRg6ONrT+s=; b=IYIVdq5e6oJZhrn+9439dc8BsHt+FMpGmDeKGC0+sCLyS+LoG/uFgzKVrRhGtbQvKx/ST+ FDKquLJfb5OK47hzRCRf17RGpkOpv2P9MX39xl6B6wWJFilHZ2ETL9Em8dJmA8vhPYNhfo y0n9djHomqmVJUdl2fXTwgDNYxaawII= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-265-7T0V0wkQOSK6KOzPfQ9_AQ-1; Fri, 14 Apr 2023 08:56:07 -0400 X-MC-Unique: 7T0V0wkQOSK6KOzPfQ9_AQ-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3B6FA185A790; Fri, 14 Apr 2023 12:56:06 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.22.9.8]) by smtp.corp.redhat.com (Postfix) with ESMTP id 28918406AA66; Fri, 14 Apr 2023 12:55:59 +0000 (UTC) From: Wander Lairson Costa <wander@redhat.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Ian Rogers <irogers@google.com>, Adrian Hunter <adrian.hunter@intel.com>, Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>, Boqun Feng <boqun.feng@gmail.com>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, Valentin Schneider <vschneid@redhat.com>, "Eric W. Biederman" <ebiederm@xmission.com>, Wander Lairson Costa <wander@redhat.com>, Michael Ellerman <mpe@ellerman.id.au>, Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov <oleg@redhat.com>, Kefeng Wang <wangkefeng.wang@huawei.com>, "Liam R. Howlett" <Liam.Howlett@oracle.com>, Kees Cook <keescook@chromium.org>, Christian Brauner <brauner@kernel.org>, Andrei Vagin <avagin@gmail.com>, Shakeel Butt <shakeelb@google.com>, linux-kernel@vger.kernel.org (open list), linux-perf-users@vger.kernel.org (open list:PERFORMANCE EVENTS SUBSYSTEM) Cc: Hu Chunyu <chuhu@redhat.com>, Paul McKenney <paulmck@kernel.org>, Thomas Gleixner <tglx@linutronix.de> Subject: [PATCH v6 2/3] sched/task: Add the put_task_struct_atomic_safe() function Date: Fri, 14 Apr 2023 09:55:28 -0300 Message-Id: <20230414125532.14958-3-wander@redhat.com> In-Reply-To: <20230414125532.14958-1-wander@redhat.com> References: <20230414125532.14958-1-wander@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1763157691287104603?= X-GMAIL-MSGID: =?utf-8?q?1763157691287104603?= |
Series |
Introduce put_task_struct_atomic_sleep()
|
|
Commit Message
Wander Lairson Costa
April 14, 2023, 12:55 p.m. UTC
Due to the possibility of indirectly acquiring sleeping locks, it is
unsafe to call put_task_struct() in atomic contexts when the kernel is
compiled with PREEMPT_RT.
To mitigate this issue, this commit introduces
put_task_struct_atomic_safe(), which schedules __put_task_struct()
through call_rcu() when PREEMPT_RT is enabled. While a workqueue would
be a more natural approach, we cannot allocate dynamic memory from
atomic context in PREEMPT_RT, making the code more complex.
This implementation ensures safe execution in atomic contexts and
avoids any potential issues that may arise from using the non-atomic
version.
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reported-by: Hu Chunyu <chuhu@redhat.com>
Cc: Paul McKenney <paulmck@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
---
include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++
kernel/fork.c | 8 ++++++++
2 files changed, 39 insertions(+)
Comments
On 4/14/23 08:55, Wander Lairson Costa wrote: > Due to the possibility of indirectly acquiring sleeping locks, it is > unsafe to call put_task_struct() in atomic contexts when the kernel is > compiled with PREEMPT_RT. > > To mitigate this issue, this commit introduces > put_task_struct_atomic_safe(), which schedules __put_task_struct() > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would > be a more natural approach, we cannot allocate dynamic memory from > atomic context in PREEMPT_RT, making the code more complex. > > This implementation ensures safe execution in atomic contexts and > avoids any potential issues that may arise from using the non-atomic > version. > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > Reported-by: Hu Chunyu <chuhu@redhat.com> > Cc: Paul McKenney <paulmck@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ > kernel/fork.c | 8 ++++++++ > 2 files changed, 39 insertions(+) > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > index b597b97b1f8f..5c13b83d7008 100644 > --- a/include/linux/sched/task.h > +++ b/include/linux/sched/task.h > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) > > void put_task_struct_rcu_user(struct task_struct *task); > > +extern void __delayed_put_task_struct(struct rcu_head *rhp); > + > +static inline void put_task_struct_atomic_safe(struct task_struct *task) > +{ > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > + /* > + * Decrement the refcount explicitly to avoid unnecessarily > + * calling call_rcu. > + */ > + if (refcount_dec_and_test(&task->usage)) > + /* > + * under PREEMPT_RT, we can't call put_task_struct > + * in atomic context because it will indirectly > + * acquire sleeping locks. > + * call_rcu() will schedule delayed_put_task_struct_rcu() delayed_put_task_struct_rcu()? > + * to be called in process context. > + * > + * __put_task_struct() is called called when "called called"? > + * refcount_dec_and_test(&t->usage) succeeds. > + * > + * This means that it can't "conflict" with > + * put_task_struct_rcu_user() which abuses ->rcu the same > + * way; rcu_users has a reference so task->usage can't be > + * zero after rcu_users 1 -> 0 transition. Note that put_task_struct_rcu_user() isn't the only user of task->rcu. delayed_free_task() in kernel/fork.c also uses it, though it is only called in the error case. Still you may need to take a look to make sure that there is no conflict. Cheers, Longman
On Mon, Apr 17, 2023 at 02:51:45PM -0400, Waiman Long wrote: > > On 4/14/23 08:55, Wander Lairson Costa wrote: > > Due to the possibility of indirectly acquiring sleeping locks, it is > > unsafe to call put_task_struct() in atomic contexts when the kernel is > > compiled with PREEMPT_RT. > > > > To mitigate this issue, this commit introduces > > put_task_struct_atomic_safe(), which schedules __put_task_struct() > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would > > be a more natural approach, we cannot allocate dynamic memory from > > atomic context in PREEMPT_RT, making the code more complex. > > > > This implementation ensures safe execution in atomic contexts and > > avoids any potential issues that may arise from using the non-atomic > > version. > > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > > Reported-by: Hu Chunyu <chuhu@redhat.com> > > Cc: Paul McKenney <paulmck@kernel.org> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > --- > > include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ > > kernel/fork.c | 8 ++++++++ > > 2 files changed, 39 insertions(+) > > > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > > index b597b97b1f8f..5c13b83d7008 100644 > > --- a/include/linux/sched/task.h > > +++ b/include/linux/sched/task.h > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) > > void put_task_struct_rcu_user(struct task_struct *task); > > +extern void __delayed_put_task_struct(struct rcu_head *rhp); > > + > > +static inline void put_task_struct_atomic_safe(struct task_struct *task) > > +{ > > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > > + /* > > + * Decrement the refcount explicitly to avoid unnecessarily > > + * calling call_rcu. > > + */ > > + if (refcount_dec_and_test(&task->usage)) > > + /* > > + * under PREEMPT_RT, we can't call put_task_struct > > + * in atomic context because it will indirectly > > + * acquire sleeping locks. > > + * call_rcu() will schedule delayed_put_task_struct_rcu() > delayed_put_task_struct_rcu()? > > + * to be called in process context. > > + * > > + * __put_task_struct() is called called when > "called called"? > > + * refcount_dec_and_test(&t->usage) succeeds. > > + * > > + * This means that it can't "conflict" with > > + * put_task_struct_rcu_user() which abuses ->rcu the same > > + * way; rcu_users has a reference so task->usage can't be > > + * zero after rcu_users 1 -> 0 transition. > > Note that put_task_struct_rcu_user() isn't the only user of task->rcu. > delayed_free_task() in kernel/fork.c also uses it, though it is only called > in the error case. Still you may need to take a look to make sure that there > is no conflict. > delayed_free_task() is called when a process fails to start. Therefore, AFAICT, there is no way it can conflict with put_task_struct(). > Cheers, > Longman >
On 4/18/23 10:18, Wander Lairson Costa wrote: > On Mon, Apr 17, 2023 at 02:51:45PM -0400, Waiman Long wrote: >> On 4/14/23 08:55, Wander Lairson Costa wrote: >>> Due to the possibility of indirectly acquiring sleeping locks, it is >>> unsafe to call put_task_struct() in atomic contexts when the kernel is >>> compiled with PREEMPT_RT. >>> >>> To mitigate this issue, this commit introduces >>> put_task_struct_atomic_safe(), which schedules __put_task_struct() >>> through call_rcu() when PREEMPT_RT is enabled. While a workqueue would >>> be a more natural approach, we cannot allocate dynamic memory from >>> atomic context in PREEMPT_RT, making the code more complex. >>> >>> This implementation ensures safe execution in atomic contexts and >>> avoids any potential issues that may arise from using the non-atomic >>> version. >>> >>> Signed-off-by: Wander Lairson Costa <wander@redhat.com> >>> Reported-by: Hu Chunyu <chuhu@redhat.com> >>> Cc: Paul McKenney <paulmck@kernel.org> >>> Cc: Thomas Gleixner <tglx@linutronix.de> >>> --- >>> include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ >>> kernel/fork.c | 8 ++++++++ >>> 2 files changed, 39 insertions(+) >>> >>> diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h >>> index b597b97b1f8f..5c13b83d7008 100644 >>> --- a/include/linux/sched/task.h >>> +++ b/include/linux/sched/task.h >>> @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) >>> void put_task_struct_rcu_user(struct task_struct *task); >>> +extern void __delayed_put_task_struct(struct rcu_head *rhp); >>> + >>> +static inline void put_task_struct_atomic_safe(struct task_struct *task) >>> +{ >>> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { >>> + /* >>> + * Decrement the refcount explicitly to avoid unnecessarily >>> + * calling call_rcu. >>> + */ >>> + if (refcount_dec_and_test(&task->usage)) >>> + /* >>> + * under PREEMPT_RT, we can't call put_task_struct >>> + * in atomic context because it will indirectly >>> + * acquire sleeping locks. >>> + * call_rcu() will schedule delayed_put_task_struct_rcu() >> delayed_put_task_struct_rcu()? >>> + * to be called in process context. >>> + * >>> + * __put_task_struct() is called called when >> "called called"? >>> + * refcount_dec_and_test(&t->usage) succeeds. >>> + * >>> + * This means that it can't "conflict" with >>> + * put_task_struct_rcu_user() which abuses ->rcu the same >>> + * way; rcu_users has a reference so task->usage can't be >>> + * zero after rcu_users 1 -> 0 transition. >> Note that put_task_struct_rcu_user() isn't the only user of task->rcu. >> delayed_free_task() in kernel/fork.c also uses it, though it is only called >> in the error case. Still you may need to take a look to make sure that there >> is no conflict. >> > delayed_free_task() is called when a process fails to start. Therefore, AFAICT, > there is no way it can conflict with put_task_struct(). I think so too, but for completeness, you should document somewhere that it is a possible conflicting user. Cheers, Longman
On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote: > Due to the possibility of indirectly acquiring sleeping locks, it is > unsafe to call put_task_struct() in atomic contexts when the kernel is > compiled with PREEMPT_RT. > > To mitigate this issue, this commit introduces > put_task_struct_atomic_safe(), which schedules __put_task_struct() > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would > be a more natural approach, we cannot allocate dynamic memory from > atomic context in PREEMPT_RT, making the code more complex. > > This implementation ensures safe execution in atomic contexts and > avoids any potential issues that may arise from using the non-atomic > version. > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > Reported-by: Hu Chunyu <chuhu@redhat.com> > Cc: Paul McKenney <paulmck@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ > kernel/fork.c | 8 ++++++++ > 2 files changed, 39 insertions(+) > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > index b597b97b1f8f..5c13b83d7008 100644 > --- a/include/linux/sched/task.h > +++ b/include/linux/sched/task.h > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) > > void put_task_struct_rcu_user(struct task_struct *task); > > +extern void __delayed_put_task_struct(struct rcu_head *rhp); > + > +static inline void put_task_struct_atomic_safe(struct task_struct *task) > +{ > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > + /* > + * Decrement the refcount explicitly to avoid unnecessarily > + * calling call_rcu. > + */ > + if (refcount_dec_and_test(&task->usage)) > + /* > + * under PREEMPT_RT, we can't call put_task_struct > + * in atomic context because it will indirectly > + * acquire sleeping locks. > + * call_rcu() will schedule delayed_put_task_struct_rcu() > + * to be called in process context. > + * > + * __put_task_struct() is called called when > + * refcount_dec_and_test(&t->usage) succeeds. > + * > + * This means that it can't "conflict" with > + * put_task_struct_rcu_user() which abuses ->rcu the same > + * way; rcu_users has a reference so task->usage can't be > + * zero after rcu_users 1 -> 0 transition. > + */ > + call_rcu(&task->rcu, __delayed_put_task_struct); This will invoke __delayed_put_task_struct() with softirqs disabled. Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT? Thanx, Paul > + } else { > + put_task_struct(task); > + } > +} > + > /* Free all architecture-specific resources held by a thread. */ > void release_thread(struct task_struct *dead_task); > > diff --git a/kernel/fork.c b/kernel/fork.c > index 0c92f224c68c..9884794fe4b8 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk) > } > EXPORT_SYMBOL_GPL(__put_task_struct); > > +void __delayed_put_task_struct(struct rcu_head *rhp) > +{ > + struct task_struct *task = container_of(rhp, struct task_struct, rcu); > + > + __put_task_struct(task); > +} > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct); > + > void __init __weak arch_task_cache_init(void) { } > > /* > -- > 2.39.2 >
On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote: > > Due to the possibility of indirectly acquiring sleeping locks, it is > > unsafe to call put_task_struct() in atomic contexts when the kernel is > > compiled with PREEMPT_RT. > > > > To mitigate this issue, this commit introduces > > put_task_struct_atomic_safe(), which schedules __put_task_struct() > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would > > be a more natural approach, we cannot allocate dynamic memory from > > atomic context in PREEMPT_RT, making the code more complex. > > > > This implementation ensures safe execution in atomic contexts and > > avoids any potential issues that may arise from using the non-atomic > > version. > > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > > Reported-by: Hu Chunyu <chuhu@redhat.com> > > Cc: Paul McKenney <paulmck@kernel.org> > > Cc: Thomas Gleixner <tglx@linutronix.de> > > --- > > include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ > > kernel/fork.c | 8 ++++++++ > > 2 files changed, 39 insertions(+) > > > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > > index b597b97b1f8f..5c13b83d7008 100644 > > --- a/include/linux/sched/task.h > > +++ b/include/linux/sched/task.h > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) > > > > void put_task_struct_rcu_user(struct task_struct *task); > > > > +extern void __delayed_put_task_struct(struct rcu_head *rhp); > > + > > +static inline void put_task_struct_atomic_safe(struct task_struct *task) > > +{ > > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > > + /* > > + * Decrement the refcount explicitly to avoid unnecessarily > > + * calling call_rcu. > > + */ > > + if (refcount_dec_and_test(&task->usage)) > > + /* > > + * under PREEMPT_RT, we can't call put_task_struct > > + * in atomic context because it will indirectly > > + * acquire sleeping locks. > > + * call_rcu() will schedule delayed_put_task_struct_rcu() > > + * to be called in process context. > > + * > > + * __put_task_struct() is called called when > > + * refcount_dec_and_test(&t->usage) succeeds. > > + * > > + * This means that it can't "conflict" with > > + * put_task_struct_rcu_user() which abuses ->rcu the same > > + * way; rcu_users has a reference so task->usage can't be > > + * zero after rcu_users 1 -> 0 transition. > > + */ > > + call_rcu(&task->rcu, __delayed_put_task_struct); > > This will invoke __delayed_put_task_struct() with softirqs disabled. > Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT? > softirqs execute in thread context in PREEMPT_RT. We are good here. > Thanx, Paul > > > + } else { > > + put_task_struct(task); > > + } > > +} > > + > > /* Free all architecture-specific resources held by a thread. */ > > void release_thread(struct task_struct *dead_task); > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > index 0c92f224c68c..9884794fe4b8 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk) > > } > > EXPORT_SYMBOL_GPL(__put_task_struct); > > > > +void __delayed_put_task_struct(struct rcu_head *rhp) > > +{ > > + struct task_struct *task = container_of(rhp, struct task_struct, rcu); > > + > > + __put_task_struct(task); > > +} > > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct); > > + > > void __init __weak arch_task_cache_init(void) { } > > > > /* > > -- > > 2.39.2 > > >
On Mon, Apr 24, 2023 at 03:43:09PM -0300, Wander Lairson Costa wrote: > On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote: > > > Due to the possibility of indirectly acquiring sleeping locks, it is > > > unsafe to call put_task_struct() in atomic contexts when the kernel is > > > compiled with PREEMPT_RT. > > > > > > To mitigate this issue, this commit introduces > > > put_task_struct_atomic_safe(), which schedules __put_task_struct() > > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would > > > be a more natural approach, we cannot allocate dynamic memory from > > > atomic context in PREEMPT_RT, making the code more complex. > > > > > > This implementation ensures safe execution in atomic contexts and > > > avoids any potential issues that may arise from using the non-atomic > > > version. > > > > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > > > Reported-by: Hu Chunyu <chuhu@redhat.com> > > > Cc: Paul McKenney <paulmck@kernel.org> > > > Cc: Thomas Gleixner <tglx@linutronix.de> > > > --- > > > include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ > > > kernel/fork.c | 8 ++++++++ > > > 2 files changed, 39 insertions(+) > > > > > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > > > index b597b97b1f8f..5c13b83d7008 100644 > > > --- a/include/linux/sched/task.h > > > +++ b/include/linux/sched/task.h > > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) > > > > > > void put_task_struct_rcu_user(struct task_struct *task); > > > > > > +extern void __delayed_put_task_struct(struct rcu_head *rhp); > > > + > > > +static inline void put_task_struct_atomic_safe(struct task_struct *task) > > > +{ > > > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > > > + /* > > > + * Decrement the refcount explicitly to avoid unnecessarily > > > + * calling call_rcu. > > > + */ > > > + if (refcount_dec_and_test(&task->usage)) > > > + /* > > > + * under PREEMPT_RT, we can't call put_task_struct > > > + * in atomic context because it will indirectly > > > + * acquire sleeping locks. > > > + * call_rcu() will schedule delayed_put_task_struct_rcu() > > > + * to be called in process context. > > > + * > > > + * __put_task_struct() is called called when > > > + * refcount_dec_and_test(&t->usage) succeeds. > > > + * > > > + * This means that it can't "conflict" with > > > + * put_task_struct_rcu_user() which abuses ->rcu the same > > > + * way; rcu_users has a reference so task->usage can't be > > > + * zero after rcu_users 1 -> 0 transition. > > > + */ > > > + call_rcu(&task->rcu, __delayed_put_task_struct); > > > > This will invoke __delayed_put_task_struct() with softirqs disabled. > > Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT? > > softirqs execute in thread context in PREEMPT_RT. We are good here. So the sleeping lock is a spinlock rather than (say) a mutex? Thanx, Paul > > > + } else { > > > + put_task_struct(task); > > > + } > > > +} > > > + > > > /* Free all architecture-specific resources held by a thread. */ > > > void release_thread(struct task_struct *dead_task); > > > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > > index 0c92f224c68c..9884794fe4b8 100644 > > > --- a/kernel/fork.c > > > +++ b/kernel/fork.c > > > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk) > > > } > > > EXPORT_SYMBOL_GPL(__put_task_struct); > > > > > > +void __delayed_put_task_struct(struct rcu_head *rhp) > > > +{ > > > + struct task_struct *task = container_of(rhp, struct task_struct, rcu); > > > + > > > + __put_task_struct(task); > > > +} > > > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct); > > > + > > > void __init __weak arch_task_cache_init(void) { } > > > > > > /* > > > -- > > > 2.39.2 > > > > > >
On Mon, Apr 24, 2023 at 3:52 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Mon, Apr 24, 2023 at 03:43:09PM -0300, Wander Lairson Costa wrote: > > On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote: > > > > Due to the possibility of indirectly acquiring sleeping locks, it is > > > > unsafe to call put_task_struct() in atomic contexts when the kernel is > > > > compiled with PREEMPT_RT. > > > > > > > > To mitigate this issue, this commit introduces > > > > put_task_struct_atomic_safe(), which schedules __put_task_struct() > > > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would > > > > be a more natural approach, we cannot allocate dynamic memory from > > > > atomic context in PREEMPT_RT, making the code more complex. > > > > > > > > This implementation ensures safe execution in atomic contexts and > > > > avoids any potential issues that may arise from using the non-atomic > > > > version. > > > > > > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > > > > Reported-by: Hu Chunyu <chuhu@redhat.com> > > > > Cc: Paul McKenney <paulmck@kernel.org> > > > > Cc: Thomas Gleixner <tglx@linutronix.de> > > > > --- > > > > include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ > > > > kernel/fork.c | 8 ++++++++ > > > > 2 files changed, 39 insertions(+) > > > > > > > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > > > > index b597b97b1f8f..5c13b83d7008 100644 > > > > --- a/include/linux/sched/task.h > > > > +++ b/include/linux/sched/task.h > > > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) > > > > > > > > void put_task_struct_rcu_user(struct task_struct *task); > > > > > > > > +extern void __delayed_put_task_struct(struct rcu_head *rhp); > > > > + > > > > +static inline void put_task_struct_atomic_safe(struct task_struct *task) > > > > +{ > > > > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > > > > + /* > > > > + * Decrement the refcount explicitly to avoid unnecessarily > > > > + * calling call_rcu. > > > > + */ > > > > + if (refcount_dec_and_test(&task->usage)) > > > > + /* > > > > + * under PREEMPT_RT, we can't call put_task_struct > > > > + * in atomic context because it will indirectly > > > > + * acquire sleeping locks. > > > > + * call_rcu() will schedule delayed_put_task_struct_rcu() > > > > + * to be called in process context. > > > > + * > > > > + * __put_task_struct() is called called when > > > > + * refcount_dec_and_test(&t->usage) succeeds. > > > > + * > > > > + * This means that it can't "conflict" with > > > > + * put_task_struct_rcu_user() which abuses ->rcu the same > > > > + * way; rcu_users has a reference so task->usage can't be > > > > + * zero after rcu_users 1 -> 0 transition. > > > > + */ > > > > + call_rcu(&task->rcu, __delayed_put_task_struct); > > > > > > This will invoke __delayed_put_task_struct() with softirqs disabled. > > > Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT? > > > > softirqs execute in thread context in PREEMPT_RT. We are good here. > > So the sleeping lock is a spinlock rather than (say) a mutex? > Yes, under PREEMPT_RT, spinlocks are implemented in terms of rtmutex. > Thanx, Paul > > > > > + } else { > > > > + put_task_struct(task); > > > > + } > > > > +} > > > > + > > > > /* Free all architecture-specific resources held by a thread. */ > > > > void release_thread(struct task_struct *dead_task); > > > > > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > > > index 0c92f224c68c..9884794fe4b8 100644 > > > > --- a/kernel/fork.c > > > > +++ b/kernel/fork.c > > > > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk) > > > > } > > > > EXPORT_SYMBOL_GPL(__put_task_struct); > > > > > > > > +void __delayed_put_task_struct(struct rcu_head *rhp) > > > > +{ > > > > + struct task_struct *task = container_of(rhp, struct task_struct, rcu); > > > > + > > > > + __put_task_struct(task); > > > > +} > > > > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct); > > > > + > > > > void __init __weak arch_task_cache_init(void) { } > > > > > > > > /* > > > > -- > > > > 2.39.2 > > > > > > > > > >
On Mon, 24 Apr 2023 11:52:52 -0700 "Paul E. McKenney" <paulmck@kernel.org> wrote: > > softirqs execute in thread context in PREEMPT_RT. We are good here. > > So the sleeping lock is a spinlock rather than (say) a mutex? local_bh_disable() on RT basically turns into: local_lock(&softirq_ctrl.lock); rcu_read_lock(); Which grabs a per CPU mutex that is taken by softirqs, and also calls rcu_read_lock(). This allows bottom halves to still run as threads but maintain the same synchronization as they do on mainline. -- Steve
On Mon, Apr 24, 2023 at 05:34:29PM -0300, Wander Lairson Costa wrote: > On Mon, Apr 24, 2023 at 3:52 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Mon, Apr 24, 2023 at 03:43:09PM -0300, Wander Lairson Costa wrote: > > > On Mon, Apr 24, 2023 at 3:09 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Fri, Apr 14, 2023 at 09:55:28AM -0300, Wander Lairson Costa wrote: > > > > > Due to the possibility of indirectly acquiring sleeping locks, it is > > > > > unsafe to call put_task_struct() in atomic contexts when the kernel is > > > > > compiled with PREEMPT_RT. > > > > > > > > > > To mitigate this issue, this commit introduces > > > > > put_task_struct_atomic_safe(), which schedules __put_task_struct() > > > > > through call_rcu() when PREEMPT_RT is enabled. While a workqueue would > > > > > be a more natural approach, we cannot allocate dynamic memory from > > > > > atomic context in PREEMPT_RT, making the code more complex. > > > > > > > > > > This implementation ensures safe execution in atomic contexts and > > > > > avoids any potential issues that may arise from using the non-atomic > > > > > version. > > > > > > > > > > Signed-off-by: Wander Lairson Costa <wander@redhat.com> > > > > > Reported-by: Hu Chunyu <chuhu@redhat.com> > > > > > Cc: Paul McKenney <paulmck@kernel.org> > > > > > Cc: Thomas Gleixner <tglx@linutronix.de> > > > > > --- > > > > > include/linux/sched/task.h | 31 +++++++++++++++++++++++++++++++ > > > > > kernel/fork.c | 8 ++++++++ > > > > > 2 files changed, 39 insertions(+) > > > > > > > > > > diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h > > > > > index b597b97b1f8f..5c13b83d7008 100644 > > > > > --- a/include/linux/sched/task.h > > > > > +++ b/include/linux/sched/task.h > > > > > @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) > > > > > > > > > > void put_task_struct_rcu_user(struct task_struct *task); > > > > > > > > > > +extern void __delayed_put_task_struct(struct rcu_head *rhp); > > > > > + > > > > > +static inline void put_task_struct_atomic_safe(struct task_struct *task) > > > > > +{ > > > > > + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { > > > > > + /* > > > > > + * Decrement the refcount explicitly to avoid unnecessarily > > > > > + * calling call_rcu. > > > > > + */ > > > > > + if (refcount_dec_and_test(&task->usage)) > > > > > + /* > > > > > + * under PREEMPT_RT, we can't call put_task_struct > > > > > + * in atomic context because it will indirectly > > > > > + * acquire sleeping locks. > > > > > + * call_rcu() will schedule delayed_put_task_struct_rcu() > > > > > + * to be called in process context. > > > > > + * > > > > > + * __put_task_struct() is called called when > > > > > + * refcount_dec_and_test(&t->usage) succeeds. > > > > > + * > > > > > + * This means that it can't "conflict" with > > > > > + * put_task_struct_rcu_user() which abuses ->rcu the same > > > > > + * way; rcu_users has a reference so task->usage can't be > > > > > + * zero after rcu_users 1 -> 0 transition. > > > > > + */ > > > > > + call_rcu(&task->rcu, __delayed_put_task_struct); > > > > > > > > This will invoke __delayed_put_task_struct() with softirqs disabled. > > > > Or do softirq-disabled contexts count as non-atomic in PREEMPT_RT? > > > > > > softirqs execute in thread context in PREEMPT_RT. We are good here. > > > > So the sleeping lock is a spinlock rather than (say) a mutex? > > Yes, under PREEMPT_RT, spinlocks are implemented in terms of rtmutex. Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Thanx, Paul > > > > > + } else { > > > > > + put_task_struct(task); > > > > > + } > > > > > +} > > > > > + > > > > > /* Free all architecture-specific resources held by a thread. */ > > > > > void release_thread(struct task_struct *dead_task); > > > > > > > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > > > > index 0c92f224c68c..9884794fe4b8 100644 > > > > > --- a/kernel/fork.c > > > > > +++ b/kernel/fork.c > > > > > @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk) > > > > > } > > > > > EXPORT_SYMBOL_GPL(__put_task_struct); > > > > > > > > > > +void __delayed_put_task_struct(struct rcu_head *rhp) > > > > > +{ > > > > > + struct task_struct *task = container_of(rhp, struct task_struct, rcu); > > > > > + > > > > > + __put_task_struct(task); > > > > > +} > > > > > +EXPORT_SYMBOL_GPL(__delayed_put_task_struct); > > > > > + > > > > > void __init __weak arch_task_cache_init(void) { } > > > > > > > > > > /* > > > > > -- > > > > > 2.39.2 > > > > > > > > > > > > > > >
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index b597b97b1f8f..5c13b83d7008 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -141,6 +141,37 @@ static inline void put_task_struct_many(struct task_struct *t, int nr) void put_task_struct_rcu_user(struct task_struct *task); +extern void __delayed_put_task_struct(struct rcu_head *rhp); + +static inline void put_task_struct_atomic_safe(struct task_struct *task) +{ + if (IS_ENABLED(CONFIG_PREEMPT_RT)) { + /* + * Decrement the refcount explicitly to avoid unnecessarily + * calling call_rcu. + */ + if (refcount_dec_and_test(&task->usage)) + /* + * under PREEMPT_RT, we can't call put_task_struct + * in atomic context because it will indirectly + * acquire sleeping locks. + * call_rcu() will schedule delayed_put_task_struct_rcu() + * to be called in process context. + * + * __put_task_struct() is called called when + * refcount_dec_and_test(&t->usage) succeeds. + * + * This means that it can't "conflict" with + * put_task_struct_rcu_user() which abuses ->rcu the same + * way; rcu_users has a reference so task->usage can't be + * zero after rcu_users 1 -> 0 transition. + */ + call_rcu(&task->rcu, __delayed_put_task_struct); + } else { + put_task_struct(task); + } +} + /* Free all architecture-specific resources held by a thread. */ void release_thread(struct task_struct *dead_task); diff --git a/kernel/fork.c b/kernel/fork.c index 0c92f224c68c..9884794fe4b8 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -854,6 +854,14 @@ void __put_task_struct(struct task_struct *tsk) } EXPORT_SYMBOL_GPL(__put_task_struct); +void __delayed_put_task_struct(struct rcu_head *rhp) +{ + struct task_struct *task = container_of(rhp, struct task_struct, rcu); + + __put_task_struct(task); +} +EXPORT_SYMBOL_GPL(__delayed_put_task_struct); + void __init __weak arch_task_cache_init(void) { } /*