Message ID | 20240217012745.3446231-4-boqun.feng@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-69643-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2685:b0:108:e6aa:91d0 with SMTP id mn5csp93808dyc; Fri, 16 Feb 2024 17:28:58 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWpkgCxC1MbTPARRPYTO5TKrFbI7XrMYWDbErfUUIaSwD/vrXzVDJE8MGPh/mEG9Lg26DTzDubISNw5LINKZjzBDcVpAQ== X-Google-Smtp-Source: AGHT+IGou7ZDo0BQc9zUAlu3df2lz3SWnWq17ey9gEsxq4ueLt7PpORv1W66y4HvceP5VKEktCDs X-Received: by 2002:a05:620a:2043:b0:785:96a6:f6ce with SMTP id d3-20020a05620a204300b0078596a6f6cemr6611022qka.43.1708133337970; Fri, 16 Feb 2024 17:28:57 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708133337; cv=pass; d=google.com; s=arc-20160816; b=jbvlEqBL2uG2wXbly/oL57VCgCGgtYA4VE9aUN0VPPo72maP8vusX+05L7d7JQQvtS fxIU6KnKtUI9noq+qnBrJ57jjkqfoA5mgiSAA8/tR+cBnXzp/OPnMM2aljNWccunuFc5 jx+CcqRc0LU5z6GovJTbzk6eC0kqBI1kndUhgNajEsic3h2/UyqAXD6kvVs8BrzwHGbv Z1VchiNx/7yqBZjWdzLRtmdWmt27Q2J3fpU2sq0FGYh7eGTcy4C3GKkANi6NN+Hnuhl1 u3vEsMD7Y8slqsKi5M6L5noiaskiYoA9Qq52fLQ4v4jOtbu/tZNgPrw+yKXB/67ZqAU7 88qQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:feedback-id:dkim-signature; bh=EqWJLzhbN2hPYk8CsSuDuGTVd75xnF2oMWP55z7qdX0=; fh=IiUbLwkbFIh/44BWntCRyQXtVRSaZb2zc/Z+4fDlNG4=; b=IBtQ0HwbPB0s+N7JBRd+pdSU5ZiTSsdQbCYFLUrRm5eSi6ga8znshCl6LxLIpN4jcL vhGDUkF9C3rzIYoYU6Z34JNdDIslaz95mNF+dqd0gXMcJcGiLM6916hylUGm2JvgJId8 f0E2yEamhesp2y2oUCd29t5l77v8nWXEday5a7dp3DSx0xSg7da5j6RyHuyvFJJ/eOd/ a4lHSHbdkxit6Nl9AUmpwRvVEiibwW0L67k20gJd4Q7DlUGvmsJ9TWiv080OdmhpJ2e8 LL80g73/NcqR1yuY9BjcC8Whmw3Q397QEdNBrPTXCC3GS1Z3FIW/G7FmyU4j/QptonvR pbkA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=hl2soOzW; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-69643-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-69643-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id m4-20020a05620a24c400b007872569fc53si1413757qkn.650.2024.02.16.17.28.57 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Feb 2024 17:28:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-69643-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=hl2soOzW; arc=pass (i=1 spf=pass spfdomain=gmail.com dkim=pass dkdomain=gmail.com dmarc=pass fromdomain=gmail.com); spf=pass (google.com: domain of linux-kernel+bounces-69643-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-69643-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id B0B8E1C21AD7 for <ouuuleilei@gmail.com>; Sat, 17 Feb 2024 01:28:57 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C9DD41CD27; Sat, 17 Feb 2024 01:28:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="hl2soOzW" Received: from mail-oi1-f172.google.com (mail-oi1-f172.google.com [209.85.167.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E7EA11C2B2; Sat, 17 Feb 2024 01:28:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708133286; cv=none; b=bSs05HGZRnjViVIaoIkkl5AoEoJguRbI18nm1hdcSLuhZwjr8Lz3YZwePgAq00PQ11x0Qb14uTdj9/vow7Z9X8oo/LJedT0Mikxzm0Q8dCWX316ptXVKbgUwpa/joViqOtuJ0deQoM7iJ7gfx5sLBiCb6PvSCP4lnXQ+ZGJpIew= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708133286; c=relaxed/simple; bh=2d1ogzLrDb9BJTG1ntCaJltOgc+YNgTsi3wE71ajevA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=G1FAmyXhihioosK35fp2Fvcp3IlNKTzzj/qU5M75CKK0cNbPFBRUB1AS4HmyfwfcgS3NFJICrz3wm8w9T/H15K5aIF4Yjo16p6pWmLVeDul4eqMwwI9g2aNMm06RG/U7odu9IBj5n/tPpYTD9xnmhRI/uN10sD0Pz6AuFh2NO0g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=hl2soOzW; arc=none smtp.client-ip=209.85.167.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-oi1-f172.google.com with SMTP id 5614622812f47-3c04535b706so1493106b6e.0; Fri, 16 Feb 2024 17:28:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708133284; x=1708738084; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:feedback-id:from:to:cc:subject :date:message-id:reply-to; bh=EqWJLzhbN2hPYk8CsSuDuGTVd75xnF2oMWP55z7qdX0=; b=hl2soOzWS2ODFCHneH2gN0rdc3gRIrQTduxh6JzAfH8xo8ZNAFtfdWJH6RvvejT2zN IZQFnh+wgvbS+F5NUo9bRb2RI9uuxcfxbXG3xENhoHUCb5J5ob/eJjEL/DjReJoSBhdz bSJWlvgipgR9fYQjYHU7hVIHTeYTmnn/apvK7ueYkOfPbpC49F738M8Ms/BzatYDa3uR odfllwqXbhkGcnKQcc0IQzOtCJi7Q2tSA+U/A5AIsKkwvnfGU5ohbLXV1KryeKtdIU1V IkZPsr7T8KTLCdDXdPadYwAiRcyjiXIuQJ/w+gHKwqBqQgESLd1+0qBr9DOYQgB5umqp L8OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708133284; x=1708738084; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:feedback-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=EqWJLzhbN2hPYk8CsSuDuGTVd75xnF2oMWP55z7qdX0=; b=Vyjuu5NN7im8NcnGuP5TGNYcgJuQItf0Cfg5L4dUt3DCo6wpxClwWUGKvEU5oZaaNq KnWq2CapsYAqOoXbY2wgxNWEhqHrgTgjt6IN3EQGHc5NWdrrkPrWUKE6cLkz1aUEPWv2 9TDY13vwmQYmp3yWNZQUySHXuqMmIsuFgfGwgCxDagNgKYFWE3VjvgU9Zt13CTYs+dpS JXgFemOfZ+eQo75ruxzHsD6m/R2tvfzVmx/CW7WFzO1UZbOC7tms2gsUR8Q0VOt6DtOF jtbwjWG0tys8fyky3Gsl3qauBDR8vL1Qht5cg1oTKFl13R24z7a+nfHgoeypoHRtA3n9 bAdQ== X-Forwarded-Encrypted: i=1; AJvYcCXbF6IUXh0YupEUbneFkwB+dz3cVObJ5VQjhU3KEAa4E02mhnMAxs4/6OhtCvsZoTi3MIBHYxsW7HZqzSw77kxLxWb8 X-Gm-Message-State: AOJu0Yz5URBLvI8D2wThcsPZ9J/JyEN8Lka606AfEjCAUNnYfZP4QdQz yGnf3cjcsrW7HG8iabUMPgSex0JrrTH1upFKNPf6xTDy57c2Vjt2 X-Received: by 2002:aca:1a0c:0:b0:3bf:f452:651a with SMTP id a12-20020aca1a0c000000b003bff452651amr5780521oia.52.1708133283981; Fri, 16 Feb 2024 17:28:03 -0800 (PST) Received: from fauth2-smtp.messagingengine.com (fauth2-smtp.messagingengine.com. [103.168.172.201]) by smtp.gmail.com with ESMTPSA id bk36-20020a05620a1a2400b007873d8734cfsm423907qkb.132.2024.02.16.17.28.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Feb 2024 17:28:03 -0800 (PST) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailfauth.nyi.internal (Postfix) with ESMTP id A571F1200043; Fri, 16 Feb 2024 20:28:02 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Fri, 16 Feb 2024 20:28:02 -0500 X-ME-Sender: <xms:oQvQZWpPPcf37E-FrOYYIBwlynXGHcz8eGY3KafMs9RPc9WbJ7G-RQ> <xme:oQvQZUooIlQre7GZ35B1JZ8RLsfjM3Gi6CkB9CJ8WH9wCp3meI2B0LshLWKjEl57i KvAESISA6LzjyN-pg> X-ME-Received: <xmr:oQvQZbP7Vcqn-jQ8o9clR6fXuz31TBUFHOp4H7m0sqFr82ZadCPZrap7kr4> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrvdefgdefgecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpeeuohhquhhn ucfhvghnghcuoegsohhquhhnrdhfvghnghesghhmrghilhdrtghomheqnecuggftrfgrth htvghrnhepgffhffevhffhvdfgjefgkedvlefgkeegveeuheelhfeivdegffejgfetuefg heeinecuffhomhgrihhnpehkvghrnhgvlhdrohhrghenucevlhhushhtvghrufhiiigvpe dtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegsohhquhhnodhmvghsmhhtphgruhhthhhp vghrshhonhgrlhhithihqdeiledvgeehtdeigedqudejjeekheehhedvqdgsohhquhhnrd hfvghngheppehgmhgrihhlrdgtohhmsehfihigmhgvrdhnrghmvg X-ME-Proxy: <xmx:ogvQZV5ppu6MulSi_mmHmZF2WrMbqTfHViV1lJHq5v4e2Oz6Qzwtgw> <xmx:ogvQZV4VcQKgQyt-SOU9HPDGqkhauPPzLU9joFNUYuYg4UCNaoUdzQ> <xmx:ogvQZVgp51qd357oHSDy5MWy2wkXJRp5X1i9ssSjyGb9exF67Gi9Cw> <xmx:ogvQZUz2eyi_9JgpG4fH2dNZqeLlQ7MFbZPnmnrEguuvYYKo8l3tCeeOyk0> Feedback-ID: iad51458e:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Fri, 16 Feb 2024 20:28:01 -0500 (EST) From: Boqun Feng <boqun.feng@gmail.com> To: linux-kernel@vger.kernel.org, rcu@vger.kernel.org Cc: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>, "Paul E. McKenney" <paulmck@kernel.org>, Chen Zhongjin <chenzhongjin@huawei.com>, Yang Jihong <yangjihong1@huawei.com>, Boqun Feng <boqun.feng@gmail.com>, Frederic Weisbecker <frederic@kernel.org>, Neeraj Upadhyay <quic_neeraju@quicinc.com>, Joel Fernandes <joel@joelfernandes.org>, Josh Triplett <josh@joshtriplett.org>, Steven Rostedt <rostedt@goodmis.org>, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, Lai Jiangshan <jiangshanlai@gmail.com>, Zqiang <qiang.zhang1211@gmail.com>, Andrew Morton <akpm@linux-foundation.org>, Kent Overstreet <kent.overstreet@linux.dev>, Oleg Nesterov <oleg@redhat.com>, Heiko Carstens <hca@linux.ibm.com>, Christian Brauner <brauner@kernel.org>, Suren Baghdasaryan <surenb@google.com>, "Michael S. Tsirkin" <mst@redhat.com>, Mike Christie <michael.christie@oracle.com>, Mateusz Guzik <mjguzik@gmail.com>, Nicholas Piggin <npiggin@gmail.com>, Peng Zhang <zhangpeng.00@bytedance.com> Subject: [PATCH v2 3/6] rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks Date: Fri, 16 Feb 2024 17:27:38 -0800 Message-ID: <20240217012745.3446231-4-boqun.feng@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240217012745.3446231-1-boqun.feng@gmail.com> References: <20240217012745.3446231-1-boqun.feng@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791107622829453121 X-GMAIL-MSGID: 1791107622829453121 |
Series |
RCU tasks fixes for v6.9
|
|
Commit Message
Boqun Feng
Feb. 17, 2024, 1:27 a.m. UTC
From: "Paul E. McKenney" <paulmck@kernel.org> Holding a mutex across synchronize_rcu_tasks() and acquiring that same mutex in code called from do_exit() after its call to exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() results in deadlock. This is by design, because tasks that are far enough into do_exit() are no longer present on the tasks list, making it a bit difficult for RCU Tasks to find them, let alone wait on them to do a voluntary context switch. However, such deadlocks are becoming more frequent. In addition, lockdep currently does not detect such deadlocks and they can be difficult to reproduce. In addition, if a task voluntarily context switches during that time (for example, if it blocks acquiring a mutex), then this task is in an RCU Tasks quiescent state. And with some adjustments, RCU Tasks could just as well take advantage of that fact. This commit therefore initializes the data structures that will be needed to rely on these quiescent states and to eliminate these deadlocks. Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> Reported-by: Yang Jihong <yangjihong1@huawei.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> --- init/init_task.c | 1 + kernel/fork.c | 1 + kernel/rcu/tasks.h | 2 ++ 3 files changed, 4 insertions(+)
Comments
Le Fri, Feb 16, 2024 at 05:27:38PM -0800, Boqun Feng a écrit : > From: "Paul E. McKenney" <paulmck@kernel.org> > > Holding a mutex across synchronize_rcu_tasks() and acquiring > that same mutex in code called from do_exit() after its call to > exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() > results in deadlock. This is by design, because tasks that are far > enough into do_exit() are no longer present on the tasks list, making > it a bit difficult for RCU Tasks to find them, let alone wait on them > to do a voluntary context switch. However, such deadlocks are becoming > more frequent. In addition, lockdep currently does not detect such > deadlocks and they can be difficult to reproduce. > > In addition, if a task voluntarily context switches during that time > (for example, if it blocks acquiring a mutex), then this task is in an > RCU Tasks quiescent state. And with some adjustments, RCU Tasks could > just as well take advantage of that fact. > > This commit therefore initializes the data structures that will be needed > to rely on these quiescent states and to eliminate these deadlocks. > > Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ > > Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> > Reported-by: Yang Jihong <yangjihong1@huawei.com> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > Tested-by: Yang Jihong <yangjihong1@huawei.com> > Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> > Signed-off-by: Boqun Feng <boqun.feng@gmail.com> > --- > init/init_task.c | 1 + > kernel/fork.c | 1 + > kernel/rcu/tasks.h | 2 ++ > 3 files changed, 4 insertions(+) > > diff --git a/init/init_task.c b/init/init_task.c > index 7ecb458eb3da..4daee6d761c8 100644 > --- a/init/init_task.c > +++ b/init/init_task.c > @@ -147,6 +147,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = { > .rcu_tasks_holdout = false, > .rcu_tasks_holdout_list = LIST_HEAD_INIT(init_task.rcu_tasks_holdout_list), > .rcu_tasks_idle_cpu = -1, > + .rcu_tasks_exit_list = LIST_HEAD_INIT(init_task.rcu_tasks_exit_list), > #endif > #ifdef CONFIG_TASKS_TRACE_RCU > .trc_reader_nesting = 0, > diff --git a/kernel/fork.c b/kernel/fork.c > index 0d944e92a43f..af7203be1d2d 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -1976,6 +1976,7 @@ static inline void rcu_copy_process(struct task_struct *p) > p->rcu_tasks_holdout = false; > INIT_LIST_HEAD(&p->rcu_tasks_holdout_list); > p->rcu_tasks_idle_cpu = -1; > + INIT_LIST_HEAD(&p->rcu_tasks_exit_list); > #endif /* #ifdef CONFIG_TASKS_RCU */ > #ifdef CONFIG_TASKS_TRACE_RCU > p->trc_reader_nesting = 0; > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h > index b7d5f2757053..4a5d562e3189 100644 > --- a/kernel/rcu/tasks.h > +++ b/kernel/rcu/tasks.h > @@ -277,6 +277,8 @@ static void cblist_init_generic(struct rcu_tasks *rtp) > rtpcp->rtpp = rtp; > if (!rtpcp->rtp_blkd_tasks.next) > INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks); > + if (!rtpcp->rtp_exit_list.next) I assume there can't be an exiting task concurrently at this point on boot. Because kthreadd just got created and workqueues as well but that's it, right? Or workqueues can die that early? Probably not. > + INIT_LIST_HEAD(&rtpcp->rtp_exit_list); Because if tasks can exit concurrently, then we are in trouble :-) Thanks. > } > > pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d.\n", rtp->name, > -- > 2.43.0 >
On Thu, Feb 22, 2024 at 05:21:03PM +0100, Frederic Weisbecker wrote: > Le Fri, Feb 16, 2024 at 05:27:38PM -0800, Boqun Feng a écrit : > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > Holding a mutex across synchronize_rcu_tasks() and acquiring > > that same mutex in code called from do_exit() after its call to > > exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() > > results in deadlock. This is by design, because tasks that are far > > enough into do_exit() are no longer present on the tasks list, making > > it a bit difficult for RCU Tasks to find them, let alone wait on them > > to do a voluntary context switch. However, such deadlocks are becoming > > more frequent. In addition, lockdep currently does not detect such > > deadlocks and they can be difficult to reproduce. > > > > In addition, if a task voluntarily context switches during that time > > (for example, if it blocks acquiring a mutex), then this task is in an > > RCU Tasks quiescent state. And with some adjustments, RCU Tasks could > > just as well take advantage of that fact. > > > > This commit therefore initializes the data structures that will be needed > > to rely on these quiescent states and to eliminate these deadlocks. > > > > Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ > > > > Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> > > Reported-by: Yang Jihong <yangjihong1@huawei.com> > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > Tested-by: Yang Jihong <yangjihong1@huawei.com> > > Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com> > > --- > > init/init_task.c | 1 + > > kernel/fork.c | 1 + > > kernel/rcu/tasks.h | 2 ++ > > 3 files changed, 4 insertions(+) > > > > diff --git a/init/init_task.c b/init/init_task.c > > index 7ecb458eb3da..4daee6d761c8 100644 > > --- a/init/init_task.c > > +++ b/init/init_task.c > > @@ -147,6 +147,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = { > > .rcu_tasks_holdout = false, > > .rcu_tasks_holdout_list = LIST_HEAD_INIT(init_task.rcu_tasks_holdout_list), > > .rcu_tasks_idle_cpu = -1, > > + .rcu_tasks_exit_list = LIST_HEAD_INIT(init_task.rcu_tasks_exit_list), > > #endif > > #ifdef CONFIG_TASKS_TRACE_RCU > > .trc_reader_nesting = 0, > > diff --git a/kernel/fork.c b/kernel/fork.c > > index 0d944e92a43f..af7203be1d2d 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -1976,6 +1976,7 @@ static inline void rcu_copy_process(struct task_struct *p) > > p->rcu_tasks_holdout = false; > > INIT_LIST_HEAD(&p->rcu_tasks_holdout_list); > > p->rcu_tasks_idle_cpu = -1; > > + INIT_LIST_HEAD(&p->rcu_tasks_exit_list); > > #endif /* #ifdef CONFIG_TASKS_RCU */ > > #ifdef CONFIG_TASKS_TRACE_RCU > > p->trc_reader_nesting = 0; > > diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h > > index b7d5f2757053..4a5d562e3189 100644 > > --- a/kernel/rcu/tasks.h > > +++ b/kernel/rcu/tasks.h > > @@ -277,6 +277,8 @@ static void cblist_init_generic(struct rcu_tasks *rtp) > > rtpcp->rtpp = rtp; > > if (!rtpcp->rtp_blkd_tasks.next) > > INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks); > > + if (!rtpcp->rtp_exit_list.next) > > I assume there can't be an exiting task concurrently at this point on > boot. Because kthreadd just got created and workqueues as well but that's it, > right? Or workqueues can die that early? Probably not. > > > + INIT_LIST_HEAD(&rtpcp->rtp_exit_list); > > Because if tasks can exit concurrently, then we are in trouble :-) Tasks exiting at that point might be unconventional, but I don't see anything that prevents them from doing so. So excellent catch, and thank you very much!!! My thought is to add the following patch to precede this one, which initializes those lists at rcu_init() time. Would that work? Thanx, Paul ------------------------------------------------------------------------ commit 9a876aac8064dfd46c840e4bb6177e65f7964bb4 Author: Paul E. McKenney <paulmck@kernel.org> Date: Thu Feb 22 12:29:54 2024 -0800 rcu-tasks: Initialize callback lists at rcu_init() time In order for RCU Tasks to reliably maintain per-CPU lists of exiting tasks, those lists must be initialized before it is possible for tasks to exit, especially given that the boot CPU is not necessarily CPU 0 (an example being, powerpc kexec() kernels). And at the time that rcu_init_tasks_generic() is called, a task could potentially exit, unconventional though that sort of thing might be. This commit therefore moves the calls to cblist_init_generic() from functions called from rcu_init_tasks_generic() to a new function named tasks_cblist_init_generic() that is invoked from rcu_init(). This constituted a bug in a commit that never went to mainline, so there is no need for any backporting to -stable. Reported-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h index 4e65a92e528e5..86fce206560e8 100644 --- a/kernel/rcu/rcu.h +++ b/kernel/rcu/rcu.h @@ -528,6 +528,12 @@ struct task_struct *get_rcu_tasks_gp_kthread(void); struct task_struct *get_rcu_tasks_rude_gp_kthread(void); #endif // # ifdef CONFIG_TASKS_RUDE_RCU +#ifdef CONFIG_TASKS_RCU_GENERIC +void tasks_cblist_init_generic(void); +#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */ +static inline void tasks_cblist_init_generic(void) { } +#endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */ + #define RCU_SCHEDULER_INACTIVE 0 #define RCU_SCHEDULER_INIT 1 #define RCU_SCHEDULER_RUNNING 2 diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 866743e0796f4..e06e388e7c7e6 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -240,7 +240,6 @@ static const char *tasks_gp_state_getname(struct rcu_tasks *rtp) static void cblist_init_generic(struct rcu_tasks *rtp) { int cpu; - unsigned long flags; int lim; int shift; @@ -266,10 +265,8 @@ static void cblist_init_generic(struct rcu_tasks *rtp) WARN_ON_ONCE(!rtpcp); if (cpu) raw_spin_lock_init(&ACCESS_PRIVATE(rtpcp, lock)); - local_irq_save(flags); // serialize initialization if (rcu_segcblist_empty(&rtpcp->cblist)) rcu_segcblist_init(&rtpcp->cblist); - local_irq_restore(flags); INIT_WORK(&rtpcp->rtp_work, rcu_tasks_invoke_cbs_wq); rtpcp->cpu = cpu; rtpcp->rtpp = rtp; @@ -1153,7 +1150,6 @@ module_param(rcu_tasks_lazy_ms, int, 0444); static int __init rcu_spawn_tasks_kthread(void) { - cblist_init_generic(&rcu_tasks); rcu_tasks.gp_sleep = HZ / 10; rcu_tasks.init_fract = HZ / 10; if (rcu_tasks_lazy_ms >= 0) @@ -1340,7 +1336,6 @@ module_param(rcu_tasks_rude_lazy_ms, int, 0444); static int __init rcu_spawn_tasks_rude_kthread(void) { - cblist_init_generic(&rcu_tasks_rude); rcu_tasks_rude.gp_sleep = HZ / 10; if (rcu_tasks_rude_lazy_ms >= 0) rcu_tasks_rude.lazy_jiffies = msecs_to_jiffies(rcu_tasks_rude_lazy_ms); @@ -1972,7 +1967,6 @@ module_param(rcu_tasks_trace_lazy_ms, int, 0444); static int __init rcu_spawn_tasks_trace_kthread(void) { - cblist_init_generic(&rcu_tasks_trace); if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB)) { rcu_tasks_trace.gp_sleep = HZ / 10; rcu_tasks_trace.init_fract = HZ / 10; @@ -2144,6 +2138,24 @@ late_initcall(rcu_tasks_verify_schedule_work); static void rcu_tasks_initiate_self_tests(void) { } #endif /* #else #ifdef CONFIG_PROVE_RCU */ +void __init tasks_cblist_init_generic(void) +{ + lockdep_assert_irqs_disabled(); + WARN_ON(num_online_cpus() > 1); + +#ifdef CONFIG_TASKS_RCU + cblist_init_generic(&rcu_tasks); +#endif + +#ifdef CONFIG_TASKS_RUDE_RCU + cblist_init_generic(&rcu_tasks_rude); +#endif + +#ifdef CONFIG_TASKS_TRACE_RCU + cblist_init_generic(&rcu_tasks_trace); +#endif +} + void __init rcu_init_tasks_generic(void) { #ifdef CONFIG_TASKS_RCU diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c index fec804b790803..705c0d16850aa 100644 --- a/kernel/rcu/tiny.c +++ b/kernel/rcu/tiny.c @@ -261,4 +261,5 @@ void __init rcu_init(void) { open_softirq(RCU_SOFTIRQ, rcu_process_callbacks); rcu_early_boot_tests(); + tasks_cblist_init_generic(); } diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 31f3a61f9c384..4f4aec64039f0 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -5601,6 +5601,8 @@ void __init rcu_init(void) (void)start_poll_synchronize_rcu_expedited(); rcu_test_sync_prims(); + + tasks_cblist_init_generic(); } #include "tree_stall.h"
On Thu, Feb 22, 2024 at 12:41:55PM -0800, Paul E. McKenney wrote: > On Thu, Feb 22, 2024 at 05:21:03PM +0100, Frederic Weisbecker wrote: > > Le Fri, Feb 16, 2024 at 05:27:38PM -0800, Boqun Feng a écrit : > > > From: "Paul E. McKenney" <paulmck@kernel.org> > > > > > > Holding a mutex across synchronize_rcu_tasks() and acquiring > > > that same mutex in code called from do_exit() after its call to > > > exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop() > > > results in deadlock. This is by design, because tasks that are far > > > enough into do_exit() are no longer present on the tasks list, making > > > it a bit difficult for RCU Tasks to find them, let alone wait on them > > > to do a voluntary context switch. However, such deadlocks are becoming > > > more frequent. In addition, lockdep currently does not detect such > > > deadlocks and they can be difficult to reproduce. > > > > > > In addition, if a task voluntarily context switches during that time > > > (for example, if it blocks acquiring a mutex), then this task is in an > > > RCU Tasks quiescent state. And with some adjustments, RCU Tasks could > > > just as well take advantage of that fact. > > > > > > This commit therefore initializes the data structures that will be needed > > > to rely on these quiescent states and to eliminate these deadlocks. > > > > > > Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@huawei.com/ > > > > > > Reported-by: Chen Zhongjin <chenzhongjin@huawei.com> > > > Reported-by: Yang Jihong <yangjihong1@huawei.com> > > > Signed-off-by: Paul E. McKenney <paulmck@kernel.org> > > > Tested-by: Yang Jihong <yangjihong1@huawei.com> > > > Tested-by: Chen Zhongjin <chenzhongjin@huawei.com> > > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Looks good, thanks! Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
diff --git a/init/init_task.c b/init/init_task.c index 7ecb458eb3da..4daee6d761c8 100644 --- a/init/init_task.c +++ b/init/init_task.c @@ -147,6 +147,7 @@ struct task_struct init_task __aligned(L1_CACHE_BYTES) = { .rcu_tasks_holdout = false, .rcu_tasks_holdout_list = LIST_HEAD_INIT(init_task.rcu_tasks_holdout_list), .rcu_tasks_idle_cpu = -1, + .rcu_tasks_exit_list = LIST_HEAD_INIT(init_task.rcu_tasks_exit_list), #endif #ifdef CONFIG_TASKS_TRACE_RCU .trc_reader_nesting = 0, diff --git a/kernel/fork.c b/kernel/fork.c index 0d944e92a43f..af7203be1d2d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1976,6 +1976,7 @@ static inline void rcu_copy_process(struct task_struct *p) p->rcu_tasks_holdout = false; INIT_LIST_HEAD(&p->rcu_tasks_holdout_list); p->rcu_tasks_idle_cpu = -1; + INIT_LIST_HEAD(&p->rcu_tasks_exit_list); #endif /* #ifdef CONFIG_TASKS_RCU */ #ifdef CONFIG_TASKS_TRACE_RCU p->trc_reader_nesting = 0; diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index b7d5f2757053..4a5d562e3189 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -277,6 +277,8 @@ static void cblist_init_generic(struct rcu_tasks *rtp) rtpcp->rtpp = rtp; if (!rtpcp->rtp_blkd_tasks.next) INIT_LIST_HEAD(&rtpcp->rtp_blkd_tasks); + if (!rtpcp->rtp_exit_list.next) + INIT_LIST_HEAD(&rtpcp->rtp_exit_list); } pr_info("%s: Setting shift to %d and lim to %d rcu_task_cb_adjust=%d.\n", rtp->name,