From patchwork Fri Nov 25 13:54:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 26019 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp4037094wrr; Fri, 25 Nov 2022 06:00:15 -0800 (PST) X-Google-Smtp-Source: AA0mqf4kmR8UuKYAcMvmhNXKZQqjiMuFXMN0qyYnIw54m838xj7hiasa4kxgvJwehubQti7W/5si X-Received: by 2002:a05:6402:f07:b0:46a:7f29:1b15 with SMTP id i7-20020a0564020f0700b0046a7f291b15mr8186874eda.226.1669384815527; Fri, 25 Nov 2022 06:00:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669384815; cv=none; d=google.com; s=arc-20160816; b=f7UuZLBiFenU9Y5pxxK08DNrtkP7kNUWDCpmHBu1PfL7sIF5iZjLyRBKmMvmYkEzOC wMAAAXeljMF5fPT0MCo+T+kzRWtE/0p7Qpa3g/Fo7nWkdKCXTrc/LpN4DowvYAE0QqnI fKI/ngtEFdOSig5voBlBovPQienvtGKadHTBeHc55e06Il4tOKPmV3HyJWCFEuy33G0y oY/xSv/814hrRaVEVsfeIYujsL73fBquvyr7IpsjprDPLQOSIR103qA0yqI5e0zIZsdC kEQ7ckOWPo6ornkGCld4STATxXzY09ZRPwGUYFp/rZ8l4VAsaQ1JIJWtPiBd5AhUVVFp 1ksQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=QE3G8t8Zom7RmhjzRwd4LFBRicvyyB/1OabOFwIE7Ys=; b=kHlG3E0OpJws3xajx+QZmAJ/m7oENTRmLUpEP6zm9kHSJj9gKVKKGCClwwyMzCgMak VohEp+a+xeypNqiK/N8aiaa+EbfTMSW+n7HVrcLw4vevSyy+lhAq0ZfumzTuRrUvft9H uLAVisEB1azlt/JYc4ygebvzkYuxl6o314vb0Npl403qX8IdQ7JiTinx147zZ0+TxaMs rBiiDBvos/YXwRvwj4833O9/pAyRiofqw0b0HNRgYkwZ+YZAkMsDoXVC0oMyes+5Pj5+ GGVOJ5+1z6f+ws6xittmu61DdfSM9dLOM9vrjBnrHOL8DGp9P/LxZqeXIUr92bGrI5Hj UOsQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JaRLG4y8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xd9-20020a170907078900b0079330b37fb5si3397881ejb.564.2022.11.25.05.59.48; Fri, 25 Nov 2022 06:00:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JaRLG4y8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229808AbiKYNzN (ORCPT + 99 others); Fri, 25 Nov 2022 08:55:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229628AbiKYNzK (ORCPT ); Fri, 25 Nov 2022 08:55:10 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3660B21E; Fri, 25 Nov 2022 05:55:10 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CC87962457; Fri, 25 Nov 2022 13:55:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2BE43C433D6; Fri, 25 Nov 2022 13:55:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669384509; bh=Q8JjNQ5bUk4AZ2CfSjRxxwMV7KD/wuw64pWCocyDRgA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JaRLG4y8AXSHW+V/j1dU5Z7h2CjWGPm9A3Hve6Luaou4atZcm0ge2x8XLpWVJ3yV6 m7EpeG4ynFyGNSgYyCFDOs3gR+Re6P2x6BaqO0eOd76kOKWV14Ojnl4iKjsHVkPFsm WMJtIw5d8oqBJLgNd7cGWS8ovokOyJKTGI/OeWuDWgSRmIa6mlN+asvF+vijYbfHlD nvsmj3RVm19vGniDoCTbTsDJvWz2WCPcdtTHsNLE21Qjf2g9VR+igHQIrlyRyf9d+b u/st5dXe6yQ2xtnj2cPHEghdY+Dq+ecDmDZD6frC7fCFU10cWTSjlk301WXkaZEz4G aK5negbtrDb1g== From: Frederic Weisbecker To: "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , "Eric W . Biederman" , Neeraj Upadhyay , Oleg Nesterov , Pengfei Xu , Boqun Feng , Lai Jiangshan , rcu@vger.kernel.org Subject: [PATCH 1/3] rcu-tasks: Improve comments explaining tasks_rcu_exit_srcu purpose Date: Fri, 25 Nov 2022 14:54:58 +0100 Message-Id: <20221125135500.1653800-2-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221125135500.1653800-1-frederic@kernel.org> References: <20221125135500.1653800-1-frederic@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750476852092279847?= X-GMAIL-MSGID: =?utf-8?q?1750476852092279847?= Make sure we don't need to look again into the depths of git blame in order not to miss a subtle part about how rcu-tasks is dealing with exiting tasks. Suggested-by: Boqun Feng Suggested-by: Neeraj Upadhyay Suggested-by: Paul E. McKenney Cc: Oleg Nesterov Cc: Lai Jiangshan Cc: Eric W. Biederman Signed-off-by: Frederic Weisbecker --- kernel/rcu/tasks.h | 37 +++++++++++++++++++++++++++++-------- 1 file changed, 29 insertions(+), 8 deletions(-) diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 4a991311be9b..7deed6135f73 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -827,11 +827,21 @@ static void rcu_tasks_pertask(struct task_struct *t, struct list_head *hop) static void rcu_tasks_postscan(struct list_head *hop) { /* - * Wait for tasks that are in the process of exiting. This - * does only part of the job, ensuring that all tasks that were - * previously exiting reach the point where they have disabled - * preemption, allowing the later synchronize_rcu() to finish - * the job. + * Exiting tasks may escape the tasklist scan. Those are vulnerable + * until their final schedule() with TASK_DEAD state. To cope with + * this, divide the fragile exit path part in two intersecting + * read side critical sections: + * + * 1) An _SRCU_ read side starting before calling exit_notify(), + * which may remove the task from the tasklist, and ending after + * the final preempt_disable() call in do_exit(). + * + * 2) An _RCU_ read side starting with the final preempt_disable() + * call in do_exit() and ending with the final call to schedule() + * with TASK_DEAD state. + * + * This handles the part 1). And postgp will handle part 2) with a + * call to synchronize_rcu(). */ synchronize_srcu(&tasks_rcu_exit_srcu); } @@ -898,7 +908,10 @@ static void rcu_tasks_postgp(struct rcu_tasks *rtp) * * In addition, this synchronize_rcu() waits for exiting tasks * to complete their final preempt_disable() region of execution, - * cleaning up after the synchronize_srcu() above. + * cleaning up after synchronize_srcu(&tasks_rcu_exit_srcu), + * enforcing the whole region before tasklist removal until + * the final schedule() with TASK_DEAD state to be an RCU TASKS + * read side critical section. */ synchronize_rcu(); } @@ -988,7 +1001,11 @@ void show_rcu_tasks_classic_gp_kthread(void) EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread); #endif // !defined(CONFIG_TINY_RCU) -/* Do the srcu_read_lock() for the above synchronize_srcu(). */ +/* + * Contribute to protect against tasklist scan blind spot while the + * task is exiting and may be removed from the tasklist. See + * corresponding synchronize_srcu() for further details. + */ void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) { preempt_disable(); @@ -996,7 +1013,11 @@ void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) preempt_enable(); } -/* Do the srcu_read_unlock() for the above synchronize_srcu(). */ +/* + * Contribute to protect against tasklist scan blind spot while the + * task is exiting and may be removed from the tasklist. See + * corresponding synchronize_srcu() for further details. + */ void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu) { struct task_struct *t = current; From patchwork Fri Nov 25 13:54:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 26020 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp4038131wrr; Fri, 25 Nov 2022 06:01:24 -0800 (PST) X-Google-Smtp-Source: AA0mqf5gBa+gARB5jHI8U+nJfFANrEw1/f4iY+3CE3ESIHhKMSHGGE7kEKahODG7SWWhmmWZxzAN X-Received: by 2002:a17:906:4c92:b0:78d:ad29:396f with SMTP id q18-20020a1709064c9200b0078dad29396fmr32284823eju.165.1669384878683; Fri, 25 Nov 2022 06:01:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669384878; cv=none; d=google.com; s=arc-20160816; b=QmUxTQb3gFXlxttH9znpzv78j8iAda+dvJUngy0cCcr4LvG+HkpFhZerbpeIAEdj3v xCef8zyqqt08Z4IEj1bS+yNOt2MI3konBab/qZh6WRjJCdwzn/WpNaq5ZfkXQ6xJ6ajR 2WrBaAreVBu/HylFM/ggp+tf4EYuKQ8fPH5ApjJ300ZukP2bhFbSeMr6wKNWUJojWgmq S6i4hRPgsKLAILwngtCfKSucmJusCqfscFuM1ORIhyNbbBhaNGt1++FAwQJ5Y1LuMOx1 o3wShUB5VNNcIKl8BEhIO7xJFwCmpcTd1iP4pxXSBUd24mr153NwGQImScQWdgTpCJqX fVxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=x27sSVS1Lg0kLoQjmQESjOUYq/MQacKhznRORjG2m+w=; b=xmeZtLzUjBT0JCxFKq39MjIbpGf2ArTr89T+iQMw3ZFbLiEtagqF8LzhEK0PmNogyp A/yJGrgqjNSizc5GnEqpEV4WDB8Zh52Tg9LWxWI3VeBlz5aEAZHjlaKzd7tIorV/D6Tf qaR8FbUMqU9YCXDU+gjPDy9LSPNPW5ujKFEX1o9F9N7d3tHfmjfUz81PSAOrs2KtHl50 bfmjR/zyJGts0lSBCaADaqFwotXbjRE8B7qPp+sv3SPzTOnKsmxTt+jKOuzqWtwXwA8f QlWmPFd+U023DTS3UhjyM7qQjCm3tb0AIo+HN5yx7bgEFFXL5o2s3X/G1iqG3EQVYqW5 3U3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=PWrTou2x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p15-20020a50cd8f000000b00462ab8923ccsi1968536edi.600.2022.11.25.06.00.45; Fri, 25 Nov 2022 06:01:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=PWrTou2x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229686AbiKYNzP (ORCPT + 99 others); Fri, 25 Nov 2022 08:55:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229756AbiKYNzN (ORCPT ); Fri, 25 Nov 2022 08:55:13 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B54431AD8F; Fri, 25 Nov 2022 05:55:12 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 51DB36245A; Fri, 25 Nov 2022 13:55:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A4754C433C1; Fri, 25 Nov 2022 13:55:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669384511; bh=ALLACmzXC6lYKmELWY0sATEsOCq01HNEAVNMSA5sSNU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PWrTou2xJP0lqm6s5QPAORWmW0j8jfCxhnCMQeUbEiz3F0QA3AxuIpCRAkk6inw5p wEDcwPD1DlhwzX8kvszAKdvuVCvPhw+Dvkee6PRXx0kK9u+EG71tWNxOce0XX/WyOg 6I9TlVrT6Ds+SlvRt/8s1gXR8efW9OiazXwJwX0Uv8GBHlUx1UHellIjcY/tApFMOz gDfyGplUPEmCE0qcRmw1AyGPAcsYOKjecreO+qjAWsISQvsifq0uCCqmBnvsbWZ5t0 rjDY9oLXtHLIKgJe9vsrsBRNCqBRUkXaON89nIoi4DeUIM2KnFGrAvr8TNoTR9jDFa POLinGPxc460Q== From: Frederic Weisbecker To: "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , "Eric W . Biederman" , Neeraj Upadhyay , Oleg Nesterov , Pengfei Xu , Boqun Feng , Lai Jiangshan , rcu@vger.kernel.org Subject: [PATCH 2/3] rcu-tasks: Remove preemption disablement around srcu_read_[un]lock() calls Date: Fri, 25 Nov 2022 14:54:59 +0100 Message-Id: <20221125135500.1653800-3-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221125135500.1653800-1-frederic@kernel.org> References: <20221125135500.1653800-1-frederic@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750476918583357879?= X-GMAIL-MSGID: =?utf-8?q?1750476918583357879?= Ever since the following commit: 5a41344a3d83 ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()") SRCU doesn't rely anymore on preemption to be disabled in order to modify the per-CPU counter. And even then it used to be done from the API itself. Therefore and after checking further, it appears to be safe to remove the preemption disablement around __srcu_read_[un]lock() in exit_tasks_rcu_start() and exit_tasks_rcu_finish() Suggested-by: Boqun Feng Suggested-by: Paul E. McKenney Suggested-by: Neeraj Upadhyay Cc: Lai Jiangshan Signed-off-by: Frederic Weisbecker --- kernel/rcu/tasks.h | 4 ---- 1 file changed, 4 deletions(-) diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 7deed6135f73..9a8114114b48 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -1008,9 +1008,7 @@ EXPORT_SYMBOL_GPL(show_rcu_tasks_classic_gp_kthread); */ void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) { - preempt_disable(); current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu); - preempt_enable(); } /* @@ -1022,9 +1020,7 @@ void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu) { struct task_struct *t = current; - preempt_disable(); __srcu_read_unlock(&tasks_rcu_exit_srcu, t->rcu_tasks_idx); - preempt_enable(); exit_tasks_rcu_finish_trace(t); } From patchwork Fri Nov 25 13:55:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 26021 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp4038129wrr; Fri, 25 Nov 2022 06:01:24 -0800 (PST) X-Google-Smtp-Source: AA0mqf7HDuUxbrXU0PaPoSp362XcviAUvBY+JN+o6/HzeDS9MuAfUsK60PYZwsQky2Xi4TE66/1Y X-Received: by 2002:a2e:a4a3:0:b0:278:ecbe:ebba with SMTP id g3-20020a2ea4a3000000b00278ecbeebbamr7425719ljm.450.1669384878232; Fri, 25 Nov 2022 06:01:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669384878; cv=none; d=google.com; s=arc-20160816; b=g8EPd/cXMYWkulMG1JZIy4Ru1b+qhcmlszpLEuC+7Yxe547K1Ile/mHhBnMQFndG7n 6zf8v0A5qRqf1r434dUCKgCS8Cadpop1C9K/GIu5kPcgoNDvoCKlqTE1BHInd0F6h2FL Lu2CnoOZ5OT7QTOOFvi/LIFIpwATh0nRrveL7pe/z0I7y+HrTYwlFRN2A/piaVk82hv8 1sIVw3ULsElDKUZ8eO7Zsc+EPNP+qJ0nNS32l6Ugkt6ZiGw4ahjDgWiaTKYXrtYl+sAH eBoscdjrikm5DZCrdqEBkeYWZqDShgClQ3Ave0craREYg1pmYkUt/x2QPuvLxwHpgms9 1frg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PiTnyDpu0Bq9hY67gaFSXvjjP0yYugjNnNzHkN1KhR8=; b=UWr9mb6tpI2FV+TVkxrs6gElJ+jv3L8Bwo/NDgMsqS1VHnkoZIb8Xzy8+3UCswkNk1 0C6tbH+blCashutHyVRo5dngHmwsX1zUzAiHiX1WC3w/6Pfp4/5R5Ys6slj4NqaIDedJ Htfsh4nd5s9aT5m9G2sMD/7Qjqs8FFyjvMIu8kHEWaXgUmFUSl0EJIfNItFsz2AN/b+J 0zbIJ0lBR9qq3KhZbIEYI/ahcSajz8nv23kYjQ0nX4iUoGJaKa3A0IIn0O7fHp8SwuGy oSlwnSnQZJEkBpyu3u6XDA1jmazt9j9JTR2yS2+CVCDxKmRwJsw+keOlhc/BtRNAezIn b2FQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KouTAujd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y14-20020a50e60e000000b004602a1b7da9si2824202edm.133.2022.11.25.06.00.46; Fri, 25 Nov 2022 06:01:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KouTAujd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229928AbiKYNz0 (ORCPT + 99 others); Fri, 25 Nov 2022 08:55:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229896AbiKYNzW (ORCPT ); Fri, 25 Nov 2022 08:55:22 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3690221258; Fri, 25 Nov 2022 05:55:15 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CA49C6245B; Fri, 25 Nov 2022 13:55:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2D975C433D6; Fri, 25 Nov 2022 13:55:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1669384514; bh=02dI1lNmveQrMuT/Cc1xXAnRLUb8g7J9hwX6UzoKBFc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KouTAujdIoK+X5R0bHouk8vqaIt+Uy9J1A6+KgwwpYJ3AYYmoFXZnLIqJqWhthu+0 LB6nXrlZWPPq0lvRrz/IeIXqy2x0/ZH4aJoWJph3oOzxAYrOWqMc9PrfMy7/QEI4rG vSmOivxxi1G0DES8NNcOg/EYByyXiGT63lxyjmQ/BzupqnEP4dZmBQQWllDfjQYH0k BuTH/ydfgjl5T8rQdvCGwniEUmbZN5IdlZhpPH85qIO+5oqoNXXzKVXWmlZmHBZAzA AMJCvZ2gcpSUjnSFQ5RJ9166NT8dAHz0F9kiTSft1aZDe3g0D8c+LZynmvxEoNOEyS KqZCdzKoO/DBw== From: Frederic Weisbecker To: "Paul E . McKenney" Cc: LKML , Frederic Weisbecker , "Eric W . Biederman" , Neeraj Upadhyay , Oleg Nesterov , Pengfei Xu , Boqun Feng , Lai Jiangshan , rcu@vger.kernel.org Subject: [PATCH 3/3] rcu-tasks: Fix synchronize_rcu_tasks() VS zap_pid_ns_processes() Date: Fri, 25 Nov 2022 14:55:00 +0100 Message-Id: <20221125135500.1653800-4-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20221125135500.1653800-1-frederic@kernel.org> References: <20221125135500.1653800-1-frederic@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750476917803718371?= X-GMAIL-MSGID: =?utf-8?q?1750476917803718371?= RCU Tasks and PID-namespace unshare can interact in do_exit() in a complicated circular dependency: 1) TASK A calls unshare(CLONE_NEWPID), this creates a new PID namespace that every subsequent child of TASK A will belong to. But TASK A doesn't itself belong to that new PID namespace. 2) TASK A forks() and creates TASK B. TASK A stays attached to its PID namespace (let's say PID_NS1) and TASK B is the first task belonging to the new PID namespace created by unshare() (let's call it PID_NS2). 3) Since TASK B is the first task attached to PID_NS2, it becomes the PID_NS2 child reaper. 4) TASK A forks() again and creates TASK C which get attached to PID_NS2. Note how TASK C has TASK A as a parent (belonging to PID_NS1) but has TASK B (belonging to PID_NS2) as a pid_namespace child_reaper. 5) TASK B exits and since it is the child reaper for PID_NS2, it has to kill all other tasks attached to PID_NS2, and wait for all of them to die before getting reaped itself (zap_pid_ns_process()). 6) TASK A calls synchronize_rcu_tasks() which leads to synchronize_srcu(&tasks_rcu_exit_srcu). 7) TASK B is waiting for TASK C to get reaped. But TASK B is under a tasks_rcu_exit_srcu SRCU critical section (exit_notify() is between exit_tasks_rcu_start() and exit_tasks_rcu_finish()), blocking TASK A. 8) TASK C exits and since TASK A is its parent, it waits for it to reap TASK C, but it can't because TASK A waits for TASK B that waits for TASK C. Pid_namespace semantics can hardly be changed at this point. But the coverage of tasks_rcu_exit_srcu can be reduced instead. The current task is assumed not to be concurrently reapable at this stage of exit_notify() and therefore tasks_rcu_exit_srcu can be temporarily relaxed without breaking its constraints, providing a way out of the deadlock scenario. Fixes: 3f95aa81d265 ("rcu: Make TASKS_RCU handle tasks that are almost done exiting") Reported-by: Pengfei Xu Suggested-by: Boqun Feng Suggested-by: Neeraj Upadhyay Suggested-by: Paul E. McKenney Cc: Oleg Nesterov Cc: Lai Jiangshan Cc: Eric W . Biederman Signed-off-by: Frederic Weisbecker --- include/linux/rcupdate.h | 2 ++ kernel/pid_namespace.c | 17 +++++++++++++++++ kernel/rcu/tasks.h | 14 ++++++++++++-- 3 files changed, 31 insertions(+), 2 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 89b3036746d2..a19d91d5461c 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -238,6 +238,7 @@ void synchronize_rcu_tasks_rude(void); #define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false) void exit_tasks_rcu_start(void); +void exit_tasks_rcu_stop(void); void exit_tasks_rcu_finish(void); #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */ #define rcu_tasks_classic_qs(t, preempt) do { } while (0) @@ -246,6 +247,7 @@ void exit_tasks_rcu_finish(void); #define call_rcu_tasks call_rcu #define synchronize_rcu_tasks synchronize_rcu static inline void exit_tasks_rcu_start(void) { } +static inline void exit_tasks_rcu_stop(void) { } static inline void exit_tasks_rcu_finish(void) { } #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */ diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index f4f8cb0435b4..fc21c5d5fd5d 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -244,7 +244,24 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns) set_current_state(TASK_INTERRUPTIBLE); if (pid_ns->pid_allocated == init_pids) break; + /* + * Release tasks_rcu_exit_srcu to avoid following deadlock: + * + * 1) TASK A unshare(CLONE_NEWPID) + * 2) TASK A fork() twice -> TASK B (child reaper for new ns) + * and TASK C + * 3) TASK B exits, kills TASK C, waits for TASK A to reap it + * 4) TASK A calls synchronize_rcu_tasks() + * -> synchronize_srcu(tasks_rcu_exit_srcu) + * 5) *DEADLOCK* + * + * It is considered safe to release tasks_rcu_exit_srcu here + * because we assume the current task can not be concurrently + * reaped at this point. + */ + exit_tasks_rcu_stop(); schedule(); + exit_tasks_rcu_start(); } __set_current_state(TASK_RUNNING); diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h index 9a8114114b48..f9dfc2ece287 100644 --- a/kernel/rcu/tasks.h +++ b/kernel/rcu/tasks.h @@ -1016,12 +1016,22 @@ void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu) * task is exiting and may be removed from the tasklist. See * corresponding synchronize_srcu() for further details. */ -void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu) +void exit_tasks_rcu_stop(void) __releases(&tasks_rcu_exit_srcu) { struct task_struct *t = current; __srcu_read_unlock(&tasks_rcu_exit_srcu, t->rcu_tasks_idx); - exit_tasks_rcu_finish_trace(t); +} + +/* + * Contribute to protect against tasklist scan blind spot while the + * task is exiting and may be removed from the tasklist. See + * corresponding synchronize_srcu() for further details. + */ +void exit_tasks_rcu_finish(void) +{ + exit_tasks_rcu_stop(); + exit_tasks_rcu_finish_trace(current); } #else /* #ifdef CONFIG_TASKS_RCU */