From patchwork Mon Aug 7 08:52:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Rheinsberg X-Patchwork-Id: 131756 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp1314005vqr; Mon, 7 Aug 2023 01:59:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGtw7EOKG6qBTTjMgPN1Ke5JTlRb3o/lupwFP6kv8xm8bS911ANo+qX4UhdS4sVQ0/Gz24I X-Received: by 2002:a05:6a20:138e:b0:126:f64b:6689 with SMTP id hn14-20020a056a20138e00b00126f64b6689mr7182901pzc.12.1691398752933; Mon, 07 Aug 2023 01:59:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691398752; cv=none; d=google.com; s=arc-20160816; b=Zwys+pjkjY76XwrZPkG3RllUkKFDfp7F4FLEqZvNARVAEY4FNqJ3XBlQC6KIrHoVxo ARLzM+O9bVRwrh6T35yzp158LG5XFTWM3VOwfS5hesoZAjt9Lct8Y+s+wf5hJQGwMqO7 6MU5ac5A3VtZA6pLdHOyjaHA5B+DgmijlYSB/RPRSidau40LMzb2vXTLeJ7DpqHFBOxJ UO/mQ/oNyViEfxlqpyLhR+2Wly6LqjVwrYFQjDm8mnifPOteo9tT1AchFaGwdMCUJsE4 9BhPJIPrtRJNEv6qFDP96IAnW8f7zOr/XW/+8bPuI7SGXe32uPPdddFWJK/wcOqZQ/iQ Y0wQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:feedback-id:dkim-signature :dkim-signature; bh=pK+JY1UZHRKX40EFsSZc3Ch/k+Ium31qkjrWu7lTaJ4=; fh=8SMVP86YAZTBxKnFOd469RQJ6gUylBfY6zFuDwGgmBk=; b=gjn2op/qhqOnLYsGCzk0wLFwZnG//OQD4KslINxF/EAFxKqrGWPSLMWrNaKzGmIE72 HU71xWYZ0jnLmKLstKOFrauFVbrhesqNFxE8E/8OLL1696kpABgRQuqhb7ZAJUeBXQnN 4YN30VRmC3Inf2rNlWh77JnRcufFriBUV0iQxYbrYDdJEbyt0IWi707xYP5jYWHD53Gm 4QG27DlKL82HyzlBx3g+iIH+kkZLKznwVUhq0xW1+k6Akgl+4ypYNVDbHYFcYAl+2tc+ Ze50QaOTSF51Tc5x7YDaVcKqEM6gBpSmmd+N1eTge/4JPNjEHw6qqoB45YHzzwsNbNH9 wMLw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@readahead.eu header.s=fm3 header.b=d51qVqoD; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=BPK1TApj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cw27-20020a056a00451b00b00687080bf78dsi5477307pfb.284.2023.08.07.01.58.50; Mon, 07 Aug 2023 01:59:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@readahead.eu header.s=fm3 header.b=d51qVqoD; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=BPK1TApj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229824AbjHGIwh (ORCPT + 99 others); Mon, 7 Aug 2023 04:52:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231172AbjHGIwb (ORCPT ); Mon, 7 Aug 2023 04:52:31 -0400 Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30F6392 for ; Mon, 7 Aug 2023 01:52:30 -0700 (PDT) Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 9529F5C0071; Mon, 7 Aug 2023 04:52:29 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 07 Aug 2023 04:52:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=readahead.eu; h= cc:cc:content-transfer-encoding:content-type:date:date:from:from :in-reply-to:message-id:mime-version:reply-to:sender:subject :subject:to:to; s=fm3; t=1691398349; x=1691484749; bh=pK+JY1UZHR KX40EFsSZc3Ch/k+Ium31qkjrWu7lTaJ4=; b=d51qVqoD/EYic19iCczvJVjmCM T6Bg8JJDKX2UCRbq8qpGubBCy8ZgZaJXCrseisgm8IGV7etXE0ZNI6SC14+P7n1u Ry7eXwxiZMtt7sb+lNtoZDvS9W+X0X4weE2Am/goB9rHR7i638840T/gWxMeDCQ7 43clmkOTrfJ7lg3dFfbpzFJU2mOmSd9buWzR9VC1/tG92GhX401K0h+h5CJHmwIP UOvMbyHrGi1bS0ryXeyaxmXrZrTtByc6lm4USolHGvcz4BmjUN7QfEHG88g4Fr3Q FZjMrExl1UzQxFuBY5Krd9w/kTAxBz2ExWWmjE3ctF0ndsqbZ+mbfstl7SAg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:date:date:feedback-id:feedback-id:from:from :in-reply-to:message-id:mime-version:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1691398349; x=1691484749; bh=pK+JY1UZHRKX4 0EFsSZc3Ch/k+Ium31qkjrWu7lTaJ4=; b=BPK1TApja4rq2rGxfw0Rt3SkhwB94 3Q+L99PGQpjOUW6GonMq2NJSt2XkefX2qkQ+wHj+JkbVZJqhuoocs/ut0zIw8awK mk/VOkfSbJG4TLAPxHzegWBP8aGWPktDyI7PYFy5LnSn/j2TQu064EIr929Xj3ZE vOwU+y7TSweXlQpLdukbPni7Cg9lFMUTJZ7JLPO4tU3zxAyH7I3FaiUilBszAgbk 32Jcpj1Umb4QrQ9TsOraUkvtjjiiP1ujIXS6/iGPJeJk8rmMwLaWvJBvSNmP4ukW DKVlYz9MxQKNN18D5fvCwgHITZ5rR9SppqKwkS9p1BZ1UmkRVmvTPeu5A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedviedrledtgddtjecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffoggfgsedtkeertdertddtnecuhfhrohhmpeffrghvihguucft hhgvihhnshgsvghrghcuoegurghvihgusehrvggruggrhhgvrggurdgvuheqnecuggftrf grthhtvghrnhepkeeivdeggeehleeltedujeejhedvfedvieeiiedvteevvdejhefgkeet gffggeevnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomh epuggrvhhiugesrhgvrggurghhvggrugdrvghu X-ME-Proxy: Feedback-ID: id2994666:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 7 Aug 2023 04:52:27 -0400 (EDT) From: David Rheinsberg To: linux-kernel@vger.kernel.org Cc: Christian Brauner , Jan Kara , Kees Cook , Alexander Mikhalitsyn , Luca Boccassi , David Rheinsberg Subject: [PATCH] pid: allow pidfds for reaped tasks Date: Mon, 7 Aug 2023 10:52:03 +0200 Message-ID: <20230807085203.819772-1-david@readahead.eu> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773560139004307245 X-GMAIL-MSGID: 1773560139004307245 A pidfd can currently only be created for tasks that are thread-group leaders and not reaped. This patch changes the pidfd-core to allow for pidfds on reapead thread-group leaders as well. A pidfd can outlive the task it refers to, and thus user-space must already be prepared that the task underlying a pidfd is gone at the time they get their hands on the pidfd. For instance, resolving the pidfd to a PID via the fdinfo must be prepared to read `-1`. Despite user-space knowing that a pidfd might be stale, several kernel APIs currently add another layer that checks for this. In particular, SO_PEERPIDFD returns `EINVAL` if the peer-task was already reaped, but returns a stale pidfd if the task is reaped immediately after the respective alive-check. This has the unfortunate effect that user-space now has two ways to check for the exact same scenario: A syscall might return EINVAL/ESRCH/... *or* the pidfd might be stale, even though there is no particular reason to distinguish both cases. This also propagates through user-space APIs, which pass on pidfds. They must be prepared to pass on `-1` *or* the pidfd, because there is no guaranteed way to get a stale pidfd from the kernel. This patch changes the core pidfd helpers to allow creation of pidfds even if the PID is no longer linked to any task. This only affects one of the three pidfd users that currently exist: 1) fanotify already tests for a linked TGID-task manually before creating the PIDFD, thus it is not directly affected by this change. However, note that the current fanotify code fails with an error if the target process is reaped exactly between the TGID-check in fanotify and the test in pidfd_prepare(). With this patch, this will no longer be the case. 2) pidfd_open(2) calls find_get_pid() before creating the pidfd, thus it is also not directly affected by this change. Again, similar to fanotify, there is a race between the find_get_pid() call and pidfd_prepare(), which currently causes pidfd_open(2) to return EINVAL rather than ESRCH if the process is reaped just between those two checks. With this patch, this will no longer be the case. 3) SO_PEERPIDFD will be affected by this change and from now on return stale pidfds rather than EINVAL if the respective peer task is reaped already. Given that users of SO_PEERPIDFD must already deal with stale pidfds, this change hopefully simplifies the API of SO_PEERPIDFD, and all dependent user-space APIs (e.g., GetConnectionCredentials() on D-Bus driver APIs). Also note that SO_PEERPIDFD is still pending to be released with linux-6.5. Signed-off-by: David Rheinsberg --- kernel/fork.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index d2e12b6d2b18..4dde19a8c264 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2161,7 +2161,7 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re * Allocate a new file that stashes @pid and reserve a new pidfd number in the * caller's file descriptor table. The pidfd is reserved but not installed yet. * - * The helper verifies that @pid is used as a thread group leader. + * The helper verifies that @pid is/was used as a thread group leader. * * If this function returns successfully the caller is responsible to either * call fd_install() passing the returned pidfd and pidfd file as arguments in @@ -2180,7 +2180,14 @@ static int __pidfd_prepare(struct pid *pid, unsigned int flags, struct file **re */ int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret) { - if (!pid || !pid_has_task(pid, PIDTYPE_TGID)) + if (!pid) + return -EINVAL; + + /* + * Non thread-group leaders cannot have pidfds, but we allow them for + * reaped thread-group leaders. + */ + if (pid_has_task(pid, PIDTYPE_PID) && !pid_has_task(pid, PIDTYPE_TGID)) return -EINVAL; return __pidfd_prepare(pid, flags, ret);