Message ID | 20240207114549.GA12697@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-56416-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:168b:b0:106:860b:bbdd with SMTP id ma11csp2167848dyb; Wed, 7 Feb 2024 03:47:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IHfm6DzsKHKGArvAuLdJIOiARk8IHNhcaAy38wD6tOnb1WpAWXC7nTgfVa1dc9brdDRm7DN X-Received: by 2002:a17:902:f68d:b0:1d9:ce46:6eb8 with SMTP id l13-20020a170902f68d00b001d9ce466eb8mr7877263plg.4.1707306449671; Wed, 07 Feb 2024 03:47:29 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707306449; cv=pass; d=google.com; s=arc-20160816; b=t4P+u/H0NQJ7f9M7aiIC5iKkEEh43h4LqxNvO1TXRveo81U0oe9LqCRgrJQs8BUd5F 3lu9jrEpmvPgpsBN/95lKkQ27adlbyJkFPy8tVegKnaJGmn51cqaEh4PFLHC5xcrdWze j1v26e0OgTAGYtMG+KunaLPiILwp+8wviTDo5uKxaDBF81rjmTTtiaF3uYZPknNTET8H SvbcDhw7zOsnvCR9izkOJiBAVfpwRa6RejGSWCBzwECiMYXfaCjYGTrppC8fbR7CKOtd h/vLnvIESO22KavP4BVR30rvhk+rMS7w4vuLuHj8tcnf/GkdBmU3ur2Z5TR2KPfQ5u9L my2g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:subject:cc:to:from :date:dkim-signature; bh=nvbuj/0u9tEQEJhqguXdXu/oFCwcAEaMYAW5rW/LIQU=; fh=QTSjB1NivGImapozUfrAxRYPH9GgV9z2A/VlUubNFhE=; b=GAmQuOyOnN98mmcFP502vt2DghRi44Jhaaqd4HPF2onbHxYdHwglMoV+yrQxAiBmz5 0lcBKhPmcNmGuJkYfi+4aCZnLCVuHM+wICX4cX/nDx2SrF6YuMfxMJXPVA1YTl1q5oWf wCbnoHuSMRo+UuYed2MMqC9vMOYfmCnFLDG6qmqylSq3iaqeBEXIBJ686hn8rBw85onH 0tN6y2lVXxL4sdYM6c3g3SJDgywrpX4glq5G9qK3EO7gkaOiyJJS51+9NFWcy5gPjsy2 YmoKwlLRDDQkeX7b8fbtqulcwMzJ3keVF8EiyEL+lLm0poSER0SNcaD3GNTCb7ZsatiG J4Pg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=a1+SCfFb; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-56416-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-56416-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=2; AJvYcCVTFGiCkqzplHEp6NcYQCTiWHVimvTfPRUeRdWK6D+AeLwjlTArB1fTzx+WT8RG/8MJo1f/lI+OVjRbNgMAIgrRbDvEeA== Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id bo6-20020a056a02038600b005cdba9aa36esi1448211pgb.733.2024.02.07.03.47.29 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 Feb 2024 03:47:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-56416-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=a1+SCfFb; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-56416-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-56416-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id B1621B20B4D for <ouuuleilei@gmail.com>; Wed, 7 Feb 2024 11:47:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C76DF1D531; Wed, 7 Feb 2024 11:47:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="a1+SCfFb" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A5741B7E3 for <linux-kernel@vger.kernel.org>; Wed, 7 Feb 2024 11:47:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707306433; cv=none; b=r+MTosaeYHm0F70UVd47xesyB2jVShQWdv7TppCgJ3wFIcdk7H/lCMVE1IBdIqlRhnL8z38ew5SgY5nMfz1kQw7nVR6W8sAkb0s4DambvLvlQyJK8cIiPrqv/+CESgx3zz32drWjgcqrZDoxPgXYxB9JqgbqDLzHPEPHhIc8Ob4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707306433; c=relaxed/simple; bh=tjrQh1y4H5W6sx6xqeHy4JzMSxFgjnezbUH6ZJHM5NI=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=klXR/+PvGbwRbxAEzs+yUlZBSsuUXNXG7Gh5DpdbZ/MKCH+rWwn2D3y7SxMYs4AEehhQ8gzazKYHosJWBX83D2BdLdBTJ0DD7Z4MESyJZ76SBcSHpGL7SP2GSCJ64pKIIQqwCWYIrnEgtwDfWSiFhCxblneOSezWQ/YJnn3umG4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=a1+SCfFb; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707306429; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=nvbuj/0u9tEQEJhqguXdXu/oFCwcAEaMYAW5rW/LIQU=; b=a1+SCfFbIJhGzlOD8valK23DgWu7JYk8fdd0alY8o8ZjLFJRGEHzMVaIaGuZ6gJ3sYIlDs Pmibue2fZwnnsaWp4IWpgp7jrSSHJ77fEpQG093r2/IZumN5sT7NHY00nqc6mGn1tHLEmj tVj0DwLRcpVA1cmZMqQ0myAwWfPZLvc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-435-aNrfZZEpPIqEjvdaQ2srow-1; Wed, 07 Feb 2024 06:47:08 -0500 X-MC-Unique: aNrfZZEpPIqEjvdaQ2srow-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F195185A588; Wed, 7 Feb 2024 11:47:07 +0000 (UTC) Received: from dhcp-27-174.brq.redhat.com (unknown [10.45.225.212]) by smtp.corp.redhat.com (Postfix) with SMTP id 5171F1C02EB3; Wed, 7 Feb 2024 11:47:06 +0000 (UTC) Received: by dhcp-27-174.brq.redhat.com (nbSMTP-1.00) for uid 1000 oleg@redhat.com; Wed, 7 Feb 2024 12:45:51 +0100 (CET) Date: Wed, 7 Feb 2024 12:45:49 +0100 From: Oleg Nesterov <oleg@redhat.com> To: Christian Brauner <brauner@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net>, "Eric W. Biederman" <ebiederm@xmission.com>, Tycho Andersen <tycho@tycho.pizza>, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] pidfd: change pidfd_send_signal() to respect PIDFD_THREAD Message-ID: <20240207114549.GA12697@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1790240567578243134 X-GMAIL-MSGID: 1790240567578243134 |
Series |
pidfd: change pidfd_send_signal() to respect PIDFD_THREAD
|
|
Commit Message
Oleg Nesterov
Feb. 7, 2024, 11:45 a.m. UTC
Turn kill_pid_info() into kill_pid_info_type(), this allows to pass any
pid_type to group_send_sig_info(), despite its name it should work fine
even if type = PIDTYPE_PID.
Change pidfd_send_signal() to use PIDTYPE_PID or PIDTYPE_TGID depending
on PIDFD_THREAD.
While at it kill another TODO comment in pidfd_show_fdinfo(). As Christian
expains fdinfo reports f_flags, userspace can already detect PIDFD_THREAD.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/fork.c | 2 --
kernel/signal.c | 18 ++++++++++++------
2 files changed, 12 insertions(+), 8 deletions(-)
Comments
On Wed, Feb 07, 2024 at 12:45:49PM +0100, Oleg Nesterov wrote: > Turn kill_pid_info() into kill_pid_info_type(), this allows to pass any > pid_type to group_send_sig_info(), despite its name it should work fine > even if type = PIDTYPE_PID. > > Change pidfd_send_signal() to use PIDTYPE_PID or PIDTYPE_TGID depending > on PIDFD_THREAD. > > While at it kill another TODO comment in pidfd_show_fdinfo(). As Christian > expains fdinfo reports f_flags, userspace can already detect PIDFD_THREAD. > > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- > kernel/fork.c | 2 -- > kernel/signal.c | 18 ++++++++++++------ > 2 files changed, 12 insertions(+), 8 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index cd61ca87d0e6..47b565598063 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -2051,8 +2051,6 @@ static void pidfd_show_fdinfo(struct seq_file *m, struct file *f) > > seq_put_decimal_ll(m, "Pid:\t", nr); > > - /* TODO: report PIDFD_THREAD */ > - > #ifdef CONFIG_PID_NS > seq_put_decimal_ll(m, "\nNSpid:\t", nr); > if (nr > 0) { > diff --git a/kernel/signal.c b/kernel/signal.c > index c3fac06937e2..e3edcd784e45 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -47,6 +47,7 @@ > #include <linux/cgroup.h> > #include <linux/audit.h> > #include <linux/sysctl.h> > +#include <uapi/linux/pidfd.h> > > #define CREATE_TRACE_POINTS > #include <trace/events/signal.h> > @@ -1478,7 +1479,8 @@ int __kill_pgrp_info(int sig, struct kernel_siginfo *info, struct pid *pgrp) > return ret; > } > > -int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > +static int kill_pid_info_type(int sig, struct kernel_siginfo *info, > + struct pid *pid, enum pid_type type) > { > int error = -ESRCH; > struct task_struct *p; > @@ -1487,11 +1489,10 @@ int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > rcu_read_lock(); > p = pid_task(pid, PIDTYPE_PID); > if (p) > - error = group_send_sig_info(sig, info, p, PIDTYPE_TGID); > + error = group_send_sig_info(sig, info, p, type); > rcu_read_unlock(); > if (likely(!p || error != -ESRCH)) > return error; > - > /* > * The task was unhashed in between, try again. If it > * is dead, pid_task() will return NULL, if we race with > @@ -1500,6 +1501,11 @@ int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > } > } > > +int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > +{ > + return kill_pid_info_type(sig, info, pid, PIDTYPE_TGID); > +} > + > static int kill_proc_info(int sig, struct kernel_siginfo *info, pid_t pid) > { > int error; > @@ -3890,6 +3896,7 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, > struct fd f; > struct pid *pid; > kernel_siginfo_t kinfo; > + enum pid_type type; > > /* Enforce flags be set to 0 until we add an extension. */ > if (flags) > @@ -3928,9 +3935,8 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, > prepare_kill_siginfo(sig, &kinfo); > } > > - /* TODO: respect PIDFD_THREAD */ > - ret = kill_pid_info(sig, &kinfo, pid); > - > + type = (f.file->f_flags & PIDFD_THREAD) ? PIDTYPE_PID : PIDTYPE_TGID; > + ret = kill_pid_info_type(sig, &kinfo, pid, type); If the user doesn't provide siginfo then the kernel fills in the info in prepare_kill_siginfo() a few lines above. That sets info->si_code to SI_USER even for the PIDFD_THREAD case. Whenever the info is filled in by the kernel it's not exactly userspace impersonating anything plus we know that what we're sending to is a pidfd by the type of the pidfd. So it feels like we should fill in SI_TKILL here as well? I would also suggest we update the obsolete comment on top of pidfd_send_signal() along the lines of: diff --git a/kernel/signal.c b/kernel/signal.c index e3edcd784e45..40df0c17abd7 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -3878,14 +3878,10 @@ static struct pid *pidfd_to_pid(const struct file *file) * @info: signal info * @flags: future flags * - * The syscall currently only signals via PIDTYPE_PID which covers - * kill(<positive-pid>, <signal>. It does not signal threads or process - * groups. - * In order to extend the syscall to threads and process groups the @flags - * argument should be used. In essence, the @flags argument will determine - * what is signaled and not the file descriptor itself. Put in other words, - * grouping is a property of the flags argument not a property of the file - * descriptor. + * If the @pidfd refers to a thread-group leader the signal is thread-group + * directed. If @pidfd referes to a thread then the signal is thread directed. + * In the future extension to @flags may be used to override the default scope + * of @pidfd. * * Return: 0 on success, negative errno on failure */
On 02/08, Christian Brauner wrote: > > On Wed, Feb 07, 2024 at 12:45:49PM +0100, Oleg Nesterov wrote: > > + type = (f.file->f_flags & PIDFD_THREAD) ? PIDTYPE_PID : PIDTYPE_TGID; > > + ret = kill_pid_info_type(sig, &kinfo, pid, type); > > If the user doesn't provide siginfo then the kernel fills in the info in > prepare_kill_siginfo() a few lines above. That sets info->si_code to > SI_USER even for the PIDFD_THREAD case. Whenever the info is filled in > by the kernel it's not exactly userspace impersonating anything plus we > know that what we're sending to is a pidfd by the type of the pidfd. So > it feels like we should fill in SI_TKILL here as well? Hmm. Agreed, will do, thanks. But then I think this needs another preparational 1/2 patch. prepare_kill_siginfo() should have a new arg so that do_tkill() could use it too. (offtopic, but may be the "Only allow sending arbitrary signals to yourself" check in pidfd_send_signal() needs another helper, do_rt_sigqueueinfo() does the same check). > I would also suggest we update the obsolete comment on top of > pidfd_send_signal() along the lines of: Ah, indeed, thanks. Oleg.
On 02/08, Christian Brauner wrote: > > I would also suggest we update the obsolete comment on top of > pidfd_send_signal() along the lines of: Yes, but... > + * If the @pidfd refers to a thread-group leader the signal is thread-group > + * directed. If @pidfd referes to a thread then the signal is thread directed. No, this depends on PIDFD_THREAD only. If it is set then the signal is always "thread directed" even if @pidfd refers to a thread-group leader. Otherwise the target task must be a group leader and the signal will be "thread-group directed". Right? Oleg.
On Thu, Feb 08, 2024 at 02:53:45PM +0100, Oleg Nesterov wrote: > On 02/08, Christian Brauner wrote: > > > > On Wed, Feb 07, 2024 at 12:45:49PM +0100, Oleg Nesterov wrote: > > > + type = (f.file->f_flags & PIDFD_THREAD) ? PIDTYPE_PID : PIDTYPE_TGID; > > > + ret = kill_pid_info_type(sig, &kinfo, pid, type); > > > > If the user doesn't provide siginfo then the kernel fills in the info in > > prepare_kill_siginfo() a few lines above. That sets info->si_code to > > SI_USER even for the PIDFD_THREAD case. Whenever the info is filled in > > by the kernel it's not exactly userspace impersonating anything plus we > > know that what we're sending to is a pidfd by the type of the pidfd. So > > it feels like we should fill in SI_TKILL here as well? > > Hmm. Agreed, will do, thanks. > > But then I think this needs another preparational 1/2 patch. > prepare_kill_siginfo() should have a new arg so that do_tkill() could > use it too. Agreed. > > (offtopic, but may be the "Only allow sending arbitrary signals to yourself" > check in pidfd_send_signal() needs another helper, do_rt_sigqueueinfo() > does the same check). Agreed.
On Thu, Feb 08, 2024 at 03:06:10PM +0100, Oleg Nesterov wrote: > On 02/08, Christian Brauner wrote: > > > > I would also suggest we update the obsolete comment on top of > > pidfd_send_signal() along the lines of: > > Yes, but... > > > + * If the @pidfd refers to a thread-group leader the signal is thread-group > > + * directed. If @pidfd referes to a thread then the signal is thread directed. > > No, this depends on PIDFD_THREAD only. Sorry, yes, that is what I meant to say but obviously wrote unclearly. I mean that if pidfd->f_flags & PIDFD_THREAD then it's thread-directed. That's what I meant by type of pidfd. Not type as in PIDTYPE_PID in struct pid. > > If it is set then the signal is always "thread directed" even if @pidfd refers > to a thread-group leader. > > Otherwise the target task must be a group leader and the signal will be > "thread-group directed". > > Right? Yes, please feel free to update the comment!
On 02/08, Oleg Nesterov wrote: > > On 02/08, Christian Brauner wrote: > > > > On Wed, Feb 07, 2024 at 12:45:49PM +0100, Oleg Nesterov wrote: > > > + type = (f.file->f_flags & PIDFD_THREAD) ? PIDTYPE_PID : PIDTYPE_TGID; > > > + ret = kill_pid_info_type(sig, &kinfo, pid, type); > > > > If the user doesn't provide siginfo then the kernel fills in the info in > > prepare_kill_siginfo() a few lines above. That sets info->si_code to > > SI_USER even for the PIDFD_THREAD case. Whenever the info is filled in > > by the kernel it's not exactly userspace impersonating anything plus we > > know that what we're sending to is a pidfd by the type of the pidfd. So > > it feels like we should fill in SI_TKILL here as well? > > Hmm. Agreed, will do, thanks. Cough... lets forget this patch for the moment. Is prepare_kill_siginfo() correct when we send a signal to the child pid namespace? si_pid = task_tgid_vnr(current) doesn't look right in this case but perhaps I am totally confused. And why do we need it at all? Can't sys_kill() and pidfd_send_signal() just use SEND_SIG_NOINFO? OK, I am sure I missed something. Will read this code tomorrow. Oleg.
> Is prepare_kill_siginfo() correct when we send a signal to the child > pid namespace? si_pid = task_tgid_vnr(current) doesn't look right in > this case but perhaps I am totally confused. > > And why do we need it at all? Can't sys_kill() and pidfd_send_signal() > just use SEND_SIG_NOINFO? Yeah, good point. I don't remember as it's been quite a while ago. My guess is that it just tried to mirror kill() itself without being aware of SEND_SIG_NOINFO. If you don't find anything wrong with this then switch it to SEND_SIG_NOINFO in a preparatory patch we can backport, please.
Oleg Nesterov <oleg@redhat.com> writes: > Turn kill_pid_info() into kill_pid_info_type(), this allows to pass any > pid_type to group_send_sig_info(), despite its name it should work fine > even if type = PIDTYPE_PID. > > Change pidfd_send_signal() to use PIDTYPE_PID or PIDTYPE_TGID depending > on PIDFD_THREAD. > > While at it kill another TODO comment in pidfd_show_fdinfo(). As Christian > expains fdinfo reports f_flags, userspace can already detect PIDFD_THREAD. > I have a question here. Why is this based on group_send_sig_info instead of send_sig_info? AKA why does this look like sys_kill instead of sys_tgkill? In particular I am asking are the intended semantics that the signal is sent to a single thread in a thread group and placed in the per thread queue, or is the signal sent to the entire thread group and placed in the thread group signal queue? Does the type parameter handle that decision for us now? If so perhaps we should cleanup the helpers so that this easier to see when reading the code. Because honestly right now using group_send_sig_info when the intended target of the signal is not the entire thread group is very confusing when reading your change. Eric > Signed-off-by: Oleg Nesterov <oleg@redhat.com> > --- > kernel/fork.c | 2 -- > kernel/signal.c | 18 ++++++++++++------ > 2 files changed, 12 insertions(+), 8 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index cd61ca87d0e6..47b565598063 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -2051,8 +2051,6 @@ static void pidfd_show_fdinfo(struct seq_file *m, struct file *f) > > seq_put_decimal_ll(m, "Pid:\t", nr); > > - /* TODO: report PIDFD_THREAD */ > - > #ifdef CONFIG_PID_NS > seq_put_decimal_ll(m, "\nNSpid:\t", nr); > if (nr > 0) { > diff --git a/kernel/signal.c b/kernel/signal.c > index c3fac06937e2..e3edcd784e45 100644 > --- a/kernel/signal.c > +++ b/kernel/signal.c > @@ -47,6 +47,7 @@ > #include <linux/cgroup.h> > #include <linux/audit.h> > #include <linux/sysctl.h> > +#include <uapi/linux/pidfd.h> > > #define CREATE_TRACE_POINTS > #include <trace/events/signal.h> > @@ -1478,7 +1479,8 @@ int __kill_pgrp_info(int sig, struct kernel_siginfo *info, struct pid *pgrp) > return ret; > } > > -int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > +static int kill_pid_info_type(int sig, struct kernel_siginfo *info, > + struct pid *pid, enum pid_type type) > { > int error = -ESRCH; > struct task_struct *p; > @@ -1487,11 +1489,10 @@ int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > rcu_read_lock(); > p = pid_task(pid, PIDTYPE_PID); > if (p) > - error = group_send_sig_info(sig, info, p, PIDTYPE_TGID); > + error = group_send_sig_info(sig, info, p, type); > rcu_read_unlock(); > if (likely(!p || error != -ESRCH)) > return error; > - > /* > * The task was unhashed in between, try again. If it > * is dead, pid_task() will return NULL, if we race with > @@ -1500,6 +1501,11 @@ int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > } > } > > +int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) > +{ > + return kill_pid_info_type(sig, info, pid, PIDTYPE_TGID); > +} > + > static int kill_proc_info(int sig, struct kernel_siginfo *info, pid_t pid) > { > int error; > @@ -3890,6 +3896,7 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, > struct fd f; > struct pid *pid; > kernel_siginfo_t kinfo; > + enum pid_type type; > > /* Enforce flags be set to 0 until we add an extension. */ > if (flags) > @@ -3928,9 +3935,8 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, > prepare_kill_siginfo(sig, &kinfo); > } > > - /* TODO: respect PIDFD_THREAD */ > - ret = kill_pid_info(sig, &kinfo, pid); > - > + type = (f.file->f_flags & PIDFD_THREAD) ? PIDTYPE_PID : PIDTYPE_TGID; > + ret = kill_pid_info_type(sig, &kinfo, pid, type); > err: > fdput(f); > return ret;
On 02/08, Eric W. Biederman wrote: > > Oleg Nesterov <oleg@redhat.com> writes: > > > Turn kill_pid_info() into kill_pid_info_type(), this allows to pass any > > pid_type to group_send_sig_info(), despite its name it should work fine > > even if type = PIDTYPE_PID. > > > > Change pidfd_send_signal() to use PIDTYPE_PID or PIDTYPE_TGID depending > > on PIDFD_THREAD. > > > > While at it kill another TODO comment in pidfd_show_fdinfo(). As Christian > > expains fdinfo reports f_flags, userspace can already detect PIDFD_THREAD. > > > > I have a question here. > > Why is this based on group_send_sig_info instead of send_sig_info? Well. send_sig_info() accepts "struct task_struct *", not "struct pid *", it doesn't do check_kill_permission(), and it doesn't handle the possible race with mt-exec. > In particular I am asking are the intended semantics that the signal is > sent to a single thread in a thread group and placed in the per thread > queue, or is the signal sent to the entire thread group and placed > in the thread group signal queue? This depends on PIDFD_THREAD. If it is set then the signal goes to the per thread queue. > Because honestly right now using group_send_sig_info when > the intended target of the signal is not the entire thread > group is very confusing when reading your change. Agreed, so perhaps it makes sense to rename it later. See despite its name it should work fine even if type = PIDTYPE_PID. in the changelog above. Oleg.
On 02/08, Christian Brauner wrote: > > > Is prepare_kill_siginfo() correct when we send a signal to the child > > pid namespace? si_pid = task_tgid_vnr(current) doesn't look right in > > this case but perhaps I am totally confused. > > > > And why do we need it at all? Can't sys_kill() and pidfd_send_signal() > > just use SEND_SIG_NOINFO? > > Yeah, good point. I don't remember as it's been quite a while ago. My > guess is that it just tried to mirror kill() itself without being aware > of SEND_SIG_NOINFO. If you don't find anything wrong with this then > switch it to SEND_SIG_NOINFO in a preparatory patch we can backport, > please. Yes, but I still feel I must have missed something. Will read this code tomorrow. And another note for the record before I forget this. We can probably improve and rename access_pidfd_pidns(). Currently it is only used by pidfd_send_signal() but pidns_install() looks like another user. Oleg.
On Thu, Feb 08, 2024 at 04:57:31PM +0100, Oleg Nesterov wrote: > On 02/08, Eric W. Biederman wrote: > > > > Oleg Nesterov <oleg@redhat.com> writes: > > > > > Turn kill_pid_info() into kill_pid_info_type(), this allows to pass any > > > pid_type to group_send_sig_info(), despite its name it should work fine > > > even if type = PIDTYPE_PID. > > > > > > Change pidfd_send_signal() to use PIDTYPE_PID or PIDTYPE_TGID depending > > > on PIDFD_THREAD. > > > > > > While at it kill another TODO comment in pidfd_show_fdinfo(). As Christian > > > expains fdinfo reports f_flags, userspace can already detect PIDFD_THREAD. > > > > > > > I have a question here. > > > > Why is this based on group_send_sig_info instead of send_sig_info? > > Well. send_sig_info() accepts "struct task_struct *", not "struct pid *", > it doesn't do check_kill_permission(), and it doesn't handle the possible > race with mt-exec. > > > In particular I am asking are the intended semantics that the signal is > > sent to a single thread in a thread group and placed in the per thread > > queue, or is the signal sent to the entire thread group and placed > > in the thread group signal queue? > > This depends on PIDFD_THREAD. If it is set then the signal goes to > the per thread queue. > > > Because honestly right now using group_send_sig_info when > > the intended target of the signal is not the entire thread > > group is very confusing when reading your change. > > Agreed, so perhaps it makes sense to rename it later. See Agreed. The function seems misnamed and incorrectly documented. It just seems that it's never been used with PIDTYPE_PID but it's perfectly capable of doing that. So maybe just put a patch on top renaming it to send_sig_info_type() and remove the old comment. But I can live without renaming it for now as well.
On 02/08, Oleg Nesterov wrote: > > Is prepare_kill_siginfo() correct when we send a signal to the child > pid namespace? si_pid = task_tgid_vnr(current) doesn't look right Yes, but iiuc send_signal_locked() should fixup si_pid/si_uid, so it is not buggy. > And why do we need it at all? Can't sys_kill() and pidfd_send_signal() > just use SEND_SIG_NOINFO? Probably yes. And even do_tkill() can use SEND_SIG_NOINFO if we change __send_signal_locked() to check the type before ".si_code = SI_USER". but then TP_STORE_SIGINFO() needs some changes... I'll try to do this later, I do not want to mix this change with the PIDFD_THREAD changes. Oleg.
On 02/09, Christian Brauner wrote: > > On Thu, Feb 08, 2024 at 04:57:31PM +0100, Oleg Nesterov wrote: > > On 02/08, Eric W. Biederman wrote: > > > > > > Because honestly right now using group_send_sig_info when > > > the intended target of the signal is not the entire thread > > > group is very confusing when reading your change. > > > > Agreed, so perhaps it makes sense to rename it later. See > > Agreed. The function seems misnamed and incorrectly documented. It just > seems that it's never been used with PIDTYPE_PID but it's perfectly > capable of doing that. So maybe just put a patch on top renaming it to > send_sig_info_type() and remove the old comment. But I can live without > renaming it for now as well. OK, I'll update the comment, please see below. It should probably say more about the case when type > PIDTYPE_TGID, but this is "offtopic" for this patch. Oleg. --- a/kernel/signal.c +++ b/kernel/signal.c @@ -1437,7 +1437,8 @@ void lockdep_assert_task_sighand_held(struct task_struct *task) #endif /* - * send signal info to all the members of a group + * send signal info to all the members of a thread group or to the + * individual thread if type == PIDTYPE_PID. */ int group_send_sig_info(int sig, struct kernel_siginfo *info, struct task_struct *p, enum pid_type type)
On Fri, Feb 09, 2024 at 11:28:17AM +0100, Oleg Nesterov wrote: > On 02/08, Oleg Nesterov wrote: > > > > Is prepare_kill_siginfo() correct when we send a signal to the child > > pid namespace? si_pid = task_tgid_vnr(current) doesn't look right > > Yes, but iiuc send_signal_locked() should fixup si_pid/si_uid, so it > is not buggy. It must've been. Yesterday I realized that otherwise kill(2) would have been broken for a long time. I think this was originally fixed in commit 6588c1e3ff01 ("signals: SI_USER: Masquerade si_pid when crossing pid ns boundary").
diff --git a/kernel/fork.c b/kernel/fork.c index cd61ca87d0e6..47b565598063 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2051,8 +2051,6 @@ static void pidfd_show_fdinfo(struct seq_file *m, struct file *f) seq_put_decimal_ll(m, "Pid:\t", nr); - /* TODO: report PIDFD_THREAD */ - #ifdef CONFIG_PID_NS seq_put_decimal_ll(m, "\nNSpid:\t", nr); if (nr > 0) { diff --git a/kernel/signal.c b/kernel/signal.c index c3fac06937e2..e3edcd784e45 100644 --- a/kernel/signal.c +++ b/kernel/signal.c @@ -47,6 +47,7 @@ #include <linux/cgroup.h> #include <linux/audit.h> #include <linux/sysctl.h> +#include <uapi/linux/pidfd.h> #define CREATE_TRACE_POINTS #include <trace/events/signal.h> @@ -1478,7 +1479,8 @@ int __kill_pgrp_info(int sig, struct kernel_siginfo *info, struct pid *pgrp) return ret; } -int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) +static int kill_pid_info_type(int sig, struct kernel_siginfo *info, + struct pid *pid, enum pid_type type) { int error = -ESRCH; struct task_struct *p; @@ -1487,11 +1489,10 @@ int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) rcu_read_lock(); p = pid_task(pid, PIDTYPE_PID); if (p) - error = group_send_sig_info(sig, info, p, PIDTYPE_TGID); + error = group_send_sig_info(sig, info, p, type); rcu_read_unlock(); if (likely(!p || error != -ESRCH)) return error; - /* * The task was unhashed in between, try again. If it * is dead, pid_task() will return NULL, if we race with @@ -1500,6 +1501,11 @@ int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) } } +int kill_pid_info(int sig, struct kernel_siginfo *info, struct pid *pid) +{ + return kill_pid_info_type(sig, info, pid, PIDTYPE_TGID); +} + static int kill_proc_info(int sig, struct kernel_siginfo *info, pid_t pid) { int error; @@ -3890,6 +3896,7 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, struct fd f; struct pid *pid; kernel_siginfo_t kinfo; + enum pid_type type; /* Enforce flags be set to 0 until we add an extension. */ if (flags) @@ -3928,9 +3935,8 @@ SYSCALL_DEFINE4(pidfd_send_signal, int, pidfd, int, sig, prepare_kill_siginfo(sig, &kinfo); } - /* TODO: respect PIDFD_THREAD */ - ret = kill_pid_info(sig, &kinfo, pid); - + type = (f.file->f_flags & PIDFD_THREAD) ? PIDTYPE_PID : PIDTYPE_TGID; + ret = kill_pid_info_type(sig, &kinfo, pid, type); err: fdput(f); return ret;