[v3,0/1] pidfd: implement PIDFD_THREAD flag for pidfd_open()

Message ID 20240131132541.GA23632@redhat.com
State New
Headers

Commit Message

Oleg Nesterov Jan. 31, 2024, 1:25 p.m. UTC
  Please see the interdiff below.

Also, I updated the changelog to document that the behaviour of
pidfd_poll(PIDFD_THREAD, pid-of-group-leader) is not well defined
if a sub-thread execs.

Do you agree with this semantics?

Oleg.
---
  

Comments

Christian Brauner Jan. 31, 2024, 1:58 p.m. UTC | #1
On Wed, Jan 31, 2024 at 02:25:41PM +0100, Oleg Nesterov wrote:
> Please see the interdiff below.
> 
> Also, I updated the changelog to document that the behaviour of
> pidfd_poll(PIDFD_THREAD, pid-of-group-leader) is not well defined
> if a sub-thread execs.
> 
> Do you agree with this semantics?

Yeah, that seems fine to me.
  
Oleg Nesterov Jan. 31, 2024, 2:12 p.m. UTC | #2
Before I forget this...

After this patch we can easily add another feature, pidfd_poll()
can add, say, POLLHUP to poll_flags if the pid is "dead".

So the user can do

	poll(pidfd, { .revents = POLLHUP });

and it will block until release_task() is called and this pid is
no longer in use (pid_task() == NULL).

Do you think this can be useful?

Oleg.
  
Christian Brauner Jan. 31, 2024, 2:39 p.m. UTC | #3
On Wed, Jan 31, 2024 at 03:12:04PM +0100, Oleg Nesterov wrote:
> Before I forget this...
> 
> After this patch we can easily add another feature, pidfd_poll()
> can add, say, POLLHUP to poll_flags if the pid is "dead".
> 
> So the user can do
> 
> 	poll(pidfd, { .revents = POLLHUP });
> 
> and it will block until release_task() is called and this pid is
> no longer in use (pid_task() == NULL).
> 
> Do you think this can be useful?

Yeah, I think this is something that people would find useful. IIUC, it
would essentially allow them to do things like wait until a task has
been waited upon which for service managers is very useful information.

---

Btw, bigger picture for a second. You probably haven't kept track of
this but the fact that we got pidfds - thanks a lot to your review and
help - has made quite a huge difference in userspace. Since we last did
any meaningful work we got:

* systemd completely relying on pidfds to manage services to guard
  against any pid races.
* Extended dbus to allow authentication via pidfds.
* Extended policy kit to enable secure authentication of processes via pidfds.
* Language support for pidfds: Go, Rust etc.
* An endless number of tools that added support for them.
* glibc support for pidfd apis.

There's a bunch more. That literally obliterated whole bug classes.

I think with PIDFD_THREAD support it might make it interesting to use it
for some pthread management as well.
  
Oleg Nesterov Jan. 31, 2024, 4:10 p.m. UTC | #4
On 01/31, Christian Brauner wrote:
>
> On Wed, Jan 31, 2024 at 03:12:04PM +0100, Oleg Nesterov wrote:
> >
> > After this patch we can easily add another feature, pidfd_poll()
> > can add, say, POLLHUP to poll_flags if the pid is "dead".
> >
> > So the user can do
> >
> > 	poll(pidfd, { .revents = POLLHUP });
> >
> > and it will block until release_task() is called and this pid is
> > no longer in use (pid_task() == NULL).
> >
> > Do you think this can be useful?
>
> Yeah, I think this is something that people would find useful. IIUC, it
> would essentially allow them to do things like wait until a task has
> been waited upon

Exactly.

OK. I'll try to make the (hopefully simple) patch on top of this one
on Friday, if Tycho agrees with V3. Will be busy tomorrow.

> * systemd completely relying on pidfds to manage services to guard
>   against any pid races.
> * Extended dbus to allow authentication via pidfds.
> * Extended policy kit to enable secure authentication of processes via pidfds.
> * Language support for pidfds: Go, Rust etc.
> * An endless number of tools that added support for them.
> * glibc support for pidfd apis.
>
> There's a bunch more. That literally obliterated whole bug classes.

Thanks for this info!

Not that I ever thouhgt that pidfd is "useless", not at all, but as I said
(and as a Perl progammer ;) I simply do not know what people actually do with
pidfds ;)

Oleg.
  

Patch

diff --git a/fs/exec.c b/fs/exec.c
index 73e4045df271..0fd7e668c477 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1143,7 +1143,11 @@  static int de_thread(struct task_struct *tsk)
 
 		BUG_ON(leader->exit_state != EXIT_ZOMBIE);
 		leader->exit_state = EXIT_DEAD;
-
+		/*
+		 * leader and tsk exhanged their pids, the old pid dies,
+		 * wake up the PIDFD_THREAD waiters.
+		 */
+		do_notify_pidfd(leader);
 		/*
 		 * We are going to release_task()->ptrace_unlink() silently,
 		 * the tracer can sleep in do_wait(). EXIT_DEAD guarantees
diff --git a/include/linux/pid.h b/include/linux/pid.h
index e6a041cb8bac..8124d57752b9 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -70,10 +70,11 @@  extern const struct file_operations pidfd_fops;
 
 struct file;
 
-extern struct pid *pidfd_pid(const struct file *file);
+struct pid *pidfd_pid(const struct file *file);
 struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags);
 struct task_struct *pidfd_get_task(int pidfd, unsigned int *flags);
 int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret);
+void do_notify_pidfd(struct task_struct *task);
 
 static inline struct pid *get_pid(struct pid *pid)
 {
diff --git a/kernel/signal.c b/kernel/signal.c
index 5f48d2c4b409..9b40109f0c56 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2019,7 +2019,7 @@  int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
 	return ret;
 }
 
-static void do_notify_pidfd(struct task_struct *task)
+void do_notify_pidfd(struct task_struct *task)
 {
 	struct pid *pid;