diff options
author | Christian Brauner <brauner@kernel.org> | 2025-03-16 13:49:09 +0100 |
---|---|---|
committer | Christian Brauner <brauner@kernel.org> | 2025-03-19 14:40:18 +0100 |
commit | 68db272741834d5e528b5e1af6b83b479fec6cbb (patch) | |
tree | 70e0a7630db3e7887394bd13b1d4bca03110b583 /kernel/fork.c | |
parent | 6092c50160056dfa0f747df5e8b107124146e363 (diff) |
pidfs: ensure that PIDFS_INFO_EXIT is available
When we currently create a pidfd we check that the task hasn't been
reaped right before we create the pidfd. But it is of course possible
that by the time we return the pidfd to userspace the task has already
been reaped since we don't check again after having created a dentry for
it.
This was fine until now because that race was meaningless. But now that
we provide PIDFD_INFO_EXIT it is a problem because it is possible that
the kernel returns a reaped pidfd and it depends on the race whether
PIDFD_INFO_EXIT information is available. This depends on if the task
gets reaped before or after a dentry has been attached to struct pid.
Make this consistent and only returned pidfds for reaped tasks if
PIDFD_INFO_EXIT information is available. This is done by performing
another check whether the task has been reaped right after we attached a
dentry to struct pid.
Since pidfs_exit() is called before struct pid's task linkage is removed
the case where the task got reaped but a dentry was already attached to
struct pid and exit information was recorded and published can be
handled correctly. In that case we do return a pidfd for a reaped task
like we would've before.
Link: https://lore.kernel.org/r/20250316-kabel-fehden-66bdb6a83436@brauner
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Diffstat (limited to 'kernel/fork.c')
-rw-r--r-- | kernel/fork.c | 7 |
1 files changed, 5 insertions, 2 deletions
diff --git a/kernel/fork.c b/kernel/fork.c index 8eac9cd3385b..f11ac96b7587 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2425,8 +2425,11 @@ __latent_entropy struct task_struct *copy_process( if (clone_flags & CLONE_PIDFD) { int flags = (clone_flags & CLONE_THREAD) ? PIDFD_THREAD : 0; - /* Note that no task has been attached to @pid yet. */ - retval = __pidfd_prepare(pid, flags, &pidfile); + /* + * Note that no task has been attached to @pid yet indicate + * that via CLONE_PIDFD. + */ + retval = __pidfd_prepare(pid, flags | PIDFD_CLONE, &pidfile); if (retval < 0) goto bad_fork_free_pid; pidfd = retval; |