pid: Add the judgment of whether ns is NULL in the find_pid_ns

Message ID 20230713071713.5762-1-xuewen.yan@unisoc.com
State New
Headers
Series pid: Add the judgment of whether ns is NULL in the find_pid_ns |

Commit Message

Xuewen Yan July 13, 2023, 7:17 a.m. UTC
  There is no the judgment of whether namspace is NULL in find_pid_ns.
But there is a corner case when ns is null, for example: if user
call find_get_pid when current is in exiting, the following stack would
set thread_id be null:
release_task
    __exit_signal(p);
        __unhash_process(tsk, group_dead);
              detach_pid(p, PIDTYPE_PID);
                  __change_pid(task, type, NULL);

If user call find_get_pid at now, in find_vpid function, the
task_active_pid_ns would return NULL. As a result, it would be
error when access the ns in find_pid_ns.

So add the judgment of whether ns is NULL in the find_pid_ns to
prevent this case happen.

Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
---
 kernel/pid.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Xuewen Yan July 19, 2023, 11:46 a.m. UTC | #1
Dear all

Is there any comment about this patch?

Thanks!

On Thu, Jul 13, 2023 at 3:58 PM Xuewen Yan <xuewen.yan@unisoc.com> wrote:
>
> There is no the judgment of whether namspace is NULL in find_pid_ns.
> But there is a corner case when ns is null, for example: if user
> call find_get_pid when current is in exiting, the following stack would
> set thread_id be null:
> release_task
>     __exit_signal(p);
>         __unhash_process(tsk, group_dead);
>               detach_pid(p, PIDTYPE_PID);
>                   __change_pid(task, type, NULL);
>
> If user call find_get_pid at now, in find_vpid function, the
> task_active_pid_ns would return NULL. As a result, it would be
> error when access the ns in find_pid_ns.
>
> So add the judgment of whether ns is NULL in the find_pid_ns to
> prevent this case happen.
>
> Signed-off-by: Xuewen Yan <xuewen.yan@unisoc.com>
> ---
>  kernel/pid.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/pid.c b/kernel/pid.c
> index 6a1d23a11026..d4a9cb6f3eb9 100644
> --- a/kernel/pid.c
> +++ b/kernel/pid.c
> @@ -308,6 +308,9 @@ void disable_pid_allocation(struct pid_namespace *ns)
>
>  struct pid *find_pid_ns(int nr, struct pid_namespace *ns)
>  {
> +       if (!ns)
> +               return NULL;
> +
>         return idr_find(&ns->idr, nr);
>  }
>  EXPORT_SYMBOL_GPL(find_pid_ns);
> --
> 2.25.1
>
  
Christian Brauner July 25, 2023, 8:26 a.m. UTC | #2
On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> There is no the judgment of whether namspace is NULL in find_pid_ns.
> But there is a corner case when ns is null, for example: if user
> call find_get_pid when current is in exiting, the following stack would
> set thread_id be null:
> release_task
>     __exit_signal(p);
>         __unhash_process(tsk, group_dead);
>               detach_pid(p, PIDTYPE_PID);
>                   __change_pid(task, type, NULL);
> 
> If user call find_get_pid at now, in find_vpid function, the

I fail to see how this can happen. The code you're referencing is in
release_task(). If current has gone through that then current obviously
can't call find_vpid() on itself anymore or anything else for that
matter.
  
Xuewen Yan July 25, 2023, 12:24 p.m. UTC | #3
On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > But there is a corner case when ns is null, for example: if user
> > call find_get_pid when current is in exiting, the following stack would
> > set thread_id be null:
> > release_task
> >     __exit_signal(p);
> >         __unhash_process(tsk, group_dead);
> >               detach_pid(p, PIDTYPE_PID);
> >                   __change_pid(task, type, NULL);
> >
> > If user call find_get_pid at now, in find_vpid function, the
>
> I fail to see how this can happen. The code you're referencing is in
> release_task(). If current has gone through that then current obviously
> can't call find_vpid() on itself anymore or anything else for that
> matter.

This happened when user calls  find_vpid() in irq.

[72117.635162] Call trace:
[72117.635595]  idr_find+0xc/0x24
[72117.636103]  find_get_pid+0x40/0x68
[72117.636812]  send_event+0x88/0x180 [demux]
[72117.637593]  vbvop_copy_data+0x150/0x344 [demux]
[72117.638434]  dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
[72117.639393]  dmisr_video_parsing_switch+0x68/0xec [demux]
[72117.640332]  dmisr_handle_video_pes+0x10c/0x26c [demux]
[72117.641108]  tasklet_action_common+0x130/0x224
[72117.641784]  tasklet_action+0x28/0x34
[72117.642366]  __do_softirq+0x128/0x4dc
[72117.642944]  irq_exit+0xf8/0xfc
[72117.643459]  __handle_domain_irq+0xb0/0x108
[72117.644102]  gic_handle_irq+0x6c/0x124
[72117.644691]  el1_irq+0x108/0x200
[72117.645217]  _raw_write_unlock_irq+0x2c/0x5c
[72117.645870]  release_task+0x144/0x1ac   <<<<<<
[72117.646447]  do_exit+0x524/0x94c
[72117.646970]  __do_sys_exit_group+0x0/0x14
[72117.647591]  do_group_exit+0x0/0xa0
[72117.648146]  __se_sys_exit+0x0/0x20
[72117.648704]  el0_svc_common+0xcc/0x1bc
[72117.649292]  el0_svc_handler+0x2c/0x3c
[72117.649881]  el0_svc+0x8/0xc

In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
this time, if user calls find_get_pid() in irq, because
current->thread_id is NULL,
it will handle the NULL pointer.

BR
---
xuewen
  
Christian Brauner July 25, 2023, 12:47 p.m. UTC | #4
On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > But there is a corner case when ns is null, for example: if user
> > > call find_get_pid when current is in exiting, the following stack would
> > > set thread_id be null:
> > > release_task
> > >     __exit_signal(p);
> > >         __unhash_process(tsk, group_dead);
> > >               detach_pid(p, PIDTYPE_PID);
> > >                   __change_pid(task, type, NULL);
> > >
> > > If user call find_get_pid at now, in find_vpid function, the
> >
> > I fail to see how this can happen. The code you're referencing is in
> > release_task(). If current has gone through that then current obviously
> > can't call find_vpid() on itself anymore or anything else for that
> > matter.
> 
> This happened when user calls  find_vpid() in irq.
> 
> [72117.635162] Call trace:
> [72117.635595]  idr_find+0xc/0x24
> [72117.636103]  find_get_pid+0x40/0x68
> [72117.636812]  send_event+0x88/0x180 [demux]
> [72117.637593]  vbvop_copy_data+0x150/0x344 [demux]
> [72117.638434]  dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> [72117.639393]  dmisr_video_parsing_switch+0x68/0xec [demux]
> [72117.640332]  dmisr_handle_video_pes+0x10c/0x26c [demux]
> [72117.641108]  tasklet_action_common+0x130/0x224
> [72117.641784]  tasklet_action+0x28/0x34
> [72117.642366]  __do_softirq+0x128/0x4dc
> [72117.642944]  irq_exit+0xf8/0xfc
> [72117.643459]  __handle_domain_irq+0xb0/0x108
> [72117.644102]  gic_handle_irq+0x6c/0x124
> [72117.644691]  el1_irq+0x108/0x200
> [72117.645217]  _raw_write_unlock_irq+0x2c/0x5c
> [72117.645870]  release_task+0x144/0x1ac   <<<<<<
> [72117.646447]  do_exit+0x524/0x94c
> [72117.646970]  __do_sys_exit_group+0x0/0x14
> [72117.647591]  do_group_exit+0x0/0xa0
> [72117.648146]  __se_sys_exit+0x0/0x20
> [72117.648704]  el0_svc_common+0xcc/0x1bc
> [72117.649292]  el0_svc_handler+0x2c/0x3c
> [72117.649881]  el0_svc+0x8/0xc
> 
> In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> this time, if user calls find_get_pid() in irq, because
> current->thread_id is NULL,
> it will handle the NULL pointer.

Uhm, where is that code from? This doesn't seem to be upstream.
  
Xuewen Yan July 26, 2023, 1:23 a.m. UTC | #5
On Tue, Jul 25, 2023 at 8:47 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> > On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > > But there is a corner case when ns is null, for example: if user
> > > > call find_get_pid when current is in exiting, the following stack would
> > > > set thread_id be null:
> > > > release_task
> > > >     __exit_signal(p);
> > > >         __unhash_process(tsk, group_dead);
> > > >               detach_pid(p, PIDTYPE_PID);
> > > >                   __change_pid(task, type, NULL);
> > > >
> > > > If user call find_get_pid at now, in find_vpid function, the
> > >
> > > I fail to see how this can happen. The code you're referencing is in
> > > release_task(). If current has gone through that then current obviously
> > > can't call find_vpid() on itself anymore or anything else for that
> > > matter.
> >
> > This happened when user calls  find_vpid() in irq.
> >
> > [72117.635162] Call trace:
> > [72117.635595]  idr_find+0xc/0x24
> > [72117.636103]  find_get_pid+0x40/0x68
> > [72117.636812]  send_event+0x88/0x180 [demux]
> > [72117.637593]  vbvop_copy_data+0x150/0x344 [demux]
> > [72117.638434]  dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> > [72117.639393]  dmisr_video_parsing_switch+0x68/0xec [demux]
> > [72117.640332]  dmisr_handle_video_pes+0x10c/0x26c [demux]
> > [72117.641108]  tasklet_action_common+0x130/0x224
> > [72117.641784]  tasklet_action+0x28/0x34
> > [72117.642366]  __do_softirq+0x128/0x4dc
> > [72117.642944]  irq_exit+0xf8/0xfc
> > [72117.643459]  __handle_domain_irq+0xb0/0x108
> > [72117.644102]  gic_handle_irq+0x6c/0x124
> > [72117.644691]  el1_irq+0x108/0x200
> > [72117.645217]  _raw_write_unlock_irq+0x2c/0x5c
> > [72117.645870]  release_task+0x144/0x1ac   <<<<<<
> > [72117.646447]  do_exit+0x524/0x94c
> > [72117.646970]  __do_sys_exit_group+0x0/0x14
> > [72117.647591]  do_group_exit+0x0/0xa0
> > [72117.648146]  __se_sys_exit+0x0/0x20
> > [72117.648704]  el0_svc_common+0xcc/0x1bc
> > [72117.649292]  el0_svc_handler+0x2c/0x3c
> > [72117.649881]  el0_svc+0x8/0xc
> >
> > In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> > this time, if user calls find_get_pid() in irq, because
> > current->thread_id is NULL,
> > it will handle the NULL pointer.
>
> Uhm, where is that code from? This doesn't seem to be upstream.

It's from our own platform, we found someone called  find_get_pid() in
the module, and caused the crash.

BR
  
Christian Brauner July 26, 2023, 9:25 a.m. UTC | #6
On Wed, Jul 26, 2023 at 09:23:13AM +0800, Xuewen Yan wrote:
> On Tue, Jul 25, 2023 at 8:47 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> > > On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <brauner@kernel.org> wrote:
> > > >
> > > > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > > > But there is a corner case when ns is null, for example: if user
> > > > > call find_get_pid when current is in exiting, the following stack would
> > > > > set thread_id be null:
> > > > > release_task
> > > > >     __exit_signal(p);
> > > > >         __unhash_process(tsk, group_dead);
> > > > >               detach_pid(p, PIDTYPE_PID);
> > > > >                   __change_pid(task, type, NULL);
> > > > >
> > > > > If user call find_get_pid at now, in find_vpid function, the
> > > >
> > > > I fail to see how this can happen. The code you're referencing is in
> > > > release_task(). If current has gone through that then current obviously
> > > > can't call find_vpid() on itself anymore or anything else for that
> > > > matter.
> > >
> > > This happened when user calls  find_vpid() in irq.
> > >
> > > [72117.635162] Call trace:
> > > [72117.635595]  idr_find+0xc/0x24
> > > [72117.636103]  find_get_pid+0x40/0x68
> > > [72117.636812]  send_event+0x88/0x180 [demux]
> > > [72117.637593]  vbvop_copy_data+0x150/0x344 [demux]
> > > [72117.638434]  dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> > > [72117.639393]  dmisr_video_parsing_switch+0x68/0xec [demux]
> > > [72117.640332]  dmisr_handle_video_pes+0x10c/0x26c [demux]
> > > [72117.641108]  tasklet_action_common+0x130/0x224
> > > [72117.641784]  tasklet_action+0x28/0x34
> > > [72117.642366]  __do_softirq+0x128/0x4dc
> > > [72117.642944]  irq_exit+0xf8/0xfc
> > > [72117.643459]  __handle_domain_irq+0xb0/0x108
> > > [72117.644102]  gic_handle_irq+0x6c/0x124
> > > [72117.644691]  el1_irq+0x108/0x200
> > > [72117.645217]  _raw_write_unlock_irq+0x2c/0x5c
> > > [72117.645870]  release_task+0x144/0x1ac   <<<<<<
> > > [72117.646447]  do_exit+0x524/0x94c
> > > [72117.646970]  __do_sys_exit_group+0x0/0x14
> > > [72117.647591]  do_group_exit+0x0/0xa0
> > > [72117.648146]  __se_sys_exit+0x0/0x20
> > > [72117.648704]  el0_svc_common+0xcc/0x1bc
> > > [72117.649292]  el0_svc_handler+0x2c/0x3c
> > > [72117.649881]  el0_svc+0x8/0xc
> > >
> > > In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> > > this time, if user calls find_get_pid() in irq, because
> > > current->thread_id is NULL,
> > > it will handle the NULL pointer.
> >
> > Uhm, where is that code from? This doesn't seem to be upstream.
> 
> It's from our own platform, we found someone called  find_get_pid() in
> the module, and caused the crash.

So this is a bug report for an out of tree driver which I'm sure you're
aware we consider mostly irrelevant unless this is an upstream issue.

Please work around or better fix this in your out of tree driver or
please show a reproducer how this can happen on upstream kernels.

Otherwise I don't see why we'd care.
  
Xuewen Yan July 27, 2023, 1:21 a.m. UTC | #7
On Wed, Jul 26, 2023 at 5:25 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Wed, Jul 26, 2023 at 09:23:13AM +0800, Xuewen Yan wrote:
> > On Tue, Jul 25, 2023 at 8:47 PM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Tue, Jul 25, 2023 at 08:24:18PM +0800, Xuewen Yan wrote:
> > > > On Tue, Jul 25, 2023 at 4:49 PM Christian Brauner <brauner@kernel.org> wrote:
> > > > >
> > > > > On Thu, Jul 13, 2023 at 03:17:13PM +0800, Xuewen Yan wrote:
> > > > > > There is no the judgment of whether namspace is NULL in find_pid_ns.
> > > > > > But there is a corner case when ns is null, for example: if user
> > > > > > call find_get_pid when current is in exiting, the following stack would
> > > > > > set thread_id be null:
> > > > > > release_task
> > > > > >     __exit_signal(p);
> > > > > >         __unhash_process(tsk, group_dead);
> > > > > >               detach_pid(p, PIDTYPE_PID);
> > > > > >                   __change_pid(task, type, NULL);
> > > > > >
> > > > > > If user call find_get_pid at now, in find_vpid function, the
> > > > >
> > > > > I fail to see how this can happen. The code you're referencing is in
> > > > > release_task(). If current has gone through that then current obviously
> > > > > can't call find_vpid() on itself anymore or anything else for that
> > > > > matter.
> > > >
> > > > This happened when user calls  find_vpid() in irq.
> > > >
> > > > [72117.635162] Call trace:
> > > > [72117.635595]  idr_find+0xc/0x24
> > > > [72117.636103]  find_get_pid+0x40/0x68
> > > > [72117.636812]  send_event+0x88/0x180 [demux]
> > > > [72117.637593]  vbvop_copy_data+0x150/0x344 [demux]
> > > > [72117.638434]  dmisr_video_parsing_mpeg12+0x29c/0x42c [demux]
> > > > [72117.639393]  dmisr_video_parsing_switch+0x68/0xec [demux]
> > > > [72117.640332]  dmisr_handle_video_pes+0x10c/0x26c [demux]
> > > > [72117.641108]  tasklet_action_common+0x130/0x224
> > > > [72117.641784]  tasklet_action+0x28/0x34
> > > > [72117.642366]  __do_softirq+0x128/0x4dc
> > > > [72117.642944]  irq_exit+0xf8/0xfc
> > > > [72117.643459]  __handle_domain_irq+0xb0/0x108
> > > > [72117.644102]  gic_handle_irq+0x6c/0x124
> > > > [72117.644691]  el1_irq+0x108/0x200
> > > > [72117.645217]  _raw_write_unlock_irq+0x2c/0x5c
> > > > [72117.645870]  release_task+0x144/0x1ac   <<<<<<
> > > > [72117.646447]  do_exit+0x524/0x94c
> > > > [72117.646970]  __do_sys_exit_group+0x0/0x14
> > > > [72117.647591]  do_group_exit+0x0/0xa0
> > > > [72117.648146]  __se_sys_exit+0x0/0x20
> > > > [72117.648704]  el0_svc_common+0xcc/0x1bc
> > > > [72117.649292]  el0_svc_handler+0x2c/0x3c
> > > > [72117.649881]  el0_svc+0x8/0xc
> > > >
> > > > In release_task, write_unlock_irq(&tasklist_lock) will open irq, at
> > > > this time, if user calls find_get_pid() in irq, because
> > > > current->thread_id is NULL,
> > > > it will handle the NULL pointer.
> > >
> > > Uhm, where is that code from? This doesn't seem to be upstream.
> >
> > It's from our own platform, we found someone called  find_get_pid() in
> > the module, and caused the crash.
>
> So this is a bug report for an out of tree driver which I'm sure you're
> aware we consider mostly irrelevant unless this is an upstream issue.
>
> Please work around or better fix this in your out of tree driver or
> please show a reproducer how this can happen on upstream kernels.
>
> Otherwise I don't see why we'd care.

Okay, Thanks a lot!

BR
  

Patch

diff --git a/kernel/pid.c b/kernel/pid.c
index 6a1d23a11026..d4a9cb6f3eb9 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -308,6 +308,9 @@  void disable_pid_allocation(struct pid_namespace *ns)
 
 struct pid *find_pid_ns(int nr, struct pid_namespace *ns)
 {
+	if (!ns)
+		return NULL;
+
 	return idr_find(&ns->idr, nr);
 }
 EXPORT_SYMBOL_GPL(find_pid_ns);