ptrace: disable single step in __ptrace_unlink for protecting init task

Message ID 20221024094759.11434-1-chenzhang@kylinos.cn
State New
Headers
Series ptrace: disable single step in __ptrace_unlink for protecting init task |

Commit Message

chen zhang Oct. 24, 2022, 9:47 a.m. UTC
  I got below panic when doing fuzz test:

Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005
CPU: 2 PID: 1 Comm: systemd Kdump: loaded Not tainted 6.1.0-rc1 #1
Hardware name: LENOVO 20L5A07XCD/20L5A07XCD, BIOS N24ET56W (1.31 ) 02/19/2020
Call Trace:
[  157.210356]  dump_stack_lvl+0x49/0x63
[  157.210364]  dump_stack+0x10/0x16
[  157.210368]  panic+0x10c/0x299
[  157.210375]  do_exit.cold+0x15/0x15
[  157.210381]  do_group_exit+0x35/0x90
[  157.210386]  get_signal+0x910/0x960
[  157.210390]  ? signal_wake_up_state+0x2e/0x40
[  157.210396]  ? complete_signal+0xd0/0x2c0
[  157.210402]  arch_do_signal_or_restart+0x37/0x7c0
[  157.210408]  ? send_signal_locked+0xf5/0x140
[  157.210416]  exit_to_user_mode_prepare+0x133/0x180
[  157.210423]  irqentry_exit_to_user_mode+0x9/0x20
[  157.210428]  noist_exc_debug+0xea/0x150
[  157.210433]  asm_exc_debug+0x34/0x40
[  157.210438] RIP: 0033:0x7fcf2a8e51c9
[  157.210442] Code: ff ff 73 01 c3 48 8b 0d c5 7c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 ba 00 00 00 <0f> 05 c3 0f 1f 40 00 f3 0f 1e fa b8 ea 00 00 00 0f 05 48 3d 01 f0
[  157.210446] RSP: 002b:00007ffd7dc44678 EFLAGS: 00000302
[  157.210451] RAX: 00000000000000ba RBX: 000055f7c0363170 RCX: 000055f7c04d2820
[  157.210454] RDX: 00000000ffffffff RSI: ffffffffffffffff RDI: 000055f7c0363170
[  157.210457] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000001dd0
[  157.210460] R10: 00007ffd7ddc9090 R11: 000000000000d7da R12: 0000000000000001
[  157.210463] R13: ffffffffffffffff R14: 000055f7bf3557c1 R15: 0000000000000000

If a task attaches init task and is single stepping it, when this task
exits, ptrace value will be cleaned. It causes SIGNAL_UNKILLABLE flag
cleaned, and init task will lose the protection. Init task maybe be killed
by SIGTRAP signal because of stepping enabled. So syscall tracing and
stepping should be turned off for protecting init task before ptrace value
is cleaned.

Signed-off-by: chen zhang <chenzhang@kylinos.cn>
---
 kernel/ptrace.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Oleg Nesterov Oct. 24, 2022, 11:19 a.m. UTC | #1
On 10/24, chen zhang wrote:
>
> I got below panic when doing fuzz test:
>
> Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000005
> CPU: 2 PID: 1 Comm: systemd Kdump: loaded Not tainted 6.1.0-rc1 #1
> Hardware name: LENOVO 20L5A07XCD/20L5A07XCD, BIOS N24ET56W (1.31 ) 02/19/2020
> Call Trace:
> [  157.210356]  dump_stack_lvl+0x49/0x63
> [  157.210364]  dump_stack+0x10/0x16
> [  157.210368]  panic+0x10c/0x299
> [  157.210375]  do_exit.cold+0x15/0x15
> [  157.210381]  do_group_exit+0x35/0x90
> [  157.210386]  get_signal+0x910/0x960
> [  157.210390]  ? signal_wake_up_state+0x2e/0x40
> [  157.210396]  ? complete_signal+0xd0/0x2c0
> [  157.210402]  arch_do_signal_or_restart+0x37/0x7c0
> [  157.210408]  ? send_signal_locked+0xf5/0x140
> [  157.210416]  exit_to_user_mode_prepare+0x133/0x180
> [  157.210423]  irqentry_exit_to_user_mode+0x9/0x20
> [  157.210428]  noist_exc_debug+0xea/0x150
> [  157.210433]  asm_exc_debug+0x34/0x40
> [  157.210438] RIP: 0033:0x7fcf2a8e51c9
> [  157.210442] Code: ff ff 73 01 c3 48 8b 0d c5 7c 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 ba 00 00 00 <0f> 05 c3 0f 1f 40 00 f3 0f 1e fa b8 ea 00 00 00 0f 05 48 3d 01 f0
> [  157.210446] RSP: 002b:00007ffd7dc44678 EFLAGS: 00000302
> [  157.210451] RAX: 00000000000000ba RBX: 000055f7c0363170 RCX: 000055f7c04d2820
> [  157.210454] RDX: 00000000ffffffff RSI: ffffffffffffffff RDI: 000055f7c0363170
> [  157.210457] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000001dd0
> [  157.210460] R10: 00007ffd7ddc9090 R11: 000000000000d7da R12: 0000000000000001
> [  157.210463] R13: ffffffffffffffff R14: 000055f7bf3557c1 R15: 0000000000000000
>
> If a task attaches init task and is single stepping it, when this task
> exits, ptrace value will be cleaned. It causes SIGNAL_UNKILLABLE flag
> cleaned, and init task will lose the protection. Init task maybe be killed
> by SIGTRAP signal because of stepping enabled.

Well. If debugger just exits then the tracee (init or a regular process) can be
killed. This is the application bug. Debugger should not exit unexpectedly, that
is all.

The kernel can't "cleanup" the state of the tracee anyway. Say, debugger installs
a breakpoint and exits. The tracee will be killed once it hits this bp. What can
the kernel do?

Not to mention I don't understand how your patch can actually help. If nothing
else,

	- debugger does ptrace(PTRACE_SINGLESTEP), this wakes the tracee up

	- the tracee enters force_sig_info_to_task(SIGTRAP) after single step

	- debugger exits, __ptrace_unlink() clears ptrace/TIF_SINGLESTEP

	- force_sig_info_to_task() clears SIGNAL_UNKILLABLE, the traced init
	  will be killed.

Oleg.
  
kernel test robot Oct. 24, 2022, 6:42 p.m. UTC | #2
Hi chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.1-rc2 next-20221024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
patch link:    https://lore.kernel.org/r/20221024094759.11434-1-chenzhang%40kylinos.cn
patch subject: [PATCH] ptrace: disable single step in __ptrace_unlink for protecting init task
config: mips-bcm47xx_defconfig
compiler: mipsel-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/30b32cd2ca59b28cda083521c0689decf9fe6243
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
        git checkout 30b32cd2ca59b28cda083521c0689decf9fe6243
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=mips SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from arch/mips/include/asm/bug.h:5,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:13,
                    from include/asm-generic/current.h:5,
                    from ./arch/mips/include/generated/asm/current.h:1,
                    from include/linux/sched.h:12,
                    from kernel/ptrace.c:13:
   kernel/ptrace.c: In function '__ptrace_unlink':
>> kernel/ptrace.c:134:55: error: '_TIF_SINGLESTEP' undeclared (first use in this function); did you mean 'PTRACE_SINGLESTEP'?
     134 |             unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
         |                                                       ^~~~~~~~~~~~~~~
   include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
      78 | # define unlikely(x)    __builtin_expect(!!(x), 0)
         |                                             ^
   kernel/ptrace.c:134:55: note: each undeclared identifier is reported only once for each function it appears in
     134 |             unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
         |                                                       ^~~~~~~~~~~~~~~
   include/linux/compiler.h:78:45: note: in definition of macro 'unlikely'
      78 | # define unlikely(x)    __builtin_expect(!!(x), 0)
         |                                             ^


vim +134 kernel/ptrace.c

  > 13	#include <linux/sched.h>
    14	#include <linux/sched/mm.h>
    15	#include <linux/sched/coredump.h>
    16	#include <linux/sched/task.h>
    17	#include <linux/errno.h>
    18	#include <linux/mm.h>
    19	#include <linux/highmem.h>
    20	#include <linux/pagemap.h>
    21	#include <linux/ptrace.h>
    22	#include <linux/security.h>
    23	#include <linux/signal.h>
    24	#include <linux/uio.h>
    25	#include <linux/audit.h>
    26	#include <linux/pid_namespace.h>
    27	#include <linux/syscalls.h>
    28	#include <linux/uaccess.h>
    29	#include <linux/regset.h>
    30	#include <linux/hw_breakpoint.h>
    31	#include <linux/cn_proc.h>
    32	#include <linux/compat.h>
    33	#include <linux/sched/signal.h>
    34	#include <linux/minmax.h>
    35	
    36	#include <asm/syscall.h>	/* for syscall_get_* */
    37	
    38	/*
    39	 * Access another process' address space via ptrace.
    40	 * Source/target buffer must be kernel space,
    41	 * Do not walk the page table directly, use get_user_pages
    42	 */
    43	int ptrace_access_vm(struct task_struct *tsk, unsigned long addr,
    44			     void *buf, int len, unsigned int gup_flags)
    45	{
    46		struct mm_struct *mm;
    47		int ret;
    48	
    49		mm = get_task_mm(tsk);
    50		if (!mm)
    51			return 0;
    52	
    53		if (!tsk->ptrace ||
    54		    (current != tsk->parent) ||
    55		    ((get_dumpable(mm) != SUID_DUMP_USER) &&
    56		     !ptracer_capable(tsk, mm->user_ns))) {
    57			mmput(mm);
    58			return 0;
    59		}
    60	
    61		ret = __access_remote_vm(mm, addr, buf, len, gup_flags);
    62		mmput(mm);
    63	
    64		return ret;
    65	}
    66	
    67	
    68	void __ptrace_link(struct task_struct *child, struct task_struct *new_parent,
    69			   const struct cred *ptracer_cred)
    70	{
    71		BUG_ON(!list_empty(&child->ptrace_entry));
    72		list_add(&child->ptrace_entry, &new_parent->ptraced);
    73		child->parent = new_parent;
    74		child->ptracer_cred = get_cred(ptracer_cred);
    75	}
    76	
    77	/*
    78	 * ptrace a task: make the debugger its new parent and
    79	 * move it to the ptrace list.
    80	 *
    81	 * Must be called with the tasklist lock write-held.
    82	 */
    83	static void ptrace_link(struct task_struct *child, struct task_struct *new_parent)
    84	{
    85		__ptrace_link(child, new_parent, current_cred());
    86	}
    87	
    88	/**
    89	 * __ptrace_unlink - unlink ptracee and restore its execution state
    90	 * @child: ptracee to be unlinked
    91	 *
    92	 * Remove @child from the ptrace list, move it back to the original parent,
    93	 * and restore the execution state so that it conforms to the group stop
    94	 * state.
    95	 *
    96	 * Unlinking can happen via two paths - explicit PTRACE_DETACH or ptracer
    97	 * exiting.  For PTRACE_DETACH, unless the ptracee has been killed between
    98	 * ptrace_check_attach() and here, it's guaranteed to be in TASK_TRACED.
    99	 * If the ptracer is exiting, the ptracee can be in any state.
   100	 *
   101	 * After detach, the ptracee should be in a state which conforms to the
   102	 * group stop.  If the group is stopped or in the process of stopping, the
   103	 * ptracee should be put into TASK_STOPPED; otherwise, it should be woken
   104	 * up from TASK_TRACED.
   105	 *
   106	 * If the ptracee is in TASK_TRACED and needs to be moved to TASK_STOPPED,
   107	 * it goes through TRACED -> RUNNING -> STOPPED transition which is similar
   108	 * to but in the opposite direction of what happens while attaching to a
   109	 * stopped task.  However, in this direction, the intermediate RUNNING
   110	 * state is not hidden even from the current ptracer and if it immediately
   111	 * re-attaches and performs a WNOHANG wait(2), it may fail.
   112	 *
   113	 * CONTEXT:
   114	 * write_lock_irq(tasklist_lock)
   115	 */
   116	void __ptrace_unlink(struct task_struct *child)
   117	{
   118		const struct cred *old_cred;
   119		BUG_ON(!child->ptrace);
   120	
   121		clear_task_syscall_work(child, SYSCALL_TRACE);
   122	#if defined(CONFIG_GENERIC_ENTRY) || defined(TIF_SYSCALL_EMU)
   123		clear_task_syscall_work(child, SYSCALL_EMU);
   124	#endif
   125	
   126		child->parent = child->real_parent;
   127		list_del_init(&child->ptrace_entry);
   128		old_cred = child->ptracer_cred;
   129		child->ptracer_cred = NULL;
   130		put_cred(old_cred);
   131	
   132		spin_lock(&child->sighand->siglock);
   133		if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
 > 134		    unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
   135			user_disable_single_step(child);
   136		child->ptrace = 0;
   137		/*
   138		 * Clear all pending traps and TRAPPING.  TRAPPING should be
   139		 * cleared regardless of JOBCTL_STOP_PENDING.  Do it explicitly.
   140		 */
   141		task_clear_jobctl_pending(child, JOBCTL_TRAP_MASK);
   142		task_clear_jobctl_trapping(child);
   143	
   144		/*
   145		 * Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
   146		 * @child isn't dead.
   147		 */
   148		if (!(child->flags & PF_EXITING) &&
   149		    (child->signal->flags & SIGNAL_STOP_STOPPED ||
   150		     child->signal->group_stop_count)) {
   151			child->jobctl |= JOBCTL_STOP_PENDING;
   152	
   153			/*
   154			 * This is only possible if this thread was cloned by the
   155			 * traced task running in the stopped group, set the signal
   156			 * for the future reports.
   157			 * FIXME: we should change ptrace_init_task() to handle this
   158			 * case.
   159			 */
   160			if (!(child->jobctl & JOBCTL_STOP_SIGMASK))
   161				child->jobctl |= SIGSTOP;
   162		}
   163	
   164		/*
   165		 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
   166		 * @child in the butt.  Note that @resume should be used iff @child
   167		 * is in TASK_TRACED; otherwise, we might unduly disrupt
   168		 * TASK_KILLABLE sleeps.
   169		 */
   170		if (child->jobctl & JOBCTL_STOP_PENDING || task_is_traced(child))
   171			ptrace_signal_wake_up(child, true);
   172	
   173		spin_unlock(&child->sighand->siglock);
   174	}
   175
  
kernel test robot Oct. 24, 2022, 7:43 p.m. UTC | #3
Hi chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.1-rc2 next-20221024]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
patch link:    https://lore.kernel.org/r/20221024094759.11434-1-chenzhang%40kylinos.cn
patch subject: [PATCH] ptrace: disable single step in __ptrace_unlink for protecting init task
config: arm-versatile_defconfig
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 791a7ae1ba3efd6bca96338e10ffde557ba83920)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm cross compiling tool for clang build
        # apt-get install binutils-arm-linux-gnueabi
        # https://github.com/intel-lab-lkp/linux/commit/30b32cd2ca59b28cda083521c0689decf9fe6243
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review chen-zhang/ptrace-disable-single-step-in-__ptrace_unlink-for-protecting-init-task/20221024-175119
        git checkout 30b32cd2ca59b28cda083521c0689decf9fe6243
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> kernel/ptrace.c:134:48: error: use of undeclared identifier '_TIF_SINGLESTEP'
               unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
                                                         ^
   1 error generated.


vim +/_TIF_SINGLESTEP +134 kernel/ptrace.c

   125	
   126		child->parent = child->real_parent;
   127		list_del_init(&child->ptrace_entry);
   128		old_cred = child->ptracer_cred;
   129		child->ptracer_cred = NULL;
   130		put_cred(old_cred);
   131	
   132		spin_lock(&child->sighand->siglock);
   133		if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
 > 134		    unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
   135			user_disable_single_step(child);
   136		child->ptrace = 0;
   137		/*
   138		 * Clear all pending traps and TRAPPING.  TRAPPING should be
   139		 * cleared regardless of JOBCTL_STOP_PENDING.  Do it explicitly.
   140		 */
   141		task_clear_jobctl_pending(child, JOBCTL_TRAP_MASK);
   142		task_clear_jobctl_trapping(child);
   143	
   144		/*
   145		 * Reinstate JOBCTL_STOP_PENDING if group stop is in effect and
   146		 * @child isn't dead.
   147		 */
   148		if (!(child->flags & PF_EXITING) &&
   149		    (child->signal->flags & SIGNAL_STOP_STOPPED ||
   150		     child->signal->group_stop_count)) {
   151			child->jobctl |= JOBCTL_STOP_PENDING;
   152	
   153			/*
   154			 * This is only possible if this thread was cloned by the
   155			 * traced task running in the stopped group, set the signal
   156			 * for the future reports.
   157			 * FIXME: we should change ptrace_init_task() to handle this
   158			 * case.
   159			 */
   160			if (!(child->jobctl & JOBCTL_STOP_SIGMASK))
   161				child->jobctl |= SIGSTOP;
   162		}
   163	
   164		/*
   165		 * If transition to TASK_STOPPED is pending or in TASK_TRACED, kick
   166		 * @child in the butt.  Note that @resume should be used iff @child
   167		 * is in TASK_TRACED; otherwise, we might unduly disrupt
   168		 * TASK_KILLABLE sleeps.
   169		 */
   170		if (child->jobctl & JOBCTL_STOP_PENDING || task_is_traced(child))
   171			ptrace_signal_wake_up(child, true);
   172	
   173		spin_unlock(&child->sighand->siglock);
   174	}
   175
  

Patch

diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 54482193e1ed..e7c41154b31e 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -130,6 +130,9 @@  void __ptrace_unlink(struct task_struct *child)
 	put_cred(old_cred);
 
 	spin_lock(&child->sighand->siglock);
+	if (unlikely(child->signal->flags & SIGNAL_UNKILLABLE) &&
+	    unlikely(task_thread_info(child)->flags & _TIF_SINGLESTEP))
+		user_disable_single_step(child);
 	child->ptrace = 0;
 	/*
 	 * Clear all pending traps and TRAPPING.  TRAPPING should be