wchan: Fix get_wchan() when task in schedule

Message ID 20230330121238.176534-1-chenzhongjin@huawei.com
State New
Headers
Series wchan: Fix get_wchan() when task in schedule |

Commit Message

Chen Zhongjin March 30, 2023, 12:12 p.m. UTC
  get_wchan() check task to unwind is not running or going to run by:
state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq

However this cannot detect task which is going to be scheduled out.
For example, in this path:

  __wait_for_common(x, schedule_timeout, timeout, TASK_UNINTERRUPTIBLE)
  do_wait_for_common() // state == TASK_UNINTERRUPTIBLE
  schedule_timeout()
  __schedule()
    deactivate_task() // on_rq = 0

After this point get_wchan() can be run on the task but it is still
running actually, and p->pi_lock doesn't work for this case.

It can trigger some warning when running stacktrace on a running task.
Also check p->on_cpu to promise task is really switched out can prevent
this.

Fixes: 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
---
 kernel/sched/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

kernel test robot March 30, 2023, 1:53 p.m. UTC | #1
Hi Chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on linus/master v6.3-rc4 next-20230330]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chen-Zhongjin/wchan-Fix-get_wchan-when-task-in-schedule/20230330-201555
patch link:    https://lore.kernel.org/r/20230330121238.176534-1-chenzhongjin%40huawei.com
patch subject: [PATCH] wchan: Fix get_wchan() when task in schedule
config: sh-allmodconfig (https://download.01.org/0day-ci/archive/20230330/202303302125.7Ku9P7v5-lkp@intel.com/config)
compiler: sh4-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/d5fd727a071ab3c2241f858e77c2ae5bb3cec6f3
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Chen-Zhongjin/wchan-Fix-get_wchan-when-task-in-schedule/20230330-201555
        git checkout d5fd727a071ab3c2241f858e77c2ae5bb3cec6f3
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh SHELL=/bin/bash kernel/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303302125.7Ku9P7v5-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/sched/core.c: In function 'get_wchan':
>> kernel/sched/core.c:2060:28: error: 'struct task_struct' has no member named 'on_cpu'
    2060 |             !p->on_rq && !p->on_cpu)
         |                            ^~


vim +2060 kernel/sched/core.c

  2046	
  2047	unsigned long get_wchan(struct task_struct *p)
  2048	{
  2049		unsigned long ip = 0;
  2050		unsigned int state;
  2051	
  2052		if (!p || p == current)
  2053			return 0;
  2054	
  2055		/* Only get wchan if task is blocked and we can keep it that way. */
  2056		raw_spin_lock_irq(&p->pi_lock);
  2057		state = READ_ONCE(p->__state);
  2058		smp_rmb(); /* see try_to_wake_up() */
  2059		if (state != TASK_RUNNING && state != TASK_WAKING &&
> 2060		    !p->on_rq && !p->on_cpu)
  2061			ip = __get_wchan(p);
  2062		raw_spin_unlock_irq(&p->pi_lock);
  2063	
  2064		return ip;
  2065	}
  2066
  
kernel test robot March 30, 2023, 4:58 p.m. UTC | #2
Hi Chen,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/sched/core]
[also build test ERROR on linus/master v6.3-rc4 next-20230330]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Chen-Zhongjin/wchan-Fix-get_wchan-when-task-in-schedule/20230330-201555
patch link:    https://lore.kernel.org/r/20230330121238.176534-1-chenzhongjin%40huawei.com
patch subject: [PATCH] wchan: Fix get_wchan() when task in schedule
config: mips-randconfig-r006-20230329 (https://download.01.org/0day-ci/archive/20230331/202303310019.uMAqiUA4-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 67409911353323ca5edf2049ef0df54132fa1ca7)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install mips cross compiling tool for clang build
        # apt-get install binutils-mipsel-linux-gnu
        # https://github.com/intel-lab-lkp/linux/commit/d5fd727a071ab3c2241f858e77c2ae5bb3cec6f3
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Chen-Zhongjin/wchan-Fix-get_wchan-when-task-in-schedule/20230330-201555
        git checkout d5fd727a071ab3c2241f858e77c2ae5bb3cec6f3
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=mips olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=mips SHELL=/bin/bash kernel/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303310019.uMAqiUA4-lkp@intel.com/

All errors (new ones prefixed by >>):

>> kernel/sched/core.c:2060:23: error: no member named 'on_cpu' in 'struct task_struct'
               !p->on_rq && !p->on_cpu)
                             ~  ^
   1 error generated.


vim +2060 kernel/sched/core.c

  2046	
  2047	unsigned long get_wchan(struct task_struct *p)
  2048	{
  2049		unsigned long ip = 0;
  2050		unsigned int state;
  2051	
  2052		if (!p || p == current)
  2053			return 0;
  2054	
  2055		/* Only get wchan if task is blocked and we can keep it that way. */
  2056		raw_spin_lock_irq(&p->pi_lock);
  2057		state = READ_ONCE(p->__state);
  2058		smp_rmb(); /* see try_to_wake_up() */
  2059		if (state != TASK_RUNNING && state != TASK_WAKING &&
> 2060		    !p->on_rq && !p->on_cpu)
  2061			ip = __get_wchan(p);
  2062		raw_spin_unlock_irq(&p->pi_lock);
  2063	
  2064		return ip;
  2065	}
  2066
  
Chen Zhongjin March 31, 2023, 1:08 a.m. UTC | #3
It seems because of !CONFIG_SMP. I'll push a new version.

On 2023/3/30 21:53, kernel test robot wrote:
> Hi Chen,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on tip/sched/core]
> [also build test ERROR on linus/master v6.3-rc4 next-20230330]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Chen-Zhongjin/wchan-Fix-get_wchan-when-task-in-schedule/20230330-201555
> patch link:    https://lore.kernel.org/r/20230330121238.176534-1-chenzhongjin%40huawei.com
> patch subject: [PATCH] wchan: Fix get_wchan() when task in schedule
> config: sh-allmodconfig (https://download.01.org/0day-ci/archive/20230330/202303302125.7Ku9P7v5-lkp@intel.com/config)
> compiler: sh4-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
>          wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>          chmod +x ~/bin/make.cross
>          # https://github.com/intel-lab-lkp/linux/commit/d5fd727a071ab3c2241f858e77c2ae5bb3cec6f3
>          git remote add linux-review https://github.com/intel-lab-lkp/linux
>          git fetch --no-tags linux-review Chen-Zhongjin/wchan-Fix-get_wchan-when-task-in-schedule/20230330-201555
>          git checkout d5fd727a071ab3c2241f858e77c2ae5bb3cec6f3
>          # save the config file
>          mkdir build_dir && cp config build_dir/.config
>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh olddefconfig
>          COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=sh SHELL=/bin/bash kernel/
>
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
> | Link: https://lore.kernel.org/oe-kbuild-all/202303302125.7Ku9P7v5-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>     kernel/sched/core.c: In function 'get_wchan':
>>> kernel/sched/core.c:2060:28: error: 'struct task_struct' has no member named 'on_cpu'
>      2060 |             !p->on_rq && !p->on_cpu)
>           |                            ^~
>
>
> vim +2060 kernel/sched/core.c
>
>    2046	
>    2047	unsigned long get_wchan(struct task_struct *p)
>    2048	{
>    2049		unsigned long ip = 0;
>    2050		unsigned int state;
>    2051	
>    2052		if (!p || p == current)
>    2053			return 0;
>    2054	
>    2055		/* Only get wchan if task is blocked and we can keep it that way. */
>    2056		raw_spin_lock_irq(&p->pi_lock);
>    2057		state = READ_ONCE(p->__state);
>    2058		smp_rmb(); /* see try_to_wake_up() */
>    2059		if (state != TASK_RUNNING && state != TASK_WAKING &&
>> 2060		    !p->on_rq && !p->on_cpu)
>    2061			ip = __get_wchan(p);
>    2062		raw_spin_unlock_irq(&p->pi_lock);
>    2063	
>    2064		return ip;
>    2065	}
>    2066	
>
  

Patch

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0d18c3969f90..2071d1c0eaca 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2041,7 +2041,8 @@  unsigned long get_wchan(struct task_struct *p)
 	raw_spin_lock_irq(&p->pi_lock);
 	state = READ_ONCE(p->__state);
 	smp_rmb(); /* see try_to_wake_up() */
-	if (state != TASK_RUNNING && state != TASK_WAKING && !p->on_rq)
+	if (state != TASK_RUNNING && state != TASK_WAKING &&
+	    !p->on_rq && !p->on_cpu)
 		ip = __get_wchan(p);
 	raw_spin_unlock_irq(&p->pi_lock);