[1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed

Message ID 20230804124549.2562977-2-zhengyejian1@huawei.com
State New
Headers
Series tracing: Fix cpu buffers unavailable problem and add its testcase |

Commit Message

Zheng Yejian Aug. 4, 2023, 12:45 p.m. UTC
  Trace ring buffer can no longer record anything after executing like:
  # cd /sys/kernel/tracing
  # cat tracing_cpumask
  fff
  # echo 0 > tracing_cpumask
  # echo 1 > snapshot
  # echo fff > tracing_cpumask
  # echo "hello world" > trace_marker
  -bash: echo: write error: Bad file descriptor

The root cause is that:
  1. After `echo 0 > tracing_cpumask`, 'record_disabled' of cpu buffers
     in 'tr->array_buffer.buffer' became 1 (see tracing_set_cpumask());
  2. After `echo 1 > snapshot`, 'tr->array_buffer.buffer' is swapped
     with 'tr->max_buffer.buffer', then the 'record_disabled' became 0
     (see update_max_tr());
  3. After `echo fff > tracing_cpumask`, the 'record_disabled' become -1;
Then array_buffer and max_buffer are both unavailable due to value of
'record_disabled' is not 0.

To fix it, enable or disable both array_buffer and max_buffer at the same
time in tracing_set_cpumask().

Fixes: 71babb2705e2 ("tracing: change CPU ring buffer state from tracing_cpumask")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
---
 kernel/trace/trace.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

kernel test robot Aug. 4, 2023, 4:41 p.m. UTC | #1
Hi Zheng,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[cannot apply to rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-buffers-unavailable-due-to-record_disabled-messed/20230804-204751
base:   linus/master
patch link:    https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com
patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed
config: i386-randconfig-r025-20230731 (https://download.01.org/0day-ci/archive/20230805/202308050048.bUnVeBjV-lkp@intel.com/config)
compiler: clang version 16.0.4 (https://github.com/llvm/llvm-project.git ae42196bc493ffe877a7e3dff8be32035dea4d07)
reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050048.bUnVeBjV-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308050048.bUnVeBjV-lkp@intel.com/

All errors (new ones prefixed by >>):

>> kernel/trace/trace.c:5280:39: error: no member named 'max_buffer' in 'struct trace_array'
                           ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
                                                          ~~  ^
   kernel/trace/trace.c:5286:38: error: no member named 'max_buffer' in 'struct trace_array'
                           ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
                                                         ~~  ^
   2 errors generated.


vim +5280 kernel/trace/trace.c

  5260	
  5261	int tracing_set_cpumask(struct trace_array *tr,
  5262				cpumask_var_t tracing_cpumask_new)
  5263	{
  5264		int cpu;
  5265	
  5266		if (!tr)
  5267			return -EINVAL;
  5268	
  5269		local_irq_disable();
  5270		arch_spin_lock(&tr->max_lock);
  5271		for_each_tracing_cpu(cpu) {
  5272			/*
  5273			 * Increase/decrease the disabled counter if we are
  5274			 * about to flip a bit in the cpumask:
  5275			 */
  5276			if (cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
  5277					!cpumask_test_cpu(cpu, tracing_cpumask_new)) {
  5278				atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
  5279				ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
> 5280				ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
  5281			}
  5282			if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
  5283					cpumask_test_cpu(cpu, tracing_cpumask_new)) {
  5284				atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
  5285				ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
  5286				ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
  5287			}
  5288		}
  5289		arch_spin_unlock(&tr->max_lock);
  5290		local_irq_enable();
  5291	
  5292		cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new);
  5293	
  5294		return 0;
  5295	}
  5296
  
Steven Rostedt Aug. 4, 2023, 4:51 p.m. UTC | #2
On Sat, 5 Aug 2023 00:41:13 +0800
kernel test robot <lkp@intel.com> wrote:

>   5276			if (cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
>   5277					!cpumask_test_cpu(cpu, tracing_cpumask_new)) {
>   5278				atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
>   5279				ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
> > 5280				ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);  

The access to max_buffer requires a:

#ifdef CONFIG_TRACER_MAX_TRACE

Around them.

-- Steve

>   5281			}
>   5282			if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
>   5283					cpumask_test_cpu(cpu, tracing_cpumask_new)) {
>   5284				atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
>   5285				ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
>   5286				ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
>   5287			}
>   5288		}
>   5289		arch_spin_unlock(&tr->max_lock);
  
kernel test robot Aug. 4, 2023, 11:01 p.m. UTC | #3
Hi Zheng,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on rostedt-trace/for-next v6.5-rc4 next-20230804]
[cannot apply to rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-buffers-unavailable-due-to-record_disabled-messed/20230804-204751
base:   linus/master
patch link:    https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com
patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed
config: i386-defconfig (https://download.01.org/0day-ci/archive/20230805/202308050601.nsx7Z4CY-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050601.nsx7Z4CY-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308050601.nsx7Z4CY-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/trace/trace.c: In function 'tracing_set_cpumask':
>> kernel/trace/trace.c:5280:60: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
    5280 |                         ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
         |                                                            ^~~~~~~~~~
         |                                                            array_buffer
   kernel/trace/trace.c:5286:59: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
    5286 |                         ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
         |                                                           ^~~~~~~~~~
         |                                                           array_buffer


vim +5280 kernel/trace/trace.c

  5260	
  5261	int tracing_set_cpumask(struct trace_array *tr,
  5262				cpumask_var_t tracing_cpumask_new)
  5263	{
  5264		int cpu;
  5265	
  5266		if (!tr)
  5267			return -EINVAL;
  5268	
  5269		local_irq_disable();
  5270		arch_spin_lock(&tr->max_lock);
  5271		for_each_tracing_cpu(cpu) {
  5272			/*
  5273			 * Increase/decrease the disabled counter if we are
  5274			 * about to flip a bit in the cpumask:
  5275			 */
  5276			if (cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
  5277					!cpumask_test_cpu(cpu, tracing_cpumask_new)) {
  5278				atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
  5279				ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
> 5280				ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
  5281			}
  5282			if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
  5283					cpumask_test_cpu(cpu, tracing_cpumask_new)) {
  5284				atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
  5285				ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
  5286				ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
  5287			}
  5288		}
  5289		arch_spin_unlock(&tr->max_lock);
  5290		local_irq_enable();
  5291	
  5292		cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new);
  5293	
  5294		return 0;
  5295	}
  5296
  
kernel test robot Aug. 5, 2023, 1:15 a.m. UTC | #4
Hi Zheng,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on rostedt-trace/for-next v6.5-rc4 next-20230804]
[cannot apply to rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-buffers-unavailable-due-to-record_disabled-messed/20230804-204751
base:   linus/master
patch link:    https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com
patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed
config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202308050731.PQutr3r0-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/trace/trace.c: In function 'tracing_set_cpumask':
>> kernel/trace/trace.c:5280:60: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
    5280 |                         ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
         |                                                            ^~~~~~~~~~
         |                                                            array_buffer
   kernel/trace/trace.c:5286:59: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
    5286 |                         ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
         |                                                           ^~~~~~~~~~
         |                                                           array_buffer


vim +5280 kernel/trace/trace.c

  5260	
  5261	int tracing_set_cpumask(struct trace_array *tr,
  5262				cpumask_var_t tracing_cpumask_new)
  5263	{
  5264		int cpu;
  5265	
  5266		if (!tr)
  5267			return -EINVAL;
  5268	
  5269		local_irq_disable();
  5270		arch_spin_lock(&tr->max_lock);
  5271		for_each_tracing_cpu(cpu) {
  5272			/*
  5273			 * Increase/decrease the disabled counter if we are
  5274			 * about to flip a bit in the cpumask:
  5275			 */
  5276			if (cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
  5277					!cpumask_test_cpu(cpu, tracing_cpumask_new)) {
  5278				atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
  5279				ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
> 5280				ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
  5281			}
  5282			if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
  5283					cpumask_test_cpu(cpu, tracing_cpumask_new)) {
  5284				atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
  5285				ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
  5286				ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
  5287			}
  5288		}
  5289		arch_spin_unlock(&tr->max_lock);
  5290		local_irq_enable();
  5291	
  5292		cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new);
  5293	
  5294		return 0;
  5295	}
  5296
  
Zheng Yejian Aug. 5, 2023, 1:56 a.m. UTC | #5
On 2023/8/5 09:15, kernel test robot wrote:
> Hi Zheng,
> 
> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on rostedt-trace/for-next v6.5-rc4 next-20230804]
> [cannot apply to rostedt-trace/for-next-urgent]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Zheng-Yejian/tracing-Fix-cpu-buffers-unavailable-due-to-record_disabled-messed/20230804-204751
> base:   linus/master
> patch link:    https://lore.kernel.org/r/20230804124549.2562977-2-zhengyejian1%40huawei.com
> patch subject: [PATCH 1/2] tracing: Fix cpu buffers unavailable due to 'record_disabled' messed
> config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@intel.com/config)
> compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
> reproduce: (https://download.01.org/0day-ci/archive/20230805/202308050731.PQutr3r0-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202308050731.PQutr3r0-lkp@intel.com/
> 
> All errors (new ones prefixed by >>):
> 
>     kernel/trace/trace.c: In function 'tracing_set_cpumask':
>>> kernel/trace/trace.c:5280:60: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
>      5280 |                         ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
>           |                                                            ^~~~~~~~~~
>           |                                                            array_buffer
>     kernel/trace/trace.c:5286:59: error: 'struct trace_array' has no member named 'max_buffer'; did you mean 'array_buffer'?
>      5286 |                         ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
>           |                                                           ^~~~~~~~~~
>           |                                                           array_buffer
> 

Thank you, robot!
I'll fix it in v2 soon.

> 
> vim +5280 kernel/trace/trace.c
> 
>    5260	
>    5261	int tracing_set_cpumask(struct trace_array *tr,
>    5262				cpumask_var_t tracing_cpumask_new)
>    5263	{
>    5264		int cpu;
>    5265	
>    5266		if (!tr)
>    5267			return -EINVAL;
>    5268	
>    5269		local_irq_disable();
>    5270		arch_spin_lock(&tr->max_lock);
>    5271		for_each_tracing_cpu(cpu) {
>    5272			/*
>    5273			 * Increase/decrease the disabled counter if we are
>    5274			 * about to flip a bit in the cpumask:
>    5275			 */
>    5276			if (cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
>    5277					!cpumask_test_cpu(cpu, tracing_cpumask_new)) {
>    5278				atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
>    5279				ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
>> 5280				ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
>    5281			}
>    5282			if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
>    5283					cpumask_test_cpu(cpu, tracing_cpumask_new)) {
>    5284				atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
>    5285				ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
>    5286				ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
>    5287			}
>    5288		}
>    5289		arch_spin_unlock(&tr->max_lock);
>    5290		local_irq_enable();
>    5291	
>    5292		cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new);
>    5293	
>    5294		return 0;
>    5295	}
>    5296	
>
  
Zheng Yejian Aug. 5, 2023, 1:58 a.m. UTC | #6
On 2023/8/5 00:51, Steven Rostedt wrote:
> On Sat, 5 Aug 2023 00:41:13 +0800
> kernel test robot <lkp@intel.com> wrote:
> 
>>    5276			if (cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
>>    5277					!cpumask_test_cpu(cpu, tracing_cpumask_new)) {
>>    5278				atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
>>    5279				ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
>>> 5280				ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
> 
> The access to max_buffer requires a:
> 
> #ifdef CONFIG_TRACER_MAX_TRACE
> 
> Around them.
> 
> -- Steve

Thanks, Steve, I'll add it in v2.

-- Zheng Yejian

> 
>>    5281			}
>>    5282			if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
>>    5283					cpumask_test_cpu(cpu, tracing_cpumask_new)) {
>>    5284				atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
>>    5285				ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
>>    5286				ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
>>    5287			}
>>    5288		}
>>    5289		arch_spin_unlock(&tr->max_lock);
>
  

Patch

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index b8870078ef58..691ec4d4cd19 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -5277,11 +5277,13 @@  int tracing_set_cpumask(struct trace_array *tr,
 				!cpumask_test_cpu(cpu, tracing_cpumask_new)) {
 			atomic_inc(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
 			ring_buffer_record_disable_cpu(tr->array_buffer.buffer, cpu);
+			ring_buffer_record_disable_cpu(tr->max_buffer.buffer, cpu);
 		}
 		if (!cpumask_test_cpu(cpu, tr->tracing_cpumask) &&
 				cpumask_test_cpu(cpu, tracing_cpumask_new)) {
 			atomic_dec(&per_cpu_ptr(tr->array_buffer.data, cpu)->disabled);
 			ring_buffer_record_enable_cpu(tr->array_buffer.buffer, cpu);
+			ring_buffer_record_enable_cpu(tr->max_buffer.buffer, cpu);
 		}
 	}
 	arch_spin_unlock(&tr->max_lock);