[v2] perf record: Fix coredump with --overwrite and --max-size

Message ID 20221229124728.66515-1-yangjihong1@huawei.com
State New
Headers
Series [v2] perf record: Fix coredump with --overwrite and --max-size |

Commit Message

Yang Jihong Dec. 29, 2022, 12:47 p.m. UTC
  When --overwrite and --max-size options of perf record are used together,
a segmentation fault occurs. The following is an example:

 # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
  [ perf record: Woken up 1 times to write data ]
  perf: Segmentation fault
  Obtained 1 stack frames.
  [0xc4c67f]
  Segmentation fault (core dumped)

backtrace of the core file is as follows:

  #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
  #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
  #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
  #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
  #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
  #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
  #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
  #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
  #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
  #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
  #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
  #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
  #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
  #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562

The reason is that record__bytes_written accesses the freed memory rec->thread_data,
The process is as follows:
  __cmd_record
    -> record__free_thread_data
      -> zfree(&rec->thread_data)         // free rec->thread_data
    -> record__synthesize
      -> perf_event__synthesize_id_index
        -> process_synthesized_event
          -> record__write
            -> record__bytes_written     // access rec->thread_data

we only need to check the value of done first.
Also add variable check in record__bytes_written for code hardening,
and save bytes_written separately to reduce one calculation.

Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---

Changes since v1:
 - Add variable check in record__bytes_written for code hardening.
 - Save bytes_written separately to reduce one calculation.
 - Remove rec->opts.tail_synthesize check.

 tools/perf/builtin-record.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)
  

Comments

Arnaldo Carvalho de Melo Jan. 2, 2023, 4:20 p.m. UTC | #1
Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> When --overwrite and --max-size options of perf record are used together,
> a segmentation fault occurs. The following is an example:
> 
>  # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
>   [ perf record: Woken up 1 times to write data ]
>   perf: Segmentation fault
>   Obtained 1 stack frames.
>   [0xc4c67f]
>   Segmentation fault (core dumped)
> 
> backtrace of the core file is as follows:
> 
>   #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
>   #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
>   #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
>   #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
>   #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
>   #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
>   #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
>   #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
>   #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
>   #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
>   #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
>   #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
>   #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
>   #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
> 
> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> The process is as follows:
>   __cmd_record
>     -> record__free_thread_data
>       -> zfree(&rec->thread_data)         // free rec->thread_data
>     -> record__synthesize
>       -> perf_event__synthesize_id_index
>         -> process_synthesized_event
>           -> record__write
>             -> record__bytes_written     // access rec->thread_data
> 
> we only need to check the value of done first.
> Also add variable check in record__bytes_written for code hardening,
> and save bytes_written separately to reduce one calculation.
> 
> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> ---
> 
> Changes since v1:
>  - Add variable check in record__bytes_written for code hardening.
>  - Save bytes_written separately to reduce one calculation.
>  - Remove rec->opts.tail_synthesize check.

Namhyung, are you ok with this now?

- Arnaldo
 
>  tools/perf/builtin-record.c | 26 +++++++++++++++++---------
>  1 file changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index 29dcd454b8e2..acba9e43e519 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
>  	u64 bytes_written = rec->bytes_written;
>  	struct record_thread *thread_data = rec->thread_data;
>  
> +	if (thread_data == NULL)
> +		return bytes_written;
> +
>  	for (t = 0; t < rec->nr_threads; t++)
>  		bytes_written += thread_data[t].bytes_written;
>  
>  	return bytes_written;
>  }
>  
> -static bool record__output_max_size_exceeded(struct record *rec)
> +static void record__check_output_max_size_exceeded(struct record *rec)
>  {
> -	return rec->output_max_size &&
> -	       (record__bytes_written(rec) >= rec->output_max_size);
> +	u64 bytes_written;
> +
> +	if (rec->output_max_size == 0 || done)
> +		return;
> +
> +	bytes_written = record__bytes_written(rec);
> +	if (bytes_written >= rec->output_max_size) {
> +		fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> +			" stopping session ]\n", bytes_written >> 10);
> +
> +		done = 1;
> +	}
>  }
>  
>  static int record__write(struct record *rec, struct mmap *map __maybe_unused,
> @@ -260,12 +273,7 @@ static int record__write(struct record *rec, struct mmap *map __maybe_unused,
>  	else
>  		rec->bytes_written += size;
>  
> -	if (record__output_max_size_exceeded(rec) && !done) {
> -		fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> -				" stopping session ]\n",
> -				record__bytes_written(rec) >> 10);
> -		done = 1;
> -	}
> +	record__check_output_max_size_exceeded(rec);
>  
>  	if (switch_output_size(rec))
>  		trigger_hit(&switch_output_trigger);
> -- 
> 2.30.GIT
  
Namhyung Kim Jan. 3, 2023, 4:50 p.m. UTC | #2
On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>
> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> > When --overwrite and --max-size options of perf record are used together,
> > a segmentation fault occurs. The following is an example:
> >
> >  # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
> >   [ perf record: Woken up 1 times to write data ]
> >   perf: Segmentation fault
> >   Obtained 1 stack frames.
> >   [0xc4c67f]
> >   Segmentation fault (core dumped)
> >
> > backtrace of the core file is as follows:
> >
> >   #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
> >   #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
> >   #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
> >   #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
> >   #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
> >   #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
> >   #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
> >   #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
> >   #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
> >   #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
> >   #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
> >   #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
> >   #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
> >   #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
> >
> > The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> > The process is as follows:
> >   __cmd_record
> >     -> record__free_thread_data
> >       -> zfree(&rec->thread_data)         // free rec->thread_data
> >     -> record__synthesize
> >       -> perf_event__synthesize_id_index
> >         -> process_synthesized_event
> >           -> record__write
> >             -> record__bytes_written     // access rec->thread_data
> >
> > we only need to check the value of done first.
> > Also add variable check in record__bytes_written for code hardening,
> > and save bytes_written separately to reduce one calculation.
> >
> > Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> > Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> > ---
> >
> > Changes since v1:
> >  - Add variable check in record__bytes_written for code hardening.
> >  - Save bytes_written separately to reduce one calculation.
> >  - Remove rec->opts.tail_synthesize check.
>
> Namhyung, are you ok with this now?
>
> - Arnaldo
>
> >  tools/perf/builtin-record.c | 26 +++++++++++++++++---------
> >  1 file changed, 17 insertions(+), 9 deletions(-)
> >
> > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> > index 29dcd454b8e2..acba9e43e519 100644
> > --- a/tools/perf/builtin-record.c
> > +++ b/tools/perf/builtin-record.c
> > @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
> >       u64 bytes_written = rec->bytes_written;
> >       struct record_thread *thread_data = rec->thread_data;
> >
> > +     if (thread_data == NULL)
> > +             return bytes_written;
> > +

Then it won't count bytes written by threads, right?
I think it needs to be saved somewhere.

Thanks,
Namhyung


> >       for (t = 0; t < rec->nr_threads; t++)
> >               bytes_written += thread_data[t].bytes_written;
> >
> >       return bytes_written;
> >  }
> >
> > -static bool record__output_max_size_exceeded(struct record *rec)
> > +static void record__check_output_max_size_exceeded(struct record *rec)
> >  {
> > -     return rec->output_max_size &&
> > -            (record__bytes_written(rec) >= rec->output_max_size);
> > +     u64 bytes_written;
> > +
> > +     if (rec->output_max_size == 0 || done)
> > +             return;
> > +
> > +     bytes_written = record__bytes_written(rec);
> > +     if (bytes_written >= rec->output_max_size) {
> > +             fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> > +                     " stopping session ]\n", bytes_written >> 10);
> > +
> > +             done = 1;
> > +     }
> >  }
> >
> >  static int record__write(struct record *rec, struct mmap *map __maybe_unused,
> > @@ -260,12 +273,7 @@ static int record__write(struct record *rec, struct mmap *map __maybe_unused,
> >       else
> >               rec->bytes_written += size;
> >
> > -     if (record__output_max_size_exceeded(rec) && !done) {
> > -             fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
> > -                             " stopping session ]\n",
> > -                             record__bytes_written(rec) >> 10);
> > -             done = 1;
> > -     }
> > +     record__check_output_max_size_exceeded(rec);
> >
> >       if (switch_output_size(rec))
> >               trigger_hit(&switch_output_trigger);
> > --
> > 2.30.GIT
>
> --
>
> - Arnaldo
  
Yang Jihong Jan. 5, 2023, 4:09 a.m. UTC | #3
Hello,

On 2023/1/4 0:50, Namhyung Kim wrote:
> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>
>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
>>> When --overwrite and --max-size options of perf record are used together,
>>> a segmentation fault occurs. The following is an example:
>>>
>>>   # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
>>>    [ perf record: Woken up 1 times to write data ]
>>>    perf: Segmentation fault
>>>    Obtained 1 stack frames.
>>>    [0xc4c67f]
>>>    Segmentation fault (core dumped)
>>>
>>> backtrace of the core file is as follows:
>>>
>>>    #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
>>>    #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
>>>    #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
>>>    #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
>>>    #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
>>>    #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
>>>    #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
>>>    #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
>>>    #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
>>>    #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
>>>    #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
>>>    #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
>>>    #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
>>>    #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
>>>
>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
>>> The process is as follows:
>>>    __cmd_record
>>>      -> record__free_thread_data
>>>        -> zfree(&rec->thread_data)         // free rec->thread_data
>>>      -> record__synthesize
>>>        -> perf_event__synthesize_id_index
>>>          -> process_synthesized_event
>>>            -> record__write
>>>              -> record__bytes_written     // access rec->thread_data
>>>
>>> we only need to check the value of done first.
>>> Also add variable check in record__bytes_written for code hardening,
>>> and save bytes_written separately to reduce one calculation.
>>>
>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
>>> ---
>>>
>>> Changes since v1:
>>>   - Add variable check in record__bytes_written for code hardening.
>>>   - Save bytes_written separately to reduce one calculation.
>>>   - Remove rec->opts.tail_synthesize check.
>>
>> Namhyung, are you ok with this now?
>>
>> - Arnaldo
>>
>>>   tools/perf/builtin-record.c | 26 +++++++++++++++++---------
>>>   1 file changed, 17 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>> index 29dcd454b8e2..acba9e43e519 100644
>>> --- a/tools/perf/builtin-record.c
>>> +++ b/tools/perf/builtin-record.c
>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
>>>        u64 bytes_written = rec->bytes_written;
>>>        struct record_thread *thread_data = rec->thread_data;
>>>
>>> +     if (thread_data == NULL)
>>> +             return bytes_written;
>>> +
> 
> Then it won't count bytes written by threads, right?
> I think it needs to be saved somewhere.
> 
I'm not sure here. Can you explain it more clearly, thanks :)
I can modify it accordingly.

I think if thread_data == NULL, it is not thread data.
In this case, we just return rec->bytes_written.

Thanks,
Yang
  
Namhyung Kim Jan. 6, 2023, 9:12 p.m. UTC | #4
Hello,

On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> Hello,
>
> On 2023/1/4 0:50, Namhyung Kim wrote:
> > On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >>
> >> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> >>> When --overwrite and --max-size options of perf record are used together,
> >>> a segmentation fault occurs. The following is an example:
> >>>
> >>>   # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
> >>>    [ perf record: Woken up 1 times to write data ]
> >>>    perf: Segmentation fault
> >>>    Obtained 1 stack frames.
> >>>    [0xc4c67f]
> >>>    Segmentation fault (core dumped)
> >>>
> >>> backtrace of the core file is as follows:
> >>>
> >>>    #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
> >>>    #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
> >>>    #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
> >>>    #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
> >>>    #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
> >>>    #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
> >>>    #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
> >>>    #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
> >>>    #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
> >>>    #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
> >>>    #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
> >>>    #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
> >>>    #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
> >>>    #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
> >>>
> >>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> >>> The process is as follows:
> >>>    __cmd_record
> >>>      -> record__free_thread_data
> >>>        -> zfree(&rec->thread_data)         // free rec->thread_data
> >>>      -> record__synthesize
> >>>        -> perf_event__synthesize_id_index
> >>>          -> process_synthesized_event
> >>>            -> record__write
> >>>              -> record__bytes_written     // access rec->thread_data
> >>>
> >>> we only need to check the value of done first.
> >>> Also add variable check in record__bytes_written for code hardening,
> >>> and save bytes_written separately to reduce one calculation.
> >>>
> >>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> >>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> >>> ---
> >>>
> >>> Changes since v1:
> >>>   - Add variable check in record__bytes_written for code hardening.
> >>>   - Save bytes_written separately to reduce one calculation.
> >>>   - Remove rec->opts.tail_synthesize check.
> >>
> >> Namhyung, are you ok with this now?
> >>
> >> - Arnaldo
> >>
> >>>   tools/perf/builtin-record.c | 26 +++++++++++++++++---------
> >>>   1 file changed, 17 insertions(+), 9 deletions(-)
> >>>
> >>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> >>> index 29dcd454b8e2..acba9e43e519 100644
> >>> --- a/tools/perf/builtin-record.c
> >>> +++ b/tools/perf/builtin-record.c
> >>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
> >>>        u64 bytes_written = rec->bytes_written;
> >>>        struct record_thread *thread_data = rec->thread_data;
> >>>
> >>> +     if (thread_data == NULL)
> >>> +             return bytes_written;
> >>> +
> >
> > Then it won't count bytes written by threads, right?
> > I think it needs to be saved somewhere.
> >
> I'm not sure here. Can you explain it more clearly, thanks :)
> I can modify it accordingly.
>
> I think if thread_data == NULL, it is not thread data.
> In this case, we just return rec->bytes_written.

It can be thread data but freed before tail synthesis, right?
In that case, I think it needs to add bytes_written by threads
to calculate the correct data size.

Thanks,
Namhyung
  
Yang Jihong Jan. 9, 2023, 2:46 a.m. UTC | #5
Hello,

On 2023/1/7 5:12, Namhyung Kim wrote:
> Hello,
> 
> On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <yangjihong1@huawei.com> wrote:
>>
>> Hello,
>>
>> On 2023/1/4 0:50, Namhyung Kim wrote:
>>> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>>>
>>>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
>>>>> When --overwrite and --max-size options of perf record are used together,
>>>>> a segmentation fault occurs. The following is an example:
>>>>>
>>>>>    # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
>>>>>     [ perf record: Woken up 1 times to write data ]
>>>>>     perf: Segmentation fault
>>>>>     Obtained 1 stack frames.
>>>>>     [0xc4c67f]
>>>>>     Segmentation fault (core dumped)
>>>>>
>>>>> backtrace of the core file is as follows:
>>>>>
>>>>>     #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
>>>>>     #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
>>>>>     #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
>>>>>     #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
>>>>>     #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
>>>>>     #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
>>>>>     #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
>>>>>     #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
>>>>>     #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
>>>>>     #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
>>>>>     #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
>>>>>     #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
>>>>>     #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
>>>>>     #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
>>>>>
>>>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
>>>>> The process is as follows:
>>>>>     __cmd_record
>>>>>       -> record__free_thread_data
>>>>>         -> zfree(&rec->thread_data)         // free rec->thread_data
>>>>>       -> record__synthesize
>>>>>         -> perf_event__synthesize_id_index
>>>>>           -> process_synthesized_event
>>>>>             -> record__write
>>>>>               -> record__bytes_written     // access rec->thread_data
>>>>>
>>>>> we only need to check the value of done first.
>>>>> Also add variable check in record__bytes_written for code hardening,
>>>>> and save bytes_written separately to reduce one calculation.
>>>>>
>>>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
>>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
>>>>> ---
>>>>>
>>>>> Changes since v1:
>>>>>    - Add variable check in record__bytes_written for code hardening.
>>>>>    - Save bytes_written separately to reduce one calculation.
>>>>>    - Remove rec->opts.tail_synthesize check.
>>>>
>>>> Namhyung, are you ok with this now?
>>>>
>>>> - Arnaldo
>>>>
>>>>>    tools/perf/builtin-record.c | 26 +++++++++++++++++---------
>>>>>    1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>
>>>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>>>> index 29dcd454b8e2..acba9e43e519 100644
>>>>> --- a/tools/perf/builtin-record.c
>>>>> +++ b/tools/perf/builtin-record.c
>>>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
>>>>>         u64 bytes_written = rec->bytes_written;
>>>>>         struct record_thread *thread_data = rec->thread_data;
>>>>>
>>>>> +     if (thread_data == NULL)
>>>>> +             return bytes_written;
>>>>> +
>>>
>>> Then it won't count bytes written by threads, right?
>>> I think it needs to be saved somewhere.
>>>
>> I'm not sure here. Can you explain it more clearly, thanks :)
>> I can modify it accordingly.
>>
>> I think if thread_data == NULL, it is not thread data.
>> In this case, we just return rec->bytes_written.
> 
> It can be thread data but freed before tail synthesis, right?
> In that case, I think it needs to add bytes_written by threads
> to calculate the correct data size.
Em... In the __cmd_record function, record__stop_threads is called 
before record__free_thread_data, so if the thread has been freed, there 
will be no thread data.
I think it's okay to ignore the situation you mentioned above.

Thanks,
Yang
  
Namhyung Kim Jan. 10, 2023, 7:21 p.m. UTC | #6
On Sun, Jan 8, 2023 at 6:47 PM Yang Jihong <yangjihong1@huawei.com> wrote:
>
> Hello,
>
> On 2023/1/7 5:12, Namhyung Kim wrote:
> > Hello,
> >
> > On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <yangjihong1@huawei.com> wrote:
> >>
> >> Hello,
> >>
> >> On 2023/1/4 0:50, Namhyung Kim wrote:
> >>> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> >>>>
> >>>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
> >>>>> When --overwrite and --max-size options of perf record are used together,
> >>>>> a segmentation fault occurs. The following is an example:
> >>>>>
> >>>>>    # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
> >>>>>     [ perf record: Woken up 1 times to write data ]
> >>>>>     perf: Segmentation fault
> >>>>>     Obtained 1 stack frames.
> >>>>>     [0xc4c67f]
> >>>>>     Segmentation fault (core dumped)
> >>>>>
> >>>>> backtrace of the core file is as follows:
> >>>>>
> >>>>>     #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
> >>>>>     #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
> >>>>>     #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
> >>>>>     #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
> >>>>>     #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
> >>>>>     #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
> >>>>>     #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
> >>>>>     #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
> >>>>>     #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
> >>>>>     #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
> >>>>>     #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
> >>>>>     #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
> >>>>>     #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
> >>>>>     #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
> >>>>>
> >>>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
> >>>>> The process is as follows:
> >>>>>     __cmd_record
> >>>>>       -> record__free_thread_data
> >>>>>         -> zfree(&rec->thread_data)         // free rec->thread_data
> >>>>>       -> record__synthesize
> >>>>>         -> perf_event__synthesize_id_index
> >>>>>           -> process_synthesized_event
> >>>>>             -> record__write
> >>>>>               -> record__bytes_written     // access rec->thread_data
> >>>>>
> >>>>> we only need to check the value of done first.
> >>>>> Also add variable check in record__bytes_written for code hardening,
> >>>>> and save bytes_written separately to reduce one calculation.
> >>>>>
> >>>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
> >>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
> >>>>> ---
> >>>>>
> >>>>> Changes since v1:
> >>>>>    - Add variable check in record__bytes_written for code hardening.
> >>>>>    - Save bytes_written separately to reduce one calculation.
> >>>>>    - Remove rec->opts.tail_synthesize check.
> >>>>
> >>>> Namhyung, are you ok with this now?
> >>>>
> >>>> - Arnaldo
> >>>>
> >>>>>    tools/perf/builtin-record.c | 26 +++++++++++++++++---------
> >>>>>    1 file changed, 17 insertions(+), 9 deletions(-)
> >>>>>
> >>>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> >>>>> index 29dcd454b8e2..acba9e43e519 100644
> >>>>> --- a/tools/perf/builtin-record.c
> >>>>> +++ b/tools/perf/builtin-record.c
> >>>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
> >>>>>         u64 bytes_written = rec->bytes_written;
> >>>>>         struct record_thread *thread_data = rec->thread_data;
> >>>>>
> >>>>> +     if (thread_data == NULL)
> >>>>> +             return bytes_written;
> >>>>> +
> >>>
> >>> Then it won't count bytes written by threads, right?
> >>> I think it needs to be saved somewhere.
> >>>
> >> I'm not sure here. Can you explain it more clearly, thanks :)
> >> I can modify it accordingly.
> >>
> >> I think if thread_data == NULL, it is not thread data.
> >> In this case, we just return rec->bytes_written.
> >
> > It can be thread data but freed before tail synthesis, right?
> > In that case, I think it needs to add bytes_written by threads
> > to calculate the correct data size.
> Em... In the __cmd_record function, record__stop_threads is called
> before record__free_thread_data, so if the thread has been freed, there
> will be no thread data.
> I think it's okay to ignore the situation you mentioned above.

Right, the thread data is already freed, but we need the size.

I think it didn't (and won't) update to rec->bytes_written for the data
written by the threads (data.X file) because it's only for the main
'data' file.  So record__bytes_written() will return a smaller number
after the threads are gone.  But I think it should return the total
data size.

Thanks,
Namhyung
  
Yang Jihong Jan. 13, 2023, 6:53 a.m. UTC | #7
Hello,

On 2023/1/11 3:21, Namhyung Kim wrote:
> On Sun, Jan 8, 2023 at 6:47 PM Yang Jihong <yangjihong1@huawei.com> wrote:
>>
>> Hello,
>>
>> On 2023/1/7 5:12, Namhyung Kim wrote:
>>> Hello,
>>>
>>> On Wed, Jan 4, 2023 at 8:09 PM Yang Jihong <yangjihong1@huawei.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> On 2023/1/4 0:50, Namhyung Kim wrote:
>>>>> On Mon, Jan 2, 2023 at 8:20 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
>>>>>>
>>>>>> Em Thu, Dec 29, 2022 at 12:47:28PM +0000, Yang Jihong escreveu:
>>>>>>> When --overwrite and --max-size options of perf record are used together,
>>>>>>> a segmentation fault occurs. The following is an example:
>>>>>>>
>>>>>>>     # perf record -e sched:sched* --overwrite --max-size 1M -a -- sleep 1
>>>>>>>      [ perf record: Woken up 1 times to write data ]
>>>>>>>      perf: Segmentation fault
>>>>>>>      Obtained 1 stack frames.
>>>>>>>      [0xc4c67f]
>>>>>>>      Segmentation fault (core dumped)
>>>>>>>
>>>>>>> backtrace of the core file is as follows:
>>>>>>>
>>>>>>>      #0  0x0000000000417990 in process_locked_synthesized_event (tool=0x0, event=0x15, sample=0x1de0, machine=0xf8) at builtin-record.c:630
>>>>>>>      #1  0x000000000057ee53 in perf_event__synthesize_threads (nr_threads_synthesize=21, mmap_data=<optimized out>, needs_mmap=<optimized out>, machine=0x17ad9b0, process=<optimized out>, tool=0x0) at util/synthetic-events.c:1950
>>>>>>>      #2  __machine__synthesize_threads (nr_threads_synthesize=0, data_mmap=<optimized out>, needs_mmap=<optimized out>, process=<optimized out>, threads=0x8, target=0x8, tool=0x0, machine=0x17ad9b0) at util/synthetic-events.c:1936
>>>>>>>      #3  machine__synthesize_threads (machine=0x17ad9b0, target=0x8, threads=0x8, needs_mmap=<optimized out>, data_mmap=<optimized out>, nr_threads_synthesize=0) at util/synthetic-events.c:1947
>>>>>>>      #4  0x000000000040165d in record__synthesize (tail=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2010
>>>>>>>      #5  0x0000000000403989 in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xbe2520 <record>) at builtin-record.c:2810
>>>>>>>      #6  0x00000000004196ba in record__init_thread_user_masks (rec=0xbe2520 <record>, cpus=0x17a65f0) at builtin-record.c:3837
>>>>>>>      #7  record__init_thread_masks (rec=0xbe2520 <record>) at builtin-record.c:3938
>>>>>>>      #8  cmd_record (argc=1, argv=0x7ffdd692dc60) at builtin-record.c:4241
>>>>>>>      #9  0x00000000004b701d in pager_command_config (var=0x0, value=0x15 <error: Cannot access memory at address 0x15>, data=0x1de0) at perf.c:117
>>>>>>>      #10 0x00000000004b732b in get_leaf_frame_caller_aarch64 (sample=0xfffffffb, thread=0x0, usr_idx=<optimized out>) at util/arm64-frame-pointer-unwind-support.c:56
>>>>>>>      #11 0x0000000000406331 in execv_dashed_external (argv=0x7ffdd692d9e8) at perf.c:410
>>>>>>>      #12 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:431
>>>>>>>      #13 main (argc=<optimized out>, argv=0x7ffdd692d9e8) at perf.c:562
>>>>>>>
>>>>>>> The reason is that record__bytes_written accesses the freed memory rec->thread_data,
>>>>>>> The process is as follows:
>>>>>>>      __cmd_record
>>>>>>>        -> record__free_thread_data
>>>>>>>          -> zfree(&rec->thread_data)         // free rec->thread_data
>>>>>>>        -> record__synthesize
>>>>>>>          -> perf_event__synthesize_id_index
>>>>>>>            -> process_synthesized_event
>>>>>>>              -> record__write
>>>>>>>                -> record__bytes_written     // access rec->thread_data
>>>>>>>
>>>>>>> we only need to check the value of done first.
>>>>>>> Also add variable check in record__bytes_written for code hardening,
>>>>>>> and save bytes_written separately to reduce one calculation.
>>>>>>>
>>>>>>> Fixes: 6d57581659f7 ("perf record: Add support for limit perf output file size")
>>>>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
>>>>>>> ---
>>>>>>>
>>>>>>> Changes since v1:
>>>>>>>     - Add variable check in record__bytes_written for code hardening.
>>>>>>>     - Save bytes_written separately to reduce one calculation.
>>>>>>>     - Remove rec->opts.tail_synthesize check.
>>>>>>
>>>>>> Namhyung, are you ok with this now?
>>>>>>
>>>>>> - Arnaldo
>>>>>>
>>>>>>>     tools/perf/builtin-record.c | 26 +++++++++++++++++---------
>>>>>>>     1 file changed, 17 insertions(+), 9 deletions(-)
>>>>>>>
>>>>>>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>>>>>>> index 29dcd454b8e2..acba9e43e519 100644
>>>>>>> --- a/tools/perf/builtin-record.c
>>>>>>> +++ b/tools/perf/builtin-record.c
>>>>>>> @@ -230,16 +230,29 @@ static u64 record__bytes_written(struct record *rec)
>>>>>>>          u64 bytes_written = rec->bytes_written;
>>>>>>>          struct record_thread *thread_data = rec->thread_data;
>>>>>>>
>>>>>>> +     if (thread_data == NULL)
>>>>>>> +             return bytes_written;
>>>>>>> +
>>>>>
>>>>> Then it won't count bytes written by threads, right?
>>>>> I think it needs to be saved somewhere.
>>>>>
>>>> I'm not sure here. Can you explain it more clearly, thanks :)
>>>> I can modify it accordingly.
>>>>
>>>> I think if thread_data == NULL, it is not thread data.
>>>> In this case, we just return rec->bytes_written.
>>>
>>> It can be thread data but freed before tail synthesis, right?
>>> In that case, I think it needs to add bytes_written by threads
>>> to calculate the correct data size.
>> Em... In the __cmd_record function, record__stop_threads is called
>> before record__free_thread_data, so if the thread has been freed, there
>> will be no thread data.
>> I think it's okay to ignore the situation you mentioned above.
> 
> Right, the thread data is already freed, but we need the size.
> 
> I think it didn't (and won't) update to rec->bytes_written for the data
> written by the threads (data.X file) because it's only for the main
> 'data' file.  So record__bytes_written() will return a smaller number
> after the threads are gone.  But I think it should return the total
> data size.
> 
Yes, the total data size including data.X file should be returned here 
to fit the semantics, so there's a problem here, too. will fix in next 
version.

Thanks,
Yang
  

Patch

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 29dcd454b8e2..acba9e43e519 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -230,16 +230,29 @@  static u64 record__bytes_written(struct record *rec)
 	u64 bytes_written = rec->bytes_written;
 	struct record_thread *thread_data = rec->thread_data;
 
+	if (thread_data == NULL)
+		return bytes_written;
+
 	for (t = 0; t < rec->nr_threads; t++)
 		bytes_written += thread_data[t].bytes_written;
 
 	return bytes_written;
 }
 
-static bool record__output_max_size_exceeded(struct record *rec)
+static void record__check_output_max_size_exceeded(struct record *rec)
 {
-	return rec->output_max_size &&
-	       (record__bytes_written(rec) >= rec->output_max_size);
+	u64 bytes_written;
+
+	if (rec->output_max_size == 0 || done)
+		return;
+
+	bytes_written = record__bytes_written(rec);
+	if (bytes_written >= rec->output_max_size) {
+		fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
+			" stopping session ]\n", bytes_written >> 10);
+
+		done = 1;
+	}
 }
 
 static int record__write(struct record *rec, struct mmap *map __maybe_unused,
@@ -260,12 +273,7 @@  static int record__write(struct record *rec, struct mmap *map __maybe_unused,
 	else
 		rec->bytes_written += size;
 
-	if (record__output_max_size_exceeded(rec) && !done) {
-		fprintf(stderr, "[ perf record: perf size limit reached (%" PRIu64 " KB),"
-				" stopping session ]\n",
-				record__bytes_written(rec) >> 10);
-		done = 1;
-	}
+	record__check_output_max_size_exceeded(rec);
 
 	if (switch_output_size(rec))
 		trigger_hit(&switch_output_trigger);