[PATCHSET,00/20] perf stat: Cleanup counter aggregation (v3)

Message ID 20221018020227.85905-1-namhyung@kernel.org
Headers
Series perf stat: Cleanup counter aggregation (v3) |

Message

Namhyung Kim Oct. 18, 2022, 2:02 a.m. UTC
  Hello,

Current perf stat code is somewhat hard to follow since it handles
many combinations of PMUs/events for given display and aggregation
options.  This is my attempt to clean it up a little. ;-)

changes in v3)
 * fix cgroup event display
 * fix perf test failures  (Arnaldo)
 * add comment on the global debug output  (Athira)

changes in v2)
 * fix a segfault in perf stat report for per-process record  (Jiri)
 * fix metric only display  (Jiri)
 * add evsel__reset_aggr_stat  (ian)
 * add more comments  (Ian)
 * add Acked-by from Ian


My first concern is that aggregation and display routines are intermixed
and processed differently depends on the aggregation mode.  I'd like to
separate them apart and make the logic clearer.

To do that, I added struct perf_stat_aggr to save the aggregated counter
values and other info.  It'll be allocated and processed according to
the aggr_mode and display logic will use it.

I've tested the following combination.

  $ cat test-matrix.sh
  #!/bin/sh

  set -e

  yes > /dev/null &
  TARGET=$!

  ./perf stat true
  ./perf stat -a true
  ./perf stat -C0 true
  ./perf stat -p $TARGET true
  ./perf stat -t $TARGET true

  ./perf stat -a -A true
  ./perf stat -a --per-node true
  ./perf stat -a --per-socket true
  ./perf stat -a --per-die true
  ./perf stat -a --per-core true
  ./perf stat -a --per-thread true

  ./perf stat -a -I 500 sleep 1
  ./perf stat -a -I 500 --summary sleep 1
  ./perf stat -a -I 500 --per-socket sleep 1
  ./perf stat -a -I 500 --summary --per-socket sleep 1

  ./perf stat -a --metric-only true
  ./perf stat -a --metric-only --per-socket true
  ./perf stat -a --metric-only -I 500 sleep 1
  ./perf stat -a --metric-only -I 500 --per-socket sleep 1

  ./perf stat record true && ./perf stat report
  ./perf stat record -p $TARGET true && ./perf stat report
  ./perf stat record -a true && ./perf stat report
  ./perf stat record -a --per-core true && ./perf stat report
  ./perf stat record -a --per-core --metric-only true && ./perf stat report
  ./perf stat record -a -I 500 sleep 1 && ./perf stat report
  ./perf stat record -a -I 500 --per-core sleep 1 && ./perf stat report
  ./perf stat record -a -I 500 --per-core --metric-only sleep 1 && ./perf stat report

  ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ true
  ./perf stat -a -A -e cpu/event=cpu-cycles,percore/ --percore-show-thread true

  kill $TARGET

The code is available at 'perf/stat-aggr-v3' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (20):
  perf tools: Save evsel->pmu in parse_events()
  perf tools: Use pmu info in evsel__is_hybrid()
  perf stat: Use evsel__is_hybrid() more
  perf stat: Add aggr id for global mode
  perf stat: Add cpu aggr id for no aggregation mode
  perf stat: Add 'needs_sort' argument to cpu_aggr_map__new()
  perf stat: Add struct perf_stat_aggr to perf_stat_evsel
  perf stat: Allocate evsel->stats->aggr properly
  perf stat: Aggregate events using evsel->stats->aggr
  perf stat: Factor out evsel__count_has_error()
  perf stat: Aggregate per-thread stats using evsel->stats->aggr
  perf stat: Allocate aggr counts for recorded data
  perf stat: Reset aggr counts for each interval
  perf stat: Split process_counters()
  perf stat: Add perf_stat_merge_counters()
  perf stat: Add perf_stat_process_percore()
  perf stat: Add perf_stat_process_shadow_stats()
  perf stat: Display event stats using aggr counts
  perf stat: Display percore events properly
  perf stat: Remove unused perf_counts.aggr field

 tools/perf/builtin-script.c                   |   4 +-
 tools/perf/builtin-stat.c                     | 186 +++++--
 tools/perf/tests/parse-metric.c               |   2 +-
 tools/perf/tests/pmu-events.c                 |   2 +-
 tools/perf/util/counts.c                      |   1 -
 tools/perf/util/counts.h                      |   1 -
 tools/perf/util/cpumap.c                      |  16 +-
 tools/perf/util/cpumap.h                      |   8 +-
 tools/perf/util/evsel.c                       |  32 +-
 tools/perf/util/parse-events.c                |   1 +
 tools/perf/util/pmu.c                         |   4 +
 .../scripting-engines/trace-event-python.c    |   6 -
 tools/perf/util/stat-display.c                | 462 +++---------------
 tools/perf/util/stat.c                        | 407 ++++++++++++---
 tools/perf/util/stat.h                        |  40 +-
 15 files changed, 633 insertions(+), 539 deletions(-)


base-commit: a3a365655a28f12f07eddf4f3fd596987b175e1d