[2/2] perf test: Skip metrics w/o event name in stat STD output linter

Message ID 20230623230139.985594-2-namhyung@kernel.org
State New
Headers
Series [1/2] perf test: Reorder event name checks in stat STD output linter |

Commit Message

Namhyung Kim June 23, 2023, 11:01 p.m. UTC
  This test checks if the output of perf stat to match event names and
metrics.  So it wants the output lines to have both event name and
metric.  Otherwise it should skip the line.

On AMD machines, the instruction event has two metrics and they are printed
in separate lines.  It makes the line without event name like below:

  # perf stat -a sleep 1

   Performance counter stats for 'system wide':

           64,383.34 msec cpu-clock                  #   64.048 CPUs utilized
              14,526      context-switches           #  225.617 /sec
                 112      cpu-migrations             #    1.740 /sec
                 190      page-faults                #    2.951 /sec
         807,558,652      cycles                     #    0.013 GHz                         (83.30%)
          69,809,799      stalled-cycles-frontend    #    8.64% frontend cycles idle        (83.30%)
         196,983,266      stalled-cycles-backend     #   24.39% backend cycles idle         (83.30%)
         424,876,008      instructions               #    0.53  insn per cycle
 (here) --->                                  #    0.46  stalled cycles per insn     (83.30%)
          97,788,321      branches                   #    1.519 M/sec                       (83.34%)
           4,147,377      branch-misses              #    4.24% of all branches             (83.46%)

         1.005241409 seconds time elapsed

Also modern Intel machines have TopDown metrics which also don't have
event names.

  # perf stat -a sleep 1

   Performance counter stats for 'system wide':

            8,015.39 msec cpu-clock                        #    7.996 CPUs utilized
               5,823      context-switches                 #  726.477 /sec
                 189      cpu-migrations                   #   23.580 /sec
                 139      page-faults                      #   17.342 /sec
         435,139,308      cycles                           #    0.054 GHz
         193,891,345      instructions                     #    0.45  insn per cycle
          42,773,028      branches                         #    5.336 M/sec
           2,298,113      branch-misses                    #    5.37% of all branches
                          TopdownL1                 #     25.5 %  tma_backend_bound
              /-->                                  #      7.9 %  tma_bad_speculation
    (here) --+                                      #     55.7 %  tma_frontend_bound
              \-->                                  #     10.9 %  tma_retiring

         1.002395924 seconds time elapsed

There is a check to skip TopdownL1 and TopdownL2 specifically but it
does not cover every affected lines.

So there is another check to skip the line if it has nothing on the left
side of # sign.  Well.. it seems ok but that's not enough too.

When aggregation mode (like --per-socket or --per-thread) is used, it
adds some prefix (e.g. CPU socket, task name and PID) in the output
line.  So the test code ignores them to normalize result.

A problem can happen for per-thread mode when task name contains one or
more spaces.  It'd only ignore the first part of the task name, and it
thinks there's something more in the line so it would not skip.

  # perf stat -a --perf-thread sleep 1
  ...
            perf-21276                  #     70.2 %  tma_backend_bound
            perf-21276                  #      3.9 %  tma_bad_speculation
            perf-21276                  #     10.5 %  tma_frontend_bound
            perf-21276                  #     15.3 %  tma_retiring
	    ^^^^^^^^^^
	    (ignored)

         my task-21328                  #     70.2 %  tma_backend_bound
         my task-21328                  #      3.9 %  tma_bad_speculation
         my task-21328                  #     10.5 %  tma_frontend_bound
         my task-21328                  #     15.3 %  tma_retiring
	 ^^
     (ignored)

So I think it should look at the metric names instead.  Add skip_metric
to hold the list of names to skip.  It would contain 'stalled cycles per
insn' and metrics started by 'tma_'.

Fixes: 99a04a48f225 ("perf test: Add test case for the standard 'perf stat' output")
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/tests/shell/stat+std_output.sh | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
  

Comments

Ian Rogers June 23, 2023, 11:23 p.m. UTC | #1
On Fri, Jun 23, 2023 at 4:01 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> This test checks if the output of perf stat to match event names and
> metrics.  So it wants the output lines to have both event name and
> metric.  Otherwise it should skip the line.
>
> On AMD machines, the instruction event has two metrics and they are printed
> in separate lines.  It makes the line without event name like below:
>
>   # perf stat -a sleep 1
>
>    Performance counter stats for 'system wide':
>
>            64,383.34 msec cpu-clock                  #   64.048 CPUs utilized
>               14,526      context-switches           #  225.617 /sec
>                  112      cpu-migrations             #    1.740 /sec
>                  190      page-faults                #    2.951 /sec
>          807,558,652      cycles                     #    0.013 GHz                         (83.30%)
>           69,809,799      stalled-cycles-frontend    #    8.64% frontend cycles idle        (83.30%)
>          196,983,266      stalled-cycles-backend     #   24.39% backend cycles idle         (83.30%)
>          424,876,008      instructions               #    0.53  insn per cycle
>  (here) --->                                  #    0.46  stalled cycles per insn     (83.30%)
>           97,788,321      branches                   #    1.519 M/sec                       (83.34%)
>            4,147,377      branch-misses              #    4.24% of all branches             (83.46%)
>
>          1.005241409 seconds time elapsed
>
> Also modern Intel machines have TopDown metrics which also don't have
> event names.
>
>   # perf stat -a sleep 1
>
>    Performance counter stats for 'system wide':
>
>             8,015.39 msec cpu-clock                        #    7.996 CPUs utilized
>                5,823      context-switches                 #  726.477 /sec
>                  189      cpu-migrations                   #   23.580 /sec
>                  139      page-faults                      #   17.342 /sec
>          435,139,308      cycles                           #    0.054 GHz
>          193,891,345      instructions                     #    0.45  insn per cycle
>           42,773,028      branches                         #    5.336 M/sec
>            2,298,113      branch-misses                    #    5.37% of all branches
>                           TopdownL1                 #     25.5 %  tma_backend_bound
>               /-->                                  #      7.9 %  tma_bad_speculation
>     (here) --+                                      #     55.7 %  tma_frontend_bound
>               \-->                                  #     10.9 %  tma_retiring
>
>          1.002395924 seconds time elapsed
>
> There is a check to skip TopdownL1 and TopdownL2 specifically but it
> does not cover every affected lines.
>
> So there is another check to skip the line if it has nothing on the left
> side of # sign.  Well.. it seems ok but that's not enough too.
>
> When aggregation mode (like --per-socket or --per-thread) is used, it
> adds some prefix (e.g. CPU socket, task name and PID) in the output
> line.  So the test code ignores them to normalize result.
>
> A problem can happen for per-thread mode when task name contains one or
> more spaces.  It'd only ignore the first part of the task name, and it
> thinks there's something more in the line so it would not skip.
>
>   # perf stat -a --perf-thread sleep 1
>   ...
>             perf-21276                  #     70.2 %  tma_backend_bound
>             perf-21276                  #      3.9 %  tma_bad_speculation
>             perf-21276                  #     10.5 %  tma_frontend_bound
>             perf-21276                  #     15.3 %  tma_retiring
>             ^^^^^^^^^^
>             (ignored)
>
>          my task-21328                  #     70.2 %  tma_backend_bound
>          my task-21328                  #      3.9 %  tma_bad_speculation
>          my task-21328                  #     10.5 %  tma_frontend_bound
>          my task-21328                  #     15.3 %  tma_retiring
>          ^^
>      (ignored)
>
> So I think it should look at the metric names instead.  Add skip_metric
> to hold the list of names to skip.  It would contain 'stalled cycles per
> insn' and metrics started by 'tma_'.
>
> Fixes: 99a04a48f225 ("perf test: Add test case for the standard 'perf stat' output")
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> ---
>  tools/perf/tests/shell/stat+std_output.sh | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/tests/shell/stat+std_output.sh b/tools/perf/tests/shell/stat+std_output.sh
> index 1f70aab45184..f972b31fa0c2 100755
> --- a/tools/perf/tests/shell/stat+std_output.sh
> +++ b/tools/perf/tests/shell/stat+std_output.sh
> @@ -12,8 +12,7 @@ stat_output=$(mktemp /tmp/__perf_test.stat_output.std.XXXXX)
>
>  event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
>  event_metric=("CPUs utilized" "CPUs utilized" "/sec" "/sec" "/sec" "frontend cycles idle" "backend cycles idle" "GHz" "insn per cycle" "/sec" "of all branches")
> -
> -metricgroup_name=(TopdownL1 TopdownL2)
> +skip_metric=("stalled cycles per insn" "tma_")
>
>  cleanup() {
>    rm -f "${stat_output}"
> @@ -58,13 +57,14 @@ function commachecker()
>
>                 main_body=$(echo $line | cut -d' ' -f$prefix-)
>                 x=${main_body%#*}
> -               # Check default metricgroup
> -               y=$(echo $x | tr -d ' ')
> -               [ "$y" = "" ] && continue
> -               for i in "${!metricgroup_name[@]}"; do
> -                       [[ "$y" == *"${metricgroup_name[$i]}"* ]] && break
> +               [ "$x" = "" ] && continue
> +
> +               # Skip metrics without event name
> +               y=${main_body#*#}
> +               for i in "${!skip_metric[@]}"; do
> +                       [[ "$y" == *"${skip_metric[$i]}"* ]] && break
>                 done
> -               [[ "$y" == *"${metricgroup_name[$i]}"* ]] && continue
> +               [[ "$y" == *"${skip_metric[$i]}"* ]] && continue
>
>                 # Check default event
>                 for i in "${!event_name[@]}"; do
> --
> 2.41.0.162.gfafddb0af9-goog
>
  

Patch

diff --git a/tools/perf/tests/shell/stat+std_output.sh b/tools/perf/tests/shell/stat+std_output.sh
index 1f70aab45184..f972b31fa0c2 100755
--- a/tools/perf/tests/shell/stat+std_output.sh
+++ b/tools/perf/tests/shell/stat+std_output.sh
@@ -12,8 +12,7 @@  stat_output=$(mktemp /tmp/__perf_test.stat_output.std.XXXXX)
 
 event_name=(cpu-clock task-clock context-switches cpu-migrations page-faults stalled-cycles-frontend stalled-cycles-backend cycles instructions branches branch-misses)
 event_metric=("CPUs utilized" "CPUs utilized" "/sec" "/sec" "/sec" "frontend cycles idle" "backend cycles idle" "GHz" "insn per cycle" "/sec" "of all branches")
-
-metricgroup_name=(TopdownL1 TopdownL2)
+skip_metric=("stalled cycles per insn" "tma_")
 
 cleanup() {
   rm -f "${stat_output}"
@@ -58,13 +57,14 @@  function commachecker()
 
 		main_body=$(echo $line | cut -d' ' -f$prefix-)
 		x=${main_body%#*}
-		# Check default metricgroup
-		y=$(echo $x | tr -d ' ')
-		[ "$y" = "" ] && continue
-		for i in "${!metricgroup_name[@]}"; do
-			[[ "$y" == *"${metricgroup_name[$i]}"* ]] && break
+		[ "$x" = "" ] && continue
+
+		# Skip metrics without event name
+		y=${main_body#*#}
+		for i in "${!skip_metric[@]}"; do
+			[[ "$y" == *"${skip_metric[$i]}"* ]] && break
 		done
-		[[ "$y" == *"${metricgroup_name[$i]}"* ]] && continue
+		[[ "$y" == *"${skip_metric[$i]}"* ]] && continue
 
 		# Check default event
 		for i in "${!event_name[@]}"; do