perf tools: Add -H short option for --hierarchy
Commit Message
I found the hierarchy mode useful, but it's easy to make a typo when
using it. Let's add a short option for that.
Also update the documentation. :)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/Documentation/perf-report.txt | 29 ++++++++++++++++++++-
tools/perf/Documentation/perf-top.txt | 32 +++++++++++++++++++++++-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 2 +-
4 files changed, 61 insertions(+), 4 deletions(-)
Comments
On 26/10/23 09:26, Namhyung Kim wrote:
> I found the hierarchy mode useful, but it's easy to make a typo when
> using it. Let's add a short option for that.
>
> Also update the documentation. :)
Perhaps it would also be possible to support bash-completions for
long options
Hi Adrian,
On Wed, Oct 25, 2023 at 11:46 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 26/10/23 09:26, Namhyung Kim wrote:
> > I found the hierarchy mode useful, but it's easy to make a typo when
> > using it. Let's add a short option for that.
> >
> > Also update the documentation. :)
>
> Perhaps it would also be possible to support bash-completions for
> long options
I believe it already supports long options. But I have some setup
which doesn't work with bash completions. :-(
Thanks,
Namhyung
Em Thu, Oct 26, 2023 at 09:46:02AM +0300, Adrian Hunter escreveu:
> On 26/10/23 09:26, Namhyung Kim wrote:
> > I found the hierarchy mode useful, but it's easy to make a typo when
> > using it. Let's add a short option for that.
> > Also update the documentation. :)
> Perhaps it would also be possible to support bash-completions for
> long options
It works:
# . ~acme/git/linux/tools/perf/perf-completion.sh
# perf top --hi<TAB>
--hide_kernel_symbols --hide_user_symbols --hierarchy
#
And:
perf top --hie<ENTER>
works as it is unambiguous (so far).
What we don't have is a way to use hierachy by default, i.e. we should
have:
perf config top.hierarchy=1
and then:
perf top
would always use the hierarchy view.
tools/perf/Documentation/perf-config.txt has the options that can be
set, like:
# perf report | head -15
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 373K of event 'cycles:P'
# Event count (approx.): 205365133495
#
# Overhead Command Shared Object Symbol
# ........ ............... ................. ...................................
#
3.17% MediaDe~hine #6 libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
2.31% swapper [kernel.vmlinux] [k] psi_group_change
1.87% MediaSu~sor #10 libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
1.84% MediaSu~isor #7 libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
#
Then:
# perf config report.sort_order=dso
# perf report | head -15
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 373K of event 'cycles:P'
# Event count (approx.): 205365133495
#
# Overhead Shared Object
# ........ ..............................................
#
59.52% [kernel.vmlinux]
19.79% libc.so.6
8.07% libxul.so
5.25% libopenh264.so.2.3.1
#
# cat ~/.perfconfig
# this file is auto-generated.
[report]
sort_order = dso
[root@five ~]# perf config report.sort_order
report.sort_order=dso
#
Right now 'perf top' has only:
static int perf_top_config(const char *var, const char *value, void *cb __maybe_unused)
{
if (!strcmp(var, "top.call-graph")) {
var = "call-graph.record-mode";
return perf_default_config(var, value, cb);
}
if (!strcmp(var, "top.children")) {
symbol_conf.cumulate_callchain = perf_config_bool(var, value);
return 0;
}
return 0;
}
This would be similar to what was done for --no-children on:
https://git.kernel.org/torvalds/c/104ac991bd821773cba6f262f97a4a752ed76dd5
$ git show --pretty=full 104ac991bd821773cba6f262f97a4a752ed76dd5 | head -5
commit 104ac991bd821773cba6f262f97a4a752ed76dd5
Author: Namhyung Kim <namhyung@kernel.org>
Commit: Jiri Olsa <jolsa@kernel.org>
perf top: Add top.children config option
- Arnaldo
Hi Arnaldo,
On Thu, Oct 26, 2023 at 1:02 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Oct 26, 2023 at 09:46:02AM +0300, Adrian Hunter escreveu:
> > On 26/10/23 09:26, Namhyung Kim wrote:
> > > I found the hierarchy mode useful, but it's easy to make a typo when
> > > using it. Let's add a short option for that.
>
> > > Also update the documentation. :)
>
> > Perhaps it would also be possible to support bash-completions for
> > long options
>
> It works:
>
> # . ~acme/git/linux/tools/perf/perf-completion.sh
> # perf top --hi<TAB>
> --hide_kernel_symbols --hide_user_symbols --hierarchy
> #
>
> And:
>
> perf top --hie<ENTER>
>
> works as it is unambiguous (so far).
Thanks for the test!
>
> What we don't have is a way to use hierachy by default, i.e. we should
> have:
>
> perf config top.hierarchy=1
>
> and then:
>
> perf top
>
> would always use the hierarchy view.
>
> tools/perf/Documentation/perf-config.txt has the options that can be
> set, like:
>
> # perf report | head -15
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 373K of event 'cycles:P'
> # Event count (approx.): 205365133495
> #
> # Overhead Command Shared Object Symbol
> # ........ ............... ................. ...................................
> #
> 3.17% MediaDe~hine #6 libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
> 2.31% swapper [kernel.vmlinux] [k] psi_group_change
> 1.87% MediaSu~sor #10 libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
> 1.84% MediaSu~isor #7 libc.so.6 [.] pthread_mutex_lock@@GLIBC_2.2.5
> #
>
> Then:
>
> # perf config report.sort_order=dso
> # perf report | head -15
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 373K of event 'cycles:P'
> # Event count (approx.): 205365133495
> #
> # Overhead Shared Object
> # ........ ..............................................
> #
> 59.52% [kernel.vmlinux]
> 19.79% libc.so.6
> 8.07% libxul.so
> 5.25% libopenh264.so.2.3.1
> #
>
> # cat ~/.perfconfig
> # this file is auto-generated.
> [report]
> sort_order = dso
> [root@five ~]# perf config report.sort_order
> report.sort_order=dso
> #
>
> Right now 'perf top' has only:
>
> static int perf_top_config(const char *var, const char *value, void *cb __maybe_unused)
> {
> if (!strcmp(var, "top.call-graph")) {
> var = "call-graph.record-mode";
> return perf_default_config(var, value, cb);
> }
> if (!strcmp(var, "top.children")) {
> symbol_conf.cumulate_callchain = perf_config_bool(var, value);
> return 0;
> }
>
> return 0;
> }
>
> This would be similar to what was done for --no-children on:
Sure, I can add the config option later. But it's not
compatible with some options that change the output
like --children and --fields. Maybe it needs to handle
some kind of priority of settings for incompatible one.
Thanks,
Namhyung
>
> https://git.kernel.org/torvalds/c/104ac991bd821773cba6f262f97a4a752ed76dd5
>
> $ git show --pretty=full 104ac991bd821773cba6f262f97a4a752ed76dd5 | head -5
> commit 104ac991bd821773cba6f262f97a4a752ed76dd5
> Author: Namhyung Kim <namhyung@kernel.org>
> Commit: Jiri Olsa <jolsa@kernel.org>
>
> perf top: Add top.children config option
>
> - Arnaldo
@@ -528,8 +528,35 @@ include::itrace.txt[]
--raw-trace::
When displaying traceevent output, do not use print fmt or plugins.
+-H::
--hierarchy::
- Enable hierarchical output.
+ Enable hierarchical output. In the hierarchy mode, each sort key groups
+ samples based on the criteria and then sub-divide it using the lower
+ level sort key.
+
+ For example:
+ In normal output:
+
+ perf report -s dso,sym
+ # Overhead Shared Object Symbol
+ 50.00% [kernel.kallsyms] [k] kfunc1
+ 20.00% perf [.] foo
+ 15.00% [kernel.kallsyms] [k] kfunc2
+ 10.00% perf [.] bar
+ 5.00% libc.so [.] libcall
+
+ In hierarchy output:
+
+ perf report -s dso,sym --hierarchy
+ # Overhead Shared Object / Symbol
+ 65.00% [kernel.kallsyms]
+ 50.00% [k] kfunc1
+ 15.00% [k] kfunc2
+ 30.00% perf
+ 20.00% [.] foo
+ 10.00% [.] bar
+ 5.00% libc.so
+ 5.00% [.] libcall
--inline::
If a callgraph address belongs to an inlined function, the inline stack
@@ -261,8 +261,38 @@ Default is to monitor all CPUS.
--raw-trace::
When displaying traceevent output, do not use print fmt or plugins.
+-H::
--hierarchy::
- Enable hierarchy output.
+ Enable hierarchical output. In the hierarchy mode, each sort key groups
+ samples based on the criteria and then sub-divide it using the lower
+ level sort key.
+
+ For example, in normal output:
+
+ perf report -s dso,sym
+ #
+ # Overhead Shared Object Symbol
+ # ........ ................. ...........
+ 50.00% [kernel.kallsyms] [k] kfunc1
+ 20.00% perf [.] foo
+ 15.00% [kernel.kallsyms] [k] kfunc2
+ 10.00% perf [.] bar
+ 5.00% libc.so [.] libcall
+
+ In hierarchy output:
+
+ perf report -s dso,sym --hierarchy
+ #
+ # Overhead Shared Object / Symbol
+ # .......... ......................
+ 65.00% [kernel.kallsyms]
+ 50.00% [k] kfunc1
+ 15.00% [k] kfunc2
+ 30.00% perf
+ 20.00% [.] foo
+ 10.00% [.] bar
+ 5.00% libc.so
+ 5.00% [.] libcall
--overwrite::
Enable this to use just the most recent records, which helps in high core count
@@ -1392,7 +1392,7 @@ int cmd_report(int argc, const char **argv)
"only show processor socket that match with this filter"),
OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
"Show raw trace event output (do not use print fmt or plugins)"),
- OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
+ OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
"Show entries in a hierarchy"),
OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode",
"'always' (default), 'never' or 'auto' only applicable to --stdio mode",
@@ -1573,7 +1573,7 @@ int cmd_top(int argc, const char **argv)
"add last branch records to call history"),
OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
"Show raw trace event output (do not use print fmt or plugins)"),
- OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
+ OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
"Show entries in a hierarchy"),
OPT_BOOLEAN(0, "overwrite", &top.record_opts.overwrite,
"Use a backward ring buffer, default: no"),