perf tools: Add -H short option for --hierarchy

Message ID 20231026062615.3096537-1-namhyung@kernel.org
State New
Headers
Series perf tools: Add -H short option for --hierarchy |

Commit Message

Namhyung Kim Oct. 26, 2023, 6:26 a.m. UTC
  I found the hierarchy mode useful, but it's easy to make a typo when
using it.  Let's add a short option for that.

Also update the documentation. :)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-report.txt | 29 ++++++++++++++++++++-
 tools/perf/Documentation/perf-top.txt    | 32 +++++++++++++++++++++++-
 tools/perf/builtin-report.c              |  2 +-
 tools/perf/builtin-top.c                 |  2 +-
 4 files changed, 61 insertions(+), 4 deletions(-)
  

Comments

Adrian Hunter Oct. 26, 2023, 6:46 a.m. UTC | #1
On 26/10/23 09:26, Namhyung Kim wrote:
> I found the hierarchy mode useful, but it's easy to make a typo when
> using it.  Let's add a short option for that.
> 
> Also update the documentation. :)

Perhaps it would also be possible to support bash-completions for
long options
  
Namhyung Kim Oct. 26, 2023, 5:19 p.m. UTC | #2
Hi Adrian,

On Wed, Oct 25, 2023 at 11:46 PM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 26/10/23 09:26, Namhyung Kim wrote:
> > I found the hierarchy mode useful, but it's easy to make a typo when
> > using it.  Let's add a short option for that.
> >
> > Also update the documentation. :)
>
> Perhaps it would also be possible to support bash-completions for
> long options

I believe it already supports long options.  But I have some setup
which doesn't work with bash completions. :-(

Thanks,
Namhyung
  
Arnaldo Carvalho de Melo Oct. 26, 2023, 8:02 p.m. UTC | #3
Em Thu, Oct 26, 2023 at 09:46:02AM +0300, Adrian Hunter escreveu:
> On 26/10/23 09:26, Namhyung Kim wrote:
> > I found the hierarchy mode useful, but it's easy to make a typo when
> > using it.  Let's add a short option for that.

> > Also update the documentation. :)

> Perhaps it would also be possible to support bash-completions for
> long options

It works:

  # . ~acme/git/linux/tools/perf/perf-completion.sh
  # perf top --hi<TAB>
  --hide_kernel_symbols  --hide_user_symbols    --hierarchy
  #

And:

perf top --hie<ENTER>

works as it is unambiguous (so far).

What we don't have is a way to use hierachy by default, i.e. we should
have:

perf config top.hierarchy=1

and then:

perf top

would always use the hierarchy view.

tools/perf/Documentation/perf-config.txt has the options that can be
set, like:

# perf report | head -15
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 373K of event 'cycles:P'
# Event count (approx.): 205365133495
#
# Overhead  Command          Shared Object                                     Symbol
# ........  ...............  .................    ...................................
#
     3.17%  MediaDe~hine #6  libc.so.6            [.] pthread_mutex_lock@@GLIBC_2.2.5
     2.31%  swapper          [kernel.vmlinux]     [k] psi_group_change
     1.87%  MediaSu~sor #10  libc.so.6            [.] pthread_mutex_lock@@GLIBC_2.2.5
     1.84%  MediaSu~isor #7  libc.so.6            [.] pthread_mutex_lock@@GLIBC_2.2.5
#

Then:

# perf config report.sort_order=dso
# perf report | head -15
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 373K of event 'cycles:P'
# Event count (approx.): 205365133495
#
# Overhead  Shared Object                                 
# ........  ..............................................
#
    59.52%  [kernel.vmlinux]                              
    19.79%  libc.so.6                                     
     8.07%  libxul.so                                     
     5.25%  libopenh264.so.2.3.1                          
#

# cat ~/.perfconfig
# this file is auto-generated.
[report]
	sort_order = dso
[root@five ~]# perf config report.sort_order
report.sort_order=dso
#

Right now 'perf top' has only:

static int perf_top_config(const char *var, const char *value, void *cb __maybe_unused)
{
        if (!strcmp(var, "top.call-graph")) {
                var = "call-graph.record-mode";
                return perf_default_config(var, value, cb);
        }
        if (!strcmp(var, "top.children")) {
                symbol_conf.cumulate_callchain = perf_config_bool(var, value);
                return 0;
        }

        return 0;
}

This would be similar to what was done for --no-children on:

https://git.kernel.org/torvalds/c/104ac991bd821773cba6f262f97a4a752ed76dd5

$ git show --pretty=full 104ac991bd821773cba6f262f97a4a752ed76dd5 | head -5
commit 104ac991bd821773cba6f262f97a4a752ed76dd5
Author: Namhyung Kim <namhyung@kernel.org>
Commit: Jiri Olsa <jolsa@kernel.org>

    perf top: Add top.children config option

- Arnaldo
  
Namhyung Kim Nov. 6, 2023, 4:43 a.m. UTC | #4
Hi Arnaldo,

On Thu, Oct 26, 2023 at 1:02 PM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Oct 26, 2023 at 09:46:02AM +0300, Adrian Hunter escreveu:
> > On 26/10/23 09:26, Namhyung Kim wrote:
> > > I found the hierarchy mode useful, but it's easy to make a typo when
> > > using it.  Let's add a short option for that.
>
> > > Also update the documentation. :)
>
> > Perhaps it would also be possible to support bash-completions for
> > long options
>
> It works:
>
>   # . ~acme/git/linux/tools/perf/perf-completion.sh
>   # perf top --hi<TAB>
>   --hide_kernel_symbols  --hide_user_symbols    --hierarchy
>   #
>
> And:
>
> perf top --hie<ENTER>
>
> works as it is unambiguous (so far).

Thanks for the test!

>
> What we don't have is a way to use hierachy by default, i.e. we should
> have:
>
> perf config top.hierarchy=1
>
> and then:
>
> perf top
>
> would always use the hierarchy view.
>
> tools/perf/Documentation/perf-config.txt has the options that can be
> set, like:
>
> # perf report | head -15
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 373K of event 'cycles:P'
> # Event count (approx.): 205365133495
> #
> # Overhead  Command          Shared Object                                     Symbol
> # ........  ...............  .................    ...................................
> #
>      3.17%  MediaDe~hine #6  libc.so.6            [.] pthread_mutex_lock@@GLIBC_2.2.5
>      2.31%  swapper          [kernel.vmlinux]     [k] psi_group_change
>      1.87%  MediaSu~sor #10  libc.so.6            [.] pthread_mutex_lock@@GLIBC_2.2.5
>      1.84%  MediaSu~isor #7  libc.so.6            [.] pthread_mutex_lock@@GLIBC_2.2.5
> #
>
> Then:
>
> # perf config report.sort_order=dso
> # perf report | head -15
> # To display the perf.data header info, please use --header/--header-only options.
> #
> #
> # Total Lost Samples: 0
> #
> # Samples: 373K of event 'cycles:P'
> # Event count (approx.): 205365133495
> #
> # Overhead  Shared Object
> # ........  ..............................................
> #
>     59.52%  [kernel.vmlinux]
>     19.79%  libc.so.6
>      8.07%  libxul.so
>      5.25%  libopenh264.so.2.3.1
> #
>
> # cat ~/.perfconfig
> # this file is auto-generated.
> [report]
>         sort_order = dso
> [root@five ~]# perf config report.sort_order
> report.sort_order=dso
> #
>
> Right now 'perf top' has only:
>
> static int perf_top_config(const char *var, const char *value, void *cb __maybe_unused)
> {
>         if (!strcmp(var, "top.call-graph")) {
>                 var = "call-graph.record-mode";
>                 return perf_default_config(var, value, cb);
>         }
>         if (!strcmp(var, "top.children")) {
>                 symbol_conf.cumulate_callchain = perf_config_bool(var, value);
>                 return 0;
>         }
>
>         return 0;
> }
>
> This would be similar to what was done for --no-children on:

Sure, I can add the config option later.  But it's not
compatible with some options that change the output
like --children and --fields.  Maybe it needs to handle
some kind of priority of settings for incompatible one.

Thanks,
Namhyung

>
> https://git.kernel.org/torvalds/c/104ac991bd821773cba6f262f97a4a752ed76dd5
>
> $ git show --pretty=full 104ac991bd821773cba6f262f97a4a752ed76dd5 | head -5
> commit 104ac991bd821773cba6f262f97a4a752ed76dd5
> Author: Namhyung Kim <namhyung@kernel.org>
> Commit: Jiri Olsa <jolsa@kernel.org>
>
>     perf top: Add top.children config option
>
> - Arnaldo
  

Patch

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index af068b4f1e5a..7d8916b2b7f7 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -528,8 +528,35 @@  include::itrace.txt[]
 --raw-trace::
 	When displaying traceevent output, do not use print fmt or plugins.
 
+-H::
 --hierarchy::
-	Enable hierarchical output.
+	Enable hierarchical output.  In the hierarchy mode, each sort key groups
+	samples based on the criteria and then sub-divide it using the lower
+	level sort key.
+
+	For example:
+	In normal output:
+
+	  perf report -s dso,sym
+	  # Overhead  Shared Object      Symbol
+	      50.00%  [kernel.kallsyms]  [k] kfunc1
+	      20.00%  perf               [.] foo
+	      15.00%  [kernel.kallsyms]  [k] kfunc2
+	      10.00%  perf               [.] bar
+	       5.00%  libc.so            [.] libcall
+
+	In hierarchy output:
+
+	  perf report -s dso,sym --hierarchy
+	  #   Overhead  Shared Object / Symbol
+	      65.00%    [kernel.kallsyms]
+	        50.00%    [k] kfunc1
+	        15.00%    [k] kfunc2
+	      30.00%    perf
+	        20.00%    [.] foo
+	        10.00%    [.] bar
+	       5.00%    libc.so
+	         5.00%    [.] libcall
 
 --inline::
 	If a callgraph address belongs to an inlined function, the inline stack
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 3c202ec080ba..a754875fa5bb 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -261,8 +261,38 @@  Default is to monitor all CPUS.
 --raw-trace::
 	When displaying traceevent output, do not use print fmt or plugins.
 
+-H::
 --hierarchy::
-	Enable hierarchy output.
+	Enable hierarchical output.  In the hierarchy mode, each sort key groups
+	samples based on the criteria and then sub-divide it using the lower
+	level sort key.
+
+	For example, in normal output:
+
+	  perf report -s dso,sym
+	  #
+	  # Overhead  Shared Object      Symbol
+	  # ........  .................  ...........
+	      50.00%  [kernel.kallsyms]  [k] kfunc1
+	      20.00%  perf               [.] foo
+	      15.00%  [kernel.kallsyms]  [k] kfunc2
+	      10.00%  perf               [.] bar
+	       5.00%  libc.so            [.] libcall
+
+	In hierarchy output:
+
+	  perf report -s dso,sym --hierarchy
+	  #
+	  #   Overhead  Shared Object / Symbol
+	  # ..........  ......................
+	      65.00%    [kernel.kallsyms]
+	        50.00%    [k] kfunc1
+	        15.00%    [k] kfunc2
+	      30.00%    perf
+	        20.00%    [.] foo
+	        10.00%    [.] bar
+	       5.00%    libc.so
+	         5.00%    [.] libcall
 
 --overwrite::
 	Enable this to use just the most recent records, which helps in high core count
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ca8f2331795c..b16680d0f82c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1392,7 +1392,7 @@  int cmd_report(int argc, const char **argv)
 		    "only show processor socket that match with this filter"),
 	OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
 		    "Show raw trace event output (do not use print fmt or plugins)"),
-	OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
+	OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
 		    "Show entries in a hierarchy"),
 	OPT_CALLBACK_DEFAULT(0, "stdio-color", NULL, "mode",
 			     "'always' (default), 'never' or 'auto' only applicable to --stdio mode",
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index ea8c7eca5eee..3cccb2a516dc 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1573,7 +1573,7 @@  int cmd_top(int argc, const char **argv)
 		    "add last branch records to call history"),
 	OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
 		    "Show raw trace event output (do not use print fmt or plugins)"),
-	OPT_BOOLEAN(0, "hierarchy", &symbol_conf.report_hierarchy,
+	OPT_BOOLEAN('H', "hierarchy", &symbol_conf.report_hierarchy,
 		    "Show entries in a hierarchy"),
 	OPT_BOOLEAN(0, "overwrite", &top.record_opts.overwrite,
 		    "Use a backward ring buffer, default: no"),