Message ID | 20240228005230.287113-1-namhyung@kernel.org |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-84286-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp3058628dyb; Tue, 27 Feb 2024 16:53:33 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXknWalgwooQ23zv4yX3YPJxNFwESWMqt5oys+NHX6O3XadkoDGjJ4jckwn/vwpqEwb4yLd5ehPQEW5aGPKGe4KgHBb7g== X-Google-Smtp-Source: AGHT+IHMmqC8xcusNd4Ss97CtaQbKVpEdzZJM3CO/MpLXBB82ygZvgFFslY7vbUtIGxT1CPUe66z X-Received: by 2002:a05:6512:3f28:b0:512:bea0:b724 with SMTP id y40-20020a0565123f2800b00512bea0b724mr8924780lfa.51.1709081612833; Tue, 27 Feb 2024 16:53:32 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709081612; cv=pass; d=google.com; s=arc-20160816; b=RZRM85wpJZ4nTJnt1X7zRPdqvnd3gCeEBt3vO3qj0BuHX5h8Gpbwt9I8UPbjIscVxo r/kGR31D+xGdGA5jQh1fIptkQCH6ZEvXpvo/PxRJ1RZjALw9X0pe8dKtTtBjIs+gDrKO hZm0mFeQfIE4/VDpD1Dbf9DPBwN1d3IjbVLFvTVnlmbL3YEPhIt6ovOtT3Ydg4KABvIq soM9uqta6aKV5spZRpd5AfZhbVEhh/24OWqPduWZkNwCLSXMj8HhIFbwMAFfboh5WEgz CBBKHwdgfAj3VK0+TaqDrqDT9IV369mPYbyUDdI8wowSaZQ/qXtVz+TU7BLOdz45F4in MdQg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=sZfrapsxCnNC0YXN2XtGVaZ4SFpMv9Di9D++fGSBe0k=; fh=s1fZEdGGHFKlkrF3njJuxobE+Xf6SbIV8h7pc+69Jsg=; b=IdkHAz7MNIdYSuB/jpfMshVt9vxe3Hh7yNWBbvD5b/Q0o7RsTEqSMtMZxq5VMe61W+ hTfvKF+HKQdwAvuZ1tNPey2/bevJ5OB+M9UdnO0XeVEtgxpbpdALA+sOyx7JkeCkBsN7 5lmuRKrIyifIIUXPuKQiwq66T7WRmL2oLF0Cce0ccHLbR+sQblEcaSHpI8Ss9NC3XBL7 Wb3Ji5N/woTnHWqHN3EPU8n99sxxjQbtom3IZYOHRdfoABqn8Io53MJS40NPuHXLA0RY MXaOCCseH8is6TqBleg0TFlE2cyLnRHyd1sqvda4XoDT8MpxSYsHO0khH+bnmt/h4ikH xABg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=pBZYTBSY; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-84286-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84286-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id se19-20020a170906ce5300b00a3ec892acf9si1189589ejb.743.2024.02.27.16.53.32 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Feb 2024 16:53:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-84286-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=pBZYTBSY; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-84286-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-84286-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 2D8B11F2B4AB for <ouuuleilei@gmail.com>; Wed, 28 Feb 2024 00:53:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 25E287470; Wed, 28 Feb 2024 00:52:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pBZYTBSY" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68C85A34; Wed, 28 Feb 2024 00:52:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709081552; cv=none; b=NfrSkkOdctOY7yKxlWnpIbtXhKsptS7xinRSBNF5J4eJe9vibF31kxW+B0QgH8VALSf0NREX3eSTznOVRvMRu706SePT3Mc0VxP4bZLqGxJ2UNhZEJICdghYoXxDAwPMqdZn8b4QCqEarzk6IYeM5mDWC6Sc1vzvfnqJ1f4yqOc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709081552; c=relaxed/simple; bh=Pf04X6SV6XUlQWu1+HzF9j9trt7MUo1jU/RSKN5kwes=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=lkrTKNnsBt71HmEJBIpZec9hhmvwroAYNJvvLJ43MmQoGNkqPcZiQ97L8ttbEer0mPCd+hCS3nUslFKYH/O3/m4LaGRT3ox0HiYg/qqX6xox2yu0beWKb2pThS+1jrqPh6+AOBmXanTjkaPaXS9+JqrEFlxMsmFTFmiVnOYzMpw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pBZYTBSY; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 90394C433F1; Wed, 28 Feb 2024 00:52:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1709081551; bh=Pf04X6SV6XUlQWu1+HzF9j9trt7MUo1jU/RSKN5kwes=; h=From:To:Cc:Subject:Date:From; b=pBZYTBSYsSsFTPfDvsBGpMOVpCgniu+PwtBRwJqD+aP+ESBTeZAdxvn2h0QG0+0k5 aoQhKj+hq+wVVkHajIRys6Zc/luHWr73ZMoY2cPCBMkSvRSiNeHB71Z6McWYm0FN5g g7E3THuYdsKu5Hcpb1jSmbdq/81kvzPRIwGlLW05kuqr1tM0Z/c6SDp5Z3KTCZCVHr e/b0E9Hs1IT25EF1Jo1oi33g+VLvSE4HmxrdGnW4mxS+yR8NE/8Qizt4WC5whonSay xjPnHZZaW4m0/AogNqq7P6QUOj3JNMDPmM+E/an2EBWdh0sDN+GJDf2mp5/JJPMhjX ffT5ZjRd1rGbQ== From: Namhyung Kim <namhyung@kernel.org> To: Arnaldo Carvalho de Melo <acme@kernel.org>, Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org>, Adrian Hunter <adrian.hunter@intel.com>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>, LKML <linux-kernel@vger.kernel.org>, linux-perf-users@vger.kernel.org, Andi Kleen <ak@linux.intel.com> Subject: [PATCH 0/4] perf annotate: Improve memory usage for symbol histogram Date: Tue, 27 Feb 2024 16:52:26 -0800 Message-ID: <20240228005230.287113-1-namhyung@kernel.org> X-Mailer: git-send-email 2.44.0.rc1.240.g4c46232300-goog Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792101960853316352 X-GMAIL-MSGID: 1792101960853316352 |
Series |
perf annotate: Improve memory usage for symbol histogram
|
|
Message
Namhyung Kim
Feb. 28, 2024, 12:52 a.m. UTC
Hello, This is another series of memory optimization in perf annotate. When perf annotate (or perf report/top with TUI) processes samples, it needs to save the sample period (overhead) at instruction level. For now, it allocates an array to do that for the whole symbol when it hits any new symbol. This comes with a lot of waste since samples can be very few and instructions span to multiple bytes. For example, when a sample hits symbol 'foo' that has size of 100 and that's the only sample falls into the symbol. Then it needs to allocate a symbol histogram (sym_hist) and the its size would be 16 (header) + 16 (sym_hist_entry) * 100 (symbol_size) = 1616 But actually it just needs 32 (header + sym_hist_entry) bytes. Things get worse if the symbol size is bigger (and it doesn't have many samples in different places). Also note that it needs separate histogram for each event. Let's split the sym_hist_entry and have it in a hash table so that it can allocate only necessary entries. No functional change intended. Thanks, Namhyung Namhyung Kim (4): perf annotate: Add a hashmap for symbol histogram perf annotate: Calculate instruction overhead using hashmap perf annotate: Remove sym_hist.addr[] array perf annotate: Add comments in the data structures tools/perf/ui/gtk/annotate.c | 14 ++++- tools/perf/util/annotate.c | 114 ++++++++++++++++++++++------------- tools/perf/util/annotate.h | 86 +++++++++++++++++++++++--- 3 files changed, 158 insertions(+), 56 deletions(-)
Comments
On Tue, Feb 27, 2024 at 4:52 PM Namhyung Kim <namhyung@kernel.org> wrote: > > Now symbol histogram uses an array to save per-offset sample counts. > But it wastes a lot of memory if the symbol has a few samples only. > Add a hashmap to save values only for actual samples. > > For now, it has duplicate histogram (one in the existing array and > another in the new hash map). Once it can convert to use the hash > in all places, we can get rid of the array later. > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/perf/util/annotate.c | 40 +++++++++++++++++++++++++++++++++++++- > tools/perf/util/annotate.h | 2 ++ > 2 files changed, 41 insertions(+), 1 deletion(-) > > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index 107b264fa41e..7a70e4d35c9b 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -38,6 +38,7 @@ > #include "arch/common.h" > #include "namespaces.h" > #include "thread.h" > +#include "hashmap.h" > #include <regex.h> > #include <linux/bitops.h> > #include <linux/kernel.h> > @@ -863,6 +864,17 @@ bool arch__is(struct arch *arch, const char *name) > return !strcmp(arch->name, name); > } > > +/* symbol histogram: key = offset << 16 | evsel->core.idx */ > +static size_t sym_hist_hash(long key, void *ctx __maybe_unused) > +{ > + return (key >> 16) + (key & 0xffff); > +} > + > +static bool sym_hist_equal(long key1, long key2, void *ctx __maybe_unused) > +{ > + return key1 == key2; > +} > + > static struct annotated_source *annotated_source__new(void) > { > struct annotated_source *src = zalloc(sizeof(*src)); > @@ -877,6 +889,8 @@ static __maybe_unused void annotated_source__delete(struct annotated_source *src > { > if (src == NULL) > return; > + > + hashmap__free(src->samples); > zfree(&src->histograms); > free(src); > } > @@ -909,6 +923,14 @@ static int annotated_source__alloc_histograms(struct annotated_source *src, > src->sizeof_sym_hist = sizeof_sym_hist; > src->nr_histograms = nr_hists; > src->histograms = calloc(nr_hists, sizeof_sym_hist) ; > + > + if (src->histograms == NULL) > + return -1; > + > + src->samples = hashmap__new(sym_hist_hash, sym_hist_equal, NULL); > + if (src->samples == NULL) > + zfree(&src->histograms); > + > return src->histograms ? 0 : -1; > } > > @@ -920,6 +942,7 @@ void symbol__annotate_zero_histograms(struct symbol *sym) > if (notes->src != NULL) { > memset(notes->src->histograms, 0, > notes->src->nr_histograms * notes->src->sizeof_sym_hist); > + hashmap__clear(notes->src->samples); > } > if (notes->branch && notes->branch->cycles_hist) { > memset(notes->branch->cycles_hist, 0, > @@ -983,8 +1006,10 @@ static int __symbol__inc_addr_samples(struct map_symbol *ms, > struct perf_sample *sample) > { > struct symbol *sym = ms->sym; > + long hash_key; > unsigned offset; > struct sym_hist *h; > + struct sym_hist_entry *entry; > > pr_debug3("%s: addr=%#" PRIx64 "\n", __func__, map__unmap_ip(ms->map, addr)); > > @@ -1002,15 +1027,28 @@ static int __symbol__inc_addr_samples(struct map_symbol *ms, > __func__, __LINE__, sym->name, sym->start, addr, sym->end, sym->type == STT_FUNC); > return -ENOMEM; > } > + > + hash_key = offset << 16 | evidx; > + if (!hashmap__find(src->samples, hash_key, &entry)) { > + entry = zalloc(sizeof(*entry)); > + if (entry == NULL) > + return -ENOMEM; > + > + if (hashmap__add(src->samples, hash_key, entry) < 0) > + return -ENOMEM; > + } > + > h->nr_samples++; > h->addr[offset].nr_samples++; > h->period += sample->period; > h->addr[offset].period += sample->period; > + entry->nr_samples++; > + entry->period += sample->period; > > pr_debug3("%#" PRIx64 " %s: period++ [addr: %#" PRIx64 ", %#" PRIx64 > ", evidx=%d] => nr_samples: %" PRIu64 ", period: %" PRIu64 "\n", > sym->start, sym->name, addr, addr - sym->start, evidx, > - h->addr[offset].nr_samples, h->addr[offset].period); > + entry->nr_samples, entry->period); > return 0; > } > > diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h > index 94435607c958..a2b0c8210740 100644 > --- a/tools/perf/util/annotate.h > +++ b/tools/perf/util/annotate.h > @@ -12,6 +12,7 @@ > #include "symbol_conf.h" > #include "mutex.h" > #include "spark.h" > +#include "hashmap.h" nit: This could just be a forward reference to keep the number of header files down. Thanks, Ian > > struct hist_browser_timer; > struct hist_entry; > @@ -280,6 +281,7 @@ struct annotated_source { > size_t sizeof_sym_hist; > struct sym_hist *histograms; > struct annotation_line **offsets; > + struct hashmap *samples; > int nr_histograms; > int nr_entries; > int nr_asm_entries; > -- > 2.44.0.rc1.240.g4c46232300-goog >
On Tue, Feb 27, 2024 at 4:52 PM Namhyung Kim <namhyung@kernel.org> wrote: > > Hello, > > This is another series of memory optimization in perf annotate. > > When perf annotate (or perf report/top with TUI) processes samples, it > needs to save the sample period (overhead) at instruction level. For > now, it allocates an array to do that for the whole symbol when it > hits any new symbol. This comes with a lot of waste since samples can > be very few and instructions span to multiple bytes. > > For example, when a sample hits symbol 'foo' that has size of 100 and > that's the only sample falls into the symbol. Then it needs to > allocate a symbol histogram (sym_hist) and the its size would be > > 16 (header) + 16 (sym_hist_entry) * 100 (symbol_size) = 1616 > > But actually it just needs 32 (header + sym_hist_entry) bytes. Things > get worse if the symbol size is bigger (and it doesn't have many > samples in different places). Also note that it needs separate > histogram for each event. > > Let's split the sym_hist_entry and have it in a hash table so that it > can allocate only necessary entries. > > No functional change intended. > > Thanks, > Namhyung > > > Namhyung Kim (4): > perf annotate: Add a hashmap for symbol histogram > perf annotate: Calculate instruction overhead using hashmap > perf annotate: Remove sym_hist.addr[] array > perf annotate: Add comments in the data structures Reviewed-by: Ian Rogers <irogers@google.com> Thanks, Ian > tools/perf/ui/gtk/annotate.c | 14 ++++- > tools/perf/util/annotate.c | 114 ++++++++++++++++++++++------------- > tools/perf/util/annotate.h | 86 +++++++++++++++++++++++--- > 3 files changed, 158 insertions(+), 56 deletions(-) > > -- > 2.44.0.rc1.240.g4c46232300-goog >
On Tue, Feb 27, 2024 at 04:52:26PM -0800, Namhyung Kim wrote: > This is another series of memory optimization in perf annotate. > When perf annotate (or perf report/top with TUI) processes samples, it > needs to save the sample period (overhead) at instruction level. For > now, it allocates an array to do that for the whole symbol when it > hits any new symbol. This comes with a lot of waste since samples can > be very few and instructions span to multiple bytes. > For example, when a sample hits symbol 'foo' that has size of 100 and > that's the only sample falls into the symbol. Then it needs to > allocate a symbol histogram (sym_hist) and the its size would be > 16 (header) + 16 (sym_hist_entry) * 100 (symbol_size) = 1616 > But actually it just needs 32 (header + sym_hist_entry) bytes. Things > get worse if the symbol size is bigger (and it doesn't have many > samples in different places). Also note that it needs separate > histogram for each event. > Let's split the sym_hist_entry and have it in a hash table so that it > can allocate only necessary entries. > No functional change intended. I tried this before/after this series: $ time perf annotate --stdio2 -i perf.data.annotate For: perf record -e '{cycles,instructions,cache-misses}' make -k CORESIGHT=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin And found these odd cases: $ diff -u before after --- before 2024-02-28 15:38:25.086062812 -0300 +++ after 2024-02-29 14:12:05.606652725 -0300 @@ -2450826,7 +2450826,7 @@ ↓ je 1c62 → call operator delete(void*)@plt { return _M_dataplus._M_p; } - 1c62: mov 0x13c0(%rsp),%rdi + 0.00 0.00 100.00 1c62: mov 0x13c0(%rsp),%rdi if (_M_data() == _M_local_data()) lea 0x13d0(%rsp),%rax cmp %rax,%rdi @@ -2470648,7 +2470648,7 @@ mov %rbx,%rdi → call operator delete(void*)@plt using reference = T &; - 0.00 0.00 100.00 11c65: mov 0x8(%r12),%rax + 11c65: mov 0x8(%r12),%rax size_t size() const { return Size; } mov 0x10(%r12),%ecx mov %rax,%rbp $ This is a large function: Samples: 574K of events 'anon group { cpu_core/cycles/u, cpu_core/instructions/u, cpu_core/cache-misses/u }', 4000 Hz, Event count (approx.): 614695751751, [percent: local period]$ clang::CompilerInvocation::ParseCodeGenArgs(clang::CodeGenOptions&, llvm::opt::ArgList&, clang::InputKind, clang::DiagnosticsEngine&, llvm::Triple const&, std::__cxx11::basic_string<char, std Percent Probably when building the BPF skels in tools/perf/ - Arnaldo
Hi Arnaldo, On Mon, Mar 4, 2024 at 5:58 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > On Tue, Feb 27, 2024 at 04:52:26PM -0800, Namhyung Kim wrote: > > This is another series of memory optimization in perf annotate. > > > When perf annotate (or perf report/top with TUI) processes samples, it > > needs to save the sample period (overhead) at instruction level. For > > now, it allocates an array to do that for the whole symbol when it > > hits any new symbol. This comes with a lot of waste since samples can > > be very few and instructions span to multiple bytes. > > > For example, when a sample hits symbol 'foo' that has size of 100 and > > that's the only sample falls into the symbol. Then it needs to > > allocate a symbol histogram (sym_hist) and the its size would be > > > 16 (header) + 16 (sym_hist_entry) * 100 (symbol_size) = 1616 > > > But actually it just needs 32 (header + sym_hist_entry) bytes. Things > > get worse if the symbol size is bigger (and it doesn't have many > > samples in different places). Also note that it needs separate > > histogram for each event. > > > Let's split the sym_hist_entry and have it in a hash table so that it > > can allocate only necessary entries. > > > No functional change intended. > > I tried this before/after this series: > > $ time perf annotate --stdio2 -i perf.data.annotate > > For: > > perf record -e '{cycles,instructions,cache-misses}' make -k CORESIGHT=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin > > And found these odd cases: > > $ diff -u before after > --- before 2024-02-28 15:38:25.086062812 -0300 > +++ after 2024-02-29 14:12:05.606652725 -0300 > @@ -2450826,7 +2450826,7 @@ > ↓ je 1c62 > → call operator delete(void*)@plt > { return _M_dataplus._M_p; } > - 1c62: mov 0x13c0(%rsp),%rdi > + 0.00 0.00 100.00 1c62: mov 0x13c0(%rsp),%rdi > if (_M_data() == _M_local_data()) > lea 0x13d0(%rsp),%rax > cmp %rax,%rdi > @@ -2470648,7 +2470648,7 @@ > mov %rbx,%rdi > → call operator delete(void*)@plt > using reference = T &; > - 0.00 0.00 100.00 11c65: mov 0x8(%r12),%rax > + 11c65: mov 0x8(%r12),%rax > size_t size() const { return Size; } > mov 0x10(%r12),%ecx > mov %rax,%rbp > $ > > > This is a large function: Thanks for the test! I think it missed the cast to 64-bit somewhere. I'll check and send v2 soon. Thanks, Namhyung > > Samples: 574K of events 'anon group { cpu_core/cycles/u, cpu_core/instructions/u, cpu_core/cache-misses/u }', 4000 Hz, Event count (approx.): 614695751751, [percent: local period]$ > clang::CompilerInvocation::ParseCodeGenArgs(clang::CodeGenOptions&, llvm::opt::ArgList&, clang::InputKind, clang::DiagnosticsEngine&, llvm::Triple const&, std::__cxx11::basic_string<char, std > Percent > > Probably when building the BPF skels in tools/perf/ > > - Arnaldo