Message ID | ZMl8VyhdwhClTM5g@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp48153vqx; Tue, 1 Aug 2023 15:20:28 -0700 (PDT) X-Google-Smtp-Source: APBJJlGwuK8UNwBe79S/e5cuWEOd6EwmaDOKbNBHenYmhwaI1R8Qt8VZZHPdXmUlNDYFTv15R7hf X-Received: by 2002:a05:6402:517a:b0:522:582c:f427 with SMTP id d26-20020a056402517a00b00522582cf427mr5050767ede.14.1690928428646; Tue, 01 Aug 2023 15:20:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690928428; cv=none; d=google.com; s=arc-20160816; b=x2vFHjyjfph1bCvt13mb8wj3pc01NvVMKPIDbijyaY0N86zExCwjWfNhJbH9Eidiaw nxO+2bKtuGbXR4DnvpA4oNETSiOOecpejHvavVLoVyOpucbgQE4xwXJXvkPxE9pdqMBY WdX31obZU7892dXqI3Wn/TO2lXveJAXfi3Je5VgcGA2SZHy96a/MijEn0vJb0rptbbNb kDt7XjEhcmLxVGpbpnJ/I8LYCRuHhmC6OvK2BPc6qM3ZdsShQrc8fN+PoOrj8OKwsrtd EJNh/ovZQhb3jFIsWYZIJ5Uo2KKl67Ixih4zHzrdJp2HuCwStQ+IebWqfvaXJXAXADII Azlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:message-id :subject:cc:to:from:date:dkim-signature; bh=1kh9MnhN05RFd1xEkeOdrRQV2KTk2zxEO4IQE9HDNVs=; fh=LxBo9/j1bUDqR37YoE8t2g6WeB5RhX/4atgcZeM8BIs=; b=u5xxj658l/NrUpbwkFeoIOlAbUDR/7LNV9hPXNmcKe1ayA+BH2Wcvy3K7ahxYrG071 5Nt269sxvv+Onzk1Luh6++UXSWsNwC4GekUN60xVcsrHqEjlu9khete/hlHP2rnEf9c7 UVRYil2MJ3Njeh+kFQJiIWWx77yWgF5g59xMUXp4JM7v5nJ66Xf4CJdwPHErqW8O4gsq Q4xMCVIuu0D7BgIHVAls9VoEW1NP+GfHmL3xXv9yfzZfOCrGliimk5kIh2qPK9yyBLmk cHomcf5VcHTqCwp+IGPpqMhFFxaeek2I5HHDhEy0ocCI0CVwZvQaQknLl80B9vfog39B YvUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rP1HGRkj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d25-20020aa7ce19000000b0051e0f8b1699si6006579edv.185.2023.08.01.15.20.05; Tue, 01 Aug 2023 15:20:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=rP1HGRkj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232574AbjHAVmy (ORCPT <rfc822;maxi.paulin@gmail.com> + 99 others); Tue, 1 Aug 2023 17:42:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231351AbjHAVmw (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 1 Aug 2023 17:42:52 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 553D31FDA for <linux-kernel@vger.kernel.org>; Tue, 1 Aug 2023 14:42:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DDB9B6162C for <linux-kernel@vger.kernel.org>; Tue, 1 Aug 2023 21:42:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1E137C433C8; Tue, 1 Aug 2023 21:42:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1690926170; bh=sVMbvlwuoJ5vjWfQinZxmH1t6ndRFcWLp4U0FAFY/IQ=; h=Date:From:To:Cc:Subject:From; b=rP1HGRkjQ+WQQb1XzDiMaR8nmI6EwrsJTQmDhvUDiiObY/3S6uj4JTBz0qHwMWNIc jzQk31jBnJ7bKmDS7tfRTBc9udiTQGCvEEsBOvcuZH9ZEkjkScthmhKxFnOjnumtSn DI70RrO1DSF6oJcxx9rB0YUj9tjWtBOm/F7EqBhvI2CtNOiwke0I55FeFlHlle5H1n gKlMD52VO+/yDSSYGnLWKEl3mnuiDFcctvWb7gK5GqMxRSxEr5GCzpuxqZDiuN+rcq GkPdhwO/pQlwGtnglNInT8sbFIRJWd1zWQfy1w3YtyjRZcOYTt4OVpX1B+xWq7KmBq frETTWL2hhV3Q== Received: by quaco.ghostprotocols.net (Postfix, from userid 1000) id 662344012D; Tue, 1 Aug 2023 18:42:47 -0300 (-03) Date: Tue, 1 Aug 2023 18:42:47 -0300 From: Arnaldo Carvalho de Melo <acme@kernel.org> To: Artem Savkov <asavkov@redhat.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>, Namhyung Kim <namhyung@kernel.org>, Adrian Hunter <adrian.hunter@intel.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Ian Rogers <irogers@google.com>, Ingo Molnar <mingo@redhat.com>, Jiri Olsa <jolsa@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Masami Hiramatsu <mhiramat@kernel.org>, Milian Wolff <milian.wolff@kdab.com>, Peter Zijlstra <peterz@infradead.org>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Subject: [PATCH 1/1] Revert "perf report: Append inlines to non-DWARF callchains" Message-ID: <ZMl8VyhdwhClTM5g@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773066968348839506 X-GMAIL-MSGID: 1773066968348839506 |
Series |
[1/1] Revert "perf report: Append inlines to non-DWARF callchains"
|
|
Commit Message
Arnaldo Carvalho de Melo
Aug. 1, 2023, 9:42 p.m. UTC
Hi Artem,
Can you please double check this? I reproduced with:
git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e
build it
perf record -a -g sleep 5s
perf report
Do you get the same slowness and then reverting it, i.e. just
going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e.
without the inlines in the callchains?
- Arnaldo
----
This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e.
The tests were made with a specific workload, further tests on a
recently updated fedora 38 system with a system wide perf.data file
shows 'perf report' taking excessive time, so lets revert this until a
full investigation and improvement on the addr2line support code is
made.
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/machine.c | 5 -----
1 file changed, 5 deletions(-)
Comments
Hi Arnaldo, On Tue, Aug 01, 2023 at 06:42:47PM -0300, Arnaldo Carvalho de Melo wrote: > Hi Artem, > > Can you please double check this? I reproduced with: > > git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e > build it > perf record -a -g sleep 5s > perf report > > Do you get the same slowness and then reverting it, i.e. just > going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e. > without the inlines in the callchains? With a simple test like this I definitely get a slowdown, but not sure if it can be called excessive. Below are the times I got by running 'time perf report' and hitting 'q' during load so that it quits as soon as it is loads up. Tested on a freshly updated fedora 38. For 'perf record -a -g sleep 60' (Event count (approx.): 774055090): with inlines: $ time ./perf report real 0m1.477s user 0m1.324s sys 0m0.147s without inlines: $ time ./perf report real 0m1.349s user 0m1.232s sys 0m0.111s For 'perf record -a -g sleep 5' (Event count (approx.): 90179399): with inlines: $ time ./perf report real 0m0.657s user 0m0.555s sys 0m0.099s without inlines: $ time ./perf report real 0m0.559s user 0m0.498s sys 0m0.060s > - Arnaldo > > ---- > > This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. > > The tests were made with a specific workload, further tests on a > recently updated fedora 38 system with a system wide perf.data file > shows 'perf report' taking excessive time, so lets revert this until a > full investigation and improvement on the addr2line support code is > made. > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > Cc: Artem Savkov <asavkov@redhat.com> > Cc: Namhyung Kim <namhyung@kernel.org> > Cc: Adrian Hunter <adrian.hunter@intel.com> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > Cc: Ian Rogers <irogers@google.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Jiri Olsa <jolsa@kernel.org> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Masami Hiramatsu <mhiramat@kernel.org> > Cc: Milian Wolff <milian.wolff@kdab.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > --- > tools/perf/util/machine.c | 5 ----- > 1 file changed, 5 deletions(-) > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -45,7 +45,6 @@ > > static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd, > struct thread *th, bool lock); > -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip); > > static struct dso *machine__kernel_dso(struct machine *machine) > { > @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread, > ms.maps = maps__get(al.maps); > ms.map = map__get(al.map); > ms.sym = al.sym; > - > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > - goto out; > - > srcline = callchain_srcline(&ms, al.addr); > err = callchain_cursor_append(cursor, ip, &ms, > branch, flags, nr_loop_iter, > -- > 2.41.0 >
On Wed, Aug 02, 2023 at 09:43:40AM +0200, Artem Savkov wrote: > Hi Arnaldo, > > On Tue, Aug 01, 2023 at 06:42:47PM -0300, Arnaldo Carvalho de Melo wrote: > > Hi Artem, > > > > Can you please double check this? I reproduced with: > > > > git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e > > build it > > perf record -a -g sleep 5s > > perf report > > > > Do you get the same slowness and then reverting it, i.e. just > > going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e. > > without the inlines in the callchains? > > With a simple test like this I definitely get a slowdown, but not sure > if it can be called excessive. > > Below are the times I got by running 'time perf report' and hitting 'q' > during load so that it quits as soon as it is loads up. Tested on a > freshly updated fedora 38. My bad, I had wrong debuginfo installed for the kernel I tested. I can reproduce it with the correct one. Looks like vmlinux is just too much for addr2line. Maybe we can skip it but leave other inlines in, like so: diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 11de3ca8d4fa7..fef309cd401f7 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -2388,7 +2388,9 @@ static int add_callchain_ip(struct thread *thread, ms.map = map__get(al.map); ms.sym = al.sym; - if (!branch && append_inlines(cursor, &ms, ip) == 0) + if (!branch && ms.map && ms.map->dso && + strcmp(ms.map->dso->short_name, "[kernel.vmlinux]") && + append_inlines(cursor, &ms, ip) == 0) goto out; srcline = callchain_srcline(&ms, al.addr); > > - Arnaldo > > > > ---- > > > > This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. > > > > The tests were made with a specific workload, further tests on a > > recently updated fedora 38 system with a system wide perf.data file > > shows 'perf report' taking excessive time, so lets revert this until a > > full investigation and improvement on the addr2line support code is > > made. > > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > Cc: Artem Savkov <asavkov@redhat.com> > > Cc: Namhyung Kim <namhyung@kernel.org> > > Cc: Adrian Hunter <adrian.hunter@intel.com> > > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > > Cc: Ian Rogers <irogers@google.com> > > Cc: Ingo Molnar <mingo@redhat.com> > > Cc: Jiri Olsa <jolsa@kernel.org> > > Cc: Mark Rutland <mark.rutland@arm.com> > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > Cc: Milian Wolff <milian.wolff@kdab.com> > > Cc: Peter Zijlstra <peterz@infradead.org> > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > --- > > tools/perf/util/machine.c | 5 ----- > > 1 file changed, 5 deletions(-) > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644 > > --- a/tools/perf/util/machine.c > > +++ b/tools/perf/util/machine.c > > @@ -45,7 +45,6 @@ > > > > static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd, > > struct thread *th, bool lock); > > -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip); > > > > static struct dso *machine__kernel_dso(struct machine *machine) > > { > > @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread, > > ms.maps = maps__get(al.maps); > > ms.map = map__get(al.map); > > ms.sym = al.sym; > > - > > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > > - goto out; > > - > > srcline = callchain_srcline(&ms, al.addr); > > err = callchain_cursor_append(cursor, ip, &ms, > > branch, flags, nr_loop_iter, > > -- > > 2.41.0 > > > > -- > Artem
Em Mon, Aug 07, 2023 at 01:00:08PM +0200, Artem Savkov escreveu: > On Wed, Aug 02, 2023 at 09:43:40AM +0200, Artem Savkov wrote: > > Hi Arnaldo, > > > > On Tue, Aug 01, 2023 at 06:42:47PM -0300, Arnaldo Carvalho de Melo wrote: > > > Hi Artem, > > > > > > Can you please double check this? I reproduced with: > > > > > > git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e > > > build it > > > perf record -a -g sleep 5s > > > perf report > > > > > > Do you get the same slowness and then reverting it, i.e. just > > > going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e. > > > without the inlines in the callchains? > > > > With a simple test like this I definitely get a slowdown, but not sure > > if it can be called excessive. > > > > Below are the times I got by running 'time perf report' and hitting 'q' > > during load so that it quits as soon as it is loads up. Tested on a > > freshly updated fedora 38. > > My bad, I had wrong debuginfo installed for the kernel I tested. I can > reproduce it with the correct one. Looks like vmlinux is just too much > for addr2line. Maybe we can skip it but leave other inlines in, like so: That is a possibilit, and probably we could make it cheaper by looking at the cpumode, avoiding calling addr2line when we didn't makage to resolve the symbol, etc. We also may want to have this as an option that has to be explicitely enabled, like --resolve-inlines, as this will add overhead no matter if we stop calling addr2line and do it more efficiently, etc. Fact is, we're late in the 6.5 schedule, so the best thing now is to just revert the patch and then try again later, ok? - Arnaldo > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index 11de3ca8d4fa7..fef309cd401f7 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -2388,7 +2388,9 @@ static int add_callchain_ip(struct thread *thread, > ms.map = map__get(al.map); > ms.sym = al.sym; > > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > + if (!branch && ms.map && ms.map->dso && > + strcmp(ms.map->dso->short_name, "[kernel.vmlinux]") && > + append_inlines(cursor, &ms, ip) == 0) > goto out; > > srcline = callchain_srcline(&ms, al.addr); > > > > - Arnaldo > > > > > > ---- > > > > > > This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. > > > > > > The tests were made with a specific workload, further tests on a > > > recently updated fedora 38 system with a system wide perf.data file > > > shows 'perf report' taking excessive time, so lets revert this until a > > > full investigation and improvement on the addr2line support code is > > > made. > > > > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > Cc: Artem Savkov <asavkov@redhat.com> > > > Cc: Namhyung Kim <namhyung@kernel.org> > > > Cc: Adrian Hunter <adrian.hunter@intel.com> > > > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > > > Cc: Ian Rogers <irogers@google.com> > > > Cc: Ingo Molnar <mingo@redhat.com> > > > Cc: Jiri Olsa <jolsa@kernel.org> > > > Cc: Mark Rutland <mark.rutland@arm.com> > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > Cc: Milian Wolff <milian.wolff@kdab.com> > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > > --- > > > tools/perf/util/machine.c | 5 ----- > > > 1 file changed, 5 deletions(-) > > > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > > index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644 > > > --- a/tools/perf/util/machine.c > > > +++ b/tools/perf/util/machine.c > > > @@ -45,7 +45,6 @@ > > > > > > static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd, > > > struct thread *th, bool lock); > > > -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip); > > > > > > static struct dso *machine__kernel_dso(struct machine *machine) > > > { > > > @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread, > > > ms.maps = maps__get(al.maps); > > > ms.map = map__get(al.map); > > > ms.sym = al.sym; > > > - > > > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > > > - goto out; > > > - > > > srcline = callchain_srcline(&ms, al.addr); > > > err = callchain_cursor_append(cursor, ip, &ms, > > > branch, flags, nr_loop_iter, > > > -- > > > 2.41.0 > > > > > > > -- > > Artem > > -- > Artem >
On Mon, Aug 07, 2023 at 10:34:44AM -0300, Arnaldo Carvalho de Melo wrote: > Em Mon, Aug 07, 2023 at 01:00:08PM +0200, Artem Savkov escreveu: > > On Wed, Aug 02, 2023 at 09:43:40AM +0200, Artem Savkov wrote: > > > Hi Arnaldo, > > > > > > On Tue, Aug 01, 2023 at 06:42:47PM -0300, Arnaldo Carvalho de Melo wrote: > > > > Hi Artem, > > > > > > > > Can you please double check this? I reproduced with: > > > > > > > > git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e > > > > build it > > > > perf record -a -g sleep 5s > > > > perf report > > > > > > > > Do you get the same slowness and then reverting it, i.e. just > > > > going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e. > > > > without the inlines in the callchains? > > > > > > With a simple test like this I definitely get a slowdown, but not sure > > > if it can be called excessive. > > > > > > Below are the times I got by running 'time perf report' and hitting 'q' > > > during load so that it quits as soon as it is loads up. Tested on a > > > freshly updated fedora 38. > > > > My bad, I had wrong debuginfo installed for the kernel I tested. I can > > reproduce it with the correct one. Looks like vmlinux is just too much > > for addr2line. Maybe we can skip it but leave other inlines in, like so: > > That is a possibilit, and probably we could make it cheaper by looking > at the cpumode, avoiding calling addr2line when we didn't makage to > resolve the symbol, etc. > > We also may want to have this as an option that has to be explicitely > enabled, like --resolve-inlines, as this will add overhead no matter if > we stop calling addr2line and do it more efficiently, etc. Sounds good, I'll look into it. > Fact is, we're late in the 6.5 schedule, so the best thing now is to > just revert the patch and then try again later, ok? Yes, sure. > - Arnaldo > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > index 11de3ca8d4fa7..fef309cd401f7 100644 > > --- a/tools/perf/util/machine.c > > +++ b/tools/perf/util/machine.c > > @@ -2388,7 +2388,9 @@ static int add_callchain_ip(struct thread *thread, > > ms.map = map__get(al.map); > > ms.sym = al.sym; > > > > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > > + if (!branch && ms.map && ms.map->dso && > > + strcmp(ms.map->dso->short_name, "[kernel.vmlinux]") && > > + append_inlines(cursor, &ms, ip) == 0) > > goto out; > > > > srcline = callchain_srcline(&ms, al.addr); > > > > > > - Arnaldo > > > > > > > > ---- > > > > > > > > This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. > > > > > > > > The tests were made with a specific workload, further tests on a > > > > recently updated fedora 38 system with a system wide perf.data file > > > > shows 'perf report' taking excessive time, so lets revert this until a > > > > full investigation and improvement on the addr2line support code is > > > > made. > > > > > > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > > Cc: Artem Savkov <asavkov@redhat.com> > > > > Cc: Namhyung Kim <namhyung@kernel.org> > > > > Cc: Adrian Hunter <adrian.hunter@intel.com> > > > > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > > > > Cc: Ian Rogers <irogers@google.com> > > > > Cc: Ingo Molnar <mingo@redhat.com> > > > > Cc: Jiri Olsa <jolsa@kernel.org> > > > > Cc: Mark Rutland <mark.rutland@arm.com> > > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > > Cc: Milian Wolff <milian.wolff@kdab.com> > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > > > --- > > > > tools/perf/util/machine.c | 5 ----- > > > > 1 file changed, 5 deletions(-) > > > > > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > > > index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644 > > > > --- a/tools/perf/util/machine.c > > > > +++ b/tools/perf/util/machine.c > > > > @@ -45,7 +45,6 @@ > > > > > > > > static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd, > > > > struct thread *th, bool lock); > > > > -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip); > > > > > > > > static struct dso *machine__kernel_dso(struct machine *machine) > > > > { > > > > @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread, > > > > ms.maps = maps__get(al.maps); > > > > ms.map = map__get(al.map); > > > > ms.sym = al.sym; > > > > - > > > > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > > > > - goto out; > > > > - > > > > srcline = callchain_srcline(&ms, al.addr); > > > > err = callchain_cursor_append(cursor, ip, &ms, > > > > branch, flags, nr_loop_iter, > > > > -- > > > > 2.41.0 > > > > > > > > > > -- > > > Artem > > > > -- > > Artem > > > > -- > > - Arnaldo >
On 07/08/2023 16.03, Artem Savkov wrote: > On Mon, Aug 07, 2023 at 10:34:44AM -0300, Arnaldo Carvalho de Melo wrote: >> Em Mon, Aug 07, 2023 at 01:00:08PM +0200, Artem Savkov escreveu: >>> On Wed, Aug 02, 2023 at 09:43:40AM +0200, Artem Savkov wrote: >>>> Hi Arnaldo, >>>> >>>> On Tue, Aug 01, 2023 at 06:42:47PM -0300, Arnaldo Carvalho de Melo wrote: >>>>> Hi Artem, >>>>> >>>>> Can you please double check this? I reproduced with: >>>>> >>>>> git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e >>>>> build it >>>>> perf record -a -g sleep 5s >>>>> perf report >>>>> >>>>> Do you get the same slowness and then reverting it, i.e. just >>>>> going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e. >>>>> without the inlines in the callchains? >>>> >>>> With a simple test like this I definitely get a slowdown, but not sure >>>> if it can be called excessive. >>>> >>>> Below are the times I got by running 'time perf report' and hitting 'q' >>>> during load so that it quits as soon as it is loads up. Tested on a >>>> freshly updated fedora 38. >>> I reported this problem to ACME. It is also possible to reproduce without hitting 'q' via using this cmdline with --stdio like this: time perf report -v --stdio > /dev/null 2> debug01.stderr The file 'debug01.stderr' contained a lot of addr2line output, that might help debug the issue further. >>> My bad, I had wrong debuginfo installed for the kernel I tested. I can >>> reproduce it with the correct one. Looks like vmlinux is just too much >>> for addr2line. Maybe we can skip it but leave other inlines in, like so: >> >> That is a possibilit, and probably we could make it cheaper by looking >> at the cpumode, avoiding calling addr2line when we didn't makage to >> resolve the symbol, etc. >> >> We also may want to have this as an option that has to be explicitely >> enabled, like --resolve-inlines, as this will add overhead no matter if >> we stop calling addr2line and do it more efficiently, etc. > > Sounds good, I'll look into it. > >> Fact is, we're late in the 6.5 schedule, so the best thing now is to >> just revert the patch and then try again later, ok? > > Yes, sure. > >> - Arnaldo >> >>> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c >>> index 11de3ca8d4fa7..fef309cd401f7 100644 >>> --- a/tools/perf/util/machine.c >>> +++ b/tools/perf/util/machine.c >>> @@ -2388,7 +2388,9 @@ static int add_callchain_ip(struct thread *thread, >>> ms.map = map__get(al.map); >>> ms.sym = al.sym; >>> >>> - if (!branch && append_inlines(cursor, &ms, ip) == 0) >>> + if (!branch && ms.map && ms.map->dso && >>> + strcmp(ms.map->dso->short_name, "[kernel.vmlinux]") && >>> + append_inlines(cursor, &ms, ip) == 0) >>> goto out; >>> >>> srcline = callchain_srcline(&ms, al.addr); >>> >>>>> - Arnaldo >>>>> >>>>> ---- >>>>> >>>>> This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. >>>>> >>>>> The tests were made with a specific workload, further tests on a >>>>> recently updated fedora 38 system with a system wide perf.data file >>>>> shows 'perf report' taking excessive time, so lets revert this until a >>>>> full investigation and improvement on the addr2line support code is >>>>> made. >>>>> Reported-by: Jesper Dangaard Brouer <hawk@kernel.org> >>>>> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> >>>>> Cc: Artem Savkov <asavkov@redhat.com> >>>>> Cc: Namhyung Kim <namhyung@kernel.org> >>>>> Cc: Adrian Hunter <adrian.hunter@intel.com> >>>>> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> >>>>> Cc: Ian Rogers <irogers@google.com> >>>>> Cc: Ingo Molnar <mingo@redhat.com> >>>>> Cc: Jiri Olsa <jolsa@kernel.org> >>>>> Cc: Mark Rutland <mark.rutland@arm.com> >>>>> Cc: Masami Hiramatsu <mhiramat@kernel.org> >>>>> Cc: Milian Wolff <milian.wolff@kdab.com> >>>>> Cc: Peter Zijlstra <peterz@infradead.org> >>>>> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> >>>>> --- >>>>> tools/perf/util/machine.c | 5 ----- >>>>> 1 file changed, 5 deletions(-) >>>>> >>>>> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c >>>>> index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644 >>>>> --- a/tools/perf/util/machine.c >>>>> +++ b/tools/perf/util/machine.c >>>>> @@ -45,7 +45,6 @@ >>>>> >>>>> static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd, >>>>> struct thread *th, bool lock); >>>>> -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip); >>>>> >>>>> static struct dso *machine__kernel_dso(struct machine *machine) >>>>> { >>>>> @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread, >>>>> ms.maps = maps__get(al.maps); >>>>> ms.map = map__get(al.map); >>>>> ms.sym = al.sym; >>>>> - >>>>> - if (!branch && append_inlines(cursor, &ms, ip) == 0) >>>>> - goto out; >>>>> - >>>>> srcline = callchain_srcline(&ms, al.addr); >>>>> err = callchain_cursor_append(cursor, ip, &ms, >>>>> branch, flags, nr_loop_iter, Tested-by: Jesper Dangaard Brouer <hawk@kernel.org> --Jesper
Em Tue, Aug 08, 2023 at 03:38:37PM +0200, Jesper Dangaard Brouer escreveu: > > > On 07/08/2023 16.03, Artem Savkov wrote: > > On Mon, Aug 07, 2023 at 10:34:44AM -0300, Arnaldo Carvalho de Melo wrote: > > > Em Mon, Aug 07, 2023 at 01:00:08PM +0200, Artem Savkov escreveu: > > > > On Wed, Aug 02, 2023 at 09:43:40AM +0200, Artem Savkov wrote: > > > > > Hi Arnaldo, > > > > > > > > > > On Tue, Aug 01, 2023 at 06:42:47PM -0300, Arnaldo Carvalho de Melo wrote: > > > > > > Hi Artem, > > > > > > > > > > > > Can you please double check this? I reproduced with: > > > > > > > > > > > > git checkout 46d21ec067490ab9cdcc89b9de5aae28786a8b8e > > > > > > build it > > > > > > perf record -a -g sleep 5s > > > > > > perf report > > > > > > > > > > > > Do you get the same slowness and then reverting it, i.e. just > > > > > > going to HEAD~ and rebuilding getting a fast 'perf report' startup, i.e. > > > > > > without the inlines in the callchains? > > > > > > > > > > With a simple test like this I definitely get a slowdown, but not sure > > > > > if it can be called excessive. > > > > > > > > > > Below are the times I got by running 'time perf report' and hitting 'q' > > > > > during load so that it quits as soon as it is loads up. Tested on a > > > > > freshly updated fedora 38. > > > > > > I reported this problem to ACME. It is also possible to reproduce > without hitting 'q' via using this cmdline with --stdio like this: > > time perf report -v --stdio > /dev/null 2> debug01.stderr > > The file 'debug01.stderr' contained a lot of addr2line output, that > might help debug the issue further. > > > > > > > My bad, I had wrong debuginfo installed for the kernel I tested. I can > > > > reproduce it with the correct one. Looks like vmlinux is just too much > > > > for addr2line. Maybe we can skip it but leave other inlines in, like so: > > > > > > That is a possibilit, and probably we could make it cheaper by looking > > > at the cpumode, avoiding calling addr2line when we didn't makage to > > > resolve the symbol, etc. > > > > > > We also may want to have this as an option that has to be explicitely > > > enabled, like --resolve-inlines, as this will add overhead no matter if > > > we stop calling addr2line and do it more efficiently, etc. > > > > Sounds good, I'll look into it. > > > > > Fact is, we're late in the 6.5 schedule, so the best thing now is to > > > just revert the patch and then try again later, ok? > > > > Yes, sure. > > > > > - Arnaldo > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > > > index 11de3ca8d4fa7..fef309cd401f7 100644 > > > > --- a/tools/perf/util/machine.c > > > > +++ b/tools/perf/util/machine.c > > > > @@ -2388,7 +2388,9 @@ static int add_callchain_ip(struct thread *thread, > > > > ms.map = map__get(al.map); > > > > ms.sym = al.sym; > > > > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > > > > + if (!branch && ms.map && ms.map->dso && > > > > + strcmp(ms.map->dso->short_name, "[kernel.vmlinux]") && > > > > + append_inlines(cursor, &ms, ip) == 0) > > > > goto out; > > > > srcline = callchain_srcline(&ms, al.addr); > > > > > > > > > > - Arnaldo > > > > > > > > > > > > ---- > > > > > > > > > > > > This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. > > > > > > > > > > > > The tests were made with a specific workload, further tests on a > > > > > > recently updated fedora 38 system with a system wide perf.data file > > > > > > shows 'perf report' taking excessive time, so lets revert this until a > > > > > > full investigation and improvement on the addr2line support code is > > > > > > made. > > > > > > > > Reported-by: Jesper Dangaard Brouer <hawk@kernel.org> Thanks, I did add that locally, will keep it and add your Tested-by, thanks a lot! - Arnaldo > > > > > > Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> > > > > > > Cc: Artem Savkov <asavkov@redhat.com> > > > > > > Cc: Namhyung Kim <namhyung@kernel.org> > > > > > > Cc: Adrian Hunter <adrian.hunter@intel.com> > > > > > > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> > > > > > > Cc: Ian Rogers <irogers@google.com> > > > > > > Cc: Ingo Molnar <mingo@redhat.com> > > > > > > Cc: Jiri Olsa <jolsa@kernel.org> > > > > > > Cc: Mark Rutland <mark.rutland@arm.com> > > > > > > Cc: Masami Hiramatsu <mhiramat@kernel.org> > > > > > > Cc: Milian Wolff <milian.wolff@kdab.com> > > > > > > Cc: Peter Zijlstra <peterz@infradead.org> > > > > > > Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > > > > > --- > > > > > > tools/perf/util/machine.c | 5 ----- > > > > > > 1 file changed, 5 deletions(-) > > > > > > > > > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > > > > > index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644 > > > > > > --- a/tools/perf/util/machine.c > > > > > > +++ b/tools/perf/util/machine.c > > > > > > @@ -45,7 +45,6 @@ > > > > > > static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd, > > > > > > struct thread *th, bool lock); > > > > > > -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip); > > > > > > static struct dso *machine__kernel_dso(struct machine *machine) > > > > > > { > > > > > > @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread, > > > > > > ms.maps = maps__get(al.maps); > > > > > > ms.map = map__get(al.map); > > > > > > ms.sym = al.sym; > > > > > > - > > > > > > - if (!branch && append_inlines(cursor, &ms, ip) == 0) > > > > > > - goto out; > > > > > > - > > > > > > srcline = callchain_srcline(&ms, al.addr); > > > > > > err = callchain_cursor_append(cursor, ip, &ms, > > > > > > branch, flags, nr_loop_iter, > > Tested-by: Jesper Dangaard Brouer <hawk@kernel.org> > > --Jesper
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c index 4e62843d51b7dbf9..f4cb41ee23cdbcfc 100644 --- a/tools/perf/util/machine.c +++ b/tools/perf/util/machine.c @@ -45,7 +45,6 @@ static void __machine__remove_thread(struct machine *machine, struct thread_rb_node *nd, struct thread *th, bool lock); -static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms, u64 ip); static struct dso *machine__kernel_dso(struct machine *machine) { @@ -2385,10 +2384,6 @@ static int add_callchain_ip(struct thread *thread, ms.maps = maps__get(al.maps); ms.map = map__get(al.map); ms.sym = al.sym; - - if (!branch && append_inlines(cursor, &ms, ip) == 0) - goto out; - srcline = callchain_srcline(&ms, al.addr); err = callchain_cursor_append(cursor, ip, &ms, branch, flags, nr_loop_iter,