Message ID | 20230614151625.2077-1-yangjihong1@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp34819vqr; Wed, 14 Jun 2023 08:33:37 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5Y6g2v0KvSWX1GNZpix5yeqvYYR4nwSWQI8IDDylzS1FYF2rYTSJ1qnluxWxvmi7SG68Sa X-Received: by 2002:a05:6870:5156:b0:17a:ce6b:720 with SMTP id z22-20020a056870515600b0017ace6b0720mr11737770oak.19.1686756817525; Wed, 14 Jun 2023 08:33:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686756817; cv=none; d=google.com; s=arc-20160816; b=RHhE1yYb7B8F9FtFA3TBmSrG+UUANAdCNB9S7HnpDmap71OYiuphXRIOUGc4swWEL5 hsl3kM0kQ3SrgiZnf+BUuclLzsITyKIcON+fHoSAOe9IrAmzBj3luHekRTimg9sN7sHc BwVmCgNMK91gcOfDR/yHQFISG46nzZ0y4lKvIkSfsABrjvpzGil5Kafzl+FAPfYy2BJ+ WPdlAyx1VlAGnvZilClokotsmpJKHWd3sMoxqKGCqrGbroc20BCrJPk7iSO9GE73eyQv yspauqTdGfebEC6wSBt2jHvvZivYkKQx4y67frl5t2EAFfb2zBHBAIC6bWhga5EL9pmY Xhnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=SbM0ssjKXZUN0Zf2hWE0fJCwE0KK/cwwnnRmi5Y5B4Y=; b=HR+/UNrrcKDw8C3IP3E+hb9HbqiRQw+qQhDMVmAXUnKSPprJauREWtXM8QQkXO29sN vXYqO1618aBYBvWrnjYCvpQQJEn8HEFGSv4mk/XK/ABmBI7gKU1NNTa40H6uXfOSXJDf xZ74JIUCuNXD7WsDWdAA9JpQ6WJ3bWNPJTB1hY8xqFsqm1fY8lWsco59uPNd13ZHtQdV YmfEfS/t2PR9oCNcY0H3jPKO/9ft61Mgemm4qgLVlPRsegvsZzYozZPJVHTUxWOMsq5A CX6/z6jaNUCO/v8Y+6xeLjG/bwiAvJkGJnCOuZ8T1lrNo/RejtAw7UnuqT+3LbY0LRsd TEqQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mn6-20020a17090b188600b0025c1d114af0si3234314pjb.93.2023.06.14.08.32.47; Wed, 14 Jun 2023 08:33:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343701AbjFNPSV (ORCPT <rfc822;n2h9z4@gmail.com> + 99 others); Wed, 14 Jun 2023 11:18:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343677AbjFNPSR (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 14 Jun 2023 11:18:17 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50386198; Wed, 14 Jun 2023 08:18:14 -0700 (PDT) Received: from kwepemm600003.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Qh8BL43ksztQlf; Wed, 14 Jun 2023 23:15:38 +0800 (CST) Received: from localhost.localdomain (10.67.174.95) by kwepemm600003.china.huawei.com (7.193.23.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Wed, 14 Jun 2023 23:18:07 +0800 From: Yang Jihong <yangjihong1@huawei.com> To: <peterz@infradead.org>, <mingo@redhat.com>, <acme@kernel.org>, <mark.rutland@arm.com>, <alexander.shishkin@linux.intel.com>, <jolsa@kernel.org>, <namhyung@kernel.org>, <irogers@google.com>, <adrian.hunter@intel.com>, <kan.liang@linux.intel.com>, <linux-perf-users@vger.kernel.org>, <linux-kernel@vger.kernel.org> CC: <yangjihong1@huawei.com> Subject: [PATCH] perf top & record: Fix segfault when default cycles event is not supported Date: Wed, 14 Jun 2023 15:16:25 +0000 Message-ID: <20230614151625.2077-1-yangjihong1@huawei.com> X-Mailer: git-send-email 2.30.GIT MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.67.174.95] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To kwepemm600003.china.huawei.com (7.193.23.202) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768692717008287221?= X-GMAIL-MSGID: =?utf-8?q?1768692717008287221?= |
Series |
perf top & record: Fix segfault when default cycles event is not supported
|
|
Commit Message
Yang Jihong
June 14, 2023, 3:16 p.m. UTC
The perf-record and perf-top call parse_event() to add a cycles event to
an empty evlist. For the system that does not support hardware cycles
event, such as QEMU, the evlist is empty due to the following code process:
parse_event(evlist, "cycles:P" or ""cycles:Pu")
parse_events(evlist, "cycles:P")
__parse_events
...
ret = parse_events__scanner(str, &parse_state);
// ret = 0
...
ret2 = parse_events__sort_events_and_fix_groups()
if (ret2 < 0)
return ret;
// The cycles event is not supported, here ret2 = -EINVAL,
// Here return 0.
...
evlist__splice_list_tail(evlist)
// The code here does not execute to, so the evlist is still empty.
A null pointer occurs when the content in the evlist is accessed later.
Before:
# perf list hw
List of pre-defined events (to be used in -e or -M):
# perf record true
libperf: Miscounted nr_mmaps 0 vs 1
WARNING: No sample_id_all support, falling back to unordered processing
perf: Segmentation fault
Obtained 1 stack frames.
[0xc5beff]
Segmentation fault
Solution:
If cycles event is not supported, try to fall back to cpu-clock event.
After:
# perf record true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data ]
#
Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
---
tools/perf/builtin-record.c | 4 +---
tools/perf/builtin-top.c | 3 +--
tools/perf/util/evlist.c | 18 ++++++++++++++++++
tools/perf/util/evlist.h | 1 +
4 files changed, 21 insertions(+), 5 deletions(-)
Comments
On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > The perf-record and perf-top call parse_event() to add a cycles event to > an empty evlist. For the system that does not support hardware cycles > event, such as QEMU, the evlist is empty due to the following code process: > > parse_event(evlist, "cycles:P" or ""cycles:Pu") > parse_events(evlist, "cycles:P") > __parse_events > ... > ret = parse_events__scanner(str, &parse_state); > // ret = 0 > ... > ret2 = parse_events__sort_events_and_fix_groups() > if (ret2 < 0) > return ret; > // The cycles event is not supported, here ret2 = -EINVAL, > // Here return 0. > ... > evlist__splice_list_tail(evlist) > // The code here does not execute to, so the evlist is still empty. > > A null pointer occurs when the content in the evlist is accessed later. > > Before: > > # perf list hw > > List of pre-defined events (to be used in -e or -M): > > # perf record true > libperf: Miscounted nr_mmaps 0 vs 1 > WARNING: No sample_id_all support, falling back to unordered processing > perf: Segmentation fault > Obtained 1 stack frames. > [0xc5beff] > Segmentation fault > > Solution: > If cycles event is not supported, try to fall back to cpu-clock event. > > After: > # perf record true > [ perf record: Woken up 1 times to write data ] > [ perf record: Captured and wrote 0.006 MB perf.data ] > # > > Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") > Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Thanks, useful addition. The cpu-clock fall back wasn't present before 7b100989b4f6 so is the fixes tag correct? Wrt segv, I'm beginning to think that we should always forcibly create a core PMU even if we can't find one one in sysfs, my guess is that is what triggers the segv. evlist__add_default doesn't really explain what the function is doing and default can have >1 meaning. Perhaps, evlist__add_cycles. Thanks, Ian > --- > tools/perf/builtin-record.c | 4 +--- > tools/perf/builtin-top.c | 3 +-- > tools/perf/util/evlist.c | 18 ++++++++++++++++++ > tools/perf/util/evlist.h | 1 + > 4 files changed, 21 insertions(+), 5 deletions(-) > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > index aec18db7ff23..29ae2b84a63a 100644 > --- a/tools/perf/builtin-record.c > +++ b/tools/perf/builtin-record.c > @@ -4161,9 +4161,7 @@ int cmd_record(int argc, const char **argv) > record.opts.tail_synthesize = true; > > if (rec->evlist->core.nr_entries == 0) { > - bool can_profile_kernel = perf_event_paranoid_check(1); > - > - err = parse_event(rec->evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); > + err = evlist__add_default(rec->evlist); > if (err) > goto out; > } > diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c > index c363c04e16df..798cb9252a5f 100644 > --- a/tools/perf/builtin-top.c > +++ b/tools/perf/builtin-top.c > @@ -1665,8 +1665,7 @@ int cmd_top(int argc, const char **argv) > goto out_delete_evlist; > > if (!top.evlist->core.nr_entries) { > - bool can_profile_kernel = perf_event_paranoid_check(1); > - int err = parse_event(top.evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); > + int err = evlist__add_default(top.evlist); > > if (err) > goto out_delete_evlist; > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c > index 7ef43f72098e..60efa762405e 100644 > --- a/tools/perf/util/evlist.c > +++ b/tools/perf/util/evlist.c > @@ -287,6 +287,24 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) > return evsel; > } > > +int evlist__add_default(struct evlist *evlist) > +{ > + bool can_profile_kernel; > + int err; > + > + can_profile_kernel = perf_event_paranoid_check(1); > + err = parse_event(evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); > + if (err) > + return err; > + > + if (!evlist->core.nr_entries) { > + pr_debug("The cycles event is not supported, trying to fall back to cpu-clock event\n"); > + return parse_event(evlist, "cpu-clock"); > + } > + > + return 0; > +} > + > #ifdef HAVE_LIBTRACEEVENT > struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide) > { > diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h > index 664c6bf7b3e0..47eea809ee91 100644 > --- a/tools/perf/util/evlist.h > +++ b/tools/perf/util/evlist.h > @@ -116,6 +116,7 @@ int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs); > > int evlist__add_dummy(struct evlist *evlist); > struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide); > +int evlist__add_default(struct evlist *evlist); > static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist) > { > return evlist__add_aux_dummy(evlist, true); > -- > 2.30.GIT >
On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@google.com> wrote: > > On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > > > The perf-record and perf-top call parse_event() to add a cycles event to > > an empty evlist. For the system that does not support hardware cycles > > event, such as QEMU, the evlist is empty due to the following code process: > > > > parse_event(evlist, "cycles:P" or ""cycles:Pu") > > parse_events(evlist, "cycles:P") > > __parse_events > > ... > > ret = parse_events__scanner(str, &parse_state); > > // ret = 0 > > ... > > ret2 = parse_events__sort_events_and_fix_groups() > > if (ret2 < 0) > > return ret; > > // The cycles event is not supported, here ret2 = -EINVAL, > > // Here return 0. > > ... > > evlist__splice_list_tail(evlist) > > // The code here does not execute to, so the evlist is still empty. > > > > A null pointer occurs when the content in the evlist is accessed later. > > > > Before: > > > > # perf list hw > > > > List of pre-defined events (to be used in -e or -M): > > > > # perf record true > > libperf: Miscounted nr_mmaps 0 vs 1 > > WARNING: No sample_id_all support, falling back to unordered processing > > perf: Segmentation fault > > Obtained 1 stack frames. > > [0xc5beff] > > Segmentation fault > > > > Solution: > > If cycles event is not supported, try to fall back to cpu-clock event. > > > > After: > > # perf record true > > [ perf record: Woken up 1 times to write data ] > > [ perf record: Captured and wrote 0.006 MB perf.data ] > > # > > > > Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") > > Signed-off-by: Yang Jihong <yangjihong1@huawei.com> > > Thanks, useful addition. The cpu-clock fall back wasn't present before > 7b100989b4f6 so is the fixes tag correct? Hmm... it should be coming from evsel__fallback: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840 so we shouldn't duplicate that logic. The question is why we're not doing the fallback. Thanks, Ian > Wrt segv, I'm beginning to think that we should always forcibly create > a core PMU even if we can't find one one in sysfs, my guess is that is > what triggers the segv. > > evlist__add_default doesn't really explain what the function is doing > and default can have >1 meaning. Perhaps, evlist__add_cycles. > > Thanks, > Ian > > > --- > > tools/perf/builtin-record.c | 4 +--- > > tools/perf/builtin-top.c | 3 +-- > > tools/perf/util/evlist.c | 18 ++++++++++++++++++ > > tools/perf/util/evlist.h | 1 + > > 4 files changed, 21 insertions(+), 5 deletions(-) > > > > diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c > > index aec18db7ff23..29ae2b84a63a 100644 > > --- a/tools/perf/builtin-record.c > > +++ b/tools/perf/builtin-record.c > > @@ -4161,9 +4161,7 @@ int cmd_record(int argc, const char **argv) > > record.opts.tail_synthesize = true; > > > > if (rec->evlist->core.nr_entries == 0) { > > - bool can_profile_kernel = perf_event_paranoid_check(1); > > - > > - err = parse_event(rec->evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); > > + err = evlist__add_default(rec->evlist); > > if (err) > > goto out; > > } > > diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c > > index c363c04e16df..798cb9252a5f 100644 > > --- a/tools/perf/builtin-top.c > > +++ b/tools/perf/builtin-top.c > > @@ -1665,8 +1665,7 @@ int cmd_top(int argc, const char **argv) > > goto out_delete_evlist; > > > > if (!top.evlist->core.nr_entries) { > > - bool can_profile_kernel = perf_event_paranoid_check(1); > > - int err = parse_event(top.evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); > > + int err = evlist__add_default(top.evlist); > > > > if (err) > > goto out_delete_evlist; > > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c > > index 7ef43f72098e..60efa762405e 100644 > > --- a/tools/perf/util/evlist.c > > +++ b/tools/perf/util/evlist.c > > @@ -287,6 +287,24 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) > > return evsel; > > } > > > > +int evlist__add_default(struct evlist *evlist) > > +{ > > + bool can_profile_kernel; > > + int err; > > + > > + can_profile_kernel = perf_event_paranoid_check(1); > > + err = parse_event(evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); > > + if (err) > > + return err; > > + > > + if (!evlist->core.nr_entries) { > > + pr_debug("The cycles event is not supported, trying to fall back to cpu-clock event\n"); > > + return parse_event(evlist, "cpu-clock"); > > + } > > + > > + return 0; > > +} > > + > > #ifdef HAVE_LIBTRACEEVENT > > struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide) > > { > > diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h > > index 664c6bf7b3e0..47eea809ee91 100644 > > --- a/tools/perf/util/evlist.h > > +++ b/tools/perf/util/evlist.h > > @@ -116,6 +116,7 @@ int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs); > > > > int evlist__add_dummy(struct evlist *evlist); > > struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide); > > +int evlist__add_default(struct evlist *evlist); > > static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist) > > { > > return evlist__add_aux_dummy(evlist, true); > > -- > > 2.30.GIT > >
Hello, On 2023/6/15 0:18, Ian Rogers wrote: > On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: >> >> The perf-record and perf-top call parse_event() to add a cycles event to >> an empty evlist. For the system that does not support hardware cycles >> event, such as QEMU, the evlist is empty due to the following code process: >> >> parse_event(evlist, "cycles:P" or ""cycles:Pu") >> parse_events(evlist, "cycles:P") >> __parse_events >> ... >> ret = parse_events__scanner(str, &parse_state); >> // ret = 0 >> ... >> ret2 = parse_events__sort_events_and_fix_groups() >> if (ret2 < 0) >> return ret; >> // The cycles event is not supported, here ret2 = -EINVAL, >> // Here return 0. >> ... >> evlist__splice_list_tail(evlist) >> // The code here does not execute to, so the evlist is still empty. >> >> A null pointer occurs when the content in the evlist is accessed later. >> >> Before: >> >> # perf list hw >> >> List of pre-defined events (to be used in -e or -M): >> >> # perf record true >> libperf: Miscounted nr_mmaps 0 vs 1 >> WARNING: No sample_id_all support, falling back to unordered processing >> perf: Segmentation fault >> Obtained 1 stack frames. >> [0xc5beff] >> Segmentation fault >> >> Solution: >> If cycles event is not supported, try to fall back to cpu-clock event. >> >> After: >> # perf record true >> [ perf record: Woken up 1 times to write data ] >> [ perf record: Captured and wrote 0.006 MB perf.data ] >> # >> >> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") >> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> > > Thanks, useful addition. The cpu-clock fall back wasn't present before > 7b100989b4f6 so is the fixes tag correct? > Before 7b100989b4f6, perf-record call evlist__add_default() to create an evsel and directly add it to the evlist, it does not search for the corresponding PMU in the sysfs. Therefore, the evlist is not empty before the commit 7b100989b4f6. > Wrt segv, I'm beginning to think that we should always forcibly create > a core PMU even if we can't find one one in sysfs, my guess is that is > what triggers the segv. > Yes, that's the reason. Thanks, Yang. > evlist__add_default doesn't really explain what the function is doing > and default can have >1 meaning. Perhaps, evlist__add_cycles. > > Thanks, > Ian > >> --- >> tools/perf/builtin-record.c | 4 +--- >> tools/perf/builtin-top.c | 3 +-- >> tools/perf/util/evlist.c | 18 ++++++++++++++++++ >> tools/perf/util/evlist.h | 1 + >> 4 files changed, 21 insertions(+), 5 deletions(-) >> >> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c >> index aec18db7ff23..29ae2b84a63a 100644 >> --- a/tools/perf/builtin-record.c >> +++ b/tools/perf/builtin-record.c >> @@ -4161,9 +4161,7 @@ int cmd_record(int argc, const char **argv) >> record.opts.tail_synthesize = true; >> >> if (rec->evlist->core.nr_entries == 0) { >> - bool can_profile_kernel = perf_event_paranoid_check(1); >> - >> - err = parse_event(rec->evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); >> + err = evlist__add_default(rec->evlist); >> if (err) >> goto out; >> } >> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c >> index c363c04e16df..798cb9252a5f 100644 >> --- a/tools/perf/builtin-top.c >> +++ b/tools/perf/builtin-top.c >> @@ -1665,8 +1665,7 @@ int cmd_top(int argc, const char **argv) >> goto out_delete_evlist; >> >> if (!top.evlist->core.nr_entries) { >> - bool can_profile_kernel = perf_event_paranoid_check(1); >> - int err = parse_event(top.evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); >> + int err = evlist__add_default(top.evlist); >> >> if (err) >> goto out_delete_evlist; >> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c >> index 7ef43f72098e..60efa762405e 100644 >> --- a/tools/perf/util/evlist.c >> +++ b/tools/perf/util/evlist.c >> @@ -287,6 +287,24 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) >> return evsel; >> } >> >> +int evlist__add_default(struct evlist *evlist) >> +{ >> + bool can_profile_kernel; >> + int err; >> + >> + can_profile_kernel = perf_event_paranoid_check(1); >> + err = parse_event(evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); >> + if (err) >> + return err; >> + >> + if (!evlist->core.nr_entries) { >> + pr_debug("The cycles event is not supported, trying to fall back to cpu-clock event\n"); >> + return parse_event(evlist, "cpu-clock"); >> + } >> + >> + return 0; >> +} >> + >> #ifdef HAVE_LIBTRACEEVENT >> struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide) >> { >> diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h >> index 664c6bf7b3e0..47eea809ee91 100644 >> --- a/tools/perf/util/evlist.h >> +++ b/tools/perf/util/evlist.h >> @@ -116,6 +116,7 @@ int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs); >> >> int evlist__add_dummy(struct evlist *evlist); >> struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide); >> +int evlist__add_default(struct evlist *evlist); >> static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist) >> { >> return evlist__add_aux_dummy(evlist, true); >> -- >> 2.30.GIT >> > . >
Hello, On 2023/6/15 6:03, Ian Rogers wrote: > On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@google.com> wrote: >> >> On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: >>> >>> The perf-record and perf-top call parse_event() to add a cycles event to >>> an empty evlist. For the system that does not support hardware cycles >>> event, such as QEMU, the evlist is empty due to the following code process: >>> >>> parse_event(evlist, "cycles:P" or ""cycles:Pu") >>> parse_events(evlist, "cycles:P") >>> __parse_events >>> ... >>> ret = parse_events__scanner(str, &parse_state); >>> // ret = 0 >>> ... >>> ret2 = parse_events__sort_events_and_fix_groups() >>> if (ret2 < 0) >>> return ret; >>> // The cycles event is not supported, here ret2 = -EINVAL, >>> // Here return 0. >>> ... >>> evlist__splice_list_tail(evlist) >>> // The code here does not execute to, so the evlist is still empty. >>> >>> A null pointer occurs when the content in the evlist is accessed later. >>> >>> Before: >>> >>> # perf list hw >>> >>> List of pre-defined events (to be used in -e or -M): >>> >>> # perf record true >>> libperf: Miscounted nr_mmaps 0 vs 1 >>> WARNING: No sample_id_all support, falling back to unordered processing >>> perf: Segmentation fault >>> Obtained 1 stack frames. >>> [0xc5beff] >>> Segmentation fault >>> >>> Solution: >>> If cycles event is not supported, try to fall back to cpu-clock event. >>> >>> After: >>> # perf record true >>> [ perf record: Woken up 1 times to write data ] >>> [ perf record: Captured and wrote 0.006 MB perf.data ] >>> # >>> >>> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") >>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> >> >> Thanks, useful addition. The cpu-clock fall back wasn't present before >> 7b100989b4f6 so is the fixes tag correct? > > Hmm... it should be coming from evsel__fallback: > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840 > so we shouldn't duplicate that logic. The question is why we're not > doing the fallback. > Yes, it's a bit of the same logic as evsel__fallback, or we can call evlist__add_default() as before, simply create an evsel of hardware cycles and add it directly to evlist. Please confirm whether this solution is feasible. If it is feasible, I will send a v2 version. Thanks, Yang
On Wed, Jun 14, 2023 at 6:55 PM Yang Jihong <yangjihong1@huawei.com> wrote: > > Hello, > > On 2023/6/15 6:03, Ian Rogers wrote: > > On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@google.com> wrote: > >> > >> On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: > >>> > >>> The perf-record and perf-top call parse_event() to add a cycles event to > >>> an empty evlist. For the system that does not support hardware cycles > >>> event, such as QEMU, the evlist is empty due to the following code process: > >>> > >>> parse_event(evlist, "cycles:P" or ""cycles:Pu") > >>> parse_events(evlist, "cycles:P") > >>> __parse_events > >>> ... > >>> ret = parse_events__scanner(str, &parse_state); > >>> // ret = 0 > >>> ... > >>> ret2 = parse_events__sort_events_and_fix_groups() > >>> if (ret2 < 0) > >>> return ret; > >>> // The cycles event is not supported, here ret2 = -EINVAL, > >>> // Here return 0. > >>> ... > >>> evlist__splice_list_tail(evlist) > >>> // The code here does not execute to, so the evlist is still empty. > >>> > >>> A null pointer occurs when the content in the evlist is accessed later. > >>> > >>> Before: > >>> > >>> # perf list hw > >>> > >>> List of pre-defined events (to be used in -e or -M): > >>> > >>> # perf record true > >>> libperf: Miscounted nr_mmaps 0 vs 1 > >>> WARNING: No sample_id_all support, falling back to unordered processing > >>> perf: Segmentation fault > >>> Obtained 1 stack frames. > >>> [0xc5beff] > >>> Segmentation fault > >>> > >>> Solution: > >>> If cycles event is not supported, try to fall back to cpu-clock event. > >>> > >>> After: > >>> # perf record true > >>> [ perf record: Woken up 1 times to write data ] > >>> [ perf record: Captured and wrote 0.006 MB perf.data ] > >>> # > >>> > >>> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") > >>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> > >> > >> Thanks, useful addition. The cpu-clock fall back wasn't present before > >> 7b100989b4f6 so is the fixes tag correct? > > > > Hmm... it should be coming from evsel__fallback: > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840 > > so we shouldn't duplicate that logic. The question is why we're not > > doing the fallback. > > > > Yes, it's a bit of the same logic as evsel__fallback, or we can call > evlist__add_default() as before, simply create an evsel of hardware > cycles and add it directly to evlist. > > Please confirm whether this solution is feasible. If it is feasible, I > will send a v2 version. The previous evlist__add_default logic didn't handle wildcard PMUs for cycles, hence wanting to reuse the parse events logic. The error is that the logic now isn't doing the fallback properly. I think an evlist__add_cycles which uses evsel__fallback makes sense matching the previous logic. I'd be happy if you took a look. I'll write a patch so that the perf_pmus list of core PMUs is never empty. Thanks, Ian > Thanks, > Yang
Hello, On 2023/6/15 10:04, Ian Rogers wrote: > On Wed, Jun 14, 2023 at 6:55 PM Yang Jihong <yangjihong1@huawei.com> wrote: >> >> Hello, >> >> On 2023/6/15 6:03, Ian Rogers wrote: >>> On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@google.com> wrote: >>>> >>>> On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: >>>>> >>>>> The perf-record and perf-top call parse_event() to add a cycles event to >>>>> an empty evlist. For the system that does not support hardware cycles >>>>> event, such as QEMU, the evlist is empty due to the following code process: >>>>> >>>>> parse_event(evlist, "cycles:P" or ""cycles:Pu") >>>>> parse_events(evlist, "cycles:P") >>>>> __parse_events >>>>> ... >>>>> ret = parse_events__scanner(str, &parse_state); >>>>> // ret = 0 >>>>> ... >>>>> ret2 = parse_events__sort_events_and_fix_groups() >>>>> if (ret2 < 0) >>>>> return ret; >>>>> // The cycles event is not supported, here ret2 = -EINVAL, >>>>> // Here return 0. >>>>> ... >>>>> evlist__splice_list_tail(evlist) >>>>> // The code here does not execute to, so the evlist is still empty. >>>>> >>>>> A null pointer occurs when the content in the evlist is accessed later. >>>>> >>>>> Before: >>>>> >>>>> # perf list hw >>>>> >>>>> List of pre-defined events (to be used in -e or -M): >>>>> >>>>> # perf record true >>>>> libperf: Miscounted nr_mmaps 0 vs 1 >>>>> WARNING: No sample_id_all support, falling back to unordered processing >>>>> perf: Segmentation fault >>>>> Obtained 1 stack frames. >>>>> [0xc5beff] >>>>> Segmentation fault >>>>> >>>>> Solution: >>>>> If cycles event is not supported, try to fall back to cpu-clock event. >>>>> >>>>> After: >>>>> # perf record true >>>>> [ perf record: Woken up 1 times to write data ] >>>>> [ perf record: Captured and wrote 0.006 MB perf.data ] >>>>> # >>>>> >>>>> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") >>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> >>>> >>>> Thanks, useful addition. The cpu-clock fall back wasn't present before >>>> 7b100989b4f6 so is the fixes tag correct? >>> >>> Hmm... it should be coming from evsel__fallback: >>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840 >>> so we shouldn't duplicate that logic. The question is why we're not >>> doing the fallback. >>> >> >> Yes, it's a bit of the same logic as evsel__fallback, or we can call >> evlist__add_default() as before, simply create an evsel of hardware >> cycles and add it directly to evlist. >> >> Please confirm whether this solution is feasible. If it is feasible, I >> will send a v2 version. > > The previous evlist__add_default logic didn't handle wildcard PMUs for > cycles, hence wanting to reuse the parse events logic. The error is > that the logic now isn't doing the fallback properly. I think an > evlist__add_cycles which uses evsel__fallback makes sense matching the > previous logic. I'd be happy if you took a look. I'll write a patch so > that the perf_pmus list of core PMUs is never empty. Do you mean to do fallback when parsing events? Instead of doing it at evsel__open? Okay, I'll re-patch it with this solution, and it looks like it should work. Thanks, Yang
Hello, On 2023/6/15 10:04, Ian Rogers wrote: > On Wed, Jun 14, 2023 at 6:55 PM Yang Jihong <yangjihong1@huawei.com> wrote: >> >> Hello, >> >> On 2023/6/15 6:03, Ian Rogers wrote: >>> On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@google.com> wrote: >>>> >>>> On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: >>>>> >>>>> The perf-record and perf-top call parse_event() to add a cycles event to >>>>> an empty evlist. For the system that does not support hardware cycles >>>>> event, such as QEMU, the evlist is empty due to the following code process: >>>>> >>>>> parse_event(evlist, "cycles:P" or ""cycles:Pu") >>>>> parse_events(evlist, "cycles:P") >>>>> __parse_events >>>>> ... >>>>> ret = parse_events__scanner(str, &parse_state); >>>>> // ret = 0 >>>>> ... >>>>> ret2 = parse_events__sort_events_and_fix_groups() >>>>> if (ret2 < 0) >>>>> return ret; >>>>> // The cycles event is not supported, here ret2 = -EINVAL, >>>>> // Here return 0. >>>>> ... >>>>> evlist__splice_list_tail(evlist) >>>>> // The code here does not execute to, so the evlist is still empty. >>>>> >>>>> A null pointer occurs when the content in the evlist is accessed later. >>>>> >>>>> Before: >>>>> >>>>> # perf list hw >>>>> >>>>> List of pre-defined events (to be used in -e or -M): >>>>> >>>>> # perf record true >>>>> libperf: Miscounted nr_mmaps 0 vs 1 >>>>> WARNING: No sample_id_all support, falling back to unordered processing >>>>> perf: Segmentation fault >>>>> Obtained 1 stack frames. >>>>> [0xc5beff] >>>>> Segmentation fault >>>>> >>>>> Solution: >>>>> If cycles event is not supported, try to fall back to cpu-clock event. >>>>> >>>>> After: >>>>> # perf record true >>>>> [ perf record: Woken up 1 times to write data ] >>>>> [ perf record: Captured and wrote 0.006 MB perf.data ] >>>>> # >>>>> >>>>> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") >>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> >>>> >>>> Thanks, useful addition. The cpu-clock fall back wasn't present before >>>> 7b100989b4f6 so is the fixes tag correct? >>> >>> Hmm... it should be coming from evsel__fallback: >>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840 >>> so we shouldn't duplicate that logic. The question is why we're not >>> doing the fallback. >>> >> >> Yes, it's a bit of the same logic as evsel__fallback, or we can call >> evlist__add_default() as before, simply create an evsel of hardware >> cycles and add it directly to evlist. >> >> Please confirm whether this solution is feasible. If it is feasible, I >> will send a v2 version. > > The previous evlist__add_default logic didn't handle wildcard PMUs for > cycles, hence wanting to reuse the parse events logic. The error is > that the logic now isn't doing the fallback properly. I think an > evlist__add_cycles which uses evsel__fallback makes sense matching the > previous logic. I'd be happy if you took a look. I'll write a patch so > that the perf_pmus list of core PMUs is never empty. > The gdb calltrace for core dump is as follows: (gdb) bt #0 0x00000000005ffaa2 in __perf_cpu_map__nr (cpus=0x0) at cpumap.c:283 #1 0x00000000005ffd17 in perf_cpu_map__max (map=0x0) at cpumap.c:371 #2 0x0000000000565644 in cpu_map_data__alloc (syn_data=syn_data@entry=0x7ffc843bff30, header_size=header_size@entry=8) at util/synthetic-events.c:1273 #3 0x0000000000568712 in cpu_map_event__new (map=<optimized out>) at util/synthetic-events.c:1321 #4 perf_event__synthesize_cpu_map (tool=tool@entry=0xc37580 <record>, map=<optimized out>, process=process@entry=0x413a80 <process_synthesized_event>, machine=machine@entry=0x0) at util/synthetic-events.c:1341 #5 0x000000000041426e in record__synthesize (tail=tail@entry=false, rec=0xc37580 <record>) at builtin-record.c:2050 #6 0x0000000000415a0b in __cmd_record (argc=<optimized out>, argv=<optimized out>, rec=0xc37580 <record>) at builtin-record.c:2512 #7 0x0000000000418f22 in cmd_record (argc=<optimized out>, argv=<optimized out>) at builtin-record.c:4260 #8 0x00000000004babdd in run_builtin (p=p@entry=0xc3a0e8 <commands+264>, argc=argc@entry=2, argv=argv@entry=0x7ffc843c5b30) at perf.c:323 #9 0x0000000000401632 in handle_internal_command (argv=0x7ffc843c5b30, argc=2) at perf.c:377 #10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:421 #11 main (argc=2, argv=0x7ffc843c5b30) at perf.c:537 The direct cause of the problem is that rec->evlist->core.all_cpus is empty, resulting in null pointer access. The code process is as follows: cmd_record parse_event(rec->evlist) // Hardware cycle events should not be supported here, so rec->evlist is empty ... evlist__create_maps(rec->evlist) perf_evlist__set_maps(&rec->evlist->core) perf_evlist__propagate_maps(&rec->evlist->core) perf_evlist__for_each_evsel(&rec->evlist->core, evsel) // because rec->evlist is empty, don't get into that __perf_evlist__propagate_maps(), so rec->evlist->core.all_cpus is NULL. __perf_evlist__propagate_maps rec->evlist->core.all_cpus = perf_cpu_map__merge(evlist->all_cpus, evsel->cpus); ... __cmd_record record__synthesize perf_event__synthesize_cpu_map(rec->evlist->core.all_cpus) cpu_map_event__new(rec->evlist->core.all_cpus) cpu_map_data__alloc(rec->evlist->core.all_cpus) perf_cpu_map__max(rec->evlist->core.all_cpus) __perf_cpu_map__nr // Here null pointer access! ... record__open evsel__fallback // Here fallback is just starting Thanks, Yang
On Thu, Jun 15, 2023 at 4:46 AM Yang Jihong <yangjihong1@huawei.com> wrote: > > Hello, > > On 2023/6/15 10:04, Ian Rogers wrote: > > On Wed, Jun 14, 2023 at 6:55 PM Yang Jihong <yangjihong1@huawei.com> wrote: > >> > >> Hello, > >> > >> On 2023/6/15 6:03, Ian Rogers wrote: > >>> On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@google.com> wrote: > >>>> > >>>> On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: > >>>>> > >>>>> The perf-record and perf-top call parse_event() to add a cycles event to > >>>>> an empty evlist. For the system that does not support hardware cycles > >>>>> event, such as QEMU, the evlist is empty due to the following code process: > >>>>> > >>>>> parse_event(evlist, "cycles:P" or ""cycles:Pu") > >>>>> parse_events(evlist, "cycles:P") > >>>>> __parse_events > >>>>> ... > >>>>> ret = parse_events__scanner(str, &parse_state); > >>>>> // ret = 0 > >>>>> ... > >>>>> ret2 = parse_events__sort_events_and_fix_groups() > >>>>> if (ret2 < 0) > >>>>> return ret; > >>>>> // The cycles event is not supported, here ret2 = -EINVAL, > >>>>> // Here return 0. > >>>>> ... > >>>>> evlist__splice_list_tail(evlist) > >>>>> // The code here does not execute to, so the evlist is still empty. > >>>>> > >>>>> A null pointer occurs when the content in the evlist is accessed later. > >>>>> > >>>>> Before: > >>>>> > >>>>> # perf list hw > >>>>> > >>>>> List of pre-defined events (to be used in -e or -M): > >>>>> > >>>>> # perf record true > >>>>> libperf: Miscounted nr_mmaps 0 vs 1 > >>>>> WARNING: No sample_id_all support, falling back to unordered processing > >>>>> perf: Segmentation fault > >>>>> Obtained 1 stack frames. > >>>>> [0xc5beff] > >>>>> Segmentation fault > >>>>> > >>>>> Solution: > >>>>> If cycles event is not supported, try to fall back to cpu-clock event. > >>>>> > >>>>> After: > >>>>> # perf record true > >>>>> [ perf record: Woken up 1 times to write data ] > >>>>> [ perf record: Captured and wrote 0.006 MB perf.data ] > >>>>> # > >>>>> > >>>>> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") > >>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> > >>>> > >>>> Thanks, useful addition. The cpu-clock fall back wasn't present before > >>>> 7b100989b4f6 so is the fixes tag correct? > >>> > >>> Hmm... it should be coming from evsel__fallback: > >>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840 > >>> so we shouldn't duplicate that logic. The question is why we're not > >>> doing the fallback. > >>> > >> > >> Yes, it's a bit of the same logic as evsel__fallback, or we can call > >> evlist__add_default() as before, simply create an evsel of hardware > >> cycles and add it directly to evlist. > >> > >> Please confirm whether this solution is feasible. If it is feasible, I > >> will send a v2 version. > > > > The previous evlist__add_default logic didn't handle wildcard PMUs for > > cycles, hence wanting to reuse the parse events logic. The error is > > that the logic now isn't doing the fallback properly. I think an > > evlist__add_cycles which uses evsel__fallback makes sense matching the > > previous logic. I'd be happy if you took a look. I'll write a patch so > > that the perf_pmus list of core PMUs is never empty. > > > > The gdb calltrace for core dump is as follows: > > (gdb) bt > #0 0x00000000005ffaa2 in __perf_cpu_map__nr (cpus=0x0) at cpumap.c:283 > #1 0x00000000005ffd17 in perf_cpu_map__max (map=0x0) at cpumap.c:371 > #2 0x0000000000565644 in cpu_map_data__alloc > (syn_data=syn_data@entry=0x7ffc843bff30, > header_size=header_size@entry=8) at util/synthetic-events.c:1273 > #3 0x0000000000568712 in cpu_map_event__new (map=<optimized out>) at > util/synthetic-events.c:1321 > #4 perf_event__synthesize_cpu_map (tool=tool@entry=0xc37580 <record>, > map=<optimized out>, process=process@entry=0x413a80 > <process_synthesized_event>, machine=machine@entry=0x0) at > util/synthetic-events.c:1341 > #5 0x000000000041426e in record__synthesize (tail=tail@entry=false, > rec=0xc37580 <record>) at builtin-record.c:2050 > #6 0x0000000000415a0b in __cmd_record (argc=<optimized out>, > argv=<optimized out>, rec=0xc37580 <record>) at builtin-record.c:2512 > #7 0x0000000000418f22 in cmd_record (argc=<optimized out>, > argv=<optimized out>) at builtin-record.c:4260 > #8 0x00000000004babdd in run_builtin (p=p@entry=0xc3a0e8 > <commands+264>, argc=argc@entry=2, argv=argv@entry=0x7ffc843c5b30) at > perf.c:323 > #9 0x0000000000401632 in handle_internal_command (argv=0x7ffc843c5b30, > argc=2) at perf.c:377 > #10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at > perf.c:421 > #11 main (argc=2, argv=0x7ffc843c5b30) at perf.c:537 > > The direct cause of the problem is that rec->evlist->core.all_cpus is > empty, resulting in null pointer access. > > The code process is as follows: > > cmd_record > parse_event(rec->evlist) > // Hardware cycle events should not be supported here, so rec->evlist > is empty > ... > > evlist__create_maps(rec->evlist) > perf_evlist__set_maps(&rec->evlist->core) > perf_evlist__propagate_maps(&rec->evlist->core) > perf_evlist__for_each_evsel(&rec->evlist->core, evsel) > // because rec->evlist is empty, don't get into that > __perf_evlist__propagate_maps(), so rec->evlist->core.all_cpus is NULL. > __perf_evlist__propagate_maps > rec->evlist->core.all_cpus = perf_cpu_map__merge(evlist->all_cpus, > evsel->cpus); > ... > > __cmd_record > record__synthesize > perf_event__synthesize_cpu_map(rec->evlist->core.all_cpus) > cpu_map_event__new(rec->evlist->core.all_cpus) > cpu_map_data__alloc(rec->evlist->core.all_cpus) > perf_cpu_map__max(rec->evlist->core.all_cpus) > __perf_cpu_map__nr > // Here null pointer access! > ... > > record__open > evsel__fallback > // Here fallback is just starting > Sorry, I don't follow this. I sent out a patch for the no core PMU case - please take a look: https://lore.kernel.org/lkml/20230627182834.117565-1-irogers@google.com/ I haven't got a reproduction for failing to open cycles and it's not clear to me why evsel__fallback isn't being used to fallback to clock. Were you able to look at this? Thanks, Ian > Thanks, > Yang
Hello, On 2023/6/28 2:32, Ian Rogers wrote: > On Thu, Jun 15, 2023 at 4:46 AM Yang Jihong <yangjihong1@huawei.com> wrote: >> >> Hello, >> >> On 2023/6/15 10:04, Ian Rogers wrote: >>> On Wed, Jun 14, 2023 at 6:55 PM Yang Jihong <yangjihong1@huawei.com> wrote: >>>> >>>> Hello, >>>> >>>> On 2023/6/15 6:03, Ian Rogers wrote: >>>>> On Wed, Jun 14, 2023 at 9:18 AM Ian Rogers <irogers@google.com> wrote: >>>>>> >>>>>> On Wed, Jun 14, 2023 at 8:18 AM Yang Jihong <yangjihong1@huawei.com> wrote: >>>>>>> >>>>>>> The perf-record and perf-top call parse_event() to add a cycles event to >>>>>>> an empty evlist. For the system that does not support hardware cycles >>>>>>> event, such as QEMU, the evlist is empty due to the following code process: >>>>>>> >>>>>>> parse_event(evlist, "cycles:P" or ""cycles:Pu") >>>>>>> parse_events(evlist, "cycles:P") >>>>>>> __parse_events >>>>>>> ... >>>>>>> ret = parse_events__scanner(str, &parse_state); >>>>>>> // ret = 0 >>>>>>> ... >>>>>>> ret2 = parse_events__sort_events_and_fix_groups() >>>>>>> if (ret2 < 0) >>>>>>> return ret; >>>>>>> // The cycles event is not supported, here ret2 = -EINVAL, >>>>>>> // Here return 0. >>>>>>> ... >>>>>>> evlist__splice_list_tail(evlist) >>>>>>> // The code here does not execute to, so the evlist is still empty. >>>>>>> >>>>>>> A null pointer occurs when the content in the evlist is accessed later. >>>>>>> >>>>>>> Before: >>>>>>> >>>>>>> # perf list hw >>>>>>> >>>>>>> List of pre-defined events (to be used in -e or -M): >>>>>>> >>>>>>> # perf record true >>>>>>> libperf: Miscounted nr_mmaps 0 vs 1 >>>>>>> WARNING: No sample_id_all support, falling back to unordered processing >>>>>>> perf: Segmentation fault >>>>>>> Obtained 1 stack frames. >>>>>>> [0xc5beff] >>>>>>> Segmentation fault >>>>>>> >>>>>>> Solution: >>>>>>> If cycles event is not supported, try to fall back to cpu-clock event. >>>>>>> >>>>>>> After: >>>>>>> # perf record true >>>>>>> [ perf record: Woken up 1 times to write data ] >>>>>>> [ perf record: Captured and wrote 0.006 MB perf.data ] >>>>>>> # >>>>>>> >>>>>>> Fixes: 7b100989b4f6 ("perf evlist: Remove __evlist__add_default") >>>>>>> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> >>>>>> >>>>>> Thanks, useful addition. The cpu-clock fall back wasn't present before >>>>>> 7b100989b4f6 so is the fixes tag correct? >>>>> >>>>> Hmm... it should be coming from evsel__fallback: >>>>> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/util/evsel.c?h=tmp.perf-tools-next#n2840 >>>>> so we shouldn't duplicate that logic. The question is why we're not >>>>> doing the fallback. >>>>> >>>> >>>> Yes, it's a bit of the same logic as evsel__fallback, or we can call >>>> evlist__add_default() as before, simply create an evsel of hardware >>>> cycles and add it directly to evlist. >>>> >>>> Please confirm whether this solution is feasible. If it is feasible, I >>>> will send a v2 version. >>> >>> The previous evlist__add_default logic didn't handle wildcard PMUs for >>> cycles, hence wanting to reuse the parse events logic. The error is >>> that the logic now isn't doing the fallback properly. I think an >>> evlist__add_cycles which uses evsel__fallback makes sense matching the >>> previous logic. I'd be happy if you took a look. I'll write a patch so >>> that the perf_pmus list of core PMUs is never empty. >>> >> >> The gdb calltrace for core dump is as follows: >> >> (gdb) bt >> #0 0x00000000005ffaa2 in __perf_cpu_map__nr (cpus=0x0) at cpumap.c:283 >> #1 0x00000000005ffd17 in perf_cpu_map__max (map=0x0) at cpumap.c:371 >> #2 0x0000000000565644 in cpu_map_data__alloc >> (syn_data=syn_data@entry=0x7ffc843bff30, >> header_size=header_size@entry=8) at util/synthetic-events.c:1273 >> #3 0x0000000000568712 in cpu_map_event__new (map=<optimized out>) at >> util/synthetic-events.c:1321 >> #4 perf_event__synthesize_cpu_map (tool=tool@entry=0xc37580 <record>, >> map=<optimized out>, process=process@entry=0x413a80 >> <process_synthesized_event>, machine=machine@entry=0x0) at >> util/synthetic-events.c:1341 >> #5 0x000000000041426e in record__synthesize (tail=tail@entry=false, >> rec=0xc37580 <record>) at builtin-record.c:2050 >> #6 0x0000000000415a0b in __cmd_record (argc=<optimized out>, >> argv=<optimized out>, rec=0xc37580 <record>) at builtin-record.c:2512 >> #7 0x0000000000418f22 in cmd_record (argc=<optimized out>, >> argv=<optimized out>) at builtin-record.c:4260 >> #8 0x00000000004babdd in run_builtin (p=p@entry=0xc3a0e8 >> <commands+264>, argc=argc@entry=2, argv=argv@entry=0x7ffc843c5b30) at >> perf.c:323 >> #9 0x0000000000401632 in handle_internal_command (argv=0x7ffc843c5b30, >> argc=2) at perf.c:377 >> #10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at >> perf.c:421 >> #11 main (argc=2, argv=0x7ffc843c5b30) at perf.c:537 >> >> The direct cause of the problem is that rec->evlist->core.all_cpus is >> empty, resulting in null pointer access. >> >> The code process is as follows: >> >> cmd_record >> parse_event(rec->evlist) >> // Hardware cycle events should not be supported here, so rec->evlist >> is empty >> ... >> >> evlist__create_maps(rec->evlist) >> perf_evlist__set_maps(&rec->evlist->core) >> perf_evlist__propagate_maps(&rec->evlist->core) >> perf_evlist__for_each_evsel(&rec->evlist->core, evsel) >> // because rec->evlist is empty, don't get into that >> __perf_evlist__propagate_maps(), so rec->evlist->core.all_cpus is NULL. >> __perf_evlist__propagate_maps >> rec->evlist->core.all_cpus = perf_cpu_map__merge(evlist->all_cpus, >> evsel->cpus); >> ... >> >> __cmd_record >> record__synthesize >> perf_event__synthesize_cpu_map(rec->evlist->core.all_cpus) >> cpu_map_event__new(rec->evlist->core.all_cpus) >> cpu_map_data__alloc(rec->evlist->core.all_cpus) >> perf_cpu_map__max(rec->evlist->core.all_cpus) >> __perf_cpu_map__nr >> // Here null pointer access! >> ... >> >> record__open >> evsel__fallback >> // Here fallback is just starting >> > > Sorry, I don't follow this. I sent out a patch for the no core PMU > case - please take a look: > https://lore.kernel.org/lkml/20230627182834.117565-1-irogers@google.com/ Thanks for the patch, it seems to solve the problem. I'll test it now. > I haven't got a reproduction for failing to open cycles and it's not > clear to me why evsel__fallback isn't being used to fallback to clock. > Were you able to look at this? 1. In the QEMU environment that does not support hardware events, as shown in [1], the sysfs node related to the core PMU does not exist in the /sys/bus/event_source/devices/ directory. As a result, evlist is empty after perf_recod invokes parse_event(evlist, "cycles:P"). 2. When perf_record initializes the member in evlist->core, the content of some members is NULL because evlist is empty. For example, the value of evlist->core.all_cpus is NULL. 3. Access the content in evlist->core.all_cpus. The null pointer access is triggered, as shown in [2]. At this time, perf_record has not invoked evsel__fallback() yet. 4. evsel__fallback() is invoked in record__open(). Therefore, segfault occurs before fallback to clock. [1]: # ls /sys/bus/event_source/devices/ breakpoint kprobe msr software tracepoint uprobe [2]: cmd_record __cmd_record record__synthesize perf_event__synthesize_cpu_map(evlist->core.all_cpus) cpu_map_event__new(evlist->core.all_cpus) cpu_map_data__alloc(evlist->core.all_cpus) perf_cpu_map__max(evlist->core.all_cpus) __perf_cpu_map__nr // Here null pointer access ... Thanks, Yang
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index aec18db7ff23..29ae2b84a63a 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -4161,9 +4161,7 @@ int cmd_record(int argc, const char **argv) record.opts.tail_synthesize = true; if (rec->evlist->core.nr_entries == 0) { - bool can_profile_kernel = perf_event_paranoid_check(1); - - err = parse_event(rec->evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); + err = evlist__add_default(rec->evlist); if (err) goto out; } diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c index c363c04e16df..798cb9252a5f 100644 --- a/tools/perf/builtin-top.c +++ b/tools/perf/builtin-top.c @@ -1665,8 +1665,7 @@ int cmd_top(int argc, const char **argv) goto out_delete_evlist; if (!top.evlist->core.nr_entries) { - bool can_profile_kernel = perf_event_paranoid_check(1); - int err = parse_event(top.evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); + int err = evlist__add_default(top.evlist); if (err) goto out_delete_evlist; diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 7ef43f72098e..60efa762405e 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -287,6 +287,24 @@ struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide) return evsel; } +int evlist__add_default(struct evlist *evlist) +{ + bool can_profile_kernel; + int err; + + can_profile_kernel = perf_event_paranoid_check(1); + err = parse_event(evlist, can_profile_kernel ? "cycles:P" : "cycles:Pu"); + if (err) + return err; + + if (!evlist->core.nr_entries) { + pr_debug("The cycles event is not supported, trying to fall back to cpu-clock event\n"); + return parse_event(evlist, "cpu-clock"); + } + + return 0; +} + #ifdef HAVE_LIBTRACEEVENT struct evsel *evlist__add_sched_switch(struct evlist *evlist, bool system_wide) { diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h index 664c6bf7b3e0..47eea809ee91 100644 --- a/tools/perf/util/evlist.h +++ b/tools/perf/util/evlist.h @@ -116,6 +116,7 @@ int arch_evlist__cmp(const struct evsel *lhs, const struct evsel *rhs); int evlist__add_dummy(struct evlist *evlist); struct evsel *evlist__add_aux_dummy(struct evlist *evlist, bool system_wide); +int evlist__add_default(struct evlist *evlist); static inline struct evsel *evlist__add_dummy_on_all_cpus(struct evlist *evlist) { return evlist__add_aux_dummy(evlist, true);