Message ID | 20230223075800.1795777-1-changbin.du@huawei.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp232507wrd; Thu, 23 Feb 2023 02:11:53 -0800 (PST) X-Google-Smtp-Source: AK7set86FvrMuth4ApenkzjUUvzTYmHhwzvCU0UKNPICRdhyLrpZJJPRdluyaeVHWImttNOoDU14 X-Received: by 2002:a05:6a00:1804:b0:5b2:5466:34e1 with SMTP id y4-20020a056a00180400b005b2546634e1mr10771091pfa.3.1677147113645; Thu, 23 Feb 2023 02:11:53 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677147113; cv=none; d=google.com; s=arc-20160816; b=AVR7jZxfSu0SBl6gmp37cRuwQeVmgggvkrnz6WvegTVnwTSuT+JBOBzIgsoxj15NsY eFXPg2VGteqx9WpbMWqwebXT05zQ8LMJDXRIgd2YuRfYPUiqVh0jKrlgW03Igs5Yi2Rz TkTzpxmrn3qcSpFwhz3AIkR8Zbc8ORfehgXUYGI3VczW75JgqAa3P9RAys5fX3dASr8p AKeWs0sDvQXNcrGqVCMz/tSvwlzaCkrMBCJPq8WHjCG7ynNOJgyXJpODQ7MwUliFbTju P5kFQZSR5PIae1bJ1HrQCijg/ABRQPY62px2dTJiazBmbHbxBOvt7kNRkXzUERBSf0Jh tUOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=Jf2CBRCW59yTCQXFkRksIslPU6GjZWNQ1s8UsxzqZ3c=; b=08r3fpWT5WaLgpmPkA+LDKIjw0fjpJpFs61cMK0Hfj9x1TdaOyUT+EHyDR+X7YVkyI aRtlpg2v+wNm69dGOZpwzzjr8CmcVX3JJaociLTHnYKc49TzgLF45ewKnUdXRwmcmkGy 6zBogDpvvV4f5FvMOLufb9ZfWjkOCA3YDW99zbssWx+DlTkuH5QnBSjdzgGosBhGf81m sVk/Z5+hv/AGAkwMpqs7XB+ZaAZEjNdtdEKO6U1W/Ysdh9/gvgsmHYwJNZICXTeS0R1X nYShM6ATcKc7wpUTz1kVWkGyTQ/i3yw3+hYzFIYDQ9nT/SWVBVPPxRjXhGatI8UVHwXe uNEw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y18-20020a63b512000000b004fb647a765bsi7385200pge.473.2023.02.23.02.11.40; Thu, 23 Feb 2023 02:11:53 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233398AbjBWKGn (ORCPT <rfc822;cambridge8321@gmail.com> + 99 others); Thu, 23 Feb 2023 05:06:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233601AbjBWKGg (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 23 Feb 2023 05:06:36 -0500 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E92F39CF1; Thu, 23 Feb 2023 02:06:29 -0800 (PST) Received: from kwepemi500013.china.huawei.com (unknown [172.30.72.57]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4PMpWd5ZDszRs8M; Thu, 23 Feb 2023 18:03:41 +0800 (CST) Received: from M910t.huawei.com (10.110.54.157) by kwepemi500013.china.huawei.com (7.221.188.120) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.17; Thu, 23 Feb 2023 18:05:21 +0800 From: Changbin Du <changbin.du@huawei.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org> CC: Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, <linux-perf-users@vger.kernel.org>, <linux-kernel@vger.kernel.org>, Hui Wang <hw.huiwang@huawei.com>, Changbin Du <changbin.du@huawei.com> Subject: [PATCH] perf: fix counting when initial delay configured Date: Thu, 23 Feb 2023 15:58:00 +0800 Message-ID: <20230223075800.1795777-1-changbin.du@huawei.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.110.54.157] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To kwepemi500013.china.huawei.com (7.221.188.120) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758616211929576443?= X-GMAIL-MSGID: =?utf-8?q?1758616211929576443?= |
Series |
perf: fix counting when initial delay configured
|
|
Commit Message
Changbin Du
Feb. 23, 2023, 7:58 a.m. UTC
When creating counters with initial delay configured, the enable_on_exec
field is not set. So we need to enable the counters later. The problem
is, when a workload is specified the target__none() is still true. So
we also need to check stat_config.initial_delay.
Before this fix the event is not counted:
$ ./perf stat -e instructions -D 100 sleep 2
Events disabled
Events enabled
Performance counter stats for 'sleep 2':
<not counted> instructions
1.901661124 seconds time elapsed
0.001602000 seconds user
0.000000000 seconds sys
After fix it works:
$ ./perf stat -e instructions -D 100 sleep 2
Events disabled
Events enabled
Performance counter stats for 'sleep 2':
404,214 instructions
1.901743475 seconds time elapsed
0.001617000 seconds user
0.000000000 seconds sys
Fixes: c587e77e100f ("perf stat: Do not delay the workload with --delay")
Signed-off-by: Changbin Du <changbin.du@huawei.com>
---
tools/perf/builtin-stat.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Em Thu, Feb 23, 2023 at 03:58:00PM +0800, Changbin Du escreveu: > When creating counters with initial delay configured, the enable_on_exec > field is not set. So we need to enable the counters later. The problem > is, when a workload is specified the target__none() is still true. So > we also need to check stat_config.initial_delay. > > Before this fix the event is not counted: > $ ./perf stat -e instructions -D 100 sleep 2 > Events disabled > Events enabled > > Performance counter stats for 'sleep 2': > > <not counted> instructions > > 1.901661124 seconds time elapsed > > 0.001602000 seconds user > 0.000000000 seconds sys > > After fix it works: > $ ./perf stat -e instructions -D 100 sleep 2 > Events disabled > Events enabled > > Performance counter stats for 'sleep 2': > > 404,214 instructions > > 1.901743475 seconds time elapsed > > 0.001617000 seconds user > 0.000000000 seconds sys > > Fixes: c587e77e100f ("perf stat: Do not delay the workload with --delay") Yeap, even the comment states that we need to enable when initial_delay is set :-) I added the additional test output below. Namhyung, can you please ack it? - Arnaldo Committer testing: Before: Lets use stress-ng so that we have lots of samples using a CPU stressor and also intermingle the workload output with the messages about when the events get enabled (i.e. later on in the workload): $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 Events disabled stress-ng: info: [38361] setting to a 1 second run per stressor stress-ng: info: [38361] dispatching hogs: 32 cpu Events enabled stress-ng: info: [38361] successful run completed in 1.01s Performance counter stats for 'stress-ng -c 32 -t 1': <not counted> instructions:u 0.916479141 seconds time elapsed 30.868003000 seconds user 0.049851000 seconds sys Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog $ After the fix: $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 Events disabled stress-ng: info: [40429] setting to a 1 second run per stressor stress-ng: info: [40429] dispatching hogs: 32 cpu Events enabled stress-ng: info: [40429] successful run completed in 1.01s Performance counter stats for 'stress-ng -c 32 -t 1': 154117865145 instructions:u 0.920827644 seconds time elapsed 30.864753000 seconds user 0.073862000 seconds sys $ > Signed-off-by: Changbin Du <changbin.du@huawei.com> > --- > tools/perf/builtin-stat.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > index 9f3e4b257516..c71d85577de6 100644 > --- a/tools/perf/builtin-stat.c > +++ b/tools/perf/builtin-stat.c > @@ -544,7 +544,7 @@ static int enable_counters(void) > * - we don't have tracee (attaching to task or cpu) > * - we have initial delay configured > */ > - if (!target__none(&target)) { > + if (!target__none(&target) || stat_config.initial_delay) { > if (!all_counters_use_bpf) > evlist__enable(evsel_list); > } > -- > 2.25.1 >
Hello, On Thu, Feb 23, 2023 at 5:45 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Thu, Feb 23, 2023 at 03:58:00PM +0800, Changbin Du escreveu: > > When creating counters with initial delay configured, the enable_on_exec > > field is not set. So we need to enable the counters later. The problem > > is, when a workload is specified the target__none() is still true. So > > we also need to check stat_config.initial_delay. > > > > Before this fix the event is not counted: > > $ ./perf stat -e instructions -D 100 sleep 2 > > Events disabled > > Events enabled > > > > Performance counter stats for 'sleep 2': > > > > <not counted> instructions > > > > 1.901661124 seconds time elapsed > > > > 0.001602000 seconds user > > 0.000000000 seconds sys > > > > After fix it works: > > $ ./perf stat -e instructions -D 100 sleep 2 > > Events disabled > > Events enabled > > > > Performance counter stats for 'sleep 2': > > > > 404,214 instructions > > > > 1.901743475 seconds time elapsed > > > > 0.001617000 seconds user > > 0.000000000 seconds sys > > > > Fixes: c587e77e100f ("perf stat: Do not delay the workload with --delay") > > Yeap, even the comment states that we need to enable when initial_delay > is set :-) Right, but the logic that checks the initial_delay is placed out of the function. Just checking the initial_delay value can be confusing as it can have a negative value. Maybe we can add an argument (bool force?) to the enable_counters() function. Thanks, Namhyung > > I added the additional test output below. > > Namhyung, can you please ack it? > > - Arnaldo > > Committer testing: > > Before: > > Lets use stress-ng so that we have lots of samples using a CPU stressor > and also intermingle the workload output with the messages about when > the events get enabled (i.e. later on in the workload): > > $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 > Events disabled > stress-ng: info: [38361] setting to a 1 second run per stressor > stress-ng: info: [38361] dispatching hogs: 32 cpu > Events enabled > stress-ng: info: [38361] successful run completed in 1.01s > > Performance counter stats for 'stress-ng -c 32 -t 1': > > <not counted> instructions:u > > 0.916479141 seconds time elapsed > > 30.868003000 seconds user > 0.049851000 seconds sys > > > Some events weren't counted. Try disabling the NMI watchdog: > echo 0 > /proc/sys/kernel/nmi_watchdog > perf stat ... > echo 1 > /proc/sys/kernel/nmi_watchdog > $ > > After the fix: > > $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 > Events disabled > stress-ng: info: [40429] setting to a 1 second run per stressor > stress-ng: info: [40429] dispatching hogs: 32 cpu > Events enabled > stress-ng: info: [40429] successful run completed in 1.01s > > Performance counter stats for 'stress-ng -c 32 -t 1': > > 154117865145 instructions:u > > 0.920827644 seconds time elapsed > > 30.864753000 seconds user > 0.073862000 seconds sys > > > $ > > > Signed-off-by: Changbin Du <changbin.du@huawei.com> > > --- > > tools/perf/builtin-stat.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > > index 9f3e4b257516..c71d85577de6 100644 > > --- a/tools/perf/builtin-stat.c > > +++ b/tools/perf/builtin-stat.c > > @@ -544,7 +544,7 @@ static int enable_counters(void) > > * - we don't have tracee (attaching to task or cpu) > > * - we have initial delay configured > > */ > > - if (!target__none(&target)) { > > + if (!target__none(&target) || stat_config.initial_delay) { > > if (!all_counters_use_bpf) > > evlist__enable(evsel_list); > > } > > -- > > 2.25.1 > > > > -- > > - Arnaldo
On Thu, Feb 23, 2023 at 02:48:26PM -0800, Namhyung Kim wrote: > Hello, > > On Thu, Feb 23, 2023 at 5:45 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > Em Thu, Feb 23, 2023 at 03:58:00PM +0800, Changbin Du escreveu: > > > When creating counters with initial delay configured, the enable_on_exec > > > field is not set. So we need to enable the counters later. The problem > > > is, when a workload is specified the target__none() is still true. So > > > we also need to check stat_config.initial_delay. > > > > > > Before this fix the event is not counted: > > > $ ./perf stat -e instructions -D 100 sleep 2 > > > Events disabled > > > Events enabled > > > > > > Performance counter stats for 'sleep 2': > > > > > > <not counted> instructions > > > > > > 1.901661124 seconds time elapsed > > > > > > 0.001602000 seconds user > > > 0.000000000 seconds sys > > > > > > After fix it works: > > > $ ./perf stat -e instructions -D 100 sleep 2 > > > Events disabled > > > Events enabled > > > > > > Performance counter stats for 'sleep 2': > > > > > > 404,214 instructions > > > > > > 1.901743475 seconds time elapsed > > > > > > 0.001617000 seconds user > > > 0.000000000 seconds sys > > > > > > Fixes: c587e77e100f ("perf stat: Do not delay the workload with --delay") > > > > Yeap, even the comment states that we need to enable when initial_delay > > is set :-) > > Right, but the logic that checks the initial_delay is placed > out of the function. Just checking the initial_delay value > can be confusing as it can have a negative value. > > Maybe we can add an argument (bool force?) to the > enable_counters() function. > Yes, it could be done. Maybe we can fold the 'initial_delay' into 'struct target', and provide consistent behaviour for all subcommands. Here I add a target__enable_counter_on_exec() to determine whether enable_on_exec should set for all counters. $ git diff -U1 diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index c71d85577de6..e807be2214c7 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -541,8 +541,3 @@ static int enable_counters(void) - /* - * We need to enable counters only if: - * - we don't have tracee (attaching to task or cpu) - * - we have initial delay configured - */ - if (!target__none(&target) || stat_config.initial_delay) { + if (!target__enable_counter_on_exec(&target)) { if (!all_counters_use_bpf) @@ -916,3 +911,3 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) - if (stat_config.initial_delay) { + if (target.initial_delay) { pr_info(EVLIST_DISABLED_MSG); @@ -928,4 +923,4 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) - if (stat_config.initial_delay > 0) { - usleep(stat_config.initial_delay * USEC_PER_MSEC); + if (target.initial_delay > 0) { + usleep(target.initial_delay * USEC_PER_MSEC); err = enable_counters(); @@ -1250,3 +1245,3 @@ static struct option stat_options[] = { "aggregate counts per numa node", AGGR_NODE), - OPT_INTEGER('D', "delay", &stat_config.initial_delay, + OPT_INTEGER('D', "delay", &target.initial_delay, "ms to wait before starting measurement after program start (-1: start with events disabled)"), diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c index 534d36d26fc3..40984d124db1 100644 --- a/tools/perf/util/stat.c +++ b/tools/perf/util/stat.c @@ -848,3 +848,3 @@ int create_perf_stat_counter(struct evsel *evsel, */ - if (target__none(target) && !config->initial_delay) + if (target__enable_counter_on_exec(target)) attr->enable_on_exec = 1; diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h index daec6cba500d..a6721b644bfb 100644 --- a/tools/perf/util/target.h +++ b/tools/perf/util/target.h @@ -20,2 +20,3 @@ struct target { bool hybrid; + int initial_delay; const char *attr_map; @@ -74,2 +75,7 @@ static inline bool target__none(struct target *target) +static inline bool target__enable_counter_on_exec(struct target *target) +{ + return target__none(target) && !target->initial_delay; +} + > Thanks, > Namhyung > > > > > > I added the additional test output below. > > > > Namhyung, can you please ack it? > > > > - Arnaldo > > > > Committer testing: > > > > Before: > > > > Lets use stress-ng so that we have lots of samples using a CPU stressor > > and also intermingle the workload output with the messages about when > > the events get enabled (i.e. later on in the workload): > > > > $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 > > Events disabled > > stress-ng: info: [38361] setting to a 1 second run per stressor > > stress-ng: info: [38361] dispatching hogs: 32 cpu > > Events enabled > > stress-ng: info: [38361] successful run completed in 1.01s > > > > Performance counter stats for 'stress-ng -c 32 -t 1': > > > > <not counted> instructions:u > > > > 0.916479141 seconds time elapsed > > > > 30.868003000 seconds user > > 0.049851000 seconds sys > > > > > > Some events weren't counted. Try disabling the NMI watchdog: > > echo 0 > /proc/sys/kernel/nmi_watchdog > > perf stat ... > > echo 1 > /proc/sys/kernel/nmi_watchdog > > $ > > > > After the fix: > > > > $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 > > Events disabled > > stress-ng: info: [40429] setting to a 1 second run per stressor > > stress-ng: info: [40429] dispatching hogs: 32 cpu > > Events enabled > > stress-ng: info: [40429] successful run completed in 1.01s > > > > Performance counter stats for 'stress-ng -c 32 -t 1': > > > > 154117865145 instructions:u > > > > 0.920827644 seconds time elapsed > > > > 30.864753000 seconds user > > 0.073862000 seconds sys > > > > > > $ > > > > > Signed-off-by: Changbin Du <changbin.du@huawei.com> > > > --- > > > tools/perf/builtin-stat.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > > > index 9f3e4b257516..c71d85577de6 100644 > > > --- a/tools/perf/builtin-stat.c > > > +++ b/tools/perf/builtin-stat.c > > > @@ -544,7 +544,7 @@ static int enable_counters(void) > > > * - we don't have tracee (attaching to task or cpu) > > > * - we have initial delay configured > > > */ > > > - if (!target__none(&target)) { > > > + if (!target__none(&target) || stat_config.initial_delay) { > > > if (!all_counters_use_bpf) > > > evlist__enable(evsel_list); > > > } > > > -- > > > 2.25.1 > > > > > > > -- > > > > - Arnaldo >
On Thu, Feb 23, 2023 at 11:25 PM Changbin Du <changbin.du@huawei.com> wrote: > > On Thu, Feb 23, 2023 at 02:48:26PM -0800, Namhyung Kim wrote: > > Hello, > > > > On Thu, Feb 23, 2023 at 5:45 AM Arnaldo Carvalho de Melo > > <acme@kernel.org> wrote: > > > > > > Em Thu, Feb 23, 2023 at 03:58:00PM +0800, Changbin Du escreveu: > > > > When creating counters with initial delay configured, the enable_on_exec > > > > field is not set. So we need to enable the counters later. The problem > > > > is, when a workload is specified the target__none() is still true. So > > > > we also need to check stat_config.initial_delay. > > > > > > > > Before this fix the event is not counted: > > > > $ ./perf stat -e instructions -D 100 sleep 2 > > > > Events disabled > > > > Events enabled > > > > > > > > Performance counter stats for 'sleep 2': > > > > > > > > <not counted> instructions > > > > > > > > 1.901661124 seconds time elapsed > > > > > > > > 0.001602000 seconds user > > > > 0.000000000 seconds sys > > > > > > > > After fix it works: > > > > $ ./perf stat -e instructions -D 100 sleep 2 > > > > Events disabled > > > > Events enabled > > > > > > > > Performance counter stats for 'sleep 2': > > > > > > > > 404,214 instructions > > > > > > > > 1.901743475 seconds time elapsed > > > > > > > > 0.001617000 seconds user > > > > 0.000000000 seconds sys > > > > > > > > Fixes: c587e77e100f ("perf stat: Do not delay the workload with --delay") > > > > > > Yeap, even the comment states that we need to enable when initial_delay > > > is set :-) > > > > Right, but the logic that checks the initial_delay is placed > > out of the function. Just checking the initial_delay value > > can be confusing as it can have a negative value. > > > > Maybe we can add an argument (bool force?) to the > > enable_counters() function. > > > Yes, it could be done. > > Maybe we can fold the 'initial_delay' into 'struct target', and provide > consistent behaviour for all subcommands. Here I add a > target__enable_counter_on_exec() to determine whether enable_on_exec should > set for all counters. Maybe just target__enable_on_exec() ? Note that we have similar logic in the perf record. Otherwise looks good. Thanks, Namhyung > > $ git diff -U1 > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > index c71d85577de6..e807be2214c7 100644 > --- a/tools/perf/builtin-stat.c > +++ b/tools/perf/builtin-stat.c > @@ -541,8 +541,3 @@ static int enable_counters(void) > > - /* > - * We need to enable counters only if: > - * - we don't have tracee (attaching to task or cpu) > - * - we have initial delay configured > - */ > - if (!target__none(&target) || stat_config.initial_delay) { > + if (!target__enable_counter_on_exec(&target)) { > if (!all_counters_use_bpf) > @@ -916,3 +911,3 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > > - if (stat_config.initial_delay) { > + if (target.initial_delay) { > pr_info(EVLIST_DISABLED_MSG); > @@ -928,4 +923,4 @@ static int __run_perf_stat(int argc, const char **argv, int run_idx) > > - if (stat_config.initial_delay > 0) { > - usleep(stat_config.initial_delay * USEC_PER_MSEC); > + if (target.initial_delay > 0) { > + usleep(target.initial_delay * USEC_PER_MSEC); > err = enable_counters(); > @@ -1250,3 +1245,3 @@ static struct option stat_options[] = { > "aggregate counts per numa node", AGGR_NODE), > - OPT_INTEGER('D', "delay", &stat_config.initial_delay, > + OPT_INTEGER('D', "delay", &target.initial_delay, > "ms to wait before starting measurement after program start (-1: start with events disabled)"), > diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c > index 534d36d26fc3..40984d124db1 100644 > --- a/tools/perf/util/stat.c > +++ b/tools/perf/util/stat.c > @@ -848,3 +848,3 @@ int create_perf_stat_counter(struct evsel *evsel, > */ > - if (target__none(target) && !config->initial_delay) > + if (target__enable_counter_on_exec(target)) > attr->enable_on_exec = 1; > diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h > index daec6cba500d..a6721b644bfb 100644 > --- a/tools/perf/util/target.h > +++ b/tools/perf/util/target.h > @@ -20,2 +20,3 @@ struct target { > bool hybrid; > + int initial_delay; > const char *attr_map; > @@ -74,2 +75,7 @@ static inline bool target__none(struct target *target) > > +static inline bool target__enable_counter_on_exec(struct target *target) > +{ > + return target__none(target) && !target->initial_delay; > +} > + > > > Thanks, > > Namhyung > > > > > > > > > > I added the additional test output below. > > > > > > Namhyung, can you please ack it? > > > > > > - Arnaldo > > > > > > Committer testing: > > > > > > Before: > > > > > > Lets use stress-ng so that we have lots of samples using a CPU stressor > > > and also intermingle the workload output with the messages about when > > > the events get enabled (i.e. later on in the workload): > > > > > > $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 > > > Events disabled > > > stress-ng: info: [38361] setting to a 1 second run per stressor > > > stress-ng: info: [38361] dispatching hogs: 32 cpu > > > Events enabled > > > stress-ng: info: [38361] successful run completed in 1.01s > > > > > > Performance counter stats for 'stress-ng -c 32 -t 1': > > > > > > <not counted> instructions:u > > > > > > 0.916479141 seconds time elapsed > > > > > > 30.868003000 seconds user > > > 0.049851000 seconds sys > > > > > > > > > Some events weren't counted. Try disabling the NMI watchdog: > > > echo 0 > /proc/sys/kernel/nmi_watchdog > > > perf stat ... > > > echo 1 > /proc/sys/kernel/nmi_watchdog > > > $ > > > > > > After the fix: > > > > > > $ perf stat -e instructions -D 100 stress-ng -c 32 -t 1 > > > Events disabled > > > stress-ng: info: [40429] setting to a 1 second run per stressor > > > stress-ng: info: [40429] dispatching hogs: 32 cpu > > > Events enabled > > > stress-ng: info: [40429] successful run completed in 1.01s > > > > > > Performance counter stats for 'stress-ng -c 32 -t 1': > > > > > > 154117865145 instructions:u > > > > > > 0.920827644 seconds time elapsed > > > > > > 30.864753000 seconds user > > > 0.073862000 seconds sys > > > > > > > > > $ > > > > > > > Signed-off-by: Changbin Du <changbin.du@huawei.com> > > > > --- > > > > tools/perf/builtin-stat.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c > > > > index 9f3e4b257516..c71d85577de6 100644 > > > > --- a/tools/perf/builtin-stat.c > > > > +++ b/tools/perf/builtin-stat.c > > > > @@ -544,7 +544,7 @@ static int enable_counters(void) > > > > * - we don't have tracee (attaching to task or cpu) > > > > * - we have initial delay configured > > > > */ > > > > - if (!target__none(&target)) { > > > > + if (!target__none(&target) || stat_config.initial_delay) { > > > > if (!all_counters_use_bpf) > > > > evlist__enable(evsel_list); > > > > } > > > > -- > > > > 2.25.1 > > > > > > > > > > -- > > > > > > - Arnaldo > > > > -- > Cheers, > Changbin Du
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 9f3e4b257516..c71d85577de6 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -544,7 +544,7 @@ static int enable_counters(void) * - we don't have tracee (attaching to task or cpu) * - we have initial delay configured */ - if (!target__none(&target)) { + if (!target__none(&target) || stat_config.initial_delay) { if (!all_counters_use_bpf) evlist__enable(evsel_list); }