[v2] perf vendor events amd: Fix large metrics

Message ID 20230706063440.54189-1-sandipan.das@amd.com
State New
Headers
Series [v2] perf vendor events amd: Fix large metrics |

Commit Message

Sandipan Das July 6, 2023, 6:34 a.m. UTC
  There are cases where a metric requires more events than the number of
available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
data fabric counters but the "nps1_die_to_dram" metric has eight events.
By default, the constituent events are placed in a group and since the
events cannot be scheduled at the same time, the metric is not computed.
The "all metrics" test also fails because of this.

Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
the user to run perf with "--metric-no-group".

E.g.

  $ sudo perf test -v 101

Before:

  101: perf all metrics test                                           :
  --- start ---
  test child forked, pid 37131
  Testing branch_misprediction_ratio
  Testing all_remote_links_outbound
  Testing nps1_die_to_dram
  Metric 'nps1_die_to_dram' not printed in:
  Error:
  Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
  Testing macro_ops_dispatched
  Testing all_l2_cache_accesses
  Testing all_l2_cache_hits
  Testing all_l2_cache_misses
  Testing ic_fetch_miss_ratio
  Testing l2_cache_accesses_from_l2_hwpf
  Testing l2_cache_misses_from_l2_hwpf
  Testing op_cache_fetch_miss_ratio
  Testing l3_read_miss_latency
  Testing l1_itlb_misses
  test child finished with -1
  ---- end ----
  perf all metrics test: FAILED!

After:

  101: perf all metrics test                                           :
  --- start ---
  test child forked, pid 43766
  Testing branch_misprediction_ratio
  Testing all_remote_links_outbound
  Testing nps1_die_to_dram
  Testing macro_ops_dispatched
  Testing all_l2_cache_accesses
  Testing all_l2_cache_hits
  Testing all_l2_cache_misses
  Testing ic_fetch_miss_ratio
  Testing l2_cache_accesses_from_l2_hwpf
  Testing l2_cache_misses_from_l2_hwpf
  Testing op_cache_fetch_miss_ratio
  Testing l3_read_miss_latency
  Testing l1_itlb_misses
  test child finished with 0
  ---- end ----
  perf all metrics test: Ok

Reported-by: Ayush Jain <ayush.jain3@amd.com>
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
---

Previous versions can be found at:
v1: https://lore.kernel.org/all/20230614090710.680330-1-sandipan.das@amd.com/

Changes in v2:
- As suggested by Ian, use the NO_GROUP_EVENTS constraint instead of
  retrying the test scenario with --metric-no-group.
- Change the commit message accordingly.

 tools/perf/pmu-events/arch/x86/amdzen1/recommended.json | 3 ++-
 tools/perf/pmu-events/arch/x86/amdzen2/recommended.json | 3 ++-
 tools/perf/pmu-events/arch/x86/amdzen3/recommended.json | 3 ++-
 3 files changed, 6 insertions(+), 3 deletions(-)
  

Comments

Ian Rogers July 6, 2023, 1:49 p.m. UTC | #1
On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <sandipan.das@amd.com> wrote:
>
> There are cases where a metric requires more events than the number of
> available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
> data fabric counters but the "nps1_die_to_dram" metric has eight events.
> By default, the constituent events are placed in a group and since the
> events cannot be scheduled at the same time, the metric is not computed.
> The "all metrics" test also fails because of this.
>
> Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
> the user to run perf with "--metric-no-group".
>
> E.g.
>
>   $ sudo perf test -v 101
>
> Before:
>
>   101: perf all metrics test                                           :
>   --- start ---
>   test child forked, pid 37131
>   Testing branch_misprediction_ratio
>   Testing all_remote_links_outbound
>   Testing nps1_die_to_dram
>   Metric 'nps1_die_to_dram' not printed in:
>   Error:
>   Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
>   Testing macro_ops_dispatched
>   Testing all_l2_cache_accesses
>   Testing all_l2_cache_hits
>   Testing all_l2_cache_misses
>   Testing ic_fetch_miss_ratio
>   Testing l2_cache_accesses_from_l2_hwpf
>   Testing l2_cache_misses_from_l2_hwpf
>   Testing op_cache_fetch_miss_ratio
>   Testing l3_read_miss_latency
>   Testing l1_itlb_misses
>   test child finished with -1
>   ---- end ----
>   perf all metrics test: FAILED!
>
> After:
>
>   101: perf all metrics test                                           :
>   --- start ---
>   test child forked, pid 43766
>   Testing branch_misprediction_ratio
>   Testing all_remote_links_outbound
>   Testing nps1_die_to_dram
>   Testing macro_ops_dispatched
>   Testing all_l2_cache_accesses
>   Testing all_l2_cache_hits
>   Testing all_l2_cache_misses
>   Testing ic_fetch_miss_ratio
>   Testing l2_cache_accesses_from_l2_hwpf
>   Testing l2_cache_misses_from_l2_hwpf
>   Testing op_cache_fetch_miss_ratio
>   Testing l3_read_miss_latency
>   Testing l1_itlb_misses
>   test child finished with 0
>   ---- end ----
>   perf all metrics test: Ok
>
> Reported-by: Ayush Jain <ayush.jain3@amd.com>
> Suggested-by: Ian Rogers <irogers@google.com>
> Signed-off-by: Sandipan Das <sandipan.das@amd.com>

Acked-by: Ian Rogers <irogers@google.com>

Will there be a PMU driver fix so that the perf_event_open fails for
the group? That way the weak group would work.

Thanks,
Ian

> ---
>
> Previous versions can be found at:
> v1: https://lore.kernel.org/all/20230614090710.680330-1-sandipan.das@amd.com/
>
> Changes in v2:
> - As suggested by Ian, use the NO_GROUP_EVENTS constraint instead of
>   retrying the test scenario with --metric-no-group.
> - Change the commit message accordingly.
>
>  tools/perf/pmu-events/arch/x86/amdzen1/recommended.json | 3 ++-
>  tools/perf/pmu-events/arch/x86/amdzen2/recommended.json | 3 ++-
>  tools/perf/pmu-events/arch/x86/amdzen3/recommended.json | 3 ++-
>  3 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> index bf5083c1c260..4d28177325a0 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> @@ -169,8 +169,9 @@
>    },
>    {
>      "MetricName": "nps1_die_to_dram",
> -    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> +    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
>      "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> +    "MetricConstraint": "NO_GROUP_EVENTS",
>      "MetricGroup": "data_fabric",
>      "PerPkg": "1",
>      "ScaleUnit": "6.1e-5MiB"
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> index a71694a043ba..60e19456d4c8 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> @@ -169,8 +169,9 @@
>    },
>    {
>      "MetricName": "nps1_die_to_dram",
> -    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> +    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
>      "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> +    "MetricConstraint": "NO_GROUP_EVENTS",
>      "MetricGroup": "data_fabric",
>      "PerPkg": "1",
>      "ScaleUnit": "6.1e-5MiB"
> diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> index 988cf68ae825..3e9e1781812e 100644
> --- a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> +++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> @@ -205,10 +205,11 @@
>    },
>    {
>      "MetricName": "nps1_die_to_dram",
> -    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> +    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
>      "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
>      "MetricGroup": "data_fabric",
>      "PerPkg": "1",
> +    "MetricConstraint": "NO_GROUP_EVENTS",
>      "ScaleUnit": "6.1e-5MiB"
>    }
>  ]
> --
> 2.34.1
>
  
Sandipan Das July 6, 2023, 2:22 p.m. UTC | #2
Hi Ian,

On 7/6/2023 7:19 PM, Ian Rogers wrote:
> On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <sandipan.das@amd.com> wrote:
>>
>> There are cases where a metric requires more events than the number of
>> available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
>> data fabric counters but the "nps1_die_to_dram" metric has eight events.
>> By default, the constituent events are placed in a group and since the
>> events cannot be scheduled at the same time, the metric is not computed.
>> The "all metrics" test also fails because of this.
>>
>> Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
>> the user to run perf with "--metric-no-group".
>>
>> E.g.
>>
>>   $ sudo perf test -v 101
>>
>> Before:
>>
>>   101: perf all metrics test                                           :
>>   --- start ---
>>   test child forked, pid 37131
>>   Testing branch_misprediction_ratio
>>   Testing all_remote_links_outbound
>>   Testing nps1_die_to_dram
>>   Metric 'nps1_die_to_dram' not printed in:
>>   Error:
>>   Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
>>   Testing macro_ops_dispatched
>>   Testing all_l2_cache_accesses
>>   Testing all_l2_cache_hits
>>   Testing all_l2_cache_misses
>>   Testing ic_fetch_miss_ratio
>>   Testing l2_cache_accesses_from_l2_hwpf
>>   Testing l2_cache_misses_from_l2_hwpf
>>   Testing op_cache_fetch_miss_ratio
>>   Testing l3_read_miss_latency
>>   Testing l1_itlb_misses
>>   test child finished with -1
>>   ---- end ----
>>   perf all metrics test: FAILED!
>>
>> After:
>>
>>   101: perf all metrics test                                           :
>>   --- start ---
>>   test child forked, pid 43766
>>   Testing branch_misprediction_ratio
>>   Testing all_remote_links_outbound
>>   Testing nps1_die_to_dram
>>   Testing macro_ops_dispatched
>>   Testing all_l2_cache_accesses
>>   Testing all_l2_cache_hits
>>   Testing all_l2_cache_misses
>>   Testing ic_fetch_miss_ratio
>>   Testing l2_cache_accesses_from_l2_hwpf
>>   Testing l2_cache_misses_from_l2_hwpf
>>   Testing op_cache_fetch_miss_ratio
>>   Testing l3_read_miss_latency
>>   Testing l1_itlb_misses
>>   test child finished with 0
>>   ---- end ----
>>   perf all metrics test: Ok
>>
>> Reported-by: Ayush Jain <ayush.jain3@amd.com>
>> Suggested-by: Ian Rogers <irogers@google.com>
>> Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> 
> Acked-by: Ian Rogers <irogers@google.com>
> 
> Will there be a PMU driver fix so that the perf_event_open fails for
> the group? That way the weak group would work.
> 

Yes, that's in our plan. Ravi (in CC) and I have discussed about adding
group validation in the event_init() path.

- Sandipan
  
Arnaldo Carvalho de Melo July 11, 2023, 2:51 p.m. UTC | #3
Em Thu, Jul 06, 2023 at 06:49:29AM -0700, Ian Rogers escreveu:
> On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <sandipan.das@amd.com> wrote:
> >
> > There are cases where a metric requires more events than the number of
> > available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
> > data fabric counters but the "nps1_die_to_dram" metric has eight events.
> > By default, the constituent events are placed in a group and since the
> > events cannot be scheduled at the same time, the metric is not computed.
> > The "all metrics" test also fails because of this.
> >
> > Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
> > the user to run perf with "--metric-no-group".
> >
> > E.g.
> >
> >   $ sudo perf test -v 101
> >
> > Before:
> >
> >   101: perf all metrics test                                           :
> >   --- start ---
> >   test child forked, pid 37131
> >   Testing branch_misprediction_ratio
> >   Testing all_remote_links_outbound
> >   Testing nps1_die_to_dram
> >   Metric 'nps1_die_to_dram' not printed in:
> >   Error:
> >   Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
> >   Testing macro_ops_dispatched
> >   Testing all_l2_cache_accesses
> >   Testing all_l2_cache_hits
> >   Testing all_l2_cache_misses
> >   Testing ic_fetch_miss_ratio
> >   Testing l2_cache_accesses_from_l2_hwpf
> >   Testing l2_cache_misses_from_l2_hwpf
> >   Testing op_cache_fetch_miss_ratio
> >   Testing l3_read_miss_latency
> >   Testing l1_itlb_misses
> >   test child finished with -1
> >   ---- end ----
> >   perf all metrics test: FAILED!
> >
> > After:
> >
> >   101: perf all metrics test                                           :
> >   --- start ---
> >   test child forked, pid 43766
> >   Testing branch_misprediction_ratio
> >   Testing all_remote_links_outbound
> >   Testing nps1_die_to_dram
> >   Testing macro_ops_dispatched
> >   Testing all_l2_cache_accesses
> >   Testing all_l2_cache_hits
> >   Testing all_l2_cache_misses
> >   Testing ic_fetch_miss_ratio
> >   Testing l2_cache_accesses_from_l2_hwpf
> >   Testing l2_cache_misses_from_l2_hwpf
> >   Testing op_cache_fetch_miss_ratio
> >   Testing l3_read_miss_latency
> >   Testing l1_itlb_misses
> >   test child finished with 0
> >   ---- end ----
> >   perf all metrics test: Ok
> >
> > Reported-by: Ayush Jain <ayush.jain3@amd.com>
> > Suggested-by: Ian Rogers <irogers@google.com>
> > Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> 
> Acked-by: Ian Rogers <irogers@google.com>

Thanks, applied.

- Arnaldo

 
> Will there be a PMU driver fix so that the perf_event_open fails for
> the group? That way the weak group would work.
> 
> Thanks,
> Ian
> 
> > ---
> >
> > Previous versions can be found at:
> > v1: https://lore.kernel.org/all/20230614090710.680330-1-sandipan.das@amd.com/
> >
> > Changes in v2:
> > - As suggested by Ian, use the NO_GROUP_EVENTS constraint instead of
> >   retrying the test scenario with --metric-no-group.
> > - Change the commit message accordingly.
> >
> >  tools/perf/pmu-events/arch/x86/amdzen1/recommended.json | 3 ++-
> >  tools/perf/pmu-events/arch/x86/amdzen2/recommended.json | 3 ++-
> >  tools/perf/pmu-events/arch/x86/amdzen3/recommended.json | 3 ++-
> >  3 files changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> > index bf5083c1c260..4d28177325a0 100644
> > --- a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> > +++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> > @@ -169,8 +169,9 @@
> >    },
> >    {
> >      "MetricName": "nps1_die_to_dram",
> > -    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> > +    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> >      "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> > +    "MetricConstraint": "NO_GROUP_EVENTS",
> >      "MetricGroup": "data_fabric",
> >      "PerPkg": "1",
> >      "ScaleUnit": "6.1e-5MiB"
> > diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> > index a71694a043ba..60e19456d4c8 100644
> > --- a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> > +++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
> > @@ -169,8 +169,9 @@
> >    },
> >    {
> >      "MetricName": "nps1_die_to_dram",
> > -    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> > +    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> >      "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> > +    "MetricConstraint": "NO_GROUP_EVENTS",
> >      "MetricGroup": "data_fabric",
> >      "PerPkg": "1",
> >      "ScaleUnit": "6.1e-5MiB"
> > diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> > index 988cf68ae825..3e9e1781812e 100644
> > --- a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> > +++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
> > @@ -205,10 +205,11 @@
> >    },
> >    {
> >      "MetricName": "nps1_die_to_dram",
> > -    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
> > +    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
> >      "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
> >      "MetricGroup": "data_fabric",
> >      "PerPkg": "1",
> > +    "MetricConstraint": "NO_GROUP_EVENTS",
> >      "ScaleUnit": "6.1e-5MiB"
> >    }
> >  ]
> > --
> > 2.34.1
> >
  
Namhyung Kim July 11, 2023, 5:34 p.m. UTC | #4
On Tue, Jul 11, 2023 at 7:51 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Jul 06, 2023 at 06:49:29AM -0700, Ian Rogers escreveu:
> > On Wed, Jul 5, 2023 at 11:34 PM Sandipan Das <sandipan.das@amd.com> wrote:
> > >
> > > There are cases where a metric requires more events than the number of
> > > available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
> > > data fabric counters but the "nps1_die_to_dram" metric has eight events.
> > > By default, the constituent events are placed in a group and since the
> > > events cannot be scheduled at the same time, the metric is not computed.
> > > The "all metrics" test also fails because of this.
> > >
> > > Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
> > > the user to run perf with "--metric-no-group".
> > >
> > > E.g.
> > >
> > >   $ sudo perf test -v 101
> > >
> > > Before:
> > >
> > >   101: perf all metrics test                                           :
> > >   --- start ---
> > >   test child forked, pid 37131
> > >   Testing branch_misprediction_ratio
> > >   Testing all_remote_links_outbound
> > >   Testing nps1_die_to_dram
> > >   Metric 'nps1_die_to_dram' not printed in:
> > >   Error:
> > >   Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
> > >   Testing macro_ops_dispatched
> > >   Testing all_l2_cache_accesses
> > >   Testing all_l2_cache_hits
> > >   Testing all_l2_cache_misses
> > >   Testing ic_fetch_miss_ratio
> > >   Testing l2_cache_accesses_from_l2_hwpf
> > >   Testing l2_cache_misses_from_l2_hwpf
> > >   Testing op_cache_fetch_miss_ratio
> > >   Testing l3_read_miss_latency
> > >   Testing l1_itlb_misses
> > >   test child finished with -1
> > >   ---- end ----
> > >   perf all metrics test: FAILED!
> > >
> > > After:
> > >
> > >   101: perf all metrics test                                           :
> > >   --- start ---
> > >   test child forked, pid 43766
> > >   Testing branch_misprediction_ratio
> > >   Testing all_remote_links_outbound
> > >   Testing nps1_die_to_dram
> > >   Testing macro_ops_dispatched
> > >   Testing all_l2_cache_accesses
> > >   Testing all_l2_cache_hits
> > >   Testing all_l2_cache_misses
> > >   Testing ic_fetch_miss_ratio
> > >   Testing l2_cache_accesses_from_l2_hwpf
> > >   Testing l2_cache_misses_from_l2_hwpf
> > >   Testing op_cache_fetch_miss_ratio
> > >   Testing l3_read_miss_latency
> > >   Testing l1_itlb_misses
> > >   test child finished with 0
> > >   ---- end ----
> > >   perf all metrics test: Ok
> > >
> > > Reported-by: Ayush Jain <ayush.jain3@amd.com>
> > > Suggested-by: Ian Rogers <irogers@google.com>
> > > Signed-off-by: Sandipan Das <sandipan.das@amd.com>
> >
> > Acked-by: Ian Rogers <irogers@google.com>
>
> Thanks, applied.

If I'm not too late..

Tested-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung
  

Patch

diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
index bf5083c1c260..4d28177325a0 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
@@ -169,8 +169,9 @@ 
   },
   {
     "MetricName": "nps1_die_to_dram",
-    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
     "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
+    "MetricConstraint": "NO_GROUP_EVENTS",
     "MetricGroup": "data_fabric",
     "PerPkg": "1",
     "ScaleUnit": "6.1e-5MiB"
diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
index a71694a043ba..60e19456d4c8 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json
@@ -169,8 +169,9 @@ 
   },
   {
     "MetricName": "nps1_die_to_dram",
-    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
     "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
+    "MetricConstraint": "NO_GROUP_EVENTS",
     "MetricGroup": "data_fabric",
     "PerPkg": "1",
     "ScaleUnit": "6.1e-5MiB"
diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
index 988cf68ae825..3e9e1781812e 100644
--- a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
+++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json
@@ -205,10 +205,11 @@ 
   },
   {
     "MetricName": "nps1_die_to_dram",
-    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)",
+    "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)",
     "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7",
     "MetricGroup": "data_fabric",
     "PerPkg": "1",
+    "MetricConstraint": "NO_GROUP_EVENTS",
     "ScaleUnit": "6.1e-5MiB"
   }
 ]